Alvin Šipraga [Sat, 6 Jun 2026 08:29:30 +0000 (05:29 -0300)]
net: dsa: realtek: rtl8365mb: add VLAN support
Realtek RTL8365MB switches (a.k.a. RTL8367C family) use two different
structures for VLANs:
- VLAN4K: A full table with 4096 entries defining port membership and
tagging.
- VLANMC: A smaller table with 32 entries used primarily for PVID
assignment.
In this hardware, a port's PVID must point to an index in the VLANMC
table rather than a VID directly. Since the VLANMC table is limited to
32 entries, the driver implements a dynamic allocation scheme to
maximize resource usage:
- VLAN4K is treated by the driver as the source of truth for membership.
- A VLANMC entry is only allocated when a port is configured to use a
specific VID as its PVID.
- VLANMC entries are deleted when no longer needed as a PVID by any port.
Although VLANMC has a members field, the switch only checks membership
in the VLAN4K table. This driver will use VLANMC members field as way to
track which ports are using that entry as PVID.
VLANMC index 0, although a valid entry, is reserved in this driver as a
neutral PVID value for ports not using a specific PVID.
In the subsequent RTL8367D switch family, VLANMC table was
removed and PVID assignment was delegated to a dedicated set of
registers.
The use of FIELD_PREP for reconstructing LO/HI values was suggested by
Yury Norov.
Fix for vlan_setup and vlan_filtering was suggested by Abdulkader
Alrezej.
net: dsa: realtek: rtl8365mb: use dsa helpers for port iteration
Convert open-coded port iteration loops to use the DSA helpers and
restructure rtl8365mb_setup() into clear blocking, user, and
CPU port phases.
As part of this refactoring, unused ports are explicitly placed into a
blocked, isolated state with learning disabled, ensuring safe default
hardware behavior. The driver also does not allocate a virtual IRQ
mapping for unused ports. To accommodate this, a guard check is added to
the interrupt handler (rtl8365mb_irq) to safely skip ports without a
valid IRQ mapping. The irq domain teardown, however, does clean all
ports as external PHYs may still map the IRQ.
Furthermore, since the new initialization loop starts with all ports
administratively isolated by default, CPU port forwarding and isolation
masks are explicitly configured at the end of the setup phase to prevent
egress traffic from being blocked.
Explicitly enforce the presence of a CPU port (-EINVAL) and reject DSA
cascade links (-EOPNOTSUPP) during setup to prevent silent failures.
These topologies were already non-functional. Without a CPU port, the
driver does not activate CPU tagging. Additionally, the switch hardware
was not designed to be cascaded, and DSA links never worked because
CPU tagging is not enabled for them.
Convert numeric error codes into human-readable strings by using %pe
together with ERR_PTR() in dev_err() messages. Also use dev_err_probe()
instead of checking for -EPROBE_DEFER.
Guangshuo Li [Sun, 7 Jun 2026 14:57:47 +0000 (22:57 +0800)]
net: lan966x: restore RX state on reload failure
lan966x_fdma_reload() backs up rx->page_pool and rx->fdma before
reallocating the RX resources for the new MTU. If the allocation fails,
the restore path puts these fields back before restarting RX.
However, the reload path also updates rx->page_order and rx->max_mtu
before calling lan966x_fdma_rx_alloc(). These fields are not restored on
failure, so RX can be restarted with the old pages, old FDMA state and
old page pool, but with the page geometry from the failed new MTU.
This can make the XDP path advertise a frame size derived from the new
page_order while the actual RX pages still come from the old allocation.
For example, after a failed reload to a jumbo MTU, xdp_init_buff() may be
called with a frame size larger than the restored RX pages.
lan966x_fdma_rx_alloc_page_pool() also registers the newly allocated page
pool with each port's XDP RXQ before fdma_alloc_coherent() is called. If
fdma_alloc_coherent() fails, the new page pool is destroyed, but the
rollback path does not restore the per-port XDP RXQ mem model
registration either.
Save and restore rx->page_order and rx->max_mtu, and restore the old page
pool registration for each port's XDP RXQ before RX is started again.
This keeps the restored RX state consistent after a failed reload.
Vadim Fedorenko [Mon, 8 Jun 2026 15:59:52 +0000 (15:59 +0000)]
ptp: ocp: fix resource freeing order
Commit a60fc3294a37 ("ptp: rework ptp_clock_unregister() to disable
events") added a call to ptp_disable_all_events() which changes the
configuration of pins if they support EXTTS events. In ptp_ocp_detach()
pins resources are freed before ptp_clock_unregister() and it leads to
use-after-free during driver removal. Fix it by changing the order of
free/unregister calls. To avoid irq handler running on the other core
while ptp device unregistering, call synchronize_irq() after HW is
configured to stop producing irqs and no irqs are in-flight.
Xiang Mei [Sun, 7 Jun 2026 05:44:28 +0000 (22:44 -0700)]
tun: zero the whole vnet header in tun_put_user()
tun_put_user() declares an on-stack struct virtio_net_hdr_v1_hash_tunnel
without zeroing it. For a non-tunnel skb, virtio_net_hdr_tnl_from_skb()
only initializes the first 10 bytes (sizeof(struct virtio_net_hdr)),
leaving bytes 10..23 (num_buffers and the hash/tunnel fields) as stack
garbage.
An unprivileged user can set the vnet header size to 24 with
TUNSETVNETHDRSZ, so __tun_vnet_hdr_put() copies all 24 bytes of the
partially-initialized struct to userspace, leaking 14 bytes of kernel
stack on every read of a non-tunnel packet.
Fix it the same way tun_get_user() already does by zeroing the whole
header right after declaration.
Fixes: 288f30435132 ("tun: enable gso over UDP tunnel support.") Reported-by: Weiming Shi <bestswngs@gmail.com> Signed-off-by: Xiang Mei <xmei5@asu.edu> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20260607054428.3050243-1-xmei5@asu.edu Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Weiming Shi [Sat, 6 Jun 2026 19:24:48 +0000 (12:24 -0700)]
net/rds: fix NULL deref in rds_ib_send_cqe_handler() on masked atomic completion
rds_ib_xmit_atomic() always programs a masked atomic opcode
(IB_WR_MASKED_ATOMIC_CMP_AND_SWP or IB_WR_MASKED_ATOMIC_FETCH_AND_ADD)
for every RDS atomic cmsg. But the completion-side switch in
rds_ib_send_unmap_op() only handles the non-masked opcodes, so a masked
atomic completion falls through to default and returns rm == NULL while
send->s_op is left set. rds_ib_send_cqe_handler() then dereferences the
NULL rm via rm->m_final_op, oopsing in softirq context. An unprivileged
AF_RDS sendmsg() of an atomic cmsg over an active RDS/IB connection
triggers it; on hardware that natively accepts masked atomics (mlx4,
mlx5) no extra setup is needed.
RDS/IB: rds_ib_send_unmap_op: unexpected opcode 0xd in WR!
Oops: general protection fault [#1] SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000190-0x0000000000000197]
RIP: rds_ib_send_cqe_handler+0x25c/0xb10 (net/rds/ib_send.c:282)
Call Trace:
<IRQ>
rds_ib_send_cqe_handler (net/rds/ib_send.c:282)
poll_scq (net/rds/ib_cm.c:274)
rds_ib_tasklet_fn_send (net/rds/ib_cm.c:294)
tasklet_action_common (kernel/softirq.c:943)
handle_softirqs (kernel/softirq.c:573)
run_ksoftirqd (kernel/softirq.c:479)
</IRQ>
Kernel panic - not syncing: Fatal exception in interrupt
Handle the masked atomic opcodes in the same case as the non-masked
ones: they map to the same struct rds_message.atomic union member, so
the existing container_of()/rds_ib_send_unmap_atomic() body is correct
for them.
Fixes: 20c72bd5f5f9 ("RDS: Implement masked atomic operations") Reported-by: Xiang Mei <xmei5@asu.edu> Signed-off-by: Weiming Shi <bestswngs@gmail.com> Reviewed-by: Allison Henderson <achender@kernel.org> Link: https://patch.msgid.link/20260606192447.1179255-2-bestswngs@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Kyle Zeng [Sun, 7 Jun 2026 02:18:19 +0000 (19:18 -0700)]
net: guard timestamp cmsgs to real error queue skbs
skb_is_err_queue() treats PACKET_OUTGOING as the sole marker for an skb
from sk_error_queue. That assumption is not true for AF_PACKET sockets:
outgoing packet taps are also delivered to packet sockets with
skb->pkt_type == PACKET_OUTGOING, but their skb->cb is owned by AF_PACKET
instead of struct sock_exterr_skb.
If such an skb is received with timestamping enabled, the generic
timestamp cmsg path can read AF_PACKET control-buffer state as
sock_exterr_skb::opt_stats. With SO_RXQ_OVFL enabled, the packet drop
counter overlaps opt_stats. An odd drop count makes the path emit
SCM_TIMESTAMPING_OPT_STATS with skb->len and skb->data. For non-linear
skbs this copies past the linear head and can trigger hardened usercopy or
disclose adjacent heap contents.
Keep skb_is_err_queue() local to net/socket.c, but make it verify that
the PACKET_OUTGOING marker is paired with the sock_rmem_free destructor
installed by sock_queue_err_skb(). AF_PACKET receive skbs use normal
receive ownership and no longer pass as error-queue skbs, while legitimate
sk_error_queue entries keep the PACKET_OUTGOING marker and sock_rmem_free
ownership.
Fixes: 8605330aac5a ("tcp: fix SCM_TIMESTAMPING_OPT_STATS for normal skbs") Signed-off-by: Kyle Zeng <kylebot@openai.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20260607021819.49698-1-kylebot@openai.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Xin Long [Sun, 7 Jun 2026 23:03:47 +0000 (19:03 -0400)]
sctp: validate embedded INIT chunk and address list lengths in cookie
sctp_unpack_cookie() only checked that the embedded INIT chunk length
did not exceed the remaining cookie payload, but did not ensure that the
INIT chunk is large enough to contain a complete INIT header.
A malformed COOKIE_ECHO can therefore carry a truncated INIT chunk whose
length field is smaller than sizeof(struct sctp_init_chunk). Later,
sctp_process_init() accesses INIT parameters unconditionally, which may
lead to out-of-bounds reads.
In addition, raw_addr_list_len is not fully validated against the
remaining cookie payload. When cookie authentication is disabled, an
attacker can supply an oversized raw_addr_list_len and cause
sctp_raw_to_bind_addrs() to read beyond the end of the cookie. The
address parser also lacks sufficient bounds checks for parameter headers
and lengths, allowing malformed address parameters to trigger
out-of-bounds reads.
Fix this by:
- requiring the embedded INIT chunk length to be at least sizeof(struct
sctp_init_chunk);
- validating that the INIT chunk and raw address list together fit
within the cookie payload;
- verifying sufficient data exists for each address parameter header and
payload before parsing it.
Note that sctp_verify_init() must be called after sctp_unpack_cookie()
and before sctp_process_init() when cookie authentication is disabled.
This will be addressed in a separate patch.
====================
net: add retry mechanism to ndo_set_rx_mode_async
Original async ndo_set_rx_mode work left one place where we do netdev_WARN
in response to a ENOMEM. The intent was to see whether actual real
users can hit that (adding uc/mc under memory pressure seems like a
very unlikely thing to do). However, it was quickly triggered by
syzbot's failslab. Add a retry mechanism and downgrade netdev_WARN
to netdev_err. The retry logic is a typical exponential backoff:
1, 2, 4, 8 seconds, 15 in total, hopefully enough for a system to resolve
memory pressure.
====================
Eric Dumazet [Mon, 8 Jun 2026 15:59:18 +0000 (15:59 +0000)]
ip6_vti: set netns_immutable on the fallback device.
john1988 and Noam Rathaus reported that vti6_init_net() does not set the
netns_immutable flag on the per-netns fallback tunnel device (ip6_vti0).
Other similar tunnel drivers (like ip6_tunnel, sit, ip6_gre, and ip_tunnel)
correctly set this flag during their fallback device initialization to
prevent them from being moved to another network namespace.
Fixes: 61220ab34948 ("vti6: Enable namespace changing") Reported-by: Noam Rathaus <noamr@ssd-disclosure.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Reviewed-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Link: https://patch.msgid.link/20260608155918.787644-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Remove the driver-specific BNXT_STATE_L2_FILTER_RETRY + timer + sp_task
retry mechanism and rely on the core stack's ndo_set_rx_mode_async retry
instead.
bnxt_cfg_rx_mode() now returns errors instead of swallowing them. The
PF-unavailable case (-ENODEV from HWRM on a VF) is normalized to
-EAGAIN at the boundary so callers can match on a single "retry me"
errno without re-implementing the VF/-ENODEV check. Other errors
propagate unchanged.
This removes:
- BNXT_STATE_L2_FILTER_RETRY state bit
- BNXT_RX_MASK_SP_EVENT sp_event bit
- Retry trigger from bnxt_timer()
- BNXT_RX_MASK_SP_EVENT handling from bnxt_sp_task()
bnxt_init_chip() still calls bnxt_cfg_rx_mode() directly during open.
On a fresh open dev->uc is empty and the call effectively cannot fail
on the unicast path. But on FW reset reopen (bnxt_fw_reset_task ->
bnxt_open) a VF may have a populated dev->uc and the PF may be
transiently unavailable; since that path doesn't go through
__dev_open(), the follow-up rx_mode call that would otherwise drive
the core retry doesn't fire. On -EAGAIN, swallow the error and call
netif_rx_mode_schedule_retry() explicitly. The unicast filter loop
truncates vnic->uc_filter_count on failure, so the retry's delta check
sees pending work and reinstalls.
Cc: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20260608154014.227538-4-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When ndo_set_rx_mode_async returns an error, schedule a retry with
exponential backoff (1s, 2s, 4s, 8s -- 15s total). Give up after the
4th retry and log an error via netdev_err().
This moves retry logic from individual drivers into the core stack.
Timer callback does not hold a ref on dev. Safe because the timer can
only be armed when dev is IFF_UP, and __dev_close_many runs
timer_delete_sync before clearing IFF_UP. Unregister always closes
IFF_UP devices first, so by the time dev can be freed the timer is
dead and cannot be re-armed.
net: change ndo_set_rx_mode_async return type to int
Change the return type of ndo_set_rx_mode_async from void to int to
allow drivers to report failures back to the core stack. This is a
prerequisite for adding retry logic in the core when drivers fail to
program RX filters (e.g. bnxt VF when PF is unavailable).
All existing implementations return 0 for now, maintaining current
behavior.
sctp: fix uninit-value in __sctp_rcv_asconf_lookup()
__sctp_rcv_asconf_lookup() in net/sctp/input.c only checks that the ASCONF
chunk can hold the ADDIP header and a parameter header, then calls
af->from_addr_param(), which reads the full address (16 bytes for IPv6)
trusting the parameter's declared length.
An unauthenticated peer can send a truncated trailing ASCONF chunk that
declares an IPv6 address parameter but stops after the 4-byte parameter
header; reached from the no-association lookup path, from_addr_param() then
reads uninitialized bytes past the parameter.
Impact: an unauthenticated SCTP peer makes the receive path read up to 16
bytes of uninitialized memory past a truncated ASCONF address parameter.
The sibling __sctp_rcv_init_lookup() bounds parameters with
sctp_walk_params(); this path open-codes the fetch and omits the bound.
Verify the whole address parameter lies within the chunk before
from_addr_param() reads it, the same class of fix as commit 51e5ad549c43
("net: sctp: fix KMSAN uninit-value in sctp_inq_pop").
Fixes: df2185771439 ("[SCTP]: Update association lookup to look at ASCONF chunks as well") Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Acked-by: Xin Long <lucien.xin@gmail.com> Link: https://patch.msgid.link/20260608122234.459098-1-michael.bommarito@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
net: mana: Cache MANA_QUERY_LINK_CONFIG result to avoid repeated HWC queries
mana_query_link_cfg() sends an HWC command to firmware on every call,
but the link speed and QoS values it returns only change when the
driver explicitly calls mana_set_bw_clamp(). This function is called
not only by userspace via ethtool get_link_ksettings, but also
periodically by hv_netvsc through netvsc_get_link_ksettings and by
the sysfs speed_show attribute via dev_attr_show, resulting in
unnecessary HWC traffic every few minutes.
Add a link_cfg_error field to mana_port_context to cache the query
result. The field uses three states: 1 (not yet queried, initial
value set during mana_probe_port), 0 (success, speed/max_speed are
valid), or a negative errno for permanent errors like -EOPNOTSUPP
when the hardware does not support the command. Transient errors and
qos_unconfigured responses are not cached so that subsequent calls
will retry.
MANA is ops-locked because it implements net_shaper_ops, so the core
already takes netdev_lock() around all ethtool_ops and net_shaper_ops
entry points. Reuse that lock to serialize mana_query_link_cfg() and
mana_set_bw_clamp(). This prevents a concurrent mana_set_bw_clamp()
from racing with an in-flight query and publishing stale pre-clamp
speed/max_speed.
Invalidate the cache inside mana_set_bw_clamp() on success, so all
current and future callers that change the link configuration
automatically trigger a fresh query on the next mana_query_link_cfg()
call. Also reset link_cfg_error during resume in mana_probe() under
netdev_lock(), so that any query already in flight cannot later
store 0 and silently overwrite the post-resume invalidation.
David Yang [Sat, 6 Jun 2026 12:52:44 +0000 (20:52 +0800)]
net: mscc: ocelot: validate netdev belongs to switch in .netdev_to_port()
The .netdev_to_port() currently takes only a net_device and returns the
port index, without verifying the netdev actually belongs to the switch
being operated on. This can cause flower rule parsing to silently
resolve to a wrong port on the local hardware.
Update both implementations felix_netdev_to_port() and
ocelot_netdev_to_port() to validate ownership. Also update the callers
in ocelot_flower.c to pass through the ocelot context.
Kyle Meyer [Fri, 5 Jun 2026 22:25:24 +0000 (17:25 -0500)]
bnxt_en: Fix NULL pointer dereference
PCIe errors detected by a Root Port or Downstream Port cause error
recovery services to run on all subordinate devices regardless of
administrative state.
The .error_detected() callback, bnxt_io_error_detected(), disables
and synchronizes IRQs via bnxt_disable_int_sync(), which calls
bnxt_cp_num_to_irq_num() to map completion rings to IRQs using
bp->bnapi.
Since bp->bnapi is allocated on NIC open and freed on NIC close, PCIe
error recovery on a closed NIC can dereference a NULL pointer.
Check if bp->bnapi is NULL before disabling and synchronizing IRQs.
Haiyang Zhang [Fri, 5 Jun 2026 21:22:56 +0000 (14:22 -0700)]
net: mana: Add support for PF device 0x00C1
Update the device id table to include the new device id 0x00C1.
This device's BAR layout is similar to VF's, update the function,
mana_gd_init_registers(), accordingly.
net: dsa: qca8k: Add support for force mode for fixed link topology
A fixed link topology is commonly used to connect this switch (on port
0 or 6) to a SoC's MAC over SGMII. When inband negotiation is not used,
the switch needs to be configured to operate in force mode. As such,
enable support for force mode.
====================
Add motorcomm 8531s set ds func and 8522 driver
This patch is for Starfive JHB100 EVB board. JHB100 contain
1 RGMII/RMII and 1 RMII synopsys GMAC cores. In the EVB board, RGMII
interface connect with YT8531s Ethernet PHY. RMII interface connect
with YT8522 ethernet PHY. So patch 1-2 is for RGMII interface
patch 3 is RMII is for RMII interface.
JHB100 is a Starfive new RISC-V SoC for datacenter BMC (BaseBoard
Managent Controller). Similar with Aspeed 27x0.
The JHB100 minimal system upstream is in progress:
https://patchwork.kernel.org/project/linux-riscv/cover/20260508053632.818548-1-changhuang.liang@starfivetech.com/
====================
Minda Chen [Fri, 5 Jun 2026 06:02:11 +0000 (14:02 +0800)]
net: motorcomm: phy: set drive strength in YT8531s RGMII
Set RXD and RX CLK pin drive strength while in YT8531s connect
with RGMII. Need to check 8531s PHY ID because 8521 and 8531s
pin drive strength is different, 8521 can not call
yt8531_set_ds().
Minda Chen [Fri, 5 Jun 2026 06:02:10 +0000 (14:02 +0800)]
net: phy: motorcomm: move mdio lock out from yt8531_set_ds()
yt8531_set_ds() default set register with mdio lock and only called
with YT8531 PHY. But new type YT8531s support RGMII and has the same
pin strength setting with YT8531, YT8531s need to call yt8531_set_ds()
setting pin drive strength. But YT8531s config init function
yt8521_config_init() already get the mdio lock with phy_select_page().
If calling yt8521_config_init() with mdio lock will cause dead lock.
Need to get the lock before calling yt8531_set_ds() and move mdio
lock out from it for YT8531s.
Wyatt Feng [Fri, 5 Jun 2026 05:53:42 +0000 (13:53 +0800)]
sctp: stream: fully roll back denied add-stream state
When ADD_OUT_STREAMS is denied, SCTP only shrinks the queued chunks and
then lowers outcnt. That leaves removed stream metadata behind, so a
later re-add can reuse a stale ext and hit a null-pointer dereference in
the scheduler get path.
Fix the rollback by tearing down the removed stream state the same way
other stream resizes do. Unschedule the current scheduler state, drop
the removed stream ext state with sctp_stream_outq_migrate(), and then
reschedule the remaining streams.
This keeps scheduler-private RR/FC/PRIO lists consistent while fully
rolling back denied outgoing stream additions.
Fixes: 637784ade221 ("sctp: introduce priority based stream scheduler") Cc: stable@kernel.org Reported-by: Yuan Tan <yuantan098@gmail.com> Reported-by: Yifan Wu <yifanwucs@gmail.com> Reported-by: Juefei Pu <tomapufckgml@gmail.com> Reported-by: Zhengchuan Liang <zcliangcn@gmail.com> Reported-by: Xin Liu <bird@lzu.edu.cn> Signed-off-by: Wyatt Feng <bronzed_45_vested@icloud.com> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn> Acked-by: Xin Long <lucien.xin@gmail.com> Link: https://patch.msgid.link/d78954ecd94954653ee299400e98d74a03a6f7d3.1780603399.git.bronzed_45_vested@icloud.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Wed, 10 Jun 2026 00:23:42 +0000 (17:23 -0700)]
Merge branch 'mana-per-vport-eq'
Long Li says:
====================
net: mana: Per-vPort EQ and MSI-X management
This series moves EQ ownership from the shared mana_context to per-vPort
mana_port_context, enabling each vPort to have dedicated MSI-X vectors
when the hardware provides enough vectors. When vectors are limited, the
driver falls back to sharing MSI-X among vPorts.
The series introduces a GDMA IRQ Context (GIC) abstraction with reference
counting to manage interrupt context lifecycle. This allows both Ethernet
and RDMA EQs to dynamically acquire dedicated or shared MSI-X vectors at
vPort creation time rather than pre-allocating all vectors at probe time.
====================
Long Li [Fri, 5 Jun 2026 00:57:15 +0000 (17:57 -0700)]
RDMA/mana_ib: Allocate interrupt contexts on EQs
Use the GIC functions to allocate interrupt contexts for RDMA EQs. These
interrupt contexts may be shared with Ethernet EQs when MSI-X vectors
are limited.
The driver now supports allocating dedicated MSI-X for each EQ. Indicate
this capability through driver capability bits. The RDMA EQs pass
use_msi_bitmap=false to share MSI-X vectors with Ethernet, while the
capability flag advertises that the driver supports per-vPort EQ
separation when hardware has sufficient vectors.
Populate eq.irq on all RDMA EQs for consistency with the Ethernet path.
Also relocate the GDMA_DRV_CAP_FLAG_1_HW_VPORT_LINK_AWARE define to its
numeric BIT(6) position among the other capability flags.
Long Li [Fri, 5 Jun 2026 00:57:14 +0000 (17:57 -0700)]
net: mana: Allocate interrupt context for each EQ when creating vPort
Use GIC functions to create a dedicated interrupt context or acquire a
shared interrupt context for each EQ when setting up a vPort.
The caller now owns the GIC reference across the EQ create/destroy
lifecycle: mana_create_eq() calls mana_gd_get_gic() before creating
each EQ and mana_destroy_eq() calls mana_gd_put_gic() after destroying
it. The msix_index invalidation is moved from mana_gd_deregister_irq()
to the mana_gd_create_eq() error path so that mana_destroy_eq() can
read the index before teardown.
Long Li [Fri, 5 Jun 2026 00:57:13 +0000 (17:57 -0700)]
net: mana: Use GIC functions to allocate global EQs
Replace the GDMA global interrupt setup code with the new GIC allocation
and release functions for managing interrupt contexts.
This changes the per-queue interrupt names in /proc/interrupts from
mana_q0, mana_q1, ... to mana_msi1, mana_msi2, ... to reflect the
MSI-X index rather than a zero-based queue number. The HWC interrupt
name (mana_hwc) is unchanged.
Long Li [Fri, 5 Jun 2026 00:57:12 +0000 (17:57 -0700)]
net: mana: Introduce GIC context with refcounting for interrupt management
To allow Ethernet EQs to use dedicated or shared MSI-X vectors and RDMA
EQs to share the same MSI-X, introduce a GIC (GDMA IRQ Context) with
reference counting. This allows the driver to create an interrupt context
on an assigned or unassigned MSI-X vector and share it across multiple
EQ consumers.
Long Li [Fri, 5 Jun 2026 00:57:11 +0000 (17:57 -0700)]
net: mana: Query device capabilities and configure MSI-X sharing for EQs
When querying the device, adjust the max number of queues to allow
dedicated MSI-X vectors for each vPort. The per-vPort queue count is
clamped towards MANA_DEF_NUM_QUEUES but will not exceed the hardware
maximum reported by the device.
MSI-X sharing among vPorts is enabled when there are not enough MSI-X
vectors for dedicated allocation, or when the platform does not support
dynamic MSI-X allocation (in which case all vectors are pre-allocated
at probe time and sharing is always used). The msi_sharing flag is
reset at the top of mana_gd_query_max_resources() so it is recomputed
from current hardware state on each probe or resume cycle.
Clamp apc->max_queues to gc->max_num_queues_vport in mana_init_port()
so that on resume, if max_num_queues_vport has decreased due to fewer
MSI-X vectors, num_queues is reduced accordingly before EQ allocation.
A device reporting zero ports now results in a fatal probe error since
the per-vPort MSI-X math requires at least one port.
Rename mana_query_device_cfg() to mana_gd_query_device_cfg() as it is
used at GDMA device probe time for querying device capabilities.
Long Li [Fri, 5 Jun 2026 00:57:10 +0000 (17:57 -0700)]
net: mana: Create separate EQs for each vPort
To prepare for assigning vPorts to dedicated MSI-X vectors, remove EQ
sharing among the vPorts and create dedicated EQs for each vPort.
Move the EQ definition from struct mana_context to struct mana_port_context
and update related support functions. Export mana_create_eq() and
mana_destroy_eq() for use by the MANA RDMA driver.
RSS QPs now take a vport reference via pd->vport_use_count to ensure
EQs outlive all QP consumers. The vport must already be configured by
a raw QP before an RSS QP can be created. EQs are only destroyed when
the last QP (raw or RSS) on the PD releases its reference.
Restrict each vport to a single RSS QP. The hardware only supports one
steering configuration (indirection table / hash key) per vport, and
mana_disable_vport_rx() on QP destroy disables RX globally for the
vport. Previously, creating a second RSS QP would silently overwrite
the first QP's steering config and destroy would blackhole all traffic.
This is now explicitly rejected with -EBUSY. Existing applications
(DPDK being the primary RDMA consumer) always create one RSS QP per
vport, so no real-world flows are affected.
Reject cross-port PD sharing for both raw and RSS QPs. Since EQs and
vport configuration are per-port, a PD is bound to the port used by
its first raw QP. Subsequent QPs on the same PD must use the same
port or the creation fails with -EINVAL. Previously this was silently
broken: with shared EQs it appeared to work, but with per-vPort EQs
a cross-port PD would cause wrong-port EQ teardown and corruption.
DPDK creates one PD per port so no existing flows are affected.
Serialize mana_set_channels() and the async per-port queue reset
handler against RDMA vport configuration to prevent RDMA from claiming
the vport during the detach/attach window. A channel_changing flag is
set under apc->vport_mutex before detach and checked by
mana_cfg_vport() when called from the RDMA path, blocking RDMA from
grabbing the vport during the entire window. When the port is down
and RDMA already holds the vport, the channel change is rejected with
-EBUSY.
Linus Torvalds [Wed, 10 Jun 2026 00:20:00 +0000 (17:20 -0700)]
Merge tag 'trace-rv-v7.1-rc6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull runtime verifier fixes from Steven Rostedt:
- Fix reset ordering on per-task destruction
Reset the task before dropping the slot instead of after, which was
causing out-of-bound memory accesses.
- Fix HA monitor synchronization and cleanup
Ensure synchronous cleanup for HA monitors by running timer callbacks
in RCU read-side critical sections and using synchronize_rcu() during
destruction.
- Avoid armed timers after tasks exit
Add automatic cleanup for per-task HA monitors to prevent timers from
firing after task exit.
- Fix memory ordering for DA/HA monitors
Fix race conditions during monitor start by using release-acquire
semantics for the monitoring flag.
- Fix initialization for DA/HA monitors
Ensure monitors are not initialized relying on potentially corrupted
state like the monitoring flag, that is not reset by all monitors
type and may have an unknown state in monitors reusing the storage
(per-task).
- Fix memory safety in per-task and per-object monitors
Prevent use-after-free and out-of-bounds access by synchronizing with
in-flight tracepoint probes using tracepoint_synchronize_unregister()
before freeing monitor storage or releasing task slots.
- Adjust monitors for preemptible tracepoints
Fix monitors that relied on tracepoints disabling preemption.
Explicitly disable task migration when per-CPU monitors handle events
to avoid accessing the wrong state and update the opid monitor logic.
- Fix incorrect __user specifier usage
Remove __user from a non-pointer variable in the extract_params()
helper.
- Fix bugs in the rv tool
Ensure strings are NUL-terminated, fix substring matching in monitor
searches, and improve cleanup and exit status handling.
- Fix several bugs in rvgen
Fix LTL literal stringification, subparsers' options handling, and
suffix stripping in dot2k.
* tag 'trace-rv-v7.1-rc6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
verification/rvgen: Fix ltl2k writing True as a literal
verification/rvgen: Fix options shared among commands
verification/rvgen: Fix suffix strip in dot2k
tools/rv: Fix cleanup after failed trace setup
tools/rv: Fix substring match when listing container monitors
tools/rv: Fix substring match bug in monitor name search
tools/rv: Ensure monitor name and desc are NUL-terminated
rv: Use 0 to check preemption enabled in opid
rv: Prevent task migration while handling per-CPU events
rv: Ensure synchronous cleanup for HA monitors
rv: Add automatic cleanup handlers for per-task HA monitors
rv: Do not rely on clean monitor when initialising HA
rv: Fix monitor start ordering and memory ordering for monitoring flag
rv: Ensure all pending probes terminate on per-obj monitor destroy
rv: Prevent in-flight per-task handlers from using invalid slots
rv: Reset per-task DA monitors before releasing the slot
rv: Fix __user specifier usage in extract_params()
Linus Torvalds [Wed, 10 Jun 2026 00:05:19 +0000 (17:05 -0700)]
Merge tag 'trace-tools-v7.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull RTLA fix from Steven Rostedt:
- Fix multi-character short option parsing
Fix regression in parsing of multiple-character short options
(eg -p100 /= -p 100/, -un /= -u -n/) caused by getopt_long()
internal state corruption after a refactoring.
* tag 'trace-tools-v7.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
rtla: Fix parsing of multi-character short options
====================
Consolidate FCrypt and PCBC code into net/rxrpc/
The FCrypt "block cipher" and the PCBC mode of operation are obsolete
and insecure. Since their only user is net/rxrpc/, they belong there,
not in the crypto API.
Therefore, this series removes these algorithms from the crypto API and
replaces them with local implementations in net/rxrpc/.
The local implementations are simpler too, as they avoid the crypto API
boilerplate.
I don't know how to test all the code in net/rxrpc/, but everything
should still work. I added a KUnit test for the crypto functions.
====================
Eric Biggers [Fri, 22 May 2026 05:07:36 +0000 (00:07 -0500)]
crypto: pcbc - Remove support for PCBC mode
The only user of PCBC mode (Propagating Cipher Block Chaining mode) was
net/rxrpc/rxkad.c, which now uses local code instead.
While PCBC was an interesting cryptographic experiment, it has largely
been relegated to the history books and academic exercises. It is
non-parallelizable (i.e., very slow) and doesn't actually achieve the
integrity properties it was apparently intended to achieve.
Remove support for it from the crypto API.
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Eric Biggers <ebiggers@kernel.org> Tested-by: Marc Dionne <marc.dionne@auristor.com> Link: https://patch.msgid.link/20260522050740.84561-6-ebiggers@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Biggers [Fri, 22 May 2026 05:07:35 +0000 (00:07 -0500)]
crypto: fcrypt - Remove support for FCrypt block cipher
Remove the insecure FCrypt block cipher from the crypto API. Its only
user was net/rxrpc/, but now net/rxrpc/ implements it locally. The
crypto API implementation is no longer needed.
For some additional context: FCrypt was designed in 1988 and is
essentially a weakened version of DES. It has the same 56-bit key size
as DES, which is easily brute forced. Moreover, it's cryptographically
weak and doesn't even provide the intended 56-bit security level. Its
author considers it to be a mistake, as well
(https://lists.openafs.org/pipermail/openafs-devel/2000-December/005320.html).
But fortunately this 1980s-era homebrew block cipher was never adopted
outside of net/rxrpc/. So its code can just be kept there.
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Eric Biggers <ebiggers@kernel.org> Tested-by: Marc Dionne <marc.dionne@auristor.com> Link: https://patch.msgid.link/20260522050740.84561-5-ebiggers@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Biggers [Fri, 22 May 2026 05:07:34 +0000 (00:07 -0500)]
net/rxrpc: Reimplement DES-PCBC using DES library
Since the use of "pcbc(des)" in rxkad_decrypt_ticket() is the only
remaining user of the crypto API "pcbc" template, just implement
DES-PCBC by locally implementing PCBC mode on top of the DES library.
Note that only the decryption direction is needed.
This will allow support for the obsolete PCBC mode to be removed from
the crypto API.
Eric Biggers [Fri, 22 May 2026 05:07:33 +0000 (00:07 -0500)]
net/rxrpc: Use local FCrypt-PCBC implementation
Use the local implementation of FCrypt-PCBC instead of the crypto API
one. This will allow the crypto API one to be removed. It also
simplifies the code quite a bit.
The local FCrypt-PCBC implementation is also significantly faster than
the crypto API one, since the crypto API one had a lot of overhead. For
example, benchmarking on an x86_64 CPU, I see that FCrypt-PCBC
decryption throughput improved from 83 MB/s to 157 MB/s.
(Meanwhile, AES-256-GCM decryption is 8064 MB/s on the same CPU.
Clearly, anyone looking for good performance, or anything that is
actually secure for that matter, needs to look elsewhere anyway.)
Eric Biggers [Fri, 22 May 2026 05:07:32 +0000 (00:07 -0500)]
net/rxrpc: Add local FCrypt-PCBC implementation
Add a local implementation of FCrypt-PCBC encryption and decryption.
This will be used instead of the crypto API one, allowing the crypto API
one to be removed. It will also simplify rxkad.c quite a bit.
A KUnit test is included. The FCrypt-PCBC test vectors are borrowed
from the existing ones in crypto/testmgr.h. Note that this adds the
first KUnit test for net/rxrpc/, which previously had no KUnit tests.
The FCrypt code is based on crypto/fcrypt.c, but I simplified it a bit.
The PCBC part is straightforward and I just wrote it from scratch.
Add fsi_clk_prepare() and fsi_clk_unprepare() helpers and call them
from fsi_dai_startup() and fsi_dai_shutdown().
This ensures clk_prepare() and clk_unprepare() are executed from
sleepable contexts and keeps clocks prepared only while audio streams
are active.
Acked-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Suggested-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: bui duc phuc <phucduc.bui@gmail.com> Link: https://patch.msgid.link/20260609113836.45079-11-phucduc.bui@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
FSI register accesses on the r8a7740 require the SPU bus clock to be
enabled. Add support for acquiring and managing the SPU clock via the
device tree to ensure proper register access.
Acked-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Suggested-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: bui duc phuc <phucduc.bui@gmail.com> Link: https://patch.msgid.link/20260609113836.45079-10-phucduc.bui@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
Move fsi_clk_init() from set_fmt() to the probe path.
This ensures that clock resources are acquired only once during device
initialization, instead of being looked up repeatedly whenever set_fmt()
is called.
Together with the previous conversion to devm_clk_get_optional(), the
driver can now probe successfully even when optional clocks are absent.
The set_rate() callbacks continue to validate that all required clocks
are available before applying hardware-specific configuration.
Acked-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Suggested-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: bui duc phuc <phucduc.bui@gmail.com> Link: https://patch.msgid.link/20260609113836.45079-9-phucduc.bui@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
ASoC: renesas: fsi: Use devm_clk_get_optional() for optional clocks
The xck, ick, and div clocks are optional. Switch from devm_clk_get()
to devm_clk_get_optional() to correctly handle cases where these clocks
are missing.
Acked-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Suggested-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: bui duc phuc <phucduc.bui@gmail.com> Link: https://patch.msgid.link/20260609113836.45079-8-phucduc.bui@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
ASoC: renesas: fsi: Fix register access from in-flight IRQ after shutdown
In-flight IRQs may still be running when the SPU clock is disabled,
leading to register access after shutdown and causing system hangs.
Fix this to use fsi_stream_is_working() when handling in-flight IRQ
handlers. If no streams are active, the handler now returns immediately
to prevent hardware access.
Acked-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Suggested-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: bui duc phuc <phucduc.bui@gmail.com> Link: https://patch.msgid.link/20260609113836.45079-6-phucduc.bui@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
Move fsi_stream_is_working() before fsi_count_fifo_err().
This prepares for a subsequent patch that needs to check stream status
when handling in-flight IRQ handlers. No functional changwqes intended.
Acked-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Suggested-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: bui duc phuc <phucduc.bui@gmail.com> Link: https://patch.msgid.link/20260609113836.45079-5-phucduc.bui@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
Call fsi_stream_stop() before fsi_hw_shutdown(). This matches the existing
order in the suspend path.
This change ensures all register accesses during stream shutdown are fully
completed before disabling the clocks.
Acked-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Suggested-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: bui duc phuc <phucduc.bui@gmail.com> Link: https://patch.msgid.link/20260609113836.45079-4-phucduc.bui@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
ASoC: dt-bindings: renesas,fsi: add support multiple clocks
The FSI on r8a7740 requires the SPU bus/bridge clock to be enabled before
accessing its registers. Without this clock, any register access leads to
a system hang as the FSI block sits behind the SPU bus.
Update the binding to support multiple clocks to properly describe the
hardware clock tree, including:
- SPU bus/bridge clock (spu) for register access.
- CPG DIV6 clocks (icka/b) as functional clock.
- FSI dividers (diva/b) for audio clock generation.
- External clock inputs (xcka/b) provided by the board.
The hardware supports several valid clock configurations. For example,
when both FSIA and FSIB operate as slaves, only the fck and spu clocks
are required. When a port operates as a master, it can use either an
internal clock source (ickx + divx) or an external clock source
(ickx + xckx). Therefore, while fck and spu are mandatory on r8a7740,
the remaining clocks (icka/b, diva/b and xcka/b) are optional and depend
on the selected master/slave configuration and clock source.
Both sh73a0 and r8a7740 define the SPU DIV6 clock control register at
0xe6150084. The binding therefore documents the clocks supported by the
FSI driver for these variants.
Mark Brown [Tue, 9 Jun 2026 23:09:13 +0000 (00:09 +0100)]
ASoC: codecs: aw88261: fixes and cleanup
Val Packett <val@packett.cool> says:
The Awinic smart speaker/amp drivers were merged in a very
"downstream-brained" state, where configuration was only really
determined by the binary "firmware" (register list) file instead
of properly participating in the ASoC system. Let's start
untangling this mess. This series makes aw88261 actually usable
on devices like fairphone-fp5, motorola-dubai and xiaomi-pipa.
Val Packett [Fri, 29 May 2026 20:05:14 +0000 (17:05 -0300)]
ASoC: codecs: aw88261: make volume control usable
- Invert the value to match userspace expectations (in the hardware,
positive numbers represent negative dB attenuation)
- Provide TLV metadata for the dB scale (and divide the raw values by 2
as the excessive precision used by HW is not representable in TLV)
- Do not unnecessarily reset the volume while switching profiles
- Simplify aw88261_dev_set_volume using regmap_update_bits
- Do not add the initial volume from the profile to the requested volume
as that would throw off the dB mapping (if a lower max limit is
desired, it can be set in the UCM profile in userspace)
With this change, it's actually possible to use this hardware volume
control as PlaybackVolume in an ALSA UCM profile.
Val Packett [Fri, 29 May 2026 20:05:13 +0000 (17:05 -0300)]
ASoC: codecs: aw88261: fix incorrect masks for boost regs
The boost-related register fields used in aw88261_reg_force_set use the
exact same definitions as the rest of the fields, where the mask must be
inverted when passing it to regmap_update_bits, but they weren't
inverted here.
Fixes: 028a2ae25691 ("ASoC: codecs: Add aw88261 amplifier driver") Signed-off-by: Val Packett <val@packett.cool> Tested-by: Luca Weiss <luca.weiss@fairphone.com> Link: https://patch.msgid.link/20260529200550.529719-7-val@packett.cool Signed-off-by: Mark Brown <broonie@kernel.org>
Val Packett [Fri, 29 May 2026 20:05:12 +0000 (17:05 -0300)]
ASoC: codecs: aw88261: remove async start
Codec drivers are not supposed to do anything like this. The result was
that the first second or so of playback was essentially inaudible, and
very short alert sounds could be missed entirely. Let's not do this.
Val Packett [Fri, 29 May 2026 20:05:10 +0000 (17:05 -0300)]
ASoC: codecs: aw88261: reduce log spam
This driver would create a wall of logspam during initialization due to
e.g. the PLL not being ready while waiting for it to stabilize. Change
intermediate dev_err() calls to dev_dbg() to reduce the noise.
While here, log the detected chip ID when that check fails.
Val Packett [Fri, 29 May 2026 20:05:08 +0000 (17:05 -0300)]
ASoC: codecs: aw88261: support changing sample rate and bit width
The aw88261 driver only worked with 32-bit 48kHz streams so far due to
the lack of a proper PLL initialization sequence. Fix by selecting all
the necessary PLL settings based on what was passed to us by the
hw_params/set_fmt ops. This replaces the strange downstream routine
that tries two divider modes in sequence.
Fixes: 028a2ae25691 ("ASoC: codecs: Add aw88261 amplifier driver") Tested-by: Luca Weiss <luca.weiss@fairphone.com> # qcm6490-fairphone-fp5 Signed-off-by: Val Packett <val@packett.cool> Link: https://patch.msgid.link/20260529200550.529719-2-val@packett.cool Signed-off-by: Mark Brown <broonie@kernel.org>
Peng Yang [Mon, 8 Jun 2026 09:58:49 +0000 (17:58 +0800)]
spi: dw: fix race between IRQ handler and error handler on SMP
On SMP systems, dw_spi_handle_err() can be called from the SPI core
kthread while the IRQ handler is still accessing the FIFO on another
CPU. Resetting the chip via dw_spi_reset_chip() during an active FIFO
read/write causes a bus error.
Fix this by calling disable_irq() before the chip reset, which masks
the IRQ and waits for any in-flight handler to complete via
synchronize_irq(). This ensures no handler is accessing the FIFO when
the reset occurs.
Ruoyu Wang [Tue, 9 Jun 2026 05:26:47 +0000 (13:26 +0800)]
spi: meson-spifc: fix runtime PM leak on remove
pm_runtime_get_sync() increments the runtime PM usage counter even when it
returns an error. meson_spifc_remove() uses it to resume the controller
before disabling runtime PM, but never drops the usage counter again.
Balance the get with pm_runtime_put_noidle() after disabling runtime PM,
matching the teardown pattern used by other SPI controller drivers.
Found by static analysis. I do not have hardware to test this.
Dmitry Torokhov [Sun, 7 Jun 2026 03:51:29 +0000 (20:51 -0700)]
software node: allow passing reference args to PROPERTY_ENTRY_REF()
When dynamically creating software nodes and properties for subsequent
use with software_node_register() current implementation of
PROPERTY_ENTRY_REF() is not suitable because it creates a temporary
instance of struct software_node_ref_args on stack which will later
disappear, and software_node_register() only does shallow copy of
properties.
Fix this by allowing to pass address of reference arguments structure
directly into PROPERTY_ENTRY_REF(), so that caller can manage lifetime
of the object properly.
Mark Brown [Tue, 9 Jun 2026 21:46:07 +0000 (22:46 +0100)]
regulator: qcom_smd-regulator: Add PM8019
Stephan Gerhold <stephan.gerhold@linaro.org> says:
Add the definitions and dt-bindings for the regulators in PM8019 to allow
controlling them through the RPM firmware. PM8019 is typically used
together with the MDM9607 SoC.
Stephan Gerhold [Mon, 8 Jun 2026 12:05:44 +0000 (14:05 +0200)]
regulator: qcom_smd-regulator: Add PM8019
Add the definitions for the regulators in PM8019 to allow controlling them
through the RPM firmware. Reading the TYPE/SUBTYPE registers using SPMI
reveals that PM8019 uses a mixture of regulators from PMA8084 (hfsmps,
pldo) and PM8916 (nldo).
spi: Use named initializers for platform_device_id arrays
Named initializers are better readable and more robust to changes of the
struct definition. This robustness is relevant for a planned change to
struct platform_device_id replacing .driver_data by an anonymous union.
While touching these arrays unify spacing and usage of commas.
Tommaso Merciai [Mon, 8 Jun 2026 20:25:08 +0000 (22:25 +0200)]
spi: rzv2h-rspi: Add suspend/resume support
Add suspend/resume support to the rzv2h-rspi driver by implementing
suspend and resume callbacks that delegate to spi_controller_suspend()
and spi_controller_resume() respectively.
Viken Dadhaniya [Tue, 9 Jun 2026 08:43:09 +0000 (14:13 +0530)]
spi: qcom-geni: Fix cs_change handling on the last transfer
TPM TIS SPI probe fails with:
tpm_tis_spi: probe of spi11.0 failed with error -110
TPM TIS SPI sets cs_change=1 on single-transfer messages to keep CS
asserted across the header, wait-state, and data phases of a transaction.
CS deassertion between these phases violates the TCG SPI flow control
specification.
This bug was introduced by commit b99181cdf9fa ("spi-geni-qcom: remove
manual CS control"), which replaced manual CS control with automatic CS
control via the FRAGMENTATION bit. The FRAGMENTATION bit controls CS
behavior after a transfer: when set to 1, CS remains asserted; when
cleared to 0, CS is deasserted.
The commit correctly sets FRAGMENTATION for non-last transfers with
cs_change=0 to keep CS asserted between chained transfers, but misses the
case where cs_change=1 is set on the last transfer. When cs_change=1 on
the last transfer, the client requests CS to remain asserted after the
message completes, so FRAGMENTATION must be set to 1 in this case as well.
Fix setup_se_xfer() to set FRAGMENTATION when cs_change=1 on the last
transfer.
Also fix the same missing case in setup_gsi_xfer() and correct it to
write 1 instead of the raw bitmask FRAGMENTATION (value 4) to
peripheral.fragmentation. This field is a 1-bit boolean consumed by
gpi_create_spi_tre() via u32_encode_bits(..., TRE_SPI_GO_FRAG). Writing 4
to a 1-bit field causes u32_encode_bits() to mask it to 0, silently
disabling the FRAGMENTATION bit in the GPI TRE regardless of the
cs_change logic.
The INA238 family supports eight conversion time steps from 50 us to
4120 us (SQ52206: 66 us to 8230 us). At the millisecond granularity of
update_interval, the four shortest steps (50, 84, 150, 280 us) all
round to the same value and cannot be individually selected.
Add support for the generic update_interval_us attribute, which reports
and programs the same ADC cycle time as update_interval but in
microseconds, giving userspace full access to all conversion time steps.
Both attributes reflect the total cycle time including the active
averaging count: the reported value is the raw conversion time
multiplied by the number of averaged samples, and writes apply the
inverse mapping.
Some hardware monitoring chips support update intervals below one
millisecond. The existing update_interval attribute uses millisecond
granularity, which causes sub-millisecond steps to round to the same
value and become inaccessible from userspace.
Introduce update_interval_us, a companion chip-level attribute that
expresses the same update interval in microseconds. Drivers
implementing this attribute should also implement update_interval for
compatibility with millisecond-based userspace interfaces.
hwmon: (ina238) Add support for samples and update_interval
Expose INA238 ADC averaging count (AVG) and conversion timing
(VBUSCT/VSHCT/VTCT) through chip-level hwmon attributes:
chip/samples
chip/update_interval
Use per-chip conversion-time lookup tables so the same helpers work
for INA228/INA237/INA238/INA700/INA780 and SQ52206. Cache ADC_CONFIG
in driver data and update it on writes to avoid extra register reads
during read-modify-write updates.
Report update_interval in milliseconds as required by the hwmon ABI.
Compute it from raw ADC cycle time multiplied by the active averaging
count, and apply the inverse mapping on writes so programmed conversion
time tracks the selected sample count.
Clamp user-provided update_interval before unit scaling to prevent
overflow in arithmetic conversions.
Also combine chip attributes in HWMON_CHANNEL_INFO using a bitwise OR
for a single logical chip channel.
Jason Gunthorpe [Mon, 8 Jun 2026 18:10:04 +0000 (15:10 -0300)]
iommu/dma: Do not try to iommu_map a 0 length region in swiotlb
iommu_dma_iova_link_swiotlb() processes a mapping that is unaligned in three
parts, the head, middle and trailer. If the middle is empty because there
are no aligned pages it will call down to iommu_map() with a 0 size
which the iommupt implementation will fail as illegal.
It then tries to do an error unwind and starts from the wrong spot
corrupting the mapping so the eventual destruction triggers a WARN_ON.
Check for 0 length and avoid mapping and use offset not 0 as the starting
point to unlink.
This is frequently triggered by using some kinds of thunderbolt NVMe
drives that trigger forced SWIOTLB for unaligned memory. NVMe seems to
pass in oddly aligned buffers for the passthrough commands from smartctl
that hit this condition.
Cc: stable@vger.kernel.org Fixes: 433a76207dcf ("dma-mapping: Implement link/unlink ranges API") Reported-by: Mark Lord <mlord@pobox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Samiullah Khawaja <skhawaja@google.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/r/0-v1-8536728bc89f+469-swiotlb_warn_jgg@nvidia.com
====================
bpf, lpm_trie: Allow sleepable BPF programs to use LPM tries
trie_lookup_elem() annotates its rcu_dereference_check() walks with only
rcu_read_lock_bh_held(), so a sleepable BPF program that touches an LPM
trie (e.g. a sleepable LSM hook calling bpf_map_lookup_elem()) trips a
"suspicious RCU usage" lockdep splat on debug kernels: it holds only
rcu_read_lock_trace(), which that annotation does not accept.
Patch 1 relaxes the rcu_dereference annotations in the trie walks so they
no longer trip lockdep from the Tasks Trace context, including the
trie_update_elem()/trie_delete_elem() writer walks (protected by
trie->lock). Patch 2 adds BPF_MAP_TYPE_LPM_TRIE to the verifier's
sleepable map whitelist so sleepable programs can reference an LPM trie
directly, not just as the inner map of a map-of-maps. LPM trie nodes are
reclaimed via bpf_mem_cache_free_rcu(), which chains a regular RCU grace
period into a Tasks Trace grace period before freeing -- the same
discipline BPF_MAP_TYPE_HASH relies on for sleepable access.
Changes since v1:
- Split into a 2-patch series.
- Patch 1 now also converts the trie_update_elem()/trie_delete_elem()
walks from rcu_dereference() to rcu_dereference_protected(*p, 1),
addressing review feedback that v1 only fixed the lookup path and left
the same splat on the writer paths.
- New patch 2 adds the verifier whitelist entry so the fix is actually
reachable for directly-referenced LPM tries.
- Retitled v1 ("Allow lookups from sleepable BPF programs").
Vlad Poenaru [Tue, 9 Jun 2026 13:55:58 +0000 (06:55 -0700)]
bpf: Allow sleepable programs to use LPM trie maps directly
The previous change relaxed the rcu_dereference annotations in
lpm_trie.c so the trie walks no longer trip lockdep when reached from a
sleepable BPF program holding only rcu_read_lock_trace(). By itself
that only helps tries reached as the inner map of a map-of-maps, or
from the classic-RCU syscall path: a sleepable program that references
an LPM trie directly is still rejected at load time by
check_map_prog_compatibility(), whose sleepable whitelist omits
BPF_MAP_TYPE_LPM_TRIE:
Sleepable programs can only use array, hash, ringbuf and local storage maps
LPM trie nodes are allocated from a bpf_mem_alloc (trie->ma) and freed
with bpf_mem_cache_free_rcu(), which chains a regular RCU grace period
into a Tasks Trace grace period before the node -- and the value
embedded in it that trie_lookup_elem() returns to the program -- is
released. That is the same reclaim discipline BPF_MAP_TYPE_HASH relies
on for sleepable access, so a value handed to a sleepable reader cannot
be freed while the program is still running under rcu_read_lock_trace().
The writer paths take trie->lock across the walk and never relied on the
RCU read-side lock to keep nodes alive.
Add BPF_MAP_TYPE_LPM_TRIE to the sleepable map whitelist so these
programs can use LPM tries directly.
Vlad Poenaru [Tue, 9 Jun 2026 13:55:57 +0000 (06:55 -0700)]
bpf: Allow LPM map access from sleepable BPF programs
trie_lookup_elem() annotates its rcu_dereference_check() walks with
only rcu_read_lock_bh_held(). Because rcu_dereference_check(p, c)
resolves to "c || rcu_read_lock_held()", this passes for XDP/NAPI and
classic RCU readers but fails for sleepable BPF programs, which enter
via __bpf_prog_enter_sleepable() and hold only rcu_read_lock_trace().
trie_update_elem() and trie_delete_elem() have the same problem in a
different form: they walk the trie with plain rcu_dereference(), which
asserts rcu_read_lock_held() unconditionally. Both are reachable from
sleepable BPF programs via the bpf_map_update_elem / bpf_map_delete_elem
helpers, and from the syscall path under classic rcu_read_lock(). In
the writer paths the trie is actually protected by trie->lock (an
rqspinlock taken across the walk); we never relied on the RCU read-side
lock to keep nodes alive there.
A sleepable LSM hook that ends up touching an LPM trie therefore
triggers lockdep on debug kernels:
=============================
WARNING: suspicious RCU usage
7.1.0-... Tainted: G E
-----------------------------
kernel/bpf/lpm_trie.c:249 suspicious rcu_dereference_check() usage!
1 lock held by net_tests/540:
#0: (rcu_tasks_trace_srcu_struct){....}-{0:0},
at: __bpf_prog_enter_sleepable+0x26/0x280
Call Trace:
dump_stack_lvl
lockdep_rcu_suspicious
trie_lookup_elem
bpf_prog_..._enforce_security_socket_connect
bpf_trampoline_...
security_socket_connect
__sys_connect
do_syscall_64
This is lockdep-only -- no UAF, since Tasks Trace RCU does serialize
against the trie's reclaim path -- but it spams the console once per
distinct callsite on every debug kernel running a sleepable BPF LSM
that touches an LPM trie, which is increasingly common.
For the lookup path, switch the rcu_dereference_check() annotation
from rcu_read_lock_bh_held() to bpf_rcu_lock_held(), which accepts all
three contexts (classic, BH, Tasks Trace). Other map types already
follow this convention.
For trie_update_elem() and trie_delete_elem(), annotate the walks as
rcu_dereference_protected(*p, 1) -- matching trie_free() in the same
file -- since trie->lock is held across the walk. rqspinlock has no
lockdep_map, so the predicate degenerates to '1' rather than
lockdep_is_held(&trie->lock); the protection is real but not
machine-verifiable. trie_get_next_key() also uses bare
rcu_dereference() but is reachable only from the BPF syscall, which
holds classic rcu_read_lock() before dispatching, so it is left
untouched.
Fixes: 694cea395fde ("bpf: Allow RCU-protected lookups to happen from bh context") Cc: stable@vger.kernel.org Signed-off-by: Vlad Poenaru <vlad.wing@gmail.com> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Link: https://lore.kernel.org/r/20260609135558.193287-2-vlad.wing@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Don't just overwrite the original pointer passed to krealloc()
with its return value without checking latter:
MEM = krealloc(MEM, SZ, GFP);
If krealloc() returns NULL, that erases the pointer
to the still allocated memory, hence leaks this memory.
Instead, use a temporary variable, check it's not NULL
and only then assign it to the original pointer:
TMP = krealloc(MEM, SZ, GFP);
if (!TMP) return;
MEM = TMP;
Praveen Talari [Wed, 20 May 2026 07:14:29 +0000 (12:44 +0530)]
i2c: qcom-geni: Use pm_runtime_force_{suspend,resume} helpers
The driver carries custom system suspend/resume handling that manually
tracks a suspended state and conditionally calls
geni_i2c_runtime_suspend()
from the noirq suspend path, then adjusts runtime PM state by hand. This
duplicates PM core behavior and adds unnecessary complexity.
Drop the manual state tracking and switch to pm_runtime_force_suspend()
and pm_runtime_force_resume() for system sleep. These helpers already
perform the required checks, call the runtime PM callbacks when needed,
and keep runtime PM state transitions consistent.
Emil Tsalapatis [Tue, 9 Jun 2026 06:36:30 +0000 (02:36 -0400)]
selftests/bpf: Avoid spurious spmc parallel selftest errors in libarena
The libarena parallel spmc selftest is nondeterministic by design.
As a result it depends up to a point on the relative timing between the
producer and consumer threads. This introduces the possibility for two
kinds of spurious failures that this patch addresses.
1) Spurious timeouts. The test proceeds in phases, and threads use a
common counter as a barrier to avoid proceeding to the next phase
until all threads are ready to do so. If a thread takes too long to
reach the barrier, the already waiting threads may time out.
Increase the current timeout. The timeout's value is a balance
between the maximum amount of time spent on the test and the
possibility of spurious failures. Right now the timeout is too short.
Err on the side of caution and significantly increase it to avoid
spurious failures.
2) Spurious resize failures. Some selftests require the spmc queue to
resize itself. This in turn requires for the producer side to be
materially faster than the consumer side so that the queue gets full
enough for a resize. However, in the benchmark the spmc queue's producer
is outnumbered 3:1. To offset it we add busy waits for consume
queues. However, we still see occasional failures due to the queue
never resizing.
Minimize the possibility for this in two ways: First, remove one of
the consumers. The 2 consumers still exercise the "race between
consumers" scenario. Second, increase the busy wait duration to
decrease the rate by which the consumers act on the queue.
While at it, also replace a stray invalid error value "153" with EINVAL.
Fixes: 42998f819256 ("selftests/bpf: libarena: parallel test harness and spmc parallel selftest") Reported-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> Link: https://lore.kernel.org/r/20260609063630.10245-1-emil@etsalapatis.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Chen-Yu Tsai [Tue, 9 Jun 2026 08:36:27 +0000 (16:36 +0800)]
regulator: mt6359: Fix vbbck default internal supply name
This issue was pointed out by Sashiko.
vbbck is fed internally from vio18. For the MT6359, the default supply
name was incorrectly set as "VIO18", instead of the supply's default
"VIO18". In practice this still works, but it causes the regulator
description copy and replace to always happen. For the MT6359P the
name is correct.
Fix the supply name for MT6359 so that both instances are the same and
correct. Also copy the comment about the internal supply from the MT6359
list to the MT6359P list.
Fixes: 10be8fc1d534 ("regulator: mt6359: Add regulator supply names") Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://patch.msgid.link/20260609083630.1600070-1-wenst@chromium.org Signed-off-by: Mark Brown <broonie@kernel.org>
Cássio Gabriel [Tue, 9 Jun 2026 12:03:56 +0000 (09:03 -0300)]
ASoC: sma1307: Fix uevent string leaks in fault worker
sma1307_check_fault_worker() stores dynamically allocated uevent strings in
envp[0]. Several fault conditions are checked in sequence, so a later fault
can overwrite envp[0] before the final kfree() and leak the previous
allocation.
The same flow can leave an OT1 volume entry in envp[1] while envp[0]
has been overwritten by a later non-OT1 fault, causing an inconsistent
uevent payload.
Use static STATUS strings and a stack buffer for the optional VOLUME entry.
This removes the allocations from the worker and keeps VOLUME tied only
to the OT1 events that produce it.
Peter Ujfalusi <peter.ujfalusi@linux.intel.com> says:
This series hardens SOF kcontrol data paths for both IPC3 and IPC4 by
fixing size-handling bugs in put/get/update flows and tightening bounds
checks around firmware/user-provided payload lengths.
The changes include:
Fix TOCTOU-style size misuse in IPC3/IPC4 bytes put paths by validating and
using the incoming payload size.
Add notification/update payload size validation before parsing control data.
Use overflow-checked arithmetic when computing expected IPC3 control sizes.
Ensure update/copy bounds are validated against actual allocation limits.
Fix IPC3 bytes_ext bounds checks to account for struct header offset, closing
a heap overflow/over-read issue from unprivileged userspace TLV access.
Overall, the series makes control payload processing robust against malformed or
inconsistent sizes and prevents out-of-bounds accesses.
Peter Ujfalusi [Tue, 9 Jun 2026 08:34:58 +0000 (11:34 +0300)]
ASoC: SOF: ipc3-control: Fix heap overflow in bytes_ext put/get
The ipc_control_data buffer is allocated as kzalloc(max_size), where
max_size covers the entire struct sof_ipc_ctrl_data including its
flexible array payload. However, the bounds checks in bytes_ext_put
and _bytes_ext_get compared user data lengths against max_size
directly, ignoring that cdata->data sits at an offset of
sizeof(struct sof_ipc_ctrl_data) bytes into the allocation.
This allowed writing up to sizeof(struct sof_ipc_ctrl_data) bytes past
the end of the heap buffer from unprivileged userspace via the ALSA TLV
kcontrol interface, and similarly allowed over-reading adjacent heap
data on the get path.
Fix all bounds checks to subtract sizeof(*cdata) from max_size so they
reflect the actual space available at the cdata->data offset. Also fix
the error-path restore in bytes_ext_put which wrote to cdata->data
instead of cdata, causing the same overflow.
Fixes: 67ec2a091630 ("ASoC: SOF: Add bytes_ext control IPC ops for IPC3") Cc: stable@vger.kernel.org Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Liam Girdwood <liam.r.girdwood@intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://patch.msgid.link/20260609083458.31193-7-peter.ujfalusi@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
Peter Ujfalusi [Tue, 9 Jun 2026 08:34:57 +0000 (11:34 +0300)]
ASoC: SOF: ipc3-control: Fix TOCTOU in bytes_put and bytes_get
In sof_ipc3_bytes_put(), the size used for the memcpy is derived from
the old data->size already in the buffer, not the incoming new data's
size field. If the new data has a different size, the copy length is
wrong: it may truncate valid data or copy stale bytes.
Similarly, sof_ipc3_bytes_get() checks data->size against max_size
without accounting for the sizeof(struct sof_ipc_ctrl_data) offset
of the flex array within the allocation.
Fix bytes_put to validate and use the incoming data's sof_abi_hdr.size
from ucontrol before copying. Fix bytes_get to subtract sizeof(*cdata)
from the bounds check to match the actual available space.
Fixes: 544ac8858f24 ("ASoC: SOF: Add bytes_get/put control IPC ops for IPC3") Cc: stable@vger.kernel.org Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Liam Girdwood <liam.r.girdwood@intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://patch.msgid.link/20260609083458.31193-6-peter.ujfalusi@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
Peter Ujfalusi [Tue, 9 Jun 2026 08:34:56 +0000 (11:34 +0300)]
ASoC: SOF: ipc3-control: Validate size in snd_sof_update_control
In snd_sof_update_control(), firmware-provided cdata->num_elems is
checked against local_cdata->data->size but never against the actual
allocation size. If local_cdata->data->size was previously set to an
inconsistent value, the memcpy could write past the allocated buffer.
Add a bounds check to ensure num_elems fits within the available space
in the ipc_control_data allocation before copying.
Fixes: 10f461d79c2d ("ASoC: SOF: Add IPC3 topology control ops") Cc: stable@vger.kernel.org Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Liam Girdwood <liam.r.girdwood@intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://patch.msgid.link/20260609083458.31193-5-peter.ujfalusi@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
Peter Ujfalusi [Tue, 9 Jun 2026 08:34:55 +0000 (11:34 +0300)]
ASoC: SOF: ipc3-control: Use overflow checks in control_update size calc
In sof_ipc3_control_update(), the expected_size calculation uses
firmware-provided cdata->num_elems in arithmetic that could overflow
on 32-bit platforms, wrapping to a small value. This would allow the
cdata->rhdr.hdr.size comparison to pass with mismatched sizes,
potentially leading to out-of-bounds access in snd_sof_update_control.
Use check_mul_overflow() and check_add_overflow() to detect and reject
overflowed size calculations.
Fixes: 10f461d79c2d ("ASoC: SOF: Add IPC3 topology control ops") Cc: stable@vger.kernel.org Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Liam Girdwood <liam.r.girdwood@intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://patch.msgid.link/20260609083458.31193-4-peter.ujfalusi@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
Peter Ujfalusi [Tue, 9 Jun 2026 08:34:53 +0000 (11:34 +0300)]
ASoC: SOF: ipc4-control: Fix TOCTOU in sof_ipc4_bytes_put
In sof_ipc4_bytes_put(), the copy size is derived from the old
data->size in the buffer rather than the incoming new data's size
field from ucontrol. If the new data has a different size, the copy
uses the wrong length: it may truncate valid data or copy stale bytes.
Fix by validating and using the incoming data's sof_abi_hdr.size from
ucontrol before copying.
Fixes: a062c8899fed ("ASoC: SOF: ipc4-control: Add support for bytes control get and put") Cc: stable@vger.kernel.org Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Liam Girdwood <liam.r.girdwood@intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://patch.msgid.link/20260609083458.31193-2-peter.ujfalusi@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>