Juergen Gross [Thu, 9 Oct 2025 14:54:58 +0000 (16:54 +0200)]
xen/privcmd: restrict usage in unprivileged domU
The Xen privcmd driver allows to issue arbitrary hypercalls from
user space processes. This is normally no problem, as access is
usually limited to root and the hypervisor will deny any hypercalls
affecting other domains.
In case the guest is booted using secure boot, however, the privcmd
driver would be enabling a root user process to modify e.g. kernel
memory contents, thus breaking the secure boot feature.
The only known case where an unprivileged domU is really needing to
use the privcmd driver is the case when it is acting as the device
model for another guest. In this case all hypercalls issued via the
privcmd driver will target that other guest.
Fortunately the privcmd driver can already be locked down to allow
only hypercalls targeting a specific domain, but this mode can be
activated from user land only today.
The target domain can be obtained from Xenstore, so when not running
in dom0 restrict the privcmd driver to that target domain from the
beginning, resolving the potential problem of breaking secure boot.
This is XSA-482
Reported-by: Teddy Astie <teddy.astie@vates.tech> Fixes: 1c5de1939c20 ("xen: add privcmd driver") Signed-off-by: Juergen Gross <jgross@suse.com>
---
V2:
- defer reading from Xenstore if Xenstore isn't ready yet (Jan Beulich)
- wait in open() if target domain isn't known yet
- issue message in case no target domain found (Jan Beulich)
Linus Torvalds [Thu, 19 Mar 2026 23:13:51 +0000 (16:13 -0700)]
Merge tag 'pci-v7.0-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Pull pci fixes from Bjorn Helgaas:
- Create pwrctrl devices only for DT nodes below a PCI controller that
describe PCI devices and are related to a power supply; this prevents
waiting indefinitely for pwrctrl drivers that will never probe
(Manivannan Sadhasivam)
- Restore endpoint BAR mapping on subrange setup failure to make
selftest reliable (Koichiro Den)
* tag 'pci-v7.0-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
PCI: endpoint: pci-epf-test: Roll back BAR mapping when subrange setup fails
PCI/pwrctrl: Create pwrctrl devices only for PCI device nodes
PCI/pwrctrl: Ensure that remote endpoint node parent has supply requirement
* tag 'net-7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (78 commits)
MPTCP: fix lock class name family in pm_nl_create_listen_socket
icmp: fix NULL pointer dereference in icmp_tag_validation()
net: dsa: bcm_sf2: fix missing clk_disable_unprepare() in error paths
net: shaper: protect from late creation of hierarchy
net: shaper: protect late read accesses to the hierarchy
net: mvpp2: guard flow control update with global_tx_fc in buffer switching
nfnetlink_osf: validate individual option lengths in fingerprints
netfilter: nf_tables: release flowtable after rcu grace period on error
netfilter: bpf: defer hook memory release until rcu readers are done
net: bonding: fix NULL deref in bond_debug_rlb_hash_show
udp_tunnel: fix NULL deref caused by udp_sock_create6 when CONFIG_IPV6=n
net/mlx5e: Fix race condition during IPSec ESN update
net/mlx5e: Prevent concurrent access to IPSec ASO context
net/mlx5: qos: Restrict RTNL area to avoid a lock cycle
ipv6: add NULL checks for idev in SRv6 paths
NFC: nxp-nci: allow GPIOs to sleep
net: macb: fix uninitialized rx_fs_lock
net: macb: fix use-after-free access to PTP clock
netdevsim: drop PSP ext ref on forward failure
wifi: mac80211: always free skb on ieee80211_tx_prepare_skb() failure
...
Li Xiasong [Thu, 19 Mar 2026 11:21:59 +0000 (19:21 +0800)]
MPTCP: fix lock class name family in pm_nl_create_listen_socket
In mptcp_pm_nl_create_listen_socket(), use entry->addr.family
instead of sk->sk_family for lock class setup. The 'sk' parameter
is a netlink socket, not the MPTCP subflow socket being created.
Fixes: cee4034a3db1 ("mptcp: fix lockdep false positive in mptcp_pm_nl_create_listen_socket()") Signed-off-by: Li Xiasong <lixiasong1@huawei.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20260319112159.3118874-1-lixiasong1@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Weiming Shi [Wed, 18 Mar 2026 13:06:01 +0000 (21:06 +0800)]
icmp: fix NULL pointer dereference in icmp_tag_validation()
icmp_tag_validation() unconditionally dereferences the result of
rcu_dereference(inet_protos[proto]) without checking for NULL.
The inet_protos[] array is sparse -- only about 15 of 256 protocol
numbers have registered handlers. When ip_no_pmtu_disc is set to 3
(hardened PMTU mode) and the kernel receives an ICMP Fragmentation
Needed error with a quoted inner IP header containing an unregistered
protocol number, the NULL dereference causes a kernel panic in
softirq context.
Add a NULL check before accessing icmp_strict_tag_validation. If the
protocol has no registered handler, return false since it cannot
perform strict tag validation.
Fixes: 8ed1dc44d3e9 ("ipv4: introduce hardened ip_no_pmtu_disc mode") Reported-by: Xiang Mei <xmei5@asu.edu> Signed-off-by: Weiming Shi <bestswngs@gmail.com> Link: https://patch.msgid.link/20260318130558.1050247-4-bestswngs@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Linus Torvalds [Thu, 19 Mar 2026 15:45:34 +0000 (08:45 -0700)]
Merge tag 'pm-7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix an idle loop issue exposed by recent changes and a race
condition related to device removal in the runtime PM core code:
- Consolidate the handling of two special cases in the idle loop that
occur when only one CPU idle state is present (Rafael Wysocki)
- Fix a race condition related to device removal in the runtime PM
core code that may cause a stale device object pointer to be
dereferenced (Bart Van Assche)"
* tag 'pm-7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM: runtime: Fix a race condition related to device removal
sched: idle: Consolidate the handling of two special cases
Linus Torvalds [Thu, 19 Mar 2026 15:42:59 +0000 (08:42 -0700)]
Merge tag 'acpi-7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI support fixes from Rafael Wysocki:
"These fix an MFD child automatic modprobe issue introduced recently,
an ACPI processor driver issue introduced by a previous fix and an
ACPICA issue causing confusing messages regarding _DSM arguments to be
printed:
- Update the format of the last argument of _DSM to avoid printing
confusing error messages in some cases (Saket Dumbre)
- Fix MFD child automatic modprobe issue by removing a stale check
from acpi_companion_match() (Pratap Nirujogi)
- Prevent possible use-after-free in acpi_processor_errata_piix4()
from occurring by rearranging the code to print debug messages
while holding references to relevant device objects (Rafael
Wysocki)"
* tag 'acpi-7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: bus: Fix MFD child automatic modprobe issue
ACPI: processor: Fix previous acpi_processor_errata_piix4() fix
ACPICA: Update the format of Arg3 of _DSM
Paolo Abeni [Thu, 19 Mar 2026 14:39:33 +0000 (15:39 +0100)]
Merge tag 'nf-26-03-19' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Florian Westphal says:
====================
netfilter: updates for net
The following patchset contains Netfilter fixes for *net*:
1) Fix UaF when netfilter bpf link goes away while nfnetlink dumps
current hook list, we have to wait until rcu readers are gone.
2) Fix UaF when flowtable fails to register all devices, similar
bug as 1). From Pablo Neira Ayuso.
3) nfnetlink_osf fails to properly validate option length fields.
From Weiming Shi.
netfilter pull request nf-26-03-19
* tag 'nf-26-03-19' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
nfnetlink_osf: validate individual option lengths in fingerprints
netfilter: nf_tables: release flowtable after rcu grace period on error
netfilter: bpf: defer hook memory release until rcu readers are done
====================
Jakub Kicinski [Tue, 17 Mar 2026 16:10:14 +0000 (09:10 -0700)]
net: shaper: protect from late creation of hierarchy
We look up a netdev during prep of Netlink ops (pre- callbacks)
and take a ref to it. Then later in the body of the callback
we take its lock or RCU which are the actual protections.
The netdev may get unregistered in between the time we take
the ref and the time we lock it. We may allocate the hierarchy
after flush has already run, which would lead to a leak.
Take the instance lock in pre- already, this saves us from the race
and removes the need for dedicated lock/unlock callbacks completely.
After all, if there's any chance of write happening concurrently
with the flush - we're back to leaking the hierarchy.
We may take the lock for devices which don't support shapers but
we're only dealing with SET operations here, not taking the lock
would be optimizing for an error case.
Jakub Kicinski [Tue, 17 Mar 2026 16:10:13 +0000 (09:10 -0700)]
net: shaper: protect late read accesses to the hierarchy
We look up a netdev during prep of Netlink ops (pre- callbacks)
and take a ref to it. Then later in the body of the callback
we take its lock or RCU which are the actual protections.
This is not proper, a conversion from a ref to a locked netdev
must include a liveness check (a check if the netdev hasn't been
unregistered already). Fix the read cases (those under RCU).
Writes needs a separate change to protect from creating the
hierarchy after flush has already run.
net: mvpp2: guard flow control update with global_tx_fc in buffer switching
mvpp2_bm_switch_buffers() unconditionally calls
mvpp2_bm_pool_update_priv_fc() when switching between per-cpu and
shared buffer pool modes. This function programs CM3 flow control
registers via mvpp2_cm3_read()/mvpp2_cm3_write(), which dereference
priv->cm3_base without any NULL check.
When the CM3 SRAM resource is not present in the device tree (the
third reg entry added by commit 60523583b07c ("dts: marvell: add CM3
SRAM memory to cp11x ethernet device tree")), priv->cm3_base remains
NULL and priv->global_tx_fc is false. Any operation that triggers
mvpp2_bm_switch_buffers(), for example an MTU change that crosses
the jumbo frame threshold, will crash:
Unable to handle kernel NULL pointer dereference at
virtual address 0000000000000000
Mem abort info:
ESR = 0x0000000096000006
EC = 0x25: DABT (current EL), IL = 32 bits
pc : readl+0x0/0x18
lr : mvpp2_cm3_read.isra.0+0x14/0x20
Call trace:
readl+0x0/0x18
mvpp2_bm_pool_update_fc+0x40/0x12c
mvpp2_bm_pool_update_priv_fc+0x94/0xd8
mvpp2_bm_switch_buffers.isra.0+0x80/0x1c0
mvpp2_change_mtu+0x140/0x380
__dev_set_mtu+0x1c/0x38
dev_set_mtu_ext+0x78/0x118
dev_set_mtu+0x48/0xa8
dev_ifsioc+0x21c/0x43c
dev_ioctl+0x2d8/0x42c
sock_ioctl+0x314/0x378
Every other flow control call site in the driver already guards
hardware access with either priv->global_tx_fc or port->tx_fc.
mvpp2_bm_switch_buffers() is the only place that omits this check.
Add the missing priv->global_tx_fc guard to both the disable and
re-enable calls in mvpp2_bm_switch_buffers(), consistent with the
rest of the driver.
Fixes: 3a616b92a9d1 ("net: mvpp2: Add TX flow control support for jumbo frames") Signed-off-by: Muhammad Hammad Ijaz <mhijaz@amazon.com> Reviewed-by: Gunnar Kudrjavets <gunnarku@amazon.com> Link: https://patch.msgid.link/20260316193157.65748-1-mhijaz@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Weiming Shi [Thu, 19 Mar 2026 07:32:44 +0000 (15:32 +0800)]
nfnetlink_osf: validate individual option lengths in fingerprints
nfnl_osf_add_callback() validates opt_num bounds and string
NUL-termination but does not check individual option length fields.
A zero-length option causes nf_osf_match_one() to enter the option
matching loop even when foptsize sums to zero, which matches packets
with no TCP options where ctx->optp is NULL:
Oops: general protection fault
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
RIP: 0010:nf_osf_match_one (net/netfilter/nfnetlink_osf.c:98)
Call Trace:
nf_osf_match (net/netfilter/nfnetlink_osf.c:227)
xt_osf_match_packet (net/netfilter/xt_osf.c:32)
ipt_do_table (net/ipv4/netfilter/ip_tables.c:293)
nf_hook_slow (net/netfilter/core.c:623)
ip_local_deliver (net/ipv4/ip_input.c:262)
ip_rcv (net/ipv4/ip_input.c:573)
Additionally, an MSS option (kind=2) with length < 4 causes
out-of-bounds reads when nf_osf_match_one() unconditionally accesses
optp[2] and optp[3] for MSS value extraction. While RFC 9293
section 3.2 specifies that the MSS option is always exactly 4
bytes (Kind=2, Length=4), the check uses "< 4" rather than
"!= 4" because lengths greater than 4 do not cause memory
safety issues -- the buffer is guaranteed to be at least
foptsize bytes by the ctx->optsize == foptsize check.
Reject fingerprints where any option has zero length, or where an MSS
option has length less than 4, at add time rather than trusting these
values in the packet matching hot path.
Fixes: 11eeef41d5f6 ("netfilter: passive OS fingerprint xtables match") Reported-by: Xiang Mei <xmei5@asu.edu> Signed-off-by: Weiming Shi <bestswngs@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de>
netfilter: nf_tables: release flowtable after rcu grace period on error
Call synchronize_rcu() after unregistering the hooks from error path,
since a hook that already refers to this flowtable can be already
registered, exposing this flowtable to packet path and nfnetlink_hook
control plane.
This error path is rare, it should only happen by reaching the maximum
number hooks or by failing to set up to hardware offload, just call
synchronize_rcu().
There is a check for already used device hooks by different flowtable
that could result in EEXIST at this late stage. The hook parser can be
updated to perform this check earlier to this error path really becomes
rarely exercised.
Uncovered by KASAN reported as use-after-free from nfnetlink_hook path
when dumping hooks.
Jakub Kicinski [Thu, 19 Mar 2026 02:25:40 +0000 (19:25 -0700)]
Merge tag 'wireless-2026-03-18' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless
Johannes Berg says:
====================
Just a few updates:
- cfg80211:
- guarantee pmsr work is cancelled
- mac80211:
- reject TDLS operations on non-TDLS stations
- fix crash in AP_VLAN bandwidth change
- fix leak or double-free on some TX preparation
failures
- remove keys needed for beacons _after_ stopping
those
- fix debugfs static branch race
- avoid underflow in inactive time
- fix another NULL dereference in mesh on invalid
frames
- ti/wlcore: avoid infinite realloc loop
* tag 'wireless-2026-03-18' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
wifi: mac80211: always free skb on ieee80211_tx_prepare_skb() failure
wifi: wlcore: Return -ENOMEM instead of -EAGAIN if there is not enough headroom
wifi: mac80211: fix NULL deref in mesh_matches_local()
wifi: mac80211: check tdls flag in ieee80211_tdls_oper
wifi: cfg80211: cancel pmsr_free_wk in cfg80211_pmsr_wdev_down
wifi: mac80211: Fix static_branch_dec() underflow for aql_disable.
mac80211: fix crash in ieee80211_chan_bw_change for AP_VLAN stations
wifi: mac80211: use jiffies_delta_to_msecs() for sta_info inactive times
wifi: mac80211: remove keys after disabling beaconing
wifi: mac80211_hwsim: fully initialise PMSR capabilities
====================
Xiang Mei [Tue, 17 Mar 2026 00:50:34 +0000 (17:50 -0700)]
net: bonding: fix NULL deref in bond_debug_rlb_hash_show
rlb_clear_slave intentionally keeps RLB hash-table entries on
the rx_hashtbl_used_head list with slave set to NULL when no
replacement slave is available. However, bond_debug_rlb_hash_show
visites client_info->slave without checking if it's NULL.
Other used-list iterators in bond_alb.c already handle this NULL-slave
state safely:
- rlb_update_client returns early on !client_info->slave
- rlb_req_update_slave_clients, rlb_clear_slave, and rlb_rebalance
compare slave values before visiting
- lb_req_update_subnet_clients continues if slave is NULL
The following NULL deref crash can be trigger in
bond_debug_rlb_hash_show:
Add a NULL check and print "(none)" for entries with no assigned slave.
Fixes: caafa84251b88 ("bonding: add the debugfs interface to see RLB hash table") Reported-by: Weiming Shi <bestswngs@gmail.com> Signed-off-by: Xiang Mei <xmei5@asu.edu> Link: https://patch.msgid.link/20260317005034.1888794-1-xmei5@asu.edu Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Xiang Mei [Tue, 17 Mar 2026 01:02:41 +0000 (18:02 -0700)]
udp_tunnel: fix NULL deref caused by udp_sock_create6 when CONFIG_IPV6=n
When CONFIG_IPV6 is disabled, the udp_sock_create6() function returns 0
(success) without actually creating a socket. Callers such as
fou_create() then proceed to dereference the uninitialized socket
pointer, resulting in a NULL pointer dereference.
This patch makes udp_sock_create6 return -EPFNOSUPPORT instead, so
callers correctly take their error paths. There is only one caller of
the vulnerable function and only privileged users can trigger it.
Fixes: fd384412e199b ("udp_tunnel: Seperate ipv6 functions into its own file.") Reported-by: Weiming Shi <bestswngs@gmail.com> Signed-off-by: Xiang Mei <xmei5@asu.edu> Link: https://patch.msgid.link/20260317010241.1893893-1-xmei5@asu.edu Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jianbo Liu [Mon, 16 Mar 2026 09:46:03 +0000 (11:46 +0200)]
net/mlx5e: Fix race condition during IPSec ESN update
In IPSec full offload mode, the device reports an ESN (Extended
Sequence Number) wrap event to the driver. The driver validates this
event by querying the IPSec ASO and checking that the esn_event_arm
field is 0x0, which indicates an event has occurred. After handling
the event, the driver must re-arm the context by setting esn_event_arm
back to 0x1.
A race condition exists in this handling path. After validating the
event, the driver calls mlx5_accel_esp_modify_xfrm() to update the
kernel's xfrm state. This function temporarily releases and
re-acquires the xfrm state lock.
So, need to acknowledge the event first by setting esn_event_arm to
0x1. This prevents the driver from reprocessing the same ESN update if
the hardware sends events for other reason. Since the next ESN update
only occurs after nearly 2^31 packets are received, there's no risk of
missing an update, as it will happen long after this handling has
finished.
Processing the event twice causes the ESN high-order bits (esn_msb) to
be incremented incorrectly. The driver then programs the hardware with
this invalid ESN state, which leads to anti-replay failures and a
complete halt of IPSec traffic.
Fix this by re-arming the ESN event immediately after it is validated,
before calling mlx5_accel_esp_modify_xfrm(). This ensures that any
spurious, duplicate events are correctly ignored, closing the race
window.
Fixes: fef06678931f ("net/mlx5e: Fix ESN update kernel panic") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260316094603.6999-4-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jianbo Liu [Mon, 16 Mar 2026 09:46:02 +0000 (11:46 +0200)]
net/mlx5e: Prevent concurrent access to IPSec ASO context
The query or updating IPSec offload object is through Access ASO WQE.
The driver uses a single mlx5e_ipsec_aso struct for each PF, which
contains a shared DMA-mapped context for all ASO operations.
A race condition exists because the ASO spinlock is released before
the hardware has finished processing WQE. If a second operation is
initiated immediately after, it overwrites the shared context in the
DMA area.
When the first operation's completion is processed later, it reads
this corrupted context, leading to unexpected behavior and incorrect
results.
This commit fixes the race by introducing a private context within
each IPSec offload object. The shared ASO context is now copied to
this private context while the ASO spinlock is held. Subsequent
processing uses this saved, per-object context, ensuring its integrity
is maintained.
Fixes: 1ed78fc03307 ("net/mlx5e: Update IPsec soft and hard limits") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260316094603.6999-3-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Fix that by restricting the rtnl-protected section to just the necessary
part, the call to netdev_master_upper_dev_get and speed querying, so
that the last lock dependency is avoided and the cycle doesn't close.
This is safe because mlx5_uplink_netdev_get uses netdev_hold to keep the
uplink netdev alive while its master device is queried.
Use this opportunity to rename the ambiguously-named "hold_rtnl_lock"
argument to "take_rtnl" and remove the "_locked" suffix from
mlx5_esw_qos_lag_link_speed_get_locked.
Jakub Kicinski [Thu, 19 Mar 2026 00:41:00 +0000 (17:41 -0700)]
Merge tag 'batadv-net-pullrequest-20260317' of https://git.open-mesh.org/linux-merge
Simon Wunderlich says:
====================
Here is a batman-adv bugfix:
- avoid OGM aggregation when skb tailroom is insufficient, by Yang Yang
* tag 'batadv-net-pullrequest-20260317' of https://git.open-mesh.org/linux-merge:
batman-adv: avoid OGM aggregation when skb tailroom is insufficient
====================
Jakub Kicinski [Thu, 19 Mar 2026 00:38:15 +0000 (17:38 -0700)]
Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2026-03-17 (igc, iavf, libie)
Kohei Enju adds use of helper function to add missing update of
skb->tail when padding is needed for igc.
Zdenek Bouska clears stale XSK timestamps when taking down Tx rings on
igc.
Petr Oros changes handling of iavf VLAN filter handling when an added
VLAN is also on the delete list to which can race and cause the VLAN
filter to not be added.
Michal frees cmd_buf for libie firmware logging to stop memory leaks.
* '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
libie: prevent memleak in fwlog code
iavf: fix VLAN filter lost on add/delete race
igc: fix page fault in XDP TX timestamps handling
igc: fix missing update of skb->tail in igc_xmit_frame()
====================
Minhong He [Mon, 16 Mar 2026 07:33:01 +0000 (15:33 +0800)]
ipv6: add NULL checks for idev in SRv6 paths
__in6_dev_get() can return NULL when the device has no IPv6 configuration
(e.g. MTU < IPV6_MIN_MTU or after NETDEV_UNREGISTER).
Add NULL checks for idev returned by __in6_dev_get() in both
seg6_hmac_validate_skb() and ipv6_srh_rcv() to prevent potential NULL
pointer dereferences.
Fixes: 1ababeba4a21 ("ipv6: implement dataplane support for rthdr type 4 (Segment Routing Header)") Fixes: bf355b8d2c30 ("ipv6: sr: add core files for SR HMAC support") Signed-off-by: Minhong He <heminhong@kylinos.cn> Reviewed-by: Andrea Mayer <andrea.mayer@uniroma2.it> Link: https://patch.msgid.link/20260316073301.106643-1-heminhong@kylinos.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Fedor Pchelkin [Mon, 16 Mar 2026 10:38:25 +0000 (13:38 +0300)]
net: macb: fix uninitialized rx_fs_lock
If hardware doesn't support RX Flow Filters, rx_fs_lock spinlock is not
initialized leading to the following assertion splat triggerable via
set_rxnfc callback.
INFO: trying to register non-static key.
The code is fine but needs lockdep annotation, or maybe
you didn't initialize this object before use?
turning off the locking correctness validator.
CPU: 1 PID: 949 Comm: syz.0.6 Not tainted 6.1.164+ #113
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.1-0-g3208b098f51a-prebuilt.qemu.org 04/01/2014
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x8d/0xba lib/dump_stack.c:106
assign_lock_key kernel/locking/lockdep.c:974 [inline]
register_lock_class+0x141b/0x17f0 kernel/locking/lockdep.c:1287
__lock_acquire+0x74f/0x6c40 kernel/locking/lockdep.c:4928
lock_acquire kernel/locking/lockdep.c:5662 [inline]
lock_acquire+0x190/0x4b0 kernel/locking/lockdep.c:5627
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0x33/0x50 kernel/locking/spinlock.c:162
gem_del_flow_filter drivers/net/ethernet/cadence/macb_main.c:3562 [inline]
gem_set_rxnfc+0x533/0xac0 drivers/net/ethernet/cadence/macb_main.c:3667
ethtool_set_rxnfc+0x18c/0x280 net/ethtool/ioctl.c:961
__dev_ethtool net/ethtool/ioctl.c:2956 [inline]
dev_ethtool+0x229c/0x6290 net/ethtool/ioctl.c:3095
dev_ioctl+0x637/0x1070 net/core/dev_ioctl.c:510
sock_do_ioctl+0x20d/0x2c0 net/socket.c:1215
sock_ioctl+0x577/0x6d0 net/socket.c:1320
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:870 [inline]
__se_sys_ioctl fs/ioctl.c:856 [inline]
__x64_sys_ioctl+0x18c/0x210 fs/ioctl.c:856
do_syscall_x64 arch/x86/entry/common.c:46 [inline]
do_syscall_64+0x35/0x80 arch/x86/entry/common.c:76
entry_SYSCALL_64_after_hwframe+0x6e/0xd8
A more straightforward solution would be to always initialize rx_fs_lock,
just like rx_fs_list. However, in this case the driver set_rxnfc callback
would return with a rather confusing error code, e.g. -EINVAL. So deny
set_rxnfc attempts directly if the RX filtering feature is not supported
by hardware.
Fedor Pchelkin [Mon, 16 Mar 2026 10:38:24 +0000 (13:38 +0300)]
net: macb: fix use-after-free access to PTP clock
PTP clock is registered on every opening of the interface and destroyed on
every closing. However it may be accessed via get_ts_info ethtool call
which is possible while the interface is just present in the kernel.
BUG: KASAN: use-after-free in ptp_clock_index+0x47/0x50 drivers/ptp/ptp_clock.c:426
Read of size 4 at addr ffff8880194345cc by task syz.0.6/948
Set the PTP clock pointer to NULL after unregistering.
Fixes: c2594d804d5c ("macb: Common code to enable ptp support for MACB/GEM") Cc: stable@vger.kernel.org Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru> Link: https://patch.msgid.link/20260316103826.74506-1-pchelkin@ispras.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Wesley Atwell [Tue, 17 Mar 2026 06:14:31 +0000 (00:14 -0600)]
netdevsim: drop PSP ext ref on forward failure
nsim_do_psp() takes an extra reference to the PSP skb extension so the
extension survives __dev_forward_skb(). That forward path scrubs the skb
and drops attached skb extensions before nsim_psp_handle_ext() can
reattach the PSP metadata.
If __dev_forward_skb() fails in nsim_forward_skb(), the function returns
before nsim_psp_handle_ext() can attach that extension to the skb, leaving
the extra reference leaked.
Drop the saved PSP extension reference before returning from the
forward-failure path. Guard the put because plain or non-decapsulated
traffic can also fail forwarding without ever taking the extra PSP
reference.
Fixes: f857478d6206 ("netdevsim: a basic test PSP implementation") Signed-off-by: Wesley Atwell <atwellwea@gmail.com> Reviewed-by: Daniel Zahka <daniel.zahka@gmail.com> Link: https://patch.msgid.link/20260317061431.1482716-1-atwellwea@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Linus Torvalds [Wed, 18 Mar 2026 22:50:29 +0000 (15:50 -0700)]
Merge tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library fixes from Eric Biggers:
- Disable the "padlock" SHA-1 and SHA-256 driver on Zhaoxin
processors, since it does not compute hash values correctly
- Make a generated file be removed by 'make clean'
- Fix excessive stack usage in some of the arm64 AES code
* tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
lib/crypto: powerpc: Add powerpc/aesp8-ppc.S to clean-files
crypto: padlock-sha - Disable for Zhaoxin processor
crypto: arm64/aes-neonbs - Move key expansion off the stack
Linus Torvalds [Wed, 18 Mar 2026 21:27:11 +0000 (14:27 -0700)]
Merge tag 'nfsd-7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd fixes from Chuck Lever:
- Fix cache_request leak in cache_release()
- Fix heap overflow in the NFSv4.0 LOCK replay cache
- Hold net reference for the lifetime of /proc/fs/nfs/exports fd
- Defer sub-object cleanup in export "put" callbacks
* tag 'nfsd-7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
nfsd: fix heap overflow in NFSv4.0 LOCK replay cache
sunrpc: fix cache_request leak in cache_release
NFSD: Hold net reference for the lifetime of /proc/fs/nfs/exports fd
NFSD: Defer sub-object cleanup in export put callbacks
Linus Torvalds [Wed, 18 Mar 2026 15:28:54 +0000 (08:28 -0700)]
Merge tag 'soc-fixes-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC fixes from Arnd Bergmann:
"The firmware drivers for ARM SCMI, FF-A and the Tee subsystem, as
well as the reset controller and cache controller subsystem all see
small bugfixes for reference ounting errors, ABI correctness, and
NULL pointer dereferences.
Similarly, there are multiple reference counting fixes in drivers/soc/
for vendor specific drivers (rockchips, microchip), while the
freescale drivers get a fix for a race condition and error handling.
The devicetree fixes for Rockchips and NXP got held up, so for
the moment there is only Renesas fixing problesm with SD card
initialization, a boot hang on one board and incorrect descriptions
for interrupts and clock registers on some SoCs. The Microchip
polarfire gets a dts fix for a boot time warning.
A defconfig fix avoids a warning about a conflicting assignment"
* tag 'soc-fixes-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (21 commits)
ARM: multi_v7_defconfig: Drop duplicate CONFIG_TI_PRUSS=m
firmware: arm_scmi: Spelling s/mulit/multi/, s/currenly/currently/
firmware: arm_scmi: Fix NULL dereference on notify error path
firmware: arm_scpi: Fix device_node reference leak in probe path
firmware: arm_ffa: Remove vm_id argument in ffa_rxtx_unmap()
arm64: dts: renesas: r8a78000: Fix out-of-range SPI interrupt numbers
arm64: dts: renesas: rzg3s-smarc-som: Set bypass for Versa3 PLL2
arm64: dts: renesas: r9a09g087: Fix CPG register region sizes
arm64: dts: renesas: r9a09g077: Fix CPG register region sizes
arm64: dts: renesas: r9a09g057: Remove wdt{0,2,3} nodes
arm64: dts: renesas: rzv2-evk-cn15-sd: Add ramp delay for SD0 regulator
arm64: dts: renesas: rzt2h-n2h-evk: Add ramp delay for SD0 card regulator
tee: shm: Remove refcounting of kernel pages
reset: rzg2l-usbphy-ctrl: Check pwrrdy is valid before using it
soc: fsl: cpm1: qmc: Fix error check for devm_ioremap_resource() in qmc_qe_init_resources()
soc: fsl: qbman: fix race condition in qman_destroy_fq
soc: rockchip: grf: Add missing of_node_put() when returning
cache: ax45mp: Fix device node reference leak in ax45mp_cache_init()
cache: starfive: fix device node leak in starlink_cache_init()
riscv: dts: microchip: add can resets to mpfs
...
Linus Torvalds [Wed, 18 Mar 2026 15:06:30 +0000 (08:06 -0700)]
Merge tag 'loongarch-fixes-7.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch fixes from Huacai Chen:
- only use SC.Q when supported by the assembler to fix a build failure
- fix calling smp_processor_id() in preemptible code
- make a BPF helper arch_protect_bpf_trampoline() return 0 to fix a
kernel memory access failure
- fix a typo issue in kvm_vm_init_features()
* tag 'loongarch-fixes-7.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
LoongArch: KVM: Fix typo issue in kvm_vm_init_features()
LoongArch: BPF: Make arch_protect_bpf_trampoline() return 0
LoongArch: No need to flush icache if text copy failed
LoongArch: Check return values for set_memory_{rw,rox}
LoongArch: Give more information if kmem access failed
LoongArch: Fix calling smp_processor_id() in preemptible code
LoongArch: Only use SC.Q when supported by the assembler
Arnd Bergmann [Wed, 18 Mar 2026 13:06:34 +0000 (14:06 +0100)]
Merge tag 'scmi-fixes-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into arm/fixes
Arm SCMI fixes for v7.0
Few fixes to:
1. Address a NULL dereference in the SCMI notify error path by ensurin
__scmi_event_handler_get_ops() consistently returns an ERR_PTR on
failure, as expected by callers.
2. Fix a device_node reference leak in the SCPI probe path by introducing
scope-based cleanup for acquired DT nodes.
3. Correct minor spelling errors.
* tag 'scmi-fixes-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux:
firmware: arm_scmi: Spelling s/mulit/multi/, s/currenly/currently/
firmware: arm_scmi: Fix NULL dereference on notify error path
firmware: arm_scpi: Fix device_node reference leak in probe path
Arnd Bergmann [Wed, 18 Mar 2026 13:05:30 +0000 (14:05 +0100)]
Merge tag 'ffa-fix-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into arm/fixes
Arm FF-A fix for v7.0
Fix removing the vm_id argument from ffa_rxtx_unmap(), as the FF-A
specification mandates this field be zero in all contexts except a
non-secure physical FF-A instance, where the ID is inherently 0.
* tag 'ffa-fix-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux:
firmware: arm_ffa: Remove vm_id argument in ffa_rxtx_unmap()
Pratap Nirujogi [Wed, 18 Mar 2026 03:47:57 +0000 (23:47 -0400)]
ACPI: bus: Fix MFD child automatic modprobe issue
MFD child devices sharing parent's ACPI Companion fails to probe as
acpi_companion_match() returns incompatible ACPI Companion handle for
binding with the check for pnp.type.backlight added recently. Remove this
pnp.type.backlight check in acpi_companion_match() to fix the automatic
modprobe issue.
After commi f132e089fe89 ("ACPI: processor: Fix NULL-pointer dereference
in acpi_processor_errata_piix4()"), device pointers may be dereferenced
after dropping references to the device objects pointed to by them,
which may cause a use-after-free to occur.
Moreover, debug messages about enabling the errata may be printed
if the errata flags corresponding to them are unset.
Address all of these issues by moving message printing to the points
in the code where the errata flags are set.
Fixes: f132e089fe89 ("ACPI: processor: Fix NULL-pointer dereference in acpi_processor_errata_piix4()") Reported-by: Guenter Roeck <linux@roeck-us.net> Closes: https://lore.kernel.org/linux-acpi/938e2206-def5-4b7a-9b2c-d1fd37681d8a@roeck-us.net/ Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/5975693.DvuYhMxLoT@rafael.j.wysocki
Felix Fietkau [Sat, 14 Mar 2026 06:54:55 +0000 (06:54 +0000)]
wifi: mac80211: always free skb on ieee80211_tx_prepare_skb() failure
ieee80211_tx_prepare_skb() has three error paths, but only two of them
free the skb. The first error path (ieee80211_tx_prepare() returning
TX_DROP) does not free it, while invoke_tx_handlers() failure and the
fragmentation check both do.
Add kfree_skb() to the first error path so all three are consistent,
and remove the now-redundant frees in callers (ath9k, mt76,
mac80211_hwsim) to avoid double-free.
Document the skb ownership guarantee in the function's kdoc.
Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://patch.msgid.link/20260314065455.2462900-1-nbd@nbd.name Fixes: 06be6b149f7e ("mac80211: add ieee80211_tx_prepare_skb() helper function") Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Guenter Roeck [Wed, 18 Mar 2026 06:46:36 +0000 (23:46 -0700)]
wifi: wlcore: Return -ENOMEM instead of -EAGAIN if there is not enough headroom
Since upstream commit e75665dd0968 ("wifi: wlcore: ensure skb headroom
before skb_push"), wl1271_tx_allocate() and with it
wl1271_prepare_tx_frame() returns -EAGAIN if pskb_expand_head() fails.
However, in wlcore_tx_work_locked(), a return value of -EAGAIN from
wl1271_prepare_tx_frame() is interpreted as the aggregation buffer being
full. This causes the code to flush the buffer, put the skb back at the
head of the queue, and immediately retry the same skb in a tight while
loop.
Because wlcore_tx_work_locked() holds wl->mutex, and the retry happens
immediately with GFP_ATOMIC, this will result in an infinite loop and a
CPU soft lockup. Return -ENOMEM instead so the packet is dropped and
the loop terminates.
The problem was found by an experimental code review agent based on
gemini-3.1-pro while reviewing backports into v6.18.y.
Assisted-by: Gemini:gemini-3.1-pro Fixes: e75665dd0968 ("wifi: wlcore: ensure skb headroom before skb_push") Cc: Peter Astrand <astrand@lysator.liu.se> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Link: https://patch.msgid.link/20260318064636.3065925-1-linux@roeck-us.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Xiang Mei [Wed, 18 Mar 2026 03:42:44 +0000 (20:42 -0700)]
wifi: mac80211: fix NULL deref in mesh_matches_local()
mesh_matches_local() unconditionally dereferences ie->mesh_config to
compare mesh configuration parameters. When called from
mesh_rx_csa_frame(), the parsed action-frame elements may not contain a
Mesh Configuration IE, leaving ie->mesh_config NULL and triggering a
kernel NULL pointer dereference.
The other two callers are already safe:
- ieee80211_mesh_rx_bcn_presp() checks !elems->mesh_config before
calling mesh_matches_local()
- mesh_plink_get_event() is only reached through
mesh_process_plink_frame(), which checks !elems->mesh_config, too
mesh_rx_csa_frame() is the only caller that passes raw parsed elements
to mesh_matches_local() without guarding mesh_config. An adjacent
attacker can exploit this by sending a crafted CSA action frame that
includes a valid Mesh ID IE but omits the Mesh Configuration IE,
crashing the kernel.
This patch adds a NULL check for ie->mesh_config at the top of
mesh_matches_local() to return false early when the Mesh Configuration
IE is absent.
Fixes: 2e3c8736820b ("mac80211: support functions for mesh") Reported-by: Weiming Shi <bestswngs@gmail.com> Signed-off-by: Xiang Mei <xmei5@asu.edu> Link: https://patch.msgid.link/20260318034244.2595020-1-xmei5@asu.edu Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Junrui Luo [Sat, 14 Mar 2026 09:41:04 +0000 (17:41 +0800)]
bnxt_en: fix OOB access in DBG_BUF_PRODUCER async event handler
The ASYNC_EVENT_CMPL_EVENT_ID_DBG_BUF_PRODUCER handler in
bnxt_async_event_process() uses a firmware-supplied 'type' field
directly as an index into bp->bs_trace[] without bounds validation.
The 'type' field is a 16-bit value extracted from DMA-mapped completion
ring memory that the NIC writes directly to host RAM. A malicious or
compromised NIC can supply any value from 0 to 65535, causing an
out-of-bounds access into kernel heap memory.
The bnxt_bs_trace_check_wrap() call then dereferences bs_trace->magic_byte
and writes to bs_trace->last_offset and bs_trace->wrapped, leading to
kernel memory corruption or a crash.
Fix by adding a bounds check and defining BNXT_TRACE_MAX as
DBG_LOG_BUFFER_FLUSH_REQ_TYPE_ERR_QPC_TRACE + 1 to cover all currently
defined firmware trace types (0x0 through 0xc).
Linus Torvalds [Tue, 17 Mar 2026 22:19:58 +0000 (15:19 -0700)]
Merge tag 'libnvdimm-fixes-7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm fix from Ira Weiny:
- Fix old potential use after free bug
* tag 'libnvdimm-fixes-7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
nvdimm/bus: Fix potential use after free in asynchronous initialization
Linus Torvalds [Tue, 17 Mar 2026 22:00:53 +0000 (15:00 -0700)]
Merge tag 'linux_kselftest-kunit-fixes-7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kunit fix from Shuah Khan:
- Add documentation for --list_suites feature
* tag 'linux_kselftest-kunit-fixes-7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
kunit: Add documentation of --list_suites
Fixes: 96a9a9341cda ("ice: configure FW logging") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Linus Torvalds [Tue, 17 Mar 2026 20:55:51 +0000 (13:55 -0700)]
Merge tag 'hid-for-linus-2026031701' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Pull HID fixes from Jiri Kosina:
- various fixes dealing with (intentionally) broken devices in HID
core, logitech-hidpp and multitouch drivers (Lee Jones)
- fix for OOB in wacom driver (Benoît Sevens)
- fix for potentialy HID-bpf-induced buffer overflow in () (Benjamin
Tissoires)
- various other small fixes and device ID / quirk additions
* tag 'hid-for-linus-2026031701' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
HID: multitouch: Check to ensure report responses match the request
HID: logitech-hidpp: Prevent use-after-free on force feedback initialisation failure
HID: bpf: prevent buffer overflow in hid_hw_request
selftests/hid: fix compilation when bpf_wq and hid_device are not exported
HID: core: Mitigate potential OOB by removing bogus memset()
HID: intel-thc-hid: Set HID_PHYS with PCI BDF
HID: appletb-kbd: add .resume method in PM
HID: logitech-hidpp: Enable MX Master 4 over bluetooth
HID: input: Add HID_BATTERY_QUIRK_DYNAMIC for Elan touchscreens
HID: input: Drop Asus UX550* touchscreen ignore battery quirks
HID: asus: add xg mobile 2022 external hardware support
HID: wacom: fix out-of-bounds read in wacom_intuos_bt_irq
Petr Oros [Wed, 25 Feb 2026 10:01:37 +0000 (11:01 +0100)]
iavf: fix VLAN filter lost on add/delete race
When iavf_add_vlan() finds an existing filter in IAVF_VLAN_REMOVE
state, it transitions the filter to IAVF_VLAN_ACTIVE assuming the
pending delete can simply be cancelled. However, there is no guarantee
that iavf_del_vlans() has not already processed the delete AQ request
and removed the filter from the PF. In that case the filter remains in
the driver's list as IAVF_VLAN_ACTIVE but is no longer programmed on
the NIC. Since iavf_add_vlans() only picks up filters in
IAVF_VLAN_ADD state, the filter is never re-added, and spoof checking
drops all traffic for that VLAN.
CPU0 CPU1 Workqueue
---- ---- ---------
iavf_del_vlan(vlan 100)
f->state = REMOVE
schedule AQ_DEL_VLAN
iavf_add_vlan(vlan 100)
f->state = ACTIVE
iavf_del_vlans()
f is ACTIVE, skip
iavf_add_vlans()
f is ACTIVE, skip
Filter is ACTIVE in driver but absent from NIC.
Transition to IAVF_VLAN_ADD instead and schedule
IAVF_FLAG_AQ_ADD_VLAN_FILTER so iavf_add_vlans() re-programs the
filter. A duplicate add is idempotent on the PF.
Fixes: 0c0da0e95105 ("iavf: refactor VLAN filter states") Signed-off-by: Petr Oros <poros@redhat.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Zdenek Bouska [Wed, 25 Feb 2026 09:58:29 +0000 (10:58 +0100)]
igc: fix page fault in XDP TX timestamps handling
If an XDP application that requested TX timestamping is shutting down
while the link of the interface in use is still up the following kernel
splat is reported:
Kohei Enju [Sat, 14 Feb 2026 19:46:32 +0000 (19:46 +0000)]
igc: fix missing update of skb->tail in igc_xmit_frame()
igc_xmit_frame() misses updating skb->tail when the packet size is
shorter than the minimum one.
Use skb_put_padto() in alignment with other Intel Ethernet drivers.
Fixes: 0507ef8a0372 ("igc: Add transmit and receive fastpath and interrupt handlers") Signed-off-by: Kohei Enju <kohei@enjuk.jp> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Avigail Dahan <avigailx.dahan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Koichiro Den [Mon, 16 Mar 2026 14:02:25 +0000 (23:02 +0900)]
PCI: endpoint: pci-epf-test: Roll back BAR mapping when subrange setup fails
When the BAR subrange mapping test on DWC-based platforms fails due to
insufficient free inbound iATU regions, pci_epf_test_bar_subrange_setup()
returns an error (-ENOSPC) but does not restore the original BAR mapping.
This causes subsequent test runs to become confusing, since the failure may
leave room for the next subrange mapping test to pass.
Fix this by restoring the original BAR mapping when preparation of the
subrange mapping fails, so that no side effect remains regardless of the
test success or failure.
Fixes: 6c5e6101423b ("PCI: endpoint: pci-epf-test: Add BAR subrange mapping test support") Reported-by: Christian Bruel <christian.bruel@foss.st.com> Closes: https://lore.kernel.org/linux-pci/b2b03ebe-9482-4a13-b22f-7b44da096eed@foss.st.com/ Signed-off-by: Koichiro Den <den@valinux.co.jp> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Christian Bruel <christian.bruel@foss.st.com> Reviewed-by: Niklas Cassel <cassel@kernel.org> Reviewed-by: Manivannan Sadhasivam <mani@kernel.org> Link: https://patch.msgid.link/20260316140225.1481658-1-den@valinux.co.jp
Nikola Z. Ivanov [Fri, 13 Mar 2026 14:16:43 +0000 (16:16 +0200)]
net: usb: aqc111: Do not perform PM inside suspend callback
syzbot reports "task hung in rpm_resume"
This is caused by aqc111_suspend calling
the PM variant of its write_cmd routine.
The simplified call trace looks like this:
rpm_suspend()
usb_suspend_both() - here udev->dev.power.runtime_status == RPM_SUSPENDING
aqc111_suspend() - called for the usb device interface
aqc111_write32_cmd()
usb_autopm_get_interface()
pm_runtime_resume_and_get()
rpm_resume() - here we call rpm_resume() on our parent
rpm_resume() - Here we wait for a status change that will never happen.
At this point we block another task which holds
rtnl_lock and locks up the whole networking stack.
Fix this by replacing the write_cmd calls with their _nopm variants
Reported-by: syzbot+48dc1e8dfc92faf1124c@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=48dc1e8dfc92faf1124c Fixes: e58ba4544c77 ("net: usb: aqc111: Add support for wake on LAN by MAGIC packet") Signed-off-by: Nikola Z. Ivanov <zlatistiv@gmail.com> Link: https://patch.msgid.link/20260313141643.1181386-1-zlatistiv@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Daniel Borkmann [Fri, 13 Mar 2026 06:55:31 +0000 (07:55 +0100)]
clsact: Fix use-after-free in init/destroy rollback asymmetry
Fix a use-after-free in the clsact qdisc upon init/destroy rollback asymmetry.
The latter is achieved by first fully initializing a clsact instance, and
then in a second step having a replacement failure for the new clsact qdisc
instance. clsact_init() initializes ingress first and then takes care of the
egress part. This can fail midway, for example, via tcf_block_get_ext(). Upon
failure, the kernel will trigger the clsact_destroy() callback.
Commit 1cb6f0bae504 ("bpf: Fix too early release of tcx_entry") details the
way how the transition is happening. If tcf_block_get_ext on the q->ingress_block
ends up failing, we took the tcx_miniq_inc reference count on the ingress
side, but not yet on the egress side. clsact_destroy() tests whether the
{ingress,egress}_entry was non-NULL. However, even in midway failure on the
replacement, both are in fact non-NULL with a valid egress_entry from the
previous clsact instance.
What we really need to test for is whether the qdisc instance-specific ingress
or egress side previously got initialized. This adds a small helper for checking
the miniq initialization called mini_qdisc_pair_inited, and utilizes that upon
clsact_destroy() in order to fix the use-after-free scenario. Convert the
ingress_destroy() side as well so both are consistent to each other.
Fixes: 1cb6f0bae504 ("bpf: Fix too early release of tcx_entry") Reported-by: Keenan Dong <keenanat2000@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20260313065531.98639-1-daniel@iogearbox.net Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Lee Jones [Fri, 27 Feb 2026 16:30:25 +0000 (16:30 +0000)]
HID: multitouch: Check to ensure report responses match the request
It is possible for a malicious (or clumsy) device to respond to a
specific report's feature request using a completely different report
ID. This can cause confusion in the HID core resulting in nasty
side-effects such as OOB writes.
Add a check to ensure that the report ID in the response, matches the
one that was requested. If it doesn't, omit reporting the raw event and
return early.
Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
Merge tag 'v7.0-rockchip-drvfixes1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into arm/fixes
Fixing a missing of_node_put() call.
* tag 'v7.0-rockchip-drvfixes1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip:
soc: rockchip: grf: Add missing of_node_put() when returning
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Paul SAGE [Sat, 14 Mar 2026 21:54:30 +0000 (03:24 +0530)]
tg3: replace placeholder MAC address with device property
On some systems (e.g. iMac 20,1 with BCM57766), the tg3 driver reads
a default placeholder mac address (00:10:18:00:00:00) from the
mailbox. The correct value on those systems are stored in the
'local-mac-address' property.
This patch, detect the default value and tries to retrieve
the correct address from the device_get_mac_address
function instead.
The patch has been tested on two different systems:
- iMac 20,1 (BCM57766) model which use the local-mac-address property
- iMac 13,2 (BCM57766) model which can use the mailbox,
NVRAM or MAC control registers
Tested-by: Rishon Jonathan R <mithicalaviator85@gmail.com> Co-developed-by: Vincent MORVAN <vinc@42.fr> Signed-off-by: Vincent MORVAN <vinc@42.fr> Signed-off-by: Paul SAGE <paul.sage@42.fr> Signed-off-by: Atharva Tiwari <atharvatiwarilinuxdev@gmail.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20260314215432.3589-1-atharvatiwarilinuxdev@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The nframes bounds check in cdc_ncm_rx_verify_ndp16() and
cdc_ncm_rx_verify_ndp32() does not account for ndpoffset,
allowing out-of-bounds reads when the NDP is placed near the
end of the NTB.
====================
Tobi Gaertner [Sat, 14 Mar 2026 05:46:40 +0000 (22:46 -0700)]
net: usb: cdc_ncm: add ndpoffset to NDP32 nframes bounds check
The same bounds-check bug fixed for NDP16 in the previous patch also
exists in cdc_ncm_rx_verify_ndp32(). The DPE array size is validated
against the total skb length without accounting for ndpoffset, allowing
out-of-bounds reads when the NDP32 is placed near the end of the NTB.
Add ndpoffset to the nframes bounds check and use struct_size_t() to
express the NDP-plus-DPE-array size more clearly.
Tobi Gaertner [Sat, 14 Mar 2026 05:46:39 +0000 (22:46 -0700)]
net: usb: cdc_ncm: add ndpoffset to NDP16 nframes bounds check
cdc_ncm_rx_verify_ndp16() validates that the NDP header and its DPE
entries fit within the skb. The first check correctly accounts for
ndpoffset:
if ((ndpoffset + sizeof(struct usb_cdc_ncm_ndp16)) > skb_in->len)
but the second check omits it:
if ((sizeof(struct usb_cdc_ncm_ndp16) +
ret * (sizeof(struct usb_cdc_ncm_dpe16))) > skb_in->len)
This validates the DPE array size against the total skb length as if
the NDP were at offset 0, rather than at ndpoffset. When the NDP is
placed near the end of the NTB (large wNdpIndex), the DPE entries can
extend past the skb data buffer even though the check passes.
cdc_ncm_rx_fixup() then reads out-of-bounds memory when iterating
the DPE array.
Add ndpoffset to the nframes bounds check and use struct_size_t() to
express the NDP-plus-DPE-array size more clearly.
Lorenzo Bianconi [Fri, 13 Mar 2026 11:27:00 +0000 (12:27 +0100)]
net: airoha: Remove airoha_dev_stop() in airoha_remove()
Do not run airoha_dev_stop routine explicitly in airoha_remove()
since ndo_stop() callback is already executed by unregister_netdev() in
__dev_close_many routine if necessary and, doing so, we will end up causing
an underflow in the qdma users atomic counters. Rely on networking subsystem
to stop the device removing the airoha_eth module.
Jamal Hadi Salim [Sun, 15 Mar 2026 15:54:22 +0000 (11:54 -0400)]
net/sched: teql: Fix double-free in teql_master_xmit
Whenever a TEQL devices has a lockless Qdisc as root, qdisc_reset should
be called using the seq_lock to avoid racing with the datapath. Failure
to do so may cause crashes like the following:
Workflow to reproduce:
1. Initialize a TEQL topology (dummy0 and ifb0 as slaves, teql0 up).
2. Start multiple sender workers continuously transmitting packets
through teql0 to drive teql_master_xmit().
3. In parallel, repeatedly delete and re-add the root qdisc on
dummy0 and ifb0 via RTNETLINK, forcing frequent teardown and reset activity
(teql_destroy() / qdisc_reset()).
4. After running both workloads concurrently for several iterations,
KASAN reports slab-use-after-free or double-free in the skb free path.
Fix this by moving dev_reset_queue to sch_generic.h and calling it, instead
of qdisc_reset, in teql_destroy since it handles both the lock and lockless
cases correctly for root qdiscs.
Fixes: 96009c7d500e ("sched: replace __QDISC_STATE_RUNNING bit with a spin lock") Reported-by: Xianrui Dong <keenanat2000@gmail.com> Tested-by: Xianrui Dong <keenanat2000@gmail.com> Co-developed-by: Victor Nogueira <victor@mojatatu.com> Signed-off-by: Victor Nogueira <victor@mojatatu.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://patch.msgid.link/20260315155422.147256-1-jhs@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiayuan Chen [Thu, 12 Mar 2026 09:29:07 +0000 (17:29 +0800)]
net/smc: fix NULL dereference and UAF in smc_tcp_syn_recv_sock()
Syzkaller reported a panic in smc_tcp_syn_recv_sock() [1].
smc_tcp_syn_recv_sock() is called in the TCP receive path
(softirq) via icsk_af_ops->syn_recv_sock on the clcsock (TCP
listening socket). It reads sk_user_data to get the smc_sock
pointer. However, when the SMC listen socket is being closed
concurrently, smc_close_active() sets clcsock->sk_user_data
to NULL under sk_callback_lock, and then the smc_sock itself
can be freed via sock_put() in smc_release().
This leads to two issues:
1) NULL pointer dereference: sk_user_data is NULL when
accessed.
2) Use-after-free: sk_user_data is read as non-NULL, but the
smc_sock is freed before its fields (e.g., queued_smc_hs,
ori_af_ops) are accessed.
The race window looks like this (the syzkaller crash [1]
triggers via the SYN cookie path: tcp_get_cookie_sock() ->
smc_tcp_syn_recv_sock(), but the normal tcp_check_req() path
has the same race):
CPU A (softirq) CPU B (process ctx)
tcp_v4_rcv()
TCP_NEW_SYN_RECV:
sk = req->rsk_listener
sock_hold(sk)
/* No lock on listener */
smc_close_active():
write_lock_bh(cb_lock)
sk_user_data = NULL
write_unlock_bh(cb_lock)
...
smc_clcsock_release()
sock_put(smc->sk) x2
-> smc_sock freed!
tcp_check_req()
smc_tcp_syn_recv_sock():
smc = user_data(sk)
-> NULL or dangling
smc->queued_smc_hs
-> crash!
Note that the clcsock and smc_sock are two independent objects
with separate refcounts. TCP stack holds a reference on the
clcsock, which keeps it alive, but this does NOT prevent the
smc_sock from being freed.
Fix this by using RCU and refcount_inc_not_zero() to safely
access smc_sock. Since smc_tcp_syn_recv_sock() is called in
the TCP three-way handshake path, taking read_lock_bh on
sk_callback_lock is too heavy and would not survive a SYN
flood attack. Using rcu_read_lock() is much more lightweight.
- Set SOCK_RCU_FREE on the SMC listen socket so that
smc_sock freeing is deferred until after the RCU grace
period. This guarantees the memory is still valid when
accessed inside rcu_read_lock().
- Use rcu_read_lock() to protect reading sk_user_data.
- Use refcount_inc_not_zero(&smc->sk.sk_refcnt) to pin the
smc_sock. If the refcount has already reached zero (close
path completed), it returns false and we bail out safely.
Note: smc_hs_congested() has a similar lockless read of
sk_user_data without rcu_read_lock(), but it only checks for
NULL and accesses the global smc_hs_wq, never dereferencing
any smc_sock field, so it is not affected.
Reproducer was verified with mdelay injection and smc_run,
the issue no longer occurs with this patch applied.
Eric Dumazet [Sun, 15 Mar 2026 10:41:52 +0000 (10:41 +0000)]
bonding: prevent potential infinite loop in bond_header_parse()
bond_header_parse() can loop if a stack of two bonding devices is setup,
because skb->dev always points to the hierarchy top.
Add new "const struct net_device *dev" parameter to
(struct header_ops)->parse() method to make sure the recursion
is bounded, and that the final leaf parse method is called.
Fixes: 950803f72547 ("bonding: fix type confusion in bond_setup_by_slave()") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jiayuan Chen <jiayuan.chen@shopee.com> Tested-by: Jiayuan Chen <jiayuan.chen@shopee.com> Cc: Jay Vosburgh <jv@jvosburgh.net> Cc: Andrew Lunn <andrew+netdev@lunn.ch> Link: https://patch.msgid.link/20260315104152.1436867-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
PCI/pwrctrl: Ensure that remote endpoint node parent has supply requirement
If OF graph is used in the PCI device node, the pwrctrl core creates a
pwrctrl device even if the remote endpoint doesn't have power supply
requirements. Since the device doesn't have any power supply requirements,
there was no pwrctrl driver to probe, leading to PCI controller driver
probe deferral as it waits for all pwrctrl drivers to probe before starting
bus scan.
This issue happens with Qcom ath12k devices with WSI interface attached to
the Qcom IPQ platforms.
Fix this issue by checking for the existence of at least one power supply
property in the remote endpoint parent node. To consolidate all the checks,
create a new helper pci_pwrctrl_is_required() and move all the checks
there.
Jeff Layton [Tue, 24 Feb 2026 16:33:35 +0000 (11:33 -0500)]
nfsd: fix heap overflow in NFSv4.0 LOCK replay cache
The NFSv4.0 replay cache uses a fixed 112-byte inline buffer
(rp_ibuf[NFSD4_REPLAY_ISIZE]) to store encoded operation responses.
This size was calculated based on OPEN responses and does not account
for LOCK denied responses, which include the conflicting lock owner as
a variable-length field up to 1024 bytes (NFS4_OPAQUE_LIMIT).
When a LOCK operation is denied due to a conflict with an existing lock
that has a large owner, nfsd4_encode_operation() copies the full encoded
response into the undersized replay buffer via read_bytes_from_xdr_buf()
with no bounds check. This results in a slab-out-of-bounds write of up
to 944 bytes past the end of the buffer, corrupting adjacent heap memory.
This can be triggered remotely by an unauthenticated attacker with two
cooperating NFSv4.0 clients: one sets a lock with a large owner string,
then the other requests a conflicting lock to provoke the denial.
We could fix this by increasing NFSD4_REPLAY_ISIZE to allow for a full
opaque, but that would increase the size of every stateowner, when most
lockowners are not that large.
Instead, fix this by checking the encoded response length against
NFSD4_REPLAY_ISIZE before copying into the replay buffer. If the
response is too large, set rp_buflen to 0 to skip caching the replay
payload. The status is still cached, and the client already received the
correct response on the original request.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@kernel.org Reported-by: Nicholas Carlini <npc@anthropic.com> Tested-by: Nicholas Carlini <npc@anthropic.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
sched: idle: Consolidate the handling of two special cases
There are two special cases in the idle loop that are handled
inconsistently even though they are analogous.
The first one is when a cpuidle driver is absent and the default CPU
idle time power management implemented by the architecture code is used.
In that case, the scheduler tick is stopped every time before invoking
default_idle_call().
The second one is when a cpuidle driver is present, but there is only
one idle state in its table. In that case, the scheduler tick is never
stopped at all.
Since each of these approaches has its drawbacks, reconcile them with
the help of one simple heuristic. Namely, stop the tick if the CPU has
been woken up by it in the previous iteration of the idle loop, or let
it tick otherwise.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Christian Loehle <christian.loehle@arm.com> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: Qais Yousef <qyousef@layalina.io> Reviewed-by: Aboorva Devarajan <aboorvad@linux.ibm.com> Fixes: ed98c3491998 ("sched: idle: Do not stop the tick before cpuidle_idle_call()")
[ rjw: Added Fixes tag, changelog edits ] Link: https://patch.msgid.link/4741364.LvFx2qVVIh@rafael.j.wysocki Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Linus Torvalds [Mon, 16 Mar 2026 19:21:00 +0000 (12:21 -0700)]
Merge tag 'mm-hotfixes-stable-2026-03-16-12-15' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"6 hotfixes. 4 are cc:stable. 3 are for MM.
All are singletons - please see the changelogs for details"
* tag 'mm-hotfixes-stable-2026-03-16-12-15' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
MAINTAINERS: update email address for Ignat Korchagin
mm/huge_memory: fix early failure try_to_migrate() when split huge pmd for shared THP
mm/rmap: fix incorrect pte restoration for lazyfree folios
mm/huge_memory: fix use of NULL folio in move_pages_huge_pmd()
build_bug.h: correct function parameters names in kernel-doc
crash_dump: don't log dm-crypt key bytes in read_key_from_user_keying
Linus Torvalds [Mon, 16 Mar 2026 15:53:06 +0000 (08:53 -0700)]
Merge tag 'for-7.0-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- fix logging of new dentries when logging parent directory and there
are conflicting inodes (e.g. deleted directory)
- avoid taking big device lock for zone setup, this is not necessary
during mount
- tune message verbosity when auto-reclaiming zones when low on space
- fix slightly misleading message of root item check
* tag 'for-7.0-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: tree-checker: fix misleading root drop_level error message
btrfs: log new dentries when logging parent dir of a conflicting inode
btrfs: don't take device_list_mutex when querying zone info
btrfs: pass 'verbose' parameter to btrfs_relocate_block_group
Lee Jones [Fri, 27 Feb 2026 10:09:38 +0000 (10:09 +0000)]
HID: logitech-hidpp: Prevent use-after-free on force feedback initialisation failure
Presently, if the force feedback initialisation fails when probing the
Logitech G920 Driving Force Racing Wheel for Xbox One, an error number
will be returned and propagated before the userspace infrastructure
(sysfs and /dev/input) has been torn down. If userspace ignores the
errors and continues to use its references to these dangling entities, a
UAF will promptly follow.
We have 2 options; continue to return the error, but ensure that all of
the infrastructure is torn down accordingly or continue to treat this
condition as a warning by emitting the message but returning success.
It is thought that the original author's intention was to emit the
warning but keep the device functional, less the force feedback feature,
so let's go with that.
Signed-off-by: Lee Jones <lee@kernel.org> Reviewed-by: Günther Noack <gnoack@google.com> Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
HID: bpf: prevent buffer overflow in hid_hw_request
right now the returned value is considered to be always valid. However,
when playing with HID-BPF, the return value can be arbitrary big,
because it's the return value of dispatch_hid_bpf_raw_requests(), which
calls the struct_ops and we have no guarantees that the value makes
sense.
Fixes: 8bd0488b5ea5 ("HID: bpf: add HID-BPF hooks for hid_hw_raw_requests") Cc: stable@vger.kernel.org Acked-by: Jiri Kosina <jkosina@suse.com> Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
Lee Jones [Mon, 9 Mar 2026 14:59:29 +0000 (14:59 +0000)]
HID: core: Mitigate potential OOB by removing bogus memset()
The memset() in hid_report_raw_event() has the good intention of
clearing out bogus data by zeroing the area from the end of the incoming
data string to the assumed end of the buffer. However, as we have
previously seen, doing so can easily result in OOB reads and writes in
the subsequent thread of execution.
The current suggestion from one of the HID maintainers is to remove the
memset() and simply return if the incoming event buffer size is not
large enough to fill the associated report.
Suggested-by Benjamin Tissoires <bentiss@kernel.org>
Signed-off-by: Lee Jones <lee@kernel.org>
[bentiss: changed the return value] Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
Daniel Schaefer [Fri, 13 Mar 2026 13:39:25 +0000 (21:39 +0800)]
HID: intel-thc-hid: Set HID_PHYS with PCI BDF
Currently HID_PHYS is empty, which means userspace tools (e.g. fwupd)
that depend on it for distinguishing the devices, are unable to do so.
Other drivers like i2c-hid, usbhid, surface-hid, all populate it.
With this change it's set to, for example: HID_PHYS=0000:00:10.0
Each function has just a single HID device, as far as I can tell, so
there is no need to add a suffix.
Tested with fwupd 2.1.1, can avoid https://github.com/fwupd/fwupd/pull/9995
Cc: Even Xu <even.xu@intel.com> Cc: Xinpeng Sun <xinpeng.sun@intel.com> Cc: Jiri Kosina <jikos@kernel.org> Cc: Benjamin Tissoires <bentiss@kernel.org> Cc: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Daniel Schaefer <git@danielschaefer.me> Reviewed-by: Even Xu <even.xu@intel.com> Signed-off-by: Jiri Kosina <jkosina@suse.com>
Bibo Mao [Mon, 16 Mar 2026 02:36:02 +0000 (10:36 +0800)]
LoongArch: KVM: Fix typo issue in kvm_vm_init_features()
Most of VM feature detections are integer OR operations, and integer
assignment operation will clear previous integer OR operation. So here
change all integer assignment operations to integer OR operations.
Fixes: 82db90bf461b ("LoongArch: KVM: Move feature detection in kvm_vm_init_features()") Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Tiezhu Yang [Mon, 16 Mar 2026 02:36:01 +0000 (10:36 +0800)]
LoongArch: BPF: Make arch_protect_bpf_trampoline() return 0
Occasionally there exist "text_copy_cb: operation failed" when executing
the bpf selftests, the reason is copy_to_kernel_nofault() failed and the
ecode of ESTAT register is 0x4 (PME: Page Modification Exception) due to
the pte is not writeable. The root cause is that there is another place
to set the pte entry as readonly which is in the generic weak version of
arch_protect_bpf_trampoline().
There are two ways to fix this race condition issue: the direct way is
to modify the generic weak arch_protect_bpf_trampoline() to add a mutex
lock for set_memory_rox(), but the other simple and proper way is to
just make arch_protect_bpf_trampoline() return 0 in the arch-specific
code because LoongArch has already use the BPF prog pack allocator for
trampoline.
Tiezhu Yang [Mon, 16 Mar 2026 02:36:01 +0000 (10:36 +0800)]
LoongArch: Give more information if kmem access failed
If memory access such as copy_{from, to}_kernel_nofault() failed, its
users do not know what happened, so it is very useful to print the
exception code for such cases. Furthermore, it is better to print the
caller function to know where is the entry.
Xi Ruoyao [Mon, 16 Mar 2026 02:36:01 +0000 (10:36 +0800)]
LoongArch: Fix calling smp_processor_id() in preemptible code
Fix the warning:
BUG: using smp_processor_id() in preemptible [00000000] code: systemd/1
caller is larch_insn_text_copy+0x40/0xf0
Simply changing it to raw_smp_processor_id() is not enough: if preempt
and CPU hotplug happens after raw_smp_processor_id() but before calling
stop_machine(), the CPU where raw_smp_processor_id() has run may become
offline when stop_machine() and no CPU will run copy_to_kernel_nofault()
in text_copy_cb(). Thus guard the larch_insn_text_copy() calls with
cpus_read_lock() and change stop_machine() to stop_machine_cpuslocked()
to prevent this.
I've considered moving the locks inside larch_insn_text_copy() but
doing so seems not an easy hack. In bpf_arch_text_poke() obviously the
memcpy() call must be guarded by text_mutex, so we have to leave the
acquire of text_mutex out of larch_insn_text_copy(). But in the entire
kernel the acquire of mutexes is always after cpus_read_lock(), so we
cannot put cpus_read_lock() into larch_insn_text_copy() while leaving
the text_mutex acquire out (or we risk a deadlock due to inconsistent
lock acquire order). So let's fix the bug first and leave the posssible
refactor as future work.
Thomas Weißschuh [Mon, 16 Mar 2026 02:36:00 +0000 (10:36 +0800)]
LoongArch: Only use SC.Q when supported by the assembler
The 128-bit atomic cmpxchg implementation uses the SC.Q instruction.
Older versions of GNU AS do not support that instruction, erroring out:
ERROR:root:{standard input}: Assembler messages:
{standard input}:4831: Error: no match insn: sc.q $t0,$t1,$r14
{standard input}:6407: Error: no match insn: sc.q $t0,$t1,$r23
{standard input}:10856: Error: no match insn: sc.q $t0,$t1,$r14
Linus Torvalds [Sun, 15 Mar 2026 20:15:39 +0000 (13:15 -0700)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"The one core change is a re-roll of the tag allocation fix from the
last pull request that uses the correct goto to unroll all the
allocations. The remianing fixes are all small ones in drivers"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: hisi_sas: Fix NULL pointer exception during user_scan()
scsi: qla2xxx: Completely fix fcport double free
scsi: ufs: core: Fix SError in ufshcd_rtc_work() during UFS suspend
scsi: core: Fix error handling for scsi_alloc_sdev()
Linus Torvalds [Sun, 15 Mar 2026 20:08:05 +0000 (13:08 -0700)]
Merge tag 'probes-fixes-v7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull probes fixes from Masami Hiramatsu:
- Avoid crash when rmmod/insmod after ftrace killed
This fixes a kernel crash caused by kprobes on the symbol in a module
which is unloaded after ftrace_kill() is called.
- Remove unneeded warnings from __arm_kprobe_ftrace()
Remove unneeded WARN messages which can be triggered if the kprobe is
using ftrace and it fails to enable the ftrace. Since kprobes
correctly handle such failure, we don't need to warn it.
* tag 'probes-fixes-v7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
kprobes: Remove unneeded warnings from __arm_kprobe_ftrace()
kprobes: avoid crash when rmmod/insmod after ftrace killed
Linus Torvalds [Sun, 15 Mar 2026 19:50:05 +0000 (12:50 -0700)]
Merge tag 'bootconfig-fixes-v7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull bootconfig fixes from Masami Hiramatsu:
- fix off-by-one in xbc_verify_tree() unclosed brace error. This fixes
a wrong error place in unclosed brace error message
- check bounds before writing in __xbc_open_brace(). This fixes to
check the array index before setting array, so that the bootconfig
can support 16th-depth nested brace correctly
- fix snprintf truncation check in xbc_node_compose_key_after(). This
fixes to handle the return value of snprintf() correctly in case of
the return value == size
- Add bootconfig tests about braces Add test cases for checking error
position about unclosed brace and ensuring supporting 16th depth
nested braces correctly
* tag 'bootconfig-fixes-v7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
bootconfig: Add bootconfig tests about braces
lib/bootconfig: fix snprintf truncation check in xbc_node_compose_key_after()
lib/bootconfig: check bounds before writing in __xbc_open_brace()
lib/bootconfig: fix off-by-one in xbc_verify_tree() unclosed brace error
Linus Torvalds [Sun, 15 Mar 2026 19:22:10 +0000 (12:22 -0700)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"Quite a large pull request, partly due to skipping last week and
therefore having material from ~all submaintainers in this one. About
a fourth of it is a new selftest, and a couple more changes are large
in number of files touched (fixing a -Wflex-array-member-not-at-end
compiler warning) or lines changed (reformatting of a table in the API
documentation, thanks rST).
But who am I kidding---it's a lot of commits and there are a lot of
bugs being fixed here, some of them on the nastier side like the
RISC-V ones.
ARM:
- Correctly handle deactivation of interrupts that were activated
from LRs. Since EOIcount only denotes deactivation of interrupts
that are not present in an LR, start EOIcount deactivation walk
*after* the last irq that made it into an LR
- Avoid calling into the stubs to probe for ICH_VTR_EL2.TDS when pKVM
is already enabled -- not only thhis isn't possible (pKVM will
reject the call), but it is also useless: this can only happen for
a CPU that has already booted once, and the capability will not
change
- Fix a couple of low-severity bugs in our S2 fault handling path,
affecting the recently introduced LS64 handling and the even more
esoteric handling of hwpoison in a nested context
- Address yet another syzkaller finding in the vgic initialisation,
where we would end-up destroying an uninitialised vgic with nasty
consequences
- Address an annoying case of pKVM failing to boot when some of the
memblock regions that the host is faulting in are not page-aligned
- Inject some sanity in the NV stage-2 walker by checking the limits
against the advertised PA size, and correctly report the resulting
faults
PPC:
- Fix a PPC e500 build error due to a long-standing wart that was
exposed by the recent conversion to kmalloc_obj(); rip out all the
ugliness that led to the wart
RISC-V:
- Prevent speculative out-of-bounds access using array_index_nospec()
in APLIC interrupt handling, ONE_REG regiser access, AIA CSR
access, float register access, and PMU counter access
- Fix potential use-after-free issues in kvm_riscv_gstage_get_leaf(),
kvm_riscv_aia_aplic_has_attr(), and kvm_riscv_aia_imsic_has_attr()
- Fix potential null pointer dereference in
kvm_riscv_vcpu_aia_rmw_topei()
- Fix off-by-one array access in SBI PMU
- Skip THP support check during dirty logging
- Fix error code returned for Smstateen and Ssaia ONE_REG interface
- Check host Ssaia extension when creating AIA irqchip
x86:
- Fix cases where CPUID mitigation features were incorrectly marked
as available whenever the kernel used scattered feature words for
them
- Validate _all_ GVAs, rather than just the first GVA, when
processing a range of GVAs for Hyper-V's TLB flush hypercalls
- Fix a brown paper bug in add_atomic_switch_msr()
- Use hlist_for_each_entry_srcu() when traversing mask_notifier_list,
to fix a lockdep warning; KVM doesn't hold RCU, just irq_srcu
- Ensure AVIC VMCB fields are initialized if the VM has an in-kernel
local APIC (and AVIC is enabled at the module level)
- Update CR8 write interception when AVIC is (de)activated, to fix a
bug where the guest can run in perpetuity with the CR8 intercept
enabled
- Add a quirk to skip the consistency check on FREEZE_IN_SMM, i.e. to
allow L1 hypervisors to set FREEZE_IN_SMM. This reverts (by
default) an unintentional tightening of userspace ABI in 6.17, and
provides some amount of backwards compatibility with hypervisors
who want to freeze PMCs on VM-Entry
- Validate the VMCS/VMCB on return to a nested guest from SMM,
because either userspace or the guest could stash invalid values in
memory and trigger the processor's consistency checks
Generic:
- Remove a subtle pseudo-overlay of kvm_stats_desc, which, aside from
being unnecessary and confusing, triggered compiler warnings due to
-Wflex-array-member-not-at-end
- Document that vcpu->mutex is take outside of kvm->slots_lock and
kvm->slots_arch_lock, which is intentional and desirable despite
being rather unintuitive
Selftests:
- Increase the maximum number of NUMA nodes in the guest_memfd
selftest to 64 (from 8)"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (43 commits)
KVM: selftests: Verify SEV+ guests can read and write EFER, CR0, CR4, and CR8
Documentation: kvm: fix formatting of the quirks table
KVM: x86: clarify leave_smm() return value
selftests: kvm: add a test that VMX validates controls on RSM
selftests: kvm: extract common functionality out of smm_test.c
KVM: SVM: check validity of VMCB controls when returning from SMM
KVM: VMX: check validity of VMCS controls when returning from SMM
KVM: SVM: Set/clear CR8 write interception when AVIC is (de)activated
KVM: SVM: Initialize AVIC VMCB fields if AVIC is enabled with in-kernel APIC
KVM: x86: Introduce KVM_X86_QUIRK_VMCS12_ALLOW_FREEZE_IN_SMM
KVM: x86: Fix SRCU list traversal in kvm_fire_mask_notifiers()
KVM: VMX: Fix a wrong MSR update in add_atomic_switch_msr()
KVM: x86: hyper-v: Validate all GVAs during PV TLB flush
KVM: x86: synthesize CPUID bits only if CPU capability is set
KVM: PPC: e500: Rip out "struct tlbe_ref"
KVM: PPC: e500: Fix build error due to using kmalloc_obj() with wrong type
KVM: selftests: Increase 'maxnode' for guest_memfd tests
KVM: arm64: pkvm: Don't reprobe for ICH_VTR_EL2.TDS on CPU hotplug
KVM: arm64: vgic: Pick EOIcount deactivations from AP-list tail
KVM: arm64: Remove the redundant ISB in __kvm_at_s1e2()
...
Linus Torvalds [Sun, 15 Mar 2026 18:36:11 +0000 (11:36 -0700)]
Merge tag 'powerpc-7.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Madhavan Srinivasan:
- Fix KUAP warning in VMX usercopy path
- Fix lockdep warning during PCI enumeration
- Fix to move CMA reservations to arch_mm_preinit
- Fix to check current->mm is alive before getting user callchain
Thanks to Aboorva Devarajan, Christophe Leroy (CS GROUP), Dan Horák,
Nicolin Chen, Nilay Shroff, Qiao Zhao, Ritesh Harjani (IBM), Saket Kumar
Bhaskar, Sayali Patil, Shrikanth Hegde, Venkat Rao Bagalkote, and Viktor
Malik.
* tag 'powerpc-7.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/iommu: fix lockdep warning during PCI enumeration
powerpc/selftests/copyloops: extend selftest to exercise __copy_tofrom_user_power7_vmx
powerpc: fix KUAP warning in VMX usercopy path
powerpc, perf: Check that current->mm is alive before getting user callchain
powerpc/mem: Move CMA reservations to arch_mm_preinit
Linus Torvalds [Sun, 15 Mar 2026 18:26:36 +0000 (11:26 -0700)]
Merge tag 'x86-urgent-2026-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Ingo Molnar:
"Work around S2RAM hang if the firmware unexpectedly re-enables the
x2apic hardware while it was disabled by the kernel.
Force-disable it again and issue a warning into the syslog"
* tag 'x86-urgent-2026-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/apic: Disable x2apic on resume if the kernel expects so
- Fix CID hangs due to a race between concurrent forks
- Fix vfork()/CLONE_VM MMCID bug causing hangs
- Remove pointless preemption guard
- Fix CID task list walk performance regression on large systems
by removing the known-flaky and slow counting logic using
for_each_process_thread() in mm_cid_*fixup_tasks_to_cpus(), and
implementing a simple sched_mm_cid::node list instead"
* tag 'sched-urgent-2026-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/mmcid: Avoid full tasklist walks
sched/mmcid: Remove pointless preempt guard
sched/mmcid: Handle vfork()/CLONE_VM correctly
sched/mmcid: Prevent CID stalls due to concurrent forks
- Fix another objtool stack overflow in validate_branch()
* tag 'objtool-urgent-2026-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Fix another stack overflow in validate_branch()
objtool: Handle Clang RSP musical chairs
objtool: Fix ERROR_INSN() error message
objtool: Fix data alignment in elf_add_data()
objtool: Use HOSTCFLAGS for HAVE_XXHASH test
objtool/klp: Avoid NULL pointer dereference when printing code symbol name
objtool/klp: Disable unsupported pr_debug() usage
objtool/klp: Fix detection of corrupt static branch/call entries
Linus Torvalds [Sun, 15 Mar 2026 17:32:57 +0000 (10:32 -0700)]
Merge tag 'irq-urgent-2026-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Ingo Molnar:
"Two fixes for the riscv-aplic irqchip driver:
- Fix probing dependency bug on probing failure
- Fix double register_syscore() bug"
* tag 'irq-urgent-2026-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/riscv-aplic: Register syscore operations only once
irqchip/riscv-aplic: Do not clear ACPI dependencies on probe failure
Linus Torvalds [Sat, 14 Mar 2026 23:25:10 +0000 (16:25 -0700)]
Merge tag 'i3c/fixes-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux
Pull i3c fixes from Alexandre Belloni:
"This introduces the I3C_OR_I2C symbol which is not a fix per se but is
affecting multiple subsystems so it is included to ease
synchronization.
Apart from that, Adrian is mostly fixing the mipi-i3c-hci driver DMA
handling, and I took the opportunity to add two fixes for the dw-i3c
driver.
Drivers:
- dw: handle 2C properly, fix possible race condition
- mipi-i3c-hci: many DMA related fixes"
* tag 'i3c/fixes-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux:
i3c: dw-i3c-master: Set SIR_REJECT in DAT on device attach and reattach
i3c: master: dw-i3c: Fix missing of_node for virtual I2C adapter
i3c: mipi-i3c-hci: Fallback to software reset when bus disable fails
i3c: mipi-i3c-hci: Fix handling of shared IRQs during early initialization
i3c: mipi-i3c-hci: Fix race in DMA error handling in interrupt context
i3c: mipi-i3c-hci: Consolidate common xfer processing logic
i3c: mipi-i3c-hci: Restart DMA ring correctly after dequeue abort
i3c: mipi-i3c-hci: Add missing TID field to no-op command descriptor
i3c: mipi-i3c-hci: Correct RING_CTRL_ABORT handling in DMA dequeue
i3c: mipi-i3c-hci: Fix race between DMA ring dequeue and interrupt handler
i3c: mipi-i3c-hci: Fix race in DMA ring dequeue
i3c: mipi-i3c-hci: Fix race in DMA ring enqueue for parallel xfers
i3c: mipi-i3c-hci: Consolidate spinlocks
i3c: mipi-i3c-hci: Factor out DMA mapping from queuing path
i3c: mipi-i3c-hci: Fix Hot-Join NACK
i3c: mipi-i3c-hci: Use ETIMEDOUT instead of ETIME for timeout errors
i3c: simplify combined i3c/i2c dependencies