Yonghong Song [Sun, 17 May 2026 15:07:02 +0000 (08:07 -0700)]
bpf,x86: Fix exception unwinding with outgoing stack arguments
When a main program with exception_boundary has outgoing stack
arguments (e.g. from calling subprogs with >5 args), bpf_throw() fails
to correctly restore callee-saved registers, causing a kernel crash.
The x86 JIT allocates the outgoing stack arg area below the
callee-saved registers via 'sub rsp, outgoing_rsp' in the prologue.
When bpf_throw() unwinds, it captures the main program's sp (which
includes this outgoing area) and passes it to the exception callback.
The callback gets rsp and rbp, followed by pop_callee_regs, but rsp
points into the outgoing arg area rather than the callee-saved
registers, so the pops restore garbage values. Returning to the
kernel with corrupted callee-saved registers causes a crash.
Fix this by adjusting the sp (adding stack_arg_sp_adjust) passed to
the exception callback, so it points to the bottom of the callee-saved
registers instead of the outgoing arg area. When stack_arg_sp_adjust
is 0 (the common case), this is a no-op.
Fixes: 324c3ca6eed6 ("bpf,x86: Implement JIT support for stack arguments") Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20260517150702.288031-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Tejun Heo [Sun, 17 May 2026 20:31:10 +0000 (10:31 -1000)]
Merge branch 'for-7.1-fixes' into for-7.2
Pull to receive:
39e25a210060 ("sched_ext: Drop NONE early return in scx_disable_and_exit_task()") b273b75b8d67 ("sched_ext: INIT_LIST_HEAD() &sch->all in scx_alloc_and_add_sched()") cceb874eee46 ("sched_ext: Defer sub_kset base put to scx_sched_free_rcu_work") 6ae315d37924 ("sched_ext: Use HK_TYPE_DOMAIN_BOOT to detect isolcpus= domain isolation") 515e3996a4c2 ("sched_ext: Fix deadlock between scx_root_disable() and concurrent forks")
Takashi Iwai [Sun, 17 May 2026 16:51:20 +0000 (18:51 +0200)]
ALSA: pcm: Don't setup bogus iov_iter for silencing
At transition to the iov_iter for PCM data transfer, we blindly
applied the iov_iter setup also for silencing (i.e. data = NULL), and
it leads to a calculation of bogus iov_iter. Fortunately this didn't
cause troubles on most of architectures but it goes wrong on RISC-V
now, causing a NULL dereference.
Handle the NULL data case to treat the silencing in interleaved_copy()
for addressing the bug above. noninterleaved_copy() has already the
NULL data handling, so it doesn't need changes.
Tejun Heo [Sun, 17 May 2026 17:43:16 +0000 (07:43 -1000)]
sched_ext: Fix deadlock between scx_root_disable() and concurrent forks
scx_root_disable() enters SCX_DISABLING before it grabs scx_enable_mutex to
clear __scx_switched_all and scx_switching_all. task_should_scx() short-circuits on DISABLING,
so forks in that window land on fair while next_active_class() still skips
fair - the new tasks stall.
This can deadlock the disable path itself: scx_alloc_and_add_sched() runs
under scx_enable_mutex and creates a helper kthread; if that new kthread is
one of the stalled fair tasks, the mutex holder waits forever and
scx_root_disable() can never make progress. Only sub-sched support exposes
this, since sub-sched enables are the only path where
scx_alloc_and_add_sched() can race the root's disable.
Move the DISABLING check after @scx_switching_all. @scx_switching_all
serves as a proxy for __scx_switched_all, so while it's set, forks keep
going to scx. Once cleared, DISABLING applies normally.
v2: Reword in-source comment and description. (Andrea)
Fixes: 337ec00b1d9c ("sched_ext: Implement cgroup sub-sched enabling and disabling") Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Andrea Righi <arighi@nvidia.com>
Linus Torvalds [Sun, 17 May 2026 19:02:31 +0000 (12:02 -0700)]
Merge tag 'trace-v7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
- Add more functions to the remote allowed list
randconfig found more functions that are allowed for the remote code
for s390 and arm. Add them to the allowed list.
- Fix remote_test error path
If one of the simple ring buffers fails to load, the code is supposed
to rollback its initialized buffers. Instead of rolling back the
buffers for the failed load, it uses the global variable and rolls
back all the successfully loaded buffers.
* tag 'trace-v7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: Fix desc in error path for the trace remote test module
ring-buffer remote: Avoid unexpected symbol warnings (arm, s390)
Carlos López [Tue, 12 May 2026 10:00:41 +0000 (12:00 +0200)]
virt: sev-guest: Do not use host-controlled page order in cleanup path
When issuing an extended guest request (SVM_VMGEXIT_EXT_GUEST_REQUEST),
get_ext_report() allocates a buffer to retrieve a certificate blob from the
host, keeping track of its size in report_req->certs_len.
However, the host may return SNP_GUEST_VMM_ERR_INVALID_LEN, indicating
an invalid buffer size, as well as the expected length of such buffer.
get_ext_report() subsequently updates report_req->certs_len with the
host-controlled value, and cleans up the buffer by computing a page order
from such value. This is incorrect, as the host-provided length may not
match the page order of the original allocation, potentially resulting
in corruption in the page allocator.
Fix this by using alloc_pages_exact() instead, and reusing @npages to
compute the size passed to free_pages_exact(). For consistency, also
use @npages to compute the size when allocating the pages, even though
this last change has no functional effect.
Fixes: 3e385c0d6ce8 ("virt: sev-guest: Move SNP Guest Request data pages handling under snp_cmd_mutex") Signed-off-by: Carlos López <clopez@suse.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: Michael Roth <michael.roth@amd.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
====================
Fix bpf_throw() vs global subprogs interaction
There is a bug where bpf_throw()'s reachability across global subprogs
is missed by the verifier, leading to successful verification when any
kernel resource or lock is held across global subprog call boundary.
Fix this by effect summarization like other related side effects and
propagate exception reachability into callees.
selftests/bpf: Cover global subprog exception leaks
Add a verifier failure case where the caller holds a reference across a
global subprog call that may throw. The program must be rejected because
the exceptional path would skip the caller's reference release.
Global subprogs are verified independently and are not descended into
when their callers are symbolically executed. This means a caller can
hold references or locks across a global subprog call that may throw,
while the verifier only checks the non-exceptional return path at the
call site.
Record whether a subprog might throw in the CFG summary pass, alongside
the existing might_sleep and packet-data-changing summaries, and
propagate that effect through reachable callees.
When a global subprog is marked as possibly throwing, push the normal
continuation and validate the exceptional path immediately at the call
site, avoiding a synthetic exception state and associated special case
in the pruning checks.
Linus Torvalds [Sun, 17 May 2026 18:07:09 +0000 (11:07 -0700)]
Merge tag 'timers-urgent-2026-05-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fix from Ingo Molnar:
- Fix potential garbage reads in the vDSO gettimeofday code
(Thomas Weißschuh)
* tag 'timers-urgent-2026-05-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
vdso/gettimeofday: Reload sequence counter after switch to time page in do_aux()
Linus Torvalds [Sun, 17 May 2026 17:34:15 +0000 (10:34 -0700)]
Merge tag 'irq-urgent-2026-05-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull IRQ fixes from Ingo Molnar:
- Fix use-after-free in irq_work_single() on PREEMPT_RT (Jiayuan Chen)
- Don't call add_interrupt_randomness() for NMIs in
handle_percpu_devid_irq() (Mark Rutland)
- Remove unused function in the ath79-cpu irqchip driver causing LKP
CI build warnings (Rosen Penev)
- Fix IRQ allocation/teardown leakage regressions in the GICv5 irqchip
driver (Sascha Bischoff)
- Fix an IRQ trigger type regression in the Meson S4 SoC irqchip driver
(Xianwei Zhao)
- Fix CPU offlining regression in the RiscV IMSIC irqchip driver
(Yong-Xuan Wang)
* tag 'irq-urgent-2026-05-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irq_work: Fix use-after-free in irq_work_single() on PREEMPT_RT
irqchip/riscv-imsic: Clear interrupt move state during CPU offlining
irqchip/meson-gpio: Use the correct register in meson_s4_gpio_irq_set_type()
irqchip/ath79-cpu: Remove unused function
genirq/chip: Don't call add_interrupt_randomness() for NMIs
irqchip/gic-v5: Allocate ITS parent LPIs as a range
irqchip/gic-v5: Support range allocation for LPIs
irqchip/gic-v5: Move LPI allocation into the LPI domain
Add Vol+/Vol-/Mute panel button mappings for iMON VFD HID OEM v1.2.
This version differs in the codes that generate the
KEY_VOLUMEUP, KEY_VOLUMEDOWN and KEY_MUTE events.
Signed-off-by: Alessandro Baldi <baldovic@virgilio.it> Signed-off-by: Sean Young <sean@mess.org>
Linus Torvalds [Sun, 17 May 2026 16:33:49 +0000 (09:33 -0700)]
Merge tag 'riscv-for-linus-7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Paul Walmsley:
"Relatively low-impact fixes. Probably the most notable one is that we
no longer ask the monitor-mode firmware to delegate misaligned access
handling to the kernel by default, since the kernel code needs
significant improvement to match the functionality of the firmware.
This change avoids functional problems at some cost in performance,
but shouldn't affect any system with misaligned access handling in
hardware.
- Disable satp register probing when no5lvl is specified on the
kernel command line
- Fix a CFI-related issue with the misaligned access speed
measurement code
- Reduce the CFI shadow stack size limit from 4GB to 2GB (following
ARM64 GCS)
- Prevent the kernel from requesting delegation of misaligned access
faults unless a new Kconfig option, RISCV_SBI_FWFT_DELEGATE_MISALIGNED,
is enabled. This will depend on CONFIG_NONPORTABLE until the
deficiencies of the kernel misaligned access fixup code are fixed
- Fix some potential uninitialized memory accesses in error paths in
compat_riscv_gpr_set() and compat_restore_sigcontext()
- Fix a bug in the RISC-V MIPS vendor errata patching code where a
logical-and was used in place of a bitwise-and
- Drop some unnecessary code in riscv_fill_hwcap_from_isa_string()
- Use macros for isa2hwcap indices in riscv_fill_hwcap(), rather than
open-coding them
- Fix some documentation typos (one affecting 'make htmldocs')"
* tag 'riscv-for-linus-7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: misaligned: Make enabling delegation depend on NONPORTABLE
riscv: Docs: fix unmatched quote warning
riscv: cfi: reduce shadow stack size limit from 4GB to 2GB
riscv: cpufeature: Use pre-defined ISA ext macros to index isa2hwcap
riscv: mm: Fixup no5lvl failure when vaddr is invalid
riscv: Fix register corruption from uninitialized cregs on error
riscv: errata: Fix bitwise vs logical AND in MIPS errata patching
Documentation: riscv: cmodx: fix typos
riscv: cpufeature: Drop this_hwcap clear in T-Head vector workaround
riscv: Define __riscv_copy_{,vec_}{words,bytes}_unaligned() using SYM_TYPED_FUNC_START
- sy7636a: Fix sysfs attribute name in documentation
* tag 'hwmon-for-v7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (lm90) Add lock protection to lm90_alert
hwmon: (lm90) Stop work before releasing hwmon device
docs: hwmon: sy7636a: fix temperature sysfs attribute name
hwmon: (asus_atk0110) Check ACPI_COMPANION() against NULL
hwmon: (acpi_power_meter) Check ACPI_COMPANION() against NULL
Sudeep Holla [Fri, 8 May 2026 17:54:18 +0000 (18:54 +0100)]
firmware: arm_ffa: Defer probe until pKVM is initialized
When protected KVM is enabled, the kernel includes a pKVM FF-A proxy
that sits in front of the normal FF-A driver. The proxy has to perform
its own FF-A version negotiation and setup first, so that it can mediate
subsequent FF-A traffic correctly.
Defer FF-A core probing until pKVM has completed initialization. This
keeps the normal driver from negotiating the FF-A version or performing
other transport setup before the pKVM proxy is ready, and lets the
driver model retry probing once the protected KVM state required by the
FF-A transport is available.
Sudeep Holla [Fri, 8 May 2026 17:54:17 +0000 (18:54 +0100)]
firmware: arm_ffa: Set the core device as FF-A device parent
Pass a parent device into ffa_device_register() and use the synthetic
arm-ffa platform device as the parent for each registered FF-A device.
This keeps the enumerated FF-A partition devices anchored below the FF-A
core device in the driver model, matching the platform-driver conversion
of the core transport.
Sudeep Holla [Fri, 8 May 2026 17:54:16 +0000 (18:54 +0100)]
firmware: arm_ffa: Register core as a platform driver
Move the FF-A core bring-up and teardown paths into platform driver
probe and remove callbacks, and register a synthetic arm-ffa platform
device to bind the driver.
This makes the FF-A core lifetime follow the driver model while keeping
the device creation internal to the FF-A core. Use normal platform driver
registration so the probe path has standard driver-core semantics.
The synthetic platform device is a temporary bridge until ACPI and
devicetree describe the FF-A core device or object. Once those firmware
description paths are defined, the internal platform device creation can
be dropped and the driver can bind to the firmware-described device
directly.
Since the transport selection now happens from the platform probe path,
drop the __init annotation from ffa_transport_init().
Yeoreum Yun [Fri, 8 May 2026 17:54:15 +0000 (18:54 +0100)]
Revert "firmware: arm_ffa: Change initcall level of ffa_init() to rootfs_initcall"
This reverts commit 0e0546eabcd6c19765a8dbf5b5db3723e7b0ea75, which was
added to address ordering issues with the IMA LSM initialisation where
the TPM would not be fully ready by the time IMA wanted it. This has
been resolved within IMA by retrying setup during late_initcall_sync if
the TPM is not available at first.
Stepan Ionichev [Fri, 15 May 2026 13:30:04 +0000 (18:30 +0500)]
auxdisplay: Kconfig: drop unneeded quotes in PANEL_BOOT_MESSAGE dep
The PANEL_BOOT_MESSAGE dependency uses a quoted-string comparison
against the PANEL_CHANGE_MESSAGE bool symbol:
depends on PANEL_CHANGE_MESSAGE="y"
This is the only such pattern under drivers/auxdisplay/ (grep shows
no other Kconfig file in the tree uses depends on FOO="y" with
quotes for a plain bool symbol). The quoted form is parsed by
Kconfig but is not idiomatic; the common form for the same intent
is the unquoted tristate-style dependency:
depends on PANEL_CHANGE_MESSAGE
which evaluates true when PANEL_CHANGE_MESSAGE is y or m. Since
PANEL_CHANGE_MESSAGE is declared as bool (not tristate), there is
no behaviour change in practice: y is the only enabled value
either form can match.
Drop the quoted comparison so the dependency matches the prevailing
kernel Kconfig style and so it is obvious to readers that the
comparison works.
Stepan Ionichev [Thu, 14 May 2026 17:43:42 +0000 (22:43 +0500)]
auxdisplay: line-display: fix OOB read on zero-length message_store()
linedisp_display() unconditionally reads msg[count - 1] before
checking whether count is zero, so a write of zero bytes to the
message sysfs attribute hits msg[-1]:
The kernfs write buffer for that store is a 1-byte allocation
(kernfs_fop_write_iter() does kmalloc(len + 1) with len == 0),
so msg[-1] is a 1-byte read before the slab object. On a
KASAN-enabled kernel this trips an out-of-bounds report and
panics; on stock kernels it silently reads adjacent slab data
and, if that byte happens to be '\n', the following count--
wraps ssize_t 0 to -1 and is then passed to kmemdup_nul().
linedisp_display() is reached from the message_store() sysfs
callback (drivers/auxdisplay/line-display.c message attribute,
mode 0644) and from the in-tree initial-message setup with
count == -1, so the OOB path is only userspace-triggerable via
zero-byte writes; vfs_write() does not short-circuit on
count == 0 and kernfs_fop_write_iter() dispatches the store
callback regardless.
Guard the trailing-newline trim with a count check. The
existing if (!count) block then takes the clear-display path
unchanged.
Affects every auxdisplay driver that registers via
linedisp_register() / linedisp_attach(): ht16k33, max6959,
img-ascii-lcd, seg-led-gpio.
Fixes: 7e76aece6f03 ("auxdisplay: Extract character line display core support") Signed-off-by: Stepan Ionichev <sozdayvek@gmail.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
====================
bpf: Follow-up fixes for stack argument support
Commit cd59fa185a03 ("bpf: Support stack arguments for BPF functions and kfuncs")
added stack argument support for bpf functions and kfuncs. This patch set
is to fix various issues related to stack arguments, mainly include:
- Validate outgoing stack args when btf_prepare_func_args fails
- Fix arg_track_join log to use sa prefix for stack arg slots
- Clean up redundant stack arg checks for non-JITed programs
Yonghong Song [Fri, 15 May 2026 22:51:01 +0000 (15:51 -0700)]
bpf: Clean up redundant stack arg checks for non-JITed programs
Remove a redundant stack_arg_cnt check in __bpf_prog_select_runtime()
and start the stack arg loop from index 0 in bpf_fixup_call_args().
Both changes are no-ops that simplify the code:
In __bpf_prog_select_runtime(), the subprog_info[0].stack_arg_cnt
check is unreachable:
- when there is only a main program (no bpf-to-bpf calls),
subprog_info[0].stack_arg_cnt is always 0 because the main
program's arg_cnt is forced to 1
- when bpf-to-bpf calls use stack args and JIT succeeds,
fp->bpf_func is set and this code is skipped
- when JIT fails, bpf_fixup_call_args() rejects the program
before we get to __bpf_prog_select_runtime().
In bpf_fixup_call_args(), starting the loop at i=1 skipped subprog 0,
which is safe since the main program always has arg_cnt=1 and thus
bpf_in_stack_arg_cnt() returns 0. Starting at i=0 removes the need
to reason about this invariant.
Yonghong Song [Fri, 15 May 2026 22:50:56 +0000 (15:50 -0700)]
bpf: Fix arg_track_join log to use sa prefix for stack arg slots
arg_track_join() logs state transitions at CFG merge points. For
stack arg slots (r >= MAX_BPF_REG), it printed "r11:", "r12:", etc.,
which is misleading since r11 is a special register (BPF_REG_PARAMS)
not meaningful to the user.
Fix it to print "sa0:", "sa1:", etc., matching the per-instruction
transition log in arg_track_log() which already uses the "sa" prefix.
Update the existing stack_arg_pruning_type_mismatch selftest to expect
the corrected format.
Yonghong Song [Fri, 15 May 2026 22:50:51 +0000 (15:50 -0700)]
selftests/bpf: Log arg_track_join for stack arg slots in liveness analysis
Commit 2af4e792773f ("bpf: Extend liveness analysis to track stack argument slots")
added stack arg supports. For selftest
verifier_stack_arg/stack_arg: pruning with different stack arg types
the following are two arg JOIN messages:
arg JOIN insn 9 -> 10 r1: fp0-8 + _ => fp0-8|fp0+0
arg JOIN insn 9 -> 10 r11: fp0-8 + _ => fp0-8|fp0+0
Here the "r11:" label for stack arg slot 0 is misleading since r11
is a special register (BPF_REG_PARAMS). The next patch corrects
this to "sa0:", properly representing the 'stack arg slot 0'.
Yonghong Song [Fri, 15 May 2026 22:50:45 +0000 (15:50 -0700)]
selftests/bpf: Add test for stack arg read without caller write
Add negative tests for the outgoing stack arg validation.
A static subprog with a 'long *' arg causes
btf_prepare_func_args() to fail after setting arg_cnt. The
validation ensures check_outgoing_stack_args() still runs.
Also update two existing tests (release_ref, stale_pkt_ptr) whose
expected error messages changed: invalidated stack arg slots are now
caught by check_outgoing_stack_args() at the call site instead of
at the callee's dereference.
Yonghong Song [Fri, 15 May 2026 22:50:40 +0000 (15:50 -0700)]
bpf: Validate outgoing stack args when btf_prepare_func_args fails
btf_prepare_func_args() sets sub->arg_cnt before validating arg types.
If validation fails (e.g. unsupported pointer type in a static subprog),
check_outgoing_stack_args() is skipped because btf_check_func_arg_match()
returns early. For static subprogs, check_func_call() ignores non-EFAULT
errors and proceeds with the call.
This causes the callee to read stack arg slots that the caller never
stored or not initialized, potentially dereferencing NULL caller->stack_arg_regs
or getting no-initialized value.
To fix the issue, when btf_prepare_func_args() fails and the subprog expects
stack args, call check_outgoing_stack_args() to verify the caller initialized
the slots. Return -EFAULT on failure so the error is not ignored.
wifi: iwlwifi: mld: disconnect only after 6 beacons without Rx
After 4 missed beacons since last Rx, the firmware will send an NDP to the
AP. If the NDP is ACK'ed, it'll reset the missed_beacons_since_last_rx
counter.
Disconnecting after 4 beacons doesn't give enough time to the firmware
to send the NDP.
Wait until we get 6 missed beacons since last Rx before disconnecting.
Clearly, from a user perspective, it must be valid to configure
WoWLAN (which can include network detection) and then suspend
while not connected to a network, or even without an interface
at all (WoWLAN config is handled on a per-wiphy basis). Since
mac80211 doesn't distinguish these cases and simply calls the
driver to suspend whenever WoWLAN is configured, the driver has
to cleanly handle the case where it's called for WoWLAN but no
(BSS) interface exists.
Remove the WARN_ON(), move the print so it doesn't get done in
this case, and keep returning 1 to disconnect everything.
Johannes Berg [Fri, 15 May 2026 12:14:57 +0000 (15:14 +0300)]
wifi: iwlwifi: mvm: fix driver-set TX rates on old devices
On old devices such as 7265D, rates are still encoded in version 1
format, which doesn't use the CCK/OFDM rate index (0-3/0-7) but
rather their PLCP value (e.g. 10 for 1 Mbps CCK rate.)
While introducing v3 rates, I changed the driver from internally
handling v1 rates and converting to v2, to internally handling v3
and converting to v1 or v2 according to the firmware. I accordingly
changed the code in iwl_mvm_mac80211_idx_to_hwrate() to no longer
have different values for different APIs. This was correct.
However, I later reverted this part of the change, because it was
reported that I had broken beacon rates, causing a FW assert/crash.
This caused TX_CMD rates to be set incorrectly, potentially causing
a warning when reported back from the device as having been used.
Fix this (hopefully correctly now) by handling beacon rates in the
TX_CMD that's embedded in the beacon template command separately.
Restore iwl_mvm_mac80211_idx_to_hwrate() to return only the rate
index, not PLCP value, fixing the real TX_CMD.
Sheroz Juraev [Sun, 15 Mar 2026 08:12:21 +0000 (13:12 +0500)]
wifi: iwlwifi: mld: stop TX during firmware restart
When iwlwifi firmware crashes (e.g., NMI_INTERRUPT_UNKNOWN on Intel
BE201/Wi-Fi 7), iwl_mld_nic_error() sets mld->fw_status.in_hw_restart
to true. However, iwl_mld_tx_from_txq() does not check this flag before
dequeuing frames from mac80211 and pushing them to the transport layer.
Since the firmware is dead, iwl_trans_tx() returns -EIO for each frame,
which then gets freed immediately. Under high-throughput conditions
(e.g., Tailscale UDP traffic or active SSH sessions), this creates a
tight dequeue-send-fail-free loop that wastes CPU cycles and generates
rapid skb allocation churn, leading to memory pressure from slab
fragmentation.
The RX path already has this guard (iwl_mld_rx_mpdu checks
in_hw_restart at rx.c:1906), and so does the TXQ allocation worker
(iwl_mld_add_txqs_wk at tx.c:156). Add the same guard to
iwl_mld_tx_from_txq() to stop all TX during firmware restart.
Frames left in mac80211's TXQs are naturally drained after restart
completes, when queue reallocation triggers iwl_mld_tx_from_txq()
via iwl_mld_add_txq_list(), or when new upper-layer traffic invokes
wake_tx_queue.
Tested on ASUS Zenbook 14 UX3405CA with Intel BE201 (Wi-Fi 7) on
kernel 6.19.5 where the firmware crashes approximately every 10-15
minutes under Tailscale traffic.
wifi: iwlwifi: mld: fix TSO segmentation explosion when AMSDU is disabled
When the TLC notification disables AMSDU for a TID, the MLD driver sets
max_tid_amsdu_len to the sentinel value 1. The TSO segmentation path in
iwl_mld_tx_tso_segment() checks for zero but not for this sentinel,
allowing it to reach the num_subframes calculation:
This zero propagates to iwl_tx_tso_segment() which sets:
gso_size = num_subframes * mss = 0
Calling skb_gso_segment() with gso_size=0 creates over 32000 tiny
segments from a single GSO skb. This floods the TX ring with ~1024
micro-frames (the rest are purged), creating a massive burst of TX
completion events that can lead to memory corruption and a subsequent
use-after-free in TCP's retransmit queue (refcount underflow in
tcp_shifted_skb, NULL deref in tcp_rack_detect_loss).
The MVM driver is immune because it checks mvmsta->amsdu_enabled before
reaching the num_subframes calculation. The MLD driver has no equivalent
bitmap check and relies solely on max_tid_amsdu_len, which does not
catch the sentinel value.
Fix this by detecting the sentinel value (max_tid_amsdu_len == 1) at the
existing check and falling back to non-AMSDU TSO segmentation. Also add
a WARN_ON_ONCE guard after the num_subframes division as defense-in-depth
to catch any future code paths that produce zero through a different
mechanism.
Sven Eckelmann [Sat, 16 May 2026 20:10:08 +0000 (22:10 +0200)]
batman-adv: fix batadv_skb_is_frag() kernel-doc
The kernel-doc comment for batadv_skb_is_frag() contained two errors:
* the function description referred to "gain a unicast packet" instead
of "contains unicast fragment".
* the Return section omitted "merged" from "newly skb", leaving the
description grammatically incorrect and inconsistent with the
function description.
Fixes: bc62216dc8e2 ("batman-adv: frag: disallow unicast fragment in fragment") Signed-off-by: Sven Eckelmann <sven@narfation.org>
tracing: Fix desc in error path for the trace remote test module
During initialisation in remote_test_load(), if one of the
simple_ring_buffer fails to initialise, the error path attempts to
rollback initialised buffers. However, the rollback incorrectly uses the
global pointer to the trace descriptor, which is only set upon
successful load completion. Fix the error path by using the local
pointer to the descriptor.
Heechan Kang [Sat, 16 May 2026 18:47:09 +0000 (03:47 +0900)]
io_uring/waitid: clear waitid info before copying it to userspace
IORING_OP_WAITID stores its result fields in struct io_waitid::info and
later copies them to userspace siginfo. The prep path initializes the
request arguments, but it does not initialize info itself.
If the wait operation completes without reporting a child event, the common
wait code can return without writing wo_info. In that case io_waitid_finish()
still copies iw->info to userspace, exposing stale bytes from the reused
io_kiocb command storage.
Clear the result storage during prep so the io_uring path matches the
regular waitid syscall, which uses a zero-initialized struct waitid_info.
Sander Vanheule [Fri, 15 May 2026 21:23:51 +0000 (23:23 +0200)]
watchdog: realtek-otto: enable clock before using I/O
As the watchdog is normally on the same bus as the UART peripheral, the
bootloader will have ensured the bus' clock is up and running before the
watchdog driver is probed. Nevertheless, let's do things the right way
and enable the watchdog's clock before performing I/O accesses.
Sander Vanheule [Fri, 15 May 2026 21:23:50 +0000 (23:23 +0200)]
watchdog: realtek-otto: prevent PHASE2 underflows
For small pretimeout values, ((timeout - pretimeout) / tick) might be
rounded up to the same value as (timeout / tick). As a result, the
number of PHASE2 ticks may be zero, causing an underflow when
subtracting 1 to configure the hardware. While this results in a
longer-than-expected time to system reset, the duration of PHASE1 and
minimum ping interval for the watchdog would still be correct.
As the watchdog core ensures pretimeout is strictly less than timeout,
ceil(timeout / tick) is strictly greater than floor(pretimeout / tick)
and the number of PHASE1 ticks cannot be 0. So instead of rounding up
the number of PHASE1 ticks, we can round down the number of PHASE2
ticks, maintaining the current behavior while avoiding underflows.
The original helper function is now inlined, as it doesn't save any
duplication anymore.
Linus Torvalds [Sat, 16 May 2026 16:53:14 +0000 (09:53 -0700)]
Merge tag 'powerpc-7.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Madhavan Srinivasan:
- Fix preempt count leak in sysfs show paths
- Fix error handling in pika_dtm_thread
- Remove pmac_low_i2c_{lock,unlock}()
- Enable all windfarms by default
- Fix dead default for GUEST_STATE_BUFFER_TEST
- Remove redundant preempt_disable|enable() calls from
arch_irq_work_raise()
Thanks to Aboorva Devarajan, Ally Heev, Amit Machhiwal, Bart Van Assche,
Christophe Leroy, Christophe Leroy (CS GROUP), Dan Carpenter, Gautam
Menghani, Harsh Prateek Bora, Julian Braha, Krzysztof Kozlowski, Linus
Walleij, Ma Ke, Ritesh Harjani (IBM), and Sayali Patil
* tag 'powerpc-7.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/time: Remove redundant preempt_disable|enable() calls from arch_irq_work_raise()
powerpc/hv-gpci: fix preempt count leak in sysfs show paths
powerpc: fix dead default for GUEST_STATE_BUFFER_TEST
powerpc/powermac: Remove pmac_low_i2c_{lock,unlock}()
powerpc/warp: Fix error handling in pika_dtm_thread
powerpc: 82xx: fix uninitialized pointers with free attribute
powerpc/g5: Enable all windfarms by default
The GMAC node incorrectly listed four clocks, including a separate tx_clk
and a TSU GCK clock sourced from ID 67. According to the SAM9X7 clocking
scheme, the GMAC uses only three clocks: HCLK, PCLK, and the TSU GCK
derived from the GMAC peripheral clock (ID 24).
Remove the unused tx_clk, update the clock-names accordingly, and correct
the assigned clock to use GCK 24 instead of GCK 67. This aligns the device
tree with the actual hardware clock topology and prevents misconfiguration
of the GMAC clock tree.
Linus Torvalds [Sat, 16 May 2026 16:32:30 +0000 (09:32 -0700)]
Merge tag 'sound-7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A collection of small fixes. All device-specific small changes:
HD-audio:
- Fix NULL pointer dereference in snd_hda_ctl_add()
- ACPI and Kconfig fixes for Cirrus drivers
- A regression fix CA0132 codec
- Various device-specific quirks for HP, Lenovo, Samsung, Framework etc
- Documentation path fix
USB-audio:
- Boundary checks for MIDI endpoint descriptors
- Offload mapping error handling for Qualcomm
- A new device quirk for TTGK Technology USB-C Audio
- A fix for Focusrite Scarlett2 mixer"
* tag 'sound-7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda/ca0132: Disable auto-detect on manual output select
ALSA: hda/realtek: Add mute LED quirk for HP Pavilion Laptop 16-ag0xxx
ALSA: hda/realtek: ALC269 fixup for Lenovo Yoga Pro 7 15ASH111 audio
ALSA: hda: Fix NULL pointer dereference in snd_hda_ctl_add()
ALSA: hda/realtek: Add quirk for Samsung Galaxy Book5 360 headphone
ALSA: hda/cs35l56: Drop malformed default N from Kconfig
ALSA: hda/realtek: fix mic boost on Framework PTL
ALSA: hda/realtek: Limit mic boost on Positivo DN50E
ALSA: doc: cs35l56: Update path to HDA driver source
ALSA: usb-audio: qcom: Check offload mapping failures
ALSA: hda/realtek: Fix Legion 7 16ITHG6 speaker amp binding
ALSA: usb-audio: Add iface reset and delay quirk for TTGK Technology USB-C Audio
ALSA: scarlett2: Add missing error check when initialise Autogain Status
ALSA: hda: cs35l41: Put ACPI device on missing physical node
ALSA: hda: cs35l56: Put ACPI device after setting companion
ALSA: usb-audio: Bound MIDI 2.0 endpoint descriptor scans
ALSA: usb-audio: Bound MIDI endpoint descriptor scans
ALSA: hda/realtek: Add codec SSID quirk for Lenovo Yoga Pro 9 16IMH9 (17aa:38d5)
Guenter Roeck [Thu, 14 May 2026 21:41:00 +0000 (14:41 -0700)]
hwmon: (lm90) Add lock protection to lm90_alert
Sashiko reports:
lm90_alert() executes in the smbus alert context and calls
lm90_update_confreg() to disable the hardware alert line, without
acquiring hwmon_lock.
Concurrently, sysfs write operations (such as lm90_write_convrate) hold
the hwmon_lock, temporarily modify data->config, and then restore it.
If an alert interrupt occurs concurrently with a sysfs write, the sysfs
path will overwrite the alert handler's modifications to data->config
and the hardware register.
This unintentionally re-enables the hardware alert line while the alarm is
still active, causing an interrupt storm.
Add the missing lock to lm90_alert() to solve the problem.
Fixes: 7a1d220ccb0cc ("hwmon: (lm90) Introduce function to update configuration register") Reported-by: Sashiko <sashiko-bot@kernel.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Guenter Roeck [Thu, 14 May 2026 21:31:49 +0000 (14:31 -0700)]
hwmon: (lm90) Stop work before releasing hwmon device
Sashiko reports:
In lm90_probe(), the devm action to cancel the alert_work and report_work
(lm90_restore_conf) is registered in lm90_init_client() before
devm_hwmon_device_register_with_info() is called.
Because devm executes cleanup actions in reverse order during module
unbind or probe failure, the hwmon device is unregistered and freed first.
If lm90_alert_work() or lm90_report_alarms() runs in the window between
the hwmon device being freed and the delayed works being cancelled,
lm90_update_alarms() will dereference the freed data->hwmon_dev here.
Fix the problem by canceling the workers separately after registering
the hwmon device and before registering the interrupt handler. This ensures
that the workers are canceled after interrupts are disabled and before
the hwmon device is released. Add "shutdown" flag to indicate that device
shutdown is in progress to prevent workers from being re-armed.
Mirror of Mark Brown's ASoC: hdac_hdmi rate-limit patch (commit
[lkml.kernel.org/lkml/2025/6/13/1380]) for the generic snd_parse_eld()
helper used by ASoC hdmi-codec.
When a HDMI sink is disconnected (e.g. a board with two HDMI outputs and
only one cable), userspace audio servers like PipeWire keep probing the
disconnected card and trigger:
HDMI: Unknown ELD version 0
at every probe — easily 30+ messages per burst on rk3588. The same
applies to malformed ELD (MNL out of range). Both conditions are
expected when no sink is attached; rate-limit the dev_info() so the
kernel ring buffer does not fill up.
Dmitry Baryshkov [Sat, 16 May 2026 11:53:45 +0000 (14:53 +0300)]
drm/msm/snapshot: fix dumping of the unaligned regions
The snapshotting code internally aligns data segment to 16 bytes. This
works fine for DPU code (where most of the regions are aligned), but
fails for snapshotting of the DSI data (because DSI data region is
shifted by 4 bytes). Fix the code by removing length alignment and by
accurately printing last registers in the region. While reworking the
code also fix the 16x memory overallocation in
msm_disp_state_dump_regs().
Fixes: 98659487b845 ("drm/msm: add support to take dpu snapshot") Reported-by: Salendarsingh Gaud <sgaud@qti.qualcomm.com> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/725449/
Message-ID: <20260516-msm-fix-dsi-dump-2-v2-1-9e49fb2d240e@oss.qualcomm.com> Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Takashi Iwai [Fri, 15 May 2026 10:56:59 +0000 (12:56 +0200)]
ALSA: hda: Avoid quirk matching with zero PCI SSID
Heiko reported that BIOS on some recent machines doesn't set up PCI
SSID properly but leave with zero (e.g. on HP Dragonfly Folio 13.5
inch G3 with SSID 103c:8a05/8a06), which confuses the quirk table
matching and results in the non-functional state.
Fix it by skipping the PCI SSID matching when either vendor or device
ID is zero and falling back to the codec SSID that is supposed to be
more stable for those cases.
Sergio Boglione [Sat, 16 May 2026 13:16:50 +0000 (10:16 -0300)]
ALSA: hda/realtek: Add quirk for HP 250 G10 (103c:8b34)
HP 250 15.6 inch G10 Notebook PC uses the same ALC236 codec
as the HP 255 15.6 inch G10 (103c:8b2f) and requires the same
fixup to enable the internal speaker EAPD and microphone routing.
Eric Naim [Sat, 16 May 2026 11:15:31 +0000 (19:15 +0800)]
ALSA: hda/realtek: Use ALC287_FIXUP_TXNW2781_I2C for ASUS Strix Gxx5
These devices were incorrectly using the ALC287_FIXUP_TAS2781_I2C quirk
leading to errors:
[ 18.765990] Serial bus multi instantiate pseudo device driver TXNW2781:00: error -ENXIO: IRQ index 0 not found
[ 18.768153] Serial bus multi instantiate pseudo device driver TXNW2781:00: error -ENXIO: IRQ index 0 not found
[ 18.768476] Serial bus multi instantiate pseudo device driver TXNW2781:00: error -ENXIO: IRQ index 0 not found
[ 18.768899] Serial bus multi instantiate pseudo device driver TXNW2781:00: Instantiated 3 I2C devices.
Use the ALC287_FIXUP_TXNW2781_I2C quirk instead to fix this and restore
speaker audio on affected devices.
The VirtIO sound UAPI defines VIRTIO_SND_PCM_RATE_384000, and ALSA
has SNDRV_PCM_RATE_384000. However, virtio-snd's rate conversion
tables stop at 192 kHz.
A device advertising only 384 kHz is rejected as having no supported
PCM frame rates. A device advertising 384 kHz together with lower rates
does not expose 384 kHz through the ALSA hardware constraints. The
selected ALSA rate also needs a reverse mapping for SET_PARAMS.
Add the missing 384 kHz entries to both conversion tables.
Takashi Iwai [Fri, 15 May 2026 08:55:58 +0000 (10:55 +0200)]
ALSA: asihpi: Fix potential OOB array access at reading cache
find_control() to retrieve a cached info accesses the array with the
given index blindly, which may lead to an OOB array access.
Add a sanity check for avoiding it.
Haoze Xie [Fri, 15 May 2026 03:19:02 +0000 (11:19 +0800)]
netfilter: nf_queue: hold bridge skb->dev while queued
br_pass_frame_up() rewrites skb->dev from the ingress port to the bridge
master before queueing bridge LOCAL_IN packets. NFQUEUE only holds
references on state.in/out and bridge physdevs, so a queued bridge
packet can retain a freed bridge master in skb->dev until reinjection.
When the verdict is reinjected later, br_netif_receive_skb() re-enters
the receive path with skb->dev still pointing at the freed bridge master,
triggering a use-after-free.
Store skb->dev in the queue entry, hold a reference on it for the queue
lifetime, and use the saved device when dropping queued packets during
NETDEV_DOWN handling.
Fixes: ac2863445686 ("netfilter: bridge: add nf_afinfo to enable queuing to userspace") Cc: stable@kernel.org Reported-by: Yuan Tan <yuantan098@gmail.com> Reported-by: Yifan Wu <yifanwucs@gmail.com> Reported-by: Juefei Pu <tomapufckgml@gmail.com> Reported-by: Xin Liu <bird@lzu.edu.cn> Signed-off-by: Haoze Xie <royenheart@gmail.com> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Lorenzo Bianconi [Thu, 14 May 2026 14:46:38 +0000 (16:46 +0200)]
netfilter: br_netfilter: Reallocate headroom if necessary in neigh_hh_bridge()
neigh_hh_bridge() assumes the skb always has sufficient headroom to copy
the aligned L2 header. This assumption can trigger the crash reported
below using the following netfilter setup:
Fix the issue reallocating the skb headroom if necessary in neigh_hh_bridge routine.
Fixes: e179e6322ac33 ("netfilter: bridge-netfilter: Fix MAC header handling with IP DNAT") Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Jozsef Kadlecsik [Thu, 14 May 2026 08:55:13 +0000 (10:55 +0200)]
netfilter: ipset: annotate "pos" for concurrent readers/writers
The "pos" structure member of struct hbucket stores the first
free slot in the hash bucket of a hash type of set and there
are concurrent readers/writers. Annotate accesses properly.
Fixes: 18f84d41d34f ("netfilter: ipset: Introduce RCU locking in hash:* types") Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
netfilter: ipset: Fix data race between add and dump in all hash types
When adding a new entry to the next position in the existing hash bucket,
the position index was incremented too early and parallel dump could
read it before the entry was populated with the value. Move the setting
of the position index after populating the entry.
v2: Position counting fixed, noticed by Florian Westphal.
Jozsef Kadlecsik [Thu, 14 May 2026 08:55:11 +0000 (10:55 +0200)]
netfilter: ipset: Fix data race between add and list header in all hash types
The "ipset list -terse" command is actually a dump operation which
may run parallel with "ipset add" commands, which can trigger an
internal resizing of the hash type of sets just being dumped. However,
dumping just the header part of the set was not protected against
underlying resizing. Fix it by protecting the header dumping part
as well.
Fixes: c4c997839cf9 ("netfilter: ipset: Fix parallel resizing and listing of the same set") Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
struct ip6t_opts stores at most IP6T_OPTS_OPTSNR option descriptors,
but hbh_mt6_check() does not reject larger optsnr values supplied from
userspace.
Validate optsnr in the rule setup path so only match data that fits the
fixed-size opts array can be installed. This follows the existing xtables
pattern of rejecting invalid user-provided counts in checkentry() and
keeps the packet matching path unchanged.
`struct ip6t_opts` has a fixed `opts[IP6T_OPTS_OPTSNR]` array,
where `IP6T_OPTS_OPTSNR` is 16, then off-by-one array access is possible:
[ 137.924693][ T8692] UBSAN: array-index-out-of-bounds in ../net/ipv6/netfilter/ip6t_hbh.c:110:29
[ 137.926167][ T8692] index 16 is out of range for type '__u16 [16]'
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@kernel.org Reported-by: Yuan Tan <yuantan098@gmail.com> Reported-by: Yifan Wu <yifanwucs@gmail.com> Reported-by: Juefei Pu <tomapufckgml@gmail.com> Reported-by: Xin Liu <bird@lzu.edu.cn> Signed-off-by: Zhengchuan Liang <zcliangcn@gmail.com> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The iterator must stop once the last address in the requested range has
been processed. Advancing it once more can move the traversal state past
the end of the request, so a later retry may continue from an unintended
position.
Handle the iterator increment explicitly at the end of the loop and stop
once the upper bound has been processed. This keeps the existing retry
behaviour intact for valid ranges while preventing traversal from
continuing past the original boundary.
Fixes: 48596a8ddc46 ("netfilter: ipset: Fix adding an IPv4 range containing more than 2^31 addresses") Cc: stable@kernel.org Reported-by: Yuan Tan <yuantan098@gmail.com> Reported-by: Yifan Wu <yifanwucs@gmail.com> Reported-by: Juefei Pu <tomapufckgml@gmail.com> Reported-by: Xin Liu <bird@lzu.edu.cn> Signed-off-by: Nan Li <tonanli66@gmail.com> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Yizhou Zhao [Mon, 11 May 2026 17:30:41 +0000 (01:30 +0800)]
netfilter: nft_inner: Fix IPv6 inner_thoff desync
In nft_inner_parse_l2l3(), when processing inner IPv6 packets,
ipv6_find_hdr() correctly computes the transport header offset
traversing all extension headers, but the result is immediately
overwritten with nhoff + sizeof(_ip6h) (40 bytes), which only
accounts for the IPv6 base header. This creates a desync between
inner_thoff (wrong — points to extension header start) and l4proto
(correct — e.g., IPPROTO_TCP), enabling transport header forgery
and potential firewall bypass. This issue affects stable versions
from Linux 6.2.
For comparison, the normal (non-inner) IPv6 path correctly
preserves ipv6_find_hdr()'s result. Removing the incorrect overwrite
ensures that ipv6_find_hdr()'s calculated transport header offset is
preserved, thereby fixing the desynchronization.
Fixes: 3a07327d10a0 ("netfilter: nft_inner: support for inner tunnel header matching") Cc: stable@vger.kernel.org Reported-by: Yizhou Zhao <zhaoyz24@mails.tsinghua.edu.cn> Reported-by: Yuxiang Yang <yangyx22@mails.tsinghua.edu.cn> Reported-by: Xuewei Feng <fengxw06@126.com> Reported-by: Qi Li <qli01@tsinghua.edu.cn> Reported-by: Ke Xu <xuke@tsinghua.edu.cn> Assisted-by: GLM:5.1 Z.ai Signed-off-by: Yizhou Zhao <zhaoyz24@mails.tsinghua.edu.cn> Reviewed-by: Fernando Fernandez Mancera <fmancera@suse.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Jozsef Kadlecsik [Thu, 14 May 2026 08:55:10 +0000 (10:55 +0200)]
netfilter: ipset: fix a potential dump-destroy race
When dumping sets in order to create the proper order for restore,
the list type of sets dumped last. Therefore internally we run the
dumping loop twice: first with all non-list type of sets and skipping
the list type ones and then secondly for the list type of sets.
Sashiko noticed that there's a potential race between dump and destroy
if in the first loop the last set was a list type of set: its pointer
remains unreferenced and a concurrent destroy can free it.
Fix the issue by resetting the variable holding the pointer.
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Julian Anastasov [Sun, 10 May 2026 10:46:05 +0000 (13:46 +0300)]
ipvs: avoid possible loop in ip_vs_dst_event on resizing
Sashiko points out that unprivileged user can frequently
call ip_vs_flush() or ip_vs_del_service() to trigger
svc_table_changes updates that can lead to infinite loop
in ip_vs_dst_event(). This can also happen if the user
triggers frequent table resizing without deleting all
services. We should also consider the possible effects
if the user triggers many NETDEV_DOWN events.
One way to solve it is to hold svc_resize_sem in
ip_vs_dst_event() but this can block the dev notifier
during the whole resizing process.
Instead, use new rw_semaphore svc_replace_sem to protect just
the svc_table replacement which is a short code section.
Then hold svc_replace_sem in ip_vs_dst_event() to serialize
with replacing the svc_table. As result, loop is avoided
as there is no need to repeat the table walking from the
start. By this way changes in svc_table_changes can happen
only when all services are removed and all dev references
dropped which allows us to abort the table walking.
As IP_VS_WORK_SVC_NORESIZE is the flag used to stop the
svc_resize_work under service_mutex, we should check only
this flag often but not while under service_mutex.
To remove the mutex_trylock() for service_mutex in the
second phase where the resizer installs the new table
after rehashing, we will avoid holding the service_mutex
there. As result, the code in configuration context which
is under service_mutex should access ipvs->svc_table under
RCU because it can be replaced at anytime and released
after a RCU grace period. As for ip_vs_zero_all(), it needs
different solution as a table walker which can escape
single RCU read-side critical section: to hold the
svc_replace_sem to prevent table to be replaced.
In ip_vs_status_show() prefer to hold svc_replace_sem
to avoid many loops, just detect if the svc_table is
removed.
Prefer the newly attached table for the u_thresh/l_thresh
checks to know when to grow/shrink while adding or deleting
services because the new table size is based on the latest
parameters.
Jani Nikula [Wed, 13 May 2026 16:13:29 +0000 (19:13 +0300)]
drm/i915/irq: add platform specific display irq handler functions
Add a number of *_display_irq_handler() functions to group together the
various display irq handler parts for the platforms, to declutter the
core i915 irq code from the details.
Add master_ctl to struct intel_display_irq_state, and pass the state
pointer to the handlers where necessary. The handler function signatures
are intentionally the same to allow for more refactoring.
Jani Nikula [Wed, 13 May 2026 16:13:28 +0000 (19:13 +0300)]
drm/i915/irq: add platform specific display irq ack functions
Add i9xx_display_irq_ack() and vlv_display_irq_ack() to group together
the various irq ack parts for the platforms, to declutter the core i915
irq code from the details.
Introduce struct intel_display_irq_state to group together all the data
the ack functions need. In the follow-up, this state will be passed on
to similar platform specific handler functions.
Jani Nikula [Wed, 13 May 2026 16:13:26 +0000 (19:13 +0300)]
drm/i915/irq: add display irq funcs, start with intel_display_irq_reset()
Introduce display irq hooks with struct intel_display_irq_funcs, and add
the ->reset hook as the first thing. Call the reset hooks from i915 and
xe core via intel_display_irq_reset().
Relocate the gen8 and gen11 HAS_DISPLAY() check to
intel_display_irq_reset(), as the funcs pointer won't be initialized for
no display.
Note: We're increasingly moving to the territory of not touching display
at all if there's no display or it has been fused off. Which is good,
but care must be taken to not have hardware setup required also for no
display cases in display code. Also note that the line is fuzzy for
older platforms, but there we also don't have fusing.
v2:
- make the structs static const (Sashiko)
- relocate HAS_DISPLAY() (Sashiko)
Jani Nikula [Wed, 13 May 2026 16:13:24 +0000 (19:13 +0300)]
drm/i915/irq: deduplicate dg1_de_irq_postinstall() and gen11_de_irq_postinstall()
dg1_de_irq_postinstall() and gen11_de_irq_postinstall() are exactly the
same. Remove dg1_de_irq_postinstall() and call
gen11_de_irq_postinstall() instead.
xfrm: ah: use skb_to_full_sk in async output callbacks
When AH output is offloaded to an asynchronous crypto provider
(hardware accelerators such as AMD CCP, or a forced-async software
shim used for testing), the digest completion fires
ah_output_done() / ah6_output_done() on a workqueue. The egress
skb at that point may have been originated by a TCP listener
sending a SYN-ACK, which sets skb->sk to a request_sock via
skb_set_owner_edemux(); it may also have been originated by an
inet_timewait_sock retransmit. Neither is a full struct sock, and
passing the raw skb->sk to xfrm_output_resume() then forwards a
non-full socket through the rest of the xfrm output chain.
xfrm_output_resume() and its downstream consumers expect a full
sk where they dereference at all. The natural egress path
through ah_output_done() does not crash today because the
consumers that read past sock_common are either gated by
sk_fullsock() or short-circuit on flags that are clear on a fresh
request_sock; an exhaustive walk of the 50 most plausible
consumers under sch_fq, dev_queue_xmit, netfilter, tc-egress and
cgroup-egress BPF found no current unguarded deref. The bug is
still a real type confusion that future consumer changes could
turn into a memory-corruption primitive.
This is the same bug class fixed for ESP in commit 1620c88887b1
("xfrm: Fix the usage of skb->sk"). Apply the analogous fix to
AH: convert skb->sk to a full socket pointer (or NULL) via
skb_to_full_sk() before handing it to xfrm_output_resume().
The same async AH callbacks were touched recently for an
independent ESN-related ICV layout bug in commit ec54093e6a8f
("xfrm: ah: account for ESN high bits in async callbacks"); the
sk type-confusion addressed here is orthogonal. This patch is
part of an ongoing audit of the AH callback paths; an ah_output
ihl-validation hardening series is also currently under review on
netdev.
Reproduced under UML + KASAN + lockdep with a forced-async
hmac(sha1) shim that registers at priority 9999 and wraps the
sync in-tree hmac-sha1-lib. With the shim loaded, ah_output_done
runs on every SYN-ACK egress through a transport-mode AH SA and
skb->sk arrives as a request_sock (TCP_NEW_SYN_RECV); after this
patch, xfrm_output_resume() receives the listener (the result of
sk_to_full_sk()) and consumer derefs land on full-sock fields as
intended.
Fixes: 9ab1265d5231 ("xfrm: Use actual socket sk instead of skb socket for xfrm_output_resume") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Jani Nikula [Fri, 15 May 2026 16:09:20 +0000 (19:09 +0300)]
drm/xe/display: fix oops in suspend/shutdown without display
The xe driver keeps track of whether to probe display, and whether
display hardware is there, using xe->info.probe_display. It gets set to
false if there's no display after intel_display_device_probe(). However,
the display may also be disabled via fuses, detected at a later time in
intel_display_device_info_runtime_init().
In this case, the xe driver does for_each_intel_crtc() on uninitialized
mode config in xe_display_flush_cleanup_work(), leading to a NULL
pointer dereference, and generally calls display code with display info
cleared.
Check for intel_display_device_present() after
intel_display_device_info_runtime_init(), and reset
xe->info.probe_display as necessary. Also do unset_display_features()
for completeness, although display runtime init has already done
that. This will need to be unified across all cases later.
Move intel_display_device_info_runtime_init() call slightly earlier,
similar to i915, to avoid a bunch of unnecessary setup for no display
cases.
Note #1: The xe driver has no business doing low level display plumbing
like for_each_intel_crtc() to begin with. It all needs to happen in
display code.
Note #2: The actual bug is present already in commit 44e694958b95
("drm/xe/display: Implement display support"), but the oops was likely
introduced later at commit ddf6492e0e50 ("drm/xe/display: Make display
suspend/resume work on discrete").
Jasper Smet [Wed, 13 May 2026 05:21:37 +0000 (07:21 +0200)]
ASoC: amd: acp: Add DMI quirk for ASUS Zenbook S16 UM5606GA
The ASUS Zenbook S16 (UM5606GA) with AMD Ryzen AI 9 465 (Strix Point,
ACP 7.0) has a BIOS that incorrectly sets the ACPI property
'acp-audio-config-flag' to 0x10 (FLAG_AMD_LEGACY_ONLY_DMIC) for the ACP
device. This prevents snd_pci_ps from probing the SoundWire bus, resulting
in no internal audio (dummy output only).
The hardware uses a Cirrus Logic CS42L43 (headphone/jack) and four CS35L56
smart amplifiers (speakers), all on SoundWire link 1. The corresponding
machine table entry (acp70_cs42l43_l1u0_cs35l56x4_l1u0123) already exists
in amd-acp70-acpi-match.c and correctly describes this topology.
Add a DMI quirk to override the flag to 0, consistent with the existing
entry for the HN7306EA.
Felix Gu [Sat, 9 May 2026 17:55:37 +0000 (01:55 +0800)]
spi: mtk-snfi: Fix resource leak in mtk_snand_read_page_cache()
When DMA read times out in mtk_snand_read_page_cache(), the original code
erroneously jumped to cleanup label which skips DMA unmapping and ECC
disable, causing a resource leak.
Cássio Gabriel [Mon, 11 May 2026 16:42:02 +0000 (13:42 -0300)]
ASoC: amd: acp-sdw-legacy: check CPU DAI name before logging
devm_kasprintf() can fail and return NULL. The legacy AMD SoundWire
machine driver logs cpus->dai_name before checking the allocation result.
Move the debug print after the NULL check, matching the ordering used by
the SOF AMD SoundWire path after commit 5726b68473f7 ("ASoC: amd/sdw_utils:
avoid NULL deref when devm_kasprintf() fails").
Felix Gu [Thu, 7 May 2026 14:06:36 +0000 (22:06 +0800)]
spi: rspi: Simplify reset control handling
Use devm_reset_control_get_optional_exclusive_deasserted() to combine
get + deassert + cleanup in a single call, removing the redundant
rspi_reset_control_assert() helper.
ASoC: qcom: q6apm-dai: Allocate an extra page for PCM buffers
Some Old DSP firmware versions use 32-bit address arithmetic and size for
validating the PCM buffer address range. If a buffer is allocated near
the top of the 32-bit address space, arithmetic calculations involving
the end address can overflow and fail checks.
Work around this by increasing the preallocated PCM buffer size by one
page. The DSP is still passed the usable buffer size, excluding the extra
page, which prevents the firmware from seeing an end address that crosses
the 32-bit boundary.
This was not hit before because PCM buffer allocation and DSP-side
mapping happened at different points, and the size mapped on the DSP was
usually nperiods * period_size. Therefore the mapped size was unlikely to
match the full preallocated buffer size exactly, although the issue was
still possible. With early buffer mapping on the DSP, the full
preallocated buffer is mapped during PCM creation, making the failure
reproducible at boot.
Fixes: 8ea6e25c8536 ("ASoC: qcom: q6apm: Add support for early buffer mapping on DSP") Cc: Stable@vger.kernel.org Reported-by: Jens Glathe <jens.glathe@oldschoolsolutions.biz> Closes: https://lore.kernel.org/all/7f10abbd-fb78-4c3a-ab90-7ca78239891a@oldschoolsolutions.biz/ Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@oss.qualcomm.com> Tested-by: Jens Glathe <jens.glathe@oldschoolsolutions.biz> Link: https://patch.msgid.link/20260514090607.2435484-1-srinivas.kandagatla@oss.qualcomm.com Signed-off-by: Mark Brown <broonie@kernel.org>
Iker Pedrosa [Fri, 15 May 2026 10:49:01 +0000 (12:49 +0200)]
riscv: dts: spacemit: k1-bananapi-f3: add SD card support with UHS modes
Add complete SD card controller support with UHS high-speed modes.
- Enable sdhci0 controller with 4-bit bus width
- Configure card detect GPIO with GPIO_ACTIVE_LOW and internal pull-up
support
- Connect vmmc-supply to buck4 for 3.3V card power
- Connect vqmmc-supply to aldo1 for 1.8V/3.3V I/O switching
- Add dual pinctrl states for voltage-dependent pin configuration
- Support UHS-I SDR25, SDR50, and SDR104 modes
- Add stable MMC device aliases (mmc0 = eMMC, mmc1 = SD card)
This enables full SD card functionality including high-speed UHS modes
for improved performance.
net: hsr: defer node table free until after RCU readers
HSR node-list and node-status generic-netlink operations run under
rcu_read_lock(). They walk hsr->node_db through hsr_get_next_node() and
hsr_get_node_data(), but RTM_DELLINK teardown removes the same node table
with plain list_del() and frees each node immediately.
That lets a generic-netlink reader hold a struct hsr_node pointer across
hsr_dellink(). In a KASAN build, widening the reader window after
hsr_get_next_node() obtains the node reproduces a slab-use-after-free
when the reader copies node->macaddress_A; the freeing stack is
hsr_del_nodes() from hsr_dellink().
Use list_del_rcu() and defer the free through the existing
hsr_free_node_rcu() callback. This matches the lifetime rule used by the
HSR prune paths, which already delete nodes with list_del_rcu() and
call_rcu().
Linmao Li [Wed, 13 May 2026 02:55:09 +0000 (10:55 +0800)]
ipv6: addrconf: bail out of dad_failure when state is no longer POSTDAD
addrconf_dad_failure() transitions ifp->state from DAD to POSTDAD
via addrconf_dad_end(), which drops ifp->lock on return. The lock
is re-acquired after net_info_ratelimited(). A concurrent
ipv6_del_addr() can take the lock in that window, set ifp->state
to DEAD and run list_del_rcu(&ifp->if_list).
addrconf_dad_failure() then overwrites DEAD with ERRDAD at errdad:
and schedules a new dad_work. The work calls ipv6_del_addr()
again, hitting the already-poisoned list entry:
Fold the addrconf_dad_end() logic into addrconf_dad_failure() under
a single ifp->lock critical section. The STABLE_PRIVACY branch
temporarily drops ifp->lock around address regeneration, so at
lock_errdad: verify the state is still POSTDAD before transitioning
to ERRDAD; bail out otherwise to avoid overwriting a state set by
another path while the lock was released.
Fixes: c15b1ccadb32 ("ipv6: move DAD and addrconf_verify processing to workqueue") Signed-off-by: Linmao Li <lilinmao@kylinos.cn> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20260513025509.3776405-1-lilinmao@kylinos.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>