Mark Brown [Thu, 23 Apr 2026 19:17:45 +0000 (20:17 +0100)]
selftests/rseq: Don't run tests with runner scripts outside of the scripts
The rseq selftests include two runner scripts run_param_test.sh and
run_syscall_errors_test.sh which set up the environment for test binaries
and run them with various parameters. Currently we list these test binaries
in TEST_GEN_PROGS but this results in the kselftest framework running them
directly as well as via the runners, resulting in duplication and spurious
failures when the environment is not correctly set up (eg, if glibc tries
to use rseq).
Move the binaries the runners invoke to TEST_GEN_PROGS_EXTENDED, binaries
listed there are built but not run by the framework. The param_test
benchmarks are not moved since they are not run by run_param_test.sh.
Fixes: 830969e7821a ("selftests/rseq: Implement time slice extension test") Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20260423-selftests-rseq-use-runner-v1-1-e13a133754c1@kernel.org Cc: stable@vger.kernel.org
Linus Torvalds [Fri, 1 May 2026 18:26:15 +0000 (11:26 -0700)]
Merge tag 'block-7.1-20260430' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull block fixes from Jens Axboe:
- MD pull request via Yu:
- Fix a raid5 UAF on IO across the reshape position
- Avoid failing RAID1/RAID10 devices for invalid IO errors
- Fix RAID10 divide-by-zero when far_copies is zero
- Restore bitmap grow through sysfs
- Use mddev_is_dm() instead of open-coding gendisk checks
- Use ATTRIBUTE_GROUPS() for md default sysfs attributes
- Replace open-coded wait loops with wait_event helpers
- NVMe pull request via Keith:
- Target data transfer size configuation (Aurelien)
- Enable P2P for RDMA (Shivaji Kant)
- TCP target updates (Maurizio, Alistair, Chaitanya, Shivam Kumar)
- TCP host updates (Alistair, Chaitanya)
- Authentication updates (Alistair, Daniel, Chris Leech)
- Multipath fixes (John Garry)
- New quirks (Alan Cui, Tao Jiang)
- Apple driver fix (Fedor Pchelkin)
- PCI admin doorbell update fix (Keith)
- Properly propagate CDROM read-only state to the block layer
* tag 'block-7.1-20260430' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (35 commits)
md: use ATTRIBUTE_GROUPS() for md default sysfs attributes
md: use mddev_is_dm() instead of open-coding gendisk checks
md/raid1: replace wait loop with wait_event_idle() in raid1_write_request()
md/md-bitmap: add a none backend for bitmap grow
md/md-bitmap: split bitmap sysfs groups
md: factor bitmap creation away from sysfs handling
md: use mddev_lock_nointr() in mddev_suspend_and_lock_nointr()
md: replace wait loop with wait_event() in md_handle_request()
md/raid10: fix divide-by-zero in setup_geo() with zero far_copies
md/raid1,raid10: don't fail devices for invalid IO errors
MAINTAINERS: Add Xiao Ni as md/raid reviewer
md/raid5: Fix UAF on IO across the reshape position
cdrom, scsi: sr: propagate read-only status to block layer via set_disk_ro()
nvme-auth: Hash DH shared secret to create session key
nvme-pci: fix missed admin queue sq doorbell write
nvme-auth: Include SC_C in RVAL controller hash
nvme-tcp: teardown circular locking fixes
nvmet-tcp: Don't clear tls_key when freeing sq
Revert "nvmet-tcp: Don't free SQ on authentication success"
nvme: skip trace completion for host path errors
...
Linus Torvalds [Fri, 1 May 2026 18:01:31 +0000 (11:01 -0700)]
Merge tag 'io_uring-7.1-20260430' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull io_uring fixes from Jens Axboe:
- Remove dead struct io_buffer_list member
- Fix for incrementally consumed buffers with recvmsg multishot, which
requires a minimum value left in a buffer for any receive for the
headers. If there's still a bit of buffer left but it's smaller than
that value, then userspace will see a spurious -EFAULT returned in
the CQE
- Locking fix for the DEFER_TASKRUN retry list, which otherwise could
race with fallback cancelations. If the task is exiting with
task_work left in both the normal and retry list AND the exit cleanup
races with the task running task work, then entries could either be
doubly completed or lost
- Cap NAPI busy poll timeout to something sane, to avoid syzbot running
into excessive polling and triggering warnings around that
* tag 'io_uring-7.1-20260430' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
io_uring/tw: serialize ctx->retry_llist with ->uring_lock
io_uring/napi: cap busy_poll_to 10 msec
io_uring/kbuf: support min length left for incremental buffers
io_uring/kbuf: kill dead struct io_buffer_list 'nr_entries' member
Linus Torvalds [Fri, 1 May 2026 16:51:38 +0000 (09:51 -0700)]
Merge tag 'spi-fix-v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"There are a couple of nasty issues fixed here in the axiado and
rockchip drivers. We've also got more of the fixes from Johan here,
this time for the two Cadence drivers, plus a couple of other similar
fixes from John and Felix"
* tag 'spi-fix-v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: amlogic-spisg: initialize completion before requesting IRQ
spi: axiado: replace usleep_range() with udelay() in IRQ path
spi: cadence-quadspi: fix runtime pm and clock imbalance on unbind
spi: cadence-quadspi: fix unclocked access on unbind
spi: cadence-quadspi: fix clock imbalance on probe failure
spi: cadence-quadspi: fix runtime pm disable imbalance on probe failure
spi: cadence: fix clock imbalance on probe failure
spi: cadence: fix unclocked access on unbind
spi: rockchip: Drop unused and broken CR0 macros
spi: rockchip: Read ISR, not IMR, to detect cs-inactive IRQ
spi: rzv2h-rspi: Fix silent failure in clock setup error path
Kevin Brodsky [Mon, 27 Apr 2026 12:03:33 +0000 (13:03 +0100)]
arm64: signal: Preserve POR_EL0 if poe_context is missing
Commit 2e8a1acea859 ("arm64: signal: Improve POR_EL0 handling to
avoid uaccess failures") delayed the write to POR_EL0 in
rt_sigreturn to avoid spurious uaccess failures. This change however
relies on the poe_context frame record being present: on a system
supporting POE, calling sigreturn without a poe_context record now
results in writing arbitrary data from the kernel stack into POR_EL0.
Fix this by adding a __valid_fields member to struct
user_access_state, and zeroing the struct on allocation.
restore_poe_context() then indicates that the por_el0 field is valid
by setting the corresponding bit in __valid_fields, and
restore_user_access_state() only touches POR_EL0 if there is a valid
value to set it to. This is in line with how POR_EL0 was originally
handled; all frame records are currently optional, except
fpsimd_context.
To ensure that __valid_fields is kept in sync, fields (currently
just por_el0) are now accessed via accessors and prefixed with __ to
discourage direct access.
Fixes: 2e8a1acea859 ("arm64: signal: Improve POR_EL0 handling to avoid uaccess failures") Cc: <stable@vger.kernel.org> Reported-by: Will Deacon <will@kernel.org> Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Linus Torvalds [Fri, 1 May 2026 16:25:12 +0000 (09:25 -0700)]
Merge tag 'regulator-fix-v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fix from Mark Brown:
"A fix from Arnd re-adding a dependency on gpiolib which was implicitly
pulled in via an OF specific route which got removed as part of a
cleanup"
* tag 'regulator-fix-v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: rpi-panel-attiny: add back GPIOLIB dependency
Linus Torvalds [Fri, 1 May 2026 15:45:23 +0000 (08:45 -0700)]
Merge tag 'mm-hotfixes-stable-2026-04-30-15-39' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM fixes from Andrew Morton:
"20 hotfixes. All are for MM (and for MMish maintainers). 9 are
cc:stable and the remainder are for post-7.0 issues or aren't deemed
suitable for backporting.
There are two DAMON series from SeongJae Park which address races
which could lead to use-after-free errors, and avoid the possibility
of presenting stale parameter values to users"
* tag 'mm-hotfixes-stable-2026-04-30-15-39' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mm: memcontrol: fix rcu unbalance in get_non_dying_memcg_end()
mm/userfaultfd: detect VMA type change after copy retry in mfill_copy_folio_retry()
MAINTAINERS: remove stale kdump project URL
mm/damon/stat: detect and use fresh enabled value
mm/damon/lru_sort: detect and use fresh enabled and kdamond_pid values
mm/damon/reclaim: detect and use fresh enabled and kdamond_pid values
selftests/mm: specify requirement for PROC_MEM_ALWAYS_FORCE=y
mm/damon/sysfs-schemes: protect path kfree() with damon_sysfs_lock
mm/damon/sysfs-schemes: protect memcg_path kfree() with damon_sysfs_lock
MAINTAINERS: update Li Wang's email address
MAINTAINERS, mailmap: update email address for Qi Zheng
MAINTAINERS: update Liam's email address
mm/hugetlb_cma: round up per_node before logging it
MAINTAINERS: fix regex pattern in CORE MM category
mm/vma: do not try to unmap a VMA if mmap_prepare() invoked from mmap()
mm: start background writeback based on per-wb threshold for strictlimit BDIs
kho: fix error handling in kho_add_subtree()
liveupdate: fix return value on session allocation failure
mailmap: update entry for Dan Carpenter
vmalloc: fix buffer overflow in vrealloc_node_align()
For 4K pages, the early kernel mapping may use 2MB block entries but the
kernel segments are only 64KB aligned. Segment boundaries that fall
within a 2MB block therefore require a PTE table so that different
attributes can be applied on either side of the boundary.
KERNEL_SEGMENT_COUNT still correctly counts the five permanent kernel
VMAs registered by declare_kernel_vmas(). However, since commit 5973a62efa34 ("arm64: map [_text, _stext) virtual address range
non-executable+read-only"), the early mapper also maps [_text, _stext)
separately from [_stext, _etext). This adds one more early-only split
and can require one more page-table page than the existing
EARLY_SEGMENT_EXTRA_PAGES allowance reserves.
Increase the 4K-page early mapping allowance by one page to cover that
additional split.
Fixes: 5973a62efa34 ("arm64: map [_text, _stext) virtual address range non-executable+read-only") Assisted-by: TRAE:GLM-5.1 Suggested-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
[catalin.marinas@arm.com: rewrote part of the commit log]
[catalin.marinas@arm.com: expanded the code comment] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Leo Yan [Wed, 29 Apr 2026 14:30:10 +0000 (15:30 +0100)]
kselftest/arm64: Include <asm/ptrace.h> for user_gcs definition
kselftest includes kernel uAPI headers with option:
-isystem $(top_srcdir)/usr/include
Include <asm/ptrace.h> in libc-gcs.c for the definition of struct
user_gcs from the uAPI headers, and remove the redundant definition in
gcs-util.h. This fixes a compilation error on systems where the
toolchain defines NT_ARM_GCS.
Fixes: a505a52b4e29 ("kselftest/arm64: Add a GCS test program built with the system libc") Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Mark Brown <broonie@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
ALSA: hda/realtek: Add codec SSID quirk for Lenovo Yoga Pro 9 16IMH9
The Yoga Pro 9 16IMH9 (codec SSID 17aa:38d6) shares PCI audio device
subsystem ID 17aa:3811 with the Legion S7 15IMH05. The existing
SND_PCI_QUIRK entry for the Legion routes both machines to
ALC287_FIXUP_LEGION_15IMHG05_SPEAKERS, which does not bind the TAS2781
smart amplifiers, resulting in near-silent built-in speakers.
Add an HDA_CODEC_QUIRK entry immediately before the conflicting PCI quirk
that matches the Yoga Pro 9's unique codec SSID and routes it to
ALC287_FIXUP_TAS2781_I2C. Codec quirks are evaluated after PCI quirks and
take precedence, leaving the Legion S7 15IMH05 entry unaffected.
This follows the same pattern used to disambiguate PCI SSID 17aa:3847
(shared between Yoga Pro 7 14IMH9 and Legion 7 16ACHG6), where a
HDA_CODEC_QUIRK for codec SSID 17aa:38cf resolves the conflict.
Jens Axboe [Fri, 1 May 2026 11:23:12 +0000 (19:23 +0800)]
ublk: don't issue uring_cmd from fallback task work
When ublk_ch_uring_cmd_cb() runs as fallback task work (e.g., because
the submitting task is exiting), the command should not be issued as
current is a kworker, not the daemon task. This can cause io->task
to capture the wrong task in __ublk_fetch(), leading to a task
mismatch warning in ublk_uring_cmd_cancel_fn().
Check tw.cancel and return -ECANCELED instead of issuing the command
from fallback context.
netfilter: flowtable: use skb_pull_rcsum() to pop vlan/pppoe header
This adjusts the checksum, if required, after pulling the layer 2
header, either the pppoe header or the inner vlan header in the
double-tagged vlan packets.
Add the VID/PIDs for the ASUS ROG RAIKIRI II controller to xpad_device
and the VID to xpad_table. The controller has a physical PC/XBOX toggle
which switches between XBOX360 and XBOXONE protocols.
Dave Airlie [Fri, 1 May 2026 02:49:22 +0000 (12:49 +1000)]
Merge tag 'drm-xe-fixes-2026-04-30' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
API Fixes:
- Add missing pad and extensions check (Jonathan)
- Reject unsafe PAT indices for CPU cached memory (Jia)
Driver Fixes:
- Drop registration of guc_submit_wedged_fini from xe_guc_submit_wedge (Brost)
- Xe3p tuning and workaround fixes (Roper, Gustavo)
- USE drm mm instead of drm SA for CCS read/write (Satya)
- Fix leaks and null derefs (Shuicheng)
- Fix Wa_18022495364 (Tvrtko)
Michael Neuling [Thu, 9 Apr 2026 09:11:39 +0000 (09:11 +0000)]
riscv: errata: Fix bitwise vs logical AND in MIPS errata patching
The condition checking whether a specific errata needs patching uses
logical AND (&&) instead of bitwise AND (&). Since logical AND only
checks that both operands are non-zero, this causes all errata patches
to be applied whenever any single errata is detected, rather than only
applying the matching one.
The SiFive errata implementation correctly uses bitwise AND for the same
check.
Fixes: 0b0ca959d206 ("riscv: errata: Fix the PAUSE Opcode for MIPS P8700") Signed-off-by: Michael Neuling <mikey@neuling.org> Assisted-by: Cursor:claude-4.6-opus-high-thinking Link: https://patch.msgid.link/20260409091143.1348853-2-mikey@neuling.org
[pjw@kernel.org: fixed checkpatch warning] Signed-off-by: Paul Walmsley <pjw@kernel.org>
This series tightens Marvell OcteonTX2 AF NPC support for CN20K silicon
around MCAM key typing, optional debugfs setup, defrag allocation rollback,
defrag entry relocation bookkeeping, logical MCAM clear and programming,
default-rule index handling with explicit teardown, and NIXLF reserved-slot
lookup when default rules are missing.
Patches 1 through 3 focus on AF error handling: propagate
npc_mcam_idx_2_key_type() failures through cn20k MCAM enable, config, copy,
and read paths; treat cn20k NPC debugfs nodes as optional so probe does not
fail when debugfs is unavailable; and fix defrag MCAM allocation rollback
so allocation errno is not overwritten during subbank index resolution.
Patch 4 fixes npc_defrag_move_vdx_to_free(): when an MCAM line is moved to
a new physical index, move entry2target_pffunc[] association to the new
slot, clear the old slot, and retarget the matching mcam_rules entry so
software state matches hardware after defrag.
Patches 5 through 7 refine cn20k MCAM programming: clear entries using the
logical MCAM index and resolved key width, fix bank/CFG sequencing in
npc_cn20k_config_mcam_entry(), and read action metadata from the correct
bank in npc_cn20k_read_mcam_entry().
Patches 8 through 10 complete default-rule lifecycle handling: initialize
default-rule index outputs eagerly, tear down reserved default MCAM rules
explicitly (coordinated with npc_mcam_free_all_entries()), and reject
USHRT_MAX sentinel indices from npc_get_nixlf_mcam_index() on cn20k.
====================
octeontx2-af: npc: cn20k: Reject missing default-rule MCAM indices
When cn20k default L2 rules are not installed,
npc_cn20k_dft_rules_idx_get() leaves broadcast, multicast, promiscuous, and
unicast slots at USHRT_MAX. npc_get_nixlf_mcam_index() previously returned
that sentinel as a valid MCAM index, so callers could program hardware with
an invalid index.
Return -EINVAL from the cn20k branches of npc_get_nixlf_mcam_index() when
the requested slot is still USHRT_MAX. Harden cn20k NPC MCAM entry helpers
to reject out-of-range indices before touching hardware.
Drop the early bounds check in npc_enable_mcam_entry() for cn20k so invalid
indices are validated inside npc_cn20k_enable_mcam_entry() instead of being
silently ignored.
In rvu_npc_update_flowkey_alg_idx(), treat negative MCAM indices like
out-of-range values, and only update RSS actions for promiscuous and
all-multi paths when the resolved index is non-negative.
octeontx2-af: npc: cn20k: Tear down default MCAM rules explicitly on free
npc_cn20k_dft_rules_free() used the NPC MCAM mbox "free all" path, which
does not match how cn20k tracks default-rule MCAM slots indexes.
Resolve the default-rule indices, then for each valid slot clear the bitmap
entry, drop the PF/VF map, disable the MCAM line, clear the target
function, and npc_cn20k_idx_free(). Remove any matching software mcam_rules
nodes. On hard failure from idx_free, WARN and stop so the box stays up for
analysis.
In npc_mcam_free_all_entries(), prefetch the same default-rule indices and,
on cn20k, skip bitmap clear and idx_free when the scanned entry is one of
those reserved defaults (they are released by npc_cn20k_dft_rules_free).
octeontx2-af: npc: cn20k: Initialize default-rule index outputs up front
npc_cn20k_dft_rules_idx_get() wrote USHRT_MAX into individual outputs only
on some error paths (lbk promisc lookup, VF ucast lookup, and the PF rule
walk), which could leave other caller slots stale across retries.
Set every non-NULL bcast/mcast/promisc/ucast pointer to USHRT_MAX once at
entry, then drop the duplicate assignments on failure. Successful lookups
still overwrite the relevant slot before returning.
npc_cn20k_read_mcam_entry() always reloaded action and vtag_action from
bank 0 after programming the CAM words. Use the bank returned by
npc_get_bank() for the ACTION reads as well, and read those registers once
up front so both X2 and X4 paths share the same metadata.
Return directly from the X2 keyword path now that the action fields are
already populated.
Cc: Suman Ghosh <sumang@marvell.com> Fixes: 6d1e70282f76 ("octeontx2-af: npc: cn20k: Use common APIs") Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Link: https://patch.msgid.link/20260429022722.1110289-8-rkannoth@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
For X4 keys its loop reused the bank parameter as the loop counter, so bank
no longer reflected the caller's bank after the loop and the control flow
was hard to follow.
Program NPC_AF_CN20K_MCAMEX_BANKX_CFG_EXT directly in
npc_cn20k_config_mcam_entry(): one CFG write for X2 using the computed
bank, and one CFG write per bank inside the X4 action loop. Enable the
entry at the end with npc_cn20k_enable_mcam_entry(..., true) instead of
embedding the enable bit in bank_cfg via the removed helper.
Cc: Suman Ghosh <sumang@marvell.com> Fixes: 4e527f1e5c15 ("octeontx2-af: npc: cn20k: Add new mailboxes for CN20K silicon") Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Link: https://patch.msgid.link/20260429022722.1110289-7-rkannoth@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
octeontx2-af: npc: cn20k: Clear MCAM entries by index and key width
Replace the old four-argument CN20K MCAM clear with a per-bank static
helper and npc_cn20k_clear_mcam_entry() that takes a logical MCAM index,
resolves the key width via npc_mcam_idx_2_key_type(), and clears either one
bank (X2) or every bank (X4).
Call it from npc_clear_mcam_entry() on cn20k and log when key-type lookup
fails. Use the per-bank helper from npc_cn20k_config_mcam_entry() for
pre-program clears.
For loopback VFs, use the promisc MCAM index as ucast_idx when copying RSS
action for promisc, matching cn20k default-rule layout.
Cc: Suman Ghosh <sumang@marvell.com> Fixes: 6d1e70282f76 ("octeontx2-af: npc: cn20k: Use common APIs") Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Link: https://patch.msgid.link/20260429022722.1110289-6-rkannoth@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
npc_defrag_move_vdx_to_free() disables, copies, and enables the MCAM entry
at a new index but previously left entry2target_pffunc[] and the mcam_rules
list still keyed to the old index. Copy the target PF association to the
new slot, clear the old one, and retarget the rule entry so software state
matches the relocated hardware context.
octeontx2-af: npc: cn20k: Propagate errors in defrag MCAM alloc rollback
npc_defrag_alloc_free_slots() allocates MCAM indexes in up to two passes on
bank0 then bank1. On failure it rolls back by freeing entries already
placed in save[].
__npc_subbank_alloc() can return a negative errno while only part of the
indexes are valid. The rollback loop used rc for
npc_mcam_idx_2_subbank_idx() as well, so a successful lookup stored zero in
rc and a later __npc_subbank_free() failure could still end with return 0
when the allocation path had also left rc at zero (for example shortfall
after zero return values from the alloc helpers).
Jump to the rollback path immediately when either __npc_subbank_alloc()
call fails, preserving its errno. If both calls succeed but the total
allocated count is still less than cnt, set rc to -ENOSPC before rollback.
Use a separate err variable for npc_mcam_idx_2_subbank_idx() so a
successful lookup no longer clears a non-zero rc from the allocation phase.
octeontx2-af: npc: cn20k: Drop debugfs_create_file() error checks in init
debugfs is not intended to be checked for allocation failures the way other
kernel APIs are: callers should not fail probe or subsystem init because a
debugfs node could not be created, including when debugfs is disabled in
Kconfig. Replacing NULL checks with IS_ERR() checks is similarly wrong for
optional debugfs.
Remove dentry checks and -EFAULT returns from npc_cn20k_debugfs_init().
See:
https://staticthinking.wordpress.com/2023/07/24/
debugfs-functions-are-not-supposed-to-be-checked/
octeontx2-af: npc: cn20k: Propagate MCAM key-type errors on cn20k
npc_mcam_idx_2_key_type() can fail; callers used to ignore it and still
used kw_type when enabling, configuring, copying, and reading MCAM entries.
That could program or decode hardware with an undefined key type.
Return -EINVAL when key-type lookup fails. Return -EINVAL from
npc_cn20k_copy_mcam_entry() when src and dest key types differ instead of
failing silently.
Change npc_cn20k_{enable,config,copy,read}_mcam_entry() to return int on
success or error. Thread those errors through the cn20k MCAM write and read
mbox handlers, the cn20k baseline steer read path, NPC defrag move
(disable/copy/enable with dev_err and -EFAULT), and the DMAC update path in
rvu_npc_fs.c.
Make npc_copy_mcam_entry() return int so the cn20k branch can return
npc_cn20k_copy_mcam_entry() without a void/int mismatch, and fail
NPC_MCAM_SHIFT_ENTRY when copy fails.
Lorenzo Bianconi [Wed, 29 Apr 2026 12:02:31 +0000 (14:02 +0200)]
net: airoha: Move entries to queue head in case of DMA mapping failure in airoha_dev_xmit()
In order to respect the original descriptor order and avoid any
potential IOMMU fault or memory corruption, move pending queue entries
to the head of hw queue tx_list if the DMA mapping of current inflight
packet fails in airoha_dev_xmit routine.
Currently, request_threaded_irq() is used with a primary handler but a
NULL threaded handler, while also setting the IRQF_ONESHOT flag. This
specific combination triggers a WARNING since the commit aef30c8d569c
("genirq: Warn about using IRQF_ONESHOT without a threaded handler").
WARNING: kernel/irq/manage.c:1502 at __setup_irq+0x4fa/0x760
Fix the issue by switching to request_irq(), which is the appropriate
interface or a non-threaded interrupt handler, and removing the
unnecessary IRQF_ONESHOT flag.
Register WX_CFG_PORT_ST is a PF restricted register. When a VF is
initialized, attempting to read this register triggers an illegal
register access, which lead to a system hang.
When the device is VF, the bus function ID can be obtained directly from
the PCI_FUNC(pdev->devfn).
Linus Torvalds [Fri, 1 May 2026 00:36:48 +0000 (17:36 -0700)]
Merge tag 'mtd/fixes-for-7.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux
Pull mtd fixes from Miquel Raynal:
"Besides an out-of-bound bug, this is about properly supporting Winbond
octal SPI NAND chips which use a specific pattern for stuffing more
address bits in some operations. This uses the spi-mem flag in SPI
NAND that was added to the spi-mem layer just before the merge window
through the spi tree"
* tag 'mtd/fixes-for-7.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
mtd: spinand: winbond: Fix ODTR write VCR on W35NxxJW
mtd: spinand: winbond: Set the packed page read flag to W35N02/04JW
mtd: spinand: Add support for packed read data ODTR commands
mtd: spi-nor: debugfs: fix out-of-bounds read in spi_nor_params_show()
net: enetc: fix VSI mailbox timeout handling and DMA lifecycle
In the current VSI mailbox implementation, the VSI allocates a DMA buffer
to store the message sent to the PSI. When the PSI receives the message
request from the VSI, the hardware copies the message data from this DMA
buffer to PSI's DMA buffer for processing.
When enetc_msg_vsi_send() times out, two scenarios can occur:
1) Use-after-free: If the hardware hasn't completed message copying when
the VSI frees the buffer, the hardware may subsequently copy the data
from freed memory to PSI's DMA buffer.
2) Message race: If PSI hasn't processed the previous message when the
next message is sent, the VSI may receive the previous message's
reply, leading to incorrect handling.
To address these issues, implement the following changes:
- Check the mailbox busy status before sending a new message. If the
mailbox is in busy state, it indicates the previous message is still
being processed, so return an error immediately.
- Add the 'msg' field to struct enetc_si to preserve the DMA buffer
information. The caller of enetc_msg_vsi_send() no longer frees the
DMA buffer. Instead, defer freeing until it is safe to do so (when
mailbox is not busy on next send).
- Add cleanup in enetc_vf_remove() to free the last message buffer.
This ensures the DMA buffer remains valid during message copying and
prevents message reply mismatches.
Daniel Borkmann [Wed, 29 Apr 2026 15:46:48 +0000 (17:46 +0200)]
ipv6: Implement limits on extension header parsing
ipv6_{skip_exthdr,find_hdr}() and ip6_{tnl_parse_tlv_enc_lim,
protocol_deliver_rcu}() iterate over IPv6 extension headers until they
find a non-extension-header protocol or run out of packet data. The
loops have no iteration counter, relying solely on the packet length
to bound them. For a crafted packet with 8-byte extension headers
filling a 64KB jumbogram, this means a worst case of up to ~8k
iterations with a skb_header_pointer call each. ipv6_skip_exthdr(),
for example, is used where it parses the inner quoted packet inside
an incoming ICMPv6 error:
- icmpv6_rcv
- checksum validation
- case ICMPV6_DEST_UNREACH
- icmpv6_notify
- pskb_may_pull() <- pull inner IPv6 header
- ipv6_skip_exthdr() <- iterates here
- pskb_may_pull()
- ipprot->err_handler() <- sk lookup
The per-iteration cost of ipv6_skip_exthdr itself is generally
light, but skb_header_pointer becomes more costly on reassembled
packets: the first ~1232 bytes of the inner packet are in the skb's
linear area, but the remaining ~63KB are in the frag_list where
skb_copy_bits is needed to read data.
Initially, the idea was to add a configurable limit via a new
sysctl knob with default 8, in line with knobs from commit 47d3d7ac656a ("ipv6: Implement limits on Hop-by-Hop and Destination
options"), but two reasons eventually argued against it:
- It adds to UAPI that needs to be maintained forever, and
upcoming work is restricting extension header ordering anyway,
leaving little reason for another sysctl knob
- exthdrs_core.c is always built-in even when CONFIG_IPV6=n,
where struct net has no .ipv6 member, so the read site would
need an ifdef'd fallback to a constant anyway
Therefore, just use a constant (IP6_MAX_EXT_HDRS_CNT). All four
extension header walking functions are now bound by this limit.
Note that the check in ip6_protocol_deliver_rcu() happens right
before the goto resubmit, such that we don't have to have a test
for ipv6_ext_hdr() in the fast-path.
There's an ongoing IETF draft-iurman-6man-eh-occurrences to enforce
IPv6 extension headers ordering and occurrence. The latter also
discusses security implications. As per RFC8200 section 4.1, the
occurrence rules for extension headers provide a practical upper
bound which is 8. In order to be conservative, let's define
IP6_MAX_EXT_HDRS_CNT as 12 to leave enough room for quirky setups.
In the unlikely event that this is still not enough, then we might
need to reconsider a sysctl.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Justin Iurman <justin.iurman@gmail.com> Link: https://patch.msgid.link/20260429154648.809751-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Linus Torvalds [Fri, 1 May 2026 00:20:45 +0000 (17:20 -0700)]
Merge tag 'acpi-7.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI support fixes from Rafael Wysocki:
"These fix leftover issues in the ACPI Time and Alarm Device (TAD)
driver on top of the recently merged updates of it and address
assorted issues in the ACPI support code:
- Fix removal code ordering in the ACPI TAD driver, refine timer
value computations and checks in its RTC class device interface,
make it use the __ATTRIBUTE_GROUPS() macro, and fix a comment in it
(Rafael Wysocki)
- Fix EINJV2 memory error injection in APEI (Tony Luck)
- Fix related_cpus inconsistency during CPU hotplug in the ACPI CPPC
library (Jinjie Ruan)
- Add a quirk to force native backlight on HP OMEN 16 (8A44) in the
ACPI video bus driver (Shivam Kalra)"
* tag 'acpi-7.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: bus: add missing forward declaration to acpi_bus.h
ACPI: video: force native backlight on HP OMEN 16 (8A44)
ACPI: TAD: Fix up a comment in acpi_tad_probe()
ACPI: TAD: RTC: Refine timer value computations and checks
ACPI: TAD: Use devres for all driver cleanup
ACPI: TAD: Use __ATTRIBUTE_GROUPS() macro
ACPI: CPPC: Fix related_cpus inconsistency during CPU hotplug
ACPI: APEI: EINJ: Fix EINJV2 memory error injection
ACPICA: Provide #defines for EINJV2 error types
Linus Torvalds [Fri, 1 May 2026 00:07:21 +0000 (17:07 -0700)]
Merge tag 'v7.1-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:
- multichannel crediting fix
- memory allocation improvement for smb2_compound_op
- remove some dead code
* tag 'v7.1-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: change_conf needs to be called for session setup
smb: client: change allocation requirements in smb2_compound_op
smb/client: remove unused smb3_parse_opt()
Robert Marko [Tue, 28 Apr 2026 13:41:01 +0000 (15:41 +0200)]
net: phy: micrel: fix LAN8814 QSGMII soft reset
LAN8814 QSGMII soft reset was moved into the probe function to avoid
triggering it for each of 4 PHY-s in the package.
However, that broke QSGMII link between the MAC and PHY on most LAN8814
PHY-s, specificaly for us on the Microchip LAN969x switch.
Reading the QSGMII status registers it was visible that lanes were only
partially synced.
It looks like the reset timing is crucial, so lets move the reset back
into the .config_init function but guard it with phy_package_init_once()
to avoid it being triggered on each of 4 PHY-s in the package.
Change the probe function to use phy_package_probe_once() for coma and PtP
setup.
netfilter: flowtable: fix inline pppoe encapsulation in xmit path
Address two issues in the inline pppoe encapsulation:
- Add needs_gso_segment flag to segment PPPoE packets in software
given that there is no GSO support for this.
- Use FLOW_OFFLOAD_XMIT_DIRECT since neighbour cache is not available
in point-to-point device, use the hardware address that is obtained
via flowtable path discovery (ie. fill_forward_path).
Fixes: 18d27bed0880 ("netfilter: flowtable: inline pppoe encapsulation in xmit path") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
netfilter: flowtable: fix inline vlan encapsulation in xmit path
Several issues in the inline vlan support:
- The layer 2 encapsulation representation in the tuple takes encap[0] as
the outer header and encap[1] as the inner header as seen from the ingress
path. Reverse the encap loop to push first the inner then the outer vlan
header.
- Postpone pushing the layer 2 header once destination device is known.
This allows to calculate the needed hearoom via LL_RESERVED_SPACE to
accommodate the layer 2 headers.
- Add and use nf_flow_vlan_push() as suggested by Eric Woudstra, this
is a simplified version of skb_vlan_push() for egress path only.
Fixes: c653d5a78f34 ("netfilter: flowtable: inline vlan encapsulation in xmit path") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Despite claiming to add GA100 support, that commit actually has quite
a few problems. It falsely claims that there is no VBIOS. GA100 does
have a VBIOS, but it has no display engine, so it cannot use the
PRAMIN method the read VBIOS and must fall back to using PROM.
For whatever reason, the VBIOS on GA100 has an "Init-from-ROM"
(IFR) header where the PCI Expansion ROM would normally be found.
So to find that ROM, Nouveau needs to parse the IFR header.
The commit also falsely claimed that there is no graphics (GR) engine.
So rather than try to fix that commit, just revert it and start over
from scratch.
Gui-Dong Han [Thu, 16 Apr 2026 13:57:03 +0000 (21:57 +0800)]
hwmon: (lm63) Add locking to avoid TOCTOU
The functions show_fan(), show_pwm1(), show_temp11(),
temp2_crit_hyst_show(), and show_lut_temp_hyst() access shared cached
data without holding the update lock. This can cause TOCTOU races if
the cached values change between the checks and the later calculations.
Those cached values are updated in lm63_update_device(). In the general
case, the affected functions combine multiple cached values without
locking and can therefore observe a mixed old/new snapshot. In
addition, show_fan() reads data->fan[nr] locklessly while
lm63_update_device() updates data->fan[0] in two steps, which can
expose an intermediate torn value and potentially trigger a
divide-by-zero error. This means that converting the macro to a
function is not sufficient to fix show_fan().
Hold the update lock across the whole read and calculation sequence so
that the values remain stable.
Check the other functions in the driver as well. Keep them unchanged
because they either do not access shared cached values multiple times
or already do so under lock.
Miguel Ojeda [Sun, 26 Apr 2026 14:42:01 +0000 (16:42 +0200)]
rust: allow `clippy::collapsible_if` globally
Similar to `clippy::collapsible_match` (globally allowed in the previous
commit), the `clippy::collapsible_if` lint [1] can make code harder to
read in certain cases.
Miguel Ojeda [Sun, 26 Apr 2026 14:42:00 +0000 (16:42 +0200)]
rust: allow `clippy::collapsible_match` globally
The `clippy::collapsible_match` lint [1] can make code harder to read
in certain cases [2], e.g.
CLIPPY P rust/libmacros.so - due to command line change
warning: this `if` can be collapsed into the outer `match`
--> rust/pin-init/internal/src/helpers.rs:91:17
|
91 | / if nesting == 1 {
92 | | impl_generics.push(tt.clone());
93 | | impl_generics.push(tt);
94 | | skip_until_comma = false;
95 | | }
| |_________________^
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#collapsible_match
= note: `-W clippy::collapsible-match` implied by `-W clippy::all`
= help: to override `-W clippy::all` add `#[allow(clippy::collapsible_match)]`
help: collapse nested if block
|
90 ~ TokenTree::Punct(p) if skip_until_comma && p.as_char() == ','
91 ~ && nesting == 1 => {
92 | impl_generics.push(tt.clone());
93 | impl_generics.push(tt);
94 | skip_until_comma = false;
95 ~ }
|
The lint does not have much upside -- when the suggestion may be a good
one, it would still read fine when nested anyway. And it is the kind of
lint that may easily bias people to just apply the suggestion instead
of allowing it.
[ In addition, as Gary points out [3], the suggestion is also wrong [4] and
in the process of being fixed [5], possibly for Rust 1.97.0:
When a field has been initialized, `init!`/`pin_init!` create a reference
or pinned reference to the field so it can be accessed later during the
initialization of other fields. However, the reference it created is
incorrectly `&'static` rather than just the scope of the initializer.
This is caused by `&mut (*#slot).#ident`, which actually allows arbitrary
lifetime, so this is effectively `'static`. Somewhat ironically, the safety
justification of creating the accessor is.. "SAFETY: TODO".
Fix it by adding `let_binding` method on `DropGuard` to shorten lifetime.
This results in exactly what we want for these accessors. The safety and
invariant comments of `DropGuard` have been reworked; instead of reasoning
about what caller can do with the guard, express it in a way that the
ownership is transferred to the guard and `forget` takes it back, so the
unsafe operations within the `DropGuard` can be more easily justified.
Fixes: 42415d163e5d ("rust: pin-init: add references to previously initialized fields") Cc: stable@vger.kernel.org Signed-off-by: Gary Guo <gary@garyguo.net> Link: https://patch.msgid.link/20260427-pin-init-fix-v3-2-496a699674dd@garyguo.net
[ Reworded for missing word. - Miguel ] Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Gary Guo [Mon, 27 Apr 2026 15:43:00 +0000 (16:43 +0100)]
rust: pin-init: internal: move alignment check to `make_field_check`
Instead of having the reference creation serving dual-purpose as both for
let bindings and alignment check, detangle them so that the alignment check
is done explicitly in `make_field_check`. This is more robust against
refactors that may change the way let bindings are created.
David Gow [Sat, 25 Apr 2026 03:41:23 +0000 (11:41 +0800)]
rust: arch: um: Fix building 32-bit UML with GCC
32-bit UML builds can be configured either by setting CONFIG_64BIT=n or
with SUBARCH=i386. Both work with Rust-for-Linux when clang is the
compiler, but when SUBARCH=i386, we don't set a bindgen target correctly if
gcc is the compiler.
Add the appropriate bindgen target configuration for i386, as is done in
Makefile.clang.
[ For reference, the errors look like:
BINDGEN rust/bindings/bindings_generated.rs
error: unsupported option '-mno-sse' for target ''
...
error: unknown target triple 'unknown'
panicked at .../bindgen-0.72.1/ir/context.rs:562:15:
libclang error; possible causes include:
...
- Miguel ]
Fixes: ab0f4cedc355 ("arch: um: rust: Add i386 support for Rust") Signed-off-by: David Gow <david@davidgow.net> Link: https://patch.msgid.link/20260425034125.53866-1-david@davidgow.net
[ Added space in title. - Miguel ] Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
====================
net: mctp: test: minor kunit test fixes
This series provides two fixes in the MCTP kunit tests - one exposed by
ktr, and one found while debugging the former on different VM configs.
====================
Jeremy Kerr [Wed, 29 Apr 2026 08:21:42 +0000 (16:21 +0800)]
net: mctp: test: Use dev_direct_xmit for TX to our test device
In our test cases, we typically feed a packet sequence into the routing
code, then inspect the device's TXed skbs to assert specific behaviours.
Using dev_queue_xmit() for our TX path introduces a fair bit of
complexity between the test packet sequence and the test device's
ndo_start_xmit callback; which may mean that the skbs have not hit the
device at the point we're inspecting the TXed skb list.
Use dev_direct_xmit instead, as we want a direct a path as possible
here, and the test dev does not need any queueing, scheduling or flow
control.
Fixes: 6ab578739a4c ("net: mctp: test: move TX packetqueue from dst to dev") Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202604281320.525eee17-lkp@intel.com Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260429-dev-mctp-test-fixes-v1-2-1127b7425809@codeconstruct.com.au Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Hui Wang [Thu, 30 Apr 2026 04:53:50 +0000 (12:53 +0800)]
riscv: cpufeature: Drop this_hwcap clear in T-Head vector workaround
The variable this_hwcap is initialized to 0 for each loop, it is not
necessary to do the bit clearance since this_hwcap is still 0 at this
point, clearing the source_isa is enough here.
Myeonghun Pak [Fri, 24 Apr 2026 13:50:51 +0000 (22:50 +0900)]
hwmon: (corsair-psu) Close HID device on probe errors
corsairpsu_probe() opens the HID device before sending the device init
and firmware-info commands. If either command fails, the error path jumps
directly to fail_and_stop and skips hid_hw_close().
Use the existing fail_and_close label for those post-open failures so the
open count and low-level close callback are balanced before hid_hw_stop().
hwmon: Remove stale CONFIG_SENSORS_SBRMI Makefile reference
kconfiglint reports:
X001: CONFIG_SENSORS_SBRMI referenced in Makefile but not defined
in any Kconfig
The SB-RMI hardware monitoring driver was originally introduced in
commit 5a0f50d110b3 ("hwmon: Add support for SB-RMI power module") with
both a Kconfig entry (CONFIG_SENSORS_SBRMI) and a Makefile line
(obj-$(CONFIG_SENSORS_SBRMI) += sbrmi.o) in drivers/hwmon/.
Commit e156586764050 ("hwmon/misc: amd-sbi: Move core sbrmi from hwmon to
misc")
moved the driver to drivers/misc/amd-sbi/ to support additional
functionality beyond hardware monitoring. That commit correctly removed the
Kconfig entry from drivers/hwmon/Kconfig, moved the source file
drivers/hwmon/sbrmi.c to drivers/misc/amd-sbi/sbrmi.c, and created new
Kconfig/Makefile entries in drivers/misc/amd-sbi/ with a renamed symbol
(CONFIG_AMD_SBRMI_I2C).
However, the Makefile line in drivers/hwmon/Makefile was not removed in
that commit. The orphaned line references a CONFIG symbol that no longer
exists and a source file that is no longer present, so it has no effect
on the build — but it is dead code that should be cleaned up.
hwmon: (ltc2992) Fix u32 overflow in power read path
ltc2992_get_power() computes the divisor for mul_u64_u32_div() as
r_sense_uohm * 1000. This multiplication overflows u32 when
r_sense_uohm exceeds about 4.29 ohms (4294967 micro-ohms), producing
a truncated divisor and an incorrect power reading.
Cancel the factor of 1000 from both the numerator
(VADC_UV_LSB * IADC_NANOV_LSB = 312500000) and the divisor
(r_sense_uohm * 1000), giving (VADC_UV_LSB / 1000) * IADC_NANOV_LSB
= 312500 as the numerator and plain r_sense_uohm as the divisor.
The cancellation is exact because LTC2992_VADC_UV_LSB (25000) is
divisible by 1000.
This is the read-path counterpart of the write-path fix applied in
the preceding patch.
hwmon: (ltc2992) Clamp threshold writes to hardware range
ltc2992_set_voltage(), ltc2992_set_current(), and ltc2992_set_power()
do not validate the user-supplied value before converting it to a
register value. This can result in:
1. Negative input values wrapping to large positive register values.
For power, the negative long is implicitly cast to u64 in
mul_u64_u32_div(), producing an incorrect value. For voltage and
current, the negative converted value wraps when passed to
ltc2992_write_reg() as a u32.
2. Intermediate arithmetic exceeding the range representable in u64 on
64-bit platforms. In ltc2992_set_voltage(), (u64)val * 1000 can
exceed U64_MAX when val is a large positive long. In
ltc2992_set_current(), (u64)val * r_sense_uohm can overflow
similarly. In ltc2992_set_power(), the computed value may not fit
in u64.
3. Register values exceeding the hardware field width. Voltage and
current threshold registers are 12-bit (stored left-justified in
16 bits), and power threshold registers are 24-bit. Without
clamping, bits above the field width are truncated in
ltc2992_write_reg().
Fix by clamping negative values to zero, clamping positive values to
the rounded hardware-representable maximum (the value returned by the
read path for a full-scale register) to prevent intermediate overflow,
and clamping the converted register value to the hardware field width
before writing. The existing conversion formula and rounding behavior
are preserved.
In the power write path, cancel the factor of 1000 from both the
numerator (r_sense_uohm * 1000) and the denominator
(VADC_UV_LSB * IADC_NANOV_LSB) to also eliminate a u32 overflow of
r_sense_uohm * 1000 when r_sense_uohm exceeds about 4.29 ohms.
Thomas Richter [Mon, 27 Apr 2026 05:17:19 +0000 (07:17 +0200)]
s390/pai: Disable duplicate read of kernel PAI counter value
The PAI crypto counter design allows for user space and kernel space
PAI counter increment recording. This is achieved by splitting the
recording page in half. The upper part of the 4KB page records user
space increments of PAI crypto counter and the lower half records
kernel space increments. The page itself looks like:
lowcore ptr ---> ++++++++++++++++++++++++
|user space area |
+----------------------+
|kernel space area |
++++++++++++++++++++++++
User space and kernel space entries are handled via a kernel_offset
value when wrting. For PAI crypto counters this offset is 2048 or
half of a page size.
For PAI NNPA counter design this distinction was not needed. There is
no user and kernel space part for the page pointed to by lowcore.
The set up is:
There is always only one counter value recorded and saved.
Depending on number of CPUs and machine load, the number of PAI NNPA
counter increment differs between counting (perf stat) and recording
(perf record). The number reported by sampling was double the number
shown by counting.
This was caused by a double read of the PAI NNPA values in function
pai_copy(). The first part of that function reads the kernel space part.
The offset into the kernel page part must be larger than zero.
The second part of that function reads the user space part, which
begins of offset zero. This works fine for PAI crypto counters.
It fails for PAI NNPA counters because the PMU device driver does
not support that feature and has a kernel_offset value of 0x0.
Executing both user and kernel space read out might end up reading
user space value twice.
For the PAI NNPA PMU prohibit the kernel space part read out.
Cc: stable@vger.kernel.org Fixes: f12473541356 ("s390/pai_crypto: Rename paicrypt_copy() to pai_copy()") Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
netfilter: flowtable: ensure sufficient headroom in xmit path
Check for headroom and call skb_expand_head() like in the IP output
path to ensure there is sufficient headroom for the mac header when
forwarding this packet as suggested by sashiko.
netfilter: xtables: fix L4 header parsing for non-first fragments
Multiple targets and matches relies on L4 header to operate. For
fragmented packets, every fragment carries the transport protocol
identifier, but only the first fragment contains the L4 header.
As the 'raw' table can be configured to run at priority -450 (before
defragmentation at -400), the target/match can be reached before
reassembly. In this case, non-first fragments have their payload
incorrectly parsed as a TCP/UDP header. This would be of course a
misconfiguration scenario. In most of the cases this just lead to a
unreliable behavior for fragmented traffic.
Add a fragment check to ensure target/match only evaluates unfragmented
packets or the first fragment in the stream.
Fixes: 902d6a4c2a4f ("netfilter: nf_defrag: Skip defrag if NOTRACK is set") Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
netfilter: nf_tables: skip L4 header parsing for non-first fragments
The tproxy, osf and exthdr (SCTP) expressions rely on the presence of
transport layer headers to perform socket lookups, fingerprint matching,
or chunk extraction. For fragmented packets, while the IP protocol
remains constant across all fragments, only the first fragment contains
the actual L4 header.
The expressions could be attached to a chain with a priority lower than
-400, bypassing defragmentation. Or could be used in stateless
environments where defragmentation is not happening at all. This could
result in garbage data being used for the matching.
Add a check for pkt->fragoff so only unfragmented packets or the first
fragment is processed.
Fixes: 133dc203d77d ("netfilter: nft_exthdr: Support SCTP chunks") Fixes: 4ed8eb6570a4 ("netfilter: nf_tables: Add native tproxy support") Fixes: b96af92d6eaf ("netfilter: nf_tables: implement Passive OS fingerprint module in nft_osf") Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
netfilter: nf_socket: skip socket lookup for non-first fragments
Both nft_socket and xt_socket relies on L4 headers to perform socket
lookup in the slow path. For fragmented packets, while the IP protocol
remains constant across all fragments, only the first fragment contains
the actual L4 header.
As the expression/match could be attached to a chain with a priority
lower than -400, it could bypass defragmentation.
Add a check for fragmentation in the lookup functions directly so the
problem is handled for both nft_socket and xt_socket at the same time.
In addition, future users of the functions would not need to care about
this.
Fixes: 902d6a4c2a4f ("netfilter: nf_defrag: Skip defrag if NOTRACK is set") Fixes: 554ced0a6e29 ("netfilter: nf_tables: add support for native socket matching") Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
- eth:
- stmmac: prevent NULL deref when RX memory exhausted
- airoha: do not read uninitialized fragment address
- rtl8150: fix use-after-free in rtl8150_start_xmit()
Misc:
- add Ido Schimmel as IPv4/IPv6 maintainer
- add David Heidelberg as NFC subsystem maintainer"
* tag 'net-7.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (79 commits)
net/sched: cls_flower: revert unintended changes
sfc: fix error code in efx_devlink_info_running_versions()
net: tls: fix strparser anchor skb leak on offload RX setup failure
ice: add dpll peer notification for paired SMA and U.FL pins
ice: fix missing dpll notifications for SW pins
dpll: export __dpll_pin_change_ntf() for use under dpll_lock
ice: fix SMA and U.FL pin state changes affecting paired pin
ice: fix missing SMA pin initialization in DPLL subsystem
ice: fix infinite recursion in ice_cfg_tx_topo via ice_init_dev_hw
ice: fix NULL pointer dereference in ice_reset_all_vfs()
iavf: add VIRTCHNL_OP_ADD_VLAN to success completion handler
iavf: wait for PF confirmation before removing VLAN filters
iavf: stop removing VLAN filters from PF on interface down
iavf: rename IAVF_VLAN_IS_NEW to IAVF_VLAN_ADDING
page_pool: fix memory-provider leak in page_pool_create_percpu() error path
bonding: 3ad: implement proper RCU rules for port->aggregator
net: airoha: Do not return err in ndo_stop() callback
hv_sock: fix ARM64 support
MAINTAINERS: update the IPv4/IPv6 entry and add Ido Schimmel
selftests: drv-net: clarify linters and frameworks in README
...
Merge tag 'sound-7.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A bunch of small fixes. One minor fix is found in the core side for
data race in PCM OSS layer, while remaining changes are various
device-specific fixes and quirks.
- Core: PCM OSS data race fix
- HD-audio: Fixes for TAS2781, CS35L56, and Realtek/Conexant quirks;
avoidance of a WARN_ON for HDMI channel mapping
- USB-audio: Improvements in UAC3 parsing robustness (leaks, size
checks) and fixes for potential endless loops
- ASoC: Driver-specific fixes for CS35L56, Intel bytcr_wm5102,
Spacemit, AW88395, and others, plus a new quirk for Steam Deck
OLED
- Misc: A UAF fix in aloop driver, division by zero fix in ua101
driver and leak fixes in caiaq driver"
* tag 'sound-7.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (32 commits)
ALSA: hda/tas2781: Fix incorrect bit update for non-book-zero or book 0 pages >1
ALSA: hda: cs35l56: Fix uninitialized value in cs35l56_hda_read_acpi()
ALSA: hda/conexant: Fix missing error check for jack detection
ALSA: hda: Avoid WARN_ON() for HDMI chmap slot checks
ALSA: usb-audio: Fix quirk entry placement for PreSonus AudioBox USB
ASoC: spacemit: adjust FIFO trigger threshold to half FIFO size
ASoC: spacemit: move hw constraints from hw_params to startup
ASoC: codecs: ab8500: Fix casting of private data
ASoC: cs35l56: Fix illegal writes to OTP_MEM registers
ASoC: Intel: bytcr_wm5102: Fix MCLK leak on platform_clock_control error
ALSA: usb-audio: Avoid potential endless loop in convert_chmap_v3()
ALSA: usb-audio: Fix potential leak of pd at parsing UAC3 streams
ALSA: caiaq: Don't abort when no input device is available
ALSA: caiaq: Fix potentially leftover ep1_in_urb at error path
ASoC: aw88395: Fix kernel panic caused by invalid GPIO error pointer
ALSA: caiaq: fix usb_dev refcount leak on probe failure
sound: ua101: fix division by zero at probe
ALSA: usb-audio: apply quirk for Playstation PDP Riffmaster
ALSA: hda: Remove duplicate cmedia entries in codecs Makefile
ALSA: hda/realtek: Add micmute LED quirk for Acer Aspire A315-44P
...
ALSA: hda/realtek: Fix speaker silence after S3 resume on Xiaomi Mi Laptop Pro 15
The Xiaomi Mi Laptop Pro 15 (TM1905, subsystem 1d72:1905) ships with the
Realtek ALC256 codec on Intel Comet Lake PCH-LP. After S3 resume the
codec sets coefficient register 0x10 to 0x0220 instead of 0x0020 — bit 9
is erroneously set, which silences the internal speaker. Bluetooth and
HDMI audio are unaffected because they use different paths.
This is the same mechanism fixed for Clevo NJ51CU by commit edca7cc4b0ac
("ALSA: hda/realtek: Fix quirk for Clevo NJ51CU"), but the existing
ALC256_FIXUP_MIC_NO_PRESENCE_AND_RESUME also reconfigures pin 0x19 as a
front mic, which is wrong for this Xiaomi where pin 0x19 default is
0x411111f0 (disabled). Add a minimal fixup that only clears the stuck
coef bit, and add the Xiaomi SSID to the quirk table.
Verified by reading coef 0x10 with hda-verb after resume (returns
0x0220), writing 0x0020, and confirming the internal speaker resumes
output. With this fixup applied the bit is cleared on every codec init,
including post-resume.
mm: memcontrol: fix rcu unbalance in get_non_dying_memcg_end()
Currently, get_non_dying_memcg_start() and get_non_dying_memcg_end() both
evaluate cgroup_subsys_on_dfl(memory_cgrp_subsys) independently to
determine whether to acquire or release the RCU read lock.
However, the result of cgroup_subsys_on_dfl() can change dynamically at
runtime due to cgroup hierarchy rebinding (e.g., when the memory
controller is moved between cgroup v1 and v2 hierarchies). This can cause
the following warning:
=====================================
WARNING: bad unlock balance detected!
7.0.0-next-20260420+ #83 Tainted: G W
-------------------------------------
memcg-repro/270 is trying to release lock (rcu_read_lock) at:
[<ffffffff815f57f7>] rcu_read_unlock+0x17/0x60
but there are no more locks to release!
other info that might help us debug this:
1 lock held by memcg-repro/270:
#0: ffff888102fa2088 (vm_lock){++++}-{0:0}, at: do_user_addr_fault+0x285/0x880
Fix this by explicitly tracking the RCU lock state, ensuring that
rcu_read_unlock() in get_non_dying_memcg_end() is strictly paired with the
lock acquisition, regardless of any runtime rebinding events.
Link: https://lore.kernel.org/20260429073105.44472-1-qi.zheng@linux.dev Fixes: 8285917d6f38 ("mm: memcontrol: prepare for reparenting non-hierarchical stats") Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> Reviewed-by: Muchun Song <muchun.song@linux.dev> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
io_uring/tw: serialize ctx->retry_llist with ->uring_lock
The DEFER_TASKRUN local task work paths all run under ctx->uring_lock,
which serializes them with each other and with the rest of the ring's
hot paths. io_move_task_work_from_local() is the exception - it's called
from io_ring_exit_work() on a kworker without holding the lock and from
the iopoll cancelation side right after dropping it.
->work_llist is fine with this, as it's only ever updated via the
expected paths. But the ->retry_llist is updated while runing, and hence
it could potentially race between normal task_work running and the
task-has-exited shutdown path.
Simply grab ->uring_lock while moving the local work to the fallback
list for exit purposes, which nicely serializes it across both the
normal additions and the exit prune path.
Cc: stable@vger.kernel.org Fixes: f46b9cdb22f7 ("io_uring: limit local tw done") Reported-by: Robert Femmer <robert.femmer@x41-dsec.de> Reported-by: Christian Reitter <invd@inhq.net> Reported-by: Michael Rodler <michael.rodler@x41-dsec.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
platform/x86: lenovo: wmi-other: Fix uninitialized variable in lwmi_om_hwmon_write()
When the flag relax_fan_constraint is set, local variable 'raw'
is never assigned, and lwmi_om_hwmon_write() will pass uninitialized
value to lwmi_om_fan_get_set() resulting in undefined behavior.
This flag allows user to bypass minimum fan RPM divisor rounding,
but assignment to 'raw' only happens in the non-relaxed path.
Fix by defaulting 'raw' to user provided 'val' in the else branch.
Fixes: 51ed34282f63 ("platform/x86: lenovo-wmi-other: Add HWMON for fan reporting/tuning") Reviewed-by: Rong Zhang <i@rong.moe> Signed-off-by: Yufei CHENG <cd345al@gmail.com> Link: https://patch.msgid.link/20260426165034.9073-1-cd345al@gmail.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
platform/x86: hp-wmi: silence unknown board warning for 8D41
The HP Omen Max 16-ah0xxx, DMI board ID 8D41 is currently marked with
victus_s_thermal_params in the victus_s_thermal_profile_boards[] list.
This disables thermal profile readback and adds a dmesg warning during
driver init for "unknown board".
After testing we know that (similar to another HP Omen Max 16 device,
board ID 8D87), the embedded controller on this board does not expose
thermal profile which means we have to intentionally disable EC
readback.
Changing its driver_data to omen_v1_no_ec_thermal_params is sufficient
to silence the warning.
Tested-by: Benjamin Y <hectare.plains1h@icloud.com> Signed-off-by: Krishna Chomal <krishna.chomal108@gmail.com> Link: https://patch.msgid.link/20260429180953.129885-1-krishna.chomal108@gmail.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Kurt Borja [Wed, 29 Apr 2026 13:20:56 +0000 (08:20 -0500)]
platform/wmi: Fix unchecked min_size in wmidev_invoke_method()
After calling wmidev_evaluate_method(), if the ACPI core does not return
an out object, then wmidev_invoke_method() bypasses the min_size check
and returns 0. Add a check for min_size if there is not an out object.
Fixes: 1aeded2f55f0 ("platform/wmi: Extend wmidev_query_block() to reject undersized data") Closes: https://sashiko.dev/#/patchset/20260406203237.2970-1-W_Armin%40gmx.de Signed-off-by: Kurt Borja <kuurtb@gmail.com> Reviewed-by: Armin Wolf <W_Armin@gmx.de> Link: https://patch.msgid.link/20260429-invoke-fix-v1-1-ce938eb80cd3@gmail.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Prevent re-exporting of imported GEM buffers by adding a custom
prime_handle_to_fd callback that checks if the object is imported
and returns -EOPNOTSUPP if so.
Re-exporting imported GEM buffers causes loss of buffer flags settings,
leading to incorrect device access and data corruption.
Reported-by: Yametsu <yam3tsu@gmail.com> Fixes: 57557964b582 ("accel/ivpu: Add support for userptr buffer objects") Reviewed-by: Andrzej Kacprowski <andrzej.kacprowski@linux.intel.com> Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com> Cc: <stable@vger.kernel.org> # v6.19+
Paolo Abeni [Wed, 29 Apr 2026 07:39:11 +0000 (09:39 +0200)]
net/sched: cls_flower: revert unintended changes
While applying the blamed commit 4ca07b9239bd ("net: mctp i2c: check
length before marking flow active"), I unintentionally included
unrelated and unacceptable changes.
Revert them.
Fixes: 4ca07b9239bd ("net: mctp i2c: check length before marking flow active") Reported-by: Jeremy Kerr <jk@codeconstruct.com.au> Closes: https://lore.kernel.org/netdev/bd8704fe0bd53e278add5cde4873256656623e2e.camel@codeconstruct.com.au/ Signed-off-by: Paolo Abeni <pabeni@redhat.com> Link: https://patch.msgid.link/043026a53ff84da88b17648c4b0d17f0331749cb.1777447863.git.pabeni@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Dan Carpenter [Wed, 29 Apr 2026 06:48:17 +0000 (09:48 +0300)]
sfc: fix error code in efx_devlink_info_running_versions()
Return -EIO if efx_mcdi_rpc() doesn't return enough space.
Fixes: 14743ddd2495 ("sfc: add devlink info support for ef100") Signed-off-by: Dan Carpenter <error27@gmail.com> Reviewed-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://patch.msgid.link/afGpsbLRHL4_H0KS@stanley.mountain Signed-off-by: Paolo Abeni <pabeni@redhat.com>
When tls_set_device_offload_rx() fails at tls_dev_add(), the error path
calls tls_sw_free_resources_rx() to clean up the SW context that was
initialized by tls_set_sw_offload(). This function calls
tls_sw_release_resources_rx() (which stops the strparser via
tls_strp_stop()) and tls_sw_free_ctx_rx() (which kfrees the context),
but never frees the anchor skb that was allocated by alloc_skb(0) in
tls_strp_init().
Note that tls_sw_free_resources_rx() is exclusively used for this
"failed to start offload" code path, there's no other caller.
The leak did not exist before commit 84c61fe1a75b ("tls: rx: do not use
the standard strparser"), because the standard strparser doesn't try
to pre-allocate an skb.
The normal close path in tls_sk_proto_close() handles cleanup by calling
tls_sw_strparser_done() (which calls tls_strp_done()) after dropping
the socket lock, because tls_strp_done() does cancel_work_sync() and
the strparser work handler takes the socket lock.
Fixes: 84c61fe1a75b ("tls: rx: do not use the standard strparser") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20260428231559.1358502-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
====================
Intel Wired LAN Update 2026-04-27 (ice, iavf)
Petr Oros from RedHat has accumulated a number of fixes for the Intel ice
and iavf drivers, bundled together in this series.
First, a series of 4 fixes to resolve issues with the iavf driver logic for
handling VLAN filters. This includes keeping VLAN filters while the
interface is brought down, waiting for confirmation on filter deletion
before deleting filters from the driver tracking structures, and handling
the VIRTCHNL_OP_ADD_VLAN for the old v1 VLAN_ADD command.
A fix for a crash in ice_reset_all_vfs(), properly checking for errors when
ice_vf_rebuild_vsi() fails.
A fix for a possible infinite recursion in ice_cfg_tx_topo() that occurs
when trying to apply invalid Tx topology configuration.
A fix to initialize the SMA pins in the DPLL subsystem properly.
A fix to change the SMA and U.FL pin state for paired pins, ensuring that
all flows changing one pin will also update its shared pin appropriately.
A preparatory patch to export __dpll_pin_change_ntf() so that drivers can
notify pin changes while already holding the dpll_lock.
A fix to ensure DPLL notifications are sent for the software-controlled
pins which wrap the physical CGU input/output pins.
A fix to add DPLL notifications for peer pins when changing the SMA or U.FL
pins, ensuring DPLL subsystem is notified about the paired connected pins.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
====================
Petr Oros [Tue, 28 Apr 2026 05:22:23 +0000 (22:22 -0700)]
ice: add dpll peer notification for paired SMA and U.FL pins
SMA and U.FL pins share physical signal paths in pairs (SMA1/U.FL1 and
SMA2/U.FL2). When one pin's state changes via a PCA9575 GPIO write,
the paired pin's state also changes, but no notification is sent for
the peer pin. Userspace consumers monitoring the peer via dpll netlink
subscribe never learn about the update.
Add ice_dpll_sw_pin_notify_peer() which sends a change notification for
the paired SW pin. Call it from ice_dpll_pin_sma_direction_set(),
ice_dpll_sma_pin_state_set(), and ice_dpll_ufl_pin_state_set() after
pf->dplls.lock is released. Use __dpll_pin_change_ntf() because
dpll_lock is still held by the dpll netlink layer (dpll_pin_pre_doit).
Fixes: 2dd5d03c77e2 ("ice: redesign dpll sma/u.fl pins control") Signed-off-by: Petr Oros <poros@redhat.com> Tested-by: Alexander Nowlin <alexander.nowlin@intel.com> Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260427-jk-iwl-net-petr-oros-fixes-v1-11-cdcb48303fd8@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Petr Oros [Tue, 28 Apr 2026 05:22:22 +0000 (22:22 -0700)]
ice: fix missing dpll notifications for SW pins
The SMA/U.FL pin redesign (commit 2dd5d03c77e2 ("ice: redesign dpll
sma/u.fl pins control")) introduced software-controlled pins that wrap
backing CGU input/output pins, but never updated the notification and
data paths to propagate pin events to these SW wrappers.
The periodic work sends dpll_pin_change_ntf() only for direct CGU input
pins. SW pins that wrap these inputs never receive change or phase
offset notifications, so userspace consumers such as synce4l monitoring
SMA pins via dpll netlink never learn about state transitions or phase
offset updates. Similarly, ice_dpll_phase_offset_get() reads the SW
pin's own phase_offset field which is never updated; the PPS monitor
writes to the backing CGU input's field instead.
Fix by introducing ice_dpll_pin_ntf(), a wrapper around
dpll_pin_change_ntf() that also notifies any registered SMA/U.FL pin
whose backing CGU input matches. Replace all direct
dpll_pin_change_ntf() calls in the periodic notification paths with
this wrapper. Fix ice_dpll_phase_offset_get() to return the backing
CGU input's phase_offset for input-direction SW pins.
Fixes: 2dd5d03c77e2 ("ice: redesign dpll sma/u.fl pins control") Signed-off-by: Petr Oros <poros@redhat.com> Tested-by: Alexander Nowlin <alexander.nowlin@intel.com> Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260427-jk-iwl-net-petr-oros-fixes-v1-10-cdcb48303fd8@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Ivan Vecera [Tue, 28 Apr 2026 05:22:21 +0000 (22:22 -0700)]
dpll: export __dpll_pin_change_ntf() for use under dpll_lock
Export __dpll_pin_change_ntf() so that drivers can send pin change
notifications from within pin callbacks, which are already called
under dpll_lock. Using dpll_pin_change_ntf() in that context would
deadlock.
Add lockdep_assert_held() to catch misuse without the lock held.
Acked-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: Petr Oros <poros@redhat.com> Tested-by: Alexander Nowlin <alexander.nowlin@intel.com> Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260427-jk-iwl-net-petr-oros-fixes-v1-9-cdcb48303fd8@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Petr Oros [Tue, 28 Apr 2026 05:22:20 +0000 (22:22 -0700)]
ice: fix SMA and U.FL pin state changes affecting paired pin
SMA and U.FL pins share physical signal paths in pairs (SMA1/U.FL1 and
SMA2/U.FL2) controlled by the PCA9575 GPIO expander. Each pair can
only have one active pin at a time: SMA1 output and U.FL1 output share
the same CGU output, SMA2 input and U.FL2 input share the same CGU
input. The PCA9575 register bits determine which connector in each
pair owns the signal path.
The driver does not account for this pairing in two places:
ice_dpll_ufl_pin_state_set() modifies PCA9575 bits and disables the
backing CGU pin without checking whether the U.FL pin is currently
active. Disconnecting an already inactive U.FL pin flips bits that
the paired SMA pin relies on, breaking its connection.
ice_dpll_sma_direction_set() does not propagate direction changes to
the paired U.FL pin. For SMA2/U.FL2 the ICE_SMA2_UFL2_RX_DIS bit is
never managed, so U.FL2 stays disconnected after SMA2 switches to
output. For both pairs the backing CGU pin of the U.FL side is never
enabled when a direction change activates it, so userspace sees the
pin as disconnected even though the routing is correct.
Fix by guarding the U.FL disconnect path against inactive pins and by
updating the paired U.FL pin fully on SMA direction changes: manage
ICE_SMA2_UFL2_RX_DIS for the SMA2/U.FL2 pair and enable the backing
CGU pin whenever the peer becomes active.
Fixes: 2dd5d03c77e2 ("ice: redesign dpll sma/u.fl pins control") Signed-off-by: Petr Oros <poros@redhat.com> Tested-by: Alexander Nowlin <alexander.nowlin@intel.com> Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260427-jk-iwl-net-petr-oros-fixes-v1-8-cdcb48303fd8@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Petr Oros [Tue, 28 Apr 2026 05:22:19 +0000 (22:22 -0700)]
ice: fix missing SMA pin initialization in DPLL subsystem
The DPLL SMA/U.FL pin redesign introduced ice_dpll_sw_pin_frequency_get()
which gates frequency reporting on the pin's active flag. This flag is
determined by ice_dpll_sw_pins_update() from the PCA9575 GPIO expander
state. Before the redesign, SMA pins were exposed as direct HW
input/output pins and ice_dpll_frequency_get() returned the CGU
frequency unconditionally — the PCA9575 state was never consulted.
The PCA9575 powers on with all outputs high, setting ICE_SMA1_DIR_EN,
ICE_SMA1_TX_EN, ICE_SMA2_DIR_EN and ICE_SMA2_TX_EN. Nothing in the
driver writes the register during initialization, so
ice_dpll_sw_pins_update() sees all pins as inactive and
ice_dpll_sw_pin_frequency_get() permanently returns 0 Hz for every
SW pin.
Fix this by writing a default SMA configuration in
ice_dpll_init_info_sw_pins(): clear all SMA bits, then set SMA1 and
SMA2 as active inputs (DIR_EN=0) with U.FL1 output and U.FL2 input
disabled. Each SMA/U.FL pair shares a physical signal path so only
one pin per pair can be active at a time. U.FL pins still report
frequency 0 after this fix: U.FL1 (output-only) is disabled by
ICE_SMA1_TX_EN which keeps the TX output buffer off, and U.FL2
(input-only) is disabled by ICE_SMA2_UFL2_RX_DIS. They can be
activated by changing the corresponding SMA pin direction via dpll
netlink.
Fixes: 2dd5d03c77e2 ("ice: redesign dpll sma/u.fl pins control") Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Alexander Nowlin <alexander.nowlin@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260427-jk-iwl-net-petr-oros-fixes-v1-7-cdcb48303fd8@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Petr Oros [Tue, 28 Apr 2026 05:22:18 +0000 (22:22 -0700)]
ice: fix infinite recursion in ice_cfg_tx_topo via ice_init_dev_hw
On certain E810 configurations where firmware supports Tx scheduler
topology switching (tx_sched_topo_comp_mode_en), ice_cfg_tx_topo()
may need to apply a new 5-layer or 9-layer topology from the DDP
package. If the AQ command to set the topology fails (e.g. due to
invalid DDP data or firmware limitations), the global configuration
lock must still be cleared via a CORER reset.
Commit 86aae43f21cf ("ice: don't leave device non-functional if Tx
scheduler config fails") correctly fixed this by refactoring
ice_cfg_tx_topo() to always trigger CORER after acquiring the global
lock and re-initialize hardware via ice_init_hw() afterwards.
However, commit 8a37f9e2ff40 ("ice: move ice_deinit_dev() to the end
of deinit paths") later moved ice_init_dev_hw() into ice_init_hw(),
breaking the reinit path introduced by 86aae43f21cf. This creates an
infinite recursive call chain:
Fix by moving ice_init_dev_hw() back out of ice_init_hw() and calling
it explicitly from ice_probe() and ice_devlink_reinit_up(). The third
caller, ice_cfg_tx_topo(), intentionally does not need ice_init_dev_hw()
during its reinit, it only needs the core HW reinitialization. This
breaks the recursion cleanly without adding flags or guards.
The deinit ordering changes from commit 8a37f9e2ff40 ("ice: move
ice_deinit_dev() to the end of deinit paths") which fixed slow rmmod
are preserved, only the init-side placement of ice_init_dev_hw() is
reverted.
Fixes: 8a37f9e2ff40 ("ice: move ice_deinit_dev() to the end of deinit paths") Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Alexander Nowlin <alexander.nowlin@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260427-jk-iwl-net-petr-oros-fixes-v1-6-cdcb48303fd8@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Petr Oros [Tue, 28 Apr 2026 05:22:17 +0000 (22:22 -0700)]
ice: fix NULL pointer dereference in ice_reset_all_vfs()
ice_reset_all_vfs() ignores the return value of ice_vf_rebuild_vsi().
When the VSI rebuild fails (e.g. during NVM firmware update via
nvmupdate64e), ice_vsi_rebuild() tears down the VSI on its error path,
leaving txq_map and rxq_map as NULL. The subsequent unconditional call
to ice_vf_post_vsi_rebuild() leads to a NULL pointer dereference in
ice_ena_vf_q_mappings() when it accesses vsi->txq_map[0].
The single-VF reset path in ice_reset_vf() already handles this
correctly by checking the return value of ice_vf_reconfig_vsi() and
skipping ice_vf_post_vsi_rebuild() on failure.
Apply the same pattern to ice_reset_all_vfs(): check the return value
of ice_vf_rebuild_vsi() and skip ice_vf_post_vsi_rebuild() and
ice_eswitch_attach_vf() on failure. The VF is left safely disabled
(ICE_VF_STATE_INIT not set, VFGEN_RSTAT not set to VFACTIVE) and can
be recovered via a VFLR triggered by a PCI reset of the VF
(sysfs reset or driver rebind).
Note that this patch does not prevent the VF VSI rebuild from failing
during NVM update — the underlying cause is firmware being in a
transitional state while the EMP reset is processed, which can cause
Admin Queue commands (ice_add_vsi, ice_cfg_vsi_lan) to fail. This
patch only prevents the subsequent NULL pointer dereference that
crashes the kernel when the rebuild does fail.
dmesg:
ice 0000:13:00.0: firmware recommends not updating fw.mgmt, as it
may result in a downgrade. continuing anyways
ice 0000:13:00.1: ice_init_nvm failed -5
ice 0000:13:00.1: Rebuild failed, unload and reload driver
Petr Oros [Tue, 28 Apr 2026 05:22:16 +0000 (22:22 -0700)]
iavf: add VIRTCHNL_OP_ADD_VLAN to success completion handler
The V1 ADD_VLAN opcode had no success handler; filters sent via V1
stayed in ADDING state permanently. Add a fallthrough case so V1
filters also transition ADDING -> ACTIVE on PF confirmation.
Critically, add an `if (v_retval) break` guard: the error switch in
iavf_virtchnl_completion() does NOT return after handling errors,
it falls through to the success switch. Without this guard, a
PF-rejected ADD would incorrectly mark ADDING filters as ACTIVE,
creating a driver/HW mismatch where the driver believes the filter
is installed but the PF never accepted it.
For V2, this is harmless: iavf_vlan_add_reject() in the error
block already kfree'd all ADDING filters, so the success handler
finds nothing to transition.
Fixes: 968996c070ef ("iavf: Fix VLAN_V2 addition/rejection") Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260427-jk-iwl-net-petr-oros-fixes-v1-4-cdcb48303fd8@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Petr Oros [Tue, 28 Apr 2026 05:22:15 +0000 (22:22 -0700)]
iavf: wait for PF confirmation before removing VLAN filters
The VLAN filter DELETE path was asymmetric with the ADD path: ADD
waits for PF confirmation (ADD -> ADDING -> ACTIVE), but DELETE
immediately frees the filter struct after sending the DEL message
without waiting for the PF response.
This is problematic because:
- If the PF rejects the DEL, the filter remains in HW but the driver
has already freed the tracking structure, losing sync.
- Race conditions between DEL pending and other operations
(add, reset) cannot be properly resolved if the filter struct
is already gone.
Add IAVF_VLAN_REMOVING state to make the DELETE path symmetric:
In iavf_del_vlans(), transition filters from REMOVE to REMOVING
instead of immediately freeing them. The new DEL completion handler
in iavf_virtchnl_completion() frees filters on success or reverts
them to ACTIVE on error.
Update iavf_add_vlan() to handle the REMOVING state: if a DEL is
pending and the user re-adds the same VLAN, queue it for ADD so
it gets re-programmed after the PF processes the DEL.
The !VLAN_FILTERING_ALLOWED early-exit path still frees filters
directly since no PF message is sent in that case.
Also update iavf_del_vlan() to skip filters already in REMOVING
state: DEL has been sent to PF and the completion handler will
free the filter when PF confirms. Without this guard, the sequence
DEL(pending) -> user-del -> second DEL could cause the PF to return
an error for the second DEL (filter already gone), causing the
completion handler to incorrectly revert a deleted filter back to
ACTIVE.
Fixes: 968996c070ef ("iavf: Fix VLAN_V2 addition/rejection") Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260427-jk-iwl-net-petr-oros-fixes-v1-3-cdcb48303fd8@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Petr Oros [Tue, 28 Apr 2026 05:22:14 +0000 (22:22 -0700)]
iavf: stop removing VLAN filters from PF on interface down
When a VF goes down, the driver currently sends DEL_VLAN to the PF for
every VLAN filter (ACTIVE -> DISABLE -> send DEL -> INACTIVE), then
re-adds them all on UP (INACTIVE -> ADD -> send ADD -> ADDING ->
ACTIVE). This round-trip is unnecessary because:
1. The PF disables the VF's queues via VIRTCHNL_OP_DISABLE_QUEUES,
which already prevents all RX/TX traffic regardless of VLAN filter
state.
2. The VLAN filters remaining in PF HW while the VF is down is
harmless - packets matching those filters have nowhere to go with
queues disabled.
3. The DEL+ADD cycle during down/up creates race windows where the
VLAN filter list is incomplete. With spoofcheck enabled, the PF
enables TX VLAN filtering on the first non-zero VLAN add, blocking
traffic for any VLANs not yet re-added.
Remove the entire DISABLE/INACTIVE state machinery:
- Remove IAVF_VLAN_DISABLE and IAVF_VLAN_INACTIVE enum values
- Remove iavf_restore_filters() and its call from iavf_open()
- Remove VLAN filter handling from iavf_clear_mac_vlan_filters(),
rename it to iavf_clear_mac_filters()
- Remove DEL_VLAN_FILTER scheduling from iavf_down()
- Remove all DISABLE/INACTIVE handling from iavf_del_vlans()
VLAN filters now stay ACTIVE across down/up cycles. Only explicit
user removal (ndo_vlan_rx_kill_vid) or PF/VF reset triggers VLAN
filter deletion/re-addition.
Fixes: ed1f5b58ea01 ("i40evf: remove VLAN filters on close") Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260427-jk-iwl-net-petr-oros-fixes-v1-2-cdcb48303fd8@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Petr Oros [Tue, 28 Apr 2026 05:22:13 +0000 (22:22 -0700)]
iavf: rename IAVF_VLAN_IS_NEW to IAVF_VLAN_ADDING
Rename the IAVF_VLAN_IS_NEW state to IAVF_VLAN_ADDING to better
describe what the state represents: an ADD request has been sent to
the PF and is waiting for a response.
This is a pure rename with no behavioral change, preparing for a
cleanup of the VLAN filter state machine.
Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260427-jk-iwl-net-petr-oros-fixes-v1-1-cdcb48303fd8@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
parisc: Fix build failure for 32-bit kernel with PA2.0 instruction set
The CONFIG_PA11 option can not be used as a reliable check if we build a
32-bit kernel which needs the 32-bit VDSO.
Instead depend on CONFIG_64BIT and CONFIG_COMPAT only.
Reported-by: Christoph Biedl <linux-kernel.bfrz@manchmal.in-ulm.de> Tested-by: Christoph Biedl <linux-kernel.bfrz@manchmal.in-ulm.de> Signed-off-by: Helge Deller <deller@gmx.de>