Matthew Brost [Fri, 27 Feb 2026 01:52:25 +0000 (17:52 -0800)]
drm/xe: Disable garbage collector work item on SVM close
When an SVM is closed, the garbage collector work item must be stopped
synchronously and any future queuing must be prevented. Replace
flush_work() with disable_work_sync() to ensure both conditions are
met.
On PTL, older GSC FWs have a bug that can cause them to crash during
PXP invalidation events, which leads to a complete loss of power
management on the media GT. Therefore, we can't use PXP on FWs that
have this bug, which was fixed in PTL GSC build 1396.
Fixes: b1dcec9bd8a1 ("drm/xe/ptl: Enable PXP for PTL") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Julia Filipchuk <julia.filipchuk@intel.com> Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patch.msgid.link/20260324153718.3155504-10-daniele.ceraolospurio@intel.com
(cherry picked from commit 6eb04caaa972934c9b6cea0e0c29e466bf9a346f) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
drm/xe/pxp: Remove incorrect handling of impossible state during suspend
The default case of the PXP suspend switch is incorrectly exiting
without releasing the lock. However, this case is impossible to hit
because we're switching on an enum and all the valid enum values have
their own cases. Therefore, we can just get rid of the default case
and rely on the compiler to warn us if a new enum value is added and
we forget to add it to the switch.
Fixes: 51462211f4a9 ("drm/xe/pxp: add PXP PM support") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn Teres Alexis <alan.previn.teres.alexis@intel.com> Cc: Julia Filipchuk <julia.filipchuk@intel.com> Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com> Link: https://patch.msgid.link/20260324153718.3155504-8-daniele.ceraolospurio@intel.com
(cherry picked from commit f1b5a77fc9b6a90cd9a5e3db9d4c73ae1edfcfac) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
drm/xe/pxp: Clean up termination status on failure
If the PXP HW termination fails during PXP start, the normal completion
code won't be called, so the termination will remain uncomplete. To avoid
unnecessary waits, mark the termination as completed from the error path.
Note that we already do this if the termination fails when handling a
termination irq from the HW.
Fixes: f8caa80154c4 ("drm/xe/pxp: Add PXP queue tracking and session start") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn Teres Alexis <alan.previn.teres.alexis@intel.com> Cc: Julia Filipchuk <julia.filipchuk@intel.com> Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com> Link: https://patch.msgid.link/20260324153718.3155504-7-daniele.ceraolospurio@intel.com
(cherry picked from commit 5d9e708d2a69ab1f64a17aec810cd7c70c5b9fab) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Arvind Yadav [Thu, 26 Mar 2026 13:08:38 +0000 (18:38 +0530)]
drm/xe/madvise: Accept canonical GPU addresses in xe_vm_madvise_ioctl
Userspace passes canonical (sign-extended) GPU addresses where bits 63:48
mirror bit 47. The internal GPUVM uses non-canonical form (upper bits
zeroed), so passing raw canonical addresses into GPUVM lookups causes
mismatches for addresses above 128TiB.
Strip the sign extension with xe_device_uncanonicalize_addr() at the
top of xe_vm_madvise_ioctl(). Non-canonical addresses are unaffected.
Fixes: ada7486c5668 ("drm/xe: Implement madvise ioctl for xe") Suggested-by: Matthew Brost <matthew.brost@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Signed-off-by: Arvind Yadav <arvind.yadav@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260326130843.3545241-13-arvind.yadav@intel.com
(cherry picked from commit 05c8b1cdc54036465ea457a0501a8c2f9409fce7) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Felix Gu [Sat, 28 Mar 2026 16:07:06 +0000 (00:07 +0800)]
spi: stm32-ospi: Fix reset control leak on probe error
When spi_register_controller() fails after reset_control_acquire()
succeeds, the reset control is never released. This causes a resource
leak in the error path.
Add the missing reset_control_release() call in the error path.
Fixes: cf2c3eceb757 ("spi: stm32-ospi: Make usage of reset_control_acquire/release() API") Signed-off-by: Felix Gu <ustc.gu@gmail.com> Reviewed-by: Patrice Chotard <patrice.chotard@foss.st.com> Link: https://patch.msgid.link/20260329-stm32-ospi-v1-1-142122466412@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
Currently, __device_set_driver_override() handles clearing the override
via empty string ("") and newline ("\n") in two separate paths. The "\n"
case also performs an unnecessary memory allocation and immediate free.
Simplify the logic by initializing 'new' to NULL and only allocating
memory if the string length remains non-zero after stripping the
trailing newline.
Reduce code size, improve readability, and avoid unnecessary memory
operations.
Aleksandr Nogikh [Wed, 25 Mar 2026 15:48:24 +0000 (16:48 +0100)]
x86/kexec: Disable KCOV instrumentation after load_segments()
The load_segments() function changes segment registers, invalidating GS base
(which KCOV relies on for per-cpu data). When CONFIG_KCOV is enabled, any
subsequent instrumented C code call (e.g. native_gdt_invalidate()) begins
crashing the kernel in an endless loop.
To reproduce the problem, it's sufficient to do kexec on a KCOV-instrumented
kernel:
$ kexec -l /boot/otherKernel
$ kexec -e
The real-world context for this problem is enabling crash dump collection in
syzkaller. For this, the tool loads a panic kernel before fuzzing and then
calls makedumpfile after the panic. This workflow requires both CONFIG_KEXEC
and CONFIG_KCOV to be enabled simultaneously.
Adding safeguards directly to the KCOV fast-path (__sanitizer_cov_trace_pc())
is also undesirable as it would introduce an extra performance overhead.
Disabling instrumentation for the individual functions would be too fragile,
so disable KCOV instrumentation for the entire machine_kexec_64.c and
physaddr.c. If coverage-guided fuzzing ever needs these components in the
future, other approaches should be considered.
The problem is not relevant for 32 bit kernels as CONFIG_KCOV is not supported
there.
Randy Dunlap [Thu, 12 Mar 2026 05:14:44 +0000 (22:14 -0700)]
powercap: correct kernel-doc function parameter names
Use the correct function parameter names in kernel-doc comments to
avoid these warnings:
Warning: include/linux/powercap.h:254 function parameter 'name' not
described in 'powercap_register_control_type'
Warning: include/linux/powercap.h:298 function parameter 'nr_constraints'
not described in 'powercap_register_zone'
This occurs when allocating MSI-X vectors for an NVMe device. During
allocation the XIVE code creates a struct xive_irq_data and stores it
in irq_data->chip_data.
When the MSI-X irqdomain is later freed, xive_irq_free_data() is
responsible for retrieving this structure and freeing it. However,
after commit cc0cc23babc9 ("powerpc/xive: Untangle xive from child
interrupt controller drivers"), xive_irq_free_data() retrieves the
chip_data using irq_get_chip_data(), which looks up the data through
the child domain.
This is incorrect because the XIVE-specific irq data is associated with
the XIVE (parent) domain. As a result the lookup fails and the allocated
struct xive_irq_data is never freed, leading to the kmemleak report
shown above.
Fix this by retrieving the irq_data from the correct domain using
irq_domain_get_irq_data() and then accessing the chip_data via
irq_data_get_irq_chip_data().
This uses _RPAGE_SW2 bit for the PMD and PUDs similar to PTEs.
This also adds support for {pte,pmd,pud}_pgprot helpers needed for
follow_pfnmap APIs.
This allows us to extend the PFN mappings, e.g. PCI MMIO bars where
it can grow as large as 8GB or even bigger, to map at PMD / PUD level.
VFIO PCI core driver already supports fault handling at PMD / PUD level
for more efficient BAR mappings.
drivers/vfio_pci_core: Change PXD_ORDER check from switch case to if/else block
Architectures like PowerPC uses runtime defined values for
PMD_ORDER/PUD_ORDER. This is because it can use either RADIX or HASH MMU
at runtime using kernel cmdline. So the pXd_index_size is not known at
compile time. Without this fix, when we add huge pfn support on powerpc
in the next patch, vfio_pci_core driver compilation can fail with the
following errors.
CC [M] drivers/vfio/vfio_main.o
CC [M] drivers/vfio/group.o
CC [M] drivers/vfio/container.o
CC [M] drivers/vfio/virqfd.o
CC [M] drivers/vfio/vfio_iommu_spapr_tce.o
CC [M] drivers/vfio/pci/vfio_pci_core.o
CC [M] drivers/vfio/pci/vfio_pci_intrs.o
CC [M] drivers/vfio/pci/vfio_pci_rdwr.o
CC [M] drivers/vfio/pci/vfio_pci_config.o
CC [M] drivers/vfio/pci/vfio_pci.o
AR kernel/built-in.a
../drivers/vfio/pci/vfio_pci_core.c: In function ‘vfio_pci_vmf_insert_pfn’:
../drivers/vfio/pci/vfio_pci_core.c:1678:9: error: case label does not reduce to an integer constant
1678 | case PMD_ORDER:
| ^~~~
../drivers/vfio/pci/vfio_pci_core.c:1682:9: error: case label does not reduce to an integer constant
1682 | case PUD_ORDER:
| ^~~~
make[6]: *** [../scripts/Makefile.build:289: drivers/vfio/pci/vfio_pci_core.o] Error 1
make[6]: *** Waiting for unfinished jobs....
make[5]: *** [../scripts/Makefile.build:546: drivers/vfio/pci] Error 2
make[5]: *** Waiting for unfinished jobs....
make[4]: *** [../scripts/Makefile.build:546: drivers/vfio] Error 2
make[3]: *** [../scripts/Makefile.build:546: drivers] Error 2
Tom Lendacky [Tue, 24 Mar 2026 16:13:01 +0000 (10:13 -0600)]
crypto/ccp: Update HV_FIXED page states to allow freeing of memory
After SNP is disabled, any pages allocated as HV_FIXED can now be freed.
Update the page state of these pages and the snp_leak_hv_fixed_pages()
function to free pages on SNP_SHUTDOWN.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Tycho Andersen (AMD) <tycho@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Link: https://patch.msgid.link/20260324161301.1353976-8-tycho@kernel.org
The SEV firmware has support to disable SNP during an SNP_SHUTDOWN_EX command.
Verify that this support is available and set the flag so that SNP is disabled
when it is not being used.
In cases where SNP is disabled, skip the call to amd_iommu_snp_disable(), as
all of the IOMMU pages have already been made shared. Also skip the panic
case, since snp_shutdown() does IPIs.
Signed-off-by: Tycho Andersen (AMD) <tycho@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Link: https://patch.msgid.link/20260324161301.1353976-7-tycho@kernel.org
Ingo Molnar [Sun, 14 Dec 2025 08:46:49 +0000 (09:46 +0100)]
x86/cpu: Remove M486/M486SX/ELAN support
In the x86 architecture we have various complicated hardware emulation
facilities on x86-32 to support ancient 32-bit CPUs that very very few
people are using with modern kernels. This compatibility glue is sometimes
even causing problems that people spend time to resolve, which time could
be spent on other things.
As Linus recently remarked:
> I really get the feeling that it's time to leave i486 support behind.
> There's zero real reason for anybody to waste one second of
> development effort on this kind of issue.
Implement the first step and remove M486/M486SX/ELAN support:
CONFIG_M486SX
CONFIG_M486
CONFIG_MELAN
[ There's no recent M486=y kernel package for any mainstream x86
32-bit distribution available that I've been able to find, so
actual users should not be impacted, and any legacy users can
keep using older kernels. ]
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Ahmed S. Darwish <darwi@linutronix.de> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Arnd Bergmann <arnd@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: John Ogness <john.ogness@linutronix.de> Link: https://patch.msgid.link/20251214084710.3606385-2-mingo@kernel.org
Ast's DP501 initialization reads the register SCU2C at offset 0x1202c
and tries to set it to source data from VGA. But writes the update to
offset 0x0, with unknown results. Write the result to SCU instead.
The bug only happens in ast_init_analog(). There's similar code in
ast_init_dvo(), which works correctly.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Fixes: 83c6620bae3f ("drm/ast: initial DP501 support (v0.2)") Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Jocelyn Falempe <jfalempe@redhat.com> Cc: dri-devel@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v3.16+ Link: https://patch.msgid.link/20260327133532.79696-2-tzimmermann@suse.de
gpio: shared: shorten the critical section in gpiochip_setup_shared()
Commit 710abda58055 ("gpio: shared: call gpio_chip::of_xlate() if set")
introduced a critical section around the adjustmenet of entry->offset.
However this may cause a deadlock if we create the auxiliary shared
proxy devices with this lock taken. We only need to protect
entry->offset while it's read/written so shorten the critical section
and release the lock before creating the proxy device as the field in
question is no longer accessed at this point.
Mikhail Gavrilov [Fri, 27 Mar 2026 12:41:56 +0000 (17:41 +0500)]
dma-debug: suppress cacheline overlap warning when arch has no DMA alignment requirement
When CONFIG_DMA_API_DEBUG is enabled, the DMA debug infrastructure
tracks active mappings per cacheline and warns if two different DMA
mappings share the same cacheline ("cacheline tracking EEXIST,
overlapping mappings aren't supported").
On x86_64, ARCH_KMALLOC_MINALIGN defaults to 8, so small kmalloc
allocations (e.g. the 8-byte hub->buffer and hub->status in the USB
hub driver) frequently land in the same 64-byte cacheline. When both
are DMA-mapped, this triggers a false positive warning.
This has been reported repeatedly since v5.14 (when the EEXIST check
was added) across various USB host controllers and devices including
xhci_hcd with USB hubs, USB audio devices, and USB ethernet adapters.
The cacheline overlap is only a real concern on architectures that
require DMA buffer alignment to cacheline boundaries (i.e. where
ARCH_DMA_MINALIGN >= L1_CACHE_BYTES). On architectures like x86_64
where dma_get_cache_alignment() returns 1, the hardware is
cache-coherent and overlapping cacheline mappings are harmless.
Suppress the EEXIST warning when dma_get_cache_alignment() is less
than L1_CACHE_BYTES, indicating the architecture does not require
cacheline-aligned DMA buffers.
Verified with a kernel module reproducer that performs two kmalloc(8)
allocations back-to-back and DMA-maps both:
Before: allocations share a cacheline, EEXIST fires within ~50 pairs
After: same cacheline pair found, but no warning emitted
Berk Cem Goksel [Sun, 29 Mar 2026 13:38:25 +0000 (16:38 +0300)]
ALSA: caiaq: fix stack out-of-bounds read in init_card
The loop creates a whitespace-stripped copy of the card shortname
where `len < sizeof(card->id)` is used for the bounds check. Since
sizeof(card->id) is 16 and the local id buffer is also 16 bytes,
writing 16 non-space characters fills the entire buffer,
overwriting the terminating nullbyte.
When this non-null-terminated string is later passed to
snd_card_set_id() -> copy_valid_id_string(), the function scans
forward with `while (*nid && ...)` and reads past the end of the
stack buffer, reading the contents of the stack.
A USB device with a product name containing many non-ASCII, non-space
characters (e.g. multibyte UTF-8) will reliably trigger this as follows:
BUG: KASAN: stack-out-of-bounds in copy_valid_id_string
sound/core/init.c:696 [inline]
BUG: KASAN: stack-out-of-bounds in snd_card_set_id_no_lock+0x698/0x74c
sound/core/init.c:718
The off-by-one has been present since commit bafeee5b1f8d ("ALSA:
snd_usb_caiaq: give better shortname") from June 2009 (v2.6.31-rc1),
which first introduced this whitespace-stripping loop. The original
code never accounted for the null terminator when bounding the copy.
Fix this by changing the loop bound to `sizeof(card->id) - 1`,
ensuring at least one byte remains as the null terminator.
Fixes: bafeee5b1f8d ("ALSA: snd_usb_caiaq: give better shortname") Cc: stable@vger.kernel.org Cc: Andrey Konovalov <andreyknvl@gmail.com> Reported-by: Berk Cem Goksel <berkcgoksel@gmail.com> Signed-off-by: Berk Cem Goksel <berkcgoksel@gmail.com> Link: https://patch.msgid.link/20260329133825.581585-1-berkcgoksel@gmail.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
Uros Bizjak [Mon, 30 Mar 2026 05:57:45 +0000 (07:57 +0200)]
x86/asm/segment: Implement loadsegment()/savesegment() macros with static inline helpers
Convert the __loadsegment_simple() and savesegment() macro
implementations into static inline helper functions generated
via small helper macros.
Historically loadsegment() and savesegment() relied on macros that
embedded inline assembly. This approach obscures types, complicates
debugging, and makes the call sites harder for the compiler and static
analysis tools to reason about.
This change is purely mechanical and does not alter the generated code,
but improves readability, type safety, and compiler visibility of the
helpers.
Suggested-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://patch.msgid.link/20260330055823.5793-4-ubizjak@gmail.com
Uros Bizjak [Mon, 30 Mar 2026 05:57:44 +0000 (07:57 +0200)]
x86/asm/segment: Use ASM_INPUT_RM in __loadsegment_fs()
Use the ASM_INPUT_RM macro in __loadsegment_fs() to work around Clang
problems with the "rm" asm constraint. Clang seems to always chose the
memory input, while it is almost always the worst choice.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Nathan Chancellor <nathan@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Link: https://patch.msgid.link/20260330055823.5793-3-ubizjak@gmail.com
Uros Bizjak [Mon, 30 Mar 2026 05:57:43 +0000 (07:57 +0200)]
x86/asm/segment: Remove unnecessary "memory" clobber from savesegment()
The savesegment() macro uses inline assembly to copy a segment register
into a general-purpose register:
movl %seg, reg
This instruction does not access memory, yet the inline asm currently
declares a "memory" clobber, which unnecessarily acts as a compiler
barrier and may inhibit optimization.
Remove the "memory" clobber and mark the asm as `asm volatile` instead.
Segment register loads in the kernel are implemented using `asm volatile`,
so the compiler will not schedule segment register reads before those
loads. Using `asm volatile` preserves the intended ordering with other
segment register operations without imposing an unnecessary global memory
barrier.
No functional change intended.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Link: https://patch.msgid.link/20260330055823.5793-2-ubizjak@gmail.com
Uros Bizjak [Mon, 30 Mar 2026 05:57:42 +0000 (07:57 +0200)]
x86/asm/fsgsbase: Remove unnecessary "memory" clobbers from FS/GS base (read-) accessors
The rdfsbase() and rdgsbase() helpers currently include a "memory"
clobber in their inline assembly definitions. However, the RDFSBASE
and RDGSBASE instructions only read the FS/GS base MSRs into a
general-purpose register and do not access memory. The "memory" clobber,
which acts as a compiler barrier and may inhibit optimization,
is therefore unnecessary.
The "memory" clobber was historically used as a scheduling constraint
to prevent the compiler from moving the instructions before preceding
segment register loads. This is not required because both the segment
register loads and the RDFSBASE/RDGSBASE accessors are implemented
with `asm volatile`, which already prevents reordering between them.
No functional change intended.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Link: https://patch.msgid.link/20260330055823.5793-1-ubizjak@gmail.com
Ville Syrjälä [Thu, 26 Mar 2026 11:18:10 +0000 (13:18 +0200)]
drm/i915/dsi: Don't do DSC horizontal timing adjustments in command mode
Stop adjusting the horizontal timing values based on the
compression ratio in command mode. Bspec seems to be telling
us to do this only in video mode, and this is also how the
Windows driver does things.
This should also fix a div-by-zero on some machines because
the adjusted htotal ends up being so small that we end up with
line_time_us==0 when trying to determine the vtotal value in
command mode.
Note that this doesn't actually make the display on the
Huawei Matebook E work, but at least the kernel no longer
explodes when the driver loads.
Cc: stable@vger.kernel.org Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12045 Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patch.msgid.link/20260326111814.9800-2-ville.syrjala@linux.intel.com Fixes: 53693f02d80e ("drm/i915/dsi: account for DSC in horizontal timings") Reviewed-by: Jani Nikula <jani.nikula@intel.com>
(cherry picked from commit 0b475e91ecc2313207196c6d7fd5c53e1a878525) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Keenan Dong [Thu, 26 Mar 2026 12:36:39 +0000 (20:36 +0800)]
xfrm: account XFRMA_IF_ID in aevent size calculation
xfrm_get_ae() allocates the reply skb with xfrm_aevent_msgsize(), then
build_aevent() appends attributes including XFRMA_IF_ID when x->if_id is
set.
xfrm_aevent_msgsize() does not include space for XFRMA_IF_ID. For states
with if_id, build_aevent() can fail with -EMSGSIZE and hit BUG_ON(err < 0)
in xfrm_get_ae(), turning a malformed netlink interaction into a kernel
panic.
Account XFRMA_IF_ID in the size calculation unconditionally and replace
the BUG_ON with normal error unwinding.
Fixes: 7e6526404ade ("xfrm: Add a new lookup key to match xfrm interfaces.") Reported-by: Keenan Dong <keenanat2000@gmail.com> Signed-off-by: Keenan Dong <keenanat2000@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Yasuaki Torimaru [Thu, 26 Mar 2026 05:58:00 +0000 (14:58 +0900)]
xfrm: clear trailing padding in build_polexpire()
build_expire() clears the trailing padding bytes of struct
xfrm_user_expire after setting the hard field via memset_after(),
but the analogous function build_polexpire() does not do this for
struct xfrm_user_polexpire.
The padding bytes after the __u8 hard field are left
uninitialized from the heap allocation, and are then sent to
userspace via netlink multicast to XFRMNLGRP_EXPIRE listeners,
leaking kernel heap memory contents.
Add the missing memset_after() call, matching build_expire().
A recent strengthening of -Wunused-but-set-variable (enabled with -Wall)
in clang under a new subwarning, -Wunused-but-set-global, points out an
unused static global variable in scripts/mod/modpost.c:
scripts/mod/modpost.c:59:13: error: variable 'extra_warn' set but not used [-Werror,-Wunused-but-set-global]
59 | static bool extra_warn;
| ^
This variable has been unused since commit 6c6c1fc09de3 ("modpost:
require a MODULE_DESCRIPTION()") but that is expected, as there are
currently no extra warnings at W=1 right now. Declare the variable with
the unused attribute to make it clear to the compiler that this variable
may be unused.
The modules-cpio-pkg target added in commit 2a9c8c0b59d3 ("kbuild: add
target to build a cpio containing modules") is incompatible with
initramfs with merged /lib and /usr/lib directories [1]. "/lib" cannot
be a link and directory at the same time.
Respect a non-empty INSTALL_MOD_PATH in the modules-cpio-pkg target so
that `make INSTALL_MOD_PATH=/usr modules-cpio-pkg` results in the same
module install location as `make INSTALL_MOD_PATH=/usr modules_install`.
Tested with Fedora distribution initramfs produced by dracut.
Link: https://systemd.io/THE_CASE_FOR_THE_USR_MERGE/ Fixes: 2a9c8c0b59d3 ("kbuild: add target to build a cpio containing modules") Cc: stable@vger.kernel.org Reviewed-by: Simon Glass <sjg@chromium.org> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Janne Grunau <j@jannau.net> Reviewed-by: Nicolas Schier <nsc@kernel.org> Tested-by: Nicolas Schier <nsc@kernel.org> Reviewed-by: Ahmad Fatoum <a.fatoum@pengutronix.de> Link: https://patch.msgid.link/20260327-kbuild-modules-cpio-pkg-usr-merge-v3-1-ef507dfa006c@jannau.net Signed-off-by: Nathan Chancellor <nathan@kernel.org>
ksmbd: fix OOB write in QUERY_INFO for compound requests
When a compound request such as READ + QUERY_INFO(Security) is received,
and the first command (READ) consumes most of the response buffer,
ksmbd could write beyond the allocated buffer while building a security
descriptor.
The root cause was that smb2_get_info_sec() checked buffer space using
ppntsd_size from xattr, while build_sec_desc() often synthesized a
significantly larger descriptor from POSIX ACLs.
This patch introduces smb_acl_sec_desc_scratch_len() to accurately
compute the final descriptor size beforehand, performs proper buffer
checking with smb2_calc_max_out_buf_len(), and uses exact-sized
allocation + iov pinning.
Cc: stable@vger.kernel.org Fixes: e2b76ab8b5c9 ("ksmbd: add support for read compound") Signed-off-by: Asim Viladi Oglu Manizada <manizada@pm.me> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
Utkal Singh [Tue, 17 Mar 2026 15:24:39 +0000 (15:24 +0000)]
erofs: harden h_shared_count in erofs_init_inode_xattrs()
`u8 h_shared_count` indicates the shared xattr count of an inode. It is
read from the on-disk xattr ibody header, which should be corrupted if
the size of the shared xattr array exceeds the space available in
`xattr_isize`.
It does not cause harmful consequence (e.g. crashes), since the image is
already considered corrupted, it indeed results in the silent processing
of garbage metadata.
Bitterblue Smith [Wed, 25 Mar 2026 22:27:15 +0000 (00:27 +0200)]
wifi: rtw88: coex: Ignore BT info byte 5 from RTL8821A
Sometimes while watching a Youtube video with Bluetooth headphones the
audio has a lot of interruptions, because the 5th byte of the BT info
sent by RTL8821AU has strange values, which result in
coex_stat->bt_hid_pair_num being 2 or 3. When this happens
rtw_coex_freerun_check() returns true, which causes
rtw_coex_action_wl_connected() to call rtw_coex_action_freerun() instead
of rtw_coex_action_bt_a2dp().
The RTL8821AU vendor driver doesn't do anything with the 5th byte of the
BT info, so ignore it here as well.
Zong-Zhe Yang [Wed, 25 Mar 2026 07:21:30 +0000 (15:21 +0800)]
wifi: rtw89: fw: load TX power elements according to AID
For different A-die, there will be different TX power parameters.
In FW element header, the corresponding A-die ID will be described.
So, compare runtime AID with that to load the target TX power
parameters.
Ping-Ke Shih [Wed, 25 Mar 2026 07:21:29 +0000 (15:21 +0800)]
wifi: rtw89: phy: load RF parameters relying on ACV for RTL8922D
RF parameters are conditional formats with RFE type and CV as arguments,
but RTL8922D has many variants and use ACV as argument instead of CV.
Add to select proper register values.
Ping-Ke Shih [Wed, 25 Mar 2026 07:21:27 +0000 (15:21 +0800)]
wifi: rtw89: mac: disable pre-load function for RTL8922DE
The pre-load function is a MAC function to pre-load TX packets into
WiFi device's memory, so it can enhance performance. However, RTL8922DE
has not fully verified and fine tune this function, temporarily disable
this function.
Ping-Ke Shih [Wed, 25 Mar 2026 07:21:26 +0000 (15:21 +0800)]
wifi: rtw89: mac: add specific case to dump mac memory for RTL8922D
The RTL8922D can reuse most mac memory addresses, but only
RTW89_MAC_MEM_SECURITY_CAM is different from existing one. Add a function
to return the specific memory address for RTL8922D.
Ping-Ke Shih [Wed, 25 Mar 2026 07:21:25 +0000 (15:21 +0800)]
wifi: rtw89: pci: clear SER ISR when initial and leaving WoWLAN for WiFi 7 chips
The PCIE SER is to diagnose PCIE becomes abnormal, relying on IMR settings
to trigger interrupt when status is weird. Update settings to disable
PHY error flag 9, and clear ISR when initial and leaving WoWLAN.
Chin-Yen Lee [Wed, 25 Mar 2026 07:21:24 +0000 (15:21 +0800)]
wifi: rtw89: wow: enable MLD address for Magic packet wakeup
Under MLO connections, the original Magic Packet configuration
only supported Link Addresses for wakeup. Update the setting
to support both MLD Address and Link Addresses for wakeup process.
Ping-Ke Shih [Tue, 24 Mar 2026 06:20:49 +0000 (14:20 +0800)]
wifi: rtw89: 8922d: add set channel of RF part
The set channel of RF part is to configure channel and bandwidth on a
register. The function to encode channel and bandwidth into register
value will be implemented by coming patch.
Ping-Ke Shih [Tue, 24 Mar 2026 06:20:48 +0000 (14:20 +0800)]
wifi: rtw89: 8922d: add set channel of BB part
The set channel of BB part is the main part, which according to channel
and bandwidth to configure CCK SCO, RX gain of LNA and TIA programmed
in efuse for CCK and OFDM rate, and spur elimination of CSI and NBI tones.
Ping-Ke Shih [Tue, 24 Mar 2026 06:20:47 +0000 (14:20 +0800)]
wifi: rtw89: 8922d: add set channel of MAC part
The set channel is a key function to switch to specific operating channel.
For MAC part, configure hardware according to channel bandwidth, and
enable CCK rate for 2GHz band only.
Ping-Ke Shih [Tue, 24 Mar 2026 06:20:46 +0000 (14:20 +0800)]
wifi: rtw89: 8922d: read and configure RF by calibration data from efuse physical map
The calibration data is from physical map, including 1) thermal trim to
align output thermal value across chips, and 2) PA bias to transmit
expected power by controller.
Ping-Ke Shih [Tue, 24 Mar 2026 06:20:45 +0000 (14:20 +0800)]
wifi: rtw89: 8922d: define efuse map and read necessary fields
Define specific efuse map for RTL8922D, including TSSI, RX gain, MAC
address, RFE type and etc. The additional fields comparing to existing
chips are BT setting (define BT switch GPIO, antenna number and etc) and
gain offset2 (define more fields like existing RX gain offset).
Sanman Pradhan [Sun, 29 Mar 2026 17:09:48 +0000 (17:09 +0000)]
hwmon: (pxe1610) Check return value of page-select write in probe
pxe1610_probe() writes PMBUS_PAGE to select page 0 but does not check
the return value. If the write fails, subsequent register reads operate
on an indeterminate page, leading to silent misconfiguration.
Check the return value and propagate the error using dev_err_probe(),
which also handles -EPROBE_DEFER correctly without log spam.
Sanman Pradhan [Sun, 29 Mar 2026 17:09:40 +0000 (17:09 +0000)]
hwmon: (tps53679) Fix array access with zero-length block read
i2c_smbus_read_block_data() can return 0, indicating a zero-length
read. When this happens, tps53679_identify_chip() accesses buf[ret - 1]
which is buf[-1], reading one byte before the buffer on the stack.
Fix by changing the check from "ret < 0" to "ret <= 0", treating a
zero-length read as an error (-EIO), which prevents the out-of-bounds
array access.
Also fix a typo in the adjacent comment: "if present" instead of
duplicate "if".
Ping-Ke Shih [Tue, 24 Mar 2026 06:20:44 +0000 (14:20 +0800)]
wifi: rtw89: 8922d: add power on/off functions
The power on function is the first entry to power on hardware including
all MAC/BB/RF circuits, and then it becomes possible to do high level
operations, such as WiFi scan, connection.
If connection becomes unavailable, device stays into idle mode, calling
power off function to cut power.
Ping-Ke Shih [Tue, 24 Mar 2026 06:20:43 +0000 (14:20 +0800)]
wifi: rtw89: 8922d: add definition of quota, registers and efuse block
The quota is used to configure memory size for TX/RX, and the definition
of registers includes H2C command, C2H event, WoWLAN reason, IMR of CMAC
and DMAC, ACK rate selector, RF kill GPIO, and BB functions of dynamic
initial gain and EDCCA. The layout of efuse block is to define logic
map of efuse, such as MAC address and RF calibration values.
Ping-Ke Shih [Tue, 24 Mar 2026 01:10:01 +0000 (09:10 +0800)]
wifi: rtw88: validate RX rate to prevent out-of-bound
The reported RX rate might be unexpected, causing kernel warns:
Rate marked as a VHT rate but data is invalid: MCS: 0, NSS: 0
WARNING: net/mac80211/rx.c:5491 at ieee80211_rx_list+0x183/0x1020 [mac80211]
As the RX rate can be index of an array under certain conditions, validate
it to prevent accessing array out-of-bound potentially.
Tested on HP Notebook P3S95EA#ACB (kernel 6.19.9-1-cachyos):
- No WARNING: net/mac80211/rx.c:5491 observed after the v2 patch.
The unexpected `NSS: 0, MCS: 0` VHT rate warnings are successfully
mitigated.
- The system remains fully stable through prolonged idle periods,
high network load, active Bluetooth A2DP usage, and multiple deep
suspend/resume cycles.
- Zero h2c timeouts or firmware lps state errors observed in dmesg.
wifi: rtw89: phy: fix uninitialized variable access in rtw89_phy_cfo_set_crystal_cap()
In the rtw89_phy_cfo_set_crystal_cap() function, for chips other than
RTL8852A/RTL8851B, the values read by rtw89_mac_read_xtal_si() are
stored into the local variables sc_xi_val and sc_xo_val. If either
read fails, these variables remain uninitialized, they are later
used to update cfo->crystal_cap and in debug print statements. This
can lead to undefined behavior.
Fix the issue by initializing sc_xi_val and sc_xo_val to zero,
like is implemented in vendor driver.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Christian Hewitt [Tue, 17 Mar 2026 11:21:55 +0000 (11:21 +0000)]
wifi: rtw89: retry efuse physical map dump on transient failure
On Radxa Rock 5B with a RTL8852BE combo WiFi/BT card, the efuse
physical map dump intermittently fails with -EBUSY during probe.
The failure occurs in rtw89_dump_physical_efuse_map_ddv() where
read_poll_timeout_atomic() times out waiting for the B_AX_EF_RDY
bit after 1 second.
The root cause is a timing race during boot: the WiFi driver's
chip initialization (firmware download via PCIe) overlaps with
Bluetooth firmware download to the same combo chip via USB. This
can leave the efuse controller temporarily unavailable when the
WiFi driver attempts to read the efuse map.
The firmware download path retries up to 5 times, but the efuse
read that follows has no similar logic. Address this by adding
retry loop logic (also up to 5 attempts) around physical efuse
map dump.
Bitterblue Smith [Wed, 18 Mar 2026 17:45:13 +0000 (19:45 +0200)]
wifi: rtw88: TX QOS Null data the same way as Null data
When filling out the TX descriptor, Null data frames are treated like
management frames, but QOS Null data frames are treated like normal
data frames. Somehow this causes a problem for the firmware.
When connected to a network in the 2.4 GHz band, wpa_supplicant (or
NetworkManager?) triggers a scan every five minutes. During these scans
mac80211 transmits many QOS Null frames in quick succession. Because
these frames are marked with IEEE80211_TX_CTL_REQ_TX_STATUS, rtw88
asks the firmware to report the TX ACK status for each of these frames.
Sometimes the firmware can't process the TX status requests quickly
enough, they add up, it only processes some of them, and then marks
every subsequent TX status report with the wrong number.
The symptom is that after a while the warning "failed to get tx report
from firmware" appears every five minutes.
This problem apparently happens only with the older RTL8723D, RTL8821A,
RTL8812A, and probably RTL8703B chips.
Treat QOS Null data frames the same way as Null data frames. This seems
to avoid the problem.
Tested with RTL8821AU, RTL8723DU, RTL8811CU, and RTL8812BU.
Ping-Ke Shih [Mon, 16 Mar 2026 03:56:35 +0000 (11:56 +0800)]
wifi: rtw88: add quirks to disable PCI ASPM and deep LPS for HP P3S95EA#ACB
On an HP laptop (P3S95EA#ACB) equipped with a Realtek RTL8821CE 802.11ac
PCIe adapter (PCI ID: 10ec:c821), the system experiences a hard lockup
(complete freeze of the UI and kernel, sysrq doesn't work, requires
holding the power button) when the WiFi adapter enters the power
saving state. Disable PCI ASPM to avoid system freeze.
In addition, driver throws messages periodically. Though this doesn't
always cause unstable connection, missing H2C commands might cause
unpredictable results. Disable deep LPS to avoid this as well.
rtw88_8821ce 0000:13:00.0: firmware failed to leave lps state
rtw88_8821ce 0000:13:00.0: failed to send h2c command
rtw88_8821ce 0000:13:00.0: failed to send h2c command
Tested on HP Notebook P3S95EA#ACB (kernel 6.19.7-1-cachyos):
- No hard freeze observed during idle or active usage.
- Zero h2c or lps errors in dmesg across idle (10 min),
load stress (100MB download), and suspend/resume cycle.
- Both quirk flags confirmed active via sysfs without any
manual modprobe parameters.
Gary Guo [Tue, 3 Feb 2026 11:34:10 +0000 (11:34 +0000)]
kbuild: rust: provide an option to inline C helpers into Rust
A new experimental Kconfig option, `RUST_INLINE_HELPERS` is added to
allow C helpers (which were created to allow Rust to call into
inline/macro C functions without having to re-implement the logic in
Rust) to be inlined into Rust crates without performing global LTO.
If the option is enabled, the following is performed:
* For helpers, instead of compiling them to an object file to be linked
into vmlinux, they're compiled to LLVM IR bitcode. Two versions are
generated: one for built-in code (`helpers.bc`) and one for modules
(`helpers_module.bc`, with -DMODULE defined). This ensures that C
macros/inlines that behave differently for modules (e.g. static calls)
function correctly when inlined.
* When a Rust crate or object is compiled, instead of generating an
object file, LLVM bitcode is generated.
* llvm-link is invoked with --internalize to combine the helper bitcode
with the crate bitcode. This step is similar to LTO, but this is much
faster since it only needs to inline the helpers.
* clang is invoked to turn the combined bitcode into a final object file.
* Since clang may produce LLVM bitcode when LTO is enabled, and objtool
requires ELF input, $(cmd_ld_single) is invoked to ensure the object
is converted to ELF before objtool runs.
The --internalize flag tells llvm-link to treat all symbols in
helpers.bc using `internal` linkage [1]. This matches the behavior of
`clang` on `static inline` functions, and avoids exporting the symbol
from the object file.
To ensure that RUST_INLINE_HELPERS is not incompatible with BTF, we pass
the -g0 flag when building helpers. See commit 5daa0c35a1f0 ("rust:
Disallow BTF generation with Rust + LTO") for details.
We have an intended triple mismatch of `aarch64-unknown-none` vs
`aarch64-unknown-linux-gnu`, so we pass --suppress-warnings to llvm-link
to suppress it.
I considered adding some sort of check that KBUILD_MODNAME is not
present in helpers_module.bc, but this is actually not so easy to carry
out because .bc files store strings in a weird binary format, so you
cannot just grep it for a string to check whether it ended up using
KBUILD_MODNAME anywhere.
[ Andreas writes:
For the rnull driver, enabling helper inlining with this patch
gives an average speedup of 2% over the set of 120 workloads that
we publish on [2].
Link: https://rust-for-linux.com/null-block-driver
This series also uncovered a pre-existing UB instance thanks to an
`objtool` warning which I noticed while testing the series (details
in the mailing list).
- Miguel ]
Link: https://github.com/llvm/llvm-project/pull/170397 Co-developed-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Co-developed-by: Matthew Maurer <mmaurer@google.com> Signed-off-by: Matthew Maurer <mmaurer@google.com> Signed-off-by: Gary Guo <gary@garyguo.net> Co-developed-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Andreas Hindborg <a.hindborg@kernel.org> Reviewed-by: Andreas Hindborg <a.hindborg@kernel.org> Link: https://patch.msgid.link/20260203-inline-helpers-v2-3-beb8547a03c9@google.com
[ Some changes, apart from the rebase:
- Added "(EXPERIMENTAL)" to Kconfig as the commit mentions.
- Added `depends on ARM64 || X86_64` and `!UML` for now, since this is
experimental, other architectures may require other changes (e.g.
the issues I mentioned in the mailing list for ARM and UML) and they
are not really tested so far. So let arch maintainers pick this up
if they think it is worth it.
- Gated the `cmd_ld_single` step also into the new mode, which also
means that any possible future `objcopy` step is done after the
translation, as expected.
- Added `.gitignore` for `.bc` with exception for existing script.
- Added `part-of-*` for helpers bitcode files as discussed, and
dropped `$(if $(filter %_module.bc,$@),-DMODULE)` since `-DMODULE`
is already there (would be duplicated otherwise).
- Moved `LLVM_LINK` to keep binutils list alphabetized.
- Fixed typo in title.
- Dropped second `cmd_ld_single` commit message paragraph.
- Miguel ] Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Gary Guo [Tue, 3 Feb 2026 11:34:09 +0000 (11:34 +0000)]
rust: helpers: #define __rust_helper
Because of LLVM inling checks, it's generally not possible to inline a C
helper into Rust code, even with LTO:
* LLVM doesn't want to inline functions compiled with
`-fno-delete-null-pointer-checks` with code compiled without. The C
CGUs all have this enabled and Rust CGUs don't. Inlining is okay since
this is one of the hardening features that does not change the ABI,
and we shouldn't have null pointer dereferences in these helpers.
* LLVM doesn't want to inline functions with different list of builtins. C
side has `-fno-builtin-wcslen`; `wcslen` is not a Rust builtin, so
they should be compatible, but LLVM does not perform inlining due to
attributes mismatch.
* clang and Rust doesn't have the exact target string. Clang generates
`+cmov,+cx8,+fxsr` but Rust doesn't enable them (in fact, Rust will
complain if `-Ctarget-feature=+cmov,+cx8,+fxsr` is used). x86-64
always enable these features, so they are in fact the same target
string, but LLVM doesn't understand this and so inlining is inhibited.
This can be bypassed with `--ignore-tti-inline-compatible`, but this
is a hidden option.
To fix this, we can add __always_inline on every helper, which skips
these LLVM inlining checks. For this purpose, introduce a new
__rust_helper macro that needs to be added to every helper.
Most helpers already have __rust_helper specified, but there are a few
missing. The only consequence of this is that those specific helpers do
not get inlined.
Signed-off-by: Gary Guo <gary@garyguo.net> Signed-off-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Andreas Hindborg <a.hindborg@kernel.org> Reviewed-by: Andreas Hindborg <a.hindborg@kernel.org> Link: https://patch.msgid.link/20260203-inline-helpers-v2-2-beb8547a03c9@google.com Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
This config detects if Rust and Clang have matching LLVM major version.
All IR or bitcode operations (e.g. LTO) rely on LLVM major version to be
matching, otherwise it may generate errors, or worse, miscompile
silently due to change of IR semantics.
It's usually suggested to use the exact same LLVM version, but this can
be difficult to guarantee. Rust's suggestion [1] is also major-version
only, so I think this check is sufficient for the kernel.
Link: https://doc.rust-lang.org/rustc/linker-plugin-lto.html Reviewed-by: Andreas Hindborg <a.hindborg@kernel.org> Signed-off-by: Gary Guo <gary@garyguo.net> Signed-off-by: Matthew Maurer <mmaurer@google.com> Signed-off-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Nicolas Schier <nsc@kernel.org> Tested-by: Nicolas Schier <nsc@kernel.org> Tested-by: Andreas Hindborg <a.hindborg@kernel.org> Link: https://patch.msgid.link/20260203-inline-helpers-v2-1-beb8547a03c9@google.com
[ Fixed typo. - Miguel ] Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Gary Guo [Thu, 19 Mar 2026 12:16:47 +0000 (12:16 +0000)]
rust: rework `build_assert!` documentation
Add a detailed comparison and recommendation of the three types of
build-time assertion macro as module documentation (and un-hide the module
to render them).
The documentation on the macro themselves are simplified to only cover the
scenarios where they should be used; links to the module documentation is
added instead.
Reviewed-by: Yury Norov <ynorov@nvidia.com> Signed-off-by: Gary Guo <gary@garyguo.net> Reviewed-by: Alexandre Courbot <acourbot@nvidia.com> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Danilo Krummrich <dakr@kernel.org> Link: https://patch.msgid.link/20260319121653.2975748-4-gary@kernel.org
[ Added periods on comments. - Miguel ] Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Gary Guo [Thu, 19 Mar 2026 12:16:46 +0000 (12:16 +0000)]
rust: add `const_assert!` macro
The macro is a more powerful version of `static_assert!` for use inside
function contexts. This is powered by inline consts, so enable the feature
for old compiler versions that does not have it stably.
While it is possible already to write `const { assert!(...) }`, this
provides a short hand that is more uniform with other assertions. It also
formats nicer with rustfmt where it will not be formatted into multiple
lines.
Two users that would route via the Rust tree are converted.
Reviewed-by: Yury Norov <ynorov@nvidia.com> Signed-off-by: Gary Guo <gary@garyguo.net> Reviewed-by: Alexandre Courbot <acourbot@nvidia.com> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Danilo Krummrich <dakr@kernel.org> Link: https://patch.msgid.link/20260319121653.2975748-3-gary@kernel.org
[ Rebased. Fixed period typo. - Miguel ] Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Linus Torvalds [Sun, 29 Mar 2026 22:24:28 +0000 (15:24 -0700)]
Merge tag 'vfs-7.0-rc6.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
- Fix netfs_limit_iter() hitting BUG() when an ITER_KVEC iterator
reaches it via core dump writes to 9P filesystems. Add ITER_KVEC
handling following the same pattern as the existing ITER_BVEC code.
- Fix a NULL pointer dereference in the netfs unbuffered write retry
path when the filesystem (e.g., 9P) doesn't set the prepare_write
operation.
- Clear I_DIRTY_TIME in sync_lazytime for filesystems implementing
->sync_lazytime. Without this the flag stays set and may cause
additional unnecessary calls during inode deactivation.
- Increase tmpfs size in mount_setattr selftests. A recent commit
bumped the ext4 image size to 2 GB but didn't adjust the tmpfs
backing store, so mkfs.ext4 fails with ENOSPC writing metadata.
- Fix an invalid folio access in iomap when i_blkbits matches the folio
size but differs from the I/O granularity. The cur_folio pointer
would not get invalidated and iomap_read_end() would still be called
on it despite the IO helper owning it.
- Fix hash_name() docstring.
- Fix read abandonment during netfs retry where the subreq variable
used for abandonment could be uninitialized on the first pass or
point to a deleted subrequest on later passes.
- Don't block sync for filesystems with no data integrity guarantees.
Add a SB_I_NO_DATA_INTEGRITY superblock flag replacing the per-inode
AS_NO_DATA_INTEGRITY mapping flag so sync kicks off writeback but
doesn't wait for flusher threads. This fixes a suspend-to-RAM hang on
fuse-overlayfs where the flusher thread blocks when the fuse daemon
is frozen.
- Fix a lockdep splat in iomap when reads fail. iomap_read_end_io()
invokes fserror_report() which calls igrab() taking i_lock in hardirq
context while i_lock is normally held with interrupts enabled. Kick
failed read handling to a workqueue.
- Remove the redundant netfs_io_stream::front member and use
stream->subrequests.next instead, fixing a potential issue in the
direct write code path.
* tag 'vfs-7.0-rc6.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
netfs: Fix the handling of stream->front by removing it
iomap: fix lockdep complaint when reads fail
writeback: don't block sync for filesystems with no data integrity guarantees
netfs: Fix read abandonment during retry
vfs: fix docstring of hash_name()
iomap: fix invalid folio access when i_blkbits differs from I/O granularity
selftests/mount_setattr: increase tmpfs size for idmapped mount tests
fs: clear I_DIRTY_TIME in sync_lazytime
netfs: Fix NULL pointer dereference in netfs_unbuffered_write() on retry
netfs: Fix kernel BUG in netfs_limit_iter() for ITER_KVEC iterators
====================
net: hsr: subsystem cleanups and modernization
This series contains two focused HSR cleanups with practical benefit.
It constifies protocol operation tables and replaces a hardcoded
function name with __func__ to keep diagnostics correct across
refactoring.
====================
Luka Gejak [Thu, 26 Mar 2026 17:46:00 +0000 (18:46 +0100)]
net: hsr: use __func__ instead of hardcoded function name
Replace the hardcoded string "hsr_get_untagged_frame" with the
standard __func__ macro in netdev_warn_once() call to make the code
more robust to refactoring.
Luka Gejak [Thu, 26 Mar 2026 17:45:59 +0000 (18:45 +0100)]
net: hsr: constify hsr_ops and prp_ops protocol operation structures
The hsr_ops and prp_ops structures are assigned to hsr->proto_ops during
device initialization and are never modified at runtime. Declaring them
as const allows the compiler to place these structures in read-only
memory, which improves security by preventing accidental or malicious
modification of the function pointers they contain.
The proto_ops field in struct hsr_priv is also updated to a const
pointer to maintain type consistency.
Jakub Kicinski [Sun, 29 Mar 2026 21:35:29 +0000 (14:35 -0700)]
Merge branch 'macb-usrio-tsu-patches'
Conor Dooley says:
====================
macb usrio/tsu patches
At the very least, it'd be good of the soc vendor folks could check
their platforms and see if their usrio stuff actually lines up with what
the driver currently calls "macb_default_usrio". Ours didn't and it was
a nasty surprise.
Ryan and I figured out that the sama7g5 stuff is not actually using the
same usrio bits as earlier devices, so there's now more patches in this
series to split them apart. I've not tested the split or the new
property due to lack of hardware, but Ryan has.
Marking this stuff net-next, because although they're fixes I don't see
any particular urgency, and it avoids creating some dependencies between
cleanup items and the fixes.
====================
Théo Lebrun [Wed, 25 Mar 2026 16:28:18 +0000 (16:28 +0000)]
net: macb: drop usrio pointer on EyeQ5 config
USRIO is disabled on this platform, drop its inherited usrio config.
We will end up with MACB_CAPS_USRIO_DISABLED on this platform:
- We have no config->usrio so macb_configure_caps() deduces that the
feature is disabled.
- Anecdotally, we would also land in the runtime detection codepath
that reads DCFG1.
Théo Lebrun [Wed, 25 Mar 2026 16:28:17 +0000 (16:28 +0000)]
net: macb: set MACB_CAPS_USRIO_DISABLED if no usrio config is provided
bp->usrio is copied directly from dt_conf->usrio in macb_probe().
If dt_conf->usrio is NULL, we do not want to land in USRIO write
codepaths which dereference bp->usrio. Inherit automatically
MACB_CAPS_USRIO_DISABLED to avoid those.
This means a macb_config that wants to disable usrio can simply drop
its .usrio field, rather than add the disabled capability explicitly.
Nit: drop the dt_conf NULL check because the pointer is always valid.
DCFG1 (design config 1 register) carries a bit indicating whether User
I/O feature has been enabled or not. The MACB/GEM driver has a cap flag
indicating that HW has the feature disabled (default is enabled). Add
the missing connection between DCFG1 bit and MACB_CAPS_USRIO_DISABLED.
Indirect impact: avoid useless writel() on USERIO register; this is not
an important fix because USERIO is anyway read-only when feature is
disabled.
If for some reason a compatible sets USRIO_DISABLED but DCFG1 indicates
it is enabled, we still keep the disabled capability flag. This ensures
we don't break "cdns,np4-macb" that sets the flag from compatible match
data.
Conor Dooley [Wed, 25 Mar 2026 16:28:15 +0000 (16:28 +0000)]
net: macb: timer adjust mode is not supported
The ptp portion of this driver controls the tsu's timer using the
controls for "increment mode", which is not compatible with the hardware
trying to control it via the gem_tsu_inc_ctrl and gem_tsu_ms inputs in
"timer adjust mode". Abort probe if the property signalling that the
relevant signals have been wired up is present.
The GEM IP has two methods for modifying the ptp timer. The first of
these, named "increment mode", relies on software controlling the timer
by setting tsu_timer_incr and tsu_timer_incr_sub_nsec and performing
once-off adjustments via the tsu_timer_adjust register. This is what the
macb driver uses. The second mechanism, "timer adjust mode" uses the
gem_tsu_inc_ctrl and gem_tsu_ms signals to control the timer. These
modes are not intended to be used in parallel, but both can be possible
on the same device and which mode is used cannot be determined from the
compatible on all devices, because some users of the GEM IP are SoC
FPGAs that permit configuring how the IP is wired up.
Add a property to indicate that gem_tsu_inc_ctrl and gem_tsu_ms are wired
up for timer adjust mode.
Conor Dooley [Wed, 25 Mar 2026 16:28:13 +0000 (16:28 +0000)]
net: macb: clean up tsu clk rate acquisition
tsu_clk is grabbed during probe, so doesn't need to be re-grabbed here.
pclk is mandatory, probe will fail if it is err/NULL, so there's no need
to check it here or have a !pclk 3rd arm. Simplify gem_get_tsu_rate() to
account for these facts.
Conor Dooley [Wed, 25 Mar 2026 16:28:12 +0000 (16:28 +0000)]
net: macb: warn on pclk use as a tsu_clk fallback
The Candence GEM IP has a configuration parameter which determines the
source of the clock used for the timestamp unit (if it is enabled),
switching it between using the pclk and a dedicated input.
When ptp support was added to the macb driver, a new tsu_clk was added
to represent the dedicated input. While this is understandable, I think
it is bug prone and that the tsu_clk should represent whatever clock is
used for the timestamper and not just that specific input.
>From a driver point of view, the benefit of taking the conceptual
approach is avoiding misconfiguring the driver when the hardware
supports ptp (and it is set as a capability in the relevant per-device
structure) but no tsu_clk is provided in devicetree. At the moment, the
timestamper will be registered and programmed with an increment that
reflects the pclk in these cases, but will malfunction if the pclk and
tsu_clk frequencies do not match. Obviously, this means the devicetree
incorrectly represents the hardware, but this change in approach would
make the driver more resilient without meaningfully impacting correctly
described users.
Out of the devices that claim MACB_CAPS_GEM_HAS_PTP the fu540, mpfs,
sama5d2 and sama7g5-emac (but not sama7g5-gem) are at risk of having
this problem with the in-kernel devicetrees. mpfs and sama7g5-emac
have been confirmed to be incorrect, and sama5d2 is correct. It may be
that the other platforms actually do use the pclk for the timestamper
(either by supplying pclk to the tsu_clk input of the IP, or by having
the IP block configured to use pclk instead of the tsu_clk input), but
at least two are wrong, as they do not use pclk for the tsu_clk, so the
driver is registering the ptp clock incorrectly.
Add a warning if no tsu_clk is provided on a platform that uses the
timerstamper, to encourage people to specifically provide a tsu_clk and
avoid silently registering the timerstamper with the wrong clock. If the
pclk is actually used, it can be provided as a tsu_clk for improved
clarity in devicetrees.
While this changes the meaning of the devicetree property, it is
backwards compatible as there's no functional change for platforms that
didn't provide a tsu_clk and the changed meaning of providing a tsu_clk
in the devicetree does not impact platforms that already provided one as
the decision about the tsu clock source is at IP instantiation time
rather than at runtime, so there's no driver behaviour that needs to
change based on the input to the IP used for the timestamping unit.
Conor Dooley [Wed, 25 Mar 2026 16:28:11 +0000 (16:28 +0000)]
net: macb: add mpfs specific usrio configuration
On mpfs the driver needs to make sure the tsu clock source is not the
fabric, as this requires that the hardware is in Timer Adjust mode,
which is not compatible with the linux driver trying to control the
hardware. It is unlikely that this will be set, as the peripheral is
reset during probe, but if the resets are not provided in devicetree
it's probable that this bit is set incorrectly, as U-Boot's macb driver
has the same issue with using usrio settings for at91 platforms as the
default.
Conor Dooley [Wed, 25 Mar 2026 16:28:09 +0000 (16:28 +0000)]
net: macb: rework usrio refclk selection code
The USRIO based refclk selection code abuses a capability flag to set
the refclk to an external source based on match data/compatible on
sama7g5-emac and use an internal source for the gmac.
Ryan previously added a property in an attempt to decouple the refclk
source from the compatible, because this is not fixed by compatible
and there's variance based on the choices made by board designers.
Originally when Ryan added it, he removed the capability flag entirely
from match data, but this changed the default for the sama7g5-emac and
the removal had to be reverted for these devices. Because these devices
default to an external refclk, and the current property is only capable
of communicating external refclks, there's no way to make the
sama7g5-emac use an internal refclk.
Additionally, this property has no limiting based on compatible, and
if used on a platform with an external refclk that is not controlled
by USRIO the capability would be erroneously set. Because of the reuse
of the at91_default_usrio struct by non-at91 devices, this could cause
the refclk bit to be set in error, on a system where the refclk is
externally provided without usrio settings being required.
Change the new capability flag so that it actually represents the
hardware being capable of controlling the refclk source via USRIO,
and move the selection of default behaviour into the macb_usrio_config
struct provided as part of match data.
Modify the devicetree code to support a new property,
"cdns,refclk-source" which will support devices with either default,
retaining support for "cdns,refclk-external" for compatibility reasons.
Conor Dooley [Wed, 25 Mar 2026 16:28:08 +0000 (16:28 +0000)]
dt-bindings: net: cdns,macb: replace cdns,refclk-ext with cdns,refclk-source
Ryan added cdns,refclk-ext with the intent of decoupling the source of
the reference clock on sama7g5 (and related platforms) from the
compatible. Unfortunately, the default for sama7g5-emac is an external
reference clock, so this property had no effect there, so that
compatibility with older devicetrees is preserved.
Replace cdns,refclk-ext with one that supports both default states and
therefore is usable for sama7g5-emac.
For now, limit it to only the platforms that have USRIO controlled
reference clock selection, but this could be generalised in the future.
The existing property only works on devices that are compatible with
sama7g5-gem, so mark it deprecated, and limit its use to that specific
scenario.
Conor Dooley [Wed, 25 Mar 2026 16:28:07 +0000 (16:28 +0000)]
net: macb: split USRIO_HAS_CLKEN capability in two
While trying to rework the internal/external refclk selection on
sama7g5, Ryan and I noticed that the sama7g5 was "overloading" the
meaning of MACB_CAPS_USRIO_HAS_CLKEN, using it differently to how it was
originally intended.
Originally, on the macb hardware on sam9620 et al,
MACB_CAPS_USRIO_HAS_CLKEN represented the hardware having a bit that
needed to be set to turn on the input clock to the transceivers. The
sama7g5 doesn't have this bit, so for some reason the decision was made
to reuse this capability flag to control selection of internal/external
references.
Split the caps in two, so that capabilities do what they say on the tin,
and allow reworking the refclk selection handling without impacting the
older devices that use MACB_CAPS_USRIO_CLKEN for its original purpose.
Conor Dooley [Wed, 25 Mar 2026 16:28:06 +0000 (16:28 +0000)]
net: macb: rename macb_default_usrio to at91_default_usrio as not all platforms have mii mode control in usrio
Calling this structure macb_default_usrio is misleading, I believe, as
it implies that it should be used if your platform has nothing special
to do in usrio. Since usrio is platform dependent, the default here is
probably for each usrio to do nothing, with the macb documentation I
have access to prescribing no standard behaviour here. We noticed that
this was problematic because on mpfs, a bit that macb_default_usrio
sets to deal with the MII mode actually changes the source for the
tsu_clk to something with how the majority of mpfs devices are actually
configured!
Rename it to at91_default_usrio, since that's where the values actually
come from for these. I have no idea if any of the other platforms that
use the default actually copied at91's usrio configuration or if they
have usrio configurations where what the driver does has no impact.
Gate touching these bits behind a capability, like the clken refclock
usrio knob, so that platforms without the MII mode stuff can avoid
running this code.
Conor Dooley [Wed, 25 Mar 2026 16:28:05 +0000 (16:28 +0000)]
Revert "net: macb: Clean up the .usrio settings in macb_config instances"
Commit 0ae998c4efd69 ("net: macb: Clean up the .usrio settings in
macb_config instances") was a misguided attempt to clean up the driver
that actually just propagated problematic code. The default for usrio is
actually no usrio, and already there are issues with people using the
problematically named "macb_default_usrio" on platforms where the usrio
does not have this so-called default behaviour. usrio is platform
specific and using the default at91 usrio settings should be opt-in
only. Revert the "cleanup" patch.
====================
bnxt_en: Add XDP RSS hash metadata support
This series adds XDP RSS hash metadata extraction support for the bnxt_en
driver and includes selftests to validate the functionality. I was able
to test this on a BCM57414 NIC.
====================
This test loads xdp_metadata.bpf which calls bpf_xdp_metadata_rx_hash() on
incoming packets. The metadata from that packet is then sent to a BPF
map for validation. It borrows structure from xdp.py, reusing common
functions.
The test checks the device's xdp-rx-metadata-features via netlink
before running and skips on devices that do not advertise hash support.
This can be run on veth devices as well as real hardware.
The test is fairly simple and just verifies that a TCP or UDP packet can be
identified as an L4 flow. This minimal test also passes if run on a veth
device.
Chris J Arges [Wed, 25 Mar 2026 20:09:51 +0000 (15:09 -0500)]
selftests: net: move common xdp.py functions into lib
This moves a few functions which can be useful to other python programs
that manipulate XDP programs. This also refactors xdp.py to use the
refactored functions.
Chris J Arges [Wed, 25 Mar 2026 20:09:49 +0000 (15:09 -0500)]
bnxt_en: Move bnxt_rss_ext_op into header
This allows bnxt_rss_ext_op to be used by other functions. In addition this
modifies the rxcmp argument to be const since the function only reads from
this structure.
Add support for extracting RSS hash values and hash types from hardware
completion descriptors in XDP programs for bnxt_en.
Add IP_TYPE definition for determining if completion is ipv4 or ipv6. In
addition add ITYPE_ICMP flag for identifying ICMP completions.
Signed-off-by: Chris J Arges <carges@cloudflare.com> Reviewed-by: Joe Damato <joe@dama.to> Reviewed-by: Andy Gospodarek <gospo@broadcom.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Chris J Arges [Wed, 25 Mar 2026 20:09:47 +0000 (15:09 -0500)]
bnxt_en: use bnxt_xdp_buff for xdp context
This adds bnxt_xdp_buff which embeds the xdp_buff struct and stores
pointers to hardware RX completion descriptors (rx_cmp and rx_cmp_ext)
along with the completion type.
Implement an optimized CRC64 (NVMe) algorithm for ARM64 using NEON
Polynomial Multiply Long (PMULL) instructions. The generic shift-and-XOR
software implementation is slow, which creates a bottleneck in NVMe and
other storage subsystems.
The acceleration is implemented using C intrinsics (<arm_neon.h>) rather
than raw assembly for better readability and maintainability.
Key highlights of this implementation:
- Uses 4KB chunking inside scoped_ksimd() to avoid preemption latency
spikes on large buffers.
- Pre-calculates and loads fold constants via vld1q_u64() to minimize
register spilling.
- Benchmarks show the break-even point against the generic implementation
is around 128 bytes. The PMULL path is enabled only for len >= 128.
validate_fixed_range() admits buf_addr at the exact end of the
registered region when len is zero, because the check uses strict
greater-than (buf_end > imu->ubuf + imu->len). io_import_fixed()
then computes offset == imu->len, which causes the bvec skip logic
to advance past the last bio_vec entry and read bv_offset from
out-of-bounds slab memory.
Return early from io_import_fixed() when len is zero. A zero-length
import has no data to transfer and should not walk the bvec array
at all.
BUG: KASAN: slab-out-of-bounds in io_import_reg_buf+0x697/0x7f0
Read of size 4 at addr ffff888002bcc254 by task poc/103
Call Trace:
io_import_reg_buf+0x697/0x7f0
io_write_fixed+0xd9/0x250
__io_issue_sqe+0xad/0x710
io_issue_sqe+0x7d/0x1100
io_submit_sqes+0x86a/0x23c0
__do_sys_io_uring_enter+0xa98/0x1590
Allocated by task 103:
The buggy address is located 12 bytes to the right of
allocated 584-byte region [ffff888002bcc000, ffff888002bcc248)