Linus Torvalds [Tue, 18 Nov 2025 18:02:22 +0000 (10:02 -0800)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"Arm:
- Only adjust the ID registers when no irqchip has been created once
per VM run, instead of doing it once per vcpu, as this otherwise
triggers a pretty bad conbsistency check failure in the sysreg code
- Make sure the per-vcpu Fine Grain Traps are computed before we load
the system registers on the HW, as we otherwise start running
without anything set until the first preemption of the vcpu
x86:
- Fix selftests failure on AMD, checking for an optimization that was
not happening anymore"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: SVM: Fix redundant updates of LBR MSR intercepts
KVM: arm64: VHE: Compute fgt traps before activating them
KVM: arm64: Finalize ID registers only once per VM
Yosry Ahmed [Wed, 12 Nov 2025 01:30:17 +0000 (01:30 +0000)]
KVM: SVM: Fix redundant updates of LBR MSR intercepts
Don't update the LBR MSR intercept bitmaps if they're already up-to-date,
as unconditionally updating the intercepts forces KVM to recalculate the
MSR bitmaps for vmcb02 on every nested VMRUN. The redundant updates are
functionally okay; however, they neuter an optimization in Hyper-V
nested virtualization enlightenments and this manifests as a self-test
failure.
In particular, Hyper-V lets L1 mark "nested enlightenments" as clean, i.e.
tell KVM that no changes were made to the MSR bitmap since the last VMRUN.
The hyperv_svm_test KVM selftest intentionally changes the MSR bitmap
"without telling KVM about it" to verify that KVM honors the clean hint,
correctly fails because KVM notices the changed bitmap anyway:
==== Test Assertion Failure ====
x86/hyperv_svm_test.c:120: vmcb->control.exit_code == 0x081
pid=193558 tid=193558 errno=4 - Interrupted system call
1 0x0000000000411361: assert_on_unhandled_exception at processor.c:659
2 0x0000000000406186: _vcpu_run at kvm_util.c:1699
3 (inlined by) vcpu_run at kvm_util.c:1710
4 0x0000000000401f2a: main at hyperv_svm_test.c:175
5 0x000000000041d0d3: __libc_start_call_main at libc-start.o:?
6 0x000000000041f27c: __libc_start_main_impl at ??:?
7 0x00000000004021a0: _start at ??:?
vmcb->control.exit_code == SVM_EXIT_VMMCALL
Do *not* fix this by skipping svm_hv_vmcb_dirty_nested_enlightenments()
when svm_set_intercept_for_msr() performs a no-op change. changes to
the L0 MSR interception bitmap are only triggered by full CPUID updates
and MSR filter updates, both of which should be rare. Changing
svm_set_intercept_for_msr() risks hiding unintended pessimizations
like this one, and is actually more complex than this change.
Fixes: fbe5e5f030c2 ("KVM: nSVM: Always recalculate LBR MSR intercepts in svm_update_lbrv()") Cc: stable@vger.kernel.org Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20251112013017.1836863-1-yosry.ahmed@linux.dev
[Rewritten commit message based on mailing list discussion. - Paolo] Reviewed-by: Sean Christopherson <seanjc@google.com> Tested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Tue, 18 Nov 2025 16:38:01 +0000 (17:38 +0100)]
Merge tag 'kvmarm-fixes-6.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 6.18, take #3
- Only adjust the ID registers when no irqchip has been created once
per VM run, instead of doing it once per vcpu, as this otherwise
triggers a pretty bad conbsistency check failure in the sysreg code.
- Make sure the per-vcpu Fine Grain Traps are computed before we load
the system registers on the HW, as we otherwise start running without
anything set until the first preemption of the vcpu.
Linus Torvalds [Tue, 18 Nov 2025 16:21:27 +0000 (08:21 -0800)]
mm/huge_memory: Fix initialization of huge zero folio
The recent fix to properly initialize the tags of the huge zero folio
had an unfortunate not-so-subtle side effect: it caused the actual
*contents* of the huge zero folio to not be initialized at all when the
hardware didn't support the memory tagging.
The reason was the unfortunate semantics of tag_clear_highpage(): on
hardware that didn't do the tagging, it would silently just not do
anything at all. And since this is done only on arm64 with MTE support,
that basically meant most hardware.
It wasn't necessarily immediately obvious since the huge zero page isn't
necessarily very heavily used - or because it might already be zero
because all-zeroes is the most common pattern. But it ends up causing
random odd user space failures when you do hit it.
The unfortunate semantics have been around for a while, but became a
real bug only when we started actively using __GFP_ZEROTAGS in the
generic get_huge_zero_folio() function - before that, it had only ever
been used in code that checked that the hardware supported it.
Fix this by simply changing the semantics of tag_clear_highpage() to
return whether it actually successfully did something or not. While at
it, also make it initialize multiple pages in one go, since that's
actually what the only caller wants it to do and it simplifies the whole
logic.
Fixes: adfb6609c680 ("mm/huge_memory: initialise the tags of the huge zero folio") Link: https://lore.kernel.org/all/20251117082023.90176-1-00107082@163.com/ Reviewed-by: David Hildenbrand (Red Hat) <david@kernel.org> Reported-and-tested-by: David Wang <00107082@163.com> Reported-and-tested-by: Carlos Llamas <cmllamas@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Mon, 17 Nov 2025 17:11:27 +0000 (09:11 -0800)]
Merge tag 'vfs-6.18-rc7.fixes' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
- Fix unitialized variable in statmount_string()
- Fix hostfs mounting when passing host root during boot
- Fix dynamic lookup to fail on cell lookup failure
- Fix missing file type when reading bfs inodes from disk
- Enforce checking of sb_min_blocksize() calls and update all callers
accordingly
- Restore write access before closing files opened by open_exec() in
binfmt_misc
- Always freeze efivarfs during suspend/hibernate cycles
- Fix statmount()'s and listmount()'s grab_requested_mnt_ns() helper to
actually allow mount namespace file descriptor in addition to mount
namespace ids
- Fix tmpfs remount when noswap is specified
- Switch Landlock to iput_not_last() to remove false-positives from
might_sleep() annotations in iput()
- Remove dead node_to_mnt_ns() code
- Ensure that per-queue kobjects are successfully created
* tag 'vfs-6.18-rc7.fixes' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs:
landlock: fix splats from iput() after it started calling might_sleep()
fs: add iput_not_last()
shmem: fix tmpfs reconfiguration (remount) when noswap is set
fs/namespace: correctly handle errors returned by grab_requested_mnt_ns
power: always freeze efivarfs
binfmt_misc: restore write access before closing files opened by open_exec()
block: add __must_check attribute to sb_min_blocksize()
virtio-fs: fix incorrect check for fsvq->kobj
xfs: check the return value of sb_min_blocksize() in xfs_fs_fill_super
isofs: check the return value of sb_min_blocksize() in isofs_fill_super
exfat: check return value of sb_min_blocksize in exfat_read_boot_sector
vfat: fix missing sb_min_blocksize() return value checks
mnt: Remove dead code which might prevent from building
bfs: Reconstruct file type when loading from disk
afs: Fix dynamic lookup to fail on cell lookup failure
hostfs: Fix only passing host root in boot stage with new mount
fs: Fix uninitialized 'offp' in statmount_string()
Linus Torvalds [Mon, 17 Nov 2025 17:01:22 +0000 (09:01 -0800)]
Merge tag 'sched_ext-for-6.18-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext
Pull sched_ext fixes from Tejun Heo:
"Five fixes addressing PREEMPT_RT compatibility and locking issues.
Three commits fix potential deadlocks and sleeps in atomic contexts on
RT kernels by converting locks to raw spinlocks and ensuring IRQ work
runs in hard-irq context. The remaining two fix unsafe locking in the
debug dump path and a variable dereference typo"
* tag 'sched_ext-for-6.18-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext:
sched_ext: Use IRQ_WORK_INIT_HARD() to initialize rq->scx.kick_cpus_irq_work
sched_ext: Fix possible deadlock in the deferred_irq_workfn()
sched/ext: convert scx_tasks_lock to raw spinlock
sched_ext: Fix unsafe locking in the scx_dump_state()
sched_ext: Fix use of uninitialized variable in scx_bpf_cpuperf_set()
Linus Torvalds [Mon, 17 Nov 2025 16:57:21 +0000 (08:57 -0800)]
Merge tag 'mtd/fixes-for-6.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux
Pull MTD fixes from Miquel Raynal:
"Mostly small misc fixes, here they are sorted by sub-subsystem:
ECC fixes:
- Realtek Kconfig fix
SPI NAND fixes:
- Remove nonexistent QE bit on FMSH FM25S01A
Raw NAND fixes:
- Prevent DMA device NULL pointer dereference in Cadence driver
MTD device fixes:
- Possible integer overflow in read/write ioctls
- Fix the IRQ handler pointer in the onenand driver, even if in
practice it is never dereferenced.
* tag 'mtd/fixes-for-6.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
mtd: onenand: Pass correct pointer to IRQ handler
mtd: spinand: fmsh: remove QE bit for FM25S01A flash
mtd: rawnand: cadence: fix DMA device NULL pointer dereference
mtd: rawnand: realtek: Make rtl_ecc_engine_ops const
mtd: nand: MTD_NAND_ECC_REALTEK should depend on HAS_DMA
mtd: nand: realtek-ecc: Fix a IS_ERR() vs NULL bug in probe
mtdchar: fix integer overflow in read/write ioctls
Zqiang [Mon, 17 Nov 2025 12:53:10 +0000 (20:53 +0800)]
sched_ext: Use IRQ_WORK_INIT_HARD() to initialize rq->scx.kick_cpus_irq_work
For PREEMPT_RT kernels, the kick_cpus_irq_workfn() be invoked in
the per-cpu irq_work/* task context and there is no rcu-read critical
section to protect. this commit therefore use IRQ_WORK_INIT_HARD() to
initialize the per-cpu rq->scx.kick_cpus_irq_work in the
init_sched_ext_class().
Linus Torvalds [Sun, 16 Nov 2025 21:45:03 +0000 (13:45 -0800)]
Merge tag 'perf-tools-fixes-for-v6.18-2-2025-11-16' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools fixes from Arnaldo Carvalho de Melo:
- Fix writing bpf_prog (infos|btfs)_cnt to data file, to not generate
invalid perf.data files in some corner cases.
- Fix 'perf top' segfault by ensuring libbfd is initialized. This is an
opt-in feature due to license incompatibilities.
- Fix segfault in 'perf lock' due to missing kernel map.
- Fix 'perf lock contention' test.
- Don't fail fast path detection if binutils-devel isn't available.
- Sync KVM's vmx.h with the kernel to pick SEAMCALL exit reason.
* tag 'perf-tools-fixes-for-v6.18-2-2025-11-16' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
perf libbfd: Ensure libbfd is initialized prior to use
perf test: Fix lock contention test
perf lock: Fix segfault due to missing kernel map
tools headers UAPI: Sync KVM's vmx.h with the kernel to pick SEAMCALL exit reason
perf build: Don't fail fast path feature detection when binutils-devel is not available
perf header: Write bpf_prog (infos|btfs)_cnt to data file
Linus Torvalds [Sun, 16 Nov 2025 21:31:14 +0000 (13:31 -0800)]
Merge tag 'mm-hotfixes-stable-2025-11-16-10-40' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"7 hotfixes. 5 are cc:stable, 4 are against mm/
All are singletons - please see the respective changelogs for details"
* tag 'mm-hotfixes-stable-2025-11-16-10-40' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mm, swap: fix potential UAF issue for VMA readahead
selftests/user_events: fix type cast for write_index packed member in perf_test
lib/test_kho: check if KHO is enabled
mm/huge_memory: fix folio split check for anon folios in swapcache
MAINTAINERS: update David Hildenbrand's email address
crash: fix crashkernel resource shrink
mm: fix MAX_FOLIO_ORDER on powerpc configs with hugetlb
Linus Torvalds [Sun, 16 Nov 2025 15:08:28 +0000 (07:08 -0800)]
Merge tag 'firewire-fixes-6.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394
Pull firewire fixes from Takashi Sakamoto:
"This includes some fixes for the topology map, newly introduced in
v6.18 kernel"
* tag 'firewire-fixes-6.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394:
firewire: core: fix to update generation field in topology map
firewire: core: Initialize topology_map.lock
Linus Torvalds [Sun, 16 Nov 2025 15:05:24 +0000 (07:05 -0800)]
Merge tag 'edac_urgent_for_v6.18_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras
Pull EDAC fixes from Borislav Petkov:
- In Versalnet, handle the reporting of non-standard hw errors whose
information can come in more than one remote processor message.
- Explicitly reenable ECC checking after a warm reset in Altera OCRAM
as those registers are reset to default otherwise
- Fix single-bit error injection in Altera EDAC to not inject errors
directly in ECC RAM and thus lead to false double-bit errors due to
same ECC RAM being in concurrent use
* tag 'edac_urgent_for_v6.18_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
EDAC/altera: Use INTTEST register for Ethernet and USB SBE injection
EDAC/altera: Handle OCRAM ECC enable after warm reset
EDAC/versalnet: Handle split messages for non-standard errors
Takashi Sakamoto [Fri, 14 Nov 2025 14:44:21 +0000 (23:44 +0900)]
firewire: core: fix to update generation field in topology map
The generation field of topology map is updated after initialized by zero.
The updated value of generation field is always zero, and is against
specification.
Kairui Song [Tue, 11 Nov 2025 13:36:08 +0000 (21:36 +0800)]
mm, swap: fix potential UAF issue for VMA readahead
Since commit 78524b05f1a3 ("mm, swap: avoid redundant swap device
pinning"), the common helper for allocating and preparing a folio in the
swap cache layer no longer tries to get a swap device reference
internally, because all callers of __read_swap_cache_async are already
holding a swap entry reference. The repeated swap device pinning isn't
needed on the same swap device.
Caller of VMA readahead is also holding a reference to the target entry's
swap device, but VMA readahead walks the page table, so it might encounter
swap entries from other devices, and call __read_swap_cache_async on
another device without holding a reference to it.
So it is possible to cause a UAF when swapoff of device A raced with
swapin on device B, and VMA readahead tries to read swap entries from
device A. It's not easy to trigger, but in theory, it could cause real
issues.
Make VMA readahead try to get the device reference first if the swap
device is a different one from the target entry.
Link: https://lkml.kernel.org/r/20251111-swap-fix-vma-uaf-v1-1-41c660e58562@tencent.com Fixes: 78524b05f1a3 ("mm, swap: avoid redundant swap device pinning") Suggested-by: Huang Ying <ying.huang@linux.alibaba.com> Signed-off-by: Kairui Song <kasong@tencent.com> Acked-by: Chris Li <chrisl@kernel.org> Cc: Baoquan He <bhe@redhat.com> Cc: Barry Song <baohua@kernel.org> Cc: Kemeng Shi <shikemeng@huaweicloud.com> Cc: Nhat Pham <nphamcs@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Ankit Khushwaha [Thu, 6 Nov 2025 09:55:32 +0000 (15:25 +0530)]
selftests/user_events: fix type cast for write_index packed member in perf_test
Accessing 'reg.write_index' directly triggers a -Waddress-of-packed-member
warning due to potential unaligned pointer access:
perf_test.c:239:38: warning: taking address of packed member 'write_index'
of class or structure 'user_reg' may result in an unaligned pointer value
[-Waddress-of-packed-member]
239 | ASSERT_NE(-1, write(self->data_fd, ®.write_index,
| ^~~~~~~~~~~~~~~
Since write(2) works with any alignment. Casting '®.write_index'
explicitly to 'void *' to suppress this warning.
Zi Yan [Wed, 5 Nov 2025 16:29:10 +0000 (11:29 -0500)]
mm/huge_memory: fix folio split check for anon folios in swapcache
Both uniform and non uniform split check missed the check to prevent
splitting anon folios in swapcache to non-zero order.
Splitting anon folios in swapcache to non-zero order can cause data
corruption since swapcache only support PMD order and order-0 entries.
This can happen when one use split_huge_pages under debugfs to split
anon folios in swapcache.
In-tree callers do not perform such an illegal operation. Only debugfs
interface could trigger it. I will put adding a test case on my TODO
list.
Fix the check.
Link: https://lkml.kernel.org/r/20251105162910.752266-1-ziy@nvidia.com Fixes: 58729c04cf10 ("mm/huge_memory: add buddy allocator like (non-uniform) folio_split()") Signed-off-by: Zi Yan <ziy@nvidia.com> Reported-by: "David Hildenbrand (Red Hat)" <david@kernel.org> Closes: https://lore.kernel.org/all/dc0ecc2c-4089-484f-917f-920fdca4c898@kernel.org/ Acked-by: David Hildenbrand (Red Hat) <david@kernel.org> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Barry Song <baohua@kernel.org> Cc: Dev Jain <dev.jain@arm.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Nico Pache <npache@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
MAINTAINERS: update David Hildenbrand's email address
Switch to kernel.org email address as I will be leaving Red Hat. The old
address will remain active until end of January 2026, so performing the
change now should make sure that most mails will reach me.
If crashkernel is then shrunk to 50MB (echo 52428800 >
/sys/kernel/kexec_crash_size), /proc/iomem still shows 256MB reserved: af000000-beffffff : Crash kernel
This happens because __crash_shrink_memory()/kernel/crash_core.c
incorrectly updates the crashk_res resource object even when
crashk_low_res should be updated.
Fix this by ensuring the correct crashkernel resource object is updated
when shrinking crashkernel memory.
Link: https://lkml.kernel.org/r/20251101193741.289252-1-sourabhjain@linux.ibm.com Fixes: 16c6006af4d4 ("kexec: enable kexec_crash_size to support two crash kernel regions") Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com> Acked-by: Baoquan He <bhe@redhat.com> Cc: Zhen Lei <thunder.leizhen@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm: fix MAX_FOLIO_ORDER on powerpc configs with hugetlb
In the past, CONFIG_ARCH_HAS_GIGANTIC_PAGE indicated that we support
runtime allocation of gigantic hugetlb folios. In the meantime it evolved
into a generic way for the architecture to state that it supports gigantic
hugetlb folios.
In commit fae7d834c43c ("mm: add __dump_folio()") we started using
CONFIG_ARCH_HAS_GIGANTIC_PAGE to decide MAX_FOLIO_ORDER: whether we could
have folios larger than what the buddy can handle. In the context of that
commit, we started using MAX_FOLIO_ORDER to detect page corruptions when
dumping tail pages of folios. Before that commit, we assumed that we
cannot have folios larger than the highest buddy order, which was
obviously wrong.
In commit 7b4f21f5e038 ("mm/hugetlb: check for unreasonable folio sizes
when registering hstate"), we used MAX_FOLIO_ORDER to detect
inconsistencies, and in fact, we found some now.
Powerpc allows for configs that can allocate gigantic folio during boot
(not at runtime), that do not set CONFIG_ARCH_HAS_GIGANTIC_PAGE and can
exceed PUD_ORDER.
To fix it, let's make powerpc select CONFIG_ARCH_HAS_GIGANTIC_PAGE with
hugetlb on powerpc, and increase the maximum folio size with hugetlb to 16
GiB on 64bit (possible on arm64 and powerpc) and 1 GiB on 32 bit
(powerpc). Note that on some powerpc configurations, whether we actually
have gigantic pages depends on the setting of CONFIG_ARCH_FORCE_MAX_ORDER,
but there is nothing really problematic about setting it unconditionally:
we just try to keep the value small so we can better detect problems in
__dump_folio() and inconsistencies around the expected largest folio in
the system.
Ideally, we'd have a better way to obtain the maximum hugetlb folio size
and detect ourselves whether we really end up with gigantic folios. Let's
defer bigger changes and fix the warnings first.
While at it, handle gigantic DAX folios more clearly: DAX can only end up
creating gigantic folios with HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD.
Add a new Kconfig option HAVE_GIGANTIC_FOLIOS to make both cases clearer.
In particular, worry about ARCH_HAS_GIGANTIC_PAGE only with HUGETLB_PAGE.
Note: with enabling CONFIG_ARCH_HAS_GIGANTIC_PAGE on powerpc, we will now
also allow for runtime allocations of folios in some more powerpc configs.
I don't think this is a problem, but if it is we could handle it through
__HAVE_ARCH_GIGANTIC_PAGE_RUNTIME_SUPPORTED.
While __dump_page()/__dump_folio was also problematic (not handling
dumping of tail pages of such gigantic folios correctly), it doesn't seem
critical enough to mark it as a fix.
Link: https://lkml.kernel.org/r/20251114214920.2550676-1-david@kernel.org Fixes: 7b4f21f5e038 ("mm/hugetlb: check for unreasonable folio sizes when registering hstate") Reported-by: Christophe Leroy <christophe.leroy@csgroup.eu> Closes: https://lore.kernel.org/r/3e043453-3f27-48ad-b987-cc39f523060a@csgroup.eu/ Reported-by: Sourabh Jain <sourabhjain@linux.ibm.com> Closes: https://lore.kernel.org/r/94377f5c-d4f0-4c0f-b0f6-5bf1cd7305b1@linux.ibm.com/ Signed-off-by: David Hildenbrand (Red Hat) <david@kernel.org> Cc: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Donet Tom <donettom@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Mike Rapoport <rppt@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Linus Torvalds [Sat, 15 Nov 2025 16:51:43 +0000 (08:51 -0800)]
Merge tag 'timers-urgent-2025-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fix from Ingo Molnar:
"Fix a memory leak in the posix timer creation logic"
* tag 'timers-urgent-2025-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
posix-timers: Plug potential memory leak in do_timer_create()
Linus Torvalds [Sat, 15 Nov 2025 16:48:51 +0000 (08:48 -0800)]
Merge tag 'irq-urgent-2025-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fix from Ingo Molnar:
"Fix an irqchip driver release bug in the riscv-intc irqchip driver"
* tag 'irq-urgent-2025-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/riscv-intc: Add missing free() callback in riscv_intc_domain_ops
Linus Torvalds [Sat, 15 Nov 2025 16:46:18 +0000 (08:46 -0800)]
Merge tag 'core-urgent-2025-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core fix from Ingo Molnar:
"Fix a broken #ifndef in the <linux/entry-virt.h> header.
It hasn't caused problems upstream yet because no arch overrides
arch_xfer_to_guest_mode_handle_work() at this moment"
* tag 'core-urgent-2025-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
entry: Fix ifndef around arch_xfer_to_guest_mode_handle_work() stub
Linus Torvalds [Fri, 14 Nov 2025 23:45:31 +0000 (15:45 -0800)]
Merge tag 'pci-v6.18-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Pull pci fixes from Bjorn Helgaas:
- Cache the ASPM L0s/L1 Supported bits early so quirks can override
them if necessary (Bjorn Helgaas)
- Add quirks for PA Semi and Freescale Root Ports and a HiSilicon Wi-Fi
device that are reported to have broken L0s and L1 (Shawn Lin, Bjorn
Helgaas)
* tag 'pci-v6.18-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
PCI/ASPM: Avoid L0s and L1 on Hi1105 [19e5:1105] Wi-Fi
PCI/ASPM: Avoid L0s and L1 on PA Semi [1959:a002] Root Ports
PCI/ASPM: Avoid L0s and L1 on Freescale [1957:0451] Root Ports
PCI/ASPM: Convert quirks to override advertised link states
PCI/ASPM: Add pcie_aspm_remove_cap() to override advertised link states
PCI/ASPM: Cache L0s/L1 Supported so advertised link states can be overridden
Linus Torvalds [Fri, 14 Nov 2025 23:39:39 +0000 (15:39 -0800)]
Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Pull bpf fixes from Alexei Starovoitov:
- Fix interaction between livepatch and BPF fexit programs (Song Liu)
With Steven and Masami acks.
- Fix stack ORC unwind from BPF kprobe_multi (Jiri Olsa)
With Steven and Masami acks.
- Fix out of bounds access in widen_imprecise_scalars() in the verifier
(Eduard Zingerman)
- Fix conflicts between MPTCP and BPF sockmap (Jiayuan Chen)
- Fix net_sched storage collision with BPF data_meta/data_end (Eric
Dumazet)
- Add _impl suffix to BPF kfuncs with implicit args to avoid breaking
them in bpf-next when KF_IMPLICIT_ARGS is added (Mykyta Yatsenko)
* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
selftests/bpf: Test widen_imprecise_scalars() with different stack depth
bpf: account for current allocated stack depth in widen_imprecise_scalars()
bpf: Add bpf_prog_run_data_pointers()
selftests/bpf: Add mptcp test with sockmap
mptcp: Fix proto fallback detection with BPF
mptcp: Disallow MPTCP subflows from sockmap
selftests/bpf: Add stacktrace ips test for raw_tp
selftests/bpf: Add stacktrace ips test for kprobe_multi/kretprobe_multi
x86/fgraph,bpf: Fix stack ORC unwind from kprobe_multi return probe
Revert "perf/x86: Always store regs->ip in perf_callchain_kernel()"
bpf: add _impl suffix for bpf_stream_vprintk() kfunc
bpf:add _impl suffix for bpf_task_work_schedule* kfuncs
selftests/bpf: Add tests for livepatch + bpf trampoline
ftrace: bpf: Fix IPMODIFY + DIRECT in modify_ftrace_direct()
ftrace: Fix BPF fexit with livepatch
Linus Torvalds [Fri, 14 Nov 2025 23:36:15 +0000 (15:36 -0800)]
Merge tag 'rust-fixes-6.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux
Pull Rust fix from Miguel Ojeda:
- Fix a Rust 1.91.0 build issue due to 'bindings.o' not containing
DWARF debug information anymore by teaching gendwarfksyms to skip
object files without exports
* tag 'rust-fixes-6.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux:
gendwarfksyms: Skip files with no exports
Linus Torvalds [Fri, 14 Nov 2025 21:44:23 +0000 (13:44 -0800)]
Merge tag 'nfs-for-6.18-3' of git://git.linux-nfs.org/projects/anna/linux-nfs
Pull NFS client fixes from Anna Schumaker:
- Various fixes when using NFS with TLS
- Localio direct-IO fixes
- Fix error handling in nfs_atomic_open_v23()
- Fix sysfs memory leak when nfs_client kobject add fails
- Fix an incorrect parameter when calling nfs4_call_sync()
- Fix a failing LTP test when using delegated timestamps
* tag 'nfs-for-6.18-3' of git://git.linux-nfs.org/projects/anna/linux-nfs:
NFS: Fix LTP test failures when timestamps are delegated
NFSv4: Fix an incorrect parameter when calling nfs4_call_sync()
NFS: sysfs: fix leak when nfs_client kobject add fails
NFSv2/v3: Fix error handling in nfs_atomic_open_v23()
nfs/localio: do not issue misaligned DIO out-of-order
nfs/localio: Ensure DIO WRITE's IO on stable storage upon completion
nfs/localio: backfill missing partial read support for misaligned DIO
nfs/localio: add refcounting for each iocb IO associated with NFS pgio header
nfs/localio: remove unecessary ENOTBLK handling in DIO WRITE support
NFS: Check the TLS certificate fields in nfs_match_client()
pnfs: Set transport security policy to RPC_XPRTSEC_NONE unless using TLS
pnfs: Fix TLS logic in _nfs4_pnfs_v4_ds_connect()
pnfs: Fix TLS logic in _nfs4_pnfs_v3_ds_connect()
amdkfd:
- Save area check fix
- Fix GPU mappings for APU after prefetch
i915:
- Fix PSR's pipe to vblank conversion
- Disable Panel Replay on MST links
xe:
- New HW workarounds affecting PTL and WCL platforms
* tag 'drm-fixes-2025-11-15' of https://gitlab.freedesktop.org/drm/kernel:
drm/client: fix MODULE_PARM_DESC string for "active"
drm/i915/dp_mst: Disable Panel Replay
drm/amdkfd: Fix GPU mappings for APU after prefetch
drm/amdkfd: relax checks for over allocation of save area
drm/amdgpu/jpeg: Add parse_cs for JPEG5_0_1
drm/amd/amdgpu: Ensure isp_kernel_buffer_alloc() creates a new BO
drm/amd/display: Allow VRR params change if unsynced with the stream
drm/amdgpu: fix lock warning in amdgpu_userq_fence_driver_process
drm/amdgpu: jump to the correct label on failure
drm/amdgpu: disable peer-to-peer access for DCC-enabled GC12 VRAM surfaces
drm/xe/xe3lpg: Extend Wa_15016589081 for xe3lpg
drm/xe/xe3: Extend wa_14023061436
drm/xe/xe3: Add WA_14024681466 for Xe3_LPG
drm/i915/psr: fix pipe to vblank conversion
drm/panthor: Flush shmem writes before mapping buffers CPU-uncached
drm/vmwgfx: Restore Guest-Backed only cursor plane support
drm/vmwgfx: Use kref in vmw_bo_dirty
drm/vmwgfx: Validate command header size against SVGA_CMD_MAX_DATASIZE
Linus Torvalds [Fri, 14 Nov 2025 21:04:35 +0000 (13:04 -0800)]
Merge tag 'spi-fix-v6.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A few standard fixes here, plus one more interesting one from Hans
which addresses an issue where a move in when we requested GPIOs on
ACPI systems caused us to stop doing pinmuxing and leave things
floating that we'd really rather not have floating"
* tag 'spi-fix-v6.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: Add TODO comment about ACPI GPIO setup
spi: xilinx: increase number of retries before declaring stall
spi: imx: keep dma request disabled before dma transfer setup
spi: Try to get ACPI GPIO IRQ earlier
Linus Torvalds [Fri, 14 Nov 2025 21:01:23 +0000 (13:01 -0800)]
Merge tag 'regulator-fix-v6.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fix from Mark Brown:
"One simple fix for a GPIO descriptor leak in the probe error handling
for the fixed regulator"
* tag 'regulator-fix-v6.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: fixed: fix GPIO descriptor leak on register failure
Linus Torvalds [Fri, 14 Nov 2025 18:18:45 +0000 (10:18 -0800)]
Merge tag 'block-6.18-20251114' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull block fixlet from Jens Axboe:
"Been sitting on this one for a week or two, planning on sending it out
when there were other block changes for 6.18. But as that hasn't
materialized in the second week of sitting on it, let's flush it out.
A previous commit updated my git tree locations, but one was missed as
it was already set to the git.kernel.org one. But the git location swap
also renamed the actual tree from linux-block to just linux, let's get
that last one updated too"
* tag 'block-6.18-20251114' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
MAINTAINERS: correct git location for block layer tree
Linus Torvalds [Fri, 14 Nov 2025 17:57:30 +0000 (09:57 -0800)]
Merge tag 'io_uring-6.18-20251113' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull io_uring fixes from Jens Axboe:
- Use the actual segments in a request when for bvec based buffers
- Fix an odd case where the iovec might get leaked for a read/write
request, if it was newly allocated, overflowed the alloc cache, and
hit an early error
- Minor tweak to the query API added in this release, returning the
number of available entries
* tag 'io_uring-6.18-20251113' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
io_uring/rsrc: don't use blk_rq_nr_phys_segments() as number of bvecs
io_uring/query: return number of available queries
io_uring/rw: ensure allocated iovec gets cleared for early failure
Eduard Zingerman [Fri, 14 Nov 2025 02:57:30 +0000 (18:57 -0800)]
selftests/bpf: Test widen_imprecise_scalars() with different stack depth
A test case for a situation when widen_imprecise_scalars() is called
with old->allocated_stack > cur->allocated_stack. Test structure:
def widening_stack_size_bug():
r1 = 0
for r6 in 0..1:
iterator_with_diff_stack_depth(r1)
r1 = 42
def iterator_with_diff_stack_depth(r1):
if r1 != 42:
use 128 bytes of stack
iterator based loop
iterator_with_diff_stack_depth() is verified with r1 == 0 first and
r1 == 42 next. Causing stack usage of 128 bytes on a first visit and 8
bytes on a second. Such arrangement triggered a KASAN error in
widen_imprecise_scalars().
Where prev_st is an ancestor of the queued_st in the explored states
tree. This ancestor is not guaranteed to have same allocated stack
depth as queued_st. E.g. in the following case:
def main():
for i in 1..2:
foo(i) // same callsite, differnt param
def foo(i):
if i == 1:
use 128 bytes of stack
iterator based loop
Here, for a second 'foo' call prev_st->allocated_stack is 128,
while queued_st->allocated_stack is much smaller.
widen_imprecise_scalars() needs to take this into account and avoid
accessing bpf_verifier_state->frame[*]->stack out of bounds.
Fixes: 2793a8b015f7 ("bpf: exact states comparison for iterator convergence checks") Reported-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20251114025730.772723-1-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Eric Dumazet [Wed, 12 Nov 2025 12:55:16 +0000 (12:55 +0000)]
bpf: Add bpf_prog_run_data_pointers()
syzbot found that cls_bpf_classify() is able to change
tc_skb_cb(skb)->drop_reason triggering a warning in sk_skb_reason_drop().
WARNING: CPU: 0 PID: 5965 at net/core/skbuff.c:1192 __sk_skb_reason_drop net/core/skbuff.c:1189 [inline]
WARNING: CPU: 0 PID: 5965 at net/core/skbuff.c:1192 sk_skb_reason_drop+0x76/0x170 net/core/skbuff.c:1214
struct tc_skb_cb has been added in commit ec624fe740b4 ("net/sched:
Extend qdisc control block with tc control block"), which added a wrong
interaction with db58ba459202 ("bpf: wire in data and data_end for
cls_act_bpf").
drop_reason was added later.
Add bpf_prog_run_data_pointers() helper to save/restore the net_sched
storage colliding with BPF data_meta/data_end.
Fixes: ec624fe740b4 ("net/sched: Extend qdisc control block with tc control block") Reported-by: syzbot <syzkaller@googlegroups.com> Closes: https://lore.kernel.org/netdev/6913437c.a70a0220.22f260.013b.GAE@google.com/ Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Victor Nogueira <victor@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://patch.msgid.link/20251112125516.1563021-1-edumazet@google.com
Linus Torvalds [Fri, 14 Nov 2025 16:32:58 +0000 (08:32 -0800)]
Merge tag 'v6.18-p5' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fix from Herbert Xu:
- Fix device reference leak in hisilicon
* tag 'v6.18-p5' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: hisilicon/qm - Fix device reference leak in qm_get_qos_value
Linus Torvalds [Fri, 14 Nov 2025 16:30:48 +0000 (08:30 -0800)]
Merge tag 'v6.18-rc5-smb-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:
- Multichannel reconnect channel selection fix
- Fix for smbdirect (RDMA) disconnect bug
- Fix for incorrect username length check
- Fix memory leak in mount parm processing
* tag 'v6.18-rc5-smb-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
smb: client: let smbd_disconnect_rdma_connection() turn CREATED into DISCONNECTED
smb: fix invalid username check in smb3_fs_context_parse_param()
cifs: client: fix memory leak in smb3_fs_context_parse_param
smb: client: fix cifs_pick_channel when channel needs reconnect
Eslam Khafagy [Fri, 14 Nov 2025 12:27:39 +0000 (14:27 +0200)]
posix-timers: Plug potential memory leak in do_timer_create()
When posix timer creation is set to allocate a given timer ID and the
access to the user space value faults, the function terminates without
freeing the already allocated posix timer structure.
Move the allocation after the user space access to cure that.
[ tglx: Massaged change log ]
Fixes: ec2d0c04624b3 ("posix-timers: Provide a mechanism to allocate a given timer ID") Reported-by: syzbot+9c47ad18f978d4394986@syzkaller.appspotmail.com Suggested-by: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: Eslam Khafagy <eslam.medhat1993@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://patch.msgid.link/20251114122739.994326-1-eslam.medhat1993@gmail.com Closes: https://lore.kernel.org/all/69155df4.a70a0220.3124cb.0017.GAE@google.com/T/
Nick Hu [Fri, 14 Nov 2025 07:28:44 +0000 (15:28 +0800)]
irqchip/riscv-intc: Add missing free() callback in riscv_intc_domain_ops
The irq_domain_free_irqs() helper requires that the irq_domain_ops->free
callback is implemented. Otherwise, the kernel reports the warning message
"NULL pointer, cannot free irq" when irq_dispose_mapping() is invoked to
release the per-HART local interrupts.
Set irq_domain_ops->free to irq_domain_free_irqs_top() to cure that.
Heiko Carstens [Thu, 13 Nov 2025 12:21:47 +0000 (13:21 +0100)]
s390/mm: Fix __ptep_rdp() inline assembly
When a zero ASCE is passed to the __ptep_rdp() inline assembly, the
generated instruction should have the R3 field of the instruction set to
zero. However the inline assembly is written incorrectly: for such cases a
zero is loaded into a register allocated by the compiler and this register
is then used by the instruction.
This means that selected TLB entries may not be flushed since the specified
ASCE does not match the one which was used when the selected TLB entries
were created.
Fix this by removing the asce and opt parameters of __ptep_rdp(), since
all callers always pass zero, and use a hard-coded register zero for
the R3 field.
Fixes: 0807b856521f ("s390/mm: add support for RDP (Reset DAT-Protection)") Cc: stable@vger.kernel.org Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Lushih Hsieh [Fri, 14 Nov 2025 05:20:53 +0000 (13:20 +0800)]
ALSA: usb-audio: Add native DSD quirks for PureAudio DAC series
The PureAudio APA DAC and Lotus DAC5 series are USB Audio
2.0 Class devices that support native Direct Stream Digital (DSD)
playback via specific vendor protocols.
Without these quirks, the devices may only function in standard
PCM mode, or fail to correctly report their DSD format capabilities
to the ALSA framework, preventing native DSD playback under Linux.
This commit adds new quirk entries for the mentioned DAC models
based on their respective Vendor/Product IDs (VID:PID), for example:
0x16d0:0x0ab1 (APA DAC), 0x16d0:0xeca1 (DAC5 series), etc.
The quirk ensures correct DSD format handling by setting the required
SNDRV_PCM_FMTBIT_DSD_U32_BE format bit and defining the DSD-specific
Audio Class 2.0 (AC2.0) endpoint configurations. This allows the ALSA
DSD API to correctly address the device for high-bitrate DSD streams,
bypassing the need for DoP (DSD over PCM).
Test on APA DAC and Lotus DAC5 SE under Arch Linux.
Microcode that resolves the RDSEED failure (SB-7055 [1]) has been released for
additional Zen5 models to linux-firmware [2]. Update the zen5_rdseed_microcode
array to cover these new models.
Linus Torvalds [Fri, 14 Nov 2025 01:00:40 +0000 (17:00 -0800)]
Merge tag 'vfio-v6.18-rc6' of https://github.com/awilliam/linux-vfio
Pull VFIO seftest fixes from Alex Williamson:
- Fix vfio selftests to remove the expectation that the IOMMU supports
a 64-bit IOVA space.
These manifest both in the original set of tests introduced this
development cycle in identity mapping the IOVA to buffer virtual
address space, as well as the more recent boundary testing.
Implement facilities for collecting the valid IOVA ranges from the
backend, implement a simple IOVA allocator, and use the information
for determining extents (Alex Mastro)
* tag 'vfio-v6.18-rc6' of https://github.com/awilliam/linux-vfio:
vfio: selftests: replace iova=vaddr with allocated iovas
vfio: selftests: add iova allocator
vfio: selftests: fix map limit tests to use last available iova
vfio: selftests: add iova range query helpers
Linus Torvalds [Fri, 14 Nov 2025 00:54:36 +0000 (16:54 -0800)]
Merge tag 'hwmon-for-v6.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
- gpd-fan: Fix compilation error for non-ACPI builds, and initialize EC
when loading the driver
* tag 'hwmon-for-v6.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (gpd-fan) initialize EC on driver load for Win 4
hwmon: (gpd-fan) Fix compilation error in non-ACPI builds
Linus Torvalds [Fri, 14 Nov 2025 00:31:07 +0000 (16:31 -0800)]
Merge tag 'pm-6.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix issues related to the handling of compressed hibernation
images and a recent intel_pstate driver regression:
- Fix issues related to using inadequate data types and incorrect use
of atomic variables in the compressed hibernation images handling
code that were introduced during the 6.9 development cycle (Mario
Limonciello)
- Move a X86_FEATURE_IDA check from turbo_is_disabled() to the places
where a new value for MSR_IA32_PERF_CTL is computed in intel_pstate
to address a regression preventing users from enabling turbo
frequencies post-boot (Srinivas Pandruvada)"
* tag 'pm-6.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: intel_pstate: Check IDA only before MSR_IA32_PERF_CTL writes
PM: hibernate: Fix style issues in save_compressed_image()
PM: hibernate: Use atomic64_t for compressed_size variable
PM: hibernate: Emit an error when image writing fails
Linus Torvalds [Fri, 14 Nov 2025 00:22:36 +0000 (16:22 -0800)]
Merge tag 'acpi-6.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
"These fix issues in the ACPI CPPC library and in the recently added
parser for the ACPI MRRM table:
- Limit some checks in the ACPI CPPC library to online CPUs to avoid
accessing uninitialized per-CPU variables when some CPUs are
offline to start with, like during boot with 'nosmt=force' (Gautham
Shenoy)
- Rework add_boot_memory_ranges() in the ACPI MRRM table parser to
fix memory leaks and improve error handling (Kaushlendra Kumar)"
* tag 'acpi-6.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: MRRM: Fix memory leaks and improve error handling
ACPI: CPPC: Limit perf ctrs in PCC check only to online CPUs
ACPI: CPPC: Perform fast check switch only for online CPUs
ACPI: CPPC: Check _CPC validity for only the online CPUs
ACPI: CPPC: Detect preferred core availability on online CPUs
====================
mptcp: Fix conflicts between MPTCP and sockmap
Overall, we encountered a warning [1] that can be triggered by running the
selftest I provided.
sockmap works by replacing sk_data_ready, recvmsg, sendmsg operations and
implementing fast socket-level forwarding logic:
1. Users can obtain file descriptors through userspace socket()/accept()
interfaces, then call BPF syscall to perform these replacements.
2. Users can also use the bpf_sock_hash_update helper (in sockops programs)
to replace handlers when TCP connections enter ESTABLISHED state
(BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB/BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB)
However, when combined with MPTCP, an issue arises: MPTCP creates subflow
sk's and performs TCP handshakes, so the BPF program obtains subflow sk's
and may incorrectly replace their sk_prot. We need to reject such
operations. In patch 1, we set psock_update_sk_prot to NULL in the
subflow's custom sk_prot.
Additionally, if the server's listening socket has MPTCP enabled and the
client's TCP also uses MPTCP, we should allow the combination of subflow
and sockmap. This is because the latest Golang programs have enabled MPTCP
for listening sockets by default [2]. For programs already using sockmap,
upgrading Golang should not cause sockmap functionality to fail.
Patch 2 prevents the WARNING from occurring.
Despite these patches fixing stream corruption, users of sockmap must set
GODEBUG=multipathtcp=0 to disable MPTCP until sockmap fully supports it.
Jiayuan Chen [Tue, 11 Nov 2025 06:02:51 +0000 (14:02 +0800)]
mptcp: Fix proto fallback detection with BPF
The sockmap feature allows bpf syscall from userspace, or based
on bpf sockops, replacing the sk_prot of sockets during protocol stack
processing with sockmap's custom read/write interfaces.
'''
tcp_rcv_state_process()
syn_recv_sock()/subflow_syn_recv_sock()
tcp_init_transfer(BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB)
bpf_skops_established <== sockops
bpf_sock_map_update(sk) <== call bpf helper
tcp_bpf_update_proto() <== update sk_prot
'''
When the server has MPTCP enabled but the client sends a TCP SYN
without MPTCP, subflow_syn_recv_sock() performs a fallback on the
subflow, replacing the subflow sk's sk_prot with the native sk_prot.
'''
subflow_syn_recv_sock()
subflow_ulp_fallback()
subflow_drop_ctx()
mptcp_subflow_ops_undo_override()
'''
Then, this subflow can be normally used by sockmap, which replaces the
native sk_prot with sockmap's custom sk_prot. The issue occurs when the
user executes accept::mptcp_stream_accept::mptcp_fallback_tcp_ops().
Here, it uses sk->sk_prot to compare with the native sk_prot, but this
is incorrect when sockmap is used, as we may incorrectly set
sk->sk_socket->ops.
This fix uses the more generic sk_family for the comparison instead.
Additionally, this also prevents a WARNING from occurring:
result from ./scripts/decode_stacktrace.sh:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 337 at net/mptcp/protocol.c:68 mptcp_stream_accept \
(net/mptcp/protocol.c:4005)
Modules linked in:
...
Ian Rogers [Wed, 12 Nov 2025 07:43:11 +0000 (23:43 -0800)]
perf libbfd: Ensure libbfd is initialized prior to use
Multiple threads may be creating and destroying BFD objects in
situations like `perf top`.
Without appropriate initialization crashes may occur during libbfd's
cache management.
BFD's locks require recursive mutexes, add support for these.
Committer testing:
This happens only when building with 'make BUILD_NONDISTRO=1' and having
the binutils-devel package (or equivalent) installed, i.e. linking with
binutils devel files, an opt-in perf build.
Before:
root@x1:~# perf top
perf: Segmentation fault
-------- backtrace --------
<SNIP multiple failed attempts at printing a backtrace>
root@x1:~#
After this patch it works as before.
Closes: https://lore.kernel.org/lkml/aQt66zhfxSA80xwt@gentoo.org/ Fixes: 95931d9a594dd0b5 ("perf libbfd: Move libbfd functionality to its own file") Reported-by: Guilherme Amadio <amadio@gentoo.org> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ravi Bangoria [Thu, 13 Nov 2025 16:01:24 +0000 (16:01 +0000)]
perf test: Fix lock contention test
Couple of independent fixes:
1. Wire in SIGSEGV handler that terminates the test with a failure code.
2. Use "--lock-cgroup" instead of "-g"; "-g" was proposed but never
merged. See commit 4d1792d0a2564caf ("perf lock contention: Add
--lock-cgroup option")
3. Call cleanup() on every normal exit so trap_cleanup() doesn't mistake
it for an unexpected signal and emit a false-negative "Unexpected
signal in main" message.
Before patch:
# ./perf test -vv "lock contention"
85: kernel lock contention analysis test:
--- start ---
test child forked, pid 610711
Testing perf lock record and perf lock contention
Testing perf lock contention --use-bpf
Testing perf lock record and perf lock contention at the same time
Testing perf lock contention --threads
Testing perf lock contention --lock-addr
Testing perf lock contention --lock-cgroup
Unexpected signal in test_aggr_cgroup
---- end(0) ----
85: kernel lock contention analysis test : Ok
After patch:
# ./perf test -vv "lock contention"
85: kernel lock contention analysis test:
--- start ---
test child forked, pid 602637
Testing perf lock record and perf lock contention
Testing perf lock contention --use-bpf
Testing perf lock record and perf lock contention at the same time
Testing perf lock contention --threads
Testing perf lock contention --lock-addr
Testing perf lock contention --lock-cgroup
Testing perf lock contention --type-filter (w/ spinlock)
Testing perf lock contention --lock-filter (w/ tasklist_lock)
Testing perf lock contention --callstack-filter (w/ unix_stream)
[Skip] Could not find 'unix_stream'
Testing perf lock contention --callstack-filter with task aggregation
[Skip] Could not find 'unix_stream'
Testing perf lock contention --cgroup-filter
Testing perf lock contention CSV output
---- end(0) ----
85: kernel lock contention analysis test : Ok
Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ananth Narayan <ananth.narayan@amd.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Santosh Shukla <santosh.shukla@amd.com> Cc: Tycho Andersen <tycho@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
tools headers UAPI: Sync KVM's vmx.h with the kernel to pick SEAMCALL exit reason
To pick the changes in:
9d7dfb95da2cb5c1 ("KVM: VMX: Inject #UD if guest tries to execute SEAMCALL or TDCALL")
The 'perf kvm-stat' tool uses the exit reasons that are included in the
VMX_EXIT_REASONS define, this new SEAMCALL isn't included there (TDCALL
is), so shouldn't be causing any change in behaviour, this patch ends up
being just addressess the following perf build warning:
Warning: Kernel ABI header differences:
diff -u tools/arch/x86/include/uapi/asm/vmx.h arch/x86/include/uapi/asm/vmx.h
Please see tools/include/uapi/README for further details.
Cc: Sean Christopherson <seanjc@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf build: Don't fail fast path feature detection when binutils-devel is not available
This is one more remnant of the BUILD_NONDISTRO series to make building
with binutils-devel opt-in due to license incompatibility.
In this case just the references at link time were still in place, which
make building the test-all.bin file fail, which wasn't detected before
probably because the last test was done with binutils-devel available,
doh.
Now:
$ rpm -q binutils-devel
package binutils-devel is not installed
$ file /tmp/build/perf-tools/feature/test-all.bin
/tmp/build/perf-tools/feature/test-all.bin: ELF 64-bit LSB executable, x86-64, version 1 (SYSV),
dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2,
BuildID[sha1]=4b5388a346b51f1b993f0b0dbd49f4570769b03c, for GNU/Linux 3.2.0, not stripped
$
Fixes: 970ae86307718c34 ("perf build: The bfd features are opt-in, stop testing for them by default") Reviewed-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Thomas Falcon [Fri, 7 Nov 2025 17:31:50 +0000 (11:31 -0600)]
perf header: Write bpf_prog (infos|btfs)_cnt to data file
With commit f0d0f978f3f5830a ("perf header: Don't write empty BPF/BTF
info"), the write_bpf_( prog_info() | btf() ) functions exit without
writing anything if env->bpf_prog.(infos| btfs)_cnt is zero.
process_bpf_( prog_info() | btf() ), however, still expect a "count"
value to exist in the data file. If btf information is empty, for
example, process_bpf_btf will read garbage or some other data as the
number of btf nodes in the data file. As a result, the data file will
not be processed correctly.
Instead, write the count to the data file and exit if it is zero.
Fixes: f0d0f978f3f5830a ("perf header: Don't write empty BPF/BTF info") Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Thomas Falcon <thomas.falcon@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Merge fixes for issues related to the handling of compressed hibernation
images that were introduced during the 6.9 development cycle.
* pm-sleep:
PM: hibernate: Fix style issues in save_compressed_image()
PM: hibernate: Use atomic64_t for compressed_size variable
PM: hibernate: Emit an error when image writing fails
Merge ACPI CPPC library fixes and an ACPI MRRM table parser fix for
6.18-rc6.
* acpi-cppc:
ACPI: CPPC: Limit perf ctrs in PCC check only to online CPUs
ACPI: CPPC: Perform fast check switch only for online CPUs
ACPI: CPPC: Check _CPC validity for only the online CPUs
ACPI: CPPC: Detect preferred core availability on online CPUs
Linus Torvalds [Thu, 13 Nov 2025 19:37:40 +0000 (11:37 -0800)]
Merge tag 'linux_kselftest-fixes-6.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kselftest fix from Shuah Khan:
"Fixes event-filter-function.tc tracing test failure caused when a
first run to sample events triggers kmem_cache_free which interferes
with the rest of the test.
Fix this by calling sample_events twice to eliminate the
kmem_cache_free related noise from the sampling"
* tag 'linux_kselftest-fixes-6.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests/tracing: Run sample events to clear page cache events
Harry Yoo [Tue, 11 Nov 2025 12:53:31 +0000 (21:53 +0900)]
mm/slub: fix memory leak in free_to_pcs_bulk()
The commit 989b09b73978 ("slab: skip percpu sheaves for remote object
freeing") introduced the remote_objects array in free_to_pcs_bulk() to
skip sheaves when objects from a remote node are freed.
However, the array is flushed only when:
1) the array becomes full (++remote_nr >= PCS_BATCH_MAX), or
2) slab_free_hook() returns false and size becomes zero.
When neither of the conditions is met, objects in the array are leaked.
This resulted in a memory leak [1], where 82 GiB of memory was allocated
for the maple_node cache.
Flush the array after successfully freeing objects to sheaves
in the do_free: path.
In the meantime, move the snippet if (!size) goto flush_remote; outside
the while loop for readability. Let's say all objects in the array are
from a remote node: then we acquire s->cpu_sheaves->lock and try to free
an object even when size is zero. This doesn't appear to be harmful,
but isn't really readable.
Reported-by: Tytus Rogalewski <admin@simplepod.ai> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220765 [1] Closes: https://lore.kernel.org/linux-mm/20251107094809.12e9d705b7bf4815783eb184@linux-foundation.org Closes: https://lore.kernel.org/all/aRGDTwbt2EIz2CYn@hyeyoo Fixes: 989b09b73978 ("slab: skip percpu sheaves for remote object freeing") Signed-off-by: Harry Yoo <harry.yoo@oracle.com> Link: https://patch.msgid.link/20251111125331.12246-1-harry.yoo@oracle.com Acked-by: Liam R. Howlett <Liam.Howlett@oracle.com> Tested-by: Darrick J. Wong <djwong@kernel.org> Tested-by: Tytus Rogalewski <admin@simplepod.ai> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Zqiang [Thu, 13 Nov 2025 11:43:55 +0000 (19:43 +0800)]
sched_ext: Fix possible deadlock in the deferred_irq_workfn()
For PREEMPT_RT=y kernels, the deferred_irq_workfn() is executed in
the per-cpu irq_work/* task context and not disable-irq, if the rq
returned by container_of() is current CPU's rq, the following scenarios
may occur:
lock(&rq->__lock);
<Interrupt>
lock(&rq->__lock);
This commit use IRQ_WORK_INIT_HARD() to replace init_irq_work() to
initialize rq->scx.deferred_irq_work, make the deferred_irq_workfn()
is always invoked in hard-irq context.
Jiayuan Chen [Tue, 11 Nov 2025 06:02:50 +0000 (14:02 +0800)]
mptcp: Disallow MPTCP subflows from sockmap
The sockmap feature allows bpf syscall from userspace, or based on bpf
sockops, replacing the sk_prot of sockets during protocol stack processing
with sockmap's custom read/write interfaces.
'''
tcp_rcv_state_process()
subflow_syn_recv_sock()
tcp_init_transfer(BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB)
bpf_skops_established <== sockops
bpf_sock_map_update(sk) <== call bpf helper
tcp_bpf_update_proto() <== update sk_prot
'''
Consider two scenarios:
1. When the server has MPTCP enabled and the client also requests MPTCP,
the sk passed to the BPF program is a subflow sk. Since subflows only
handle partial data, replacing their sk_prot is meaningless and will
cause traffic disruption.
2. When the server has MPTCP enabled but the client sends a TCP SYN
without MPTCP, subflow_syn_recv_sock() performs a fallback on the
subflow, replacing the subflow sk's sk_prot with the native sk_prot.
'''
subflow_ulp_fallback()
subflow_drop_ctx()
mptcp_subflow_ops_undo_override()
'''
Subsequently, accept::mptcp_stream_accept::mptcp_fallback_tcp_ops()
converts the subflow to plain TCP.
For the first case, we should prevent it from being combined with sockmap
by setting sk_prot->psock_update_sk_prot to NULL, which will be blocked by
sockmap's own flow.
For the second case, since subflow_syn_recv_sock() has already restored
sk_prot to native tcp_prot/tcpv6_prot, no further action is needed.
Fixes: cec37a6e41aa ("mptcp: Handle MP_CAPABLE options for outgoing connections") Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Cc: <stable@vger.kernel.org> Link: https://patch.msgid.link/20251111060307.194196-2-jiayuan.chen@linux.dev
entry: Fix ifndef around arch_xfer_to_guest_mode_handle_work() stub
The stub implementation of arch_xfer_to_guest_mode_handle_work() is
guarded by an #ifndef that incorrectly checks for the name
arch_xfer_to_guest_mode_work instead. It seems the function was renamed
to add "_handle" as a late change to the original patch, and the #ifndef
wasn't updated to go with it.
Change the #ifndef to match the name of the function. No users right now,
so no need to update any architecture code.
Hangbin recently reported that the hsr selftests were failing and noted
that the entries in the node table were not merged, i.e., had
00:00:00:00:00:00 as MacAddressB forever [1].
This failure only occured with HSRv0 because it was not sending
supervision frames anymore. While debugging this I found that we were
not really following the HSRv0 standard for the supervision frames we
sent, so I additionally made a few changes to get closer to the standard
and restore a more correct behavior we had a while ago.
The selftests can still fail because they take a while and run into the
timeout. I did not include a change of the timeout because I have more
improvements to the selftests mostly ready that change the test duration
but are net-next material.
Felix Maurer [Tue, 11 Nov 2025 16:29:33 +0000 (17:29 +0100)]
hsr: Follow standard for HSRv0 supervision frames
For HSRv0, the path_id has the following meaning:
- 0000: PRP supervision frame
- 0001-1001: HSR ring identifier
- 1010-1011: Frames from PRP network (A/B, with RedBoxes)
- 1111: HSR supervision frame
Follow the IEC 62439-3:2010 standard more closely by setting the right
path_id for HSRv0 supervision frames (actually, it is correctly set when
the frame is constructed, but hsr_set_path_id() overwrites it) and set a
fixed HSR ring identifier of 1. The ring identifier seems to be generally
unused and we ignore it anyways on reception, but some fixed identifier is
definitely better than using one identifier in one direction and a wrong
identifier in the other.
This was also the behavior before commit f266a683a480 ("net/hsr: Better
frame dispatch") which introduced the alternating path_id. This was later
moved to hsr_set_path_id() in commit 451d8123f897 ("net: prp: add packet
handling support").
The IEC 62439-3:2010 also contains 6 unused bytes after the MacAddressA in
the HSRv0 supervision frames. Adjust a TODO comment accordingly.
Fixes: f266a683a480 ("net/hsr: Better frame dispatch") Fixes: 451d8123f897 ("net: prp: add packet handling support") Signed-off-by: Felix Maurer <fmaurer@redhat.com> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://patch.msgid.link/ea0d5133cd593856b2fa673d6e2067bf1d4d1794.1762876095.git.fmaurer@redhat.com Tested-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Felix Maurer [Tue, 11 Nov 2025 16:29:32 +0000 (17:29 +0100)]
hsr: Fix supervision frame sending on HSRv0
On HSRv0, no supervision frames were sent. The supervison frames were
generated successfully, but failed the check for a sufficiently long mac
header, i.e., at least sizeof(struct hsr_ethhdr), in hsr_fill_frame_info()
because the mac header only contained the ethernet header.
Fix this by including the HSR header in the mac header when generating HSR
supervision frames. Note that the mac header now also includes the TLV
fields. This matches how we set the headers on rx and also the size of
struct hsrv0_ethhdr_sp.
Reported-by: Hangbin Liu <liuhangbin@gmail.com> Closes: https://lore.kernel.org/netdev/aMONxDXkzBZZRfE5@fedora/ Fixes: 9cfb5e7f0ded ("net: hsr: fix hsr_init_sk() vs network/transport headers.") Signed-off-by: Felix Maurer <fmaurer@redhat.com> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://patch.msgid.link/4354114fea9a642fe71f49aeeb6c6159d1d61840.1762876095.git.fmaurer@redhat.com Tested-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Randy Dunlap [Wed, 12 Nov 2025 01:09:20 +0000 (17:09 -0800)]
drm/client: fix MODULE_PARM_DESC string for "active"
The MODULE_PARM_DESC string for the "active" parameter is missing a
space and has an extraneous trailing ']' character. Correct these.
Before patch:
$ modinfo -p ./drm_client_lib.ko
active:Choose which drm client to start, default isfbdev] (string)
After patch:
$ modinfo -p ./drm_client_lib.ko
active:Choose which drm client to start, default is fbdev (string)
Fixes: f7b42442c4ac ("drm/log: Introduce a new boot logger to draw the kmsg on the screen") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patch.msgid.link/20251112010920.2355712-1-rdunlap@infradead.org
Linus Torvalds [Thu, 13 Nov 2025 13:02:59 +0000 (05:02 -0800)]
Merge tag 'erofs-for-6.18-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs
Pull erofs fixes from Gao Xiang:
- Add Chunhai Guo as a EROFS reviewer to get more eyes from interested
industry vendors
- Fix infinite loop caused by incomplete crafted zstd-compressed data
(thanks to Robert again!)
* tag 'erofs-for-6.18-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
erofs: avoid infinite loop due to incomplete zstd-compressed data
MAINTAINERS: erofs: add myself as reviewer
* tag 'v6.18-rc5-smb-server-fixes' of git://git.samba.org/ksmbd:
smb: server: let smb_direct_disconnect_rdma_connection() turn CREATED into DISCONNECTED
ksmbd: close accepted socket when per-IP limit rejects connection
smb: server: rdma: avoid unmapping posted recv on accept failure
Xuan Zhuo [Tue, 11 Nov 2025 09:08:28 +0000 (17:08 +0800)]
virtio-net: fix incorrect flags recording in big mode
The purpose of commit 703eec1b2422 ("virtio_net: fixing XDP for fully
checksummed packets handling") is to record the flags in advance, as
their value may be overwritten in the XDP case. However, the flags
recorded under big mode are incorrect, because in big mode, the passed
buf does not point to the rx buffer, but rather to the page of the
submitted buffer. This commit fixes this issue.
For the small mode, the commit c11a49d58ad2 ("virtio_net: Fix mismatched
buf address when unmapping for small packets") fixed it.
Tested-by: Alyssa Ross <hi@alyssa.is> Fixes: 703eec1b2422 ("virtio_net: fixing XDP for fully checksummed packets handling") Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://patch.msgid.link/20251111090828.23186-1-xuanzhuo@linux.alibaba.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Linus Torvalds [Thu, 13 Nov 2025 02:41:01 +0000 (18:41 -0800)]
Merge tag 'nfsd-6.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd fixes from Chuck Lever:
"Address recently reported issues or issues found at the recent NFS
bake-a-thon held in Raleigh, NC.
Issues reported with v6.18-rc:
- Address a kernel build issue
- Reorder SEQUENCE processing to avoid spurious NFS4ERR_SEQ_MISORDERED
Issues that need expedient stable backports:
- Close a refcount leak exposure
- Report support for NFSv4.2 CLONE correctly
- Fix oops during COPY_NOTIFY processing
- Prevent rare crash after XDR encoding failure
- Prevent crash due to confused or malicious NFSv4.1 client"
* tag 'nfsd-6.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
Revert "SUNRPC: Make RPCSEC_GSS_KRB5 select CRYPTO instead of depending on it"
nfsd: ensure SEQUENCE replay sends a valid reply.
NFSD: Never cache a COMPOUND when the SEQUENCE operation fails
NFSD: Skip close replay processing if XDR encoding fails
NFSD: free copynotify stateid in nfs4_free_ol_stateid()
nfsd: add missing FATTR4_WORD2_CLONE_BLKSIZE from supported attributes
nfsd: fix refcount leak in nfsd_set_fh_dentry()
Linus Torvalds [Thu, 13 Nov 2025 02:31:22 +0000 (18:31 -0800)]
Merge tag 'dma-mapping-6.18-2025-11-12' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux
Pull dma-mapping fixes from Marek Szyprowski:
- two minor fixes for DMA API infrastructure: restoring proper
structure padding used in benchmark tests (Qinxin Xia) and global
DMA_BIT_MASK macro rework to make it a bit more clang friendly (James
Clark)
* tag 'dma-mapping-6.18-2025-11-12' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux:
dma-mapping: Allow use of DMA_BIT_MASK(64) in global scope
dma-mapping: benchmark: Restore padding to ensure uABI remained consistent
* tag 'loongarch-fixes-6.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
LoongArch: KVM: Fix max supported vCPUs set with EIOINTC
LoongArch: KVM: Skip PMU checking on vCPU context switch
LoongArch: KVM: Restore guest PMU if it is enabled
LoongArch: KVM: Add delay until timer interrupt injected
LoongArch: KVM: Set page with write attribute if dirty track disabled
LoongArch: kexec: Print out debugging message if required
LoongArch: kexec: Initialize the kexec_buf structure
LoongArch: Use correct accessor to read FWPC/MWPC
LoongArch: Refine the init_hw_perf_events() function
LoongArch: Remove __GFP_HIGHMEM masking in pud_alloc_one()
LoongArch: Let {pte,pmd}_modify() record the status of _PAGE_DIRTY
LoongArch: Consolidate max_pfn & max_low_pfn calculation
LoongArch: Consolidate early_ioremap()/ioremap_prot()
LoongArch: Use physical addresses for CSR_MERRENTRY/CSR_TLBRENTRY
LoongArch: Clarify 3 MSG interrupt features
rust: Add -fno-isolate-erroneous-paths-dereference to bindgen_skip_c_flags
Linus Torvalds [Thu, 13 Nov 2025 02:18:12 +0000 (18:18 -0800)]
Merge tag 'alpha-fixes-v6.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha
Pull alpha fix from Matt Turner:
"Add Magnus as a maintainer of the alpha port"
* tag 'alpha-fixes-v6.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha:
MAINTAINERS: Add Magnus Lindholm as maintainer for alpha port
Bjorn Helgaas [Thu, 13 Nov 2025 00:36:24 +0000 (18:36 -0600)]
PCI/ASPM: Avoid L0s and L1 on PA Semi [1959:a002] Root Ports
Christian reported that f3ac2ff14834 ("PCI/ASPM: Enable all ClockPM and
ASPM states for devicetree platforms") broke booting on the A-EON AmigaOne
X1000.
Override the L0s and L1 Support advertised in Link Capabilities by the
X1000 Root Ports ([1959:a002]) so we don't try to enable those states.
Fixes: f3ac2ff14834 ("PCI/ASPM: Enable all ClockPM and ASPM states for devicetree platforms") Fixes: df5192d9bb0e ("PCI/ASPM: Enable only L0s and L1 for devicetree platforms") Reported-by: Christian Zigotzky <chzigotzky@xenosoft.de> Link: https://lore.kernel.org/r/a41d2ca1-fcd9-c416-b111-a958e92e94bf@xenosoft.de Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Bjorn Helgaas [Mon, 10 Nov 2025 22:22:27 +0000 (16:22 -0600)]
PCI/ASPM: Convert quirks to override advertised link states
Existing quirks to disable ASPM L0s and L1 use pci_disable_link_state(),
which disables ASPM states and prevents their use in the future. But since
they are FINAL quirks, they happen after ASPM has already been enabled.
Here's a typical call path:
pci_host_probe
pci_scan_root_bus_bridge
pci_scan_child_bus
pci_scan_slot
pci_scan_single_device
pci_device_add
pci_fixup_device(pci_fixup_header) # HEADER quirks
pcie_aspm_init_link_state
pcie_config_aspm_path
pcie_config_aspm_link
pcie_config_aspm_dev # ASPM may be enabled
pci_bus_add_devices
pci_bus_add_devices
pci_fixup_device(pci_fixup_final) # FINAL quirks
quirk_disable_aspm_l0s
pci_disable_link_state(dev, PCIE_LINK_STATE_L0S)
Sometimes enabling ASPM can make the link non-functional, so if we know
ASPM is broken on a device, we shouldn't enable it at all, even
temporarily.
Convert the existing quirks to use pcie_aspm_remove_cap() instead, which
overrides the ASPM Support advertised in PCIe Link Capabilities, and make
them HEADER quirks so they run before pcie_aspm_init_link_state() has a
chance to enable ASPM.
Bjorn Helgaas [Mon, 10 Nov 2025 22:22:25 +0000 (16:22 -0600)]
PCI/ASPM: Cache L0s/L1 Supported so advertised link states can be overridden
Defective devices sometimes advertise support for ASPM L0s or L1 states
even if they don't work correctly.
Cache the L0s Supported and L1 Supported bits early in enumeration so
HEADER quirks can override the ASPM states advertised in Link Capabilities
before pcie_aspm_cap_init() enables ASPM.
Haotian Zhang [Wed, 12 Nov 2025 06:57:09 +0000 (14:57 +0800)]
ASoC: rsnd: fix OF node reference leak in rsnd_ssiu_probe()
rsnd_ssiu_probe() leaks an OF node reference obtained by
rsnd_ssiu_of_node(). The node reference is acquired but
never released across all return paths.
Fix it by declaring the device node with the __free(device_node)
cleanup construct to ensure automatic release when the variable goes
out of scope.
Dave Jiang [Wed, 5 Nov 2025 23:51:15 +0000 (16:51 -0700)]
acpi/hmat: Fix lockdep warning for hmem_register_resource()
The following lockdep splat was observed while kernel auto-online a CXL
memory region:
======================================================
WARNING: possible circular locking dependency detected
6.17.0djtest+ #53 Tainted: G W
------------------------------------------------------
systemd-udevd/3334 is trying to acquire lock: ffffffff90346188 (hmem_resource_lock){+.+.}-{4:4}, at: hmem_register_resource+0x31/0x50
but task is already holding lock: ffffffff90338890 ((node_chain).rwsem){++++}-{4:4}, at: blocking_notifier_call_chain+0x2e/0x70
which lock already depends on the new lock.
[..]
Chain exists of:
hmem_resource_lock --> mem_hotplug_lock --> (node_chain).rwsem
The lock ordering can cause potential deadlock. There are instances
where hmem_resource_lock is taken after (node_chain).rwsem, and vice
versa.
Split out the target update section of hmat_register_target() so that
hmat_callback() only envokes that section instead of attempt to register
hmem devices that it does not need to.
[ dj: Fix up comment to be closer to 80cols. (Jonathan) ]
Fixes: cf8741ac57ed ("ACPI: NUMA: HMAT: Register "soft reserved" memory as an "hmem" device") Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Tested-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Reviewed-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20251105235115.85062-3-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
hwmon: (gpd-fan) initialize EC on driver load for Win 4
The original implement will re-init the EC when it reports a zero
value, and it's a workaround for the black box buggy firmware.
Now a contributer test and report that, the bug is that, the firmware
won't initialize the EC on boot, so the EC ramains in unusable status.
And it won't need to re-init it during runtime. The original implement
is not perfect, any write command will be ignored until we first read
it. Just re-init it unconditionally when the driver load could work.