Linus Torvalds [Thu, 13 Nov 2025 02:18:12 +0000 (18:18 -0800)]
Merge tag 'alpha-fixes-v6.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha
Pull alpha fix from Matt Turner:
"Add Magnus as a maintainer of the alpha port"
* tag 'alpha-fixes-v6.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha:
MAINTAINERS: Add Magnus Lindholm as maintainer for alpha port
Magnus Lindholm [Tue, 4 Nov 2025 10:33:43 +0000 (11:33 +0100)]
MAINTAINERS: Add Magnus Lindholm as maintainer for alpha port
Acked-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Acked-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Magnus Lindholm <linmag7@gmail.com> Signed-off-by: Matt Turner <mattst88@gmail.com>
Linus Torvalds [Tue, 11 Nov 2025 18:31:17 +0000 (10:31 -0800)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"There's more here than I would ideally like at this stage, but there's
been a steady trickle of fixes and some of them took a few rounds of
review.
The bulk of the changes are fixing some fallout from the recent BBM
level two support which allows the linear map to be split from block
to page mappings at runtime, but inadvertently led to sleeping in
atomic context on some paths where the linear map was already mapped
with page granularity. The fix is simply to avoid splitting in those
cases but the implementation of that is a little involved.
The other interesting fix is addressing a catastophic performance
issue with our per-cpu atomics discovered by Paul in the SRCU locking
code but which took some interactions with the hardware folks to
resolve.
Summary:
- Avoid sleeping in atomic context when changing linear map
permissions for DEBUG_PAGEALLOC or KFENCE
- Rework printing of Spectre mitigation status to avoid hardlockup
when enabling per-task mitigations on the context-switch path
- Reject kernel modules when instruction patching fails either due to
the DWARF-based SCS patching or because of an alternatives callback
residing outside of the core kernel text
- Propagate error when updating kernel memory permissions in kprobes
- Drop pointless, incorrect message when enabling the ACPI SPCR
console
- Use value-returning LSE instructions for per-cpu atomics to reduce
latency in SRCU locking routines"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: Reject modules with internal alternative callbacks
arm64: Fail module loading if dynamic SCS patching fails
arm64: proton-pack: Fix hard lockup due to print in scheduler context
arm64: proton-pack: Drop print when !CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY
arm64: mm: Tidy up force_pte_mapping()
arm64: mm: Optimize range_split_to_ptes()
arm64: mm: Don't sleep in split_kernel_leaf_mapping() when in atomic context
arm64: kprobes: check the return value of set_memory_rox()
arm64: acpi: Drop message logging SPCR default console
Revert "ACPI: Suppress misleading SPCR console message when SPCR table is absent"
arm64: Use load LSE atomics for the non-return per-CPU atomic operations
Linus Torvalds [Tue, 11 Nov 2025 18:13:17 +0000 (10:13 -0800)]
Merge tag 'for-6.18-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- fix new inode name tracking in tree-log
- fix conventional zone and stripe calculations in zoned mode
- fix bio reference counts on error paths in relocation and scrub
* tag 'for-6.18-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: release root after error in data_reloc_print_warning_inode()
btrfs: scrub: put bio after errors in scrub_raid56_parity_stripe()
btrfs: do not update last_log_commit when logging inode due to a new name
btrfs: zoned: fix stripe width calculation
btrfs: zoned: fix conventional zone capacity calculation
Linus Torvalds [Tue, 11 Nov 2025 17:49:56 +0000 (09:49 -0800)]
Merge tag 'mm-hotfixes-stable-2025-11-10-19-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"26 hotfixes. 22(!) are cc:stable, 22 are MM.
- address some Kexec Handover issues (Pasha Tatashin)
- fix handling of large folios which are mapped outside i_size (Kiryl
Shutsemau)
- fix some DAMON time issues on 32-bit machines (Quanmin Yan)
Plus the usual shower of singletons"
* tag 'mm-hotfixes-stable-2025-11-10-19-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (26 commits)
kho: warn and exit when unpreserved page wasn't preserved
kho: fix unpreservation of higher-order vmalloc preservations
kho: fix out-of-bounds access of vmalloc chunk
MAINTAINERS: add Chris and Kairui as the swap maintainer
mm/secretmem: fix use-after-free race in fault handler
mm/huge_memory: initialise the tags of the huge zero folio
nilfs2: avoid having an active sc_timer before freeing sci
scripts/decode_stacktrace.sh: fix build ID and PC source parsing
mm/damon/sysfs: change next_update_jiffies to a global variable
mm/damon/stat: change last_refresh_jiffies to a global variable
maple_tree: fix tracepoint string pointers
codetag: debug: handle existing CODETAG_EMPTY in mark_objexts_empty for slabobj_ext
mm/mremap: honour writable bit in mremap pte batching
gcov: add support for GCC 15
mm/mm_init: fix hash table order logging in alloc_large_system_hash()
mm/truncate: unmap large folio on split failure
mm/memory: do not populate page table entries beyond i_size
fs/proc: fix uaf in proc_readdir_de()
mm/huge_memory: preserve PG_has_hwpoisoned if a folio is split to >0 order
ksm: use range-walk function to jump over holes in scan_get_next_rmap_item
...
Linus Torvalds [Mon, 10 Nov 2025 23:35:45 +0000 (15:35 -0800)]
Merge tag 'riscv-for-linus-6.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Paul Walmsley:
- fix broken clang build on versions earlier than 19 and binutils
versions earlier than 2.38.
(This exposed that we're not properly testing earlier toolchain
versions in our linux-next builds and PR submissions. This was fixed
for this PR, and is being addressed more generally for -next builds.)
- remove some redundant Makefile code
- avoid building Canaan Kendryte K210-specific code on targets that
don't build for the K210
* tag 'riscv-for-linus-6.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: Fix CONFIG_AS_HAS_INSN for new .insn usage
riscv: Remove redundant judgment for the default build target
riscv: Build loader.bin exclusively for Canaan K210
Linus Torvalds [Mon, 10 Nov 2025 16:54:36 +0000 (08:54 -0800)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"Arm:
- Fix trapping regression when no in-kernel irqchip is present
- Check host-provided, untrusted ranges and offsets in pKVM
- Fix regression restoring the ID_PFR1_EL1 register
- Fix vgic ITS locking issues when LPIs are not directly injected
Arm selftests:
- Correct target CPU programming in vgic_lpi_stress selftest
- Fix exposure of SCTLR2_EL2 and ZCR_EL2 in get-reg-list selftest
RISC-V:
- Fix check for local interrupts on riscv32
- Read HGEIP CSR on the correct cpu when checking for IMSIC
interrupts
- Remove automatic I/O mapping from kvm_arch_prepare_memory_region()
x86:
- Inject #UD if the guest attempts to execute SEAMCALL or TDCALL as
KVM doesn't support virtualization the instructions, but the
instructions are gated only by VMXON. That is, they will VM-Exit
instead of taking a #UD and until now this resulted in KVM exiting
to userspace with an emulation error.
- Unload the "FPU" when emulating INIT of XSTATE features if and only
if the FPU is actually loaded, instead of trying to predict when
KVM will emulate an INIT (CET support missed the MP_STATE path).
Add sanity checks to detect and harden against similar bugs in the
future.
- Unregister KVM's GALog notifier (for AVIC) when kvm-amd.ko is
unloaded.
- Use a raw spinlock for svm->ir_list_lock as the lock is taken
during schedule(), and "normal" spinlocks are sleepable locks when
PREEMPT_RT=y.
- Remove guest_memfd bindings on memslot deletion when a gmem file is
dying to fix a use-after-free race found by syzkaller.
- Fix a goof in the EPT Violation handler where KVM checks the wrong
variable when determining if the reported GVA is valid.
- Fix and simplify the handling of LBR virtualization on AMD, which
was made buggy and unnecessarily complicated by nested VM support
Misc:
- Update Oliver's email address"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (28 commits)
KVM: nSVM: Fix and simplify LBR virtualization handling with nested
KVM: nSVM: Always recalculate LBR MSR intercepts in svm_update_lbrv()
KVM: SVM: Mark VMCB_LBR dirty when MSR_IA32_DEBUGCTLMSR is updated
MAINTAINERS: Switch myself to using kernel.org address
KVM: arm64: vgic-v3: Release reserved slot outside of lpi_xa's lock
KVM: arm64: vgic-v3: Reinstate IRQ lock ordering for LPI xarray
KVM: arm64: Limit clearing of ID_{AA64PFR0,PFR1}_EL1.GIC to userspace irqchip
KVM: arm64: Set ID_{AA64PFR0,PFR1}_EL1.GIC when GICv3 is configured
KVM: arm64: Make all 32bit ID registers fully writable
KVM: VMX: Fix check for valid GVA on an EPT violation
KVM: guest_memfd: Remove bindings on memslot deletion when gmem is dying
KVM: SVM: switch to raw spinlock for svm->ir_list_lock
KVM: SVM: Make avic_ga_log_notifier() local to avic.c
KVM: SVM: Unregister KVM's GALog notifier on kvm-amd.ko exit
KVM: SVM: Initialize per-CPU svm_data at the end of hardware setup
KVM: x86: Call out MSR_IA32_S_CET is not handled by XSAVES
KVM: x86: Harden KVM against imbalanced load/put of guest FPU state
KVM: x86: Unload "FPU" state on INIT if and only if its currently in-use
KVM: arm64: Check the untrusted offset in FF-A memory share
KVM: arm64: Check range args for pKVM mem transitions
...
Pratyush Yadav [Mon, 3 Nov 2025 18:02:32 +0000 (19:02 +0100)]
kho: warn and exit when unpreserved page wasn't preserved
Calling __kho_unpreserve() on a pair of (pfn, end_pfn) that wasn't
preserved is a bug. Currently, if that is done, the physxa or bits can be
NULL. This results in a soft lockup since a NULL physxa or bits results
in redoing the loop without ever making any progress.
Return when physxa or bits are not found, but WARN first to loudly
indicate invalid behaviour.
Link: https://lkml.kernel.org/r/20251103180235.71409-3-pratyush@kernel.org Fixes: fc33e4b44b27 ("kexec: enable KHO support for memory preservation") Signed-off-by: Pratyush Yadav <pratyush@kernel.org> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Baoquan He <bhe@redhat.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Pratyush Yadav [Mon, 3 Nov 2025 18:02:31 +0000 (19:02 +0100)]
kho: fix unpreservation of higher-order vmalloc preservations
kho_vmalloc_unpreserve_chunk() calls __kho_unpreserve() with end_pfn as
pfn + 1. This happens to work for 0-order pages, but leaks higher order
pages.
For example, say order 2 pages back the allocation. During preservation,
they get preserved in the order 2 bitmaps, but
kho_vmalloc_unpreserve_chunk() would try to unpreserve them from the order
0 bitmaps, which should not have these bits set anyway, leaving the order
2 bitmaps untouched. This results in the pages being carried over to the
next kernel. Nothing will free those pages in the next boot, leaking
them.
Fix this by taking the order into account when calculating the end PFN for
__kho_unpreserve().
Link: https://lkml.kernel.org/r/20251103180235.71409-2-pratyush@kernel.org Fixes: a667300bd53f ("kho: add support for preserving vmalloc allocations") Signed-off-by: Pratyush Yadav <pratyush@kernel.org> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Baoquan He <bhe@redhat.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Pratyush Yadav [Mon, 3 Nov 2025 11:01:57 +0000 (12:01 +0100)]
kho: fix out-of-bounds access of vmalloc chunk
The list of pages in a vmalloc chunk is NULL-terminated. So when looping
through the pages in a vmalloc chunk, both kho_restore_vmalloc() and
kho_vmalloc_unpreserve_chunk() rightly make sure to stop when encountering
a NULL page. But when the chunk is full, the loops do not stop and go
past the bounds of chunk->phys, resulting in out-of-bounds memory access,
and possibly the restoration or unpreservation of an invalid page.
Fix this by making sure the processing of chunk stops at the end of the
array.
Link: https://lkml.kernel.org/r/20251103110159.8399-1-pratyush@kernel.org Fixes: a667300bd53f ("kho: add support for preserving vmalloc allocations") Signed-off-by: Pratyush Yadav <pratyush@kernel.org> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Baoquan He <bhe@redhat.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Chris Li [Sun, 2 Nov 2025 15:11:07 +0000 (07:11 -0800)]
MAINTAINERS: add Chris and Kairui as the swap maintainer
We have been collaborating on a systematic effort to clean up and improve
the Linux swap system, and might as well take responsibility for it.
Link: https://lkml.kernel.org/r/20251102-swap-m-v1-1-582f275d5bce@kernel.org Signed-off-by: Chris Li <chrisl@kernel.org> Acked-by: Kairui Song <kasong@tencent.com> Acked-by: Barry Song <baohua@kernel.org> Acked-by: Baoquan He <bhe@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kemeng Shi <shikemeng@huaweicloud.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Nhat Pham <nphamcs@gmail.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: SeongJae Park <sj@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Lance Yang [Fri, 31 Oct 2025 12:09:55 +0000 (20:09 +0800)]
mm/secretmem: fix use-after-free race in fault handler
When a page fault occurs in a secret memory file created with
`memfd_secret(2)`, the kernel will allocate a new folio for it, mark the
underlying page as not-present in the direct map, and add it to the file
mapping.
If two tasks cause a fault in the same page concurrently, both could end
up allocating a folio and removing the page from the direct map, but only
one would succeed in adding the folio to the file mapping. The task that
failed undoes the effects of its attempt by (a) freeing the folio again
and (b) putting the page back into the direct map. However, by doing
these two operations in this order, the page becomes available to the
allocator again before it is placed back in the direct mapping.
If another task attempts to allocate the page between (a) and (b), and the
kernel tries to access it via the direct map, it would result in a
supervisor not-present page fault.
Fix the ordering to restore the direct map before the folio is freed.
Link: https://lkml.kernel.org/r/20251031120955.92116-1-lance.yang@linux.dev Fixes: 1507f51255c9 ("mm: introduce memfd_secret system call to create "secret" memory areas") Signed-off-by: Lance Yang <lance.yang@linux.dev> Reported-by: Google Big Sleep <big-sleep-vuln-reports@google.com> Closes: https://lore.kernel.org/linux-mm/CAEXGt5QeDpiHTu3K9tvjUTPqo+d-=wuCNYPa+6sWKrdQJ-ATdg@mail.gmail.com/ Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Catalin Marinas [Fri, 31 Oct 2025 16:57:50 +0000 (16:57 +0000)]
mm/huge_memory: initialise the tags of the huge zero folio
On arm64 with MTE enabled, a page mapped as Normal Tagged (PROT_MTE) in
user space will need to have its allocation tags initialised. This is
normally done in the arm64 set_pte_at() after checking the memory
attributes. Such page is also marked with the PG_mte_tagged flag to avoid
subsequent clearing. Since this relies on having a struct page,
pte_special() mappings are ignored.
Commit d82d09e48219 ("mm/huge_memory: mark PMD mappings of the huge zero
folio special") maps the huge zero folio special and the arm64
set_pmd_at() will no longer zero the tags. There is no guarantee that the
tags are zero, especially if parts of this huge page have been previously
tagged.
It's fairly easy to detect this by regularly dropping the caches to
force the reallocation of the huge zero folio.
Allocate the huge zero folio with the __GFP_ZEROTAGS flag. In addition,
do not warn in the arm64 __access_remote_tags() when reading tags from the
huge zero page.
I bundled the arm64 change in here as well since they are both related to
the commit mapping the huge zero folio as special.
[catalin.marinas@arm.com: handle arch mte_zero_clear_page_tags() code issuing MTE instructions] Link: https://lkml.kernel.org/r/aQi8dA_QpXM8XqrE@arm.com Link: https://lkml.kernel.org/r/20251031170133.280742-1-catalin.marinas@arm.com Fixes: d82d09e48219 ("mm/huge_memory: mark PMD mappings of the huge zero folio special") Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Lance Yang <lance.yang@linux.dev> Tested-by: Beleswar Padhi <b-padhi@ti.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Aishwarya TCV <aishwarya.tcv@arm.com> Cc: David Hildenbrand (Red Hat) <david@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
nilfs2: avoid having an active sc_timer before freeing sci
Because kthread_stop did not stop sc_task properly and returned -EINTR,
the sc_timer was not properly closed, ultimately causing the problem [1]
reported by syzbot when freeing sci due to the sc_timer not being closed.
Because the thread sc_task main function nilfs_segctor_thread() returns 0
when it succeeds, when the return value of kthread_stop() is not 0 in
nilfs_segctor_destroy(), we believe that it has not properly closed
sc_timer.
We use timer_shutdown_sync() to sync wait for sc_timer to shutdown, and
set the value of sc_task to NULL under the protection of lock
sc_state_lock, so as to avoid the issue caused by sc_timer not being
properly shutdowned.
Link: https://lkml.kernel.org/r/20251029225226.16044-1-konishi.ryusuke@gmail.com Fixes: 3f66cc261ccb ("nilfs2: use kthread_create and kthread_stop for the log writer thread") Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Reported-by: syzbot+24d8b70f039151f65590@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=24d8b70f039151f65590 Tested-by: syzbot+24d8b70f039151f65590@syzkaller.appspotmail.com Signed-off-by: Edward Adam Davis <eadavis@qq.com> Cc: <stable@vger.kernel.org> [6.12+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Carlos Llamas [Thu, 30 Oct 2025 01:03:33 +0000 (01:03 +0000)]
scripts/decode_stacktrace.sh: fix build ID and PC source parsing
Support for parsing PC source info in stacktraces (e.g. '(P)') was added
in commit 2bff77c665ed ("scripts/decode_stacktrace.sh: fix decoding of
lines with an additional info"). However, this logic was placed after the
build ID processing. This incorrect order fails to parse lines containing
both elements, e.g.:
This patch fixes the problem by extracting the PC source info first and
then processing the module build ID. With this change, the line above is
now properly parsed as such:
Quanmin Yan [Thu, 30 Oct 2025 02:07:46 +0000 (10:07 +0800)]
mm/damon/sysfs: change next_update_jiffies to a global variable
In DAMON's damon_sysfs_repeat_call_fn(), time_before() is used to compare
the current jiffies with next_update_jiffies to determine whether to
update the sysfs files at this moment.
On 32-bit systems, the kernel initializes jiffies to "-5 minutes" to make
jiffies wrap bugs appear earlier. However, this causes time_before() in
damon_sysfs_repeat_call_fn() to unexpectedly return true during the first
5 minutes after boot on 32-bit systems (see [1] for more explanation,
which fixes another jiffies-related issue before). As a result, DAMON
does not update sysfs files during that period.
There is also an issue unrelated to the system's word size[2]: if the
user stops DAMON just after next_update_jiffies is updated and restarts
it after 'refresh_ms' or a longer delay, next_update_jiffies will retain
an older value, causing time_before() to return false and the update to
happen earlier than expected.
Fix these issues by making next_update_jiffies a global variable and
initializing it each time DAMON is started.
Quanmin Yan [Thu, 30 Oct 2025 02:07:45 +0000 (10:07 +0800)]
mm/damon/stat: change last_refresh_jiffies to a global variable
Patch series "mm/damon: fixes for the jiffies-related issues", v2.
On 32-bit systems, the kernel initializes jiffies to "-5 minutes" to make
jiffies wrap bugs appear earlier. However, this may cause the
time_before() series of functions to return unexpected values, resulting
in DAMON not functioning as intended. Meanwhile, similar issues exist in
some specific user operation scenarios.
This patchset addresses these issues. The first patch is about the
DAMON_STAT module, and the second patch is about the core layer's sysfs.
This patch (of 2):
In DAMON_STAT's damon_stat_damon_call_fn(), time_before_eq() is used to
avoid unnecessarily frequent stat update.
On 32-bit systems, the kernel initializes jiffies to "-5 minutes" to make
jiffies wrap bugs appear earlier. However, this causes time_before_eq()
in DAMON_STAT to unexpectedly return true during the first 5 minutes after
boot on 32-bit systems (see [1] for more explanation, which fixes another
jiffies-related issue before). As a result, DAMON_STAT does not update
any monitoring results during that period, which becomes more confusing
when DAMON_STAT_ENABLED_DEFAULT is enabled.
There is also an issue unrelated to the system's word size[2]: if the user
stops DAMON_STAT just after last_refresh_jiffies is updated and restarts
it after 5 seconds or a longer delay, last_refresh_jiffies will retain an
older value, causing time_before_eq() to return false and the update to
happen earlier than expected.
Fix these issues by making last_refresh_jiffies a global variable and
initializing it each time DAMON_STAT is started.
Martin Kaiser [Thu, 30 Oct 2025 15:55:05 +0000 (16:55 +0100)]
maple_tree: fix tracepoint string pointers
maple_tree tracepoints contain pointers to function names. Such a pointer
is saved when a tracepoint logs an event. There's no guarantee that it's
still valid when the event is parsed later and the pointer is dereferenced.
The kernel warns about these unsafe pointers.
event 'ma_read' has unsafe pointer field 'fn'
WARNING: kernel/trace/trace.c:3779 at ignore_event+0x1da/0x1e4
Mark the function names as tracepoint_string() to fix the events.
One case that doesn't work without my patch would be trace-cmd record
to save the binary ringbuffer and trace-cmd report to parse it in
userspace. The address of __func__ can't be dereferenced from
userspace but tracepoint_string will add an entry to
/sys/kernel/tracing/printk_formats
Link: https://lkml.kernel.org/r/20251030155537.87972-1-martin@kaiser.cx Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Martin Kaiser <martin@kaiser.cx> Acked-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Hao Ge [Wed, 29 Oct 2025 01:43:17 +0000 (09:43 +0800)]
codetag: debug: handle existing CODETAG_EMPTY in mark_objexts_empty for slabobj_ext
When alloc_slab_obj_exts() fails and then later succeeds in allocating a
slab extension vector, it calls handle_failed_objexts_alloc() to mark all
objects in the vector as empty. As a result all objects in this slab
(slabA) will have their extensions set to CODETAG_EMPTY.
Later on if this slabA is used to allocate a slabobj_ext vector for
another slab (slabB), we end up with the slabB->obj_exts pointing to a
slabobj_ext vector that itself has a non-NULL slabobj_ext equal to
CODETAG_EMPTY. When slabB gets freed, free_slab_obj_exts() is called to
free slabB->obj_exts vector.
free_slab_obj_exts() calls mark_objexts_empty(slabB->obj_exts) which will
generate a warning because it expects slabobj_ext vectors to have a NULL
obj_ext, not CODETAG_EMPTY.
Modify mark_objexts_empty() to skip the warning and setting the obj_ext
value if it's already set to CODETAG_EMPTY.
To quickly detect this WARN, I modified the code from
WARN_ON(slab_exts[offs].ref.ct) to BUG_ON(slab_exts[offs].ref.ct == 1);
Dev Jain [Tue, 28 Oct 2025 06:39:52 +0000 (12:09 +0530)]
mm/mremap: honour writable bit in mremap pte batching
Currently mremap folio pte batch ignores the writable bit during figuring
out a set of similar ptes mapping the same folio. Suppose that the first
pte of the batch is writable while the others are not - set_ptes will end
up setting the writable bit on the other ptes, which is a violation of
mremap semantics. Therefore, use FPB_RESPECT_WRITE to check the writable
bit while determining the pte batch.
Link: https://lkml.kernel.org/r/20251028063952.90313-1-dev.jain@arm.com Signed-off-by: Dev Jain <dev.jain@arm.com> Fixes: f822a9a81a31 ("mm: optimize mremap() by PTE batching") Reported-by: David Hildenbrand <david@redhat.com> Debugged-by: David Hildenbrand <david@redhat.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Pedro Falcato <pfalcato@suse.de> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Barry Song <baohua@kernel.org> Cc: Jann Horn <jannh@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> [6.17+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Using gcov on kernels compiled with GCC 15 results in truncated 16-byte
long .gcda files with no usable data. To fix this, update GCOV_COUNTERS
to match the value defined by GCC 15.
mm/mm_init: fix hash table order logging in alloc_large_system_hash()
When emitting the order of the allocation for a hash table,
alloc_large_system_hash() unconditionally subtracts PAGE_SHIFT from log
base 2 of the allocation size. This is not correct if the allocation size
is smaller than a page, and yields a negative value for the order as seen
below:
Use get_order() to compute the order when emitting the hash table
information to correctly handle cases where the allocation size is smaller
than a page:
Link: https://lkml.kernel.org/r/20251028191020.413002-1-isaacmanjarres@google.com Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Isaac J. Manjarres <isaacmanjarres@google.com> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Kiryl Shutsemau [Mon, 27 Oct 2025 11:56:36 +0000 (11:56 +0000)]
mm/truncate: unmap large folio on split failure
Accesses within VMA, but beyond i_size rounded up to PAGE_SIZE are
supposed to generate SIGBUS.
This behavior might not be respected on truncation.
During truncation, the kernel splits a large folio in order to reclaim
memory. As a side effect, it unmaps the folio and destroys PMD mappings
of the folio. The folio will be refaulted as PTEs and SIGBUS semantics
are preserved.
However, if the split fails, PMD mappings are preserved and the user will
not receive SIGBUS on any accesses within the PMD.
Unmap the folio on split failure. It will lead to refault as PTEs and
preserve SIGBUS semantics.
Make an exception for shmem/tmpfs that for long time intentionally mapped
with PMDs across i_size.
Link: https://lkml.kernel.org/r/20251027115636.82382-3-kirill@shutemov.name Fixes: b9a8a4195c7d ("truncate,shmem: Handle truncates that split large folios") Signed-off-by: Kiryl Shutsemau <kas@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Christian Brauner <brauner@kernel.org> Cc: "Darrick J. Wong" <djwong@kernel.org> Cc: Dave Chinner <david@fromorbit.com> Cc: David Hildenbrand <david@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Rik van Riel <riel@surriel.com> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Kiryl Shutsemau [Mon, 27 Oct 2025 11:56:35 +0000 (11:56 +0000)]
mm/memory: do not populate page table entries beyond i_size
Patch series "Fix SIGBUS semantics with large folios", v3.
Accessing memory within a VMA, but beyond i_size rounded up to the next
page size, is supposed to generate SIGBUS.
Darrick reported[1] an xfstests regression in v6.18-rc1. generic/749
failed due to missing SIGBUS. This was caused by my recent changes that
try to fault in the whole folio where possible:
19773df031bc ("mm/fault: try to map the entire file folio in finish_fault()") 357b92761d94 ("mm/filemap: map entire large folio faultaround")
These changes did not consider i_size when setting up PTEs, leading to
xfstest breakage.
However, the problem has been present in the kernel for a long time -
since huge tmpfs was introduced in 2016. The kernel happily maps
PMD-sized folios as PMD without checking i_size. And huge=always tmpfs
allocates PMD-size folios on any writes.
I considered this corner case when I implemented a large tmpfs, and my
conclusion was that no one in their right mind should rely on receiving a
SIGBUS signal when accessing beyond i_size. I cannot imagine how it could
be useful for the workload.
But apparently filesystem folks care a lot about preserving strict SIGBUS
semantics.
Generic/749 was introduced last year with reference to POSIX, but no real
workloads were mentioned. It also acknowledged the tmpfs deviation from
the test case.
POSIX indeed says[3]:
References within the address range starting at pa and
continuing for len bytes to whole pages following the end of an
object shall result in delivery of a SIGBUS signal.
The patchset fixes the regression introduced by recent changes as well as
more subtle SIGBUS breakage due to split failure on truncation.
This patch (of 2):
Accesses within VMA, but beyond i_size rounded up to PAGE_SIZE are
supposed to generate SIGBUS.
Recent changes attempted to fault in full folio where possible. They did
not respect i_size, which led to populating PTEs beyond i_size and
breaking SIGBUS semantics.
Darrick reported generic/749 breakage because of this.
However, the problem existed before the recent changes. With huge=always
tmpfs, any write to a file leads to PMD-size allocation. Following the
fault-in of the folio will install PMD mapping regardless of i_size.
Fix filemap_map_pages() and finish_fault() to not install:
- PTEs beyond i_size;
- PMD mappings across i_size;
Make an exception for shmem/tmpfs that for long time intentionally
mapped with PMDs across i_size.
Link: https://lkml.kernel.org/r/20251027115636.82382-1-kirill@shutemov.name Link: https://lkml.kernel.org/r/20251027115636.82382-2-kirill@shutemov.name Signed-off-by: Kiryl Shutsemau <kas@kernel.org> Fixes: 6795801366da ("xfs: Support large folios") Reported-by: "Darrick J. Wong" <djwong@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Dave Chinner <david@fromorbit.com> Cc: David Hildenbrand <david@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Rik van Riel <riel@surriel.com> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Wei Yang [Sat, 25 Oct 2025 02:42:33 +0000 (10:42 +0800)]
fs/proc: fix uaf in proc_readdir_de()
Pde is erased from subdir rbtree through rb_erase(), but not set the node
to EMPTY, which may result in uaf access. We should use RB_CLEAR_NODE()
set the erased node to EMPTY, then pde_subdir_next() will return NULL to
avoid uaf access.
We found an uaf issue while using stress-ng testing, need to run testcase
getdent and tun in the same time. The steps of the issue is as follows:
1) use getdent to traverse dir /proc/pid/net/dev_snmp6/, and current
pde is tun3;
2) in the [time windows] unregister netdevice tun3 and tun2, and erase
them from rbtree. erase tun3 first, and then erase tun2. the
pde(tun2) will be released to slab;
3) continue to getdent process, then pde_subdir_next() will return
pde(tun2) which is released, it will case uaf access.
CPU 0 | CPU 1
-------------------------------------------------------------------------
traverse dir /proc/pid/net/dev_snmp6/ | unregister_netdevice(tun->dev) //tun3 tun2
sys_getdents64() |
iterate_dir() |
proc_readdir() |
proc_readdir_de() | snmp6_unregister_dev()
pde_get(de); | proc_remove()
read_unlock(&proc_subdir_lock); | remove_proc_subtree()
| write_lock(&proc_subdir_lock);
[time window] | rb_erase(&root->subdir_node, &parent->subdir);
| write_unlock(&proc_subdir_lock);
read_lock(&proc_subdir_lock); |
next = pde_subdir_next(de); |
pde_put(de); |
de = next; //UAF |
rbtree of dev_snmp6
|
pde(tun3)
/ \
NULL pde(tun2)
Link: https://lkml.kernel.org/r/20251025024233.158363-1-albin_yang@163.com Signed-off-by: Wei Yang <albinwyang@tencent.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: wangzijie <wangzijie1@honor.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Zi Yan [Thu, 23 Oct 2025 03:05:21 +0000 (23:05 -0400)]
mm/huge_memory: preserve PG_has_hwpoisoned if a folio is split to >0 order
folio split clears PG_has_hwpoisoned, but the flag should be preserved in
after-split folios containing pages with PG_hwpoisoned flag if the folio
is split to >0 order folios. Scan all pages in a to-be-split folio to
determine which after-split folios need the flag.
An alternatives is to change PG_has_hwpoisoned to PG_maybe_hwpoisoned to
avoid the scan and set it on all after-split folios, but resulting false
positive has undesirable negative impact. To remove false positive,
caller of folio_test_has_hwpoisoned() and folio_contain_hwpoisoned_page()
needs to do the scan. That might be causing a hassle for current and
future callers and more costly than doing the scan in the split code.
More details are discussed in [1].
This issue can be exposed via:
1. splitting a has_hwpoisoned folio to >0 order from debugfs interface;
2. truncating part of a has_hwpoisoned folio in
truncate_inode_partial_folio().
And later accesses to a hwpoisoned page could be possible due to the
missing has_hwpoisoned folio flag. This will lead to MCE errors.
Link: https://lore.kernel.org/all/CAHbLzkoOZm0PXxE9qwtF4gKR=cpRXrSrJ9V9Pm2DJexs985q4g@mail.gmail.com/ Link: https://lkml.kernel.org/r/20251023030521.473097-1-ziy@nvidia.com Fixes: c010d47f107f ("mm: thp: split huge page to any lower order pages") Signed-off-by: Zi Yan <ziy@nvidia.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Yang Shi <yang@os.amperecomputing.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Lance Yang <lance.yang@linux.dev> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: Wei Yang <richard.weiyang@gmail.com> Cc: Pankaj Raghav <kernel@pankajraghav.com> Cc: Barry Song <baohua@kernel.org> Cc: Dev Jain <dev.jain@arm.com> Cc: Jane Chu <jane.chu@oracle.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Luis Chamberalin <mcgrof@kernel.org> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Naoya Horiguchi <nao.horiguchi@gmail.com> Cc: Nico Pache <npache@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ksm: use range-walk function to jump over holes in scan_get_next_rmap_item
Currently, scan_get_next_rmap_item() walks every page address in a VMA to
locate mergeable pages. This becomes highly inefficient when scanning
large virtual memory areas that contain mostly unmapped regions, causing
ksmd to use large amount of cpu without deduplicating much pages.
This patch replaces the per-address lookup with a range walk using
walk_page_range(). The range walker allows KSM to skip over entire
unmapped holes in a VMA, avoiding unnecessary lookups. This problem was
previously discussed in [1].
Consider the following test program which creates a 32 TiB mapping in the
virtual address space but only populates a single page:
Without this patch ksmd uses 100% of the cpu for a long time (more then 1
hour in my test machine) scanning all the 32 TiB virtual address space
that contain only one mapped page. This makes ksmd essentially deadlocked
not able to deduplicate anything of value. With this patch ksmd walks
only the one mapped page and skips the rest of the 32 TiB virtual address
space, making the scan fast using little cpu.
mm/kmsan: fix kmsan kmalloc hook when no stack depots are allocated yet
If no stack depot is allocated yet, due to masking out __GFP_RECLAIM flags
kmsan called from kmalloc cannot allocate stack depot. kmsan fails to
record origin and report issues. This may result in KMSAN failing to
report issues.
Reusing flags from kmalloc without modifying them should be safe for kmsan.
For example, such chain of calls is possible:
test_uninit_kmalloc -> kmalloc -> __kmalloc_cache_noprof ->
slab_alloc_node -> slab_post_alloc_hook ->
kmsan_slab_alloc -> kmsan_internal_poison_memory.
Only when it is called in a context without flags present should
__GFP_RECLAIM flags be masked.
With this change all kmsan tests start working reliably.
Eric reported:
: Yes, KMSAN seems to be at least partially broken currently. Besides the
: fact that the kmsan KUnit test is currently failing (which I reported at
: https://lore.kernel.org/r/20250911175145.GA1376@sol), I've confirmed that
: the poly1305 KUnit test causes a KMSAN warning with Aleksei's patch
: applied but does not cause a warning without it. The warning did get
: reached via syzbot somehow
: (https://lore.kernel.org/r/751b3d80293a6f599bb07770afcef24f623c7da0.1761026343.git.xiaopei01@kylinos.cn/),
: so KMSAN must still work in some cases. But it didn't work for me.
Link: https://lkml.kernel.org/r/20250930115600.709776-2-aleksei.nikiforov@linux.ibm.com Link: https://lkml.kernel.org/r/20251022030213.GA35717@sol Fixes: 97769a53f117 ("mm, bpf: Introduce try_alloc_pages() for opportunistic page allocation") Signed-off-by: Aleksei Nikiforov <aleksei.nikiforov@linux.ibm.com> Reviewed-by: Alexander Potapenko <glider@google.com> Tested-by: Eric Biggers <ebiggers@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Marco Elver <elver@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Kairui Song [Wed, 22 Oct 2025 10:57:19 +0000 (18:57 +0800)]
mm/shmem: fix THP allocation and fallback loop
The order check and fallback loop is updating the index value on every
loop. This will cause the index to be wrongly aligned by a larger value
while the loop shrinks the order.
This may result in inserting and returning a folio of the wrong index and
cause data corruption with some userspace workloads [1].
Pasha Tatashin [Tue, 21 Oct 2025 00:08:52 +0000 (20:08 -0400)]
kho: allocate metadata directly from the buddy allocator
KHO allocates metadata for its preserved memory map using the slab
allocator via kzalloc(). This metadata is temporary and is used by the
next kernel during early boot to find preserved memory.
A problem arises when KFENCE is enabled. kzalloc() calls can be randomly
intercepted by kfence_alloc(), which services the allocation from a
dedicated KFENCE memory pool. This pool is allocated early in boot via
memblock.
When booting via KHO, the memblock allocator is restricted to a "scratch
area", forcing the KFENCE pool to be allocated within it. This creates a
conflict, as the scratch area is expected to be ephemeral and
overwriteable by a subsequent kexec. If KHO metadata is placed in this
KFENCE pool, it leads to memory corruption when the next kernel is loaded.
To fix this, modify KHO to allocate its metadata directly from the buddy
allocator instead of slab.
Link: https://lkml.kernel.org/r/20251021000852.2924827-4-pasha.tatashin@soleen.com Fixes: fc33e4b44b27 ("kexec: enable KHO support for memory preservation") Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: David Matlack <dmatlack@google.com> Cc: Alexander Graf <graf@amazon.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Samiullah Khawaja <skhawaja@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Pasha Tatashin [Tue, 21 Oct 2025 00:08:51 +0000 (20:08 -0400)]
kho: increase metadata bitmap size to PAGE_SIZE
KHO memory preservation metadata is preserved in 512 byte chunks which
requires their allocation from slab allocator. Slabs are not safe to be
used with KHO because of kfence, and because partial slabs may lead leaks
to the next kernel. Change the size to be PAGE_SIZE.
The kfence specifically may cause memory corruption, where it randomly
provides slab objects that can be within the scratch area. The reason for
that is that kfence allocates its objects prior to KHO scratch is marked
as CMA region.
While this change could potentially increase metadata overhead on systems
with sparsely preserved memory, this is being mitigated by ongoing work to
reduce sparseness during preservation via 1G guest pages. Furthermore,
this change aligns with future work on a stateless KHO, which will also
use page-sized bitmaps for its radix tree metadata.
Link: https://lkml.kernel.org/r/20251021000852.2924827-3-pasha.tatashin@soleen.com Fixes: fc33e4b44b27 ("kexec: enable KHO support for memory preservation") Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Matlack <dmatlack@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Samiullah Khawaja <skhawaja@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Pasha Tatashin [Tue, 21 Oct 2025 00:08:50 +0000 (20:08 -0400)]
kho: warn and fail on metadata or preserved memory in scratch area
Patch series "KHO: kfence + KHO memory corruption fix", v3.
This series fixes a memory corruption bug in KHO that occurs when KFENCE
is enabled.
The root cause is that KHO metadata, allocated via kzalloc(), can be
randomly serviced by kfence_alloc(). When a kernel boots via KHO, the
early memblock allocator is restricted to a "scratch area". This forces
the KFENCE pool to be allocated within this scratch area, creating a
conflict. If KHO metadata is subsequently placed in this pool, it gets
corrupted during the next kexec operation.
Google is using KHO and have had obscure crashes due to this memory
corruption, with stacks all over the place. I would prefer this fix to be
properly backported to stable so we can also automatically consume it once
we switch to the upstream KHO.
Patch 1/3 introduces a debug-only feature (CONFIG_KEXEC_HANDOVER_DEBUG)
that adds checks to detect and fail any operation that attempts to place
KHO metadata or preserved memory within the scratch area. This serves as
a validation and diagnostic tool to confirm the problem without affecting
production builds.
Patch 2/3 Increases bitmap to PAGE_SIZE, so buddy allocator can be used.
Patch 3/3 Provides the fix by modifying KHO to allocate its metadata
directly from the buddy allocator instead of slab. This bypasses the
KFENCE interception entirely.
This patch (of 3):
It is invalid for KHO metadata or preserved memory regions to be located
within the KHO scratch area, as this area is overwritten when the next
kernel is loaded, and used early in boot by the next kernel. This can
lead to memory corruption.
Add checks to kho_preserve_* and KHO's internal metadata allocators
(xa_load_or_alloc, new_chunk) to verify that the physical address of the
memory does not overlap with any defined scratch region. If an overlap is
detected, the operation will fail and a WARN_ON is triggered. To avoid
performance overhead in production kernels, these checks are enabled only
when CONFIG_KEXEC_HANDOVER_DEBUG is selected.
Zi Yan [Fri, 17 Oct 2025 01:36:30 +0000 (21:36 -0400)]
mm/huge_memory: do not change split_huge_page*() target order silently
Page cache folios from a file system that support large block size (LBS)
can have minimal folio order greater than 0, thus a high order folio might
not be able to be split down to order-0. Commit e220917fa507 ("mm: split
a folio in minimum folio order chunks") bumps the target order of
split_huge_page*() to the minimum allowed order when splitting a LBS
folio. This causes confusion for some split_huge_page*() callers like
memory failure handling code, since they expect after-split folios all
have order-0 when split succeeds but in reality get min_order_for_split()
order folios and give warnings.
Fix it by failing a split if the folio cannot be split to the target
order. Rename try_folio_split() to try_folio_split_to_order() to reflect
the added new_order parameter. Remove its unused list parameter.
[The test poisons LBS folios, which cannot be split to order-0 folios, and
also tries to poison all memory. The non split LBS folios take more
memory than the test anticipated, leading to OOM. The patch fixed the
kernel warning and the test needs some change to avoid OOM.]
Link: https://lkml.kernel.org/r/20251017013630.139907-1-ziy@nvidia.com Fixes: e220917fa507 ("mm: split a folio in minimum folio order chunks") Signed-off-by: Zi Yan <ziy@nvidia.com> Reported-by: syzbot+e6367ea2fdab6ed46056@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/68d2c943.a70a0220.1b52b.02b3.GAE@google.com/ Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Pankaj Raghav <p.raghav@samsung.com> Reviewed-by: Wei Yang <richard.weiyang@gmail.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Barry Song <baohua@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: Dev Jain <dev.jain@arm.com> Cc: Jane Chu <jane.chu@oracle.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mariano Pache <npache@redhat.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Naoya Horiguchi <nao.horiguchi@gmail.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Christian Brauner <brauner@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Linus Torvalds [Sun, 9 Nov 2025 17:29:44 +0000 (09:29 -0800)]
Merge tag 'i2c-for-6.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fix from Wolfram Sang:
"Two reverts merged into one commit to handle a regression caused by a
wrong cleanup because the underlying implications were unclear"
* tag 'i2c-for-6.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: muxes: pca954x: Fix broken reset-gpio usage
Linus Torvalds [Sun, 9 Nov 2025 17:22:08 +0000 (09:22 -0800)]
Merge tag 'kbuild-fixes-6.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux
Pull Kbuild fixes from Nathan Chancellor:
- Strip trailing padding bytes from modules.builtin.modinfo to fix
error during modules_install with certain versions of kmod
- Drop unused static inline function warning in .c files with clang
from W=1 to W=2
- Ensure kernel-doc.py invocations use the PYTHON3 make variable to
ensure user's choice of Python interpreter is always respected
* tag 'kbuild-fixes-6.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux:
kbuild: Let kernel-doc.py use PYTHON3 override
compiler_types: Move unused static inline functions warning to W=2
kbuild: Strip trailing padding bytes from modules.builtin.modinfo
Yosry Ahmed [Sat, 8 Nov 2025 00:45:21 +0000 (00:45 +0000)]
KVM: nSVM: Fix and simplify LBR virtualization handling with nested
The current scheme for handling LBRV when nested is used is very
complicated, especially when L1 does not enable LBRV (i.e. does not set
LBR_CTL_ENABLE_MASK).
To avoid copying LBRs between VMCB01 and VMCB02 on every nested
transition, the current implementation switches between using VMCB01 or
VMCB02 as the source of truth for the LBRs while L2 is running. If L2
enables LBR, VMCB02 is used as the source of truth. When L2 disables
LBR, the LBRs are copied to VMCB01 and VMCB01 is used as the source of
truth. This introduces significant complexity, and incorrect behavior in
some cases.
For example, on a nested #VMEXIT, the LBRs are only copied from VMCB02
to VMCB01 if LBRV is enabled in VMCB01. This is because L2's writes to
MSR_IA32_DEBUGCTLMSR to enable LBR are intercepted and propagated to
VMCB01 instead of VMCB02. However, LBRV is only enabled in VMCB02 when
L2 is running.
This means that if L2 enables LBR and exits to L1, the LBRs will not be
propagated from VMCB02 to VMCB01, because LBRV is disabled in VMCB01.
There is no meaningful difference in CPUID rate in L2 when copying LBRs
on every nested transition vs. the current approach, so do the simple
and correct thing and always copy LBRs between VMCB01 and VMCB02 on
nested transitions (when LBRV is disabled by L1). Drop the conditional
LBRs copying in __svm_{enable/disable}_lbrv() as it is now unnecessary.
VMCB02 becomes the only source of truth for LBRs when L2 is running,
regardless of LBRV being enabled by L1, drop svm_get_lbr_vmcb() and use
svm->vmcb directly in its place.
Fixes: 1d5a1b5860ed ("KVM: x86: nSVM: correctly virtualize LBR msrs when L2 is running") Cc: stable@vger.kernel.org Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20251108004524.1600006-4-yosry.ahmed@linux.dev Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Yosry Ahmed [Sat, 8 Nov 2025 00:45:20 +0000 (00:45 +0000)]
KVM: nSVM: Always recalculate LBR MSR intercepts in svm_update_lbrv()
svm_update_lbrv() is called when MSR_IA32_DEBUGCTLMSR is updated, and on
nested transitions where LBRV is used. It checks whether LBRV enablement
needs to be changed in the current VMCB, and if it does, it also
recalculate intercepts to LBR MSRs.
However, there are cases where intercepts need to be updated even when
LBRV enablement doesn't. Example scenario:
- L1 has MSR_IA32_DEBUGCTLMSR cleared.
- L1 runs L2 without LBR_CTL_ENABLE (no LBRV).
- L2 sets DEBUGCTLMSR_LBR in MSR_IA32_DEBUGCTLMSR, svm_update_lbrv()
sets LBR_CTL_ENABLE in VMCB02 and disables intercepts to LBR MSRs.
- L2 exits to L1, svm_update_lbrv() is not called on this transition.
- L1 clears MSR_IA32_DEBUGCTLMSR, svm_update_lbrv() finds that
LBR_CTL_ENABLE is already cleared in VMCB01 and does nothing.
- Intercepts remain disabled, L1 reads to LBR MSRs read the host MSRs.
Fix it by always recalculating intercepts in svm_update_lbrv().
Fixes: 1d5a1b5860ed ("KVM: x86: nSVM: correctly virtualize LBR msrs when L2 is running") Cc: stable@vger.kernel.org Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20251108004524.1600006-3-yosry.ahmed@linux.dev Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Yosry Ahmed [Sat, 8 Nov 2025 00:45:19 +0000 (00:45 +0000)]
KVM: SVM: Mark VMCB_LBR dirty when MSR_IA32_DEBUGCTLMSR is updated
The APM lists the DbgCtlMsr field as being tracked by the VMCB_LBR clean
bit. Always clear the bit when MSR_IA32_DEBUGCTLMSR is updated.
The history is complicated, it was correctly cleared for L1 before
commit 1d5a1b5860ed ("KVM: x86: nSVM: correctly virtualize LBR msrs when
L2 is running"). At that point svm_set_msr() started to rely on
svm_update_lbrv() to clear the bit, but when nested virtualization
is enabled the latter does not always clear it even if MSR_IA32_DEBUGCTLMSR
changed. Go back to clearing it directly in svm_set_msr().
Fixes: 1d5a1b5860ed ("KVM: x86: nSVM: correctly virtualize LBR msrs when L2 is running") Reported-by: Matteo Rizzo <matteorizzo@google.com> Reported-by: evn@google.com Co-developed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Jim Mattson <jmattson@google.com> Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20251108004524.1600006-2-yosry.ahmed@linux.dev Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
- Fix regression restoring the ID_PFR1_EL1 register
(20251030122707.2033690-1-maz@kernel.org
- Fix vgic ITS locking issues when LPIs are not directly injected
(20251107184847.1784820-1-oupton@kernel.org)
* Test fixes
- Correct target CPU programming in vgic_lpi_stress selftest
(20251020145946.48288-1-mdittgen@amazon.de)
- Fix exposure of SCTLR2_EL2 and ZCR_EL2 in get-reg-list selftest
(20251023-b4-kvm-arm64-get-reg-list-sctlr-el2-v1-1-088f88ff992a@kernel.org)
(20251024-kvm-arm64-get-reg-list-zcr-el2-v1-1-0cd0ff75e22f@kernel.org)
Paolo Bonzini [Sun, 9 Nov 2025 07:07:32 +0000 (08:07 +0100)]
Merge tag 'kvm-x86-fixes-6.18-rc5' of https://github.com/kvm-x86/linux into HEAD
KVM x86 fixes for 6.18:
- Inject #UD if the guest attempts to execute SEAMCALL or TDCALL as KVM
doesn't support virtualization the instructions, but the instructions
are gated only by VMXON, i.e. will VM-Exit instead of taking a #UD and
thus result in KVM exiting to userspace with an emulation error.
- Unload the "FPU" when emulating INIT of XSTATE features if and only if
the FPU is actually loaded, instead of trying to predict when KVM will
emulate an INIT (CET support missed the MP_STATE path). Add sanity
checks to detect and harden against similar bugs in the future.
- Unregister KVM's GALog notifier (for AVIC) when kvm-amd.ko is unloaded.
- Use a raw spinlock for svm->ir_list_lock as the lock is taken during
schedule(), and "normal" spinlocks are sleepable locks when PREEMPT_RT=y.
- Remove guest_memfd bindings on memslot deletion when a gmem file is dying
to fix a use-after-free race found by syzkaller.
- Fix a goof in the EPT Violation handler where KVM checks the wrong
variable when determining if the reported GVA is valid.
Paolo Bonzini [Sun, 9 Nov 2025 07:07:03 +0000 (08:07 +0100)]
Merge tag 'kvm-riscv-fixes-6.18-2' of https://github.com/kvm-riscv/linux into HEAD
KVM/riscv fixes for 6.18, take #2
- Fix check for local interrupts on riscv32
- Read HGEIP CSR on the correct cpu when checking for IMSIC interrupts
- Remove automatic I/O mapping from kvm_arch_prepare_memory_region()
Jean Delvare [Fri, 7 Nov 2025 18:29:33 +0000 (19:29 +0100)]
kbuild: Let kernel-doc.py use PYTHON3 override
It is possible to force a specific version of python to be used when
building the kernel by passing PYTHON3= on the make command line.
However kernel-doc.py is currently called with python3 hard-coded and
thus ignores this setting.
Use $(PYTHON3) to run $(KERNELDOC) so that the desired version of
python is used.
Signed-off-by: Jean Delvare <jdelvare@suse.de> Reviewed-by: Nicolas Schier <nsc@kernel.org> Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Link: https://patch.msgid.link/20251107192933.2bfe9e57@endymion Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Linus Torvalds [Sat, 8 Nov 2025 23:37:03 +0000 (15:37 -0800)]
Merge tag 'drm-fixes-2025-11-09' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fix from Dave Airlie:
"Brown paper bag, the dma mask fix which I applied and actually looked
through for bad things, actually broke newer GPUs, there might be some
latent part in the boot path that is assuming 32-bit still, but we
will figure that out elsewhere.
nouveau:
- revert DMA mask change"
* tag 'drm-fixes-2025-11-09' of https://gitlab.freedesktop.org/drm/kernel:
Revert "drm/nouveau: set DMA mask before creating the flush page"
Linus Torvalds [Sat, 8 Nov 2025 23:34:23 +0000 (15:34 -0800)]
Merge tag 'rtc-6.18-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux
Pull RTC fixes from Alexandre Belloni:
"The two reverts are for patches that I shouldn't have applied. The
rx8025 patch fixes an issue present since 2022:
Linus Torvalds [Sat, 8 Nov 2025 17:01:11 +0000 (09:01 -0800)]
Merge tag 'x86-urgent-2025-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
- Fix AMD PCI root device caching regression that triggers
on certain firmware variants
- Fix the zen5_rdseed_microcode[] array to be NULL-terminated
- Add more AMD models to microcode signature checking
* tag 'x86-urgent-2025-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/microcode/AMD: Add more known models to entry sign checking
x86/CPU/AMD: Add missing terminator for zen5_rdseed_microcode
x86/amd_node: Fix AMD root device caching
Linus Torvalds [Sat, 8 Nov 2025 16:59:05 +0000 (08:59 -0800)]
Merge tag 'sched-urgent-2025-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fix from Ingo Molnar:
"Fix a group-throttling bug in the fair scheduler"
* tag 'sched-urgent-2025-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/fair: Prevent cfs_rq from being unthrottled with zero runtime_remaining
Linus Torvalds [Sat, 8 Nov 2025 16:43:01 +0000 (08:43 -0800)]
Merge tag 'xfs-fixes-6.18-rc5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs fixes from Carlos Maiolino:
"This contain fixes for the RT and zoned allocator, and a few fixes for
atomic writes"
* tag 'xfs-fixes-6.18-rc5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: free xfs_busy_extents structure when no RT extents are queued
xfs: fix zone selection in xfs_select_open_zone_mru
xfs: fix a rtgroup leak when xfs_init_zone fails
xfs: fix various problems in xfs_atomic_write_cow_iomap_begin
xfs: fix delalloc write failures in software-provided atomic writes
Oliver Upton [Fri, 7 Nov 2025 01:28:25 +0000 (17:28 -0800)]
MAINTAINERS: Switch myself to using kernel.org address
I've been running into issues with the linux.dev email
semi-periodically, switching to my kernel.org address while I go figure
out a better home for my inbox.
Oliver Upton [Fri, 7 Nov 2025 18:48:47 +0000 (10:48 -0800)]
KVM: arm64: vgic-v3: Release reserved slot outside of lpi_xa's lock
xa_release() expects to be called outside of the xa_lock. Fix
vgic_add_lpi() to drop the lock before calling and restructure to get
rid of the goto label.
Reported-by: Zenghui Yu <yuzenghui@huawei.com> Closes: https://lore.kernel.org/kvmarm/d0853e82-7d95-5025-7abf-c6f1e0cdf7b5@huawei.com/ Fixes: 481c9ee846d2 ("KVM: arm64: vgic-its: Get rid of the lpi_list_lock") Signed-off-by: Oliver Upton <oupton@kernel.org> Link: https://patch.msgid.link/20251107184847.1784820-3-oupton@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
Oliver Upton [Fri, 7 Nov 2025 18:48:46 +0000 (10:48 -0800)]
KVM: arm64: vgic-v3: Reinstate IRQ lock ordering for LPI xarray
Zenghui reports that running a KVM guest with an assigned device and
lockdep enabled produces an unfriendly splat due to an inconsistent irq
context when taking the lpi_xa's spinlock.
This is no good as in rare cases the last reference to an LPI can get
dropped after injection of a cached LPI translation. In this case,
vgic_put_irq() will release the IRQ struct and take the lpi_xa's
spinlock to erase it from the xarray.
Reinstate the IRQ ordering and update the lockdep hint accordingly. Note
that there is no irqsave equivalent of might_lock(), so just explictly
grab and release the spinlock on lockdep kernels.
Reported-by: Zenghui Yu <yuzenghui@huawei.com> Closes: https://lore.kernel.org/kvmarm/b4d7cb0f-f007-0b81-46d1-998b15cc14bc@huawei.com/ Fixes: 982f31bbb5b0 ("KVM: arm64: vgic-v3: Don't require IRQs be disabled for LPI xarray lock") Signed-off-by: Oliver Upton <oupton@kernel.org> Link: https://patch.msgid.link/20251107184847.1784820-2-oupton@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
Marc Zyngier [Thu, 30 Oct 2025 12:27:07 +0000 (12:27 +0000)]
KVM: arm64: Limit clearing of ID_{AA64PFR0,PFR1}_EL1.GIC to userspace irqchip
Now that the idreg's GIC field is in sync with the irqchip, limit
the runtime clearing of these fields to the pathological case where
we do not have an in-kernel GIC.
While we're at it, use the existing API instead of open-coded
accessors to access the ID regs.
Fixes: 5cb57a1aff755 ("KVM: arm64: Zero ID_AA64PFR0_EL1.GIC when no GICv3 is presented to the guest") Reviewed-by: Oliver Upton <oupton@kernel.org> Link: https://patch.msgid.link/20251030122707.2033690-4-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
Marc Zyngier [Thu, 30 Oct 2025 12:27:06 +0000 (12:27 +0000)]
KVM: arm64: Set ID_{AA64PFR0,PFR1}_EL1.GIC when GICv3 is configured
Drive the idreg fields indicating the presence of GICv3 directly from
the vgic code. This avoids having to do any sort of runtime clearing
of the idreg.
Fixes: 5cb57a1aff755 ("KVM: arm64: Zero ID_AA64PFR0_EL1.GIC when no GICv3 is presented to the guest") Reviewed-by: Oliver Upton <oupton@kernel.org> Link: https://patch.msgid.link/20251030122707.2033690-3-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
Marc Zyngier [Thu, 30 Oct 2025 12:27:05 +0000 (12:27 +0000)]
KVM: arm64: Make all 32bit ID registers fully writable
32bit ID registers aren't getting much love these days, and are
often missed in updates. One of these updates broke restoring
a GICv2 guest on a GICv3 machine.
Instead of performing a piecemeal fix, just bite the bullet
and make all 32bit ID regs fully writable. KVM itself never
relies on them for anything, and if the VMM wants to mess up
the guest, so be it.
Fixes: 5cb57a1aff755 ("KVM: arm64: Zero ID_AA64PFR0_EL1.GIC when no GICv3 is presented to the guest") Reported-by: Peter Maydell <peter.maydell@linaro.org> Cc: stable@vger.kernel.org Reviewed-by: Oliver Upton <oupton@kernel.org> Link: https://patch.msgid.link/20251030122707.2033690-2-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
Tested the latest kernel on my GB203 and this seems to break it somehow.
Nov 09 04:16:14 bighp kernel: nouveau 0000:02:00.0: gsp: GSP-FMC boot failed (mbox: 0x0000000b)
Nov 09 04:16:14 bighp kernel: nouveau 0000:02:00.0: gsp: init failed, -5
Nov 09 04:16:14 bighp kernel: nouveau 0000:02:00.0: init failed with -5
Nov 09 04:16:14 bighp kernel: nouveau: drm:00000000:00000080: init failed with -5
Nov 09 04:16:14 bighp kernel: nouveau 0000:02:00.0: drm: Device allocation failed: -5
Nov 09 04:16:14 bighp kernel: nouveau 0000:02:00.0: probe with driver nouveau failed with error -5
Not sure why, I went over the patch and thought it should have worked, but there must be some
32-bit problem maybe in the FMC boot path.
After commit 44aa25c000b4 ("riscv: asm: use .insn for making custom
instructions"), builds using LLVM older that 19 or binutils older than
2.38 fail with:
In file included from <built-in>:4:
In file included from lib/vdso/gettimeofday.c:6:
In file included from include/vdso/datapage.h:21:
In file included from include/vdso/processor.h:10:
arch/riscv/include/asm/vdso/processor.h:23:2: error: expected instruction format
23 | ALT_RISCV_PAUSE();
| ^
arch/riscv/include/asm/errata_list.h:47:3: note: expanded from macro 'ALT_RISCV_PAUSE'
47 | RISCV_PAUSE, /* Original RISC‑V pause insn */ \
| ^
arch/riscv/include/asm/insn-def.h:259:21: note: expanded from macro 'RISCV_PAUSE'
259 | #define RISCV_PAUSE ASM_INSN_I("0x100000f")
| ^
arch/riscv/include/asm/asm.h:16:26: note: expanded from macro 'ASM_INSN_I'
16 | #define ASM_INSN_I(__x) ".insn " __x
| ^
<inline asm>:5:7: note: instantiated into assembly here
5 | .insn 0x100000f
| ^
binutils gained support for '.insn <value>' in 2.38 [1] and LLVM gained
support in 19 [2]. Adjust the test for CONFIG_AS_HAS_INSN to ensure that
all versions of .insn are supported before being used.
Feng Jiang [Wed, 29 Oct 2025 09:44:28 +0000 (17:44 +0800)]
riscv: Build loader.bin exclusively for Canaan K210
According to the explanation in commit ef10bdf9c3e6 ("riscv:
Kconfig.socs: Split ARCH_CANAAN and SOC_CANAAN_K210"),
loader.bin is a special feature of the Canaan K210 and
is not applicable to other SoCs.
Fixes: e79dfcbfb902 ("riscv: make image compression configurable") Signed-off-by: Feng Jiang <jiangfeng@kylinos.cn> Reviewed-by: Emil Renner Berthing <emil.renner.berthing@canonical.com> Link: https://lore.kernel.org/r/20251029094429.553842-1-jiangfeng@kylinos.cn Signed-off-by: Paul Walmsley <pjw@kernel.org>
Pavel Begunkov [Fri, 7 Nov 2025 18:41:26 +0000 (18:41 +0000)]
io_uring: fix regbuf vector size truncation
There is a report of io_estimate_bvec_size() truncating the calculated
number of segments that leads to corruption issues. Check it doesn't
overflow "int"s used later. Rough but simple, can be improved on top.
Cc: stable@vger.kernel.org Fixes: 9ef4cbbcb4ac3 ("io_uring: add infra for importing vectored reg buffers") Reported-by: Google Big Sleep <big-sleep-vuln-reports+bigsleep-458654612@google.com> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Reviewed-by: Günther Noack <gnoack@google.com> Tested-by: Günther Noack <gnoack@google.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Linus Torvalds [Fri, 7 Nov 2025 22:51:11 +0000 (14:51 -0800)]
Merge tag 'drm-fixes-2025-11-08' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
"Back from travel, thanks to Simona for handling things. regular fixes,
seems about the right size, but spread out a bit.
amdgpu has the usual range of fixes, xe has a few fixes, and nouveau
has a couple of fixes, one for blackwell modifiers on 8/16 bit
surfaces.
Otherwise a few small fixes for mediatek, sched, imagination and
pixpaper.
xe:
- Fix missing synchronization on unbind
- Fix device shutdown when doing FLR
- Fix user fence signaling order
i915:
- Avoid lock inversion when pinning to GGTT on CHV/BXT+VTD
- Fix conversion between clock ticks and nanoseconds
mediatek:
- Disable AFBC support on Mediatek DRM driver
- Add pm_runtime support for GCE power control
imagination:
- kconfig: Fix dependencies
nouveau:
- Set DMA mask earlier
- Advertize correct modifiers for GB20x
pixpaper:
- kconfig: Fix dependencies"
* tag 'drm-fixes-2025-11-08' of https://gitlab.freedesktop.org/drm/kernel: (26 commits)
drm/xe: Enforce correct user fence signaling order using
drm/xe: Do clean shutdown also when using flr
drm/xe: Move declarations under conditional branch
drm/xe/guc: Synchronize Dead CT worker with unbind
drm/amd/display: Enable mst when it's detected but yet to be initialized
drm/amdgpu: Fix wait after reset sequence in S3
drm/amd: Fix suspend failure with secure display TA
drm/amdgpu: fix gpu page fault after hibernation on PF passthrough
drm/tiny: pixpaper: add explicit dependency on MMU
drm/nouveau: Advertise correct modifiers on GB20x
drm: define NVIDIA DRM format modifiers for GB20x
drm/nouveau: set DMA mask before creating the flush page
drm/sched: Fix deadlock in drm_sched_entity_kill_jobs_cb
drm/amd/display: Fix NULL deref in debugfs odm_combine_segments
drm/amdkfd: Don't clear PT after process killed
drm/amdgpu/smu: Handle S0ix for vangogh
drm/amdgpu: Drop PMFW RLC notifier from amdgpu_device_suspend()
drm/amd/display: Fix black screen with HDMI outputs
drm/amd/display: Don't stretch non-native images by default in eDP
drm/amd/pm: fix missing device_attr cleanup in amdgpu_pm_sysfs_init()
...
Linus Torvalds [Fri, 7 Nov 2025 21:19:18 +0000 (13:19 -0800)]
Merge tag 'parisc-for-6.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc fix from Helge Deller:
- fix crash triggered by unaligned access in parisc unwinder
* tag 'parisc-for-6.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Avoid crash due to unaligned access in unwinder
Linus Torvalds [Fri, 7 Nov 2025 21:13:09 +0000 (13:13 -0800)]
Merge tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd
Pull iommufd fixes from Jason Gunthorpe:
- Syzkaller found a case where maths overflows can cause divide by 0
- Typo in a compiler bug warning fix in the selftests broke the
selftests
- type1 compatability had a mismatch when unmapping an already unmapped
range, it should succeed
* tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd:
iommufd: Make vfio_compat's unmap succeed if the range is already empty
iommufd/selftest: Fix ioctl return value in _test_cmd_trigger_vevents()
iommufd: Don't overflow during division for dirty tracking
Peter Zijlstra [Thu, 6 Nov 2025 10:50:00 +0000 (11:50 +0100)]
compiler_types: Move unused static inline functions warning to W=2
Per Nathan, clang catches unused "static inline" functions in C files
since commit 6863f5643dd7 ("kbuild: allow Clang to find unused static
inline functions for W=1 build").
Linus said:
> So I entirely ignore W=1 issues, because I think so many of the extra
> warnings are bogus.
>
> But if this one in particular is causing more problems than most -
> some teams do seem to use W=1 as part of their test builds - it's fine
> to send me a patch that just moves bad warnings to W=2.
>
> And if anybody uses W=2 for their test builds, that's THEIR problem..
Here is the change to bump the warning from W=1 to W=2.
Fixes: 6863f5643dd7 ("kbuild: allow Clang to find unused static inline functions for W=1 build") Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://patch.msgid.link/20251106105000.2103276-1-andriy.shevchenko@linux.intel.com
[nathan: Adjust comment as well] Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Linus Torvalds [Fri, 7 Nov 2025 16:10:55 +0000 (08:10 -0800)]
Merge tag 'gpio-fixes-for-v6.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
- use the firmware node of the GPIO chip, not its label for software
node lookup
- fix invalid pointer access in GPIO debugfs
- drop unused functions from gpio-tb10x
- fix a regression in gpio-aggregator: restore the set_config()
callback in the driver
- correct schema $id path in ti,twl4030 DT bindings
* tag 'gpio-fixes-for-v6.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpio: tb10x: Drop unused tb10x_set_bits() function
gpio: aggregator: restore the set_config operation
gpiolib: fix invalid pointer access in debugfs
gpio: swnode: don't use the swnode's name as the key for GPIO lookup
dt-bindings: gpio: ti,twl4030: Correct the schema $id path
Linus Torvalds [Fri, 7 Nov 2025 16:07:11 +0000 (08:07 -0800)]
Merge tag 'trace-v6.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
- Check for reader catching up in ring_buffer_map_get_reader()
If the reader catches up to the writer in the memory mapped ring
buffer then calling rb_get_reader_page() will return NULL as there's
no pages left. But this isn't checked for before calling
rb_get_reader_page() and the return of NULL causes a warning.
If it is detected that the reader caught up to the writer, then
simply exit the routine
- Fix memory leak in histogram create_field_var()
The couple of the error paths in create_field_var() did not properly
clean up what was allocated. Make sure everything is freed properly
on error
- Fix help message of tools latency_collector
The help message incorrectly stated that "-t" was the same as
"--threads" whereas "--threads" is actually represented by "-e"
* tag 'trace-v6.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing/tools: Fix incorrcet short option in usage text for --threads
tracing: Fix memory leaks in create_field_var()
ring-buffer: Do not warn in ring_buffer_map_get_reader() when reader catches up
Linus Torvalds [Fri, 7 Nov 2025 15:52:45 +0000 (07:52 -0800)]
Merge tag 'io_uring-6.18-20251106' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull io_uring fixes from Jens Axboe:
- Remove the sync refill API that was added in this release, in
anticipation of doing it in a better way for the next release
- Fix type extension for calculating size off nr_pages, like we do
in other spots
* tag 'io_uring-6.18-20251106' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
io_uring: fix types for region size calulation
io_uring/zcrx: remove sync refill uapi
Linus Torvalds [Fri, 7 Nov 2025 15:47:08 +0000 (07:47 -0800)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"All fixes in the UFS driver.
The big contributor to the diffstats is the Intel controller S0ix/S3
fix which has to special case the suspend/resume patch for intel
controllers in ufshcd-pci.c"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: ufs: core: Fix invalid probe error return value
scsi: ufs: ufs-pci: Set UFSHCD_QUIRK_PERFORM_LINK_STARTUP_ONCE for Intel ADL
scsi: ufs: core: Add a quirk to suppress link_startup_again
scsi: ufs: ufs-pci: Fix S0ix/S3 for Intel controllers
scsi: ufs: core: Revert "Make HID attributes visible"
scsi: ufs: core: Reduce link startup failure logging
scsi: ufs: core: Fix a race condition related to the "hid" attribute group
scsi: ufs: ufs-qcom: Fix UFS OCP issue during UFS power down (PC=3)
Linus Torvalds [Fri, 7 Nov 2025 15:39:57 +0000 (07:39 -0800)]
Merge tag 'v6.18-rc4-smb-server-fixes' of git://git.samba.org/ksmbd
Pull smb server fixes from Steve French:
- More safely detect RDMA capable devices correctly
* tag 'v6.18-rc4-smb-server-fixes' of git://git.samba.org/ksmbd:
ksmbd: detect RDMA capable netdevs include IPoIB
ksmbd: detect RDMA capable lower devices when bridge and vlan netdev is used
Adrian Barnaś [Mon, 22 Sep 2025 13:04:27 +0000 (13:04 +0000)]
arm64: Reject modules with internal alternative callbacks
During module loading, check if a callback function used by the
alternatives specified in the '.altinstruction' ELF section (if present)
is located in core kernel .text. If not fail module loading before
callback is called.
Reported-by: Fanqin Cui <cuifq1@chinatelecom.cn> Closes: https://lore.kernel.org/all/20250807072700.348514-1-fanqincui@163.com/ Signed-off-by: Adrian Barnaś <abarnas@google.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
[will: Folded in 'noinstr' tweak from Mark] Signed-off-by: Will Deacon <will@kernel.org>
Adrian Barnaś [Mon, 22 Sep 2025 13:04:26 +0000 (13:04 +0000)]
arm64: Fail module loading if dynamic SCS patching fails
Disallow a module to load if SCS dynamic patching fails for its code. For
module loading, instead of running a dry-run to check for patching errors,
try to run patching in the first run and propagate any errors so module
loading will fail.
Signed-off-by: Adrian Barnaś <abarnas@google.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
shechenglong [Fri, 31 Oct 2025 09:15:06 +0000 (17:15 +0800)]
arm64: proton-pack: Fix hard lockup due to print in scheduler context
Relocate the printk() calls from spectre_v4_mitigations_off() and
spectre_v2_mitigations_off() into setup_system_capabilities() function,
preventing hard lockups caused by printk calls in scheduler context:
shechenglong [Fri, 31 Oct 2025 09:15:05 +0000 (17:15 +0800)]
arm64: proton-pack: Drop print when !CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY
Following the pattern established with other Spectre mitigations,
do not print a message when the CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY
Kconfig option is disabled.
Suggested-by: Will Deacon <will@kernel.org> Signed-off-by: shechenglong <shechenglong@xfusion.com> Signed-off-by: Will Deacon <will@kernel.org>
Ryan Roberts [Thu, 6 Nov 2025 16:09:43 +0000 (16:09 +0000)]
arm64: mm: Tidy up force_pte_mapping()
Tidy up the implementation of force_pte_mapping() to make it easier to
read and introduce the split_leaf_mapping_possible() helper to reduce
code duplication in split_kernel_leaf_mapping() and
arch_kfence_init_pool().
Suggested-by: David Hildenbrand (Red Hat) <david@kernel.org> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Reviewed-by: David Hildenbrand (Red Hat) <david@kernel.org> Reviewed-by: Yang Shi <yang@os.amperecomputing.com> Signed-off-by: Will Deacon <will@kernel.org>
Ryan Roberts [Thu, 6 Nov 2025 16:09:42 +0000 (16:09 +0000)]
arm64: mm: Optimize range_split_to_ptes()
Enter lazy_mmu mode while splitting a range of memory to pte mappings.
This causes barriers, which would otherwise be emitted after every pte
(and pmd/pud) write, to be deferred until exiting lazy_mmu mode.
For large systems, this is expected to significantly speed up fallback
to pte-mapping the linear map for the case where the boot CPU has
BBML2_NOABORT, but secondary CPUs do not. I haven't directly measured
it, but this is equivalent to commit 1fcb7cea8a5f ("arm64: mm: Batch dsb
and isb when populating pgtables").
Note that for the path from arch_kfence_init_pool(), we may sleep while
allocating memory inside the lazy_mmu mode. Sleeping is not allowed by
generic code inside lazy_mmu, but we know that the arm64 implementation
is sleep-safe. So this is ok and follows the same pattern already used
by split_kernel_leaf_mapping().
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Reviewed-by: Yang Shi <yang@os.amperecomputing.com> Signed-off-by: Will Deacon <will@kernel.org>
Ryan Roberts [Thu, 6 Nov 2025 16:09:41 +0000 (16:09 +0000)]
arm64: mm: Don't sleep in split_kernel_leaf_mapping() when in atomic context
It has been reported that split_kernel_leaf_mapping() is trying to sleep
in non-sleepable context. It does this when acquiring the
pgtable_split_lock mutex, when either CONFIG_DEBUG_PAGEALLOC or
CONFIG_KFENCE are enabled, which change linear map permissions within
softirq context during memory allocation and/or freeing. All other paths
into this function are called from sleepable context and so are safe.
But it turns out that the memory for which these 2 features may attempt
to modify the permissions is always mapped by pte, so there is no need
to attempt to split the mapping. So let's exit early in these cases and
avoid attempting to take the mutex.
There is one wrinkle to this approach; late-initialized kfence allocates
it's pool from the buddy which may be block mapped. So we must hook that
allocation and convert it to pte-mappings up front. Previously this was
done as a side-effect of kfence protecting all the individual pages in
its pool at init-time, but this no longer works due to the added early
exit path in split_kernel_leaf_mapping().
So instead, do this via the existing arch_kfence_init_pool() arch hook,
and reuse the existing linear_map_split_to_ptes() infrastructure.
Closes: https://lore.kernel.org/all/f24b9032-0ec9-47b1-8b95-c0eeac7a31c5@roeck-us.net/ Fixes: a166563e7ec3 ("arm64: mm: support large block mapping when rodata=full") Reported-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <groeck@google.com> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Reviewed-by: David Hildenbrand (Red Hat) <david@kernel.org> Reviewed-by: Yang Shi <yang@os.amperecomputing.com> Signed-off-by: Will Deacon <will@kernel.org>
Yang Shi [Tue, 4 Nov 2025 21:49:47 +0000 (13:49 -0800)]
arm64: kprobes: check the return value of set_memory_rox()
Since commit a166563e7ec3 ("arm64: mm: support large block mapping when
rodata=full"), __change_memory_common has more chance to fail due to
memory allocation failure when splitting page table. So check the return
value of set_memory_rox(), then bail out if it fails otherwise we may have
RW memory mapping for kprobes insn page.
Fixes: 195a1b7d8388 ("arm64: kprobes: call set_memory_rox() for kprobe page") Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> Reviewed-by: Dev Jain <dev.jain@arm.com> Signed-off-by: Yang Shi <yang@os.amperecomputing.com> Signed-off-by: Will Deacon <will@kernel.org>
Punit Agrawal [Fri, 31 Oct 2025 11:11:38 +0000 (11:11 +0000)]
arm64: acpi: Drop message logging SPCR default console
Commit f5a4af3c7527 ("ACPI: Add acpi=nospcr to disable ACPI SPCR as
default console on ARM64") introduced a command line parameter to
prevent using SPCR provided console as default. It also introduced a
message to log this choice.
Drop the message as it is not particularly useful and can be incorrect
in situations where no SPCR is provided by the firmware.
Commit bad3fa2fb920 ("ACPI: Suppress misleading SPCR console message
when SPCR table is absent") mistakenly assumes acpi_parse_spcr()
returning 0 to indicate a failure to parse SPCR. While addressing the
resultant incorrect logging it was deemed that dropping the message is
a better approach as it is not particularly useful.
Roll back the commit introducing the bug as a step towards dropping
the log message.
Catalin Marinas [Thu, 6 Nov 2025 15:52:13 +0000 (15:52 +0000)]
arm64: Use load LSE atomics for the non-return per-CPU atomic operations
The non-return per-CPU this_cpu_*() atomic operations are implemented as
STADD/STCLR/STSET when FEAT_LSE is available. On many microarchitecture
implementations, these instructions tend to be executed "far" in the
interconnect or memory subsystem (unless the data is already in the L1
cache). This is in general more efficient when there is contention as it
avoids bouncing cache lines between CPUs. The load atomics (e.g. LDADD
without XZR as destination), OTOH, tend to be executed "near" with the
data loaded into the L1 cache.
STADD executed back to back as in srcu_read_{lock,unlock}*() incur an
additional overhead due to the default posting behaviour on several CPU
implementations. Since the per-CPU atomics are unlikely to be used
concurrently on the same memory location, encourage the hardware to to
execute them "near" by issuing load atomics - LDADD/LDCLR/LDSET - with
the destination register unused (but not XZR).
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/e7d539ed-ced0-4b96-8ecd-048a5b803b85@paulmck-laptop Reported-by: Paul E. McKenney <paulmck@kernel.org> Tested-by: Paul E. McKenney <paulmck@kernel.org> Cc: Will Deacon <will@kernel.org> Reviewed-by: Palmer Dabbelt <palmer@dabbelt.com>
[will: Add comment and link to the discussion thread] Signed-off-by: Will Deacon <will@kernel.org>
Zhang Chujun [Thu, 6 Nov 2025 03:10:40 +0000 (11:10 +0800)]
tracing/tools: Fix incorrcet short option in usage text for --threads
The help message incorrectly listed '-t' as the short option for
--threads, but the actual getopt_long configuration uses '-e'.
This mismatch can confuse users and lead to incorrect command-line
usage. This patch updates the usage string to correctly show:
"-e, --threads NRTHR"
to match the implementation.
Note: checkpatch.pl reports a false-positive spelling warning on
'Run', which is intentional.
Matthew Brost [Fri, 31 Oct 2025 23:40:45 +0000 (16:40 -0700)]
drm/xe: Enforce correct user fence signaling order using
Prevent application hangs caused by out-of-order fence signaling when
user fences are attached. Use drm_syncobj (via dma-fence-chain) to
guarantee that each user fence signals in order, regardless of the
signaling order of the attached fences. Ensure user fence writebacks to
user space occur in the correct sequence.
Jouni Högander [Fri, 31 Oct 2025 12:23:11 +0000 (14:23 +0200)]
drm/xe: Do clean shutdown also when using flr
Currently Xe driver is triggering flr without any clean-up on
shutdown. This is causing random warnings from pending related works as the
underlying hardware is reset in the middle of their execution.
Fix this by performing clean shutdown also when using flr.
Fixes: 501d799a47e2 ("drm/xe: Wire up device shutdown handler") Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Maarten Lankhorst <dev@lankhorst.se> Link: https://patch.msgid.link/20251031122312.1836534-1-jouni.hogander@intel.com Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
(cherry picked from commit a4ff26b7c8ef38e4dd34f77cbcd73576fdde6dd4) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Tejas Upadhyay [Tue, 7 Oct 2025 10:02:08 +0000 (15:32 +0530)]
drm/xe: Move declarations under conditional branch
The xe_device_shutdown() function was needing a few declarations
that were only required under a specific condition. This change
moves those declarations to be within that conditional branch
to avoid unnecessary declarations.
drm/xe/guc: Synchronize Dead CT worker with unbind
Cancel and wait for any Dead CT worker to complete before continuing
with device unbinding. Else the worker will end up using resources freed
by the undind operation.
Cc: Zhanjun Dong <zhanjun.dong@intel.com> Fixes: d2c5a5a926f4 ("drm/xe/guc: Dead CT helper") Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://patch.msgid.link/20251103123144.3231829-6-balasubramani.vivekanandan@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 492671339114e376aaa38626d637a2751cdef263) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Zilin Guan [Thu, 6 Nov 2025 12:01:32 +0000 (12:01 +0000)]
tracing: Fix memory leaks in create_field_var()
The function create_field_var() allocates memory for 'val' through
create_hist_field() inside parse_atom(), and for 'var' through
create_var(), which in turn allocates var->type and var->var.name
internally. Simply calling kfree() to release these structures will
result in memory leaks.
Use destroy_hist_field() to properly free 'val', and explicitly release
the memory of var->type and var->var.name before freeing 'var' itself.
Steven Rostedt [Thu, 16 Oct 2025 17:28:48 +0000 (13:28 -0400)]
ring-buffer: Do not warn in ring_buffer_map_get_reader() when reader catches up
The function ring_buffer_map_get_reader() is a bit more strict than the
other get reader functions, and except for certain situations the
rb_get_reader_page() should not return NULL. If it does, it triggers a
warning.
This warning was triggering but after looking at why, it was because
another acceptable situation was happening and it wasn't checked for.
If the reader catches up to the writer and there's still data to be read
on the reader page, then the rb_get_reader_page() will return NULL as
there's no new page to get.
In this situation, the reader page should not be updated and no warning
should trigger.
Linus Torvalds [Fri, 7 Nov 2025 00:24:12 +0000 (16:24 -0800)]
Merge tag 'probes-fixes-v6.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull probe fixes from Masami Hiramatsu:
- tprobe-events: Fix to register tracepoint correctly
tprobe-events missed to set tracepoint data structure before
registering callback when enabling it. This sets it correctly.
- tprobe-events: Fix to put tracepoint_user when disable the event
tprobe-events missed to unregister tracepoint callback when the event
is disabled. This ensures to unregister it.
* tag 'probes-fixes-v6.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: tprobe-events: Fix to put tracepoint_user when disable the tprobe
tracing: tprobe-events: Fix to register tracepoint correctly
* tag 'perf-tools-fixes-for-v6.18-1-2025-11-06' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
perf symbols: Handle '1' symbols in /proc/kallsyms
tools headers asm: Sync fls headers header with the kernel sources
tools headers UAPI: Sync KVM's vmx.h header with the kernel sources to handle new exit reasons
tools headers svm: Sync svm headers with the kernel sources
tools headers UAPI: Sync x86's asm/kvm.h with the kernel sources
MAINTAINERS: Add James Clark as a perf tools reviewer
tools headers UAPI: Sync linux/kvm.h with the kernel sources
tools headers UAPI: Update tools's copy of drm.h to pick DRM_IOCTL_GEM_CHANGE_HANDLE
tools headers x86 cpufeatures: Sync with the kernel sources
tools headers x86: Sync table due to introducion of uprobe syscall
tools headers: Sync uapi/linux/fcntl.h with the kernel sources
tools headers: Sync uapi/linux/prctl.h with the kernel source
tools headers uapi: Update fs.h with the kernel sources
tools arch x86: Sync msr-index.h to pick AMD64_{PERF_CNTR_GLOBAL_STATUS_SET,SAVIC_CONTROL}, IA32_L3_QOS_{ABMC,EXT}_CFG
Linus Torvalds [Thu, 6 Nov 2025 23:44:18 +0000 (15:44 -0800)]
Merge tag 'riscv-for-linus-6.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Paul Walmsley:
- A fix to disable KASAN checks while walking a non-current task's
stackframe (following x86)
- A fix for a kvrealloc()-related memory leak in
module_frob_arch_sections()
- Two replacements of strcpy() with strscpy()
- A change to use the RISC-V .insn assembler directive when possible to
assemble instructions from hex opcodes
- Some low-impact fixes in the ptdump code and kprobes test code
* tag 'riscv-for-linus-6.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
cpuidle: riscv-sbi: Replace deprecated strcpy in sbi_cpuidle_init_cpu
riscv: KGDB: Replace deprecated strcpy in kgdb_arch_handle_qxfer_pkt
riscv: asm: use .insn for making custom instructions
riscv: tests: Make RISCV_KPROBES_KUNIT tristate
riscv: tests: Rename kprobes_test_riscv to kprobes_riscv
riscv: Fix memory leak in module_frob_arch_sections()
riscv: ptdump: use seq_puts() in pt_dump_seq_puts() macro
riscv: stacktrace: Disable KASAN checks for non-current tasks
Linus Torvalds [Thu, 6 Nov 2025 23:40:14 +0000 (15:40 -0800)]
Merge tag 'acpi-6.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
"These fix a coding mistake in the ACPI Smart Battery Subsystem (SBS)
driver and two documentation issues:
- Fix computation of the battery->present value in acpi_battery_read()
to work when battery->id is not zero (Dan Carpenter)
- Fix comment typo in the ACPI CPPC library (Chu Guangqing)
- Fix I2C device references in two ASL examples in the firmware guide
that were broken by a previous update (Jonas Gorski)"
* tag 'acpi-6.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: SBS: Fix present test in acpi_battery_read()
ACPI: CPPC: Fix typo in a comment
Documentation: ACPI: i2c-muxes: fix I2C device references