liveupdate: luo_file: do not clear serialized_data on unfreeze
Patch series "liveupdate: fixes in error handling".
This series contains some fixes in LUO's error handling paths.
The first patch deals with failed freeze() attempts. The cleanup path
calls unfreeze, and that clears some data needed by later unpreserve
calls.
The second patch is a bit more involved. It deals with failed retrieve()
attempts. To do so properly, it reworks some of the error handling logic
in luo_file core.
Both these fixes are "theoretical" -- in the sense that I have not been
able to reproduce either of them in normal operation. The only supported
file type right now is memfd, and there is nothing userspace can do right
now to make it fail its retrieve or freeze. I need to make the retrieve
or freeze fail by artificially injecting errors. The injected errors
trigger a use-after-free and a double-free.
That said, once more complex file handlers are added or memfd preservation
is used in ways not currently expected or covered by the tests, we will be
able to see them on real systems.
This patch (of 2):
The unfreeze operation is supposed to undo the effects of the freeze
operation. serialized_data is not set by freeze, but by preserve.
Consequently, the unpreserve operation needs to access serialized_data to
undo the effects of the preserve operation. This includes freeing the
serialized data structures for example.
If a freeze callback fails, unfreeze is called for all frozen files. This
would clear serialized_data for them. Since live update has failed, it
can be expected that userspace aborts, releasing all sessions. When the
sessions are released, unpreserve will be called for all files. The
unfrozen files will see 0 in their serialized_data. This is not expected
by file handlers, and they might either fail, leaking data and state, or
might even crash or cause invalid memory access.
Do not clear serialized_data on unfreeze so it gets passed on to
unpreserve. There is no need to clear it on unpreserve since luo_file
will be freed immediately after.
Andrew Cooper [Mon, 26 Jan 2026 21:10:46 +0000 (21:10 +0000)]
x86/kfence: fix booting on 32bit non-PAE systems
The original patch inverted the PTE unconditionally to avoid
L1TF-vulnerable PTEs, but Linux doesn't make this adjustment in 2-level
paging.
Adjust the logic to use the flip_protnone_guard() helper, which is a nop
on 2-level paging but inverts the address bits in all other paging modes.
This doesn't matter for the Xen aspect of the original change. Linux no
longer supports running 32bit PV under Xen, and Xen doesn't support
running any 32bit PV guests without using PAE paging.
Link: https://lkml.kernel.org/r/20260126211046.2096622-1-andrew.cooper3@citrix.com Fixes: b505f1944535 ("x86/kfence: avoid writing L1TF-vulnerable PTEs") Reported-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Closes: https://lore.kernel.org/lkml/CAKFNMokwjw68ubYQM9WkzOuH51wLznHpEOMSqtMoV1Rn9JV_gw@mail.gmail.com/ Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Tested-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: Alexander Potapenko <glider@google.com> Cc: Marco Elver <elver@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Jann Horn <jannh@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Breno Leitao [Thu, 22 Jan 2026 10:39:36 +0000 (02:39 -0800)]
vmcoreinfo: make hwerr_data visible for debugging
If the kernel is compiled with LTO, hwerr_data symbol might be lost, and
vmcoreinfo doesn't have it dumped. This is currently seen in some
production kernels with LTO enabled.
Remove the static qualifier from hwerr_data so that the information is
still preserved when the kernel is built with LTO. Making hwerr_data a
global symbol ensures its debug info survives the LTO link process and
appears in kallsyms. Also document it, so it doesn't get removed in
the future as suggested by akpm.
Link: https://lkml.kernel.org/r/20260122-fix_vmcoreinfo-v2-1-2d6311f9e36c@debian.org Fixes: 3fa805c37dd4 ("vmcoreinfo: track and log recoverable hardware errors") Signed-off-by: Breno Leitao <leitao@debian.org> Acked-by: Baoquan He <bhe@redhat.com> Cc: Dave Young <dyoung@redhat.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Omar Sandoval <osandov@osandov.com> Cc: Shuai Xue <xueshuai@linux.alibaba.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Zhiquan Li <zhiquan1.li@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Matthew Brost [Fri, 16 Jan 2026 11:10:16 +0000 (12:10 +0100)]
mm/zone_device: reinitialize large zone device private folios
Reinitialize metadata for large zone device private folios in
zone_device_page_init prior to creating a higher-order zone device private
folio. This step is necessary when the folio's order changes dynamically
between zone_device_page_init calls to avoid building a corrupt folio. As
part of the metadata reinitialization, the dev_pagemap must be passed in
from the caller because the pgmap stored in the folio page may have been
overwritten with a compound head.
Without this fix, individual pages could have invalid pgmap fields and
flags (with PG_locked being notably problematic) due to prior different
order allocations, which can, and will, result in kernel crashes.
Link: https://lkml.kernel.org/r/20260116111325.1736137-2-francois.dugast@intel.com Fixes: d245f9b4ab80 ("mm/zone_device: support large zone device private folios") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Francois Dugast <francois.dugast@intel.com> Acked-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Balbir Singh <balbirs@nvidia.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Zi Yan <ziy@nvidia.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Christophe Leroy (CS GROUP)" <chleroy@kernel.org> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: David Airlie <airlied@gmail.com> Cc: Simona Vetter <simona@ffwll.ch> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Lyude Paul <lyude@redhat.com> Cc: Danilo Krummrich <dakr@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Leon Romanovsky <leon@kernel.org> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Waiman Long [Thu, 22 Jan 2026 18:43:43 +0000 (13:43 -0500)]
mm/mm_init: don't cond_resched() in deferred_init_memmap_chunk() if called from deferred_grow_zone()
Commit 3acb913c9d5b ("mm/mm_init: use deferred_init_memmap_chunk() in
deferred_grow_zone()") made deferred_grow_zone() call
deferred_init_memmap_chunk() within a pgdat_resize_lock() critical section
with irqs disabled. It did check for irqs_disabled() in
deferred_init_memmap_chunk() to avoid calling cond_resched(). For a
PREEMPT_RT kernel build, however, spin_lock_irqsave() does not disable
interrupt but rcu_read_lock() is called. This leads to the following bug
report.
Fix it adding a new argument to deferred_init_memmap_chunk() to explicitly
tell it if cond_resched() is allowed or not instead of relying on some
current state information which may vary depending on the exact kernel
configuration options that are enabled.
Link: https://lkml.kernel.org/r/20260122184343.546627-1-longman@redhat.com Fixes: 3acb913c9d5b ("mm/mm_init: use deferred_init_memmap_chunk() in deferred_grow_zone()") Signed-off-by: Waiman Long <longman@redhat.com> Suggested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: "Paul E . McKenney" <paulmck@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: <stable@vger.kernrl.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Pimyn Girgis [Tue, 20 Jan 2026 16:15:10 +0000 (17:15 +0100)]
mm/kfence: randomize the freelist on initialization
Randomize the KFENCE freelist during pool initialization to make
allocation patterns less predictable. This is achieved by shuffling the
order in which metadata objects are added to the freelist using
get_random_u32_below().
Additionally, ensure the error path correctly calculates the address range
to be reset if initialization fails, as the address increment logic has
been moved to a separate loop.
Ran Xiaokai [Thu, 22 Jan 2026 13:27:40 +0000 (13:27 +0000)]
kho: init alloc tags when restoring pages from reserved memory
Memblock pages (including reserved memory) should have their allocation
tags initialized to CODETAG_EMPTY via clear_page_tag_ref() before being
released to the page allocator. When kho restores pages through
kho_restore_page(), missing this call causes mismatched
allocation/deallocation tracking and below warning message:
alloc_tag was not set
WARNING: include/linux/alloc_tag.h:164 at ___free_pages+0xb8/0x260, CPU#1: swapper/0/1
RIP: 0010:___free_pages+0xb8/0x260
kho_restore_vmalloc+0x187/0x2e0
kho_test_init+0x3c4/0xa30
do_one_initcall+0x62/0x2b0
kernel_init_freeable+0x25b/0x480
kernel_init+0x1a/0x1c0
ret_from_fork+0x2d1/0x360
Add missing clear_page_tag_ref() annotation in kho_restore_page() to
fix this.
Link: https://lkml.kernel.org/r/20260122132740.176468-1-ranxiaokai627@163.com Fixes: fc33e4b44b27 ("kexec: enable KHO support for memory preservation") Signed-off-by: Ran Xiaokai <ran.xiaokai@zte.com.cn> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Suren Baghdasaryan <surenb@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm: memfd_luo: use memfd_alloc_file() instead of shmem_file_setup()
When restoring a memfd, the file is created using shmem_file_setup().
While memfd creation also calls this function to get the file, it also
does other things:
1. The O_LARGEFILE flag is set on the file. If this is not done,
writes on the memfd exceeding 2 GiB fail.
2. FMODE_LSEEK, FMODE_PREAD, and FMODE_PWRITE are set on the file.
This makes sure the file is seekable and can be used with pread() and
pwrite().
3. Initializes the security field for the inode and makes sure that
inode creation is permitted by the security module.
Currently, none of those things are done. This means writes above 2 GiB
fail, pread(), and pwrite() fail, and so on. lseek() happens to work
because file_init_path() sets it because shmem defines fop->llseek.
Fix this by using memfd_alloc_file() to get the file to make sure the
initialization sequence for normal and preserved memfd is the same.
This series contains a couple of fixes for memfd preservation using LUO.
This patch (of 3):
The Live Update Orchestrator's (LUO) memfd preservation works by
preserving all the folios of a memfd, re-creating an empty memfd on the
next boot, and then inserting back the preserved folios.
Currently it creates the file by directly calling shmem_file_setup().
This leaves out other work done by alloc_file() like setting up the file
mode, flags, or calling the security hooks.
Export alloc_file() to let memfd_luo use it. Rename it to
memfd_alloc_file() since it is no longer private and thus needs a
subsystem prefix.
Jan Kara [Wed, 21 Jan 2026 11:27:30 +0000 (12:27 +0100)]
flex_proportions: make fprop_new_period() hardirq safe
Bernd has reported a lockdep splat from flexible proportions code that is
essentially complaining about the following race:
<timer fires>
run_timer_softirq - we are in softirq context
call_timer_fn
writeout_period
fprop_new_period
write_seqcount_begin(&p->sequence);
<hardirq is raised>
...
blk_mq_end_request()
blk_update_request()
ext4_end_bio()
folio_end_writeback()
__wb_writeout_add()
__fprop_add_percpu_max()
if (unlikely(max_frac < FPROP_FRAC_BASE)) {
fprop_fraction_percpu()
seq = read_seqcount_begin(&p->sequence);
- sees odd sequence so loops indefinitely
Note that a deadlock like this is only possible if the bdi has configured
maximum fraction of writeout throughput which is very rare in general but
frequent for example for FUSE bdis. To fix this problem we have to make
sure write section of the sequence counter is irqsafe.
Link: https://lkml.kernel.org/r/20260121112729.24463-2-jack@suse.cz Fixes: a91befde3503 ("lib/flex_proportions.c: remove local_irq_ops in fprop_new_period()") Signed-off-by: Jan Kara <jack@suse.cz> Reported-by: Bernd Schubert <bernd@bsbernd.com> Link: https://lore.kernel.org/all/9b845a47-9aee-43dd-99bc-1a82bea00442@bsbernd.com/ Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Joanne Koong <joannelkoong@gmail.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Map my address <adeep@lexina.in> to new personal address <v@baodeep.com>
Old domain lexina.in will no longer be accessible due to registration
expiration.
Jane Chu [Tue, 20 Jan 2026 23:22:34 +0000 (16:22 -0700)]
mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn
When a hugetlb folio is being poisoned again, try_memory_failure_hugetlb()
passed head pfn to kill_accessing_process(), that is not right. The
precise pfn of the poisoned page should be used in order to determine the
precise vaddr as the SIGBUS payload.
This issue has already been taken care of in the normal path, that is,
hwpoison_user_mappings(), see [1][2]. Further more, for [3] to work
correctly in the hugetlb repoisoning case, it's essential to inform VM the
precise poisoned page, not the head page.
Link: https://lkml.kernel.org/r/20260120232234.3462258-2-jane.chu@oracle.com Signed-off-by: Jane Chu <jane.chu@oracle.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Acked-by: Miaohe Lin <linmiaohe@huawei.com> Cc: Chris Mason <clm@meta.com> Cc: David Hildenbrand <david@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Jiaqi Yan <jiaqiyan@google.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Naoya Horiguchi <nao.horiguchi@gmail.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Suren Baghdasaryan <surenb@google.com> Cc: William Roche <william.roche@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Jane Chu [Tue, 20 Jan 2026 23:22:33 +0000 (16:22 -0700)]
mm/memory-failure: fix missing ->mf_stats count in hugetlb poison
When a newly poisoned subpage ends up in an already poisoned hugetlb
folio, 'num_poisoned_pages' is incremented, but the per node ->mf_stats is
not. Fix the inconsistency by designating action_result() to update them
both.
While at it, define __get_huge_page_for_hwpoison() return values in terms
of symbol names for better readibility. Also rename
folio_set_hugetlb_hwpoison() to hugetlb_update_hwpoison() since the
function does more than the conventional bit setting and the fact three
possible return values are expected.
Link: https://lkml.kernel.org/r/20260120232234.3462258-1-jane.chu@oracle.com Fixes: 18f41fa616ee ("mm: memory-failure: bump memory failure stats to pglist_data") Signed-off-by: Jane Chu <jane.chu@oracle.com> Acked-by: Miaohe Lin <linmiaohe@huawei.com> Cc: Chris Mason <clm@meta.com> Cc: David Hildenbrand <david@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Jiaqi Yan <jiaqiyan@google.com> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Naoya Horiguchi <nao.horiguchi@gmail.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Suren Baghdasaryan <surenb@google.com> Cc: William Roche <william.roche@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
robin.kuo [Fri, 16 Jan 2026 06:25:00 +0000 (14:25 +0800)]
mm, swap: restore swap_space attr aviod kernel panic
commit 8b47299a411a ("mm, swap: mark swap address space ro and add context
debug check") made the swap address space read-only. It may lead to
kernel panic if arch_prepare_to_swap returns a failure under heavy memory
pressure as follows,
Restore swap address space as not ro to avoid the panic.
Link: https://lkml.kernel.org/r/20260116062535.306453-2-robin.kuo@mediatek.com Fixes: 8b47299a411a ("mm, swap: mark swap address space ro and add context debug check") Signed-off-by: robin.kuo <robin.kuo@mediatek.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: andrew.yang <andrew.yang@mediatek.com> Cc: AngeloGiaocchino Del Regno <angelogioacchino.delregno@collabora.com> Cc: Baoquan He <bhe@redhat.com> Cc: Barry Song <baohua@kernel.org> Cc: Chinwen Chang <chinwen.chang@mediatek.com> Cc: Chris Li <chrisl@kernel.org> Cc: Kairui Song <kasong@tencent.com> Cc: Kairui Song <ryncsn@gmail.com> Cc: Kemeng Shi <shikemeng@huaweicloud.com> Cc: Mathias Brugger <matthias.bgg@gmail.com> Cc: Nhat Pham <nphamcs@gmail.com> Cc: Qun-wei Lin <Qun-wei.Lin@mediatek.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Andrey Ryabinin [Tue, 13 Jan 2026 19:15:15 +0000 (20:15 +0100)]
mm/kasan: fix KASAN poisoning in vrealloc()
A KASAN warning can be triggered when vrealloc() changes the requested
size to a value that is not aligned to KASAN_GRANULE_SIZE.
------------[ cut here ]------------
WARNING: CPU: 2 PID: 1 at mm/kasan/shadow.c:174 kasan_unpoison+0x40/0x48
...
pc : kasan_unpoison+0x40/0x48
lr : __kasan_unpoison_vmalloc+0x40/0x68
Call trace:
kasan_unpoison+0x40/0x48 (P)
vrealloc_node_align_noprof+0x200/0x320
bpf_patch_insn_data+0x90/0x2f0
convert_ctx_accesses+0x8c0/0x1158
bpf_check+0x1488/0x1900
bpf_prog_load+0xd20/0x1258
__sys_bpf+0x96c/0xdf0
__arm64_sys_bpf+0x50/0xa0
invoke_syscall+0x90/0x160
Introduce a dedicated kasan_vrealloc() helper that centralizes KASAN
handling for vmalloc reallocations. The helper accounts for KASAN granule
alignment when growing or shrinking an allocation and ensures that partial
granules are handled correctly.
Use this helper from vrealloc_node_align_noprof() to fix poisoning logic.
Kairui Song [Mon, 19 Jan 2026 16:11:21 +0000 (00:11 +0800)]
mm/shmem, swap: fix race of truncate and swap entry split
The helper for shmem swap freeing is not handling the order of swap
entries correctly. It uses xa_cmpxchg_irq to erase the swap entry, but it
gets the entry order before that using xa_get_order without lock
protection, and it may get an outdated order value if the entry is split
or changed in other ways after the xa_get_order and before the
xa_cmpxchg_irq.
And besides, the order could grow and be larger than expected, and cause
truncation to erase data beyond the end border. For example, if the
target entry and following entries are swapped in or freed, then a large
folio was added in place and swapped out, using the same entry, the
xa_cmpxchg_irq will still succeed, it's very unlikely to happen though.
To fix that, open code the Xarray cmpxchg and put the order retrieval and
value checking in the same critical section. Also, ensure the order won't
exceed the end border, skip it if the entry goes across the border.
Skipping large swap entries crosses the end border is safe here. Shmem
truncate iterates the range twice, in the first iteration,
find_lock_entries already filtered such entries, and shmem will swapin the
entries that cross the end border and partially truncate the folio (split
the folio or at least zero part of it). So in the second loop here, if we
see a swap entry that crosses the end order, it must at least have its
content erased already.
I observed random swapoff hangs and kernel panics when stress testing
ZSWAP with shmem. After applying this patch, all problems are gone.
Link: https://lkml.kernel.org/r/20260120-shmem-swap-fix-v3-1-3d33ebfbc057@tencent.com Fixes: 809bc86517cc ("mm: shmem: support large folio swap out") Signed-off-by: Kairui Song <kasong@tencent.com> Reviewed-by: Nhat Pham <nphamcs@gmail.com> Acked-by: Chris Li <chrisl@kernel.org> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Baoquan He <bhe@redhat.com> Cc: Barry Song <baohua@kernel.org> Cc: Hugh Dickins <hughd@google.com> Cc: Kemeng Shi <shikemeng@huaweicloud.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Linus Torvalds [Tue, 20 Jan 2026 23:01:15 +0000 (15:01 -0800)]
Merge tag 'devicetree-fixes-for-6.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull devicetree fixes from Rob Herring:
- Fix a refcount leak in of_alias_scan()
- Support descending into child nodes when populating nodes
in /firmware
* tag 'devicetree-fixes-for-6.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
of: fix reference count leak in of_alias_scan()
of: platform: Use default match table for /firmware
Linus Torvalds [Tue, 20 Jan 2026 21:32:16 +0000 (13:32 -0800)]
Merge tag 'mm-hotfixes-stable-2026-01-20-13-09' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
- A patch series from David Hildenbrand which fixes a few things
related to hugetlb PMD sharing
- The remainder are singletons, please see their changelogs for details
* tag 'mm-hotfixes-stable-2026-01-20-13-09' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mm: restore per-memcg proactive reclaim with !CONFIG_NUMA
mm/kfence: fix potential deadlock in reboot notifier
Docs/mm/allocation-profiling: describe sysctrl limitations in debug mode
mm: do not copy page tables unnecessarily for VM_UFFD_WP
mm/hugetlb: fix excessive IPI broadcasts when unsharing PMD tables using mmu_gather
mm/rmap: fix two comments related to huge_pmd_unshare()
mm/hugetlb: fix two comments related to huge_pmd_unshare()
mm/hugetlb: fix hugetlb_pmd_shared()
mm: remove unnecessary and incorrect mmap lock assert
x86/kfence: avoid writing L1TF-vulnerable PTEs
mm/vma: do not leak memory when .mmap_prepare swaps the file
migrate: correct lock ordering for hugetlb file folios
panic: only warn about deprecated panic_print on write access
fs/writeback: skip AS_NO_DATA_INTEGRITY mappings in wait_sb_inodes()
mm: take into account mm_cid size for mm_struct static definitions
mm: rename cpu_bitmap field to flexible_array
mm: add missing static initializer for init_mm::mm_cid.lock
Linus Torvalds [Tue, 20 Jan 2026 17:46:29 +0000 (09:46 -0800)]
Merge tag 'pwm/for-6.19-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux
Pull pwm fixes and a maintainer update from Uwe Kleine-König:
- pwm: Ensure ioctl() returns a negative errno on error
This affects two ioctls on /dev/pwmchipX where the return value of
copy_to_user() was passed to userspace. This is fixed to return
-EFAULT now instead.
- pwm: max7360: Populate missing .sizeof_wfhw in max7360_pwm_ops
This fixes an oversight in the original commit that added support for
the max7360 driver (d93a75d94b79: "pwm: max7360: Add MAX7360 PWM
support"). There is no user-visible effect because the .sizeof_wfhw
member is just a safe guard that the memory provided by the core is
big enough. While it currently is big enough and there is no reason
to assume that will change, doing that correctly is necessary.
- MAINTAINERS: Add Michal Wilczynski as reviewer for PWM rust drivers
Michal cares for the Rust parts of the pwm subsystem. Several of the
patches sent recently for the (for now) only Rust pwm driver did not
add Michal to Cc which resulted in the patches waiting for review as
I thought Michal would care but he wasn't aware of them.
* tag 'pwm/for-6.19-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux:
MAINTAINERS: Add myself as reviewer for PWM rust drivers
pwm: max7360: Populate missing .sizeof_wfhw in max7360_pwm_ops
pwm: Ensure ioctl() returns a negative errno on error
Yosry Ahmed [Fri, 16 Jan 2026 20:52:47 +0000 (20:52 +0000)]
mm: restore per-memcg proactive reclaim with !CONFIG_NUMA
Commit 2b7226af730c ("mm/memcg: make memory.reclaim interface generic")
moved proactive reclaim logic from memory.reclaim handler to a generic
user_proactive_reclaim() helper to be used for per-node proactive reclaim.
However, user_proactive_reclaim() was only defined under CONFIG_NUMA, with
a stub always returning 0 otherwise. This broke memory.reclaim on
!CONFIG_NUMA configs, causing it to report success without actually
attempting reclaim.
Move the definition of user_proactive_reclaim() outside CONFIG_NUMA, and
instead define a stub for __node_reclaim() in the !CONFIG_NUMA case.
__node_reclaim() is only called from user_proactive_reclaim() when a write
is made to sys/devices/system/node/nodeX/reclaim, which is only defined
with CONFIG_NUMA.
Link: https://lkml.kernel.org/r/20260116205247.928004-1-yosry.ahmed@linux.dev Fixes: 2b7226af730c ("mm/memcg: make memory.reclaim interface generic") Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@kernel.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Xu <weixugc@google.com> Cc: Yuanchu Xie <yuanchu@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Breno Leitao [Fri, 16 Jan 2026 14:10:11 +0000 (06:10 -0800)]
mm/kfence: fix potential deadlock in reboot notifier
The reboot notifier callback can deadlock when calling
cancel_delayed_work_sync() if toggle_allocation_gate() is blocked in
wait_event_idle() waiting for allocations, that might not happen on
shutdown path.
The issue is that cancel_delayed_work_sync() waits for the work to
complete, but the work is waiting for kfence_allocation_gate > 0 which
requires allocations to happen (each allocation is increased by 1) -
allocations that may have stopped during shutdown.
Fix this by:
1. Using cancel_delayed_work() (non-sync) to avoid blocking. Now the
callback succeeds and return.
2. Adding wake_up() to unblock any waiting toggle_allocation_gate()
3. Adding !kfence_enabled to the wait condition so the wake succeeds
The static_branch_disable() IPI will still execute after the wake, but at
this early point in shutdown (reboot notifier runs with INT_MAX priority),
the system is still functional and CPUs can respond to IPIs.
Link: https://lkml.kernel.org/r/20260116-kfence_fix-v1-1-4165a055933f@debian.org Fixes: ce2bba89566b ("mm/kfence: add reboot notifier to disable KFENCE on shutdown") Signed-off-by: Breno Leitao <leitao@debian.org> Reported-by: Chris Mason <clm@meta.com> Closes: https://lore.kernel.org/all/20260113140234.677117-1-clm@meta.com/ Reviewed-by: Marco Elver <elver@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Breno Leitao <leitao@debian.org> Cc: Chris Mason <clm@meta.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Docs/mm/allocation-profiling: describe sysctrl limitations in debug mode
When CONFIG_MEM_ALLOC_PROFILING_DEBUG=y, /proc/sys/vm/mem_profiling is
read-only to avoid debug warnings in a scenario when an allocation is
made while profiling is disabled (allocation does not get an allocation
tag), then profiling gets enabled and allocation gets freed (warning due
to the allocation missing allocation tag).
Link: https://lkml.kernel.org/r/20260116184423.2708363-1-surenb@google.com Fixes: ebdf9ad4ca98 ("memprofiling: documentation") Reported-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Suren Baghdasaryan <surenb@google.com> Acked-by: SeongJae Park <sj@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Ran Xiaokai <ran.xiaokai@zte.com.cn> Cc: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Lorenzo Stoakes [Wed, 14 Jan 2026 11:00:06 +0000 (11:00 +0000)]
mm: do not copy page tables unnecessarily for VM_UFFD_WP
Commit ab04b530e7e8 ("mm: introduce copy-on-fork VMAs and make
VM_MAYBE_GUARD one") aggregates flags checks in vma_needs_copy(),
including VM_UFFD_WP.
However in doing so, it incorrectly performed this check against src_vma.
This check was done on the assumption that all relevant flags are copied
upon fork.
However the userfaultfd logic is very innovative in that it implements
custom logic on fork in dup_userfaultfd(), including a rather well hidden
case where lacking UFFD_FEATURE_EVENT_FORK causes VM_UFFD_WP to not be
propagated to the destination VMA.
And indeed, vma_needs_copy(), prior to this patch, did check this property
on dst_vma, not src_vma.
Since all the other relevant flags are copied on fork, we can simply fix
this by checking against dst_vma.
While we're here, we fix a comment against VM_COPY_ON_FORK (noting that it
did indeed already reference dst_vma) to make it abundantly clear that we
must check against the destination VMA.
Link: https://lkml.kernel.org/r/20260114110006.1047071-1-lorenzo.stoakes@oracle.com Fixes: ab04b530e7e8 ("mm: introduce copy-on-fork VMAs and make VM_MAYBE_GUARD one") Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reported-by: Chris Mason <clm@meta.com> Closes: https://lore.kernel.org/all/20260113231257.3002271-1-clm@meta.com/ Acked-by: David Hildenbrand (Red Hat) <david@kernel.org> Acked-by: Pedro Falcato <pfalcato@suse.de> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm/hugetlb: fix excessive IPI broadcasts when unsharing PMD tables using mmu_gather
As reported, ever since commit 1013af4f585f ("mm/hugetlb: fix
huge_pmd_unshare() vs GUP-fast race") we can end up in some situations
where we perform so many IPI broadcasts when unsharing hugetlb PMD page
tables that it severely regresses some workloads.
In particular, when we fork()+exit(), or when we munmap() a large
area backed by many shared PMD tables, we perform one IPI broadcast per
unshared PMD table.
There are two optimizations to be had:
(1) When we process (unshare) multiple such PMD tables, such as during
exit(), it is sufficient to send a single IPI broadcast (as long as
we respect locking rules) instead of one per PMD table.
Locking prevents that any of these PMD tables could get reused before
we drop the lock.
(2) When we are not the last sharer (> 2 users including us), there is
no need to send the IPI broadcast. The shared PMD tables cannot
become exclusive (fully unshared) before an IPI will be broadcasted
by the last sharer.
Concurrent GUP-fast could walk into a PMD table just before we
unshared it. It could then succeed in grabbing a page from the
shared page table even after munmap() etc succeeded (and supressed
an IPI). But there is not difference compared to GUP-fast just
sleeping for a while after grabbing the page and re-enabling IRQs.
Most importantly, GUP-fast will never walk into page tables that are
no-longer shared, because the last sharer will issue an IPI
broadcast.
(if ever required, checking whether the PUD changed in GUP-fast
after grabbing the page like we do in the PTE case could handle
this)
So let's rework PMD sharing TLB flushing + IPI sync to use the mmu_gather
infrastructure so we can implement these optimizations and demystify the
code at least a bit. Extend the mmu_gather infrastructure to be able to
deal with our special hugetlb PMD table sharing implementation.
To make initialization of the mmu_gather easier when working on a single
VMA (in particular, when dealing with hugetlb), provide
tlb_gather_mmu_vma().
We'll consolidate the handling for (full) unsharing of PMD tables in
tlb_unshare_pmd_ptdesc() and tlb_flush_unshared_tables(), and track
in "struct mmu_gather" whether we had (full) unsharing of PMD tables.
Because locking is very special (concurrent unsharing+reuse must be
prevented), we disallow deferring flushing to tlb_finish_mmu() and instead
require an explicit earlier call to tlb_flush_unshared_tables().
From hugetlb code, we call huge_pmd_unshare_flush() where we make sure
that the expected lock protecting us from concurrent unsharing+reuse is
still held.
Check with a VM_WARN_ON_ONCE() in tlb_finish_mmu() that
tlb_flush_unshared_tables() was properly called earlier.
Document it all properly.
Notes about tlb_remove_table_sync_one() interaction with unsharing:
There are two fairly tricky things:
(1) tlb_remove_table_sync_one() is a NOP on architectures without
CONFIG_MMU_GATHER_RCU_TABLE_FREE.
Here, the assumption is that the previous TLB flush would send an
IPI to all relevant CPUs. Careful: some architectures like x86 only
send IPIs to all relevant CPUs when tlb->freed_tables is set.
The relevant architectures should be selecting
MMU_GATHER_RCU_TABLE_FREE, but x86 might not do that in stable
kernels and it might have been problematic before this patch.
Also, the arch flushing behavior (independent of IPIs) is different
when tlb->freed_tables is set. Do we have to enlighten them to also
take care of tlb->unshared_tables? So far we didn't care, so
hopefully we are fine. Of course, we could be setting
tlb->freed_tables as well, but that might then unnecessarily flush
too much, because the semantics of tlb->freed_tables are a bit
fuzzy.
This patch changes nothing in this regard.
(2) tlb_remove_table_sync_one() is not a NOP on architectures with
CONFIG_MMU_GATHER_RCU_TABLE_FREE that actually don't need a sync.
Take x86 as an example: in the common case (!pv, !X86_FEATURE_INVLPGB)
we still issue IPIs during TLB flushes and don't actually need the
second tlb_remove_table_sync_one().
This optimized can be implemented on top of this, by checking e.g., in
tlb_remove_table_sync_one() whether we really need IPIs. But as
described in (1), it really must honor tlb->freed_tables then to
send IPIs to all relevant CPUs.
Notes on TLB flushing changes:
(1) Flushing for non-shared PMD tables
We're converting from flush_hugetlb_tlb_range() to
tlb_remove_huge_tlb_entry(). Given that we properly initialize the
MMU gather in tlb_gather_mmu_vma() to be hugetlb aware, similar to
__unmap_hugepage_range(), that should be fine.
(2) Flushing for shared PMD tables
We're converting from various things (flush_hugetlb_tlb_range(),
tlb_flush_pmd_range(), flush_tlb_range()) to tlb_flush_pmd_range().
tlb_flush_pmd_range() achieves the same that
tlb_remove_huge_tlb_entry() would achieve in these scenarios.
Note that tlb_remove_huge_tlb_entry() also calls
__tlb_remove_tlb_entry(), however that is only implemented on
powerpc, which does not support PMD table sharing.
Similar to (1), tlb_gather_mmu_vma() should make sure that TLB
flushing keeps on working as expected.
Further, note that the ptdesc_pmd_pts_dec() in huge_pmd_share() is not a
concern, as we are holding the i_mmap_lock the whole time, preventing
concurrent unsharing. That ptdesc_pmd_pts_dec() usage will be removed
separately as a cleanup later.
There are plenty more cleanups to be had, but they have to wait until
this is fixed.
[david@kernel.org: fix kerneldoc] Link: https://lkml.kernel.org/r/f223dd74-331c-412d-93fc-69e360a5006c@kernel.org Link: https://lkml.kernel.org/r/20251223214037.580860-5-david@kernel.org Fixes: 1013af4f585f ("mm/hugetlb: fix huge_pmd_unshare() vs GUP-fast race") Signed-off-by: David Hildenbrand (Red Hat) <david@kernel.org> Reported-by: Uschakow, Stanislav" <suschako@amazon.de> Closes: https://lore.kernel.org/all/4d3878531c76479d9f8ca9789dc6485d@amazon.de/ Tested-by: Laurence Oberman <loberman@redhat.com> Acked-by: Harry Yoo <harry.yoo@oracle.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liu Shixin <liushixin2@huawei.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Rik van Riel <riel@surriel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm/rmap: fix two comments related to huge_pmd_unshare()
PMD page table unsharing no longer touches the refcount of a PMD page
table. Also, it is not about dropping the refcount of a "PMD page" but
the "PMD page table".
Let's just simplify by saying that the PMD page table was unmapped,
consequently also unmapping the folio that was mapped into this page.
This code should be deduplicated in the future.
Link: https://lkml.kernel.org/r/20251223214037.580860-4-david@kernel.org Fixes: 59d9094df3d7 ("mm: hugetlb: independent PMD page table shared count") Signed-off-by: David Hildenbrand (Red Hat) <david@kernel.org> Reviewed-by: Rik van Riel <riel@surriel.com> Tested-by: Laurence Oberman <loberman@redhat.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Oscar Salvador <osalvador@suse.de> Cc: Liu Shixin <liushixin2@huawei.com> Cc: Harry Yoo <harry.yoo@oracle.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: "Uschakow, Stanislav" <suschako@amazon.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm/hugetlb: fix two comments related to huge_pmd_unshare()
Ever since we stopped using the page count to detect shared PMD page
tables, these comments are outdated.
The only reason we have to flush the TLB early is because once we drop the
i_mmap_rwsem, the previously shared page table could get freed (to then
get reallocated and used for other purpose). So we really have to flush
the TLB before that could happen.
So let's simplify the comments a bit.
The "If we unshared PMDs, the TLB flush was not recorded in mmu_gather."
part introduced as in commit a4a118f2eead ("hugetlbfs: flush TLBs
correctly after huge_pmd_unshare") was confusing: sure it is recorded in
the mmu_gather, otherwise tlb_flush_mmu_tlbonly() wouldn't do anything.
So let's drop that comment while at it as well.
We'll centralize these comments in a single helper as we rework the code
next.
Link: https://lkml.kernel.org/r/20251223214037.580860-3-david@kernel.org Fixes: 59d9094df3d7 ("mm: hugetlb: independent PMD page table shared count") Signed-off-by: David Hildenbrand (Red Hat) <david@kernel.org> Reviewed-by: Rik van Riel <riel@surriel.com> Tested-by: Laurence Oberman <loberman@redhat.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Oscar Salvador <osalvador@suse.de> Reviewed-by: Harry Yoo <harry.yoo@oracle.com> Cc: Liu Shixin <liushixin2@huawei.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: "Uschakow, Stanislav" <suschako@amazon.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "mm/hugetlb: fixes for PMD table sharing (incl. using
mmu_gather)", v3.
One functional fix, one performance regression fix, and two related
comment fixes.
I cleaned up my prototype I recently shared [1] for the performance fix,
deferring most of the cleanups I had in the prototype to a later point.
While doing that I identified the other things.
The goal of this patch set is to be backported to stable trees "fairly"
easily. At least patch #1 and #4.
Patch #1 fixes hugetlb_pmd_shared() not detecting any sharing
Patch #2 + #3 are simple comment fixes that patch #4 interacts with.
Patch #4 is a fix for the reported performance regression due to excessive
IPI broadcasts during fork()+exit().
The last patch is all about TLB flushes, IPIs and mmu_gather.
Read: complicated
There are plenty of cleanups in the future to be had + one reasonable
optimization on x86. But that's all out of scope for this series.
Runtime tested, with a focus on fixing the performance regression using
the original reproducer [2] on x86.
This patch (of 4):
We switched from (wrongly) using the page count to an independent shared
count. Now, shared page tables have a refcount of 1 (excluding
speculative references) and instead use ptdesc->pt_share_count to identify
sharing.
We didn't convert hugetlb_pmd_shared(), so right now, we would never
detect a shared PMD table as such, because sharing/unsharing no longer
touches the refcount of a PMD table.
Page migration, like mbind() or migrate_pages() would allow for migrating
folios mapped into such shared PMD tables, even though the folios are not
exclusive. In smaps we would account them as "private" although they are
"shared", and we would be wrongly setting the PM_MMAP_EXCLUSIVE in the
pagemap interface.
Fix it by properly using ptdesc_pmd_is_shared() in hugetlb_pmd_shared().
Lorenzo Stoakes [Wed, 14 Jan 2026 11:56:19 +0000 (11:56 +0000)]
mm: remove unnecessary and incorrect mmap lock assert
This check was introduced by commit 42fc541404f2 ("mmap locking API: add
mmap_assert_locked() and mmap_assert_write_locked()") which replaced a
VM_BUG_ON_VMA() over rwsem_is_locked from commit a00cc7d9dd93 ("mm, x86:
add support for PUD-sized transparent hugepages"), i.e. the commit that
introduced PUD THPs.
These seem to be careful asserts introduced to ensure that locks are held
in general, however for a zap we require that VMAs are kept stable, and
this is a requirement that has held perfectly well for a long time.
These were long before VMA locks and thus there appears to be no reason to
think this is assert is there for anything other than 'stabilised VMA'.
Asserting that the VMA under examination is stable only in the case of a
THP PUD is strange and unnecessary. If we wish to be careful and assert
such things, we should do so at the zap level.
However in any case the current situation is already simply incorrect - a
VMA lock suffices here.
Remove the assert for now as it is unnecessarily, incorrect and unhelpful,
subsequent work can introduce an assert in general for zapping if
required.
Link: https://lkml.kernel.org/r/20260114115619.1087466-1-lorenzo.stoakes@oracle.com Fixes: 2ab7f1bbafc9 ("mm/madvise: allow guard page install/remove under VMA lock") Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reported-by: Chris Mason <clm@meta.com> Closes: https://lore.kernel.org/all/20260113220856.2358195-1-clm@meta.com/ Acked-by: David Hildenbrand (Red Hat) <david@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: SeongJae Park <sj@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
MAINTAINERS: Add myself as reviewer for PWM rust drivers
I would like to help with reviewing the Rust part of the PWM drivers.
While I maintain the Rust bindings, adding this separate entry ensures I
am automatically CC-ed on the driver implementations (drivers/pwm/*.rs)
Xen PV guests are control their own pagetables; they choose the new
PTE value, and use hypercalls to make changes so Xen can audit for
safety.
In addition to a regular reference count, Xen also maintains a type
reference count. e.g. SegDesc (referenced by vGDT/vLDT), Writable
(referenced with _PAGE_RW) or L{1..4} (referenced by vCR3 or a lower
pagetable level). This is in order to prevent e.g. a page being
inserted into the pagetables for which the guest has a writable mapping.
For non-present mappings, all other bits become software accessible,
and typically contain metadata rather a real frame address. There is
nothing that a reference count could sensibly be tied to. As such, even
if Xen could recognise the address as currently safe, nothing would
prevent that frame from changing owner to another VM in the future.
When Xen detects a PV guest writing a L1TF-PTE, it responds by
activating shadow paging. This is normally only used for the live phase
of migration, and comes with a reasonable overhead.
KFENCE only cares about getting #PF to catch wild accesses; it doesn't
care about the value for non-present mappings. Use a fully inverted PTE,
to avoid hitting the slow path when running under Xen.
While adjusting the logic, take the opportunity to skip all actions if the
PTE is already in the right state, half the number PVOps callouts, and
skip TLB maintenance on a !P -> P transition which benefits non-Xen cases
too.
Link: https://lkml.kernel.org/r/20260106180426.710013-1-andrew.cooper3@citrix.com Fixes: 1dc0da6e9ec0 ("x86, kfence: enable KFENCE for x86") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Tested-by: Marco Elver <elver@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Marco Elver <elver@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Jann Horn <jannh@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Lorenzo Stoakes [Mon, 12 Jan 2026 15:51:43 +0000 (15:51 +0000)]
mm/vma: do not leak memory when .mmap_prepare swaps the file
The current implementation of mmap() is set up such that a struct file
object is obtained for the input fd in ksys_mmap_pgoff() via fget(), and
its reference count decremented at the end of the function via. fput().
If a merge can be achieved, we are fine to simply decrement the refcount
on the file. Otherwise, in __mmap_new_file_vma(), we increment the
reference count on the file via get_file() such that the fput() in
ksys_mmap_pgoff() does not free the now-referenced file object.
The introduction of the f_op->mmap_prepare hook changes things, as it
becomes possible for a driver to replace the file object right at the
beginning of the mmap operation.
The current implementation is buggy if this happens because it
unconditionally calls get_file() on the mapping's file whether or not it
was replaced (and thus whether or not its reference count will be
decremented at the end of ksys_mmap_pgoff()).
This results in a memory leak, and was exposed in commit ab04945f91bc
("mm: update mem char driver to use mmap_prepare").
This patch solves the problem by explicitly tracking whether we actually
need to call get_file() on the file or not, and only doing so if required.
Link: https://lkml.kernel.org/r/20260112155143.661284-1-lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Fixes: ab04945f91bc ("mm: update mem char driver to use mmap_prepare") Reported-by: syzbot+bf5de69ebb4bdf86f59f@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/6964a92b.050a0220.eaf7.008a.GAE@google.com/ Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The migration path is the one taking locks in the wrong order according to
the documentation at the top of mm/rmap.c. So expand the scope of the
existing i_mmap_lock to cover the calls to remove_migration_ptes() too.
This is (mostly) how it used to be after commit c0d0381ade79. That was
removed by 336bf30eb765 for both file & anon hugetlb pages when it should
only have been removed for anon hugetlb pages.
Link: https://lkml.kernel.org/r/20260109041345.3863089-2-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Fixes: 336bf30eb765 ("hugetlbfs: fix anon huge page migration race") Reported-by: syzbot+2d9c96466c978346b55f@syzkaller.appspotmail.com Link: https://lore.kernel.org/all/68e9715a.050a0220.1186a4.000d.GAE@google.com Debugged-by: Lance Yang <lance.yang@linux.dev> Acked-by: David Hildenbrand (Red Hat) <david@kernel.org> Acked-by: Zi Yan <ziy@nvidia.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Byungchul Park <byungchul@sk.com> Cc: Gregory Price <gourry@gourry.net> Cc: Jann Horn <jannh@google.com> Cc: Joshua Hahn <joshua.hahnjy@gmail.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Rakie Kim <rakie.kim@sk.com> Cc: Rik van Riel <riel@surriel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Ying Huang <ying.huang@linux.alibaba.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Gal Pressman [Tue, 6 Jan 2026 16:33:21 +0000 (18:33 +0200)]
panic: only warn about deprecated panic_print on write access
The panic_print_deprecated() warning is being triggered on both read and
write operations to the panic_print parameter.
This causes spurious warnings when users run 'sysctl -a' to list all
sysctl values, since that command reads /proc/sys/kernel/panic_print and
triggers the deprecation notice.
Modify the handlers to only emit the deprecation warning when the
parameter is actually being set:
- sysctl_panic_print_handler(): check 'write' flag before warning.
- panic_print_get(): remove the deprecation call entirely.
This way, users are only warned when they actively try to use the
deprecated parameter, not when passively querying system state.
Link: https://lkml.kernel.org/r/20260106163321.83586-1-gal@nvidia.com Fixes: ee13240cd78b ("panic: add note that panic_print sysctl interface is deprecated") Fixes: 2683df6539cb ("panic: add note that 'panic_print' parameter is deprecated") Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Nimrod Oren <noren@nvidia.com> Cc: Feng Tang <feng.tang@linux.alibaba.com> Cc: Joel Granados <joel.granados@kernel.org> Cc: Petr Mladek <pmladek@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Joanne Koong [Mon, 5 Jan 2026 21:17:27 +0000 (13:17 -0800)]
fs/writeback: skip AS_NO_DATA_INTEGRITY mappings in wait_sb_inodes()
Above the while() loop in wait_sb_inodes(), we document that we must wait
for all pages under writeback for data integrity. Consequently, if a
mapping, like fuse, traditionally does not have data integrity semantics,
there is no need to wait at all; we can simply skip these inodes.
This restores fuse back to prior behavior where syncs are no-ops. This
fixes a user regression where if a system is running a faulty fuse server
that does not reply to issued write requests, this causes wait_sb_inodes()
to wait forever.
Link: https://lkml.kernel.org/r/20260105211737.4105620-2-joannelkoong@gmail.com Fixes: 0c58a97f919c ("fuse: remove tmp folio for writebacks and internal rb tree") Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Reported-by: Athul Krishna <athul.krishna.kr@protonmail.com> Reported-by: J. Neuschäfer <j.neuschaefer@gmx.net> Reviewed-by: Bernd Schubert <bschubert@ddn.com> Tested-by: J. Neuschäfer <j.neuschaefer@gmx.net> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Bernd Schubert <bschubert@ddn.com> Cc: Bonaccorso Salvatore <carnil@debian.org> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The cpu_bitmap flexible array now contains more than just the cpu_bitmap.
In preparation for changing the static mm_struct definitions to cover for
the additional space required, change the cpu_bitmap type from "unsigned
long" to "char", require an unsigned long alignment of the flexible array,
and rename the field from "cpu_bitmap" to "flexible_array".
Introduce the MM_STRUCT_FLEXIBLE_ARRAY_INIT macro to statically initialize
the flexible array. This covers the init_mm and efi_mm static
definitions.
This is a preparation step for fixing the missing mm_cid size for static
mm_struct definitions.
Link: https://lkml.kernel.org/r/20251224173358.647691-3-mathieu.desnoyers@efficios.com Fixes: af7f588d8f73 ("sched: Introduce per-memory-map concurrency ID") Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Reviewed-by: Thomas Gleixner <tglx@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Aboorva Devarajan <aboorvad@linux.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Christan König <christian.koenig@amd.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Christoph Lameter <cl@linux.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Dennis Zhou <dennis@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: "Liam R . Howlett" <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Martin Liu <liumartin@google.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: SeongJae Park <sj@kernel.org> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Cc: Tejun Heo <tj@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yu Zhao <yuzhao@google.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Linus Torvalds [Mon, 19 Jan 2026 20:24:45 +0000 (12:24 -0800)]
Merge tag 'ata-6.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux
Pull ata fixes from Damien Le Moal:
- A set of fixes for link power management as the recent changes/fixes
introduced regressions with ATAPI devices and with adapters that have
DUMMY ports, preventing an adapter to fully reach a low power state
and thus preventing the system CPU from reaching low power C-states
(from Niklas)
* tag 'ata-6.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
ata: libata: Print features also for ATAPI devices
ata: libata: Add DIPM and HIPM to ata_dev_print_features() early return
ata: libata: Add cpr_log to ata_dev_print_features() early return
ata: libata-sata: Improve link_power_management_supported sysfs attribute
ata: libata: Call ata_dev_config_lpm() for ATAPI devices
ata: ahci: Do not read the per port area for unimplemented ports
Richard Genoud [Tue, 13 Jan 2026 16:39:07 +0000 (17:39 +0100)]
pwm: max7360: Populate missing .sizeof_wfhw in max7360_pwm_ops
The sizeof_wfhw field wasn't populated in max7360_pwm_ops so it was set
to 0 by default.
While this is ok for now because:
sizeof(struct max7360_pwm_waveform) < PWM_WFHWSIZE
in the future, if struct max7360_pwm_waveform grows, it could lead to
stack corruption.
Uwe Kleine-König [Mon, 19 Jan 2026 15:13:26 +0000 (16:13 +0100)]
pwm: Ensure ioctl() returns a negative errno on error
copy_to_user() returns the number of bytes not copied, thus if there is
a problem a positive number. However the ioctl callback is supposed to
return a negative error code on error.
This error is a unfortunate as strictly speaking it became ABI with the
introduction of pwm character devices. However I never saw the issue in
real life -- I found this by code inspection -- and it only affects an
error case where readonly memory is passed to the ioctls or the address
mapping changes while the ioctl is active. Also there are already error
cases returning negative values, so the calling code must be prepared to
see such values already.
Linus Torvalds [Sun, 18 Jan 2026 23:15:47 +0000 (15:15 -0800)]
Merge tag 'landlock-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux
Pull landlock fixes from Mickaël Salaün:
"This fixes TCP handling, tests, documentation, non-audit elided code,
and minor cosmetic changes"
* tag 'landlock-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
landlock: Clarify documentation for the IOCTL access right
selftests/landlock: Properly close a file descriptor
landlock: Improve the comment for domain_is_scoped
selftests/landlock: Use scoped_base_variants.h for ptrace_test
selftests/landlock: Fix missing semicolon
selftests/landlock: Fix typo in fs_test
landlock: Optimize stack usage when !CONFIG_AUDIT
landlock: Fix spelling
landlock: Clean up hook_ptrace_access_check()
landlock: Improve erratum documentation
landlock: Remove useless include
landlock: Fix wrong type usage
selftests/landlock: NULL-terminate unix pathname addresses
selftests/landlock: Remove invalid unix socket bind()
selftests/landlock: Add missing connect(minimal AF_UNSPEC) test
selftests/landlock: Fix TCP bind(AF_UNSPEC) test case
landlock: Fix TCP handling of short AF_UNSPEC addresses
landlock: Fix formatting
Linus Torvalds [Sun, 18 Jan 2026 21:18:40 +0000 (13:18 -0800)]
Merge tag 'phy-fixes-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy
Pull phy fixes from Vinod Koul:
"A bunch of driver fixes:
- Freescale typec orientation switch fix, clearing register fix,
assertion of phy reset during power on
- Qualcomm pcs register clear before using
- stm one off fix
- TI runtimepm error handling, regmap leak fixes
- Rockchip gadget mode disconnection and disruption fixes
- Tegra register level fix
- Broadcom pointer cast warning fix"
* tag 'phy-fixes-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy:
phy: freescale: imx8m-pcie: assert phy reset during power on
phy: rockchip: inno-usb2: Fix a double free bug in rockchip_usb2phy_probe()
phy: broadcom: ns-usb3: Fix Wvoid-pointer-to-enum-cast warning (again)
phy: tegra: xusb: Explicitly configure HS_DISCON_LEVEL to 0x7
phy: rockchip: inno-usb2: fix communication disruption in gadget mode
phy: rockchip: inno-usb2: fix disconnection in gadget mode
phy: ti: gmii-sel: fix regmap leak on probe failure
phy: sparx5-serdes: make it selectable for ARCH_LAN969X
phy: ti: da8xx-usb: Handle devm_pm_runtime_enable() errors
phy: stm32-usphyc: Fix off by one in probe()
phy: qcom-qusb2: Fix NULL pointer dereference on early suspend
phy: fsl-imx8mq-usb: Clear the PCS_TX_SWING_FULL field before using it
dt-bindings: phy: qcom,sc8280xp-qmp-pcie-phy: Update pcie phy bindings for qcs8300
phy: fsl-imx8mq-usb: fix typec orientation switch when built as module
Linus Torvalds [Sun, 18 Jan 2026 20:29:12 +0000 (12:29 -0800)]
Merge tag 'soundwire-6.19-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire
Pull soundwire fix from Vinod Koul:
- Single off-by-one fix for allocating slave id
* tag 'soundwire-6.19-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
soundwire: bus: fix off-by-one when allocating slave IDs
Linus Torvalds [Sun, 18 Jan 2026 20:09:13 +0000 (12:09 -0800)]
Merge tag 'usb-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are some small USB fixes and new device ids for 6.19-rc6
Included in here are:
- new usb-serial device ids
- dwc3-apple driver fixes to get things working properly on that
hardware platform
- ohci/uhci platfrom driver module soft-deps with ehci to remove a
runtime warning that sometimes shows up on some platforms.
- quirk for broken devices that can not handle reading the BOS
descriptor from them without going crazy.
- usb-serial driver fixes
- xhci driver fixes
- usb gadget driver fixes
All of these except for the last xhci fix has been in linux-next for a
while. The xhci fix has been reported by others to solve the issue for
them, so should be ok"
* tag 'usb-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
xhci: sideband: don't dereference freed ring when removing sideband endpoint
usb: gadget: uvc: retry vb2_reqbufs() with vb_vmalloc_memops if use_sg fail
usb: gadget: uvc: return error from uvcg_queue_init()
usb: gadget: uvc: fix interval_duration calculation
usb: gadget: uvc: fix req_payload_size calculation
usb: dwc3: apple: Ignore USB role switches to the active role
usb: host: xhci-tegra: Use platform_get_irq_optional() for wake IRQs
USB: OHCI/UHCI: Add soft dependencies on ehci_platform
usb: dwc3: apple: Set USB2 PHY mode before dwc3 init
USB: serial: f81232: fix incomplete serial port generation
USB: serial: ftdi_sio: add support for PICAXE AXE027 cable
USB: serial: option: add Telit LE910 MBIM composition
usb: core: add USB_QUIRK_NO_BOS for devices that hang on BOS descriptor
dt-bindings: usb: qcom,dwc3: Correct MSM8994 interrupts
dt-bindings: usb: qcom,dwc3: Correct IPQ5018 interrupts
tcpm: allow looking for role_sw device in the main node
usb: dwc3: Check for USB4 IP_NAME
Linus Torvalds [Sun, 18 Jan 2026 20:00:35 +0000 (12:00 -0800)]
Merge tag 'i2c-for-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
- riic, imx-lpi2c: suspend/resume fixes
- qcom-geni: DMA handling fix
- iproc: correct DT binding description
* tag 'i2c-for-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: imx-lpi2c: change to PIO mode in system-wide suspend/resume progress
i2c: qcom-geni: make sure I2C hub controllers can't use SE DMA
i2c: riic: Move suspend handling to NOIRQ phase
dt-bindings: i2c: brcm,iproc-i2c: Allow 2 reg entries for brcm,iproc-nic-i2c
Linus Torvalds [Sun, 18 Jan 2026 19:39:56 +0000 (11:39 -0800)]
Merge tag 'edac_urgent_for_v6.19_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras
Pull EDAC fixes from Borislav Petkov:
"Make sure the memory-mapped memory controller registers BAR gets
unmapped when the driver memory allocation fails
Fix that in both x38 and i3200 EDAC drivers as former has copied the
bug from the latter, it looks like"
* tag 'edac_urgent_for_v6.19_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
EDAC/x38: Fix a resource leak in x38_probe1()
EDAC/i3200: Fix a resource leak in i3200_probe1()
Linus Torvalds [Sun, 18 Jan 2026 19:34:11 +0000 (11:34 -0800)]
Merge tag 'x86-urgent-2026-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc x86 fixes from Ingo Molnar:
- Fix resctrl initialization on Hygon CPUs
- Fix resctrl memory bandwidth counters on Hygon CPUs
- Fix x86 self-tests build bug
* tag 'x86-urgent-2026-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
selftests/x86: Add selftests include path for kselftest.h after centralization
x86/resctrl: Fix memory bandwidth counter width for Hygon
x86/resctrl: Add missing resctrl initialization for Hygon
Linus Torvalds [Sun, 18 Jan 2026 18:17:40 +0000 (10:17 -0800)]
Merge tag 'sched-urgent-2026-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
"Misc deadline scheduler fixes, mainly for a new category of bugs that
were discovered and fixed recently:
- Fix a race condition in the DL server
- Fix a DL server bug which can result in incorrectly going idle when
there's work available
- Fix DL server bug which triggers a WARN() due to broken
get_prio_dl() logic and subsequent misbehavior
- Fix double update_rq_clock() calls
- Fix setscheduler() assumption about static priorities
- Make sure balancing callbacks are always called
- Plus a handful of preparatory commits for the fixes"
* tag 'sched-urgent-2026-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/deadline: Use ENQUEUE_MOVE to allow priority change
sched: Deadline has dynamic priority
sched: Audit MOVE vs balance_callbacks
sched: Fold rq-pin swizzle into __balance_callbacks()
sched/deadline: Avoid double update_rq_clock()
sched/deadline: Ensure get_prio_dl() is up-to-date
sched/deadline: Fix server stopping with runnable tasks
sched: Provide idle_rq() helper
sched/deadline: Fix potential race in dl_add_task_root_domain()
sched/deadline: Remove unnecessary comment in dl_add_task_root_domain()
Linus Torvalds [Sun, 18 Jan 2026 17:09:32 +0000 (09:09 -0800)]
Merge tag 'objtool-urgent-2026-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull objtool fixes from Ingo Molnar:
"Fix two objtool build failures that trigger in uncommon build
environments"
* tag 'objtool-urgent-2026-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: fix build failure due to missing libopcodes check
objtool: fix compilation failure with the x32 toolchain
Julian Sun [Mon, 8 Dec 2025 12:37:13 +0000 (20:37 +0800)]
ext4: add missing down_write_data_sem in mext_move_extent().
Commit 962e8a01eab9 ("ext4: introduce mext_move_extent()") attempts to
call ext4_swap_extents() on the failure path to recover the swapped
extents, but fails to acquire locks for the two inode->i_data_sem,
triggering the BUG_ON statement in ext4_swap_extents().
This issue can be fixed by calling ext4_double_down_write_data_sem()
before ext4_swap_extents().
Signed-off-by: Julian Sun <sunjunchao@bytedance.com> Reported-by: syzbot+4ea6bd8737669b423aae@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/69368649.a70a0220.38f243.0093.GAE@google.com/ Fixes: 962e8a01eab9 ("ext4: introduce mext_move_extent()") Reviewed-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Zhang Yi <yi.zhang@huawei.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Link: https://patch.msgid.link/20251208123713.1971068-1-sunjunchao@bytedance.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Arnd Bergmann [Thu, 4 Dec 2025 10:19:10 +0000 (11:19 +0100)]
ext4: fix ext4_tune_sb_params padding
The padding at the end of struct ext4_tune_sb_params is architecture
specific and in particular is different between x86-32 and x86-64,
since the __u64 member only enforces struct alignment on the latter.
This shows up as a new warning when test-building the headers with
-Wpadded:
include/linux/ext4.h:144:1: error: padding struct size to alignment boundary with 4 bytes [-Werror=padded]
All members inside the structure are naturally aligned, so the only
difference here is the amount of padding at the end. Make the padding
explicit, to have a consistent sizeof(struct ext4_tune_sb_params) of
232 on all architectures and avoid adding compat ioctl handling for
EXT4_IOC_GET_TUNE_SB_PARAM/EXT4_IOC_SET_TUNE_SB_PARAM.
This is an ABI break on x86-32 but hopefully this can go into 6.18.y early
enough as a fixup so no actual users will be affected. Alternatively, the
kernel could handle the ioctl commands for both sizes (232 and 228 bytes)
on all architectures.
Fixes: 04a91570ac67 ("ext4: implemet new ioctls to set and get superblock parameters") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20251204101914.1037148-1-arnd@kernel.org Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
Linus Torvalds [Sun, 18 Jan 2026 03:29:32 +0000 (19:29 -0800)]
Merge tag 'for-6.19-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- with large folios in use, fix partial incorrect update of a reflinked
range
- fix potential deadlock in iget when lookup fails and eviction is
needed
- in send, validate inline extent type while detecting file holes
- fix memory leak after an error when creating a space info
- remove zone statistics from sysfs again, the output size limitations
make it unusable, we'll do it in another way in another release
- test fixes:
- return proper error codes from block remapping tests
- fix tree root leaks in qgroup tests after errors
* tag 'for-6.19-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: remove zoned statistics from sysfs
btrfs: fix memory leaks in create_space_info() error paths
btrfs: invalidate pages instead of truncate after reflinking
btrfs: update the Kconfig string for CONFIG_BTRFS_EXPERIMENTAL
btrfs: send: check for inline extents in range_is_hole_in_parent()
btrfs: tests: fix return 0 on rmap test failure
btrfs: tests: fix root tree leak in btrfs_test_qgroups()
btrfs: release path before iget_failed() in btrfs_read_locked_inode()
Linus Torvalds [Sun, 18 Jan 2026 03:24:48 +0000 (19:24 -0800)]
Merge tag 'loongarch-fixes-6.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch fixes from Huacai Chen:
"Remove redundant code in head.S, fix PMU counter allocation for mixed-
type event groups, fix a lot of dts build warnings, and fix kvm_device
memory leaks"
* tag 'loongarch-fixes-6.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
LoongArch: KVM: Fix kvm_device leak in kvm_pch_pic_destroy()
LoongArch: KVM: Fix kvm_device leak in kvm_eiointc_destroy()
LoongArch: KVM: Fix kvm_device leak in kvm_ipi_destroy()
LoongArch: dts: loongson-2k1000: Fix i2c-gpio node names
LoongArch: dts: loongson-2k2000: Add default interrupt controller address cells
LoongArch: dts: loongson-2k1000: Add default interrupt controller address cells
LoongArch: dts: loongson-2k0500: Add default interrupt controller address cells
LoongArch: dts: Describe PCI sideband IRQ through interrupt-extended
LoongArch: Fix PMU counter allocation for mixed-type event groups
LoongArch: Remove redundant code in head.S
Linus Torvalds [Sat, 17 Jan 2026 16:52:45 +0000 (08:52 -0800)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
"An arm64/mpam fix to use non-atomic bitops on struct mmap_props member
(atomicity not required).
For kunit testing, the structure is packed to avoid memcmp() errors
but this affects atomic bitops as they have strict alignment
requirements.
Also remove a duplicate include in the mpam driver"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm_mpam: Use non-atomic bitops when modifying feature bitmap
arm_mpam: Remove duplicate linux/srcu.h header
Weigang He [Sat, 17 Jan 2026 09:12:38 +0000 (09:12 +0000)]
of: fix reference count leak in of_alias_scan()
of_find_node_by_path() returns a device_node with its refcount
incremented. When kstrtoint() fails or dt_alloc() fails, the function
continues to the next iteration without calling of_node_put(), causing
a reference count leak.
Add of_node_put(np) before continue on both error paths to properly
release the device_node reference.
selftests/x86: Add selftests include path for kselftest.h after centralization
The previous change centralizing kselftest.h include path in lib.mk caused x86
selftests to fail, as x86 Makefile overwrites CFLAGS using ":=", dropping the
include path added in lib.mk. Therefore, helpers.h could not find kselftest.h
during compilation.
Fix this by adding the tools/testing/sefltest to CFLAGS in x86 Makefile.
Linus Torvalds [Sat, 17 Jan 2026 04:59:46 +0000 (20:59 -0800)]
Merge tag 'block-6.19-20260116' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull block fixes from Jens Axboe:
- NVMe pull request via Keith:
- Device quirk to disable faulty temperature (Ilikara)
- TCP target null pointer fix from bad host protocol usage (Shivam)
- Add apple,t8103-nvme-ans2 as a compatible apple controller
(Janne)
- FC tagset leak fix (Chaitanya)
- TCP socket deadlock fix (Hannes)
- Target name buffer overrun fix (Shin'ichiro)
- Fix for an underflow for rnbd during device unmap
- Zero the non-PI part of the auto integrity buffer
- Fix for a configfs memory leak in the null block driver
* tag 'block-6.19-20260116' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
rnbd-clt: fix refcount underflow in device unmap path
nvme: fix PCIe subsystem reset controller state transition
nvmet: do not copy beyond sybsysnqn string length
nvmet-tcp: fixup hang in nvmet_tcp_listen_data_ready()
null_blk: fix kmemleak by releasing references to fault configfs items
block: zero non-PI portion of auto integrity buffer
nvme-fc: release admin tagset if init fails
nvme-apple: add "apple,t8103-nvme-ans2" as compatible
nvme-tcp: fix NULL pointer dereferences in nvmet_tcp_build_pdu_iovec
nvme-pci: disable secondary temp for Wodposit WPBSNM8
Qiang Ma [Sat, 17 Jan 2026 02:57:03 +0000 (10:57 +0800)]
LoongArch: KVM: Fix kvm_device leak in kvm_pch_pic_destroy()
In kvm_ioctl_create_device(), kvm_device has allocated memory,
kvm_device->destroy() seems to be supposed to free its kvm_device
struct, but kvm_pch_pic_destroy() is not currently doing this, that
would lead to a memory leak.
So, fix it.
Cc: stable@vger.kernel.org Reviewed-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Qiang Ma <maqianga@uniontech.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Qiang Ma [Sat, 17 Jan 2026 02:57:02 +0000 (10:57 +0800)]
LoongArch: KVM: Fix kvm_device leak in kvm_eiointc_destroy()
In kvm_ioctl_create_device(), kvm_device has allocated memory,
kvm_device->destroy() seems to be supposed to free its kvm_device
struct, but kvm_eiointc_destroy() is not currently doing this, that
would lead to a memory leak.
So, fix it.
Cc: stable@vger.kernel.org Reviewed-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Qiang Ma <maqianga@uniontech.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Qiang Ma [Sat, 17 Jan 2026 02:57:02 +0000 (10:57 +0800)]
LoongArch: KVM: Fix kvm_device leak in kvm_ipi_destroy()
In kvm_ioctl_create_device(), kvm_device has allocated memory,
kvm_device->destroy() seems to be supposed to free its kvm_device
struct, but kvm_ipi_destroy() is not currently doing this, that
would lead to a memory leak.
So, fix it.
Cc: stable@vger.kernel.org Reviewed-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Qiang Ma <maqianga@uniontech.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Add missing address-cells 0 to the Local I/O, Extend I/O and PCH-PIC
Interrupt Controller node to silence W=1 warning:
loongson-2k2000.dtsi:364.5-49: Warning (interrupt_map): /bus@10000000/pcie@1a000000/pcie@9,0:interrupt-map:
Missing property '#address-cells' in node /bus@10000000/interrupt-controller@10000000, using 0 as fallback
Value '0' is correct because:
1. The LIO/EIO/PCH interrupt controller does not have children,
2. interrupt-map property (in PCI node) consists of five components and
the fourth component "parent unit address", which size is defined by
'#address-cells' of the node pointed to by the interrupt-parent
component, is not used (=0)
Add missing address-cells 0 to the Local I/O interrupt controller node
to silence W=1 warning:
loongson-2k1000.dtsi:498.5-55: Warning (interrupt_map): /bus@10000000/pcie@1a000000/pcie@9,0:interrupt-map:
Missing property '#address-cells' in node /bus@10000000/interrupt-controller@1fe01440, using 0 as fallback
Value '0' is correct because:
1. The Local I/O interrupt controller does not have children,
2. interrupt-map property (in PCI node) consists of five components and
the fourth component "parent unit address", which size is defined by
'#address-cells' of the node pointed to by the interrupt-parent
component, is not used (=0)
Add missing address-cells 0 to the Local I/O and Extend I/O interrupt
controller node to silence W=1 warning:
loongson-2k0500.dtsi:513.5-51: Warning (interrupt_map): /bus@10000000/pcie@1a000000/pcie@0,0:interrupt-map:
Missing property '#address-cells' in node /bus@10000000/interrupt-controller@1fe11600, using 0 as fallback
Value '0' is correct because:
1. The Local I/O & Extend I/O interrupt controller do not have children,
2. interrupt-map property (in PCI node) consists of five components and
the fourth component "parent unit address", which size is defined by
'#address-cells' of the node pointed to by the interrupt-parent
component, is not used (=0)
Yao Zi [Sat, 17 Jan 2026 02:56:52 +0000 (10:56 +0800)]
LoongArch: dts: Describe PCI sideband IRQ through interrupt-extended
SoC integrated peripherals on LS2K1000 and LS2K2000 could be discovered
as PCI devices, but require sideband interrupts to function, which are
previously described by interrupts and interrupt-parent properties.
However, pci/pci-device.yaml allows interrupts property to only specify
PCI INTx interrupts, not sideband ones. Convert these devices to use
interrupt-extended property, which describes sideband interrupts used by
PCI devices since dt-schema commit e6ea659d2baa ("schemas: pci-device:
Allow interrupts-extended for sideband interrupts"), eliminating
dtbs_check warnings.
Cc: stable@vger.kernel.org Fixes: 30a5532a3206 ("LoongArch: dts: DeviceTree for Loongson-2K1000") Signed-off-by: Yao Zi <me@ziyao.cc> Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Lisa Robinson [Sat, 17 Jan 2026 02:56:43 +0000 (10:56 +0800)]
LoongArch: Fix PMU counter allocation for mixed-type event groups
When validating a perf event group, validate_group() unconditionally
attempts to allocate hardware PMU counters for the leader, sibling
events and the new event being added.
This is incorrect for mixed-type groups. If a PERF_TYPE_SOFTWARE event
is part of the group, the current code still tries to allocate a hardware
PMU counter for it, which can wrongly consume hardware PMU resources and
cause spurious allocation failures.
Fix this by only allocating PMU counters for hardware events during group
validation, and skipping software events.
of: platform: Use default match table for /firmware
Calling of_platform_populate() without a match table will only populate
the immediate child nodes under /firmware. This is usually fine, but in
the case of something like a "simple-mfd" node such as
"raspberrypi,bcm2835-firmware", those child nodes will not be populated.
And subsequent calls won't work either because the /firmware node is
marked as processed already.
Switch the call to of_platform_default_populate() to solve this problem.
It should be a nop for existing cases.
Linus Torvalds [Fri, 16 Jan 2026 21:48:18 +0000 (13:48 -0800)]
Merge tag 'drm-fixes-2026-01-16' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Simona Vetter:
"We've had nothing aside of a compiler noise fix until today, when the
amd and drm-misc fixes showed up after Dave already went into weekend
mode. So it's on me to push these out, since there's a bunch of
important fixes in here I think that shouldn't be delayed for a week.
Core Changes:
- take gem lock when preallocating in gpuvm
- add single byte read fallback to dp for broken usb-c adapters
- remove duplicate drm_sysfb declarations
Driver Changes:
- i915: compiler noise fix
- amdgpu/amdkfd: pile of fixes all over
- vmwgfx:
- v10 cursor regression fix
- other fixes
- rockchip:
- waiting for cfgdone regression fix
- other fixes
- gud: fix oops on disconnect
- simple-panel:
- regression fix when connector is not set
- fix for DataImage SCF0700C48GGU18
- nouveau: cursor handling locking fix"
* tag 'drm-fixes-2026-01-16' of https://gitlab.freedesktop.org/drm/kernel: (33 commits)
drm/amd/display: Add an hdmi_hpd_debounce_delay_ms module
drm/amdgpu/userq: Fix fence reference leak on queue teardown v2
drm/amdkfd: No need to suspend whole MES to evict process
Revert "drm/amdgpu: don't attach the tlb fence for SI"
drm/amdgpu: validate the flush_gpu_tlb_pasid()
drm/amd/pm: fix smu overdrive data type wrong issue on smu 14.0.2
drm/amd/display: Initialise backlight level values from hw
drm/amd/display: Bump the HDMI clock to 340MHz
drm/amd/display: Show link name in PSR status message
drm/amdkfd: fix a memory leak in device_queue_manager_init()
drm/amdgpu: make sure userqs are enabled in userq IOCTLs
drm/amdgpu: Use correct address to setup gart page table for vram access
Revert duplicate "drm/amdgpu: disable peer-to-peer access for DCC-enabled GC12 VRAM surfaces"
drm/amd: Clean up kfd node on surprise disconnect
drm/amdgpu: fix drm panic null pointer when driver not support atomic
drm/amdgpu: Fix gfx9 update PTE mtype flag
drm/sysfb: Remove duplicate declarations
drm/nouveau/kms/nv50-: Assert we hold nv50_disp->lock in nv50_head_flush_*
drm/nouveau/disp/nv50-: Set lock_core in curs507a_prepare
drm/gud: fix NULL fb and crtc dereferences on USB disconnect
...
Linus Torvalds [Fri, 16 Jan 2026 21:09:28 +0000 (13:09 -0800)]
Merge tag 'cxl-fixes-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull Compute Express Link (CXL) fixes from Dave Jiang:
- Recognize all ZONE_DEVICE users as physaddr consumers
- Fix format string for extended_linear_cache_size_show()
- Fix target list setup for multiple decoders sharing the same
downstream port
- Restore HBIW check before derefernce platform data
- Fix potential infinite loop in __cxl_dpa_reserve()
- Check for invalid addresses returned from translation functions on
error
* tag 'cxl-fixes-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
cxl: Check for invalid addresses returned from translation functions on errors
cxl/hdm: Fix potential infinite loop in __cxl_dpa_reserve()
cxl/acpi: Restore HBIW check before dereferencing platform_data
cxl/port: Fix target list setup for multiple decoders sharing the same dport
cxl/region: fix format string for resource_size_t
x86/kaslr: Recognize all ZONE_DEVICE users as physaddr consumers
Linus Torvalds [Fri, 16 Jan 2026 20:08:19 +0000 (12:08 -0800)]
Merge tag 'pm-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix an error path memory leak in the energy model management
code, fix a kerneldoc comment in it, and fix and revamp the energy
model YNL specification added recently along with the new energy model
management netlink interface (that received feedback after being
added):
- Fix a memory leak in em_create_pd() error path (Malaya Kumar Rout)
- Fix stale description of the cost field in struct em_perf_state to
reflect the current code (Yaxiong Tian)
- Fix and revamp the energy model YNL specification added recently
along with the energy model netlink interface (Changwoo Min)"
* tag 'pm-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM: EM: Add dump to get-perf-domains in the EM YNL spec
PM: EM: Change cpus' type from string to u64 array in the EM YNL spec
PM: EM: Rename em.yaml to dev-energymodel.yaml
PM: EM: Fix yamllint warnings in the EM YNL spec
PM: EM: Fix memory leak in em_create_pd() error path
PM: EM: Fix incorrect description of the cost field in struct em_perf_state
Simona Vetter [Fri, 16 Jan 2026 19:27:20 +0000 (20:27 +0100)]
Merge tag 'drm-misc-fixes-2026-01-16' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
drm-misc-fixes for v6.19-rc6:
vmwgfx:
- Fix hw regression from refactoring cursor handling on v10 'hardware'
- Fix warnings in destructor by merging the 2 release functions
- kernel doc fix
- error handling in vmw_compat_shader_add()
rockchip:
- fix vop2 polling
- fix regression waiting for cfgdone without config change
- fix warning when enabling encoder
core:
- take gem lock when preallocating in gpuvm.
- add single byte read fallback to dp for broken usb-c adapters
- remove duplicate drm_sysfb declarations
gud:
- Fix oops on usb disconnect
Simple panel:
- Re-add fallback when connector is not set to fix regressions
- Set correct type in DataImage SCF0700C48GGU18
Linus Torvalds [Fri, 16 Jan 2026 19:03:17 +0000 (11:03 -0800)]
Merge tag 'acpi-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
"Add checks missed by a previous recent update to the ACPI
suspend-to-idle code and add a debug module parameter to it
to work around a platform firmware issue exposed by that
update (Rafael Wysocki)"
* tag 'acpi-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: PM: s2idle: Add module parameter for LPS0 constraints checking
ACPI: PM: s2idle: Add missing checks to acpi_s2idle_begin_lps0()
Linus Torvalds [Fri, 16 Jan 2026 18:48:17 +0000 (10:48 -0800)]
Merge tag 'sound-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"This became a bit larger than wished for, often seen as a bump at the
middle, but almost all changes are small device-specific fixes, so the
risk must be pretty low.
- SoundWire fix for missing symbol export
- Fixes for device-tree bindings
- A fix for OOB access in USB-audio, spotted by fuzzer
- Quirks for HD-audio, SoundWire, AMD ACP
- A series of ASoC tlv320 and wsa codec fixes
- Other misc fixes in PCM OSS error-handling, Cirrus scodec test,
ASoC ops endianess, davinci, simple-card, and tegra"
* tag 'sound-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (33 commits)
ALSA: hda/tas2781: Add newly-released HP laptop
ASoC: rt5640: Fix duplicate clock properties in DT binding
ALSA: hda/realtek: Add quirk for HP Pavilion x360 to enable mute LED
ASoC: tlv320adcx140: fix word length
ASoC: tlv320adcx140: Propagate error codes during probe
ASoC: tlv320adcx140: fix null pointer
ASoC: tlv320adcx140: invert DRE_ENABLE
ASoC: sdw_utils: cs42l43: Enable Headphone pin for LINEOUT jack type
ASoC: sdw_utils: Call init callbacks on the correct codec DAI
soundwire: Add missing EXPORT for sdw_slave_type
ALSA: usb-audio: Prevent excessive number of frames
ALSA: hda/cirrus_scodec_test: Fix test suite name
ALSA: hda/cirrus_scodec_test: Fix incorrect setup of gpiochip
ALSA: hda/realtek: Add quirk for Asus Zephyrus G14 2025 using CS35L56, fix speakers
ASoC: amd: yc: Fix microphone on ASUS M6500RE
ASoC: tegra: Revert fix for uninitialized flat cache warning in tegra210_ahub
ASoC: dt-bindings: rockchip-spdif: Allow "port" node
ASoC: dt-bindings: realtek,rt5640: Allow 7 for realtek,jack-detect-source
ASoC: dt-bindings: realtek,rt5640: Add missing properties/node
ASoC: dt-bindings: realtek,rt5640: Document port node
...
Linus Torvalds [Fri, 16 Jan 2026 17:46:59 +0000 (09:46 -0800)]
Merge tag 'printk-for-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux
Pull printk fix from Petr Mladek:
- Prevent softlockup by restoring IRQs in atomic flush after each
record
* tag 'printk-for-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
printk/nbcon: Restore IRQ in atomic flush after each emitted record
Linus Torvalds [Fri, 16 Jan 2026 17:09:41 +0000 (09:09 -0800)]
Merge tag 'xfs-fixes-6.19-rc6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs fixes from Carlos Maiolino:
"Just a few obvious fixes and some 'cosmetic' changes"
* tag 'xfs-fixes-6.19-rc6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: set max_agbno to allow sparse alloc of last full inode chunk
xfs: Fix xfs_grow_last_rtg()
xfs: improve the assert at the top of xfs_log_cover
xfs: fix an overly long line in xfs_rtgroup_calc_geometry
xfs: mark __xfs_rtgroup_extents static
xfs: Fix the return value of xfs_rtcopy_summary()
xfs: fix memory leak in xfs_growfs_check_rtgeom()
Merge fixes related to the energy model management for 6.19-rc6:
- Fix a memory leak in em_create_pd() error path (Malaya Kumar Rout)
- Fix stale description of the cost field in struct em_perf_state to
reflect the current code (Yaxiong Tian)
- Fix and revamp the energy model YNL specification added recently
along with the energy model netlink interface (Changwoo Min)
* pm-em:
PM: EM: Add dump to get-perf-domains in the EM YNL spec
PM: EM: Change cpus' type from string to u64 array in the EM YNL spec
PM: EM: Rename em.yaml to dev-energymodel.yaml
PM: EM: Fix yamllint warnings in the EM YNL spec
PM: EM: Fix memory leak in em_create_pd() error path
PM: EM: Fix incorrect description of the cost field in struct em_perf_state
Ben Horgan [Mon, 12 Jan 2026 16:58:29 +0000 (16:58 +0000)]
arm_mpam: Use non-atomic bitops when modifying feature bitmap
In the test__props_mismatch() kunit test we rely on the struct mpam_props
being packed to ensure memcmp doesn't consider packing. Making it packed
reduces the alignment of the features bitmap and so breaks a requirement
for the use of atomics. As we don't rely on the set/clear of these bits
being atomic, just make them non-atomic.
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Ben Horgan <ben.horgan@arm.com> Fixes: 8c90dc68a5de ("arm_mpam: Probe the hardware features resctrl supports") Reviewed-by: Gavin Shan <gshan@redhat.com> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Mathias Nyman [Thu, 15 Jan 2026 23:37:58 +0000 (01:37 +0200)]
xhci: sideband: don't dereference freed ring when removing sideband endpoint
xhci_sideband_remove_endpoint() incorrecly assumes that the endpoint is
running and has a valid transfer ring.
Lianqin reported a crash during suspend/wake-up stress testing, and
found the cause to be dereferencing a non-existing transfer ring
'ep->ring' during xhci_sideband_remove_endpoint().
The endpoint and its ring may be in unknown state if this function
is called after xHCI was reinitialized in resume (lost power), or if
device is being re-enumerated, disconnected or endpoint already dropped.
Fix this by both removing unnecessary ring access, and by checking
ep->ring exists before dereferencing it. Also make sure endpoint is
running before attempting to stop it.
Remove the xhci_initialize_ring_info() call during sideband endpoint
removal as is it only initializes ring structure enqueue, dequeue and
cycle state values to their starting values without changing actual
hardware enqueue, dequeue and cycle state. Leaving them out of sync
is worse than leaving it as it is. The endpoint will get freed in after
this in most usecases.
If the (audio) class driver want's to reuse the endpoint after offload
then it is up to the class driver to ensure endpoint is properly set up.
Merge tag 'usb-serial-6.19-rc6' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus
Johan writes:
USB serial fix for 6.19-rc6
Here's a fix for an f81232 enumeration issue that could prevent some
ports from being enabled (e.g. during driver rebind).
Included are also some new device ids.
All have been in linux-next with no reported issues.
* tag 'usb-serial-6.19-rc6' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial:
USB: serial: f81232: fix incomplete serial port generation
USB: serial: ftdi_sio: add support for PICAXE AXE027 cable
USB: serial: option: add Telit LE910 MBIM composition
Tim Bird [Fri, 16 Jan 2026 00:04:31 +0000 (17:04 -0700)]
kernel: modules: Add SPDX license identifier to kmod.c
Add a GPL-2.0 license identifier line for this file.
kmod.c was originally introduced in the kernel in February
of 1998 by Linus Torvalds - who was familiar with kernel
licensing at the time this was introduced.
Signed-off-by: Tim Bird <tim.bird@sony.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 15 Jan 2026 23:13:05 +0000 (15:13 -0800)]
Merge tag 'ftrace-v6.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull ftrace fix from Steven Rostedt:
- Fix allocation accounting on boot up
The ftrace records for each function that ftrace can attach to is
done in a group of pages. At boot up, the number of pages are
calculated and allocated. After that, the pages are filled with data.
It may allocate more than needed due to some functions not being
recorded (because they are unused weak functions), this too is
recorded.
After the data is filled in, a check is made to make sure the right
number of pages were allocated. But this was off due to the
assumption that the same number of entries fit per every page.
Because the size of an entry does not evenly divide into PAGE_SIZE,
there is a rounding error when a large number of pages is allocated
to hold the events. This causes the check to fail and triggers a
warning.
Fix the accounting by finding out how many pages are actually
allocated from the functions that allocate them and use that to see
if all the pages allocated were used and the ones not used are
properly freed.
* tag 'ftrace-v6.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
ftrace: Do not over-allocate ftrace memory