]> git.ipfire.org Git - thirdparty/kernel/linux.git/log
thirdparty/kernel/linux.git
2 weeks agomm, swap: consolidate bad slots setup and make it more robust
Kairui Song [Tue, 17 Feb 2026 20:06:29 +0000 (04:06 +0800)] 
mm, swap: consolidate bad slots setup and make it more robust

In preparation for using the swap table to track bad slots directly, move
the bad slot setup to one place, set up the swap_map mark, and cluster
counter update together.

While at it, provide more informative logs and a more robust fallback if
any bad slot info looks incorrect.

Fixes a potential issue that a malformed swap file may cause the cluster
to be unusable upon swapon, and provides a more verbose warning on a
malformed swap file

Link: https://lkml.kernel.org/r/20260218-swap-table-p3-v3-4-f4e34be021a7@tencent.com
Signed-off-by: Kairui Song <kasong@tencent.com>
Acked-by: Chris Li <chrisl@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kairui Song <ryncsn@gmail.com>
Cc: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: kernel test robot <lkp@intel.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Nhat Pham <nphamcs@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agomm, swap: remove redundant arguments and locking for enabling a device
Kairui Song [Tue, 17 Feb 2026 20:06:28 +0000 (04:06 +0800)] 
mm, swap: remove redundant arguments and locking for enabling a device

There is no need to repeatedly pass zero map and priority values.  zeromap
is similar to cluster info and swap_map, which are only used once the swap
device is exposed.  And the prio values are currently read only once set,
and only used for the list insertion upon expose or swap info display.

Link: https://lkml.kernel.org/r/20260218-swap-table-p3-v3-3-f4e34be021a7@tencent.com
Signed-off-by: Kairui Song <kasong@tencent.com>
Acked-by: Chris Li <chrisl@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kairui Song <ryncsn@gmail.com>
Cc: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: kernel test robot <lkp@intel.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Nhat Pham <nphamcs@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agomm, swap: clean up swapon process and locking
Kairui Song [Tue, 17 Feb 2026 20:06:27 +0000 (04:06 +0800)] 
mm, swap: clean up swapon process and locking

Slightly clean up the swapon process.  Add comments about what swap_lock
protects, introduce and rename helpers that wrap swap_map and cluster_info
setup, and do it outside of the swap_lock lock.

This lock protection is not needed for swap_map and cluster_info setup
because all swap users must either hold the percpu ref or hold a stable
allocated swap entry (e.g., locking a folio in the swap cache) before
accessing.  So before the swap device is exposed by enable_swap_info,
nothing would use the swap device's map or cluster.

So we are safe to allocate and set up swap data freely first, then expose
the swap device and set the SWP_WRITEOK flag.

Link: https://lkml.kernel.org/r/20260218-swap-table-p3-v3-2-f4e34be021a7@tencent.com
Signed-off-by: Kairui Song <kasong@tencent.com>
Acked-by: Chris Li <chrisl@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kairui Song <ryncsn@gmail.com>
Cc: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: kernel test robot <lkp@intel.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Nhat Pham <nphamcs@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agomm, swap: protect si->swap_file properly and use as a mount indicator
Kairui Song [Tue, 17 Feb 2026 20:06:26 +0000 (04:06 +0800)] 
mm, swap: protect si->swap_file properly and use as a mount indicator

Patch series "mm, swap: swap table phase III: remove swap_map", v3.

This series removes the static swap_map and uses the swap table for the
swap count directly.  This saves about ~30% memory usage for the static
swap metadata.  For example, this saves 256MB of memory when mounting a
1TB swap device.  Performance is slightly better too, since the double
update of the swap table and swap_map is now gone.

Test results:

Mounting a swap device:
=======================
Mount a 1TB brd device as SWAP, just to verify the memory save:

`free -m` before:
               total        used        free      shared  buff/cache   available
Mem:            1465        1051         417           1          61         413
Swap:        1054435           0     1054435

`free -m` after:
               total        used        free      shared  buff/cache   available
Mem:            1465         795         672           1          62         670
Swap:        1054435           0     1054435

Idle memory usage is reduced by ~256MB just as expected.  And following
this design we should be able to save another ~512MB in a next phase.

Build kernel test:
==================
Test using ZSWAP with NVME SWAP, make -j48, defconfig, in a x86_64 VM
with 5G RAM, under global pressure, avg of 32 test run:

                Before            After:
System time:    1038.97s          1013.75s (-2.4%)

Test using ZRAM as SWAP, make -j12, tinyconfig, in a ARM64 VM with 1.5G
RAM, under global pressure, avg of 32 test run:

                Before            After:
System time:    67.75s            66.65s (-1.6%)

The result is slightly better.

Redis / Valkey benchmark:
=========================
Test using ZRAM as SWAP, in a ARM64 VM with 1.5G RAM, under global pressure,
avg of 64 test run:

Server: valkey-server --maxmemory 2560M
Client: redis-benchmark -r 3000000 -n 3000000 -d 1024 -c 12 -P 32 -t get

        no persistence              with BGSAVE
Before: 472705.71 RPS               369451.68 RPS
After:  481197.93 RPS (+1.8%)       374922.32 RPS (+1.5%)

In conclusion, performance is better in all cases, and memory usage is
much lower.

The swap cgroup array will also be merged into the swap table in a later
phase, saving the other ~60% part of the static swap metadata and making
all the swap metadata dynamic.  The improved API for swap operations also
reduces the lock contention and makes more batching operations possible.

This patch (of 12):

/proc/swaps uses si->swap_map as the indicator to check if the swap
device is mounted. swap_map will be removed soon, so change it to use
si->swap_file instead because:

- si->swap_file is exactly the only dynamic content that /proc/swaps is
  interested in. Previously, it was checking si->swap_map just to ensure
  si->swap_file is available. si->swap_map is set under mutex
  protection, and after si->swap_file is set, so having si->swap_map set
  guarantees si->swap_file is set.

- Checking si->flags doesn't work here. SWP_WRITEOK is cleared during
  swapoff, but /proc/swaps is supposed to show the device under swapoff
  too to report the swapoff progress. And SWP_USED is set even if the
  device hasn't been properly set up.

We can have another flag, but the easier way is to just check
si->swap_file directly. So protect si->swap_file setting with mutext,
and set si->swap_file only when the swap device is truly enabled.

/proc/swaps only interested in si->swap_file and a few static data
reading. Only si->swap_file needs protection. Reading other static
fields is always fine.

Link: https://lkml.kernel.org/r/20260218-swap-table-p3-v3-0-f4e34be021a7@tencent.com
Link: https://lkml.kernel.org/r/20260218-swap-table-p3-v3-1-f4e34be021a7@tencent.com
Signed-off-by: Kairui Song <kasong@tencent.com>
Acked-by: Chris Li <chrisl@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: Kairui Song <ryncsn@gmail.com>
Cc: kernel test robot <lkp@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agomm: fix typo in the comment of mod_zone_state()
Miquel Sabaté Solà [Thu, 19 Feb 2026 23:44:07 +0000 (00:44 +0100)] 
mm: fix typo in the comment of mod_zone_state()

Use the proper function name, followed by parenthesis as usual.

Link: https://lkml.kernel.org/r/20260219234407.3261196-1-mssola@mssola.com
Signed-off-by: Miquel Sabaté Solà <mssola@mssola.com>
Acked-by: SeongJae Park <sj@kernel.org>
Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agomm: move pgscan, pgsteal, pgrefill to node stats
JP Kobryn (Meta) [Thu, 19 Feb 2026 23:58:46 +0000 (15:58 -0800)] 
mm: move pgscan, pgsteal, pgrefill to node stats

There are situations where reclaim kicks in on a system with free memory.
One possible cause is a NUMA imbalance scenario where one or more nodes
are under pressure.  It would help if we could easily identify such nodes.

Move the pgscan, pgsteal, and pgrefill counters from vm_event_item to
node_stat_item to provide per-node reclaim visibility.  With these
counters as node stats, the values are now displayed in the per-node
section of /proc/zoneinfo, which allows for quick identification of the
affected nodes.

/proc/vmstat continues to report the same counters, aggregated across all
nodes.  But the ordering of these items within the readout changes as they
move from the vm events section to the node stats section.

Memcg accounting of these counters is preserved.  The relocated counters
remain visible in memory.stat alongside the existing aggregate pgscan and
pgsteal counters.

However, this change affects how the global counters are accumulated.
Previously, the global event count update was gated on !cgroup_reclaim(),
excluding memcg-based reclaim from /proc/vmstat.  Now that
mod_lruvec_state() is being used to update the counters, the global
counters will include all reclaim.  This is consistent with how pgdemote
counters are already tracked.

Finally, the virtio_balloon driver is updated to use
global_node_page_state() to fetch the counters, as they are no longer
accessible through the vm_events array.

Link: https://lkml.kernel.org/r/20260219235846.161910-1-jp.kobryn@linux.dev
Signed-off-by: JP Kobryn <jp.kobryn@linux.dev>
Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Byungchul Park <byungchul@sk.com>
Cc: David Hildenbrand <david@kernel.org>
Cc: Eugenio Pérez <eperezma@redhat.com>
Cc: Gregory Price <gourry@gourry.net>
Cc: "Huang, Ying" <ying.huang@linux.alibaba.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mathew Brost <matthew.brost@intel.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: Rakie Kim <rakie.kim@sk.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Wei Xu <weixugc@google.com>
Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Cc: Yuanchu Xie <yuanchu@google.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agoselftests/mm: skip migration tests if NUMA is unavailable
AnishMulay [Wed, 18 Feb 2026 16:39:41 +0000 (11:39 -0500)] 
selftests/mm: skip migration tests if NUMA is unavailable

Currently, the migration test asserts that numa_available() returns 0.  On
systems where NUMA is not available (returning -1), such as certain ARM64
configurations or single-node systems, this assertion fails and crashes
the test.

Update the test to check the return value of numa_available().  If it is
less than 0, skip the test gracefully instead of failing.

This aligns the behavior with other MM selftests (like rmap) that skip
when NUMA support is missing.

Link: https://lkml.kernel.org/r/20260218163941.13499-1-anishm7030@gmail.com
Fixes: 0c2d08728470 ("mm: add selftests for migration entries")
Signed-off-by: AnishMulay <anishm7030@gmail.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Reviewed-by: Dev Jain <dev.jain@arm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Tested-by: Sayali Patil <sayalip@linux.ibm.com>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agomm/pkeys: remove unused tsk parameter from arch_set_user_pkey_access()
Seongsu Park [Thu, 19 Feb 2026 06:35:06 +0000 (15:35 +0900)] 
mm/pkeys: remove unused tsk parameter from arch_set_user_pkey_access()

The tsk parameter in arch_set_user_pkey_access() is never used in the
function implementations across all architectures (arm64, powerpc, x86).

Link: https://lkml.kernel.org/r/20260219063506.545148-1-sgsu.park@samsung.com
Signed-off-by: Seongsu Park <sgsu.park@samsung.com>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agomaple_tree: clean up mas_wr_node_store()
Liam R. Howlett [Fri, 30 Jan 2026 20:59:35 +0000 (15:59 -0500)] 
maple_tree: clean up mas_wr_node_store()

The new_end does not need to be passed in as the data is already being
checked.  This allows for other areas to skip getting the node new_end in
the calling function.

The type was incorrectly void * instead of void __rcu *, which isn't an
issue but is technically incorrect.

Move the variable assignment to after the declarations to clean up the
initial setup.

Ensure there is something to copy before calling memcpy().

Link: https://lkml.kernel.org/r/20260130205935.2559335-31-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrew Ballance <andrewjballance@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agomaple_tree: don't pass end to mas_wr_append()
Liam R. Howlett [Fri, 30 Jan 2026 20:59:34 +0000 (15:59 -0500)] 
maple_tree: don't pass end to mas_wr_append()

Figure out the end internally.  This is necessary for future cleanups.

Link: https://lkml.kernel.org/r/20260130205935.2559335-30-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrew Ballance <andrewjballance@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agomaple_tree: pass maple copy node to mas_wmb_replace()
Liam R. Howlett [Fri, 30 Jan 2026 20:59:33 +0000 (15:59 -0500)] 
maple_tree: pass maple copy node to mas_wmb_replace()

mas_wmb_replace() is called in three places with the same setup, move the
setup into the function itself.  The function needs to be relocated as it
calls mtree_range_walk().

Link: https://lkml.kernel.org/r/20260130205935.2559335-29-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrew Ballance <andrewjballance@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agomaple_tree: remove maple big node and subtree structs
Liam R. Howlett [Fri, 30 Jan 2026 20:59:32 +0000 (15:59 -0500)] 
maple_tree: remove maple big node and subtree structs

Now that no one uses the structures and functions, drop the dead code.

Link: https://lkml.kernel.org/r/20260130205935.2559335-28-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrew Ballance <andrewjballance@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agomaple_tree: use maple copy node for mas_wr_split()
Liam R. Howlett [Fri, 30 Jan 2026 20:59:31 +0000 (15:59 -0500)] 
maple_tree: use maple copy node for mas_wr_split()

Instead of using the maple big node, use the maple copy node for reduced
stack usage and aligning with mas_wr_rebalance() and
mas_wr_spanning_store().

Splitting a node is similar to rebalancing, but a new evaluation of when
to ascend is needed.  The only other difference is that the data is pushed
and never rebalanced at each level.

The testing must also align with the changes to this commit to ensure the
test suite continues to pass.

Link: https://lkml.kernel.org/r/20260130205935.2559335-27-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrew Ballance <andrewjballance@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agomaple_tree: add cp_converged() helper
Liam R. Howlett [Fri, 30 Jan 2026 20:59:30 +0000 (15:59 -0500)] 
maple_tree: add cp_converged() helper

When the maple copy node converges into a single entry, then certain
operations can stop ascending the tree.

This is used more later.

Link: https://lkml.kernel.org/r/20260130205935.2559335-26-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrew Ballance <andrewjballance@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agomaple_tree: add copy_tree_location() helper
Liam R. Howlett [Fri, 30 Jan 2026 20:59:29 +0000 (15:59 -0500)] 
maple_tree: add copy_tree_location() helper

Extract the copying of the tree location from one maple state to another
into its own function.  This is used more later.

Link: https://lkml.kernel.org/r/20260130205935.2559335-25-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrew Ballance <andrewjballance@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agomaple_tree: add test for rebalance calculation off-by-one
Liam R. Howlett [Fri, 30 Jan 2026 20:59:28 +0000 (15:59 -0500)] 
maple_tree: add test for rebalance calculation off-by-one

During the big node removal, an incorrect rebalance step went too far up
the tree causing insufficient nodes.  Test the faulty condition by
recreating the scenario in the userspace testing.

Link: https://lkml.kernel.org/r/20260130205935.2559335-24-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrew Ballance <andrewjballance@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agomaple_tree: use maple copy node for mas_wr_rebalance() operation
Liam R. Howlett [Fri, 30 Jan 2026 20:59:27 +0000 (15:59 -0500)] 
maple_tree: use maple copy node for mas_wr_rebalance() operation

Stop using the maple big node for rebalance operations by changing to more
align with spanning store.  The rebalance operation needs its own data
calculation in rebalance_data().

In the event of too much data, the rebalance tries to push the data using
push_data_sib().  If there is insufficient data, the rebalance operation
will rebalance against a sibling (found with rebalance_sib()).

The rebalance starts at the leaf and works its way upward in the tree
using rebalance_ascend().  Most of the code is shared with spanning store
such as the copy node having a new root, but is fundamentally different in
that the data must come from a sibling.

A parent maple state is used to track the parent location to avoid
multiple mas_ascend() calls.  The maple state tree location is copied from
the parent to the mas (child) in the ascend step.  Ascending itself is
done in the main loop.

Link: https://lkml.kernel.org/r/20260130205935.2559335-23-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrew Ballance <andrewjballance@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agomaple_tree: add cp_is_new_root() helper
Liam R. Howlett [Fri, 30 Jan 2026 20:59:26 +0000 (15:59 -0500)] 
maple_tree: add cp_is_new_root() helper

Add a helper to do what is needed when the maple copy node contains a new
root node.  This is useful for future commits and is self-documenting
code.

[Liam.Howlett@oracle.com: remove warnings on older compilers]
Link: https://lkml.kernel.org/r/malwmirqnpuxqkqrobcmzfkmmxipoyzwfs2nwc5fbpxlt2r2ej@wchmjtaljvw3
[akpm@linux-foundation.org: s/cp->slot[0]/&cp->slot[0]/, per Liam]
Link: https://lkml.kernel.org/r/20260130205935.2559335-22-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrew Ballance <andrewjballance@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agomaple_tree: separate wr_split_store and wr_rebalance store type code path
Liam R. Howlett [Fri, 30 Jan 2026 20:59:25 +0000 (15:59 -0500)] 
maple_tree: separate wr_split_store and wr_rebalance store type code path

The split and rebalance store types both go through the same function that
uses the big node.  Separate the code paths so that each can be updated
independently.

No functional change intended

Link: https://lkml.kernel.org/r/20260130205935.2559335-21-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrew Ballance <andrewjballance@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agomaple_tree: remove unnecessary return statements
Liam R. Howlett [Fri, 30 Jan 2026 20:59:24 +0000 (15:59 -0500)] 
maple_tree: remove unnecessary return statements

Functions do not need to state return at the end, unless skipping unwind.
These can safely be dropped.

Link: https://lkml.kernel.org/r/20260130205935.2559335-20-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrew Ballance <andrewjballance@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agomaple_tree: inline mas_wr_spanning_rebalance()
Liam R. Howlett [Fri, 30 Jan 2026 20:59:23 +0000 (15:59 -0500)] 
maple_tree: inline mas_wr_spanning_rebalance()

Now that the spanning rebalance is small, fully inline it in
mas_wr_spanning_store().

No functional change.

Link: https://lkml.kernel.org/r/20260130205935.2559335-19-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrew Ballance <andrewjballance@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agomaple_tree: start using maple copy node for destination
Liam R. Howlett [Fri, 30 Jan 2026 20:59:22 +0000 (15:59 -0500)] 
maple_tree: start using maple copy node for destination

Stop using the maple subtree state and big node in favour of using three
destinations in the maple copy node.  That is, expand the way leaves were
handled to all levels of the tree and use the maple copy node to track the
new nodes.

Extract out the sibling init into the data calculation since this is where
the insufficient data can be detected.  The remainder of the sibling code
to shift the next iteration is moved to the spanning_ascend() function,
since it is not always needed.

Next introduce the dst_setup() function which will decide how many nodes
are needed to contain the data at this level.  Using the destination
count, populate the copy node's dst array with the new nodes and set
d_count to the correct value.  Note that this can be tricky in the case of
a leaf node with exactly enough room because of the rule against NULLs at
the end of leaves.

Once the destinations are ready, copy the data by altering the
cp_data_write() function to copy from the sources to the destinations
directly.  This eliminates the use of the big node in this code path.  On
node completion, node_finalise() will zero out the remaining area and set
the metadata, if necessary.

spanning_ascend() is used to decide if the operation is complete.  It may
create a new root, converge into one destination, or continue upwards by
ascending the left and right write maple states.

One test case setup needed to be tweaked so that the targeted node was
surrounded by full nodes.

[akpm@linux-foundation.org: coding-style cleanups]
Link: https://lkml.kernel.org/r/20260130205935.2559335-18-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrew Ballance <andrewjballance@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agomaple_tree: add gap support, slot and pivot sizes for maple copy
Liam R. Howlett [Fri, 30 Jan 2026 20:59:21 +0000 (15:59 -0500)] 
maple_tree: add gap support, slot and pivot sizes for maple copy

Add plumbing work for using maple copy as a normal node for a source of
copy operations.  This is needed later.

Link: https://lkml.kernel.org/r/20260130205935.2559335-17-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrew Ballance <andrewjballance@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agomaple_tree: introduce ma_leaf_max_gap()
Liam R. Howlett [Fri, 30 Jan 2026 20:59:20 +0000 (15:59 -0500)] 
maple_tree: introduce ma_leaf_max_gap()

This is the same as mas_leaf_max_gap(), but the information necessary is
known without a maple state in future code.  Adding this function now
simplifies the review for a subsequent patch.

Link: https://lkml.kernel.org/r/20260130205935.2559335-16-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrew Ballance <andrewjballance@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agomaple_tree: change initial big node setup in mas_wr_spanning_rebalance()
Liam R. Howlett [Fri, 30 Jan 2026 20:59:19 +0000 (15:59 -0500)] 
maple_tree: change initial big node setup in mas_wr_spanning_rebalance()

Instead of copying the data into the big node and finding out that the
data may need to be moved or appended to, calculate the data space up
front (in the maple copy node) and set up another source for the copy.

The additional copy source is tracked in the maple state sib (short for
sibling), and is put into the maple write states for future operations
after the data is in the big node.

To facilitate the newly moved node, some initial setup of the maple
subtree state are relocated after the potential shift caused by the new
way of rebalancing against a sibling.

Link: https://lkml.kernel.org/r/20260130205935.2559335-15-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrew Ballance <andrewjballance@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agomaple_tree: inline mas_spanning_rebalance_loop() into mas_wr_spanning_rebalance()
Liam R. Howlett [Fri, 30 Jan 2026 20:59:18 +0000 (15:59 -0500)] 
maple_tree: inline mas_spanning_rebalance_loop() into mas_wr_spanning_rebalance()

Just copy the code and replace count with height.  This is done to avoid
affecting other code paths into mas_spanning_rebalance_loop() for the next
change.

No functional change intended.

Link: https://lkml.kernel.org/r/20260130205935.2559335-14-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrew Ballance <andrewjballance@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agomaple_tree: testing update for spanning store
Liam R. Howlett [Fri, 30 Jan 2026 20:59:17 +0000 (15:59 -0500)] 
maple_tree: testing update for spanning store

Spanning store had some corner cases which showed up during rcu stress
testing.  Add explicit tests for those cases.

At the same time add some locking for easier visibility of the rcu stress
testing.  Only a single dump of the tree will happen on the first detected
issue instead of flooding the console with output.

Link: https://lkml.kernel.org/r/20260130205935.2559335-13-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrew Ballance <andrewjballance@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agomaple_tree: introduce maple_copy node and use it in mas_spanning_rebalance()
Liam R. Howlett [Fri, 30 Jan 2026 20:59:16 +0000 (15:59 -0500)] 
maple_tree: introduce maple_copy node and use it in mas_spanning_rebalance()

Introduce an internal-memory only node type called maple_copy to
facilitate internal copy operations.  Use it in mas_spanning_rebalance()
for just the leaf nodes.  Initially, the maple_copy node is used to
configure the source nodes and copy the data into the big_node.

The maple_copy contains a list of source entries with start and end
offsets.  One of the maple_copy entries can be itself with an offset of 0
to 2, representing the data where the store partially overwrites entries,
or fully overwrites the entry.  The side effect is that the source nodes
no longer have to worry about partially copying the existing offset if it
is not fully overwritten.

This is in preparation of removal of the maple big_node, but for the time
being the data is copied to the big node to limit the change size.

Link: https://lkml.kernel.org/r/20260130205935.2559335-12-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrew Ballance <andrewjballance@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agomaple_tree: correct right ma_wr_state end pivot in mas_wr_spanning_store()
Liam R. Howlett [Fri, 30 Jan 2026 20:59:15 +0000 (15:59 -0500)] 
maple_tree: correct right ma_wr_state end pivot in mas_wr_spanning_store()

The end_piv will be needed in the next patch set and has not been set
correctly in this code path.  Correct the oversight before using it.

Link: https://lkml.kernel.org/r/20260130205935.2559335-11-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrew Ballance <andrewjballance@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agomaple_tree: move maple_subtree_state from mas_wr_spanning_store to mas_wr_spanning_re...
Liam R. Howlett [Fri, 30 Jan 2026 20:59:14 +0000 (15:59 -0500)] 
maple_tree: move maple_subtree_state from mas_wr_spanning_store to mas_wr_spanning_rebalance

Moving the maple_subtree_state is necessary for future cleanups and is
only set up in mas_wr_spanning_rebalance() but never used.

Link: https://lkml.kernel.org/r/20260130205935.2559335-10-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrew Ballance <andrewjballance@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agomaple_tree: don't pass through height in mas_wr_spanning_store
Liam R. Howlett [Fri, 30 Jan 2026 20:59:13 +0000 (15:59 -0500)] 
maple_tree: don't pass through height in mas_wr_spanning_store

Height is not used locally in the function, so call the height argument
closer to where it is passed in the next level.

Link: https://lkml.kernel.org/r/20260130205935.2559335-9-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrew Ballance <andrewjballance@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agomaple_tree: remove l_wr_mas from mas_wr_spanning_rebalance
Liam R. Howlett [Fri, 30 Jan 2026 20:59:12 +0000 (15:59 -0500)] 
maple_tree: remove l_wr_mas from mas_wr_spanning_rebalance

Use the wr_mas instead of creating another variable on the stack.  Take
the opportunity to remove l_mas from being used anywhere but in the
maple_subtree_state.

Link: https://lkml.kernel.org/r/20260130205935.2559335-8-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrew Ballance <andrewjballance@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agomaple_tree: make ma_wr_states reliable for reuse in spanning store
Liam R. Howlett [Fri, 30 Jan 2026 20:59:11 +0000 (15:59 -0500)] 
maple_tree: make ma_wr_states reliable for reuse in spanning store

mas_extend_spanning_null() was not modifying the range min and range max
of the resulting store operation.  The result was that the maple write
state no longer matched what the write was doing.  This was not an issue
as the values were previously not used, but to make the ma_wr_state usable
in future changes, the range min/max stored in the ma_wr_state for left
and right need to be consistent with the operation.

Link: https://lkml.kernel.org/r/20260130205935.2559335-7-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrew Ballance <andrewjballance@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agomaple_tree: inline mas_spanning_rebalance() into mas_wr_spanning_rebalance()
Liam R. Howlett [Fri, 30 Jan 2026 20:59:10 +0000 (15:59 -0500)] 
maple_tree: inline mas_spanning_rebalance() into mas_wr_spanning_rebalance()

Copy the contents of mas_spanning_rebalance() into
mas_wr_spanning_rebalance(), in preparation of removing initial big node
use.

No functional changes intended.

Link: https://lkml.kernel.org/r/20260130205935.2559335-6-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrew Ballance <andrewjballance@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agomaple_tree: remove unnecessary assignment of orig_l index
Liam R. Howlett [Fri, 30 Jan 2026 20:59:09 +0000 (15:59 -0500)] 
maple_tree: remove unnecessary assignment of orig_l index

The index value is already a copy of the maple state so there is no need
to set it again.

Link: https://lkml.kernel.org/r/20260130205935.2559335-5-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrew Ballance <andrewjballance@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agomaple_tree: extract use of big node from mas_wr_spanning_store()
Liam R. Howlett [Fri, 30 Jan 2026 20:59:08 +0000 (15:59 -0500)] 
maple_tree: extract use of big node from mas_wr_spanning_store()

Isolate big node to use in its own function.

No functional changes intended.

Link: https://lkml.kernel.org/r/20260130205935.2559335-4-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrew Ballance <andrewjballance@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agomaple_tree: move mas_spanning_rebalance loop to function
Liam R. Howlett [Fri, 30 Jan 2026 20:59:07 +0000 (15:59 -0500)] 
maple_tree: move mas_spanning_rebalance loop to function

Move the loop over the tree levels to its own function.

No intended functional changes.

Link: https://lkml.kernel.org/r/20260130205935.2559335-3-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrew Ballance <andrewjballance@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agomaple_tree: fix mas_dup_alloc() sparse warning
Liam R. Howlett [Fri, 30 Jan 2026 20:59:06 +0000 (15:59 -0500)] 
maple_tree: fix mas_dup_alloc() sparse warning

Patch series "maple_tree: Replace big node with maple copy", v3.

The big node struct was created for simplicity of splitting, rebalancing,
and spanning store operations by using a copy buffer to create the data
necessary prior to breaking it up into 256B nodes.  Certain operations
were rather tricky due to the restriction of keeping NULL entries together
and never at the end of a node (except the right-most node).

The big node struct is incompatible with future features that are
currently in development.  Specifically different node types and different
data type sizes for pivots.

The big node struct was also a stack variable, which caused issues with
certain configurations of kernel build.

This series removes big node by introducing another node type which will
never be written to the tree: maple_copy.  The maple copy node operates
more like a scatter/gather operation with a number of sources and
destinations of allocated nodes.

The sources are copied to the destinations, in turn, until the sources are
exhausted.  The destination is changed if it is filled or the split
location is reached prior to the source data end.

New data is inserted by using the maple copy node itself as a source with
up to 3 slots and pivots.  The data in the maple copy node is the data
being written to the tree along with any fragment of the range(s) being
overwritten.

As with all nodes, the maple copy node is of size 256B.  Using a node type
allows for the copy operation to treat the new data stored in the maple
copy node the same as any other source node.

Analysis of the runtime shows no regression or benefit of removing the
larger stack structure.  The motivation is the ground work to use new node
types and to help those with odd configurations that have had issues.

The change was tested by myself using mm_tests on amd64 and by Suren on
android (arm64).  Limited testing on s390 qemu was also performed using
stress-ng on the virtual memory, which should cover many corner cases.

This patch (of 30):

Use RCU_INIT_POINTER to initialize an rcu pointer to an initial value
since there are no readers within the tree being created during
duplication.  There is no risk of readers seeing the initialized or
uninitialized value until after the synchronization call in
mas_dup_buld().

Link: https://lkml.kernel.org/r/20260130205935.2559335-1-Liam.Howlett@oracle.com
Link: https://lkml.kernel.org/r/20260130205935.2559335-2-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrew Ballance <andrewjballance@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agomm/fadvise: validate offset in generic_fadvise
Kevin Lourenco [Mon, 22 Dec 2025 14:18:17 +0000 (15:18 +0100)] 
mm/fadvise: validate offset in generic_fadvise

When converted to (u64) for page calculations, a negative offset can
produce extremely large page indices.  This may lead to issues in certain
advice modes (excessive readahead or cache invalidation).

Reject negative offsets with -EINVAL for consistent argument validation
and to avoid silent misbehavior.

POSIX and the man page do not clearly define behavior for negative
offset/len.  FreeBSD rejects negative offsets as well, so failing with
-EINVAL is consistent with existing practice.  The man page can be updated
separately to document the Linux behavior.

Link: https://lkml.kernel.org/r/20260208135738.18992-1-klourencodev@gmail.com
Link: https://lkml.kernel.org/r/20251222141817.13335-1-klourencodev@gmail.com
Signed-off-by: Kevin Lourenco <k.lourenco@criteo.com>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agoksm: initialize the addr only once in rmap_walk_ksm
xu xin [Thu, 12 Feb 2026 11:29:32 +0000 (19:29 +0800)] 
ksm: initialize the addr only once in rmap_walk_ksm

This is a minor performance optimization, especially when there are many
for-loop iterations, because the addr variable doesn't change across
iterations.

Therefore, it only needs to be initialized once before the loop.

Link: https://lkml.kernel.org/r/20260212192820223O_r2NQzSEPG_C56cs-z4l@zte.com.cn
Link: https://lkml.kernel.org/r/20260212192932941MSsJEAyoRW4YdLBN7_myn@zte.com.cn
Signed-off-by: xu xin <xu.xin16@zte.com.cn>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Hugh Dickins <hughd@google.com>
Cc: Wang Yaxin <wang.yaxin@zte.com.cn>
Cc: Yang Yang <yang.yang29@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 weeks agoMerge tag 'sched-urgent-2026-04-05' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 5 Apr 2026 20:45:37 +0000 (13:45 -0700)] 
Merge tag 'sched-urgent-2026-04-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler fixes from Ingo Molnar:

 - Fix zero_vruntime tracking again (Peter Zijlstra)

 - Fix avg_vruntime() usage in sched_debug (Peter Zijlstra)

* tag 'sched-urgent-2026-04-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/debug: Fix avg_vruntime() usage
  sched/fair: Fix zero_vruntime tracking fix

2 weeks agoMerge tag 'perf-urgent-2026-04-05' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 5 Apr 2026 20:43:26 +0000 (13:43 -0700)] 
Merge tag 'perf-urgent-2026-04-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fix from Ingo Molnar:

 - Fix potential bad container_of() in intel_pmu_hw_config() (Ian
   Rogers)

* tag 'perf-urgent-2026-04-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86: Fix potential bad container_of in intel_pmu_hw_config

2 weeks agoMerge tag 'irq-urgent-2026-04-05' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 5 Apr 2026 20:40:58 +0000 (13:40 -0700)] 
Merge tag 'irq-urgent-2026-04-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq fix from Ingo Molnar:

 - Fix RISC-V APLIC irqchip driver setup errors on ACPI systems (Jessica
   Liu)

* tag 'irq-urgent-2026-04-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/riscv-aplic: Restrict genpd notifier to device tree only

2 weeks agoi915: don't use a vma that didn't match the context VM
Linus Torvalds [Sun, 5 Apr 2026 19:42:25 +0000 (12:42 -0700)] 
i915: don't use a vma that didn't match the context VM

In eb_lookup_vma(), the code checks that the context vm matches before
incrementing the i915 vma usage count, but for the non-matching case it
didn't clear the non-matching vma pointer, so it would then mistakenly
be returned, causing potential UaF and refcount issues.

Reported-by: Yassine Mounir <sosohero200@gmail.com>
Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 weeks agoclk: qcom: gcc-x1e80100: Keep GCC USB QTB clock always ON
Jagadeesh Kona [Fri, 27 Mar 2026 15:06:46 +0000 (20:36 +0530)] 
clk: qcom: gcc-x1e80100: Keep GCC USB QTB clock always ON

In Hamoa, SMMU invalidation requires the GCC_AGGRE_USB_NOC_AXI_CLK
to be on for the USB QTB to be functional. This is currently
explicitly enabled by the DWC3 glue driver, so an invalidation
happening while the USB controller is suspended will fault.

Solve this by voting for the GCC MMU USB QTB clock.

Fixes: 161b7c401f4b ("clk: qcom: Add Global Clock controller (GCC) driver for X1E80100")
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Signed-off-by: Jagadeesh Kona <jagadeesh.kona@oss.qualcomm.com>
Reviewed-by: Taniya Das <taniya.das@oss.qualcomm.com>
Reviewed-by: Abel Vesa <abel.vesa@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260327-hamoa-usb-qtb-clk-always-on-v2-1-7d8a406e650f@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2 weeks agoclk: qcom: Constify list of critical CBCR registers
Krzysztof Kozlowski [Tue, 31 Mar 2026 09:17:23 +0000 (11:17 +0200)] 
clk: qcom: Constify list of critical CBCR registers

The static array 'xxx_critical_cbcrs' contains probe match-like data and
is not modified: neither by the driver defining it nor by common.c code
using it.

Make it const for code safety and code readability.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Reviewed-by: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260331091721.61613-4-krzysztof.kozlowski@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2 weeks agoclk: qcom: Constify qcom_cc_driver_data
Krzysztof Kozlowski [Tue, 31 Mar 2026 09:17:22 +0000 (11:17 +0200)] 
clk: qcom: Constify qcom_cc_driver_data

The static 'struct qcom_cc_driver_data' contains probe match-like data
and is not modified: neither by the driver defining it nor by common.c
code using it.

Make it const for code safety and code readability.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Reviewed-by: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260331091721.61613-3-krzysztof.kozlowski@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2 weeks agoclk: qcom: videocc-glymur: Constify qcom_cc_desc
Krzysztof Kozlowski [Tue, 31 Mar 2026 08:55:22 +0000 (10:55 +0200)] 
clk: qcom: videocc-glymur: Constify qcom_cc_desc

Static 'struct qcom_cc_desc' is not modified by drivers and can be made
const for code safety.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Reviewed-by: Taniya Das <taniya.das@oss.qualcomm.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260331085521.37337-2-krzysztof.kozlowski@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2 weeks agotools/power/x86/intel-speed-select: v1.26 release
Srinivas Pandruvada [Sun, 5 Apr 2026 18:54:25 +0000 (11:54 -0700)] 
tools/power/x86/intel-speed-select: v1.26 release

This version includes the following changes:
- Setting current base frequency as maximum for SST-BF with
kernel QOS changes
- Harmonize extended family decoded with the rest of the kernel
- Minor changes for error codes and messages

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2 weeks agotools/power/x86/intel-speed-select: Fix output when running on unsupported CLX platforms
Zhang Rui [Thu, 19 Mar 2026 05:52:56 +0000 (13:52 +0800)] 
tools/power/x86/intel-speed-select: Fix output when running on unsupported CLX platforms

When running intel-speed-select on unsupported CLX platforms, it prints
 intel-speed-select: Invalid CPU model (85)
 : Success
Because this is not a system error and errno is not set.

Replace err() with exit().

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2 weeks agotools/power/x86/intel-speed-select: Print Version info when Incompatible API version...
Zhang Rui [Thu, 19 Mar 2026 05:52:55 +0000 (13:52 +0800)] 
tools/power/x86/intel-speed-select: Print Version info when Incompatible API version is detected

When running an old version intel-speed-select tool on newer platforms,
even with "intel-speed-select -v", the tool only complains about
"Incompatible API version", without giving the current version info.

Print Version info whenever Incompatible API version is detected.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2 weeks agotools/power/x86/intel-speed-select: Fix some program return value
Zhang Rui [Thu, 19 Mar 2026 05:52:54 +0000 (13:52 +0800)] 
tools/power/x86/intel-speed-select: Fix some program return value

When running the "intel-speed-select -h" command, it returns
1. 0 when using a version that is API incompatible.
2. 1 when using a version that is API compatible.
And this is confusing.

Fix the program to return 0 for "-h" parameter, and return 1 whenever
"Incompatible API versions" is detected.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2 weeks agotools/power/x86/intel-speed-select: Fix cpu extended family ID decoding
Zhang Rui [Mon, 26 Jan 2026 00:27:01 +0000 (08:27 +0800)] 
tools/power/x86/intel-speed-select: Fix cpu extended family ID decoding

When decode and use CPU extended family ID in intel-speed-select, there
are several potential issues,
1. Mask with 0x0f to get CPU extended family ID is bogus because
   CPU extended family ID takes 8 bits (bit 27:20).
2. Use CPU extended family ID fields without checking CPU family ID is
   risky. Because Intel SDM says, "The Extended Family ID needs to be
   examined only when the Family ID is 0FH."
3. Saving cpu family ID and cpu extended family ID separately doesn't
   align with Linux kernel. And it may bring extra complexity when
   making family specific changes in the future.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2 weeks agotools/power/x86/intel-speed-select: Avoid current base freq as maximum
Srinivas Pandruvada [Thu, 12 Mar 2026 19:41:08 +0000 (12:41 -0700)] 
tools/power/x86/intel-speed-select: Avoid current base freq as maximum

SST-PP level change results in online/offline of CPUs with -o option.
The Linux intel-pstate driver internally stores the current HWP_REQ MSR
value during offline and restores them during online.

It is possible that during SST-PP level change, the new HWP_CAP limits
can be updated. So, when a CPU is online, the HWP_REQ MSR should be
updated to new values based on HWP_CAP values.

This is particularly problematic when either turbo is disabled or the
current HWP_REQ value (stored before online) is less than the base
frequency from the updated HWP_CAP MSR guaranteed value. If the HWP_REQ
MSR is not updated, then the performance will be limited to the value
before perf level change.

Hence the tool updates cpufreq scaling_max_freq to the newer
base_frequency value in this case. This step is not required when HWP
interrupts are enabled, as the perf level change should result in a new
interrupt with HWP_GUARANTEED_PERF_CHANGE_STATUS and the intel_pstate
driver will update to new limits.

But the tool needs to handle the case when HWP interrupts are not
enabled but there is no way for the tool to know that HWP interrupts are
enabled or not. So, it has to still update the scaling_max_freq.

With the QOS changes in the kernel, user space writes to scaling_max_freq
are treated as hard limits. So, when base frequency is increased with
SST-BF enabled, the cpufreq subsystem will still not allow setting to the
SST-BF high priority core frequency. So, the HWP_REQ MSR will still be
capped to the user-set scaling_max_freq after SST-PP level change.

To address this, instead of setting scaling_max_freq to the current HWP_CAP
highest frequency, set it to the maximum integer value to set the QOS limit
as unconstrained. In this case, the actual HWP_REQ maximum frequency will
still be capped to HWP_CAP highest performance by the intel-pstate driver.
So, it will not result in invalid HWP_REQ values.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2 weeks agoMerge tag 'mips-fixes_7.0_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips...
Linus Torvalds [Sun, 5 Apr 2026 18:29:07 +0000 (11:29 -0700)] 
Merge tag 'mips-fixes_7.0_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux

Pull MIPS fixes from Thomas Bogendoerfer:

 - Fix TLB uniquification for systems with TLB not initialised by
   firmware

 - Fix allocation in TLB uniquification

 - Fix SiByte cache initialisation

 - Check uart parameters from firmware on Loongson64 systems

 - Fix clock id mismatch for Ralink SoCs

 - Fix GCC version check for __mutli3 workaround

* tag 'mips-fixes_7.0_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
  mips: mm: Allocate tlb_vpn array atomically
  MIPS: mm: Rewrite TLB uniquification for the hidden bit feature
  MIPS: mm: Suppress TLB uniquification on EHINV hardware
  MIPS: Always record SEGBITS in cpu_data.vmbits
  MIPS: Fix the GCC version check for `__multi3' workaround
  MIPS: SiByte: Bring back cache initialisation
  mips: ralink: update CPU clock index
  MIPS: Loongson64: env: Check UARTs passed by LEFI cautiously

2 weeks agoMerge tag 'char-misc-7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh...
Linus Torvalds [Sun, 5 Apr 2026 17:09:33 +0000 (10:09 -0700)] 
Merge tag 'char-misc-7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc/iio driver fixes from Greg KH:
 "Here are a relativly large number of small char/misc/iio and other
  driver fixes for 7.0-rc7. There's a bunch, but overall they are all
  small fixes for issues that people have been having that I finally
  caught up with getting merged due to delays on my end.

  The "largest" change overall is just some documentation updates to the
  security-bugs.rst file to hopefully tell the AI tools (and any users
  that actually read the documentation), how to send us better security
  bug reports as the quantity of reports these past few weeks has
  increased dramatically due to tools getting better at "finding"
  things.

  Included in here are:
   - lots of small IIO driver fixes for issues reported in 7.0-rc
   - gpib driver fixes
   - comedi driver fixes
   - interconnect driver fix
   - nvmem driver fixes
   - mei driver fix
   - counter driver fix
   - binder rust driver fixes
   - some other small misc driver fixes

  All of these have been in linux-next this week with no reported issues"

* tag 'char-misc-7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (63 commits)
  Documentation: fix two typos in latest update to the security report howto
  Documentation: clarify the mandatory and desirable info for security reports
  Documentation: explain how to find maintainers addresses for security reports
  Documentation: minor updates to the security contacts
  .get_maintainer.ignore: add myself
  nvmem: zynqmp_nvmem: Fix buffer size in DMA and memcpy
  nvmem: imx: assign nvmem_cell_info::raw_len
  misc: fastrpc: check qcom_scm_assign_mem() return in rpmsg_probe
  misc: fastrpc: possible double-free of cctx->remote_heap
  comedi: dt2815: add hardware detection to prevent crash
  comedi: runflags cannot determine whether to reclaim chanlist
  comedi: Reinit dev->spinlock between attachments to low-level drivers
  comedi: me_daq: Fix potential overrun of firmware buffer
  comedi: me4000: Fix potential overrun of firmware buffer
  comedi: ni_atmio16d: Fix invalid clean-up after failed attach
  gpib: fix use-after-free in IO ioctl handlers
  gpib: lpvo_usb: fix memory leak on disconnect
  gpib: Fix fluke driver s390 compile issue
  lis3lv02d: Omit IRQF_ONESHOT if no threaded handler is provided
  lis3lv02d: fix kernel-doc warnings
  ...

2 weeks agoMerge tag 'tty-7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Linus Torvalds [Sun, 5 Apr 2026 17:04:28 +0000 (10:04 -0700)] 
Merge tag 'tty-7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty fixes from Greg KH:
 "Here are two small tty vt fixes for 7.0-rc7 to resolve some reported
  issues with the resize ability of the alt screen buffer. Both of these
  have been in linux-next all week with no reported issues"

* tag 'tty-7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  vt: resize saved unicode buffer on alt screen exit after resize
  vt: discard stale unicode buffer on alt screen exit after resize

2 weeks agoMerge tag 'usb-7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Linus Torvalds [Sun, 5 Apr 2026 17:00:26 +0000 (10:00 -0700)] 
Merge tag 'usb-7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB/Thunderbolt fixes from Greg KH:
 "Here are a bunch of USB and Thunderbolt fixes (most all are USB) for
  7.0-rc7. More than I normally like this late in the release cycle,
  partly due to my recent travels, and partly due to people banging away
  on the USB gadget interfaces and apis more than normal (big shoutout
  to Android for getting the vendors to actually work upstream on this,
  that's a huge win overall for everyone here)

  Included in here are:
   - Small thunderbolt fix
   - new USB serial driver ids added
   - typec driver fixes
   - gadget driver fixes for some disconnect issues
   - other usb gadget driver fixes for reported problems with binding
     and unbinding devices as happens when a gadget device connects /
     disconnects from a system it is plugged into (or it switches device
     mode at a user's request, these things are complex little
     beasts...)
   - usb offload fixes (where USB audio tunnels through the controller
     while the main CPU is asleep) for when EMP spikes hit the system
     causing disconnects to happen (as often happens with static
     electricity in the winter months). This has been much reported by
     at least one vendor, and resolves the issues they have been seeing
     with this codepath. Can't wait for the "formal methods are the
     answer!" people to try to model that one properly...
   - Other small usb driver fixes for issues reported.

  All of these have been in linux-next this week, and before, with no
  reported issues, and I've personally been stressing these harder than
  normal on my systems here with no problems"

* tag 'usb-7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (39 commits)
  usb: gadget: f_hid: move list and spinlock inits from bind to alloc
  usb: host: xhci-sideband: delegate offload_usage tracking to class drivers
  usb: core: use dedicated spinlock for offload state
  usb: cdns3: gadget: fix state inconsistency on gadget init failure
  usb: dwc3: imx8mp: fix memory leak on probe failure path
  usb: gadget: f_uac1_legacy: validate control request size
  usb: ulpi: fix double free in ulpi_register_interface() error path
  usb: misc: usbio: Fix URB memory leak on submit failure
  USB: core: add NO_LPM quirk for Razer Kiyo Pro webcam
  usb: cdns3: gadget: fix NULL pointer dereference in ep_queue
  usb: core: phy: avoid double use of 'usb3-phy'
  USB: serial: option: add MeiG Smart SRM825WN
  usb: gadget: f_rndis: Fix net_device lifecycle with device_move
  usb: gadget: f_subset: Fix net_device lifecycle with device_move
  usb: gadget: f_eem: Fix net_device lifecycle with device_move
  usb: gadget: f_ecm: Fix net_device lifecycle with device_move
  usb: gadget: u_ncm: Add kernel-doc comments for struct f_ncm_opts
  usb: gadget: f_rndis: Protect RNDIS options with mutex
  usb: gadget: f_subset: Fix unbalanced refcnt in geth_free
  dt-bindings: connector: add pd-disable dependency
  ...

2 weeks agoEDAC/mc: Fix error path ordering in edac_mc_alloc()
Borislav Petkov (AMD) [Tue, 31 Mar 2026 12:16:23 +0000 (14:16 +0200)] 
EDAC/mc: Fix error path ordering in edac_mc_alloc()

When the mci->pvt_info allocation in edac_mc_alloc() fails, the error path
will call put_device() which will end up calling the device's release
function.

However, the init ordering is wrong such that device_initialize() happens
*after* the failed allocation and thus the device itself and the release
function pointer are not initialized yet when they're called:

  MCE: In-kernel MCE decoding enabled.
  ------------[ cut here ]------------
  kobject: '(null)': is not initialized, yet kobject_put() is being called.
  WARNING: lib/kobject.c:734 at kobject_put, CPU#22: systemd-udevd
  CPU: 22 UID: 0 PID: 538 Comm: systemd-udevd Not tainted 7.0.0-rc1+ #2 PREEMPT(full)
  RIP: 0010:kobject_put
  Call Trace:
   <TASK>
   edac_mc_alloc+0xbe/0xe0 [edac_core]
   amd64_edac_init+0x7a4/0xff0 [amd64_edac]
   ? __pfx_amd64_edac_init+0x10/0x10 [amd64_edac]
   do_one_initcall
   ...

Reorder the calling sequence so that the device is initialized and thus the
release function pointer is properly set before it can be used.

This was found by Claude while reviewing another EDAC patch.

Fixes: 0bbb265f7089 ("EDAC/mc: Get rid of silly one-shot struct allocation in edac_mc_alloc()")
Reported-by: Claude Code:claude-opus-4.5
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: stable@kernel.org
Link: https://patch.msgid.link/20260331121623.4871-1-bp@kernel.org
2 weeks agogpu: nova-core: vbios: use from_le_bytes() for PCI ROM header parsing
John Hubbard [Sat, 4 Apr 2026 21:28:29 +0000 (14:28 -0700)] 
gpu: nova-core: vbios: use from_le_bytes() for PCI ROM header parsing

Clippy fires two clippy::precedence warnings on the manual
byte-shifting expression:
  warning: operator precedence can trip the unwary
     --> drivers/gpu/nova-core/vbios.rs:511:17
      |
  511 | /                 u32::from(data[29]) << 24
  512 | |                     | u32::from(data[28]) << 16
  513 | |                     | u32::from(data[27]) << 8
      | |______________________________________________^

Clear the warnings by replacing manual byte-shifting with
u32::from_le_bytes(). Using from_le_bytes() is also a tiny code
improvement, because it uses less code and is clearer about the intent.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Link: https://patch.msgid.link/20260404212831.78971-2-jhubbard@nvidia.com
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2 weeks agogpu: nova-core: bitfield: fix broken Default implementation
Eliot Courtney [Wed, 1 Apr 2026 01:42:28 +0000 (10:42 +0900)] 
gpu: nova-core: bitfield: fix broken Default implementation

The current implementation does not actually set the default values for
the fields in the bitfield.

Fixes: 3fa145bef533 ("gpu: nova-core: register: generate correct `Default` implementation")
Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
Link: https://patch.msgid.link/20260401-fix-bitfield-v2-1-2fa68c98114a@nvidia.com
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2 weeks agogpu: nova-core: falcon: pad firmware DMA object size to required block alignment
Alexandre Courbot [Sun, 5 Apr 2026 02:22:54 +0000 (11:22 +0900)] 
gpu: nova-core: falcon: pad firmware DMA object size to required block alignment

Commit a88831502c8f ("gpu: nova-core: falcon: use dma::Coherent")
dropped the nova-local `DmaObject` device memory type for the
kernel-global `Coherent` one.

This switch had a side-effect: `DmaObject` always aligned the requested
size to `PAGE_SIZE`, and also reported that adjusted size when queried.
`Coherent`, on the other hand, does page-align allocation sizes but only
allows CPU access on the exact size provided by the caller.

This change runs into a limitation of falcon DMA copies, namely that DMA
accesses are done on blocks of exactly 256 bytes. If the provided data
does not have a length that is a multiple of 256, `dma_wr` returns
an error.

It was expected that all firmwares would present the proper adjusted
size, but this is not the case at least on my GA107:

    NovaCore 0000:08:00.0: DMA transfer goes beyond range of DMA object
    NovaCore 0000:08:00.0: Failed to load FWSEC firmware: EINVAL
    NovaCore 0000:08:00.0: probe with driver NovaCore failed with error -22

Fix this by padding the `Coherent`'s size to `MEM_BLOCK_ALIGNMENT` (i.e.
256) when allocating it and filling it with zeroes, before copying the
firmware on top of it.

Fixes: a88831502c8f ("gpu: nova-core: falcon: use dma::Coherent")
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Link: https://patch.msgid.link/20260405-falcon-dma-roundup-v2-1-4af5b2ff9c16@nvidia.com
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2 weeks agogpu: nova-core: gsp: fix undefined behavior in command queue code
Alexandre Courbot [Sat, 4 Apr 2026 05:04:24 +0000 (14:04 +0900)] 
gpu: nova-core: gsp: fix undefined behavior in command queue code

`driver_read_area` and `driver_write_area` are internal methods that
return slices containing the area of the command queue buffer that the
driver has exclusive read or write access, respectively.

While their returned value is correct and safe to use, internally they
temporarily create a reference to the whole command-buffer slice,
including GSP-owned regions. These regions can change without notice,
and thus creating a slice to them, even if never accessed, is undefined
behavior.

Fix this by making these methods create slices to valid regions only.

Fixes: 75f6b1de8133 ("gpu: nova-core: gsp: Add GSP command queue bindings and handling")
Reported-by: Danilo Krummrich <dakr@kernel.org>
Closes: https://lore.kernel.org/all/DH47AVPEKN06.3BERUSJIB4M1R@kernel.org/
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Link: https://patch.msgid.link/20260404-cmdq-ub-fix-v5-1-53d21f4752f5@nvidia.com
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2 weeks agox86/mce/amd: Filter bogus hardware errors on Zen3 clients
Yazen Ghannam [Sat, 28 Feb 2026 14:08:14 +0000 (09:08 -0500)] 
x86/mce/amd: Filter bogus hardware errors on Zen3 clients

Users have been observing multiple L3 cache deferred errors after recent
kernel rework of deferred error handling.¹ ⁴

The errors are bogus due to inconsistent status values. Also, user verified
that bogus MCA_DESTAT values are present on the system even with an older
kernel.²

The errors seem to be garbage values present in the MCA_DESTAT of some L3
cache banks. These were implicitly ignored before the recent kernel rework
because these do not generate a deferred error interrupt.

A later revision of the rework patch was merged for v6.19. This naturally
filtered out most of the bogus error logs. However, a few signatures still
remain.³

Minimize the scope of the filter to the reported CPU
family/model/stepping and only for errors which don't have the Enabled
bit in the MCi status MSR.

¹ https://lore.kernel.org/20250915010010.3547-1-spasswolf@web.de
² https://lore.kernel.org/6e1eda7dd55f6fa30405edf7b0f75695cf55b237.camel@web.de
³ https://lore.kernel.org/21ba47fa8893b33b94370c2a42e5084cf0d2e975.camel@web.de
⁴ https://lore.kernel.org/r/CAKFB093B2k3sKsGJ_QNX1jVQsaXVFyy=wNwpzCGLOXa_vSDwXw@mail.gmail.com

  [ bp: Generalize the condition according to which errors are bogus. ]

Fixes: 7cb735d7c0cb ("x86/mce: Unify AMD DFR handler with MCA Polling")
Closes: https://lore.kernel.org/20250915010010.3547-1-spasswolf@web.de
Reported-by: Bert Karwatzki <spasswolf@web.de>
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-By: Bert Karwatzki <spasswolf@web.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/20250915010010.3547-1-spasswolf@web.de
2 weeks agokconfig: forbid multiple entries with the same symbol in a choice
Masahiro Yamada [Mon, 30 Mar 2026 11:57:35 +0000 (20:57 +0900)] 
kconfig: forbid multiple entries with the same symbol in a choice

Commit 6a859f1a19d1 ("powerpc: unify two CONFIG_POWERPC64_CPU entries
in the same choice block") removed the only occurrence of this tricky
use case.

Disallow this pattern in choice_check_sanity() and revert commit
4d46b5b623e0 ("kconfig: fix infinite loop in sym_calc_choice()").

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Link: https://patch.msgid.link/20260330115736.1559962-1-masahiroy@kernel.org
Reviewed-by: Nicolas Schier <nsc@kernel.org>
Signed-off-by: Nicolas Schier <nsc@kernel.org>
2 weeks agoDocumentation: kbuild: Update the debug information notes in reproducible-builds.rst
Nathan Chancellor [Fri, 13 Mar 2026 23:37:29 +0000 (16:37 -0700)] 
Documentation: kbuild: Update the debug information notes in reproducible-builds.rst

The debug information part of the "Absolute filenames" section in the
reproducible builds document only mentions providing
'-fdebug-prefix-map' to KCFLAGS but it needs to be provided to KAFLAGS
as well since debug information has been generated for assembly files
for a long time.

Additionally, mention that the build directory may also appear as an
absolute path in the debug information (via DW_AT_comp_dir), so it needs
to be overridden via '-fdebug-prefix-map' as well.

Reported-by: Alexander Coffin <alex@cyberialabs.net>
Closes: https://lore.kernel.org/b8dfe7035d19fd611b9be55ee3145fdb@purelymail.com/
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Link: https://patch.msgid.link/20260313-kbuild-docs-repro-builds-fdebug-prefix-map-updates-v1-1-3aeeef7fa710@kernel.org
Signed-off-by: Nicolas Schier <nsc@kernel.org>
2 weeks agochecksyscalls: move instance functionality into generic code
Thomas Weißschuh [Thu, 2 Apr 2026 14:36:20 +0000 (16:36 +0200)] 
checksyscalls: move instance functionality into generic code

On MIPS the checksyscalls.sh script may be executed multiple times.
Currently these multiple executions are executed on each build as kbuild
see that the commands have changed each time.

Use a dedicated stamp file for each different invocation to avoid the
spurious executions.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Nicolas Schier <nsc@kernel.org>
Link: https://patch.msgid.link/20260402-kbuild-missing-syscalls-v3-3-6641be1de2db@weissschuh.net
Signed-off-by: Nicolas Schier <nsc@kernel.org>
2 weeks agochecksyscalls: only run when necessary
Thomas Weißschuh [Thu, 2 Apr 2026 14:36:19 +0000 (16:36 +0200)] 
checksyscalls: only run when necessary

Currently checksyscalls.sh is unconditionally executed during each build.
Most of these executions are unnecessary.

Only run checksyscalls.sh if one of its inputs have changed.

This new logic does not work for the multiple invocations done for MIPS.
The effect is that checksyscalls.sh is still executed unconditionally.
However this is not worse than before.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Nicolas Schier <nsc@kernel.org>
Link: https://patch.msgid.link/20260402-kbuild-missing-syscalls-v3-2-6641be1de2db@weissschuh.net
Signed-off-by: Nicolas Schier <nsc@kernel.org>
2 weeks agochecksyscalls: fail on all intermediate errors
Thomas Weißschuh [Sat, 4 Apr 2026 12:23:10 +0000 (14:23 +0200)] 
checksyscalls: fail on all intermediate errors

Make sure that a failure of any intermediate step also fails the
overall execution.

Link: https://sashiko.dev/#/patchset/20260402-kbuild-missing-syscalls-v3-0-6641be1de2db%40weissschuh.net
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://patch.msgid.link/20260404-checksyscalls-set-e-v1-1-206400e78668@weissschuh.net
Signed-off-by: Nicolas Schier <nsc@kernel.org>
2 weeks agochecksyscalls: move path to reference table to a variable
Thomas Weißschuh [Thu, 2 Apr 2026 14:36:18 +0000 (16:36 +0200)] 
checksyscalls: move path to reference table to a variable

An upcoming patch will need to reuse this path.

Move it into a reusable variable.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Nicolas Schier <nsc@kernel.org>
Link: https://patch.msgid.link/20260402-kbuild-missing-syscalls-v3-1-6641be1de2db@weissschuh.net
Signed-off-by: Nicolas Schier <nsc@kernel.org>
2 weeks agoKVM: arm64: Advertise ID_AA64PFR2_EL1.GCIE
Marc Zyngier [Wed, 1 Apr 2026 17:00:17 +0000 (18:00 +0100)] 
KVM: arm64: Advertise ID_AA64PFR2_EL1.GCIE

As we are missing ID_AA64PFR2_EL1.GCIE from the kernel feature set,
userspace cannot write ID_AA64PFR2_EL1 with GCIE set, even if we are
on a GICv5 host.

Add the required field description.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://patch.msgid.link/20260401170017.369529-1-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
2 weeks agocoda_flag_children(): fix a UAF
Al Viro [Sun, 1 Feb 2026 17:33:37 +0000 (12:33 -0500)] 
coda_flag_children(): fix a UAF

if de goes negative right under us, there's nothing to prevent inode
getting freed just as we call coda_flag_inode().  We are not holding
->d_lock, so it's not impossible.  Not going to be reproducible on
bare hardware unless it's a realtime config, but it could happen on KVM.

Trivial to fix - just hold rcu_read_lock() over that loop.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2 weeks agosanitize coda_dentry_delete()
Al Viro [Sun, 1 Feb 2026 17:18:30 +0000 (12:18 -0500)] 
sanitize coda_dentry_delete()

d_really_is_negative(dentry) is a check for d_inode(dentry) being NULL;
rechecking that is pointless (and no, it can't race - the caller is holding
->d_lock, so ->d_inode is stable)

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2 weeks agocoda: is_bad_inode() is always false there
Al Viro [Sun, 1 Feb 2026 17:10:54 +0000 (12:10 -0500)] 
coda: is_bad_inode() is always false there

... since dbd822046445 ("[PATCH] Coda FS update") back in 2002

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2 weeks agoprctl: cfi: change the branch landing pad prctl()s to be more descriptive
Paul Walmsley [Sun, 5 Apr 2026 00:40:58 +0000 (18:40 -0600)] 
prctl: cfi: change the branch landing pad prctl()s to be more descriptive

Per Linus' comments requesting the replacement of "INDIR_BR_LP" in the
indirect branch tracking prctl()s with something more readable, and
suggesting the use of the speculation control prctl()s as an exemplar,
reimplement the prctl()s and related constants that control per-task
forward-edge control flow integrity.

This primarily involves two changes.  First, the prctls are
restructured to resemble the style of the speculative execution
workaround control prctls PR_{GET,SET}_SPECULATION_CTRL, to make them
easier to extend in the future.  Second, the "indir_br_lp" abbrevation
is expanded to "branch_landing_pads" to be less telegraphic.  The
kselftest and documentation is adjusted accordingly.

Link: https://lore.kernel.org/linux-riscv/CAHk-=whhSLGZAx3N5jJpb4GLFDqH_QvS07D+6BnkPWmCEzTAgw@mail.gmail.com/
Cc: Deepak Gupta <debug@rivosinc.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2 weeks agoriscv: ptrace: cfi: expand "SS" references to "shadow stack" in uapi headers
Paul Walmsley [Sun, 5 Apr 2026 00:40:58 +0000 (18:40 -0600)] 
riscv: ptrace: cfi: expand "SS" references to "shadow stack" in uapi headers

Similar to the recent change to expand "LP" to "branch landing pad",
let's expand "SS" in the ptrace uapi macros to "shadow stack" as well.
This aligns with the existing prctl() arguments, which use the
expanded "shadow stack" names, rather than just the abbreviation.

Link: https://lore.kernel.org/linux-riscv/CAHk-=whhSLGZAx3N5jJpb4GLFDqH_QvS07D+6BnkPWmCEzTAgw@mail.gmail.com/
Cc: Deepak Gupta <debug@rivosinc.com>
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2 weeks agoprctl: rename branch landing pad implementation functions to be more explicit
Paul Walmsley [Sun, 5 Apr 2026 00:40:58 +0000 (18:40 -0600)] 
prctl: rename branch landing pad implementation functions to be more explicit

Per Linus' comments about the unreadability of abbreviations such as
"indir_br_lp", rename the three prctl() implementation functions to be more
explicit.  This involves renaming "indir_br_lp_status" in the function
names to "branch_landing_pad_state".

While here, add _prctl_ into the function names, following the
speculation control prctl implementation functions.

Link: https://lore.kernel.org/linux-riscv/CAHk-=whhSLGZAx3N5jJpb4GLFDqH_QvS07D+6BnkPWmCEzTAgw@mail.gmail.com/
Cc: Deepak Gupta <debug@rivosinc.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2 weeks agoriscv: ptrace: expand "LP" references to "branch landing pads" in uapi headers
Paul Walmsley [Sun, 5 Apr 2026 00:40:58 +0000 (18:40 -0600)] 
riscv: ptrace: expand "LP" references to "branch landing pads" in uapi headers

Per Linus' comments about the unreadability of abbreviations such as
"LP", rename the RISC-V ptrace landing pad CFI macro names to be more
explicit.  This primarily involves expanding "LP" in the names to some
variant of "branch landing pad."

Link: https://lore.kernel.org/linux-riscv/CAHk-=whhSLGZAx3N5jJpb4GLFDqH_QvS07D+6BnkPWmCEzTAgw@mail.gmail.com/
Cc: Deepak Gupta <debug@rivosinc.com>
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2 weeks agoriscv: cfi: clear CFI lock status in start_thread()
Zong Li [Sun, 5 Apr 2026 00:40:58 +0000 (18:40 -0600)] 
riscv: cfi: clear CFI lock status in start_thread()

When libc locks the CFI status through the following prctl:
 - PR_LOCK_SHADOW_STACK_STATUS
 - PR_LOCK_INDIR_BR_LP_STATUS

A newly execd address space will inherit the lock status
if it does not clear the lock bits. Since the lock bits
remain set, libc will later fail to enable the landing
pad and shadow stack.

Signed-off-by: Zong Li <zong.li@sifive.com>
Link: https://patch.msgid.link/20260323065640.4045713-1-zong.li@sifive.com
[pjw@kernel.org: ensure we unlock before changing state; cleaned up subject line]
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2 weeks agoriscv: ptrace: cfi: fix "PRACE" typo in uapi header
Paul Walmsley [Sun, 5 Apr 2026 00:40:57 +0000 (18:40 -0600)] 
riscv: ptrace: cfi: fix "PRACE" typo in uapi header

A CFI-related macro defined in arch/riscv/uapi/asm/ptrace.h misspells
"PTRACE" as "PRACE"; fix this.

Fixes: 2af7c9cf021c ("riscv/ptrace: expose riscv CFI status and state via ptrace and in core files")
Cc: Deepak Gupta <debug@rivosinc.com>
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2 weeks agoACPI: RIMT: Add dependency between iommu and devices
Sunil V L [Tue, 3 Mar 2026 06:16:05 +0000 (11:46 +0530)] 
ACPI: RIMT: Add dependency between iommu and devices

EPROBE_DEFER ensures IOMMU devices are probed before the devices that
depend on them. During shutdown, however, the IOMMU may be removed
first, leading to issues. To avoid this, a device link is added
which enforces the correct removal order.

Fixes: 8f7729552582 ("ACPI: RISC-V: Add support for RIMT")
Signed-off-by: Sunil V L <sunilvl@oss.qualcomm.com>
Link: https://patch.msgid.link/20260303061605.722949-1-sunilvl@oss.qualcomm.com
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2 weeks agoselftests: riscv: Add braces around EXPECT_EQ()
Charlie Jenkins [Tue, 10 Mar 2026 01:52:11 +0000 (18:52 -0700)] 
selftests: riscv: Add braces around EXPECT_EQ()

EXPECT_EQ() expands to multiple lines, breaking up one-line if
statements. This issue was not present in the patch on the mailing list
but was instead introduced by the maintainer when attempting to fix up
checkpatch warnings. Add braces around EXPECT_EQ() to avoid the error
even though checkpatch suggests them to be removed:

validate_v_ptrace.c:626:17: error: ‘else’ without a previous ‘if’

Fixes: 3789d5eecd5a ("selftests: riscv: verify syscalls discard vector context")
Fixes: 30eb191c895b ("selftests: riscv: verify ptrace rejects invalid vector csr inputs")
Fixes: 849f05ae1ea6 ("selftests: riscv: verify ptrace accepts valid vector csr values")
Signed-off-by: Charlie Jenkins <thecharlesjenkins@gmail.com>
Reviewed-and-tested-by: Sergey Matyukevich <geomatsi@gmail.com>
Link: https://patch.msgid.link/20260309-fix_selftests-v2-2-9d5a553a531e@gmail.com
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2 weeks agoriscv: use _BITUL macro rather than BIT() in ptrace uapi and kselftests
Paul Walmsley [Thu, 2 Apr 2026 23:18:03 +0000 (17:18 -0600)] 
riscv: use _BITUL macro rather than BIT() in ptrace uapi and kselftests

Fix the build of non-kernel code that includes the RISC-V ptrace uapi
header, and the RISC-V validate_v_ptrace.c kselftest, by using the
_BITUL() macro rather than BIT().  BIT() is not available outside
the kernel.

Based on patches and comments from Charlie Jenkins, Michael Neuling,
and Andreas Schwab.

Fixes: 30eb191c895b ("selftests: riscv: verify ptrace rejects invalid vector csr inputs")
Fixes: 2af7c9cf021c ("riscv/ptrace: expose riscv CFI status and state via ptrace and in core files")
Cc: Andreas Schwab <schwab@suse.de>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Charlie Jenkins <thecharlesjenkins@gmail.com>
Link: https://patch.msgid.link/20260330024248.449292-1-mikey@neuling.org
Link: https://lore.kernel.org/linux-riscv/20260309-fix_selftests-v2-1-9d5a553a531e@gmail.com/
Link: https://lore.kernel.org/linux-riscv/20260309-fix_selftests-v2-3-9d5a553a531e@gmail.com/
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2 weeks agoriscv: Reset pmm when PR_TAGGED_ADDR_ENABLE is not set
Zishun Yi [Sun, 22 Mar 2026 16:00:22 +0000 (00:00 +0800)] 
riscv: Reset pmm when PR_TAGGED_ADDR_ENABLE is not set

In set_tagged_addr_ctrl(), when PR_TAGGED_ADDR_ENABLE is not set, pmlen
is correctly set to 0, but it forgets to reset pmm. This results in the
CPU pmm state not corresponding to the software pmlen state.

Fix this by resetting pmm along with pmlen.

Fixes: 2e1743085887 ("riscv: Add support for the tagged address ABI")
Signed-off-by: Zishun Yi <vulab@iscas.ac.cn>
Reviewed-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://patch.msgid.link/20260322160022.21908-1-vulab@iscas.ac.cn
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2 weeks agoriscv: make runtime const not usable by modules
Jisheng Zhang [Sat, 21 Feb 2026 02:37:31 +0000 (10:37 +0800)] 
riscv: make runtime const not usable by modules

Similar as commit 284922f4c563 ("x86: uaccess: don't use runtime-const
rewriting in modules") does, make riscv's runtime const not usable by
modules too, to "make sure this doesn't get forgotten the next time
somebody wants to do runtime constant optimizations". The reason is
well explained in the above commit: "The runtime-const infrastructure
was never designed to handle the modular case, because the constant
fixup is only done at boot time for core kernel code."

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Link: https://patch.msgid.link/20260221023731.3476-1-jszhang@kernel.org
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2 weeks agoriscv: patch: Avoid early phys_to_page()
Vivian Wang [Mon, 23 Mar 2026 23:43:47 +0000 (17:43 -0600)] 
riscv: patch: Avoid early phys_to_page()

Similarly to commit 8d09e2d569f6 ("arm64: patching: avoid early
page_to_phys()"), avoid using phys_to_page() for the kernel address case
in patch_map().

Since this is called from apply_boot_alternatives() in setup_arch(), and
commit 4267739cabb8 ("arch, mm: consolidate initialization of SPARSE
memory model") has moved sparse_init() to after setup_arch(),
phys_to_page() is not available there yet, and it panics on boot with
SPARSEMEM on RV32, which does not use SPARSEMEM_VMEMMAP.

Reported-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Closes: https://lore.kernel.org/r/20260223144108-dcace0b9-02e8-4b67-a7ce-f263bed36f26@linutronix.de/
Fixes: 4267739cabb8 ("arch, mm: consolidate initialization of SPARSE memory model")
Suggested-by: Mike Rapoport <rppt@kernel.org>
Signed-off-by: Vivian Wang <wangruikang@iscas.ac.cn>
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Tested-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Link: https://patch.msgid.link/20260310-riscv-sparsemem-alternatives-fix-v1-1-659d5dd257e2@iscas.ac.cn
[pjw@kernel.org: fix the subject line to align with the patch description]
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2 weeks agoriscv: kgdb: fix several debug register assignment bugs
Paul Walmsley [Mon, 23 Mar 2026 23:43:47 +0000 (17:43 -0600)] 
riscv: kgdb: fix several debug register assignment bugs

Fix several bugs in the RISC-V kgdb implementation:

- The element of dbg_reg_def[] that is supposed to pertain to the S1
  register embeds instead the struct pt_regs offset of the A1
  register.  Fix this to use the S1 register offset in struct pt_regs.

- The sleeping_thread_to_gdb_regs() function copies the value of the
  S10 register into the gdb_regs[] array element meant for the S9
  register, and copies the value of the S11 register into the array
  element meant for the S10 register.  It also neglects to copy the
  value of the S11 register.  Fix all of these issues.

Fixes: fe89bd2be8667 ("riscv: Add KGDB support")
Cc: Vincent Chen <vincent.chen@sifive.com>
Link: https://patch.msgid.link/fde376f8-bcfd-bfe4-e467-07d8f7608d05@kernel.org
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2 weeks agoDrivers: hv: Move add_interrupt_randomness() to hypervisor callback sysvec
Michael Kelley [Thu, 2 Apr 2026 20:24:00 +0000 (13:24 -0700)] 
Drivers: hv: Move add_interrupt_randomness() to hypervisor callback sysvec

The Hyper-V ISRs, for normal guests and when running in the hypervisor root
patition, are calling add_interrupt_randomness() as a primary source of
entropy. The call is currently in the ISRs as a common place to handle both
x86/x64 and arm64.

On x86/x64, hypervisor interrupts come through a custom sysvec entry, and
do not go through a generic interrupt handler.

On arm64, hypervisor interrupts come through an emulated GICv3. GICv3 uses
the generic handler handle_percpu_devid_irq(), which does not do
add_interrupt_randomness() -- unlike its counterpart
handle_percpu_irq().

But handle_percpu_devid_irq() is now updated to do the
add_interrupt_randomness(). So add_interrupt_randomness() is now needed
only in Hyper-V's x86/x64 custom sysvec path.

Move add_interrupt_randomness() from the Hyper-V ISRs into the Hyper-V
x86/x64 custom sysvec path, matching the existing STIMER0 sysvec path.

With this change, add_interrupt_randomness() is no longer called from any
device drivers, which is appropriate.

Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Acked-by: Wei Liu <wei.liu@kernel.org>
Link: https://patch.msgid.link/20260402202400.1707-3-mhklkml@zohomail.com
2 weeks agoMerge tag 'v7.0-rc6' into irq/core
Thomas Gleixner [Sat, 4 Apr 2026 18:59:34 +0000 (20:59 +0200)] 
Merge tag 'v7.0-rc6' into irq/core

to be able to merge the hyper-v patch related to randomness.

2 weeks agoMerge tag 'devfreq-next-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git...
Rafael J. Wysocki [Sat, 4 Apr 2026 18:58:54 +0000 (20:58 +0200)] 
Merge tag 'devfreq-next-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux

Pull devfreq updates for v7.1 from Chanwoo Choi:

"- Remove unneeded casting for HZ_PER_KHZ on devfreq.c

 - Use _visible attribute to replace create/remove_sysfs_files() to fix
   sysfs attribute race conditions on devfreq.c

- Add support for Tegra114 activity monitor device on tegra30-devfreq.c"

* tag 'devfreq-next-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux:
  PM / devfreq: tegra30-devfreq: add support for Tegra114
  PM / devfreq: use _visible attribute to replace create/remove_sysfs_files()
  PM / devfreq: Remove unneeded casting for HZ_PER_KHZ

2 weeks agoMerge tag 'amd-pstate-v7.1-2026-04-02' of ssh://gitolite.kernel.org/pub/scm/linux...
Rafael J. Wysocki [Sat, 4 Apr 2026 18:55:56 +0000 (20:55 +0200)] 
Merge tag 'amd-pstate-v7.1-2026-04-02' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux

Pull amd-pstate new content for 7.1 (2026-04-02) from Mario Limonciello:

"Add support for new features:
  * CPPC performance priority
  * Dynamic EPP
  * Raw EPP
  * New unit tests for new features
 Fixes for:
  * PREEMPT_RT
  * sysfs files being present when HW missing
  * Broken/outdated documentation"

* tag 'amd-pstate-v7.1-2026-04-02' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux: (22 commits)
  MAINTAINERS: amd-pstate: Step down as maintainer, add Prateek as reviewer
  cpufreq: Pass the policy to cpufreq_driver->adjust_perf()
  cpufreq/amd-pstate: Pass the policy to amd_pstate_update()
  cpufreq/amd-pstate-ut: Add a unit test for raw EPP
  cpufreq/amd-pstate: Add support for raw EPP writes
  cpufreq/amd-pstate: Add support for platform profile class
  cpufreq/amd-pstate: add kernel command line to override dynamic epp
  cpufreq/amd-pstate: Add dynamic energy performance preference
  Documentation: amd-pstate: fix dead links in the reference section
  cpufreq/amd-pstate: Cache the max frequency in cpudata
  Documentation/amd-pstate: Add documentation for amd_pstate_floor_{freq,count}
  Documentation/amd-pstate: List amd_pstate_prefcore_ranking sysfs file
  Documentation/amd-pstate: List amd_pstate_hw_prefcore sysfs file
  amd-pstate-ut: Add a testcase to validate the visibility of driver attributes
  amd-pstate-ut: Add module parameter to select testcases
  amd-pstate: Introduce a tracepoint trace_amd_pstate_cppc_req2()
  amd-pstate: Add sysfs support for floor_freq and floor_count
  amd-pstate: Add support for CPPC_REQ2 and FLOOR_PERF
  x86/cpufeatures: Add AMD CPPC Performance Priority feature.
  amd-pstate: Make certain freq_attrs conditionally visible
  ...

2 weeks agocpuidle: Simplify cpuidle_register_device() with guard()
Huisong Li [Fri, 3 Apr 2026 08:45:42 +0000 (16:45 +0800)] 
cpuidle: Simplify cpuidle_register_device() with guard()

Use guard() macro for mutex to simplify the control flow in
cpuidle_register_device().

Signed-off-by: Huisong Li <lihuisong@huawei.com>
Link: https://patch.msgid.link/20260403084542.708104-1-lihuisong@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2 weeks agoACPI: processor: idle: Fix NULL pointer dereference in hotplug path
Huisong Li [Fri, 3 Apr 2026 09:02:53 +0000 (17:02 +0800)] 
ACPI: processor: idle: Fix NULL pointer dereference in hotplug path

A cpuidle_device might fail to register during boot, but the system can
continue to run. In such cases, acpi_processor_hotplug() can trigger
a NULL pointer dereference when accessing the per-cpu acpi_cpuidle_device.

So add NULL pointer check for the per-cpu acpi_cpuidle_device in
acpi_processor_hotplug.

Signed-off-by: Huisong Li <lihuisong@huawei.com>
Link: https://patch.msgid.link/20260403090253.998322-1-lihuisong@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2 weeks agobus: fsl-mc: use generic driver_override infrastructure
Danilo Krummrich [Tue, 24 Mar 2026 00:59:06 +0000 (01:59 +0100)] 
bus: fsl-mc: use generic driver_override infrastructure

When a driver is probed through __driver_attach(), the bus' match()
callback is called without the device lock held, thus accessing the
driver_override field without a lock, which can cause a UAF.

Fix this by using the driver-core driver_override infrastructure taking
care of proper locking internally.

Note that calling match() from __driver_attach() without the device lock
held is intentional. [1]

Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Acked-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Acked-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org>
Link: https://lore.kernel.org/driver-core/DGRGTIRHA62X.3RY09D9SOK77P@kernel.org/
Reported-by: Gui-Dong Han <hanguidong02@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220789
Fixes: 1f86a00c1159 ("bus/fsl-mc: add support for 'driver_override' in the mc-bus")
Link: https://patch.msgid.link/20260324005919.2408620-3-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2 weeks agoACPI: processor: idle: Reset power_setup_done flag on initialization failure
Huisong Li [Fri, 3 Apr 2026 08:53:43 +0000 (16:53 +0800)] 
ACPI: processor: idle: Reset power_setup_done flag on initialization failure

The 'power_setup_done' flag is a key indicator used across the ACPI
processor driver to determine if cpuidle are properly configured and
available for a given CPU.

Currently, this flag is set during the early stages of initialization.
However, if the subsequent registration of the cpuidle driver in
acpi_processor_register_idle_driver() or the per-CPU device registration
in acpi_processor_power_init() fails, this flag remains set. This may
lead to some issues where other functions in ACPI idle driver use these
flags.

Fix this by explicitly resetting this flag to 0 in these error paths.

Signed-off-by: Huisong Li <lihuisong@huawei.com>
Link: https://patch.msgid.link/20260403085343.866440-1-lihuisong@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2 weeks agoACPI: TAD: Add alarm support to the RTC class device interface
Rafael J. Wysocki [Tue, 31 Mar 2026 19:38:52 +0000 (21:38 +0200)] 
ACPI: TAD: Add alarm support to the RTC class device interface

Add alarm support, based on Section 9.17 of ACPI 6.6 [1], to the RTC
class device interface of the driver.

The ACPI time and alarm device (TAD) can support two separate alarm
timers, one for waking up the system when it is on AC power, and one
for waking it up when it is on DC power.  In principle, each of them
can be set to a different value representing the number of seconds
till the given alarm timer expires.

However, the RTC class device can only set one alarm, so it will set
both the alarm timers of the ACPI TAD (if the DC one is supported) to
the same value.  That is somewhat cumbersome because there is no way in
the ACPI TAD firmware interface to set both timers in one go, so they
need to be set sequentially, but that's how it goes.

On the alarm read side, the driver assumes that both timers have been
set to the same value, so it is sufficient to access one of them (the
AC one specifically).

Link: https://uefi.org/specs/ACPI/6.6/09_ACPI_Defined_Devices_and_Device_Specific_Objects.html#time-and-alarm-device
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://patch.msgid.link/2076980.usQuhbGJ8B@rafael.j.wysocki
2 weeks agoACPI: TAD: Split acpi_tad_rtc_read_time()
Rafael J. Wysocki [Tue, 31 Mar 2026 19:26:23 +0000 (21:26 +0200)] 
ACPI: TAD: Split acpi_tad_rtc_read_time()

Move the code converting a struct acpi_tad_rt into a struct rtc_time
from acpi_tad_rtc_read_time() into a new function, acpi_tad_rt_to_tm(),
to facilitate adding alarm support to the driver's RTC class device
interface going forward.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
[ rjw: Subject and changelog edits ]
Link: https://patch.msgid.link/9619488.CDJkKcVGEf@rafael.j.wysocki
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2 weeks agoACPI: TAD: Relocate two functions
Rafael J. Wysocki [Tue, 31 Mar 2026 19:25:40 +0000 (21:25 +0200)] 
ACPI: TAD: Relocate two functions

Move two functions introduced previously, __acpi_tad_wake_set() and
__acpi_tad_wake_read(), to the part of the code preceding the sysfs
interface implementation, since subsequently they will be used by
the RTC device interface too.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://patch.msgid.link/3960639.kQq0lBPeGt@rafael.j.wysocki
2 weeks agoACPI: TAD: Split three functions to untangle runtime PM handling
Rafael J. Wysocki [Tue, 31 Mar 2026 19:24:44 +0000 (21:24 +0200)] 
ACPI: TAD: Split three functions to untangle runtime PM handling

Move the core functionality of acpi_tad_get_real_time(),
acpi_tad_wake_set(), and acpi_tad_wake_read() into separate functions
called __acpi_tad_get_real_time(), __acpi_tad_wake_set(), and
__acpi_tad_wake_read(), respectively, which can be called from
code blocks following a single runtime resume of the device.

This will facilitate adding alarm support to the RTC class device
interface of the driver going forward.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://patch.msgid.link/23076728.EfDdHjke4D@rafael.j.wysocki
2 weeks agoACPI: processor: Rearrange and clean up acpi_processor_errata_piix4()
Rafael J. Wysocki [Tue, 31 Mar 2026 19:09:42 +0000 (21:09 +0200)] 
ACPI: processor: Rearrange and clean up acpi_processor_errata_piix4()

In acpi_processor_errata_piix4() it is not necessary to use three
struct pci_dev pointers.  One is sufficient, so use it everywhere and
drop the other two.

Additionally, define the auxiliary local variables value1 and value2
in the code block in which they are used.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/2846888.mvXUDI8C0e@rafael.j.wysocki