+++ /dev/null
-From c5eef0e2402ded238d0a8469208de6080139fd9e Mon Sep 17 00:00:00 2001
-From: Sasha Levin <sashal@kernel.org>
-Date: Tue, 20 May 2025 10:39:29 +0800
-Subject: crypto: powerpc/poly1305 - add depends on BROKEN for now
-
-From: Eric Biggers <ebiggers@google.com>
-
-[ Upstream commit bc8169003b41e89fe7052e408cf9fdbecb4017fe ]
-
-As discussed in the thread containing
-https://lore.kernel.org/linux-crypto/20250510053308.GB505731@sol/, the
-Power10-optimized Poly1305 code is currently not safe to call in softirq
-context. Disable it for now. It can be re-enabled once it is fixed.
-
-Fixes: ba8f8624fde2 ("crypto: poly1305-p10 - Glue code for optmized Poly1305 implementation for ppc64le")
-Cc: stable@vger.kernel.org
-Signed-off-by: Eric Biggers <ebiggers@google.com>
-Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-Signed-off-by: Sasha Levin <sashal@kernel.org>
----
- arch/powerpc/lib/crypto/Kconfig | 22 ++++++++++++++++++++++
- 1 file changed, 22 insertions(+)
- create mode 100644 arch/powerpc/lib/crypto/Kconfig
-
-diff --git a/arch/powerpc/lib/crypto/Kconfig b/arch/powerpc/lib/crypto/Kconfig
-new file mode 100644
-index 0000000000000..3f9e1bbd9905b
---- /dev/null
-+++ b/arch/powerpc/lib/crypto/Kconfig
-@@ -0,0 +1,22 @@
-+# SPDX-License-Identifier: GPL-2.0-only
-+
-+config CRYPTO_CHACHA20_P10
-+ tristate
-+ depends on PPC64 && CPU_LITTLE_ENDIAN && VSX
-+ default CRYPTO_LIB_CHACHA
-+ select CRYPTO_LIB_CHACHA_GENERIC
-+ select CRYPTO_ARCH_HAVE_LIB_CHACHA
-+
-+config CRYPTO_POLY1305_P10
-+ tristate
-+ depends on PPC64 && CPU_LITTLE_ENDIAN && VSX
-+ depends on BROKEN # Needs to be fixed to work in softirq context
-+ default CRYPTO_LIB_POLY1305
-+ select CRYPTO_ARCH_HAVE_LIB_POLY1305
-+ select CRYPTO_LIB_POLY1305_GENERIC
-+
-+config CRYPTO_SHA256_PPC_SPE
-+ tristate
-+ depends on SPE
-+ default CRYPTO_LIB_SHA256
-+ select CRYPTO_ARCH_HAVE_LIB_SHA256
---
-2.39.5
-
--- /dev/null
+From 0ea148a799198518d8ebab63ddd0bb6114a103bc Mon Sep 17 00:00:00 2001
+From: Kairui Song <kasong@tencent.com>
+Date: Wed, 4 Jun 2025 23:10:38 +0800
+Subject: mm: userfaultfd: fix race of userfaultfd_move and swap cache
+
+From: Kairui Song <kasong@tencent.com>
+
+commit 0ea148a799198518d8ebab63ddd0bb6114a103bc upstream.
+
+This commit fixes two kinds of races, they may have different results:
+
+Barry reported a BUG_ON in commit c50f8e6053b0, we may see the same
+BUG_ON if the filemap lookup returned NULL and folio is added to swap
+cache after that.
+
+If another kind of race is triggered (folio changed after lookup) we
+may see RSS counter is corrupted:
+
+[ 406.893936] BUG: Bad rss-counter state mm:ffff0000c5a9ddc0
+type:MM_ANONPAGES val:-1
+[ 406.894071] BUG: Bad rss-counter state mm:ffff0000c5a9ddc0
+type:MM_SHMEMPAGES val:1
+
+Because the folio is being accounted to the wrong VMA.
+
+I'm not sure if there will be any data corruption though, seems no.
+The issues above are critical already.
+
+
+On seeing a swap entry PTE, userfaultfd_move does a lockless swap cache
+lookup, and tries to move the found folio to the faulting vma. Currently,
+it relies on checking the PTE value to ensure that the moved folio still
+belongs to the src swap entry and that no new folio has been added to the
+swap cache, which turns out to be unreliable.
+
+While working and reviewing the swap table series with Barry, following
+existing races are observed and reproduced [1]:
+
+In the example below, move_pages_pte is moving src_pte to dst_pte, where
+src_pte is a swap entry PTE holding swap entry S1, and S1 is not in the
+swap cache:
+
+CPU1 CPU2
+userfaultfd_move
+ move_pages_pte()
+ entry = pte_to_swp_entry(orig_src_pte);
+ // Here it got entry = S1
+ ... < interrupted> ...
+ <swapin src_pte, alloc and use folio A>
+ // folio A is a new allocated folio
+ // and get installed into src_pte
+ <frees swap entry S1>
+ // src_pte now points to folio A, S1
+ // has swap count == 0, it can be freed
+ // by folio_swap_swap or swap
+ // allocator's reclaim.
+ <try to swap out another folio B>
+ // folio B is a folio in another VMA.
+ <put folio B to swap cache using S1 >
+ // S1 is freed, folio B can use it
+ // for swap out with no problem.
+ ...
+ folio = filemap_get_folio(S1)
+ // Got folio B here !!!
+ ... < interrupted again> ...
+ <swapin folio B and free S1>
+ // Now S1 is free to be used again.
+ <swapout src_pte & folio A using S1>
+ // Now src_pte is a swap entry PTE
+ // holding S1 again.
+ folio_trylock(folio)
+ move_swap_pte
+ double_pt_lock
+ is_pte_pages_stable
+ // Check passed because src_pte == S1
+ folio_move_anon_rmap(...)
+ // Moved invalid folio B here !!!
+
+The race window is very short and requires multiple collisions of multiple
+rare events, so it's very unlikely to happen, but with a deliberately
+constructed reproducer and increased time window, it can be reproduced
+easily.
+
+This can be fixed by checking if the folio returned by filemap is the
+valid swap cache folio after acquiring the folio lock.
+
+Another similar race is possible: filemap_get_folio may return NULL, but
+folio (A) could be swapped in and then swapped out again using the same
+swap entry after the lookup. In such a case, folio (A) may remain in the
+swap cache, so it must be moved too:
+
+CPU1 CPU2
+userfaultfd_move
+ move_pages_pte()
+ entry = pte_to_swp_entry(orig_src_pte);
+ // Here it got entry = S1, and S1 is not in swap cache
+ folio = filemap_get_folio(S1)
+ // Got NULL
+ ... < interrupted again> ...
+ <swapin folio A and free S1>
+ <swapout folio A re-using S1>
+ move_swap_pte
+ double_pt_lock
+ is_pte_pages_stable
+ // Check passed because src_pte == S1
+ folio_move_anon_rmap(...)
+ // folio A is ignored !!!
+
+Fix this by checking the swap cache again after acquiring the src_pte
+lock. And to avoid the filemap overhead, we check swap_map directly [2].
+
+The SWP_SYNCHRONOUS_IO path does make the problem more complex, but so far
+we don't need to worry about that, since folios can only be exposed to the
+swap cache in the swap out path, and this is covered in this patch by
+checking the swap cache again after acquiring the src_pte lock.
+
+Testing with a simple C program that allocates and moves several GB of
+memory did not show any observable performance change.
+
+Link: https://lkml.kernel.org/r/20250604151038.21968-1-ryncsn@gmail.com
+Fixes: adef440691ba ("userfaultfd: UFFDIO_MOVE uABI")
+Signed-off-by: Kairui Song <kasong@tencent.com>
+Closes: https://lore.kernel.org/linux-mm/CAMgjq7B1K=6OOrK2OUZ0-tqCzi+EJt+2_K97TPGoSt=9+JwP7Q@mail.gmail.com/ [1]
+Link: https://lore.kernel.org/all/CAGsJ_4yJhJBo16XhiC-nUzSheyX-V3-nFE+tAi=8Y560K8eT=A@mail.gmail.com/ [2]
+Reviewed-by: Lokesh Gidra <lokeshgidra@google.com>
+Acked-by: Peter Xu <peterx@redhat.com>
+Reviewed-by: Suren Baghdasaryan <surenb@google.com>
+Reviewed-by: Barry Song <baohua@kernel.org>
+Reviewed-by: Chris Li <chrisl@kernel.org>
+Cc: Andrea Arcangeli <aarcange@redhat.com>
+Cc: David Hildenbrand <david@redhat.com>
+Cc: Kairui Song <kasong@tencent.com>
+Cc: <stable@vger.kernel.org>
+Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
+(cherry picked from commit 0ea148a799198518d8ebab63ddd0bb6114a103bc)
+[ lokeshgidra: resolved merged conflict caused by the difference in
+ move_swap_pte() arguments ]
+Signed-off-by: Lokesh Gidra <lokeshgidra@google.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ mm/userfaultfd.c | 33 +++++++++++++++++++++++++++++++--
+ 1 file changed, 31 insertions(+), 2 deletions(-)
+
+--- a/mm/userfaultfd.c
++++ b/mm/userfaultfd.c
+@@ -1078,8 +1078,18 @@ static int move_swap_pte(struct mm_struc
+ pte_t *dst_pte, pte_t *src_pte,
+ pte_t orig_dst_pte, pte_t orig_src_pte,
+ spinlock_t *dst_ptl, spinlock_t *src_ptl,
+- struct folio *src_folio)
++ struct folio *src_folio,
++ struct swap_info_struct *si, swp_entry_t entry)
+ {
++ /*
++ * Check if the folio still belongs to the target swap entry after
++ * acquiring the lock. Folio can be freed in the swap cache while
++ * not locked.
++ */
++ if (src_folio && unlikely(!folio_test_swapcache(src_folio) ||
++ entry.val != src_folio->swap.val))
++ return -EAGAIN;
++
+ double_pt_lock(dst_ptl, src_ptl);
+
+ if (!pte_same(ptep_get(src_pte), orig_src_pte) ||
+@@ -1096,6 +1106,25 @@ static int move_swap_pte(struct mm_struc
+ if (src_folio) {
+ folio_move_anon_rmap(src_folio, dst_vma);
+ src_folio->index = linear_page_index(dst_vma, dst_addr);
++ } else {
++ /*
++ * Check if the swap entry is cached after acquiring the src_pte
++ * lock. Otherwise, we might miss a newly loaded swap cache folio.
++ *
++ * Check swap_map directly to minimize overhead, READ_ONCE is sufficient.
++ * We are trying to catch newly added swap cache, the only possible case is
++ * when a folio is swapped in and out again staying in swap cache, using the
++ * same entry before the PTE check above. The PTL is acquired and released
++ * twice, each time after updating the swap_map's flag. So holding
++ * the PTL here ensures we see the updated value. False positive is possible,
++ * e.g. SWP_SYNCHRONOUS_IO swapin may set the flag without touching the
++ * cache, or during the tiny synchronization window between swap cache and
++ * swap_map, but it will be gone very quickly, worst result is retry jitters.
++ */
++ if (READ_ONCE(si->swap_map[swp_offset(entry)]) & SWAP_HAS_CACHE) {
++ double_pt_unlock(dst_ptl, src_ptl);
++ return -EAGAIN;
++ }
+ }
+
+ orig_src_pte = ptep_get_and_clear(mm, src_addr, src_pte);
+@@ -1391,7 +1420,7 @@ retry:
+ }
+ err = move_swap_pte(mm, dst_vma, dst_addr, src_addr, dst_pte, src_pte,
+ orig_dst_pte, orig_src_pte,
+- dst_ptl, src_ptl, src_folio);
++ dst_ptl, src_ptl, src_folio, si, entry);
+ }
+
+ out:
--- /dev/null
+From 5c5f0468d172ddec2e333d738d2a1f85402cf0bc Mon Sep 17 00:00:00 2001
+From: Jeongjun Park <aha310510@gmail.com>
+Date: Fri, 9 May 2025 01:56:20 +0900
+Subject: mm/vmalloc: fix data race in show_numa_info()
+
+From: Jeongjun Park <aha310510@gmail.com>
+
+commit 5c5f0468d172ddec2e333d738d2a1f85402cf0bc upstream.
+
+The following data-race was found in show_numa_info():
+
+==================================================================
+BUG: KCSAN: data-race in vmalloc_info_show / vmalloc_info_show
+
+read to 0xffff88800971fe30 of 4 bytes by task 8289 on cpu 0:
+ show_numa_info mm/vmalloc.c:4936 [inline]
+ vmalloc_info_show+0x5a8/0x7e0 mm/vmalloc.c:5016
+ seq_read_iter+0x373/0xb40 fs/seq_file.c:230
+ proc_reg_read_iter+0x11e/0x170 fs/proc/inode.c:299
+....
+
+write to 0xffff88800971fe30 of 4 bytes by task 8287 on cpu 1:
+ show_numa_info mm/vmalloc.c:4934 [inline]
+ vmalloc_info_show+0x38f/0x7e0 mm/vmalloc.c:5016
+ seq_read_iter+0x373/0xb40 fs/seq_file.c:230
+ proc_reg_read_iter+0x11e/0x170 fs/proc/inode.c:299
+....
+
+value changed: 0x0000008f -> 0x00000000
+==================================================================
+
+According to this report,there is a read/write data-race because
+m->private is accessible to multiple CPUs. To fix this, instead of
+allocating the heap in proc_vmalloc_init() and passing the heap address to
+m->private, vmalloc_info_show() should allocate the heap.
+
+Link: https://lkml.kernel.org/r/20250508165620.15321-1-aha310510@gmail.com
+Fixes: 8e1d743f2c26 ("mm: vmalloc: support multiple nodes in vmallocinfo")
+Signed-off-by: Jeongjun Park <aha310510@gmail.com>
+Suggested-by: Eric Dumazet <edumazet@google.com>
+Suggested-by: Andrew Morton <akpm@linux-foundation.org>
+Reviewed-by: "Uladzislau Rezki (Sony)" <urezki@gmail.com>
+Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ mm/vmalloc.c | 63 ++++++++++++++++++++++++++++++++---------------------------
+ 1 file changed, 35 insertions(+), 28 deletions(-)
+
+--- a/mm/vmalloc.c
++++ b/mm/vmalloc.c
+@@ -3095,7 +3095,7 @@ static void clear_vm_uninitialized_flag(
+ /*
+ * Before removing VM_UNINITIALIZED,
+ * we should make sure that vm has proper values.
+- * Pair with smp_rmb() in show_numa_info().
++ * Pair with smp_rmb() in vread_iter() and vmalloc_info_show().
+ */
+ smp_wmb();
+ vm->flags &= ~VM_UNINITIALIZED;
+@@ -4938,28 +4938,29 @@ bool vmalloc_dump_obj(void *object)
+ #endif
+
+ #ifdef CONFIG_PROC_FS
+-static void show_numa_info(struct seq_file *m, struct vm_struct *v)
+-{
+- if (IS_ENABLED(CONFIG_NUMA)) {
+- unsigned int nr, *counters = m->private;
+- unsigned int step = 1U << vm_area_page_order(v);
+
+- if (!counters)
+- return;
++/*
++ * Print number of pages allocated on each memory node.
++ *
++ * This function can only be called if CONFIG_NUMA is enabled
++ * and VM_UNINITIALIZED bit in v->flags is disabled.
++ */
++static void show_numa_info(struct seq_file *m, struct vm_struct *v,
++ unsigned int *counters)
++{
++ unsigned int nr;
++ unsigned int step = 1U << vm_area_page_order(v);
+
+- if (v->flags & VM_UNINITIALIZED)
+- return;
+- /* Pair with smp_wmb() in clear_vm_uninitialized_flag() */
+- smp_rmb();
++ if (!counters)
++ return;
+
+- memset(counters, 0, nr_node_ids * sizeof(unsigned int));
++ memset(counters, 0, nr_node_ids * sizeof(unsigned int));
+
+- for (nr = 0; nr < v->nr_pages; nr += step)
+- counters[page_to_nid(v->pages[nr])] += step;
+- for_each_node_state(nr, N_HIGH_MEMORY)
+- if (counters[nr])
+- seq_printf(m, " N%u=%u", nr, counters[nr]);
+- }
++ for (nr = 0; nr < v->nr_pages; nr += step)
++ counters[page_to_nid(v->pages[nr])] += step;
++ for_each_node_state(nr, N_HIGH_MEMORY)
++ if (counters[nr])
++ seq_printf(m, " N%u=%u", nr, counters[nr]);
+ }
+
+ static void show_purge_info(struct seq_file *m)
+@@ -4987,6 +4988,10 @@ static int vmalloc_info_show(struct seq_
+ struct vmap_area *va;
+ struct vm_struct *v;
+ int i;
++ unsigned int *counters;
++
++ if (IS_ENABLED(CONFIG_NUMA))
++ counters = kmalloc(nr_node_ids * sizeof(unsigned int), GFP_KERNEL);
+
+ for (i = 0; i < nr_vmap_nodes; i++) {
+ vn = &vmap_nodes[i];
+@@ -5003,6 +5008,11 @@ static int vmalloc_info_show(struct seq_
+ }
+
+ v = va->vm;
++ if (v->flags & VM_UNINITIALIZED)
++ continue;
++
++ /* Pair with smp_wmb() in clear_vm_uninitialized_flag() */
++ smp_rmb();
+
+ seq_printf(m, "0x%pK-0x%pK %7ld",
+ v->addr, v->addr + v->size, v->size);
+@@ -5037,7 +5047,9 @@ static int vmalloc_info_show(struct seq_
+ if (is_vmalloc_addr(v->pages))
+ seq_puts(m, " vpages");
+
+- show_numa_info(m, v);
++ if (IS_ENABLED(CONFIG_NUMA))
++ show_numa_info(m, v, counters);
++
+ seq_putc(m, '\n');
+ }
+ spin_unlock(&vn->busy.lock);
+@@ -5047,19 +5059,14 @@ static int vmalloc_info_show(struct seq_
+ * As a final step, dump "unpurged" areas.
+ */
+ show_purge_info(m);
++ if (IS_ENABLED(CONFIG_NUMA))
++ kfree(counters);
+ return 0;
+ }
+
+ static int __init proc_vmalloc_init(void)
+ {
+- void *priv_data = NULL;
+-
+- if (IS_ENABLED(CONFIG_NUMA))
+- priv_data = kmalloc(nr_node_ids * sizeof(unsigned int), GFP_KERNEL);
+-
+- proc_create_single_data("vmallocinfo",
+- 0400, NULL, vmalloc_info_show, priv_data);
+-
++ proc_create_single("vmallocinfo", 0400, NULL, vmalloc_info_show);
+ return 0;
+ }
+ module_init(proc_vmalloc_init);
--- /dev/null
+From 93bd4a80efeb521314485a06d8c21157240497bb Mon Sep 17 00:00:00 2001
+From: Madhavan Srinivasan <maddy@linux.ibm.com>
+Date: Sun, 11 May 2025 09:41:11 +0530
+Subject: powerpc/kernel: Fix ppc_save_regs inclusion in build
+
+From: Madhavan Srinivasan <maddy@linux.ibm.com>
+
+commit 93bd4a80efeb521314485a06d8c21157240497bb upstream.
+
+Recent patch fixed an old commit
+'fc2a5a6161a2 ("powerpc/64s: ppc_save_regs is now needed for all 64s builds")'
+which is to include building of ppc_save_reg.c only when XMON
+and KEXEC_CORE and PPC_BOOK3S are enabled. This was valid, since
+ppc_save_regs was called only in replay_system_reset() of old
+irq.c which was under BOOK3S.
+
+But there has been multiple refactoring of irq.c and have
+added call to ppc_save_regs() from __replay_soft_interrupts
+-> replay_soft_interrupts which is part of irq_64.c included
+under CONFIG_PPC64. And since ppc_save_regs is called in
+CRASH_DUMP path as part of crash_setup_regs in kexec.h,
+CONFIG_PPC32 also needs it.
+
+So with this recent patch which enabled the building of
+ppc_save_regs.c caused a build break when none of these
+(XMON, KEXEC_CORE, BOOK3S) where enabled as part of config.
+Patch to enable building of ppc_save_regs.c by defaults.
+
+Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
+Link: https://patch.msgid.link/20250511041111.841158-1-maddy@linux.ibm.com
+Cc: Guenter Roeck <linux@roeck-us.net>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ arch/powerpc/kernel/Makefile | 2 --
+ 1 file changed, 2 deletions(-)
+
+--- a/arch/powerpc/kernel/Makefile
++++ b/arch/powerpc/kernel/Makefile
+@@ -162,9 +162,7 @@ endif
+
+ obj64-$(CONFIG_PPC_TRANSACTIONAL_MEM) += tm.o
+
+-ifneq ($(CONFIG_XMON)$(CONFIG_KEXEC_CORE)$(CONFIG_PPC_BOOK3S),)
+ obj-y += ppc_save_regs.o
+-endif
+
+ obj-$(CONFIG_EPAPR_PARAVIRT) += epapr_paravirt.o epapr_hcalls.o
+ obj-$(CONFIG_KVM_GUEST) += kvm.o kvm_emul.o
drm-amdgpu-add-kicker-fws-loading-for-gfx11-smu13-ps.patch
drm-amd-display-add-more-checks-for-dsc-hubp-ono-gua.patch
arm64-dts-qcom-x1e80100-crd-mark-l12b-and-l15b-alway.patch
-crypto-powerpc-poly1305-add-depends-on-broken-for-no.patch
drm-amdgpu-mes-add-missing-locking-in-helper-functio.patch
sched_ext-make-scx_group_set_weight-always-update-tg.patch
scsi-lpfc-restore-clearing-of-nlp_unreg_inp-in-ndlp-.patch
platform-x86-think-lmi-fix-kobject-cleanup.patch
platform-x86-think-lmi-fix-sysfs-group-cleanup.patch
usb-typec-displayport-fix-potential-deadlock.patch
+powerpc-kernel-fix-ppc_save_regs-inclusion-in-build.patch
+mm-vmalloc-fix-data-race-in-show_numa_info.patch
+mm-userfaultfd-fix-race-of-userfaultfd_move-and-swap-cache.patch