]> git.ipfire.org Git - thirdparty/kernel/linux.git/commitdiff
mm: hugetlb: add huge page size param to set_huge_pte_at()
authorRyan Roberts <ryan.roberts@arm.com>
Fri, 22 Sep 2023 11:58:03 +0000 (12:58 +0100)
committerAndrew Morton <akpm@linux-foundation.org>
Sat, 30 Sep 2023 00:20:47 +0000 (17:20 -0700)
Patch series "Fix set_huge_pte_at() panic on arm64", v2.

This series fixes a bug in arm64's implementation of set_huge_pte_at(),
which can result in an unprivileged user causing a kernel panic.  The
problem was triggered when running the new uffd poison mm selftest for
HUGETLB memory.  This test (and the uffd poison feature) was merged for
v6.5-rc7.

Ideally, I'd like to get this fix in for v6.6 and I've cc'ed stable
(correctly this time) to get it backported to v6.5, where the issue first
showed up.

Description of Bug
==================

arm64's huge pte implementation supports multiple huge page sizes, some of
which are implemented in the page table with multiple contiguous entries.
So set_huge_pte_at() needs to work out how big the logical pte is, so that
it can also work out how many physical ptes (or pmds) need to be written.
It previously did this by grabbing the folio out of the pte and querying
its size.

However, there are cases when the pte being set is actually a swap entry.
But this also used to work fine, because for huge ptes, we only ever saw
migration entries and hwpoison entries.  And both of these types of swap
entries have a PFN embedded, so the code would grab that and everything
still worked out.

But over time, more calls to set_huge_pte_at() have been added that set
swap entry types that do not embed a PFN.  And this causes the code to go
bang.  The triggering case is for the uffd poison test, commit
99aa77215ad0 ("selftests/mm: add uffd unit test for UFFDIO_POISON"), which
causes a PTE_MARKER_POISONED swap entry to be set, coutesey of commit
8a13897fb0da ("mm: userfaultfd: support UFFDIO_POISON for hugetlbfs") -
added in v6.5-rc7.  Although review shows that there are other call sites
that set PTE_MARKER_UFFD_WP (which also has no PFN), these don't trigger
on arm64 because arm64 doesn't support UFFD WP.

If CONFIG_DEBUG_VM is enabled, we do at least get a BUG(), but otherwise,
it will dereference a bad pointer in page_folio():

    static inline struct folio *hugetlb_swap_entry_to_folio(swp_entry_t entry)
    {
        VM_BUG_ON(!is_migration_entry(entry) && !is_hwpoison_entry(entry));

        return page_folio(pfn_to_page(swp_offset_pfn(entry)));
    }

Fix
===

The simplest fix would have been to revert the dodgy cleanup commit
18f3962953e4 ("mm: hugetlb: kill set_huge_swap_pte_at()"), but since
things have moved on, this would have required an audit of all the new
set_huge_pte_at() call sites to see if they should be converted to
set_huge_swap_pte_at().  As per the original intent of the change, it
would also leave us open to future bugs when people invariably get it
wrong and call the wrong helper.

So instead, I've added a huge page size parameter to set_huge_pte_at().
This means that the arm64 code has the size in all cases.  It's a bigger
change, due to needing to touch the arches that implement the function,
but it is entirely mechanical, so in my view, low risk.

I've compile-tested all touched arches; arm64, parisc, powerpc, riscv,
s390, sparc (and additionally x86_64).  I've additionally booted and run
mm selftests against arm64, where I observe the uffd poison test is fixed,
and there are no other regressions.

This patch (of 2):

In order to fix a bug, arm64 needs to be told the size of the huge page
for which the pte is being set in set_huge_pte_at().  Provide for this by
adding an `unsigned long sz` parameter to the function.  This follows the
same pattern as huge_pte_clear().

This commit makes the required interface modifications to the core mm as
well as all arches that implement this function (arm64, parisc, powerpc,
riscv, s390, sparc).  The actual arm64 bug will be fixed in a separate
commit.

No behavioral changes intended.

Link: https://lkml.kernel.org/r/20230922115804.2043771-1-ryan.roberts@arm.com
Link: https://lkml.kernel.org/r/20230922115804.2043771-2-ryan.roberts@arm.com
Fixes: 8a13897fb0da ("mm: userfaultfd: support UFFDIO_POISON for hugetlbfs")
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> [powerpc 8xx]
Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com> [vmalloc change]
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Cc: <stable@vger.kernel.org> [6.5+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
22 files changed:
arch/arm64/include/asm/hugetlb.h
arch/arm64/mm/hugetlbpage.c
arch/parisc/include/asm/hugetlb.h
arch/parisc/mm/hugetlbpage.c
arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h
arch/powerpc/mm/book3s64/hugetlbpage.c
arch/powerpc/mm/book3s64/radix_hugetlbpage.c
arch/powerpc/mm/nohash/8xx.c
arch/powerpc/mm/pgtable.c
arch/riscv/include/asm/hugetlb.h
arch/riscv/mm/hugetlbpage.c
arch/s390/include/asm/hugetlb.h
arch/s390/mm/hugetlbpage.c
arch/sparc/include/asm/hugetlb.h
arch/sparc/mm/hugetlbpage.c
include/asm-generic/hugetlb.h
include/linux/hugetlb.h
mm/damon/vaddr.c
mm/hugetlb.c
mm/migrate.c
mm/rmap.c
mm/vmalloc.c

index f43a38ac17799d26317ab985daccd35978b2b2f3..2ddc33d93b13b28c39d84d4253150d32c61888bd 100644 (file)
@@ -28,7 +28,7 @@ pte_t arch_make_huge_pte(pte_t entry, unsigned int shift, vm_flags_t flags);
 #define arch_make_huge_pte arch_make_huge_pte
 #define __HAVE_ARCH_HUGE_SET_HUGE_PTE_AT
 extern void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
-                           pte_t *ptep, pte_t pte);
+                           pte_t *ptep, pte_t pte, unsigned long sz);
 #define __HAVE_ARCH_HUGE_PTEP_SET_ACCESS_FLAGS
 extern int huge_ptep_set_access_flags(struct vm_area_struct *vma,
                                      unsigned long addr, pte_t *ptep,
index 9c52718ea7509a88ddbafa2eceef58b2e51615a6..a7f8c8db3425c6b860997f423cb5f0c84891a7cf 100644 (file)
@@ -249,7 +249,7 @@ static inline struct folio *hugetlb_swap_entry_to_folio(swp_entry_t entry)
 }
 
 void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
-                           pte_t *ptep, pte_t pte)
+                           pte_t *ptep, pte_t pte, unsigned long sz)
 {
        size_t pgsize;
        int i;
@@ -571,5 +571,7 @@ pte_t huge_ptep_modify_prot_start(struct vm_area_struct *vma, unsigned long addr
 void huge_ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep,
                                  pte_t old_pte, pte_t pte)
 {
-       set_huge_pte_at(vma->vm_mm, addr, ptep, pte);
+       unsigned long psize = huge_page_size(hstate_vma(vma));
+
+       set_huge_pte_at(vma->vm_mm, addr, ptep, pte, psize);
 }
index f7f078c2872c439307783fd81961e1b2faf577da..72daacc472a0a305772d7473aa8601a5316fbbe4 100644 (file)
@@ -6,7 +6,7 @@
 
 #define __HAVE_ARCH_HUGE_SET_HUGE_PTE_AT
 void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
-                    pte_t *ptep, pte_t pte);
+                    pte_t *ptep, pte_t pte, unsigned long sz);
 
 #define __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR
 pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr,
index a8a1a7c1e16eb4ef1b751144377a943be3eecaa9..a9f7e21f66567aad1d7244a47664c356346097aa 100644 (file)
@@ -140,7 +140,7 @@ static void __set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
 }
 
 void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
-                    pte_t *ptep, pte_t entry)
+                    pte_t *ptep, pte_t entry, unsigned long sz)
 {
        __set_huge_pte_at(mm, addr, ptep, entry);
 }
index de092b04ee1a1292b6352c74888291e179f5b0ae..92df40c6cc6b5e0b4015192aff3c74f9f0537ac8 100644 (file)
@@ -46,7 +46,8 @@ static inline int check_and_get_huge_psize(int shift)
 }
 
 #define __HAVE_ARCH_HUGE_SET_HUGE_PTE_AT
-void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte);
+void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
+                    pte_t pte, unsigned long sz);
 
 #define __HAVE_ARCH_HUGE_PTE_CLEAR
 static inline void huge_pte_clear(struct mm_struct *mm, unsigned long addr,
index 3bc0eb21b2a005538a7493cf8ebc7c34a6332801..5a2e512e96db6b310dde8b7c9a25dd1199acf31e 100644 (file)
@@ -143,11 +143,14 @@ pte_t huge_ptep_modify_prot_start(struct vm_area_struct *vma,
 void huge_ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr,
                                  pte_t *ptep, pte_t old_pte, pte_t pte)
 {
+       unsigned long psize;
 
        if (radix_enabled())
                return radix__huge_ptep_modify_prot_commit(vma, addr, ptep,
                                                           old_pte, pte);
-       set_huge_pte_at(vma->vm_mm, addr, ptep, pte);
+
+       psize = huge_page_size(hstate_vma(vma));
+       set_huge_pte_at(vma->vm_mm, addr, ptep, pte, psize);
 }
 
 void __init hugetlbpage_init_defaultsize(void)
index 17075c78d4bc3dfef9c662f98cbfc3429dbc6d41..35fd2a95be24c662e0ac1514d6ec51daa9402a14 100644 (file)
@@ -47,6 +47,7 @@ void radix__huge_ptep_modify_prot_commit(struct vm_area_struct *vma,
                                         pte_t old_pte, pte_t pte)
 {
        struct mm_struct *mm = vma->vm_mm;
+       unsigned long psize = huge_page_size(hstate_vma(vma));
 
        /*
         * POWER9 NMMU must flush the TLB after clearing the PTE before
@@ -58,5 +59,5 @@ void radix__huge_ptep_modify_prot_commit(struct vm_area_struct *vma,
            atomic_read(&mm->context.copros) > 0)
                radix__flush_hugetlb_page(vma, addr);
 
-       set_huge_pte_at(vma->vm_mm, addr, ptep, pte);
+       set_huge_pte_at(vma->vm_mm, addr, ptep, pte, psize);
 }
index dbbfe897455dc443e1c804aa2d8ff8ad1a2aab28..a642a79298929dda334d712678e2930f83375bcd 100644 (file)
@@ -91,7 +91,8 @@ static int __ref __early_map_kernel_hugepage(unsigned long va, phys_addr_t pa,
        if (new && WARN_ON(pte_present(*ptep) && pgprot_val(prot)))
                return -EINVAL;
 
-       set_huge_pte_at(&init_mm, va, ptep, pte_mkhuge(pfn_pte(pa >> PAGE_SHIFT, prot)));
+       set_huge_pte_at(&init_mm, va, ptep,
+                       pte_mkhuge(pfn_pte(pa >> PAGE_SHIFT, prot)), psize);
 
        return 0;
 }
index 3f86fd217690be96e6868f7f9670e6a570ef240b..3ba9fe41160469842186963aae0e2abc880e7198 100644 (file)
@@ -288,7 +288,8 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma,
 }
 
 #if defined(CONFIG_PPC_8xx)
-void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte)
+void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
+                    pte_t pte, unsigned long sz)
 {
        pmd_t *pmd = pmd_off(mm, addr);
        pte_basic_t val;
index 34e24f078cc1b3e00cdc9655269e55d90f906e9a..4c5b0e929890fadcebb3caace0afe97dfa46d8bf 100644 (file)
@@ -18,7 +18,8 @@ void huge_pte_clear(struct mm_struct *mm, unsigned long addr,
 
 #define __HAVE_ARCH_HUGE_SET_HUGE_PTE_AT
 void set_huge_pte_at(struct mm_struct *mm,
-                    unsigned long addr, pte_t *ptep, pte_t pte);
+                    unsigned long addr, pte_t *ptep, pte_t pte,
+                    unsigned long sz);
 
 #define __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR
 pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
index 96225a8533ad8002e39518f87c6bab9d0e8ffb5a..e4a2ace92dbeba5b522ebca0f0fe9dfa28a9655a 100644 (file)
@@ -180,7 +180,8 @@ pte_t arch_make_huge_pte(pte_t entry, unsigned int shift, vm_flags_t flags)
 void set_huge_pte_at(struct mm_struct *mm,
                     unsigned long addr,
                     pte_t *ptep,
-                    pte_t pte)
+                    pte_t pte,
+                    unsigned long sz)
 {
        int i, pte_num;
 
index f07267875a198760adfb149721c3497044fe478a..deb198a610395bc410bd09a30bdddbfb33b41d65 100644 (file)
@@ -16,6 +16,8 @@
 #define hugepages_supported()                  (MACHINE_HAS_EDAT1)
 
 void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
+                    pte_t *ptep, pte_t pte, unsigned long sz);
+void __set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
                     pte_t *ptep, pte_t pte);
 pte_t huge_ptep_get(pte_t *ptep);
 pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
@@ -65,7 +67,7 @@ static inline int huge_ptep_set_access_flags(struct vm_area_struct *vma,
        int changed = !pte_same(huge_ptep_get(ptep), pte);
        if (changed) {
                huge_ptep_get_and_clear(vma->vm_mm, addr, ptep);
-               set_huge_pte_at(vma->vm_mm, addr, ptep, pte);
+               __set_huge_pte_at(vma->vm_mm, addr, ptep, pte);
        }
        return changed;
 }
@@ -74,7 +76,7 @@ static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
                                           unsigned long addr, pte_t *ptep)
 {
        pte_t pte = huge_ptep_get_and_clear(mm, addr, ptep);
-       set_huge_pte_at(mm, addr, ptep, pte_wrprotect(pte));
+       __set_huge_pte_at(mm, addr, ptep, pte_wrprotect(pte));
 }
 
 static inline pte_t mk_huge_pte(struct page *page, pgprot_t pgprot)
index c718f2a0de9485ecb695b972509e99dd2eb0fbfa..297a6d897d5a0c0e2e00f271ae23d918c4c6862a 100644 (file)
@@ -142,7 +142,7 @@ static void clear_huge_pte_skeys(struct mm_struct *mm, unsigned long rste)
                __storage_key_init_range(paddr, paddr + size - 1);
 }
 
-void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
+void __set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
                     pte_t *ptep, pte_t pte)
 {
        unsigned long rste;
@@ -163,6 +163,12 @@ void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
        set_pte(ptep, __pte(rste));
 }
 
+void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
+                    pte_t *ptep, pte_t pte, unsigned long sz)
+{
+       __set_huge_pte_at(mm, addr, ptep, pte);
+}
+
 pte_t huge_ptep_get(pte_t *ptep)
 {
        return __rste_to_pte(pte_val(*ptep));
index 0a26cca24232c0811a0bda5c347c9db8dd206fa3..c714ca6a05aa04b154d0b64f52cb3f7d4c20452b 100644 (file)
@@ -14,6 +14,8 @@ extern struct pud_huge_patch_entry __pud_huge_patch, __pud_huge_patch_end;
 
 #define __HAVE_ARCH_HUGE_SET_HUGE_PTE_AT
 void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
+                    pte_t *ptep, pte_t pte, unsigned long sz);
+void __set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
                     pte_t *ptep, pte_t pte);
 
 #define __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR
@@ -32,7 +34,7 @@ static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
                                           unsigned long addr, pte_t *ptep)
 {
        pte_t old_pte = *ptep;
-       set_huge_pte_at(mm, addr, ptep, pte_wrprotect(old_pte));
+       __set_huge_pte_at(mm, addr, ptep, pte_wrprotect(old_pte));
 }
 
 #define __HAVE_ARCH_HUGE_PTEP_SET_ACCESS_FLAGS
@@ -42,7 +44,7 @@ static inline int huge_ptep_set_access_flags(struct vm_area_struct *vma,
 {
        int changed = !pte_same(*ptep, pte);
        if (changed) {
-               set_huge_pte_at(vma->vm_mm, addr, ptep, pte);
+               __set_huge_pte_at(vma->vm_mm, addr, ptep, pte);
                flush_tlb_page(vma, addr);
        }
        return changed;
index d7018823206c13647fa5ddde4103fa9ff18f5cd7..b432500c13a5d8794b2296dec7ac45973b33dbad 100644 (file)
@@ -328,7 +328,7 @@ pte_t *huge_pte_offset(struct mm_struct *mm,
        return pte_offset_huge(pmd, addr);
 }
 
-void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
+void __set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
                     pte_t *ptep, pte_t entry)
 {
        unsigned int nptes, orig_shift, shift;
@@ -364,6 +364,12 @@ void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
                                    orig_shift);
 }
 
+void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
+                    pte_t *ptep, pte_t entry, unsigned long sz)
+{
+       __set_huge_pte_at(mm, addr, ptep, entry);
+}
+
 pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr,
                              pte_t *ptep)
 {
index 4da02798a00bb6cbc8d5491ce89e72eb22616788..6dcf4d576970c4b43ce8be9e0373bc39d10e3518 100644 (file)
@@ -76,7 +76,7 @@ static inline void hugetlb_free_pgd_range(struct mmu_gather *tlb,
 
 #ifndef __HAVE_ARCH_HUGE_SET_HUGE_PTE_AT
 static inline void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
-               pte_t *ptep, pte_t pte)
+               pte_t *ptep, pte_t pte, unsigned long sz)
 {
        set_pte_at(mm, addr, ptep, pte);
 }
index 5b2626063f4fddd521c831eb17c1aae8a0e9d41e..a30686e649f7ac971835c215bb7675cc84cb4b6b 100644 (file)
@@ -984,7 +984,9 @@ static inline void huge_ptep_modify_prot_commit(struct vm_area_struct *vma,
                                                unsigned long addr, pte_t *ptep,
                                                pte_t old_pte, pte_t pte)
 {
-       set_huge_pte_at(vma->vm_mm, addr, ptep, pte);
+       unsigned long psize = huge_page_size(hstate_vma(vma));
+
+       set_huge_pte_at(vma->vm_mm, addr, ptep, pte, psize);
 }
 #endif
 
@@ -1173,7 +1175,7 @@ static inline pte_t huge_ptep_clear_flush(struct vm_area_struct *vma,
 }
 
 static inline void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
-                                  pte_t *ptep, pte_t pte)
+                                  pte_t *ptep, pte_t pte, unsigned long sz)
 {
 }
 
index 4c81a9dbd0444801a6578dd4e9fb2a47622da1af..cf8a9fc5c9d1a61770572a15c8e55738601b6e7a 100644 (file)
@@ -341,13 +341,14 @@ static void damon_hugetlb_mkold(pte_t *pte, struct mm_struct *mm,
        bool referenced = false;
        pte_t entry = huge_ptep_get(pte);
        struct folio *folio = pfn_folio(pte_pfn(entry));
+       unsigned long psize = huge_page_size(hstate_vma(vma));
 
        folio_get(folio);
 
        if (pte_young(entry)) {
                referenced = true;
                entry = pte_mkold(entry);
-               set_huge_pte_at(mm, addr, pte, entry);
+               set_huge_pte_at(mm, addr, pte, entry, psize);
        }
 
 #ifdef CONFIG_MMU_NOTIFIER
index ba6d39b71cb14326393f53d7c6068e2648932ffb..52d26072dfda55ddefa97457cdb11f9d716eb20f 100644 (file)
@@ -4980,7 +4980,7 @@ static bool is_hugetlb_entry_hwpoisoned(pte_t pte)
 
 static void
 hugetlb_install_folio(struct vm_area_struct *vma, pte_t *ptep, unsigned long addr,
-                     struct folio *new_folio, pte_t old)
+                     struct folio *new_folio, pte_t old, unsigned long sz)
 {
        pte_t newpte = make_huge_pte(vma, &new_folio->page, 1);
 
@@ -4988,7 +4988,7 @@ hugetlb_install_folio(struct vm_area_struct *vma, pte_t *ptep, unsigned long add
        hugepage_add_new_anon_rmap(new_folio, vma, addr);
        if (userfaultfd_wp(vma) && huge_pte_uffd_wp(old))
                newpte = huge_pte_mkuffd_wp(newpte);
-       set_huge_pte_at(vma->vm_mm, addr, ptep, newpte);
+       set_huge_pte_at(vma->vm_mm, addr, ptep, newpte, sz);
        hugetlb_count_add(pages_per_huge_page(hstate_vma(vma)), vma->vm_mm);
        folio_set_hugetlb_migratable(new_folio);
 }
@@ -5065,7 +5065,7 @@ again:
                } else if (unlikely(is_hugetlb_entry_hwpoisoned(entry))) {
                        if (!userfaultfd_wp(dst_vma))
                                entry = huge_pte_clear_uffd_wp(entry);
-                       set_huge_pte_at(dst, addr, dst_pte, entry);
+                       set_huge_pte_at(dst, addr, dst_pte, entry, sz);
                } else if (unlikely(is_hugetlb_entry_migration(entry))) {
                        swp_entry_t swp_entry = pte_to_swp_entry(entry);
                        bool uffd_wp = pte_swp_uffd_wp(entry);
@@ -5080,18 +5080,18 @@ again:
                                entry = swp_entry_to_pte(swp_entry);
                                if (userfaultfd_wp(src_vma) && uffd_wp)
                                        entry = pte_swp_mkuffd_wp(entry);
-                               set_huge_pte_at(src, addr, src_pte, entry);
+                               set_huge_pte_at(src, addr, src_pte, entry, sz);
                        }
                        if (!userfaultfd_wp(dst_vma))
                                entry = huge_pte_clear_uffd_wp(entry);
-                       set_huge_pte_at(dst, addr, dst_pte, entry);
+                       set_huge_pte_at(dst, addr, dst_pte, entry, sz);
                } else if (unlikely(is_pte_marker(entry))) {
                        pte_marker marker = copy_pte_marker(
                                pte_to_swp_entry(entry), dst_vma);
 
                        if (marker)
                                set_huge_pte_at(dst, addr, dst_pte,
-                                               make_pte_marker(marker));
+                                               make_pte_marker(marker), sz);
                } else {
                        entry = huge_ptep_get(src_pte);
                        pte_folio = page_folio(pte_page(entry));
@@ -5145,7 +5145,7 @@ again:
                                        goto again;
                                }
                                hugetlb_install_folio(dst_vma, dst_pte, addr,
-                                                     new_folio, src_pte_old);
+                                                     new_folio, src_pte_old, sz);
                                spin_unlock(src_ptl);
                                spin_unlock(dst_ptl);
                                continue;
@@ -5166,7 +5166,7 @@ again:
                        if (!userfaultfd_wp(dst_vma))
                                entry = huge_pte_clear_uffd_wp(entry);
 
-                       set_huge_pte_at(dst, addr, dst_pte, entry);
+                       set_huge_pte_at(dst, addr, dst_pte, entry, sz);
                        hugetlb_count_add(npages, dst);
                }
                spin_unlock(src_ptl);
@@ -5184,7 +5184,8 @@ again:
 }
 
 static void move_huge_pte(struct vm_area_struct *vma, unsigned long old_addr,
-                         unsigned long new_addr, pte_t *src_pte, pte_t *dst_pte)
+                         unsigned long new_addr, pte_t *src_pte, pte_t *dst_pte,
+                         unsigned long sz)
 {
        struct hstate *h = hstate_vma(vma);
        struct mm_struct *mm = vma->vm_mm;
@@ -5202,7 +5203,7 @@ static void move_huge_pte(struct vm_area_struct *vma, unsigned long old_addr,
                spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING);
 
        pte = huge_ptep_get_and_clear(mm, old_addr, src_pte);
-       set_huge_pte_at(mm, new_addr, dst_pte, pte);
+       set_huge_pte_at(mm, new_addr, dst_pte, pte, sz);
 
        if (src_ptl != dst_ptl)
                spin_unlock(src_ptl);
@@ -5259,7 +5260,7 @@ int move_hugetlb_page_tables(struct vm_area_struct *vma,
                if (!dst_pte)
                        break;
 
-               move_huge_pte(vma, old_addr, new_addr, src_pte, dst_pte);
+               move_huge_pte(vma, old_addr, new_addr, src_pte, dst_pte, sz);
        }
 
        if (shared_pmd)
@@ -5337,7 +5338,8 @@ static void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct
                        if (pte_swp_uffd_wp_any(pte) &&
                            !(zap_flags & ZAP_FLAG_DROP_MARKER))
                                set_huge_pte_at(mm, address, ptep,
-                                               make_pte_marker(PTE_MARKER_UFFD_WP));
+                                               make_pte_marker(PTE_MARKER_UFFD_WP),
+                                               sz);
                        else
                                huge_pte_clear(mm, address, ptep, sz);
                        spin_unlock(ptl);
@@ -5371,7 +5373,8 @@ static void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct
                if (huge_pte_uffd_wp(pte) &&
                    !(zap_flags & ZAP_FLAG_DROP_MARKER))
                        set_huge_pte_at(mm, address, ptep,
-                                       make_pte_marker(PTE_MARKER_UFFD_WP));
+                                       make_pte_marker(PTE_MARKER_UFFD_WP),
+                                       sz);
                hugetlb_count_sub(pages_per_huge_page(h), mm);
                page_remove_rmap(page, vma, true);
 
@@ -5676,7 +5679,7 @@ retry_avoidcopy:
                hugepage_add_new_anon_rmap(new_folio, vma, haddr);
                if (huge_pte_uffd_wp(pte))
                        newpte = huge_pte_mkuffd_wp(newpte);
-               set_huge_pte_at(mm, haddr, ptep, newpte);
+               set_huge_pte_at(mm, haddr, ptep, newpte, huge_page_size(h));
                folio_set_hugetlb_migratable(new_folio);
                /* Make the old page be freed below */
                new_folio = old_folio;
@@ -5972,7 +5975,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
         */
        if (unlikely(pte_marker_uffd_wp(old_pte)))
                new_pte = huge_pte_mkuffd_wp(new_pte);
-       set_huge_pte_at(mm, haddr, ptep, new_pte);
+       set_huge_pte_at(mm, haddr, ptep, new_pte, huge_page_size(h));
 
        hugetlb_count_add(pages_per_huge_page(h), mm);
        if ((flags & FAULT_FLAG_WRITE) && !(vma->vm_flags & VM_SHARED)) {
@@ -6261,7 +6264,8 @@ int hugetlb_mfill_atomic_pte(pte_t *dst_pte,
                }
 
                _dst_pte = make_pte_marker(PTE_MARKER_POISONED);
-               set_huge_pte_at(dst_mm, dst_addr, dst_pte, _dst_pte);
+               set_huge_pte_at(dst_mm, dst_addr, dst_pte, _dst_pte,
+                               huge_page_size(h));
 
                /* No need to invalidate - it was non-present before */
                update_mmu_cache(dst_vma, dst_addr, dst_pte);
@@ -6412,7 +6416,7 @@ int hugetlb_mfill_atomic_pte(pte_t *dst_pte,
        if (wp_enabled)
                _dst_pte = huge_pte_mkuffd_wp(_dst_pte);
 
-       set_huge_pte_at(dst_mm, dst_addr, dst_pte, _dst_pte);
+       set_huge_pte_at(dst_mm, dst_addr, dst_pte, _dst_pte, huge_page_size(h));
 
        hugetlb_count_add(pages_per_huge_page(h), dst_mm);
 
@@ -6598,7 +6602,7 @@ long hugetlb_change_protection(struct vm_area_struct *vma,
                        else if (uffd_wp_resolve)
                                newpte = pte_swp_clear_uffd_wp(newpte);
                        if (!pte_same(pte, newpte))
-                               set_huge_pte_at(mm, address, ptep, newpte);
+                               set_huge_pte_at(mm, address, ptep, newpte, psize);
                } else if (unlikely(is_pte_marker(pte))) {
                        /* No other markers apply for now. */
                        WARN_ON_ONCE(!pte_marker_uffd_wp(pte));
@@ -6623,7 +6627,8 @@ long hugetlb_change_protection(struct vm_area_struct *vma,
                        if (unlikely(uffd_wp))
                                /* Safe to modify directly (none->non-present). */
                                set_huge_pte_at(mm, address, ptep,
-                                               make_pte_marker(PTE_MARKER_UFFD_WP));
+                                               make_pte_marker(PTE_MARKER_UFFD_WP),
+                                               psize);
                }
                spin_unlock(ptl);
        }
index b7fa020003f34e4f539491d3755735607469b1c1..2053b54556ca5b558dbdba1b02c0081cd348d113 100644 (file)
@@ -243,7 +243,9 @@ static bool remove_migration_pte(struct folio *folio,
 
 #ifdef CONFIG_HUGETLB_PAGE
                if (folio_test_hugetlb(folio)) {
-                       unsigned int shift = huge_page_shift(hstate_vma(vma));
+                       struct hstate *h = hstate_vma(vma);
+                       unsigned int shift = huge_page_shift(h);
+                       unsigned long psize = huge_page_size(h);
 
                        pte = arch_make_huge_pte(pte, shift, vma->vm_flags);
                        if (folio_test_anon(folio))
@@ -251,7 +253,8 @@ static bool remove_migration_pte(struct folio *folio,
                                                       rmap_flags);
                        else
                                page_dup_file_rmap(new, true);
-                       set_huge_pte_at(vma->vm_mm, pvmw.address, pvmw.pte, pte);
+                       set_huge_pte_at(vma->vm_mm, pvmw.address, pvmw.pte, pte,
+                                       psize);
                } else
 #endif
                {
index ec7f8e6c9e483a6ff768272d4e89221044974bee..9f795b93cf40f5fa57c3dc38f7f18c4d4020d17d 100644 (file)
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1480,6 +1480,7 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
        struct mmu_notifier_range range;
        enum ttu_flags flags = (enum ttu_flags)(long)arg;
        unsigned long pfn;
+       unsigned long hsz = 0;
 
        /*
         * When racing against e.g. zap_pte_range() on another cpu,
@@ -1511,6 +1512,9 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
                 */
                adjust_range_if_pmd_sharing_possible(vma, &range.start,
                                                     &range.end);
+
+               /* We need the huge page size for set_huge_pte_at() */
+               hsz = huge_page_size(hstate_vma(vma));
        }
        mmu_notifier_invalidate_range_start(&range);
 
@@ -1628,7 +1632,8 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
                        pteval = swp_entry_to_pte(make_hwpoison_entry(subpage));
                        if (folio_test_hugetlb(folio)) {
                                hugetlb_count_sub(folio_nr_pages(folio), mm);
-                               set_huge_pte_at(mm, address, pvmw.pte, pteval);
+                               set_huge_pte_at(mm, address, pvmw.pte, pteval,
+                                               hsz);
                        } else {
                                dec_mm_counter(mm, mm_counter(&folio->page));
                                set_pte_at(mm, address, pvmw.pte, pteval);
@@ -1820,6 +1825,7 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma,
        struct mmu_notifier_range range;
        enum ttu_flags flags = (enum ttu_flags)(long)arg;
        unsigned long pfn;
+       unsigned long hsz = 0;
 
        /*
         * When racing against e.g. zap_pte_range() on another cpu,
@@ -1855,6 +1861,9 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma,
                 */
                adjust_range_if_pmd_sharing_possible(vma, &range.start,
                                                     &range.end);
+
+               /* We need the huge page size for set_huge_pte_at() */
+               hsz = huge_page_size(hstate_vma(vma));
        }
        mmu_notifier_invalidate_range_start(&range);
 
@@ -2020,7 +2029,8 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma,
                        pteval = swp_entry_to_pte(make_hwpoison_entry(subpage));
                        if (folio_test_hugetlb(folio)) {
                                hugetlb_count_sub(folio_nr_pages(folio), mm);
-                               set_huge_pte_at(mm, address, pvmw.pte, pteval);
+                               set_huge_pte_at(mm, address, pvmw.pte, pteval,
+                                               hsz);
                        } else {
                                dec_mm_counter(mm, mm_counter(&folio->page));
                                set_pte_at(mm, address, pvmw.pte, pteval);
@@ -2044,7 +2054,8 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma,
 
                        if (arch_unmap_one(mm, vma, address, pteval) < 0) {
                                if (folio_test_hugetlb(folio))
-                                       set_huge_pte_at(mm, address, pvmw.pte, pteval);
+                                       set_huge_pte_at(mm, address, pvmw.pte,
+                                                       pteval, hsz);
                                else
                                        set_pte_at(mm, address, pvmw.pte, pteval);
                                ret = false;
@@ -2058,7 +2069,8 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma,
                        if (anon_exclusive &&
                            page_try_share_anon_rmap(subpage)) {
                                if (folio_test_hugetlb(folio))
-                                       set_huge_pte_at(mm, address, pvmw.pte, pteval);
+                                       set_huge_pte_at(mm, address, pvmw.pte,
+                                                       pteval, hsz);
                                else
                                        set_pte_at(mm, address, pvmw.pte, pteval);
                                ret = false;
@@ -2090,7 +2102,8 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma,
                        if (pte_uffd_wp(pteval))
                                swp_pte = pte_swp_mkuffd_wp(swp_pte);
                        if (folio_test_hugetlb(folio))
-                               set_huge_pte_at(mm, address, pvmw.pte, swp_pte);
+                               set_huge_pte_at(mm, address, pvmw.pte, swp_pte,
+                                               hsz);
                        else
                                set_pte_at(mm, address, pvmw.pte, swp_pte);
                        trace_set_migration_pte(address, pte_val(swp_pte),
index ef8599d394fd0657b644bdeafaca9b8a781c6d6c..a3fedb3ee0dbd48a3bae9c712bb2602a94dd3fbf 100644 (file)
@@ -111,7 +111,7 @@ static int vmap_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end,
                        pte_t entry = pfn_pte(pfn, prot);
 
                        entry = arch_make_huge_pte(entry, ilog2(size), 0);
-                       set_huge_pte_at(&init_mm, addr, pte, entry);
+                       set_huge_pte_at(&init_mm, addr, pte, entry, size);
                        pfn += PFN_DOWN(size);
                        continue;
                }