From: Kiryl Shutsemau (Meta) Date: Fri, 29 May 2026 17:23:25 +0000 (+0100) Subject: fs/proc/task_mmu: fix make_uffd_wp_huge_pte() prot-update race X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=04718f7c9290f95385f0dd328758753dc1c36dec;p=thirdparty%2Fkernel%2Flinux.git fs/proc/task_mmu: fix make_uffd_wp_huge_pte() prot-update race Patch series "userfaultfd/pagemap: pre-existing fixes". These are pre-existing bug fixes that were carried at the front of the userfaultfd RWP working-set-tracking series up to v5 [1]. Per review feedback that fixes should not sit in the middle of a feature series, they are split out and sent on their own; the RWP series is reposted rebased on top of this. All six were flagged by the Sashiko AI review of the RWP series and carry Reported-by: Sashiko AI review . They are independent of RWP, apply to mm-new directly, and carry Cc: stable@. 1: fs/proc/task_mmu: a missing huge_ptep_modify_prot_start() in make_uffd_wp_huge_pte() can lose hardware Dirty/Accessed updates when PAGEMAP_SCAN write-protects a hugetlb PTE. 2: fs/proc/task_mmu: pagemap_scan_hugetlb_entry() compares the range against HPAGE_SIZE rather than the hstate page size, so it never write-protects gigantic hugetlb pages. 3: fs/proc/task_mmu: PAGEMAP_SCAN with PM_SCAN_WP_MATCHING over an unpopulated hugetlb range self-deadlocks -- pagemap_scan_pte_hole() calls uffd_wp_range() while walk_hugetlb_range() holds the hugetlb vma lock for read, and hugetlb_change_protection() then takes it for write. Install the marker inline instead. 4: mm/huge_memory: change_non_present_huge_pmd() drops pmd_swp_uffd_wp on a device-private PMD permission downgrade, silently losing the uffd-wp marker. 5: userfaultfd: must_wait() applies pte_write() to a locklessly read PTE without checking pte_present(), so swap/migration entries decode random offset bits and a thread can stay parked on a stale fault. 6: userfaultfd: __VMA_UFFD_FLAGS feeds VMA_UFFD_MINOR_BIT (41) to mk_vma_flags() unconditionally, an out-of-bounds write into the single-word vma_flags_t on 32-bit. Build the mask from config-gated per-mode masks so an unavailable bit is never materialised. This patch (of 6): make_uffd_wp_huge_pte() arms the UFFD_WP bit on a present HugeTLB PTE by calling huge_ptep_modify_prot_commit() with a ptent snapshot that was fetched without the corresponding huge_ptep_modify_prot_start(). The start helper is what atomically clears the entry so the kernel-owned snapshot stays consistent until the commit; without it, the hardware may set Dirty or Accessed in the live PTE between the original read and the commit, and huge_ptep_modify_prot_commit() (whose generic implementation just calls set_huge_pte_at()) then writes the stale snapshot back over the live hardware bits, losing the update. The non-hugetlb sibling make_uffd_wp_pte() does this correctly via ptep_modify_prot_start() / ptep_modify_prot_commit(). Mirror that pattern for the present-PTE branch. The migration case stays as-is -- migration entries are non-present, so there's no hardware update to race against. Link: https://lore.kernel.org/20260529172331.356655-1-kas@kernel.org Link: https://lore.kernel.org/20260529172331.356655-2-kas@kernel.org Link: https://lore.kernel.org/all/20260526130509.2748441-1-kirill@shutemov.name/ [1] Fixes: 52526ca7fdb9 ("fs/proc/task_mmu: implement IOCTL to get and optionally clear info about PTEs") Signed-off-by: Kiryl Shutsemau Reported-by: Sashiko AI review Reviewed-by: Lorenzo Stoakes Reviewed-by: Dev Jain Cc: David Hildenbrand Cc: Michal Hocko Cc: Mike Rapoport Cc: Peter Xu Cc: Suren Baghdasaryan Cc: Vlastimil Babka Cc: Balbir Singh Cc: Signed-off-by: Andrew Morton --- diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 1e3a15bf46f4..e21a38ac745b 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -2610,12 +2610,16 @@ static void make_uffd_wp_huge_pte(struct vm_area_struct *vma, if (softleaf_is_hwpoison(entry) || softleaf_is_marker(entry)) return; - if (softleaf_is_migration(entry)) + if (softleaf_is_migration(entry)) { set_huge_pte_at(vma->vm_mm, addr, ptep, pte_swp_mkuffd_wp(ptent), psize); - else - huge_ptep_modify_prot_commit(vma, addr, ptep, ptent, - huge_pte_mkuffd_wp(ptent)); + } else { + pte_t old_pte, new_pte; + + old_pte = huge_ptep_modify_prot_start(vma, addr, ptep); + new_pte = huge_pte_mkuffd_wp(old_pte); + huge_ptep_modify_prot_commit(vma, addr, ptep, old_pte, new_pte); + } } #endif /* CONFIG_HUGETLB_PAGE */