From: Baolin Wang Date: Mon, 9 Feb 2026 14:07:27 +0000 (+0800) Subject: arm64: mm: implement the architecture-specific clear_flush_young_ptes() X-Git-Tag: v7.0-rc1~47^2~5 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=07f440c23a197616f7b7607b98b9427da23f9f6f;p=thirdparty%2Fkernel%2Flinux.git arm64: mm: implement the architecture-specific clear_flush_young_ptes() Implement the Arm64 architecture-specific clear_flush_young_ptes() to enable batched checking of young flags and TLB flushing, improving performance during large folio reclamation. Performance testing: Allocate 10G clean file-backed folios by mmap() in a memory cgroup, and try to reclaim 8G file-backed folios via the memory.reclaim interface. I can observe 33% performance improvement on my Arm64 32-core server (and 10%+ improvement on my X86 machine). Meanwhile, the hotspot folio_check_references() dropped from approximately 35% to around 5%. W/o patchset: real 0m1.518s user 0m0.000s sys 0m1.518s W/ patchset: real 0m1.018s user 0m0.000s sys 0m1.018s Link: https://lkml.kernel.org/r/ce749fbae3e900e733fa104a16fcb3ca9fe4f9bd.1770645603.git.baolin.wang@linux.alibaba.com Signed-off-by: Baolin Wang Reviewed-by: Ryan Roberts Reviewed-by: David Hildenbrand (Arm) Cc: Barry Song Cc: Catalin Marinas Cc: Harry Yoo Cc: Jann Horn Cc: Liam Howlett Cc: Lorenzo Stoakes Cc: Matthew Wilcox (Oracle) Cc: Michal Hocko Cc: Mike Rapoport Cc: Rik van Riel Cc: Suren Baghdasaryan Cc: Vlastimil Babka Cc: Will Deacon Signed-off-by: Andrew Morton --- diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 3dabf5ea17faf..a17eb8a767884 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -1838,6 +1838,17 @@ static inline int ptep_clear_flush_young(struct vm_area_struct *vma, return contpte_clear_flush_young_ptes(vma, addr, ptep, 1); } +#define clear_flush_young_ptes clear_flush_young_ptes +static inline int clear_flush_young_ptes(struct vm_area_struct *vma, + unsigned long addr, pte_t *ptep, + unsigned int nr) +{ + if (likely(nr == 1 && !pte_cont(__ptep_get(ptep)))) + return __ptep_clear_flush_young(vma, addr, ptep); + + return contpte_clear_flush_young_ptes(vma, addr, ptep, nr); +} + #define wrprotect_ptes wrprotect_ptes static __always_inline void wrprotect_ptes(struct mm_struct *mm, unsigned long addr, pte_t *ptep, unsigned int nr)