]> git.ipfire.org Git - thirdparty/kernel/linux.git/commitdiff
mm: fault in complete folios instead of individual pages for tmpfs
authorBaolin Wang <baolin.wang@linux.alibaba.com>
Fri, 4 Jul 2025 03:19:26 +0000 (11:19 +0800)
committerAndrew Morton <akpm@linux-foundation.org>
Sun, 20 Jul 2025 01:59:43 +0000 (18:59 -0700)
After commit acd7ccb284b8 ("mm: shmem: add large folio support for
tmpfs"), tmpfs can also support large folio allocation (not just PMD-sized
large folios).

However, when accessing tmpfs via mmap(), although tmpfs supports large
folios, we still establish mappings at the base page granularity, which is
unreasonable.

We can map multiple consecutive pages of tmpfs folios at once according to
the size of the large folio.  On one hand, this can reduce the overhead of
page faults; on the other hand, it can leverage hardware architecture
optimizations to reduce TLB misses, such as contiguous PTEs on the ARM
architecture.

Moreover, tmpfs mount will use the 'huge=' option to control large folio
allocation explicitly.  So it can be understood that the process's RSS
statistics might increase, and I think this will not cause any obvious
effects for users.

Performance test:
I created a 1G tmpfs file, populated with 64K large folios, and write-accessed it
sequentially via mmap(). I observed a significant performance improvement:

Before the patch:
real 0m0.158s
user 0m0.008s
sys 0m0.150s

After the patch:
real 0m0.021s
user 0m0.004s
sys 0m0.017s

Link: https://lkml.kernel.org/r/440940e78aeb7430c5cc8b6d2088ae98265b9809.1751599072.git.baolin.wang@linux.alibaba.com
Fixes: acd7ccb284b8 ("mm: shmem: add large folio support for tmpfs")
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Reviewed-by: Barry Song <baohua@kernel.org>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mariano Pache <npache@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm/memory.c

index 0f9b32a20e5b788645496ff69e9d6428d9140c28..9944380e947d47702851ecaedcc23354caba8db7 100644 (file)
@@ -5383,10 +5383,10 @@ fallback:
 
        /*
         * Using per-page fault to maintain the uffd semantics, and same
-        * approach also applies to non-anonymous-shmem faults to avoid
+        * approach also applies to non shmem/tmpfs faults to avoid
         * inflating the RSS of the process.
         */
-       if (!vma_is_anon_shmem(vma) || unlikely(userfaultfd_armed(vma)) ||
+       if (!vma_is_shmem(vma) || unlikely(userfaultfd_armed(vma)) ||
            unlikely(needs_fallback)) {
                nr_pages = 1;
        } else if (nr_pages > 1) {