]> git.ipfire.org Git - thirdparty/kernel/stable-queue.git/blob
f935aac7205a2d91da617891928ebb7d8aac7fd3
[thirdparty/kernel/stable-queue.git] /
1 From 2f406263e3e954aa24c1248edcfa9be0c1bb30fa Mon Sep 17 00:00:00 2001
2 From: Yin Fengwei <fengwei.yin@intel.com>
3 Date: Tue, 8 Aug 2023 10:09:15 +0800
4 Subject: madvise:madvise_cold_or_pageout_pte_range(): don't use mapcount() against large folio for sharing check
5
6 From: Yin Fengwei <fengwei.yin@intel.com>
7
8 commit 2f406263e3e954aa24c1248edcfa9be0c1bb30fa upstream.
9
10 Patch series "don't use mapcount() to check large folio sharing", v2.
11
12 In madvise_cold_or_pageout_pte_range() and madvise_free_pte_range(),
13 folio_mapcount() is used to check whether the folio is shared. But it's
14 not correct as folio_mapcount() returns total mapcount of large folio.
15
16 Use folio_estimated_sharers() here as the estimated number is enough.
17
18 This patchset will fix the cases:
19 User space application call madvise() with MADV_FREE, MADV_COLD and
20 MADV_PAGEOUT for specific address range. There are THP mapped to the
21 range. Without the patchset, the THP is skipped. With the patch, the
22 THP will be split and handled accordingly.
23
24 David reported the cow self test skip some cases because of MADV_PAGEOUT
25 skip THP:
26 https://lore.kernel.org/linux-mm/9e92e42d-488f-47db-ac9d-75b24cd0d037@intel.com/T/#mbf0f2ec7fbe45da47526de1d7036183981691e81
27 and I confirmed this patchset make it work again.
28
29
30 This patch (of 3):
31
32 Commit 07e8c82b5eff ("madvise: convert madvise_cold_or_pageout_pte_range()
33 to use folios") replaced the page_mapcount() with folio_mapcount() to
34 check whether the folio is shared by other mapping.
35
36 It's not correct for large folio. folio_mapcount() returns the total
37 mapcount of large folio which is not suitable to detect whether the folio
38 is shared.
39
40 Use folio_estimated_sharers() which returns a estimated number of shares.
41 That means it's not 100% correct. It should be OK for madvise case here.
42
43 User-visible effects is that the THP is skipped when user call madvise.
44 But the correct behavior is THP should be split and processed then.
45
46 NOTE: this change is a temporary fix to reduce the user-visible effects
47 before the long term fix from David is ready.
48
49 Link: https://lkml.kernel.org/r/20230808020917.2230692-1-fengwei.yin@intel.com
50 Link: https://lkml.kernel.org/r/20230808020917.2230692-2-fengwei.yin@intel.com
51 Fixes: 07e8c82b5eff ("madvise: convert madvise_cold_or_pageout_pte_range() to use folios")
52 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
53 Reviewed-by: Yu Zhao <yuzhao@google.com>
54 Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
55 Cc: David Hildenbrand <david@redhat.com>
56 Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
57 Cc: Matthew Wilcox <willy@infradead.org>
58 Cc: Minchan Kim <minchan@kernel.org>
59 Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com>
60 Cc: Yang Shi <shy828301@gmail.com>
61 Cc: <stable@vger.kernel.org>
62 Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
63 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
64 ---
65 mm/madvise.c | 4 ++--
66 1 file changed, 2 insertions(+), 2 deletions(-)
67
68 --- a/mm/madvise.c
69 +++ b/mm/madvise.c
70 @@ -376,7 +376,7 @@ static int madvise_cold_or_pageout_pte_r
71 folio = pfn_folio(pmd_pfn(orig_pmd));
72
73 /* Do not interfere with other mappings of this folio */
74 - if (folio_mapcount(folio) != 1)
75 + if (folio_estimated_sharers(folio) != 1)
76 goto huge_unlock;
77
78 if (pageout_anon_only_filter && !folio_test_anon(folio))
79 @@ -448,7 +448,7 @@ regular_folio:
80 * are sure it's worth. Split it if we are only owner.
81 */
82 if (folio_test_large(folio)) {
83 - if (folio_mapcount(folio) != 1)
84 + if (folio_estimated_sharers(folio) != 1)
85 break;
86 if (pageout_anon_only_filter && !folio_test_anon(folio))
87 break;