]>
Commit | Line | Data |
---|---|---|
21bb2995 GKH |
1 | From 9625456cc76391b7f3f2809579126542a8ed4d39 Mon Sep 17 00:00:00 2001 |
2 | From: Shaohua Li <shli@fb.com> | |
3 | Date: Tue, 3 Oct 2017 16:15:32 -0700 | |
4 | Subject: mm: fix data corruption caused by lazyfree page | |
5 | ||
6 | From: Shaohua Li <shli@fb.com> | |
7 | ||
8 | commit 9625456cc76391b7f3f2809579126542a8ed4d39 upstream. | |
9 | ||
10 | MADV_FREE clears pte dirty bit and then marks the page lazyfree (clear | |
11 | SwapBacked). There is no lock to prevent the page is added to swap | |
12 | cache between these two steps by page reclaim. If page reclaim finds | |
13 | such page, it will simply add the page to swap cache without pageout the | |
14 | page to swap because the page is marked as clean. Next time, page fault | |
15 | will read data from the swap slot which doesn't have the original data, | |
16 | so we have a data corruption. To fix issue, we mark the page dirty and | |
17 | pageout the page. | |
18 | ||
19 | However, we shouldn't dirty all pages which is clean and in swap cache. | |
20 | swapin page is swap cache and clean too. So we only dirty page which is | |
21 | added into swap cache in page reclaim, which shouldn't be swapin page. | |
22 | As Minchan suggested, simply dirty the page in add_to_swap can do the | |
23 | job. | |
24 | ||
25 | Fixes: 802a3a92ad7a ("mm: reclaim MADV_FREE pages") | |
26 | Link: http://lkml.kernel.org/r/08c84256b007bf3f63c91d94383bd9eb6fee2daa.1506446061.git.shli@fb.com | |
27 | Signed-off-by: Shaohua Li <shli@fb.com> | |
28 | Reported-by: Artem Savkov <asavkov@redhat.com> | |
29 | Acked-by: Michal Hocko <mhocko@suse.com> | |
30 | Acked-by: Minchan Kim <minchan@kernel.org> | |
31 | Cc: Johannes Weiner <hannes@cmpxchg.org> | |
32 | Cc: Hillf Danton <hdanton@sina.com> | |
33 | Cc: Hugh Dickins <hughd@google.com> | |
34 | Cc: Rik van Riel <riel@redhat.com> | |
35 | Cc: Mel Gorman <mgorman@techsingularity.net> | |
36 | Signed-off-by: Andrew Morton <akpm@linux-foundation.org> | |
37 | Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> | |
38 | Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> | |
39 | ||
40 | --- | |
41 | mm/swap_state.c | 11 +++++++++++ | |
42 | 1 file changed, 11 insertions(+) | |
43 | ||
44 | --- a/mm/swap_state.c | |
45 | +++ b/mm/swap_state.c | |
46 | @@ -219,6 +219,17 @@ int add_to_swap(struct page *page) | |
47 | * clear SWAP_HAS_CACHE flag. | |
48 | */ | |
49 | goto fail; | |
50 | + /* | |
51 | + * Normally the page will be dirtied in unmap because its pte should be | |
52 | + * dirty. A special case is MADV_FREE page. The page'e pte could have | |
53 | + * dirty bit cleared but the page's SwapBacked bit is still set because | |
54 | + * clearing the dirty bit and SwapBacked bit has no lock protected. For | |
55 | + * such page, unmap will not set dirty bit for it, so page reclaim will | |
56 | + * not write the page out. This can cause data corruption when the page | |
57 | + * is swap in later. Always setting the dirty bit for the page solves | |
58 | + * the problem. | |
59 | + */ | |
60 | + set_page_dirty(page); | |
61 | ||
62 | return 1; | |
63 |