1 From a3e0f9e47d5ef7858a26cc12d90ad5146e802d47 Mon Sep 17 00:00:00 2001
2 From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
3 Date: Thu, 2 Jan 2014 12:58:51 -0800
4 Subject: mm/memory-failure.c: transfer page count from head page to tail page after split thp
6 From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
8 commit a3e0f9e47d5ef7858a26cc12d90ad5146e802d47 upstream.
10 Memory failures on thp tail pages cause kernel panic like below:
12 mce: [Hardware Error]: Machine check events logged
13 MCE exception done on CPU 7
14 BUG: unable to handle kernel NULL pointer dereference at 0000000000000058
15 IP: [<ffffffff811b7cd1>] dequeue_hwpoisoned_huge_page+0x131/0x1e0
16 PGD bae42067 PUD ba47d067 PMD 0
19 CPU: 7 PID: 128 Comm: kworker/7:2 Tainted: G M O 3.13.0-rc4-131217-1558-00003-g83b7df08e462 #25
22 me_huge_page+0x3e/0x50
23 memory_failure+0x4bb/0xc20
24 mce_process_work+0x3e/0x70
25 process_one_work+0x171/0x420
26 worker_thread+0x11b/0x3a0
27 ? manage_workers.isra.25+0x2b0/0x2b0
29 ? kthread_create_on_node+0x190/0x190
30 ret_from_fork+0x7c/0xb0
31 ? kthread_create_on_node+0x190/0x190
33 RIP dequeue_hwpoisoned_huge_page+0x131/0x1e0
36 The reasoning of this problem is shown below:
37 - when we have a memory error on a thp tail page, the memory error
38 handler grabs a refcount of the head page to keep the thp under us.
39 - Before unmapping the error page from processes, we split the thp,
40 where page refcounts of both of head/tail pages don't change.
41 - Then we call try_to_unmap() over the error page (which was a tail
42 page before). We didn't pin the error page to handle the memory error,
43 this error page is freed and removed from LRU list.
44 - We never have the error page on LRU list, so the first page state
45 check returns "unknown page," then we move to the second check
46 with the saved page flag.
47 - The saved page flag have PG_tail set, so the second page state check
49 - We call me_huge_page() for freed error page, then we hit the above panic.
51 The root cause is that we didn't move refcount from the head page to the
52 tail page after split thp. So this patch suggests to do this.
54 This panic was introduced by commit 524fca1e73 ("HWPOISON: fix
55 misjudgement of page_action() for errors on mlocked pages"). Note that we
56 did have the same refcount problem before this commit, but it was just
57 ignored because we had only first page state check which returned "unknown
58 page." The commit changed the refcount problem from "doesn't work" to
61 Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
62 Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
63 Cc: Andi Kleen <andi@firstfloor.org>
64 Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
65 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
66 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
69 mm/memory-failure.c | 10 ++++++++++
70 1 file changed, 10 insertions(+)
72 --- a/mm/memory-failure.c
73 +++ b/mm/memory-failure.c
74 @@ -938,6 +938,16 @@ static int hwpoison_user_mappings(struct
75 BUG_ON(!PageHWPoison(p));
79 + * We pinned the head page for hwpoison handling,
80 + * now we split the thp and we are interested in
81 + * the hwpoisoned raw page, so move the refcount
88 /* THP is split, so ppage should be the real poisoned page. */