]> git.ipfire.org Git - thirdparty/kernel/stable-queue.git/blob
a3ed707b4e478db02bb3fa161a05cbfae0d8af67
[thirdparty/kernel/stable-queue.git] /
1 From 2cd1c8d4dc7ecca9e9431e2dabe41ae9c7d89e51 Mon Sep 17 00:00:00 2001
2 From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
3 Date: Tue, 15 Nov 2011 14:49:09 -0800
4 Subject: x86/paravirt: PTE updates in k(un)map_atomic need to be synchronous, regardless of lazy_mmu mode
5
6 From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
7
8 commit 2cd1c8d4dc7ecca9e9431e2dabe41ae9c7d89e51 upstream.
9
10 Fix an outstanding issue that has been reported since 2.6.37.
11 Under a heavy loaded machine processing "fork()" calls could
12 crash with:
13
14 BUG: unable to handle kernel paging request at f573fc8c
15 IP: [<c01abc54>] swap_count_continued+0x104/0x180
16 *pdpt = 000000002a3b9027 *pde = 0000000001bed067 *pte = 0000000000000000 Oops: 0000 [#1] SMP
17 Modules linked in:
18 Pid: 1638, comm: apache2 Not tainted 3.0.4-linode37 #1
19 EIP: 0061:[<c01abc54>] EFLAGS: 00210246 CPU: 3
20 EIP is at swap_count_continued+0x104/0x180
21 .. snip..
22 Call Trace:
23 [<c01ac222>] ? __swap_duplicate+0xc2/0x160
24 [<c01040f7>] ? pte_mfn_to_pfn+0x87/0xe0
25 [<c01ac2e4>] ? swap_duplicate+0x14/0x40
26 [<c01a0a6b>] ? copy_pte_range+0x45b/0x500
27 [<c01a0ca5>] ? copy_page_range+0x195/0x200
28 [<c01328c6>] ? dup_mmap+0x1c6/0x2c0
29 [<c0132cf8>] ? dup_mm+0xa8/0x130
30 [<c013376a>] ? copy_process+0x98a/0xb30
31 [<c013395f>] ? do_fork+0x4f/0x280
32 [<c01573b3>] ? getnstimeofday+0x43/0x100
33 [<c010f770>] ? sys_clone+0x30/0x40
34 [<c06c048d>] ? ptregs_clone+0x15/0x48
35 [<c06bfb71>] ? syscall_call+0x7/0xb
36
37 The problem is that in copy_page_range() we turn lazy mode on,
38 and then in swap_entry_free() we call swap_count_continued()
39 which ends up in:
40
41 map = kmap_atomic(page, KM_USER0) + offset;
42
43 and then later we touch *map.
44
45 Since we are running in batched mode (lazy) we don't actually
46 set up the PTE mappings and the kmap_atomic is not done
47 synchronously and ends up trying to dereference a page that has
48 not been set.
49
50 Looking at kmap_atomic_prot_pfn(), it uses
51 'arch_flush_lazy_mmu_mode' and doing the same in
52 kmap_atomic_prot() and __kunmap_atomic() makes the problem go
53 away.
54
55 Interestingly, commit b8bcfe997e4615 ("x86/paravirt: remove lazy
56 mode in interrupts") removed part of this to fix an interrupt
57 issue - but it went to far and did not consider this scenario.
58
59 Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
60 Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
61 Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
62 Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
63 Signed-off-by: Ingo Molnar <mingo@elte.hu>
64 Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
65
66 ---
67 arch/x86/mm/highmem_32.c | 2 ++
68 1 file changed, 2 insertions(+)
69
70 --- a/arch/x86/mm/highmem_32.c
71 +++ b/arch/x86/mm/highmem_32.c
72 @@ -45,6 +45,7 @@ void *kmap_atomic_prot(struct page *page
73 vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx);
74 BUG_ON(!pte_none(*(kmap_pte-idx)));
75 set_pte(kmap_pte-idx, mk_pte(page, prot));
76 + arch_flush_lazy_mmu_mode();
77
78 return (void *)vaddr;
79 }
80 @@ -88,6 +89,7 @@ void __kunmap_atomic(void *kvaddr)
81 */
82 kpte_clear_flush(kmap_pte-idx, vaddr);
83 kmap_atomic_idx_pop();
84 + arch_flush_lazy_mmu_mode();
85 }
86 #ifdef CONFIG_DEBUG_HIGHMEM
87 else {