]> git.ipfire.org Git - thirdparty/kernel/stable-queue.git/blob - releases/3.10.94/x86-setup-extend-low-identity-map-to-cover-whole-kernel-range.patch
5.1-stable patches
[thirdparty/kernel/stable-queue.git] / releases / 3.10.94 / x86-setup-extend-low-identity-map-to-cover-whole-kernel-range.patch
1 From f5f3497cad8c8416a74b9aaceb127908755d020a Mon Sep 17 00:00:00 2001
2 From: Paolo Bonzini <pbonzini@redhat.com>
3 Date: Wed, 14 Oct 2015 13:30:45 +0200
4 Subject: x86/setup: Extend low identity map to cover whole kernel range
5
6 From: Paolo Bonzini <pbonzini@redhat.com>
7
8 commit f5f3497cad8c8416a74b9aaceb127908755d020a upstream.
9
10 On 32-bit systems, the initial_page_table is reused by
11 efi_call_phys_prolog as an identity map to call
12 SetVirtualAddressMap. efi_call_phys_prolog takes care of
13 converting the current CPU's GDT to a physical address too.
14
15 For PAE kernels the identity mapping is achieved by aliasing the
16 first PDPE for the kernel memory mapping into the first PDPE
17 of initial_page_table. This makes the EFI stub's trick "just work".
18
19 However, for non-PAE kernels there is no guarantee that the identity
20 mapping in the initial_page_table extends as far as the GDT; in this
21 case, accesses to the GDT will cause a page fault (which quickly becomes
22 a triple fault). Fix this by copying the kernel mappings from
23 swapper_pg_dir to initial_page_table twice, both at PAGE_OFFSET and at
24 identity mapping.
25
26 For some reason, this is only reproducible with QEMU's dynamic translation
27 mode, and not for example with KVM. However, even under KVM one can clearly
28 see that the page table is bogus:
29
30 $ qemu-system-i386 -pflash OVMF.fd -M q35 vmlinuz0 -s -S -daemonize
31 $ gdb
32 (gdb) target remote localhost:1234
33 (gdb) hb *0x02858f6f
34 Hardware assisted breakpoint 1 at 0x2858f6f
35 (gdb) c
36 Continuing.
37
38 Breakpoint 1, 0x02858f6f in ?? ()
39 (gdb) monitor info registers
40 ...
41 GDT= 0724e000 000000ff
42 IDT= fffbb000 000007ff
43 CR0=0005003b CR2=ff896000 CR3=032b7000 CR4=00000690
44 ...
45
46 The page directory is sane:
47
48 (gdb) x/4wx 0x32b7000
49 0x32b7000: 0x03398063 0x03399063 0x0339a063 0x0339b063
50 (gdb) x/4wx 0x3398000
51 0x3398000: 0x00000163 0x00001163 0x00002163 0x00003163
52 (gdb) x/4wx 0x3399000
53 0x3399000: 0x00400003 0x00401003 0x00402003 0x00403003
54
55 but our particular page directory entry is empty:
56
57 (gdb) x/1wx 0x32b7000 + (0x724e000 >> 22) * 4
58 0x32b7070: 0x00000000
59
60 [ It appears that you can skate past this issue if you don't receive
61 any interrupts while the bogus GDT pointer is loaded, or if you avoid
62 reloading the segment registers in general.
63
64 Andy Lutomirski provides some additional insight:
65
66 "AFAICT it's entirely permissible for the GDTR and/or LDT
67 descriptor to point to unmapped memory. Any attempt to use them
68 (segment loads, interrupts, IRET, etc) will try to access that memory
69 as if the access came from CPL 0 and, if the access fails, will
70 generate a valid page fault with CR2 pointing into the GDT or
71 LDT."
72
73 Up until commit 23a0d4e8fa6d ("efi: Disable interrupts around EFI
74 calls, not in the epilog/prolog calls") interrupts were disabled
75 around the prolog and epilog calls, and the functional GDT was
76 re-installed before interrupts were re-enabled.
77
78 Which explains why no one has hit this issue until now. ]
79
80 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
81 Reported-by: Laszlo Ersek <lersek@redhat.com>
82 Cc: <stable@vger.kernel.org>
83 Cc: Borislav Petkov <bp@alien8.de>
84 Cc: "H. Peter Anvin" <hpa@zytor.com>
85 Cc: Thomas Gleixner <tglx@linutronix.de>
86 Cc: Ingo Molnar <mingo@kernel.org>
87 Cc: Andy Lutomirski <luto@amacapital.net>
88 Signed-off-by: Matt Fleming <matt.fleming@intel.com>
89 [ Updated changelog. ]
90 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
91
92 ---
93 arch/x86/kernel/setup.c | 8 ++++++++
94 1 file changed, 8 insertions(+)
95
96 --- a/arch/x86/kernel/setup.c
97 +++ b/arch/x86/kernel/setup.c
98 @@ -1156,6 +1156,14 @@ void __init setup_arch(char **cmdline_p)
99 clone_pgd_range(initial_page_table + KERNEL_PGD_BOUNDARY,
100 swapper_pg_dir + KERNEL_PGD_BOUNDARY,
101 KERNEL_PGD_PTRS);
102 +
103 + /*
104 + * sync back low identity map too. It is used for example
105 + * in the 32-bit EFI stub.
106 + */
107 + clone_pgd_range(initial_page_table,
108 + swapper_pg_dir + KERNEL_PGD_BOUNDARY,
109 + KERNEL_PGD_PTRS);
110 #endif
111
112 tboot_probe();