1 From a5caa209ba9c29c6421292e7879d2387a2ef39c9 Mon Sep 17 00:00:00 2001
2 From: Matt Fleming <matt.fleming@intel.com>
3 Date: Fri, 25 Sep 2015 23:02:18 +0100
4 Subject: x86/efi: Fix boot crash by mapping EFI memmap entries bottom-up at runtime, instead of top-down
6 From: Matt Fleming <matt.fleming@intel.com>
8 commit a5caa209ba9c29c6421292e7879d2387a2ef39c9 upstream.
10 Beginning with UEFI v2.5 EFI_PROPERTIES_TABLE was introduced
11 that signals that the firmware PE/COFF loader supports splitting
12 code and data sections of PE/COFF images into separate EFI
13 memory map entries. This allows the kernel to map those regions
14 with strict memory protections, e.g. EFI_MEMORY_RO for code,
15 EFI_MEMORY_XP for data, etc.
17 Unfortunately, an unwritten requirement of this new feature is
18 that the regions need to be mapped with the same offsets
19 relative to each other as observed in the EFI memory map. If
20 this is not done crashes like this may occur,
22 BUG: unable to handle kernel paging request at fffffffefe6086dd
23 IP: [<fffffffefe6086dd>] 0xfffffffefe6086dd
25 [<ffffffff8104c90e>] efi_call+0x7e/0x100
26 [<ffffffff81602091>] ? virt_efi_set_variable+0x61/0x90
27 [<ffffffff8104c583>] efi_delete_dummy_variable+0x63/0x70
28 [<ffffffff81f4e4aa>] efi_enter_virtual_mode+0x383/0x392
29 [<ffffffff81f37e1b>] start_kernel+0x38a/0x417
30 [<ffffffff81f37495>] x86_64_start_reservations+0x2a/0x2c
31 [<ffffffff81f37582>] x86_64_start_kernel+0xeb/0xef
33 Here 0xfffffffefe6086dd refers to an address the firmware
34 expects to be mapped but which the OS never claimed was mapped.
35 The issue is that included in these regions are relative
36 addresses to other regions which were emitted by the firmware
37 toolchain before the "splitting" of sections occurred at
40 Needless to say, we don't satisfy this unwritten requirement on
41 x86_64 and instead map the EFI memory map entries in reverse
42 order. The above crash is almost certainly triggerable with any
43 kernel newer than v3.13 because that's when we rewrote the EFI
44 runtime region mapping code, in commit d2f7cbe7b26a ("x86/efi:
45 Runtime services virtual mapping"). For kernel versions before
46 v3.13 things may work by pure luck depending on the
47 fragmentation of the kernel virtual address space at the time we
50 Instead of mapping the EFI memory map entries in reverse order,
51 where entry N has a higher virtual address than entry N+1, map
52 them in the same order as they appear in the EFI memory map to
53 preserve this relative offset between regions.
55 This patch has been kept as small as possible with the intention
56 that it should be applied aggressively to stable and
57 distribution kernels. It is very much a bugfix rather than
58 support for a new feature, since when EFI_PROPERTIES_TABLE is
59 enabled we must map things as outlined above to even boot - we
60 have no way of asking the firmware not to split the code/data
63 In fact, this patch doesn't even make use of the more strict
64 memory protections available in UEFI v2.5. That will come later.
66 Suggested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
67 Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
68 Signed-off-by: Matt Fleming <matt.fleming@intel.com>
69 Cc: Borislav Petkov <bp@suse.de>
70 Cc: Chun-Yi <jlee@suse.com>
71 Cc: Dave Young <dyoung@redhat.com>
72 Cc: H. Peter Anvin <hpa@zytor.com>
73 Cc: James Bottomley <JBottomley@Odin.com>
74 Cc: Lee, Chun-Yi <jlee@suse.com>
75 Cc: Leif Lindholm <leif.lindholm@linaro.org>
76 Cc: Linus Torvalds <torvalds@linux-foundation.org>
77 Cc: Matthew Garrett <mjg59@srcf.ucam.org>
78 Cc: Mike Galbraith <efault@gmx.de>
79 Cc: Peter Jones <pjones@redhat.com>
80 Cc: Peter Zijlstra <peterz@infradead.org>
81 Cc: Thomas Gleixner <tglx@linutronix.de>
82 Cc: linux-kernel@vger.kernel.org
83 Link: http://lkml.kernel.org/r/1443218539-7610-2-git-send-email-matt@codeblueprint.co.uk
84 Signed-off-by: Ingo Molnar <mingo@kernel.org>
85 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
88 arch/x86/platform/efi/efi.c | 67 +++++++++++++++++++++++++++++++++++++++++++-
89 1 file changed, 66 insertions(+), 1 deletion(-)
91 --- a/arch/x86/platform/efi/efi.c
92 +++ b/arch/x86/platform/efi/efi.c
93 @@ -679,6 +679,70 @@ out:
97 + * Iterate the EFI memory map in reverse order because the regions
98 + * will be mapped top-down. The end result is the same as if we had
99 + * mapped things forward, but doesn't require us to change the
100 + * existing implementation of efi_map_region().
102 +static inline void *efi_map_next_entry_reverse(void *entry)
106 + return memmap.map_end - memmap.desc_size;
108 + entry -= memmap.desc_size;
109 + if (entry < memmap.map)
116 + * efi_map_next_entry - Return the next EFI memory map descriptor
117 + * @entry: Previous EFI memory map descriptor
119 + * This is a helper function to iterate over the EFI memory map, which
120 + * we do in different orders depending on the current configuration.
122 + * To begin traversing the memory map @entry must be %NULL.
124 + * Returns %NULL when we reach the end of the memory map.
126 +static void *efi_map_next_entry(void *entry)
128 + if (!efi_enabled(EFI_OLD_MEMMAP) && efi_enabled(EFI_64BIT)) {
130 + * Starting in UEFI v2.5 the EFI_PROPERTIES_TABLE
131 + * config table feature requires us to map all entries
132 + * in the same order as they appear in the EFI memory
133 + * map. That is to say, entry N must have a lower
134 + * virtual address than entry N+1. This is because the
135 + * firmware toolchain leaves relative references in
136 + * the code/data sections, which are split and become
137 + * separate EFI memory regions. Mapping things
138 + * out-of-order leads to the firmware accessing
139 + * unmapped addresses.
141 + * Since we need to map things this way whether or not
142 + * the kernel actually makes use of
143 + * EFI_PROPERTIES_TABLE, let's just switch to this
144 + * scheme by default for 64-bit.
146 + return efi_map_next_entry_reverse(entry);
153 + entry += memmap.desc_size;
154 + if (entry >= memmap.map_end)
161 * Map the efi memory ranges of the runtime services and update new_mmap with
164 @@ -688,7 +752,8 @@ static void * __init efi_map_regions(int
165 unsigned long left = 0;
166 efi_memory_desc_t *md;
168 - for (p = memmap.map; p < memmap.map_end; p += memmap.desc_size) {
170 + while ((p = efi_map_next_entry(p))) {
172 if (!(md->attribute & EFI_MEMORY_RUNTIME)) {