]>
Commit | Line | Data |
---|---|---|
a20c88c2 TG |
1 | L1TF - L1 Terminal Fault |
2 | ======================== | |
3 | ||
4 | L1 Terminal Fault is a hardware vulnerability which allows unprivileged | |
5 | speculative access to data which is available in the Level 1 Data Cache | |
6 | when the page table entry controlling the virtual address, which is used | |
7 | for the access, has the Present bit cleared or other reserved bits set. | |
8 | ||
9 | Affected processors | |
10 | ------------------- | |
11 | ||
12 | This vulnerability affects a wide range of Intel processors. The | |
13 | vulnerability is not present on: | |
14 | ||
15 | - Processors from AMD, Centaur and other non Intel vendors | |
16 | ||
17 | - Older processor models, where the CPU family is < 6 | |
18 | ||
19 | - A range of Intel ATOM processors (Cedarview, Cloverview, Lincroft, | |
40b696da | 20 | Penwell, Pineview, Silvermont, Airmont, Merrifield) |
a20c88c2 TG |
21 | |
22 | - The Intel Core Duo Yonah variants (2006 - 2008) | |
23 | ||
24 | - The Intel XEON PHI family | |
25 | ||
26 | - Intel processors which have the ARCH_CAP_RDCL_NO bit set in the | |
27 | IA32_ARCH_CAPABILITIES MSR. If the bit is set the CPU is not affected | |
28 | by the Meltdown vulnerability either. These CPUs should become | |
29 | available by end of 2018. | |
30 | ||
31 | Whether a processor is affected or not can be read out from the L1TF | |
32 | vulnerability file in sysfs. See :ref:`l1tf_sys_info`. | |
33 | ||
34 | Related CVEs | |
35 | ------------ | |
36 | ||
37 | The following CVE entries are related to the L1TF vulnerability: | |
38 | ||
39 | ============= ================= ============================== | |
40 | CVE-2018-3615 L1 Terminal Fault SGX related aspects | |
41 | CVE-2018-3620 L1 Terminal Fault OS, SMM related aspects | |
42 | CVE-2018-3646 L1 Terminal Fault Virtualization related aspects | |
43 | ============= ================= ============================== | |
44 | ||
45 | Problem | |
46 | ------- | |
47 | ||
48 | If an instruction accesses a virtual address for which the relevant page | |
49 | table entry (PTE) has the Present bit cleared or other reserved bits set, | |
50 | then speculative execution ignores the invalid PTE and loads the referenced | |
51 | data if it is present in the Level 1 Data Cache, as if the page referenced | |
52 | by the address bits in the PTE was still present and accessible. | |
53 | ||
54 | While this is a purely speculative mechanism and the instruction will raise | |
55 | a page fault when it is retired eventually, the pure act of loading the | |
56 | data and making it available to other speculative instructions opens up the | |
57 | opportunity for side channel attacks to unprivileged malicious code, | |
58 | similar to the Meltdown attack. | |
59 | ||
60 | While Meltdown breaks the user space to kernel space protection, L1TF | |
61 | allows to attack any physical memory address in the system and the attack | |
62 | works across all protection domains. It allows an attack of SGX and also | |
63 | works from inside virtual machines because the speculation bypasses the | |
64 | extended page table (EPT) protection mechanism. | |
65 | ||
66 | ||
67 | Attack scenarios | |
68 | ---------------- | |
69 | ||
70 | 1. Malicious user space | |
71 | ^^^^^^^^^^^^^^^^^^^^^^^ | |
72 | ||
73 | Operating Systems store arbitrary information in the address bits of a | |
74 | PTE which is marked non present. This allows a malicious user space | |
75 | application to attack the physical memory to which these PTEs resolve. | |
76 | In some cases user-space can maliciously influence the information | |
77 | encoded in the address bits of the PTE, thus making attacks more | |
78 | deterministic and more practical. | |
79 | ||
80 | The Linux kernel contains a mitigation for this attack vector, PTE | |
81 | inversion, which is permanently enabled and has no performance | |
82 | impact. The kernel ensures that the address bits of PTEs, which are not | |
83 | marked present, never point to cacheable physical memory space. | |
84 | ||
85 | A system with an up to date kernel is protected against attacks from | |
86 | malicious user space applications. | |
87 | ||
88 | 2. Malicious guest in a virtual machine | |
89 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
90 | ||
91 | The fact that L1TF breaks all domain protections allows malicious guest | |
92 | OSes, which can control the PTEs directly, and malicious guest user | |
93 | space applications, which run on an unprotected guest kernel lacking the | |
94 | PTE inversion mitigation for L1TF, to attack physical host memory. | |
95 | ||
96 | A special aspect of L1TF in the context of virtualization is symmetric | |
97 | multi threading (SMT). The Intel implementation of SMT is called | |
98 | HyperThreading. The fact that Hyperthreads on the affected processors | |
99 | share the L1 Data Cache (L1D) is important for this. As the flaw allows | |
100 | only to attack data which is present in L1D, a malicious guest running | |
101 | on one Hyperthread can attack the data which is brought into the L1D by | |
102 | the context which runs on the sibling Hyperthread of the same physical | |
103 | core. This context can be host OS, host user space or a different guest. | |
104 | ||
105 | If the processor does not support Extended Page Tables, the attack is | |
106 | only possible, when the hypervisor does not sanitize the content of the | |
107 | effective (shadow) page tables. | |
108 | ||
109 | While solutions exist to mitigate these attack vectors fully, these | |
110 | mitigations are not enabled by default in the Linux kernel because they | |
111 | can affect performance significantly. The kernel provides several | |
112 | mechanisms which can be utilized to address the problem depending on the | |
113 | deployment scenario. The mitigations, their protection scope and impact | |
114 | are described in the next sections. | |
115 | ||
40b696da | 116 | The default mitigations and the rationale for choosing them are explained |
a20c88c2 TG |
117 | at the end of this document. See :ref:`default_mitigations`. |
118 | ||
119 | .. _l1tf_sys_info: | |
120 | ||
121 | L1TF system information | |
122 | ----------------------- | |
123 | ||
124 | The Linux kernel provides a sysfs interface to enumerate the current L1TF | |
125 | status of the system: whether the system is vulnerable, and which | |
126 | mitigations are active. The relevant sysfs file is: | |
127 | ||
128 | /sys/devices/system/cpu/vulnerabilities/l1tf | |
129 | ||
130 | The possible values in this file are: | |
131 | ||
132 | =========================== =============================== | |
133 | 'Not affected' The processor is not vulnerable | |
134 | 'Mitigation: PTE Inversion' The host protection is active | |
135 | =========================== =============================== | |
136 | ||
137 | If KVM/VMX is enabled and the processor is vulnerable then the following | |
138 | information is appended to the 'Mitigation: PTE Inversion' part: | |
139 | ||
140 | - SMT status: | |
141 | ||
142 | ===================== ================ | |
143 | 'VMX: SMT vulnerable' SMT is enabled | |
144 | 'VMX: SMT disabled' SMT is disabled | |
145 | ===================== ================ | |
146 | ||
147 | - L1D Flush mode: | |
148 | ||
149 | ================================ ==================================== | |
150 | 'L1D vulnerable' L1D flushing is disabled | |
151 | ||
152 | 'L1D conditional cache flushes' L1D flush is conditionally enabled | |
153 | ||
154 | 'L1D cache flushes' L1D flush is unconditionally enabled | |
155 | ================================ ==================================== | |
156 | ||
157 | The resulting grade of protection is discussed in the following sections. | |
158 | ||
159 | ||
160 | Host mitigation mechanism | |
161 | ------------------------- | |
162 | ||
163 | The kernel is unconditionally protected against L1TF attacks from malicious | |
164 | user space running on the host. | |
165 | ||
166 | ||
167 | Guest mitigation mechanisms | |
168 | --------------------------- | |
169 | ||
170 | .. _l1d_flush: | |
171 | ||
172 | 1. L1D flush on VMENTER | |
173 | ^^^^^^^^^^^^^^^^^^^^^^^ | |
174 | ||
175 | To make sure that a guest cannot attack data which is present in the L1D | |
176 | the hypervisor flushes the L1D before entering the guest. | |
177 | ||
178 | Flushing the L1D evicts not only the data which should not be accessed | |
179 | by a potentially malicious guest, it also flushes the guest | |
180 | data. Flushing the L1D has a performance impact as the processor has to | |
181 | bring the flushed guest data back into the L1D. Depending on the | |
182 | frequency of VMEXIT/VMENTER and the type of computations in the guest | |
183 | performance degradation in the range of 1% to 50% has been observed. For | |
184 | scenarios where guest VMEXIT/VMENTER are rare the performance impact is | |
185 | minimal. Virtio and mechanisms like posted interrupts are designed to | |
186 | confine the VMEXITs to a bare minimum, but specific configurations and | |
187 | application scenarios might still suffer from a high VMEXIT rate. | |
188 | ||
189 | The kernel provides two L1D flush modes: | |
190 | - conditional ('cond') | |
191 | - unconditional ('always') | |
192 | ||
193 | The conditional mode avoids L1D flushing after VMEXITs which execute | |
40b696da TL |
194 | only audited code paths before the corresponding VMENTER. These code |
195 | paths have been verified that they cannot expose secrets or other | |
a20c88c2 TG |
196 | interesting data to an attacker, but they can leak information about the |
197 | address space layout of the hypervisor. | |
198 | ||
199 | Unconditional mode flushes L1D on all VMENTER invocations and provides | |
200 | maximum protection. It has a higher overhead than the conditional | |
201 | mode. The overhead cannot be quantified correctly as it depends on the | |
40b696da | 202 | workload scenario and the resulting number of VMEXITs. |
a20c88c2 TG |
203 | |
204 | The general recommendation is to enable L1D flush on VMENTER. The kernel | |
205 | defaults to conditional mode on affected processors. | |
206 | ||
207 | **Note**, that L1D flush does not prevent the SMT problem because the | |
208 | sibling thread will also bring back its data into the L1D which makes it | |
209 | attackable again. | |
210 | ||
211 | L1D flush can be controlled by the administrator via the kernel command | |
212 | line and sysfs control files. See :ref:`mitigation_control_command_line` | |
213 | and :ref:`mitigation_control_kvm`. | |
214 | ||
215 | .. _guest_confinement: | |
216 | ||
217 | 2. Guest VCPU confinement to dedicated physical cores | |
218 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
219 | ||
220 | To address the SMT problem, it is possible to make a guest or a group of | |
221 | guests affine to one or more physical cores. The proper mechanism for | |
222 | that is to utilize exclusive cpusets to ensure that no other guest or | |
223 | host tasks can run on these cores. | |
224 | ||
225 | If only a single guest or related guests run on sibling SMT threads on | |
226 | the same physical core then they can only attack their own memory and | |
227 | restricted parts of the host memory. | |
228 | ||
229 | Host memory is attackable, when one of the sibling SMT threads runs in | |
230 | host OS (hypervisor) context and the other in guest context. The amount | |
231 | of valuable information from the host OS context depends on the context | |
232 | which the host OS executes, i.e. interrupts, soft interrupts and kernel | |
233 | threads. The amount of valuable data from these contexts cannot be | |
234 | declared as non-interesting for an attacker without deep inspection of | |
235 | the code. | |
236 | ||
237 | **Note**, that assigning guests to a fixed set of physical cores affects | |
238 | the ability of the scheduler to do load balancing and might have | |
239 | negative effects on CPU utilization depending on the hosting | |
240 | scenario. Disabling SMT might be a viable alternative for particular | |
241 | scenarios. | |
242 | ||
243 | For further information about confining guests to a single or to a group | |
244 | of cores consult the cpusets documentation: | |
245 | ||
246 | https://www.kernel.org/doc/Documentation/cgroup-v1/cpusets.txt | |
247 | ||
248 | .. _interrupt_isolation: | |
249 | ||
250 | 3. Interrupt affinity | |
251 | ^^^^^^^^^^^^^^^^^^^^^ | |
252 | ||
253 | Interrupts can be made affine to logical CPUs. This is not universally | |
254 | true because there are types of interrupts which are truly per CPU | |
255 | interrupts, e.g. the local timer interrupt. Aside of that multi queue | |
256 | devices affine their interrupts to single CPUs or groups of CPUs per | |
257 | queue without allowing the administrator to control the affinities. | |
258 | ||
259 | Moving the interrupts, which can be affinity controlled, away from CPUs | |
260 | which run untrusted guests, reduces the attack vector space. | |
261 | ||
262 | Whether the interrupts with are affine to CPUs, which run untrusted | |
263 | guests, provide interesting data for an attacker depends on the system | |
264 | configuration and the scenarios which run on the system. While for some | |
40b696da | 265 | of the interrupts it can be assumed that they won't expose interesting |
a20c88c2 TG |
266 | information beyond exposing hints about the host OS memory layout, there |
267 | is no way to make general assumptions. | |
268 | ||
269 | Interrupt affinity can be controlled by the administrator via the | |
270 | /proc/irq/$NR/smp_affinity[_list] files. Limited documentation is | |
271 | available at: | |
272 | ||
273 | https://www.kernel.org/doc/Documentation/IRQ-affinity.txt | |
274 | ||
275 | .. _smt_control: | |
276 | ||
277 | 4. SMT control | |
278 | ^^^^^^^^^^^^^^ | |
279 | ||
280 | To prevent the SMT issues of L1TF it might be necessary to disable SMT | |
281 | completely. Disabling SMT can have a significant performance impact, but | |
282 | the impact depends on the hosting scenario and the type of workloads. | |
283 | The impact of disabling SMT needs also to be weighted against the impact | |
284 | of other mitigation solutions like confining guests to dedicated cores. | |
285 | ||
286 | The kernel provides a sysfs interface to retrieve the status of SMT and | |
287 | to control it. It also provides a kernel command line interface to | |
288 | control SMT. | |
289 | ||
290 | The kernel command line interface consists of the following options: | |
291 | ||
292 | =========== ========================================================== | |
293 | nosmt Affects the bring up of the secondary CPUs during boot. The | |
294 | kernel tries to bring all present CPUs online during the | |
295 | boot process. "nosmt" makes sure that from each physical | |
296 | core only one - the so called primary (hyper) thread is | |
297 | activated. Due to a design flaw of Intel processors related | |
298 | to Machine Check Exceptions the non primary siblings have | |
299 | to be brought up at least partially and are then shut down | |
300 | again. "nosmt" can be undone via the sysfs interface. | |
301 | ||
40b696da | 302 | nosmt=force Has the same effect as "nosmt" but it does not allow to |
a20c88c2 TG |
303 | undo the SMT disable via the sysfs interface. |
304 | =========== ========================================================== | |
305 | ||
306 | The sysfs interface provides two files: | |
307 | ||
308 | - /sys/devices/system/cpu/smt/control | |
309 | - /sys/devices/system/cpu/smt/active | |
310 | ||
311 | /sys/devices/system/cpu/smt/control: | |
312 | ||
313 | This file allows to read out the SMT control state and provides the | |
314 | ability to disable or (re)enable SMT. The possible states are: | |
315 | ||
316 | ============== =================================================== | |
317 | on SMT is supported by the CPU and enabled. All | |
318 | logical CPUs can be onlined and offlined without | |
319 | restrictions. | |
320 | ||
321 | off SMT is supported by the CPU and disabled. Only | |
322 | the so called primary SMT threads can be onlined | |
323 | and offlined without restrictions. An attempt to | |
324 | online a non-primary sibling is rejected | |
325 | ||
326 | forceoff Same as 'off' but the state cannot be controlled. | |
327 | Attempts to write to the control file are rejected. | |
328 | ||
329 | notsupported The processor does not support SMT. It's therefore | |
330 | not affected by the SMT implications of L1TF. | |
331 | Attempts to write to the control file are rejected. | |
332 | ============== =================================================== | |
333 | ||
334 | The possible states which can be written into this file to control SMT | |
335 | state are: | |
336 | ||
337 | - on | |
338 | - off | |
339 | - forceoff | |
340 | ||
341 | /sys/devices/system/cpu/smt/active: | |
342 | ||
343 | This file reports whether SMT is enabled and active, i.e. if on any | |
344 | physical core two or more sibling threads are online. | |
345 | ||
346 | SMT control is also possible at boot time via the l1tf kernel command | |
347 | line parameter in combination with L1D flush control. See | |
348 | :ref:`mitigation_control_command_line`. | |
349 | ||
350 | 5. Disabling EPT | |
351 | ^^^^^^^^^^^^^^^^ | |
352 | ||
353 | Disabling EPT for virtual machines provides full mitigation for L1TF even | |
354 | with SMT enabled, because the effective page tables for guests are | |
355 | managed and sanitized by the hypervisor. Though disabling EPT has a | |
356 | significant performance impact especially when the Meltdown mitigation | |
357 | KPTI is enabled. | |
358 | ||
359 | EPT can be disabled in the hypervisor via the 'kvm-intel.ept' parameter. | |
360 | ||
361 | There is ongoing research and development for new mitigation mechanisms to | |
362 | address the performance impact of disabling SMT or EPT. | |
363 | ||
364 | .. _mitigation_control_command_line: | |
365 | ||
366 | Mitigation control on the kernel command line | |
367 | --------------------------------------------- | |
368 | ||
369 | The kernel command line allows to control the L1TF mitigations at boot | |
370 | time with the option "l1tf=". The valid arguments for this option are: | |
371 | ||
372 | ============ ============================================================= | |
373 | full Provides all available mitigations for the L1TF | |
374 | vulnerability. Disables SMT and enables all mitigations in | |
375 | the hypervisors, i.e. unconditional L1D flushing | |
376 | ||
377 | SMT control and L1D flush control via the sysfs interface | |
378 | is still possible after boot. Hypervisors will issue a | |
379 | warning when the first VM is started in a potentially | |
380 | insecure configuration, i.e. SMT enabled or L1D flush | |
381 | disabled. | |
382 | ||
383 | full,force Same as 'full', but disables SMT and L1D flush runtime | |
384 | control. Implies the 'nosmt=force' command line option. | |
385 | (i.e. sysfs control of SMT is disabled.) | |
386 | ||
387 | flush Leaves SMT enabled and enables the default hypervisor | |
388 | mitigation, i.e. conditional L1D flushing | |
389 | ||
390 | SMT control and L1D flush control via the sysfs interface | |
391 | is still possible after boot. Hypervisors will issue a | |
392 | warning when the first VM is started in a potentially | |
393 | insecure configuration, i.e. SMT enabled or L1D flush | |
394 | disabled. | |
395 | ||
396 | flush,nosmt Disables SMT and enables the default hypervisor mitigation, | |
397 | i.e. conditional L1D flushing. | |
398 | ||
399 | SMT control and L1D flush control via the sysfs interface | |
400 | is still possible after boot. Hypervisors will issue a | |
401 | warning when the first VM is started in a potentially | |
402 | insecure configuration, i.e. SMT enabled or L1D flush | |
403 | disabled. | |
404 | ||
405 | flush,nowarn Same as 'flush', but hypervisors will not warn when a VM is | |
406 | started in a potentially insecure configuration. | |
407 | ||
408 | off Disables hypervisor mitigations and doesn't emit any | |
409 | warnings. | |
410 | ============ ============================================================= | |
411 | ||
412 | The default is 'flush'. For details about L1D flushing see :ref:`l1d_flush`. | |
413 | ||
414 | ||
415 | .. _mitigation_control_kvm: | |
416 | ||
417 | Mitigation control for KVM - module parameter | |
418 | ------------------------------------------------------------- | |
419 | ||
420 | The KVM hypervisor mitigation mechanism, flushing the L1D cache when | |
421 | entering a guest, can be controlled with a module parameter. | |
422 | ||
423 | The option/parameter is "kvm-intel.vmentry_l1d_flush=". It takes the | |
424 | following arguments: | |
425 | ||
426 | ============ ============================================================== | |
427 | always L1D cache flush on every VMENTER. | |
428 | ||
429 | cond Flush L1D on VMENTER only when the code between VMEXIT and | |
430 | VMENTER can leak host memory which is considered | |
431 | interesting for an attacker. This still can leak host memory | |
432 | which allows e.g. to determine the hosts address space layout. | |
433 | ||
434 | never Disables the mitigation | |
435 | ============ ============================================================== | |
436 | ||
437 | The parameter can be provided on the kernel command line, as a module | |
438 | parameter when loading the modules and at runtime modified via the sysfs | |
439 | file: | |
440 | ||
441 | /sys/module/kvm_intel/parameters/vmentry_l1d_flush | |
442 | ||
443 | The default is 'cond'. If 'l1tf=full,force' is given on the kernel command | |
444 | line, then 'always' is enforced and the kvm-intel.vmentry_l1d_flush | |
445 | module parameter is ignored and writes to the sysfs file are rejected. | |
446 | ||
447 | ||
448 | Mitigation selection guide | |
449 | -------------------------- | |
450 | ||
451 | 1. No virtualization in use | |
452 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
453 | ||
454 | The system is protected by the kernel unconditionally and no further | |
455 | action is required. | |
456 | ||
457 | 2. Virtualization with trusted guests | |
458 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
459 | ||
460 | If the guest comes from a trusted source and the guest OS kernel is | |
461 | guaranteed to have the L1TF mitigations in place the system is fully | |
462 | protected against L1TF and no further action is required. | |
463 | ||
464 | To avoid the overhead of the default L1D flushing on VMENTER the | |
465 | administrator can disable the flushing via the kernel command line and | |
466 | sysfs control files. See :ref:`mitigation_control_command_line` and | |
467 | :ref:`mitigation_control_kvm`. | |
468 | ||
469 | ||
470 | 3. Virtualization with untrusted guests | |
471 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
472 | ||
473 | 3.1. SMT not supported or disabled | |
474 | """""""""""""""""""""""""""""""""" | |
475 | ||
476 | If SMT is not supported by the processor or disabled in the BIOS or by | |
477 | the kernel, it's only required to enforce L1D flushing on VMENTER. | |
478 | ||
479 | Conditional L1D flushing is the default behaviour and can be tuned. See | |
480 | :ref:`mitigation_control_command_line` and :ref:`mitigation_control_kvm`. | |
481 | ||
482 | 3.2. EPT not supported or disabled | |
483 | """""""""""""""""""""""""""""""""" | |
484 | ||
485 | If EPT is not supported by the processor or disabled in the hypervisor, | |
486 | the system is fully protected. SMT can stay enabled and L1D flushing on | |
487 | VMENTER is not required. | |
488 | ||
489 | EPT can be disabled in the hypervisor via the 'kvm-intel.ept' parameter. | |
490 | ||
491 | 3.3. SMT and EPT supported and active | |
492 | """"""""""""""""""""""""""""""""""""" | |
493 | ||
494 | If SMT and EPT are supported and active then various degrees of | |
495 | mitigations can be employed: | |
496 | ||
497 | - L1D flushing on VMENTER: | |
498 | ||
499 | L1D flushing on VMENTER is the minimal protection requirement, but it | |
500 | is only potent in combination with other mitigation methods. | |
501 | ||
502 | Conditional L1D flushing is the default behaviour and can be tuned. See | |
503 | :ref:`mitigation_control_command_line` and :ref:`mitigation_control_kvm`. | |
504 | ||
505 | - Guest confinement: | |
506 | ||
507 | Confinement of guests to a single or a group of physical cores which | |
508 | are not running any other processes, can reduce the attack surface | |
509 | significantly, but interrupts, soft interrupts and kernel threads can | |
510 | still expose valuable data to a potential attacker. See | |
511 | :ref:`guest_confinement`. | |
512 | ||
513 | - Interrupt isolation: | |
514 | ||
515 | Isolating the guest CPUs from interrupts can reduce the attack surface | |
516 | further, but still allows a malicious guest to explore a limited amount | |
517 | of host physical memory. This can at least be used to gain knowledge | |
518 | about the host address space layout. The interrupts which have a fixed | |
519 | affinity to the CPUs which run the untrusted guests can depending on | |
520 | the scenario still trigger soft interrupts and schedule kernel threads | |
521 | which might expose valuable information. See | |
522 | :ref:`interrupt_isolation`. | |
523 | ||
524 | The above three mitigation methods combined can provide protection to a | |
525 | certain degree, but the risk of the remaining attack surface has to be | |
526 | carefully analyzed. For full protection the following methods are | |
527 | available: | |
528 | ||
529 | - Disabling SMT: | |
530 | ||
531 | Disabling SMT and enforcing the L1D flushing provides the maximum | |
532 | amount of protection. This mitigation is not depending on any of the | |
533 | above mitigation methods. | |
534 | ||
535 | SMT control and L1D flushing can be tuned by the command line | |
536 | parameters 'nosmt', 'l1tf', 'kvm-intel.vmentry_l1d_flush' and at run | |
537 | time with the matching sysfs control files. See :ref:`smt_control`, | |
538 | :ref:`mitigation_control_command_line` and | |
539 | :ref:`mitigation_control_kvm`. | |
540 | ||
541 | - Disabling EPT: | |
542 | ||
543 | Disabling EPT provides the maximum amount of protection as well. It is | |
544 | not depending on any of the above mitigation methods. SMT can stay | |
545 | enabled and L1D flushing is not required, but the performance impact is | |
546 | significant. | |
547 | ||
548 | EPT can be disabled in the hypervisor via the 'kvm-intel.ept' | |
549 | parameter. | |
550 | ||
551 | ||
552 | .. _default_mitigations: | |
553 | ||
554 | Default mitigations | |
555 | ------------------- | |
556 | ||
557 | The kernel default mitigations for vulnerable processors are: | |
558 | ||
559 | - PTE inversion to protect against malicious user space. This is done | |
560 | unconditionally and cannot be controlled. | |
561 | ||
562 | - L1D conditional flushing on VMENTER when EPT is enabled for | |
563 | a guest. | |
564 | ||
565 | The kernel does not by default enforce the disabling of SMT, which leaves | |
566 | SMT systems vulnerable when running untrusted guests with EPT enabled. | |
567 | ||
568 | The rationale for this choice is: | |
569 | ||
570 | - Force disabling SMT can break existing setups, especially with | |
571 | unattended updates. | |
572 | ||
573 | - If regular users run untrusted guests on their machine, then L1TF is | |
574 | just an add on to other malware which might be embedded in an untrusted | |
575 | guest, e.g. spam-bots or attacks on the local network. | |
576 | ||
577 | There is no technical way to prevent a user from running untrusted code | |
578 | on their machines blindly. | |
579 | ||
580 | - It's technically extremely unlikely and from today's knowledge even | |
581 | impossible that L1TF can be exploited via the most popular attack | |
582 | mechanisms like JavaScript because these mechanisms have no way to | |
583 | control PTEs. If this would be possible and not other mitigation would | |
584 | be possible, then the default might be different. | |
585 | ||
586 | - The administrators of cloud and hosting setups have to carefully | |
587 | analyze the risk for their scenarios and make the appropriate | |
588 | mitigation choices, which might even vary across their deployed | |
589 | machines and also result in other changes of their overall setup. | |
590 | There is no way for the kernel to provide a sensible default for this | |
591 | kind of scenarios. |