]>
Commit | Line | Data |
---|---|---|
1897907c FY |
1 | .. SPDX-License-Identifier: GPL-2.0 |
2 | ||
3 | .. include:: <isonum.txt> | |
4 | ||
5 | =============================== | |
6 | Bus lock detection and handling | |
7 | =============================== | |
8 | ||
9 | :Copyright: |copy| 2021 Intel Corporation | |
10 | :Authors: - Fenghua Yu <fenghua.yu@intel.com> | |
11 | - Tony Luck <tony.luck@intel.com> | |
12 | ||
13 | Problem | |
14 | ======= | |
15 | ||
16 | A split lock is any atomic operation whose operand crosses two cache lines. | |
17 | Since the operand spans two cache lines and the operation must be atomic, | |
18 | the system locks the bus while the CPU accesses the two cache lines. | |
19 | ||
20 | A bus lock is acquired through either split locked access to writeback (WB) | |
21 | memory or any locked access to non-WB memory. This is typically thousands of | |
22 | cycles slower than an atomic operation within a cache line. It also disrupts | |
23 | performance on other cores and brings the whole system to its knees. | |
24 | ||
25 | Detection | |
26 | ========= | |
27 | ||
28 | Intel processors may support either or both of the following hardware | |
29 | mechanisms to detect split locks and bus locks. | |
30 | ||
31 | #AC exception for split lock detection | |
32 | -------------------------------------- | |
33 | ||
34 | Beginning with the Tremont Atom CPU split lock operations may raise an | |
35 | Alignment Check (#AC) exception when a split lock operation is attemped. | |
36 | ||
37 | #DB exception for bus lock detection | |
38 | ------------------------------------ | |
39 | ||
40 | Some CPUs have the ability to notify the kernel by an #DB trap after a user | |
41 | instruction acquires a bus lock and is executed. This allows the kernel to | |
42 | terminate the application or to enforce throttling. | |
43 | ||
44 | Software handling | |
45 | ================= | |
46 | ||
47 | The kernel #AC and #DB handlers handle bus lock based on the kernel | |
48 | parameter "split_lock_detect". Here is a summary of different options: | |
49 | ||
50 | +------------------+----------------------------+-----------------------+ | |
51 | |split_lock_detect=|#AC for split lock |#DB for bus lock | | |
52 | +------------------+----------------------------+-----------------------+ | |
53 | |off |Do nothing |Do nothing | | |
54 | +------------------+----------------------------+-----------------------+ | |
55 | |warn |Kernel OOPs |Warn once per task and | | |
56 | |(default) |Warn once per task and |and continues to run. | | |
57 | | |disable future checking | | | |
58 | | |When both features are | | | |
59 | | |supported, warn in #AC | | | |
60 | +------------------+----------------------------+-----------------------+ | |
61 | |fatal |Kernel OOPs |Send SIGBUS to user. | | |
62 | | |Send SIGBUS to user | | | |
63 | | |When both features are | | | |
64 | | |supported, fatal in #AC | | | |
65 | +------------------+----------------------------+-----------------------+ | |
d28397ea FY |
66 | |ratelimit:N |Do nothing |Limit bus lock rate to | |
67 | |(0 < N <= 1000) | |N bus locks per second | | |
68 | | | |system wide and warn on| | |
69 | | | |bus locks. | | |
70 | +------------------+----------------------------+-----------------------+ | |
1897907c FY |
71 | |
72 | Usages | |
73 | ====== | |
74 | ||
75 | Detecting and handling bus lock may find usages in various areas: | |
76 | ||
77 | It is critical for real time system designers who build consolidated real | |
78 | time systems. These systems run hard real time code on some cores and run | |
79 | "untrusted" user processes on other cores. The hard real time cannot afford | |
80 | to have any bus lock from the untrusted processes to hurt real time | |
81 | performance. To date the designers have been unable to deploy these | |
82 | solutions as they have no way to prevent the "untrusted" user code from | |
83 | generating split lock and bus lock to block the hard real time code to | |
84 | access memory during bus locking. | |
85 | ||
86 | It's also useful for general computing to prevent guests or user | |
87 | applications from slowing down the overall system by executing instructions | |
88 | with bus lock. | |
89 | ||
90 | ||
91 | Guidance | |
92 | ======== | |
93 | off | |
94 | --- | |
95 | ||
96 | Disable checking for split lock and bus lock. This option can be useful if | |
97 | there are legacy applications that trigger these events at a low rate so | |
98 | that mitigation is not needed. | |
99 | ||
100 | warn | |
101 | ---- | |
102 | ||
103 | A warning is emitted when a bus lock is detected which allows to identify | |
104 | the offending application. This is the default behavior. | |
105 | ||
106 | fatal | |
107 | ----- | |
108 | ||
109 | In this case, the bus lock is not tolerated and the process is killed. | |
d28397ea FY |
110 | |
111 | ratelimit | |
112 | --------- | |
113 | ||
114 | A system wide bus lock rate limit N is specified where 0 < N <= 1000. This | |
115 | allows a bus lock rate up to N bus locks per second. When the bus lock rate | |
116 | is exceeded then any task which is caught via the buslock #DB exception is | |
117 | throttled by enforced sleeps until the rate goes under the limit again. | |
118 | ||
119 | This is an effective mitigation in cases where a minimal impact can be | |
120 | tolerated, but an eventual Denial of Service attack has to be prevented. It | |
121 | allows to identify the offending processes and analyze whether they are | |
122 | malicious or just badly written. | |
123 | ||
124 | Selecting a rate limit of 1000 allows the bus to be locked for up to about | |
125 | seven million cycles each second (assuming 7000 cycles for each bus | |
126 | lock). On a 2 GHz processor that would be about 0.35% system slowdown. |