]>
Commit | Line | Data |
---|---|---|
9e63ff13 GKH |
1 | From 206b92353c839c0b27a0b9bec24195f93fd6cf7a Mon Sep 17 00:00:00 2001 |
2 | From: Thomas Gleixner <tglx@linutronix.de> | |
3 | Date: Tue, 26 Mar 2019 17:36:05 +0100 | |
4 | Subject: cpu/hotplug: Prevent crash when CPU bringup fails on CONFIG_HOTPLUG_CPU=n | |
5 | ||
6 | From: Thomas Gleixner <tglx@linutronix.de> | |
7 | ||
8 | commit 206b92353c839c0b27a0b9bec24195f93fd6cf7a upstream. | |
9 | ||
10 | Tianyu reported a crash in a CPU hotplug teardown callback when booting a | |
11 | kernel which has CONFIG_HOTPLUG_CPU disabled with the 'nosmt' boot | |
12 | parameter. | |
13 | ||
14 | It turns out that the SMP=y CONFIG_HOTPLUG_CPU=n case has been broken | |
15 | forever in case that a bringup callback fails. Unfortunately this issue was | |
16 | not recognized when the CPU hotplug code was reworked, so the shortcoming | |
17 | just stayed in place. | |
18 | ||
19 | When a bringup callback fails, the CPU hotplug code rolls back the | |
20 | operation and takes the CPU offline. | |
21 | ||
22 | The 'nosmt' command line argument uses a bringup failure to abort the | |
23 | bringup of SMT sibling CPUs. This partial bringup is required due to the | |
24 | MCE misdesign on Intel CPUs. | |
25 | ||
26 | With CONFIG_HOTPLUG_CPU=y the rollback works perfectly fine, but | |
27 | CONFIG_HOTPLUG_CPU=n lacks essential mechanisms to exercise the low level | |
28 | teardown of a CPU including the synchronizations in various facilities like | |
29 | RCU, NOHZ and others. | |
30 | ||
31 | As a consequence the teardown callbacks which must be executed on the | |
32 | outgoing CPU within stop machine with interrupts disabled are executed on | |
33 | the control CPU in interrupt enabled and preemptible context causing the | |
34 | kernel to crash and burn. The pre state machine code has a different | |
35 | failure mode which is more subtle and resulting in a less obvious use after | |
36 | free crash because the control side frees resources which are still in use | |
37 | by the undead CPU. | |
38 | ||
39 | But this is not a x86 only problem. Any architecture which supports the | |
40 | SMP=y HOTPLUG_CPU=n combination suffers from the same issue. It's just less | |
41 | likely to be triggered because in 99.99999% of the cases all bringup | |
42 | callbacks succeed. | |
43 | ||
44 | The easy solution of making HOTPLUG_CPU mandatory for SMP is not working on | |
45 | all architectures as the following architectures have either no hotplug | |
46 | support at all or not all subarchitectures support it: | |
47 | ||
48 | alpha, arc, hexagon, openrisc, riscv, sparc (32bit), mips (partial). | |
49 | ||
50 | Crashing the kernel in such a situation is not an acceptable state | |
51 | either. | |
52 | ||
53 | Implement a minimal rollback variant by limiting the teardown to the point | |
54 | where all regular teardown callbacks have been invoked and leave the CPU in | |
55 | the 'dead' idle state. This has the following consequences: | |
56 | ||
57 | - the CPU is brought down to the point where the stop_machine takedown | |
58 | would happen. | |
59 | ||
60 | - the CPU stays there forever and is idle | |
61 | ||
62 | - The CPU is cleared in the CPU active mask, but not in the CPU online | |
63 | mask which is a legit state. | |
64 | ||
65 | - Interrupts are not forced away from the CPU | |
66 | ||
67 | - All facilities which only look at online mask would still see it, but | |
68 | that is the case during normal hotplug/unplug operations as well. It's | |
69 | just a (way) longer time frame. | |
70 | ||
71 | This will expose issues, which haven't been exposed before or only seldom, | |
72 | because now the normally transient state of being non active but online is | |
73 | a permanent state. In testing this exposed already an issue vs. work queues | |
74 | where the vmstat code schedules work on the almost dead CPU which ends up | |
75 | in an unbound workqueue and triggers 'preemtible context' warnings. This is | |
76 | not a problem of this change, it merily exposes an already existing issue. | |
77 | Still this is better than crashing fully without a chance to debug it. | |
78 | ||
79 | This is mainly thought as workaround for those architectures which do not | |
80 | support HOTPLUG_CPU. All others should enforce HOTPLUG_CPU for SMP. | |
81 | ||
82 | Fixes: 2e1a3483ce74 ("cpu/hotplug: Split out the state walk into functions") | |
83 | Reported-by: Tianyu Lan <Tianyu.Lan@microsoft.com> | |
84 | Signed-off-by: Thomas Gleixner <tglx@linutronix.de> | |
85 | Tested-by: Tianyu Lan <Tianyu.Lan@microsoft.com> | |
86 | Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> | |
87 | Cc: Konrad Wilk <konrad.wilk@oracle.com> | |
88 | Cc: Josh Poimboeuf <jpoimboe@redhat.com> | |
89 | Cc: Mukesh Ojha <mojha@codeaurora.org> | |
90 | Cc: Peter Zijlstra <peterz@infradead.org> | |
91 | Cc: Jiri Kosina <jkosina@suse.cz> | |
92 | Cc: Rik van Riel <riel@surriel.com> | |
93 | Cc: Andy Lutomirski <luto@kernel.org> | |
94 | Cc: Micheal Kelley <michael.h.kelley@microsoft.com> | |
95 | Cc: "K. Y. Srinivasan" <kys@microsoft.com> | |
96 | Cc: Linus Torvalds <torvalds@linux-foundation.org> | |
97 | Cc: Borislav Petkov <bp@alien8.de> | |
98 | Cc: K. Y. Srinivasan <kys@microsoft.com> | |
99 | Cc: stable@vger.kernel.org | |
100 | Link: https://lkml.kernel.org/r/20190326163811.503390616@linutronix.de | |
101 | Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> | |
102 | ||
103 | --- | |
104 | kernel/cpu.c | 20 ++++++++++++++++++-- | |
105 | 1 file changed, 18 insertions(+), 2 deletions(-) | |
106 | ||
107 | --- a/kernel/cpu.c | |
108 | +++ b/kernel/cpu.c | |
109 | @@ -533,6 +533,20 @@ static void undo_cpu_up(unsigned int cpu | |
110 | cpuhp_invoke_callback(cpu, st->state, false, NULL, NULL); | |
111 | } | |
112 | ||
113 | +static inline bool can_rollback_cpu(struct cpuhp_cpu_state *st) | |
114 | +{ | |
115 | + if (IS_ENABLED(CONFIG_HOTPLUG_CPU)) | |
116 | + return true; | |
117 | + /* | |
118 | + * When CPU hotplug is disabled, then taking the CPU down is not | |
119 | + * possible because takedown_cpu() and the architecture and | |
120 | + * subsystem specific mechanisms are not available. So the CPU | |
121 | + * which would be completely unplugged again needs to stay around | |
122 | + * in the current state. | |
123 | + */ | |
124 | + return st->state <= CPUHP_BRINGUP_CPU; | |
125 | +} | |
126 | + | |
127 | static int cpuhp_up_callbacks(unsigned int cpu, struct cpuhp_cpu_state *st, | |
128 | enum cpuhp_state target) | |
129 | { | |
130 | @@ -543,8 +557,10 @@ static int cpuhp_up_callbacks(unsigned i | |
131 | st->state++; | |
132 | ret = cpuhp_invoke_callback(cpu, st->state, true, NULL, NULL); | |
133 | if (ret) { | |
134 | - st->target = prev_state; | |
135 | - undo_cpu_up(cpu, st); | |
136 | + if (can_rollback_cpu(st)) { | |
137 | + st->target = prev_state; | |
138 | + undo_cpu_up(cpu, st); | |
139 | + } | |
140 | break; | |
141 | } | |
142 | } |