]> git.ipfire.org Git - thirdparty/kernel/stable-queue.git/blame - releases/2.6.36.2/nohz-s390-fix-arch_needs_cpu-return-value-on-offline-cpus.patch
Fixes for 5.10
[thirdparty/kernel/stable-queue.git] / releases / 2.6.36.2 / nohz-s390-fix-arch_needs_cpu-return-value-on-offline-cpus.patch
CommitLineData
bcd4f083
GKH
1From 398812159e328478ae49b4bd01f0d71efea96c39 Mon Sep 17 00:00:00 2001
2From: Heiko Carstens <heiko.carstens@de.ibm.com>
3Date: Wed, 1 Dec 2010 10:08:01 +0100
4Subject: [S390] nohz/s390: fix arch_needs_cpu() return value on offline cpus
5
6From: Heiko Carstens <heiko.carstens@de.ibm.com>
7
8commit 398812159e328478ae49b4bd01f0d71efea96c39 upstream.
9
10This fixes the same problem as described in the patch "nohz: fix
11printk_needs_cpu() return value on offline cpus" for the arch_needs_cpu()
12primitive:
13
14arch_needs_cpu() may return 1 if called on offline cpus. When a cpu gets
15offlined it schedules the idle process which, before killing its own cpu,
16will call tick_nohz_stop_sched_tick().
17That function in turn will call arch_needs_cpu() in order to check if the
18local tick can be disabled. On offline cpus this function should naturally
19return 0 since regardless if the tick gets disabled or not the cpu will be
20dead short after. That is besides the fact that __cpu_disable() should already
21have made sure that no interrupts on the offlined cpu will be delivered anyway.
22
23In this case it prevents tick_nohz_stop_sched_tick() to call
24select_nohz_load_balancer(). No idea if that really is a problem. However what
25made me debug this is that on 2.6.32 the function get_nohz_load_balancer() is
26used within __mod_timer() to select a cpu on which a timer gets enqueued.
27If arch_needs_cpu() returns 1 then the nohz_load_balancer cpu doesn't get
28updated when a cpu gets offlined. It may contain the cpu number of an offline
29cpu. In turn timers get enqueued on an offline cpu and not very surprisingly
30they never expire and cause system hangs.
31
32This has been observed 2.6.32 kernels. On current kernels __mod_timer() uses
33get_nohz_timer_target() which doesn't have that problem. However there might
34be other problems because of the too early exit tick_nohz_stop_sched_tick()
35in case a cpu goes offline.
36
37This specific bug was indrocuded with 3c5d92a0 "nohz: Introduce
38arch_needs_cpu".
39
40In this case a cpu hotplug notifier is used to fix the issue in order to keep
41the normal/fast path small. All we need to do is to clear the condition that
42makes arch_needs_cpu() return 1 since it is just a performance improvement
43which is supposed to keep the local tick running for a short period if a cpu
44goes idle. Nothing special needs to be done except for clearing the condition.
45
46Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
47Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
48Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
49Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
50
51---
52 arch/s390/kernel/vtime.c | 19 +++++++++++++++++++
53 1 file changed, 19 insertions(+)
54
55--- a/arch/s390/kernel/vtime.c
56+++ b/arch/s390/kernel/vtime.c
57@@ -19,6 +19,7 @@
58 #include <linux/kernel_stat.h>
59 #include <linux/rcupdate.h>
60 #include <linux/posix-timers.h>
61+#include <linux/cpu.h>
62
63 #include <asm/s390_ext.h>
64 #include <asm/timer.h>
65@@ -565,6 +566,23 @@ void init_cpu_vtimer(void)
66 __ctl_set_bit(0,10);
67 }
68
69+static int __cpuinit s390_nohz_notify(struct notifier_block *self,
70+ unsigned long action, void *hcpu)
71+{
72+ struct s390_idle_data *idle;
73+ long cpu = (long) hcpu;
74+
75+ idle = &per_cpu(s390_idle, cpu);
76+ switch (action) {
77+ case CPU_DYING:
78+ case CPU_DYING_FROZEN:
79+ idle->nohz_delay = 0;
80+ default:
81+ break;
82+ }
83+ return NOTIFY_OK;
84+}
85+
86 void __init vtime_init(void)
87 {
88 /* request the cpu timer external interrupt */
89@@ -573,5 +591,6 @@ void __init vtime_init(void)
90
91 /* Enable cpu timer interrupts on the boot cpu. */
92 init_cpu_vtimer();
93+ cpu_notifier(s390_nohz_notify, 0);
94 }
95