From: Andrea Righi Date: Fri, 22 May 2026 09:25:23 +0000 (+0200) Subject: sched/fair: Fix RCU usage in NOHZ exit path on CPU offline X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=25139c11693afed894db46d1a44e2b6e015b804d;p=thirdparty%2Fkernel%2Flinux.git sched/fair: Fix RCU usage in NOHZ exit path on CPU offline Commit c9d93a73ce87 ("sched/fair: Drop redundant RCU read lock in NOHZ kick path") removed the rcu_read_lock()/unlock() pair from set_cpu_sd_state_busy() and set_cpu_sd_state_idle() on the assumption that all callers run in a safe context for rcu_dereference_all(): IRQs disabled or cpus_write_lock() held. That assumption is wrong for the CPU hotplug teardown path. When CPUs are taken offline, set_cpu_sd_state_busy() is invoked via: cpuhp/N kthread cpuhp_thread_fun() cpuhp_invoke_callback() sched_cpu_deactivate() nohz_balance_exit_idle() set_cpu_sd_state_busy() rcu_dereference_all(per_cpu(sd_llc, cpu)) The cpuhp kthread holds cpu_hotplug_lock (percpu-rwsem) but runs with preemption and IRQs enabled. As a result, lockdep correctly reports a suspicious RCU usage on CPU offline, e.g.: # echo 0 > /sys/devices/system/cpu/cpu1/online ============================= WARNING: suspicious RCU usage ----------------------------- kernel/sched/fair.c:12793 suspicious rcu_dereference_check() usage! ... 2 locks held by cpuhp/1/20: #0: (cpu_hotplug_lock){++++}-{0:0}, at: cpuhp_thread_fun+0x42/0x1ae #1: (cpuhp_state-down){+.+.}-{0:0}, at: cpuhp_thread_fun+0x72/0x1ae Call Trace: lockdep_rcu_suspicious nohz_balance_exit_idle sched_cpu_deactivate cpuhp_invoke_callback cpuhp_thread_fun smpboot_thread_fn Fix this by adding RCU read lock coverage to the one caller that lacks it: nohz_balance_exit_idle() in the CPU hotplug teardown. The other callers (nohz_balancer_kick() and nohz_balance_enter_idle()) genuinely run with IRQs disabled, so they remain unchanged. Fixes: c9d93a73ce87 ("sched/fair: Drop redundant RCU read lock in NOHZ kick path") Closes: https://lore.kernel.org/all/38fe0a1d-1a48-435a-910a-c278024d9ac9@samsung.com/ Reported-by: Marek Szyprowski Suggested-by: Peter Zijlstra (Intel) Signed-off-by: Andrea Righi Signed-off-by: Peter Zijlstra (Intel) Link: https://patch.msgid.link/20260522092523.2046095-1-arighi@nvidia.com --- diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 7fb3f5f2d48c0..b3a416b1c2510 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -8681,7 +8681,8 @@ int sched_cpu_deactivate(unsigned int cpu) * Remove CPU from nohz.idle_cpus_mask to prevent participating in * load balancing when not active */ - nohz_balance_exit_idle(rq); + scoped_guard (rcu) + nohz_balance_exit_idle(rq); set_cpu_active(cpu, false);