]> git.ipfire.org Git - thirdparty/kernel/linux.git/commitdiff
sched/fair: Fix RCU usage in NOHZ exit path on CPU offline
authorAndrea Righi <arighi@nvidia.com>
Fri, 22 May 2026 09:25:23 +0000 (11:25 +0200)
committerPeter Zijlstra <peterz@infradead.org>
Tue, 26 May 2026 11:53:12 +0000 (13:53 +0200)
Commit c9d93a73ce87 ("sched/fair: Drop redundant RCU read lock in NOHZ
kick path") removed the rcu_read_lock()/unlock() pair from
set_cpu_sd_state_busy() and set_cpu_sd_state_idle() on the assumption
that all callers run in a safe context for rcu_dereference_all(): IRQs
disabled or cpus_write_lock() held.

That assumption is wrong for the CPU hotplug teardown path. When CPUs
are taken offline, set_cpu_sd_state_busy() is invoked via:

 cpuhp/N kthread
   cpuhp_thread_fun()
     cpuhp_invoke_callback()
       sched_cpu_deactivate()
         nohz_balance_exit_idle()
           set_cpu_sd_state_busy()
             rcu_dereference_all(per_cpu(sd_llc, cpu))

The cpuhp kthread holds cpu_hotplug_lock (percpu-rwsem) but runs with
preemption and IRQs enabled. As a result, lockdep correctly reports a
suspicious RCU usage on CPU offline, e.g.:

  # echo 0 > /sys/devices/system/cpu/cpu1/online

  =============================
  WARNING: suspicious RCU usage
  -----------------------------
  kernel/sched/fair.c:12793 suspicious rcu_dereference_check() usage!
  ...
  2 locks held by cpuhp/1/20:
   #0: (cpu_hotplug_lock){++++}-{0:0}, at: cpuhp_thread_fun+0x42/0x1ae
   #1: (cpuhp_state-down){+.+.}-{0:0}, at: cpuhp_thread_fun+0x72/0x1ae

  Call Trace:
    lockdep_rcu_suspicious
    nohz_balance_exit_idle
    sched_cpu_deactivate
    cpuhp_invoke_callback
    cpuhp_thread_fun
    smpboot_thread_fn

Fix this by adding RCU read lock coverage to the one caller that lacks
it: nohz_balance_exit_idle() in the CPU hotplug teardown.

The other callers (nohz_balancer_kick() and nohz_balance_enter_idle())
genuinely run with IRQs disabled, so they remain unchanged.

Fixes: c9d93a73ce87 ("sched/fair: Drop redundant RCU read lock in NOHZ kick path")
Closes: https://lore.kernel.org/all/38fe0a1d-1a48-435a-910a-c278024d9ac9@samsung.com/
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260522092523.2046095-1-arighi@nvidia.com
kernel/sched/core.c

index 7fb3f5f2d48c0c4dc78b68883f0aa71878921238..b3a416b1c251001335aa77b3afde7b52fe21431c 100644 (file)
@@ -8681,7 +8681,8 @@ int sched_cpu_deactivate(unsigned int cpu)
         * Remove CPU from nohz.idle_cpus_mask to prevent participating in
         * load balancing when not active
         */
-       nohz_balance_exit_idle(rq);
+       scoped_guard (rcu)
+               nohz_balance_exit_idle(rq);
 
        set_cpu_active(cpu, false);