From: Tejun Heo Date: Tue, 10 Mar 2026 17:12:21 +0000 (-1000) Subject: sched_ext: Fix scx_sched_lock / rq lock ordering X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=6b36c4c2935c54d6a103389fad2a2a9d25591501;p=thirdparty%2Fkernel%2Flinux.git sched_ext: Fix scx_sched_lock / rq lock ordering There are two sites that nest rq lock inside scx_sched_lock: - scx_bypass() takes scx_sched_lock then rq lock per CPU to propagate per-cpu bypass flags and re-enqueue tasks. - sysrq_handle_sched_ext_dump() takes scx_sched_lock to iterate all scheds, scx_dump_state() then takes rq lock per CPU for dump. And scx_claim_exit() takes scx_sched_lock to propagate exits to descendants. It can be reached from scx_tick(), BPF kfuncs, and many other paths with rq lock already held, creating the reverse ordering: rq lock -> scx_sched_lock vs. scx_sched_lock -> rq lock Fix by flipping scx_bypass() to take rq lock first, and dropping scx_sched_lock from sysrq_handle_sched_ext_dump() as scx_sched_all is already RCU-traversable and scx_dump_lock now prevents dumping a dead sched. This makes the consistent ordering rq lock -> scx_sched_lock. Reported-by: Cheng-Yang Chou Link: https://lore.kernel.org/r/20260309163025.2240221-1-yphbchou0911@gmail.com Fixes: ebeca1f930ea ("sched_ext: Introduce cgroup sub-sched support") Signed-off-by: Tejun Heo Reviewed-by: Andrea Righi --- diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c index bc6ce05bb98e6..efba05725139a 100644 --- a/kernel/sched/ext.c +++ b/kernel/sched/ext.c @@ -5097,8 +5097,8 @@ static void scx_bypass(struct scx_sched *sch, bool bypass) struct rq *rq = cpu_rq(cpu); struct task_struct *p, *n; - raw_spin_lock(&scx_sched_lock); raw_spin_rq_lock(rq); + raw_spin_lock(&scx_sched_lock); scx_for_each_descendant_pre(pos, sch) { struct scx_sched_pcpu *pcpu = per_cpu_ptr(pos->pcpu, cpu); @@ -7240,8 +7240,6 @@ static void sysrq_handle_sched_ext_dump(u8 key) struct scx_exit_info ei = { .kind = SCX_EXIT_NONE, .reason = "SysRq-D" }; struct scx_sched *sch; - guard(raw_spinlock_irqsave)(&scx_sched_lock); - list_for_each_entry_rcu(sch, &scx_sched_all, all) scx_dump_state(sch, &ei, 0, false); }