From: Cheng-Yang Chou Date: Tue, 28 Apr 2026 17:36:12 +0000 (+0800) Subject: sched_ext: Fix scx_flush_disable_work() UAF race X-Git-Tag: v7.1-rc2~27^2 X-Git-Url: http://git.ipfire.org/gitweb.cgi?a=commitdiff_plain;h=d99f7a32f09dccbe396187370ec1a74a31b73d7e;p=thirdparty%2Fkernel%2Flinux.git sched_ext: Fix scx_flush_disable_work() UAF race scx_flush_disable_work() calls irq_work_sync() followed by kthread_flush_work() to ensure that the disable kthread work has fully completed before bpf_scx_unreg() frees the SCX scheduler. However, a concurrent scx_vexit() (e.g., triggered by a watchdog stall) creates a race window between scx_claim_exit() and irq_work_queue(): CPU A (scx_vexit (watchdog)) CPU B (bpf_scx_unreg) ---- ---- scx_claim_exit() atomic_try_cmpxchg(NONE->kind) stack_trace_save() vscnprintf() scx_disable() scx_claim_exit() -> FAIL scx_flush_disable_work() irq_work_sync() // no-op: not queued yet kthread_flush_work() // no-op: not queued yet kobject_put(&sch->kobj) -> free %sch irq_work_queue() -> UAF on %sch scx_disable_irq_workfn() kthread_queue_work() -> UAF The root cause is that CPU B's scx_flush_disable_work() returns after syncing an irq_work that has not yet been queued, while CPU A is still executing the code between scx_claim_exit() and irq_work_queue(). Loop until exit_kind reaches SCX_EXIT_DONE or SCX_EXIT_NONE, draining disable_irq_work and disable_work in each pass. This ensures that any work queued after the previous check is caught, while also correctly handling cases where no disable was triggered (e.g., the scx_sub_enable_workfn() abort path). Fixes: 510a27055446 ("sched_ext: sync disable_irq_work in bpf_scx_unreg()") Reported-by: https://sashiko.dev/#/patchset/20260424100221.32407-1-icheng%40nvidia.com Suggested-by: Tejun Heo Signed-off-by: Cheng-Yang Chou Signed-off-by: Tejun Heo --- diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c index cac0b18239fe..9483be03a4ca 100644 --- a/kernel/sched/ext.c +++ b/kernel/sched/ext.c @@ -6039,8 +6039,13 @@ static void scx_disable(struct scx_sched *sch, enum scx_exit_kind kind) */ static void scx_flush_disable_work(struct scx_sched *sch) { - irq_work_sync(&sch->disable_irq_work); - kthread_flush_work(&sch->disable_work); + int kind; + + do { + irq_work_sync(&sch->disable_irq_work); + kthread_flush_work(&sch->disable_work); + kind = atomic_read(&sch->exit_kind); + } while (kind != SCX_EXIT_NONE && kind != SCX_EXIT_DONE); } static void dump_newline(struct seq_buf *s)