]> git.ipfire.org Git - thirdparty/kernel/linux.git/commit
sched_ext: Hook up hardlockup detector
authorTejun Heo <tj@kernel.org>
Tue, 11 Nov 2025 19:18:12 +0000 (09:18 -1000)
committerTejun Heo <tj@kernel.org>
Wed, 12 Nov 2025 16:43:44 +0000 (06:43 -1000)
commit582f700e1bdc5978f41e3d8d65d3e16e34e9be8a
tree36c1c48775642fc37e0a52d81edf021dfc41bdaa
parent7ed8df0d15022fcc092e7c7f0bd82359476cff3c
sched_ext: Hook up hardlockup detector

A poorly behaving BPF scheduler can trigger hard lockup. For example, on a
large system with many tasks pinned to different subsets of CPUs, if the BPF
scheduler puts all tasks in a single DSQ and lets all CPUs at it, the DSQ lock
can be contended to the point where hardlockup triggers. Unfortunately,
hardlockup can be the first signal out of such situations, thus requiring
hardlockup handling.

Hook scx_hardlockup() into the hardlockup detector to try kicking out the
current scheduler in an attempt to recover the system to a good state. The
handling strategy can delay watchdog taking its own action by one polling
period; however, given that the only remediation for hardlockup is crash, this
is likely an acceptable trade-off.

v2: Add missing dummy scx_hardlockup() definition for
    !CONFIG_SCHED_CLASS_EXT (kernel test bot).

Reported-by: Dan Schatzberg <schatzberg.dan@gmail.com>
Cc: Emil Tsalapatis <etsal@meta.com>
Cc: Douglas Anderson <dianders@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
include/linux/sched/ext.h
kernel/sched/ext.c
kernel/watchdog.c