]> git.ipfire.org Git - thirdparty/kernel/linux.git/commit
cgroup: Fix sleeping from invalid context warning on PREEMPT_RT
authorTejun Heo <tj@kernel.org>
Thu, 6 Nov 2025 18:12:36 +0000 (08:12 -1000)
committerTejun Heo <tj@kernel.org>
Thu, 6 Nov 2025 22:52:26 +0000 (12:52 -1000)
commit9311e6c29b348b005e79228ef6facd38ebcc73f9
tree31cdf288b7857629bd4fa42ac216af01f2a87a31
parentbe04e96ba911fac1dc4c7f89ebb42018d167043f
cgroup: Fix sleeping from invalid context warning on PREEMPT_RT

cgroup_task_dead() is called from finish_task_switch() which runs with
preemption disabled and doesn't allow scheduling even on PREEMPT_RT. The
function needs to acquire css_set_lock which is a regular spinlock that can
sleep on RT kernels, leading to "sleeping function called from invalid
context" warnings.

css_set_lock is too large in scope to convert to a raw_spinlock. However,
the unlinking operations don't need to run synchronously - they just need
to complete after the task is done running.

On PREEMPT_RT, defer the work through irq_work. While the work doesn't need
to happen immediately, it can't be delayed indefinitely either as the dead
task pins the cgroup and task_struct can be pinned indefinitely. Use the
lazy version of irq_work to allow batching and lower impact while ensuring
timely completion.

v2: Use IRQ_WORK_INIT_LAZY instead of immediate irq_work and add explanation
    for why the work can't be delayed indefinitely (Sebastian Andrzej Siewior).

Fixes: d245698d727a ("cgroup: Defer task cgroup unlink until after the task is done switching out")
Reported-by: Calvin Owens <calvin@wbinvd.org>
Link: https://lore.kernel.org/r/20251104181114.489391-1-calvin@wbinvd.org
Signed-off-by: Tejun Heo <tj@kernel.org>
include/linux/sched.h
kernel/cgroup/cgroup.c