When ops.sub_attach is set, scx_alloc_and_add_sched() creates sub_kset as a
child of &sch->kobj, which pins the parent with its own reference. The
disable paths never call kset_unregister(), so the final kobject_put() in
bpf_scx_unreg() leaves a stale reference and scx_kobj_release() never runs,
leaking the whole struct scx_sched on every load/unload cycle.
Unregister sub_kset in scx_root_disable() and scx_sub_disable() before
kobject_del(&sch->kobj).
Fixes: ebeca1f930ea ("sched_ext: Introduce cgroup sub-sched support")
Reported-by: Chris Mason <clm@meta.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Andrea Righi <arighi@nvidia.com>
if (sch->ops.exit)
SCX_CALL_OP(sch, exit, NULL, sch->exit_info);
+ if (sch->sub_kset)
+ kset_unregister(sch->sub_kset);
kobject_del(&sch->kobj);
}
#else /* CONFIG_EXT_SUB_SCHED */
* could observe an object of the same name still in the hierarchy when
* the next scheduler is loaded.
*/
+#ifdef CONFIG_EXT_SUB_SCHED
+ if (sch->sub_kset)
+ kset_unregister(sch->sub_kset);
+#endif
kobject_del(&sch->kobj);
free_kick_syncs();