From: Kuba Piecuch Date: Thu, 9 Apr 2026 16:57:44 +0000 (+0000) Subject: sched_ext: Documentation: improve accuracy of task lifecycle pseudo-code X-Git-Url: http://git.ipfire.org/index.cgi?a=commitdiff_plain;h=71ba9a5cb125998a875e3f008cbb28b028b609aa;p=thirdparty%2Fkernel%2Flinux.git sched_ext: Documentation: improve accuracy of task lifecycle pseudo-code * Add ops.quiescent() and ops.runnable() to the sched_change path. When a queued task has one of its scheduling properties changed (e.g. nice, affinity), it goes through dequeue() -> quiescent() -> (property change callback, e.g. ops.set_weight()) -> runnable() -> enqueue(). * Change && to || in ops.enqueue() condition. We want to enqueue tasks that have a non-zero slice and are not in any DSQ. * Call ops.dispatch() and ops.dequeue() only for tasks that have had ops.enqueue() called. This is to account for tasks direct-dispatched from ops.select_cpu(). * Add a note explaining that the pseudo-code provides a simplified view of the task lifecycle and list some examples of cases that the pseudo-code does not account for. Fixes: a4f61f0a1afd ("sched_ext: Documentation: Add ops.dequeue() to task lifecycle") Signed-off-by: Kuba Piecuch Reviewed-by: Andrea Righi Signed-off-by: Tejun Heo --- diff --git a/Documentation/scheduler/sched-ext.rst b/Documentation/scheduler/sched-ext.rst index ec594ae8086de..03d595d178ea4 100644 --- a/Documentation/scheduler/sched-ext.rst +++ b/Documentation/scheduler/sched-ext.rst @@ -408,8 +408,8 @@ for more information. Task Lifecycle -------------- -The following pseudo-code summarizes the entire lifecycle of a task managed -by a sched_ext scheduler: +The following pseudo-code presents a rough overview of the entire lifecycle +of a task managed by a sched_ext scheduler: .. code-block:: c @@ -423,20 +423,25 @@ by a sched_ext scheduler: ops.runnable(); /* Task becomes ready to run */ while (task_is_runnable(task)) { - if (task is not in a DSQ && task->scx.slice == 0) { + if (task is not in a DSQ || task->scx.slice == 0) { ops.enqueue(); /* Task can be added to a DSQ */ /* Task property change (i.e., affinity, nice, etc.)? */ if (sched_change(task)) { ops.dequeue(); /* Exiting BPF scheduler custody */ + ops.quiescent(); + + /* Property change callback, e.g. ops.set_weight() */ + + ops.runnable(); continue; } - } - /* Any usable CPU becomes available */ + /* Any usable CPU becomes available */ - ops.dispatch(); /* Task is moved to a local DSQ */ - ops.dequeue(); /* Exiting BPF scheduler custody */ + ops.dispatch(); /* Task is moved to a local DSQ */ + ops.dequeue(); /* Exiting BPF scheduler custody */ + } ops.running(); /* Task starts running on its assigned CPU */ @@ -456,6 +461,30 @@ by a sched_ext scheduler: ops.disable(); /* Disable BPF scheduling for the task */ ops.exit_task(); /* Task is destroyed */ +Note that the above pseudo-code does not cover all possible state transitions +and edge cases, to name a few examples: + +* ``ops.dispatch()`` may fail to move the task to a local DSQ due to a racing + property change on that task, in which case ``ops.dispatch()`` will be + retried. + +* The task may be direct-dispatched to a local DSQ from ``ops.enqueue()``, + in which case ``ops.dispatch()`` and ``ops.dequeue()`` are skipped and we go + straight to ``ops.running()``. + +* Property changes may occur at virtually any point during the task's lifecycle, + not just when the task is queued and waiting to be dispatched. For example, + changing a property of a running task will lead to the callback sequence + ``ops.stopping()`` -> ``ops.quiescent()`` -> (property change callback) -> + ``ops.runnable()`` -> ``ops.running()``. + +* A sched_ext task can be preempted by a task from a higher-priority scheduling + class, in which case it will exit the tick-dispatch loop even though it is runnable + and has a non-zero slice. + +See the "Scheduling Cycle" section for a more detailed description of how +a freshly woken up task gets on a CPU. + Where to Look =============