From 79104becf42baeeb4a3f2b106f954b9fc7c10a3c Mon Sep 17 00:00:00 2001 From: Fernand Sieber Date: Thu, 18 Sep 2025 17:05:28 +0200 Subject: [PATCH] sched/fair: Forfeit vruntime on yield If a task yields, the scheduler may decide to pick it again. The task in turn may decide to yield immediately or shortly after, leading to a tight loop of yields. If there's another runnable task as this point, the deadline will be increased by the slice at each loop. This can cause the deadline to runaway pretty quickly, and subsequent elevated run delays later on as the task doesn't get picked again. The reason the scheduler can pick the same task again and again despite its deadline increasing is because it may be the only eligible task at that point. Fix this by making the task forfeiting its remaining vruntime and pushing the deadline one slice ahead. This implements yield behavior more authentically. We limit the forfeiting to eligible tasks. This is because core scheduling prefers running ineligible tasks rather than force idling. As such, without the condition, we can end up on a yield loop which makes the vruntime increase rapidly, leading to anomalous run delays later down the line. Fixes: 147f3efaa24182 ("sched/fair: Implement an EEVDF-like scheduling policy") Signed-off-by: Fernand Sieber Signed-off-by: Peter Zijlstra (Intel) Link: https://lore.kernel.org/r/20250401123622.584018-1-sieberf@amazon.com Link: https://lore.kernel.org/r/20250911095113.203439-1-sieberf@amazon.com Link: https://lore.kernel.org/r/20250916140228.452231-1-sieberf@amazon.com --- kernel/sched/fair.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index bc0b7ce8a65d6..00f9d6c05d4cf 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -9007,7 +9007,19 @@ static void yield_task_fair(struct rq *rq) */ rq_clock_skip_update(rq); - se->deadline += calc_delta_fair(se->slice, se); + /* + * Forfeit the remaining vruntime, only if the entity is eligible. This + * condition is necessary because in core scheduling we prefer to run + * ineligible tasks rather than force idling. If this happens we may + * end up in a loop where the core scheduler picks the yielding task, + * which yields immediately again; without the condition the vruntime + * ends up quickly running away. + */ + if (entity_eligible(cfs_rq, se)) { + se->vruntime = se->deadline; + se->deadline += calc_delta_fair(se->slice, se); + update_min_vruntime(cfs_rq); + } } static bool yield_to_task_fair(struct rq *rq, struct task_struct *p) -- 2.47.3