From: Vincent Guittot Date: Thu, 27 May 2021 12:29:16 +0000 (+0200) Subject: sched/fair: Make sure to update tg contrib for blocked load X-Git-Tag: v5.13-rc6~14^2~3 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=02da26ad5ed6ea8680e5d01f20661439611ed776;p=thirdparty%2Fkernel%2Flinux.git sched/fair: Make sure to update tg contrib for blocked load During the update of fair blocked load (__update_blocked_fair()), we update the contribution of the cfs in tg->load_avg if cfs_rq's pelt has decayed. Nevertheless, the pelt values of a cfs_rq could have been recently updated while propagating the change of a child. In this case, cfs_rq's pelt will not decayed because it has already been updated and we don't update tg->load_avg. __update_blocked_fair ... for_each_leaf_cfs_rq_safe: child cfs_rq update cfs_rq_load_avg() for child cfs_rq ... update_load_avg(cfs_rq_of(se), se, 0) ... update cfs_rq_load_avg() for parent cfs_rq -propagation of child's load makes parent cfs_rq->load_sum becoming null -UPDATE_TG is not set so it doesn't update parent cfs_rq->tg_load_avg_contrib .. for_each_leaf_cfs_rq_safe: parent cfs_rq update cfs_rq_load_avg() for parent cfs_rq - nothing to do because parent cfs_rq has already been updated recently so cfs_rq->tg_load_avg_contrib is not updated ... parent cfs_rq is decayed list_del_leaf_cfs_rq parent cfs_rq - but it still contibutes to tg->load_avg we must set UPDATE_TG flags when propagting pending load to the parent Fixes: 039ae8bcf7a5 ("sched/fair: Fix O(nr_cgroups) in the load balancing path") Reported-by: Odin Ugedal Signed-off-by: Vincent Guittot Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Odin Ugedal Link: https://lkml.kernel.org/r/20210527122916.27683-3-vincent.guittot@linaro.org --- diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index f4795b8008415..e7c8277e3d54a 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -8029,7 +8029,7 @@ static bool __update_blocked_fair(struct rq *rq, bool *done) /* Propagate pending load changes to the parent, if any: */ se = cfs_rq->tg->se[cpu]; if (se && !skip_blocked_update(se)) - update_load_avg(cfs_rq_of(se), se, 0); + update_load_avg(cfs_rq_of(se), se, UPDATE_TG); /* * There can be a lot of idle CPU cgroups. Don't let fully