]>
Commit | Line | Data |
---|---|---|
39a3a1fe GKH |
1 | From b5a9b340789b2b24c6896bcf7a065c31a4db671c Mon Sep 17 00:00:00 2001 |
2 | From: Vincent Guittot <vincent.guittot@linaro.org> | |
3 | Date: Wed, 19 Oct 2016 14:45:23 +0200 | |
4 | Subject: sched/fair: Fix incorrect task group ->load_avg | |
5 | ||
6 | From: Vincent Guittot <vincent.guittot@linaro.org> | |
7 | ||
8 | commit b5a9b340789b2b24c6896bcf7a065c31a4db671c upstream. | |
9 | ||
10 | A scheduler performance regression has been reported by Joseph Salisbury, | |
11 | which he bisected back to: | |
12 | ||
13 | 3d30544f0212 ("sched/fair: Apply more PELT fixes) | |
14 | ||
15 | The regression triggers when several levels of task groups are involved | |
16 | (read: SystemD) and cpu_possible_mask != cpu_present_mask. | |
17 | ||
18 | The root cause is that group entity's load (tg_child->se[i]->avg.load_avg) | |
19 | is initialized to scale_load_down(se->load.weight). During the creation of | |
20 | a child task group, its group entities on possible CPUs are attached to | |
21 | parent's cfs_rq (tg_parent) and their loads are added to the parent's load | |
22 | (tg_parent->load_avg) with update_tg_load_avg(). | |
23 | ||
24 | But only the load on online CPUs will then be updated to reflect real load, | |
25 | whereas load on other CPUs will stay at the initial value. | |
26 | ||
27 | The result is a tg_parent->load_avg that is higher than the real load, the | |
28 | weight of group entities (tg_parent->se[i]->load.weight) on online CPUs is | |
29 | smaller than it should be, and the task group gets a less running time than | |
30 | what it could expect. | |
31 | ||
32 | ( This situation can be detected with /proc/sched_debug. The ".tg_load_avg" | |
33 | of the task group will be much higher than sum of ".tg_load_avg_contrib" | |
34 | of online cfs_rqs of the task group. ) | |
35 | ||
36 | The load of group entities don't have to be intialized to something else | |
37 | than 0 because their load will increase when an entity is attached. | |
38 | ||
39 | Reported-by: Joseph Salisbury <joseph.salisbury@canonical.com> | |
40 | Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com> | |
41 | Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> | |
42 | Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> | |
43 | Cc: Linus Torvalds <torvalds@linux-foundation.org> | |
44 | Cc: Peter Zijlstra <peterz@infradead.org> | |
45 | Cc: Thomas Gleixner <tglx@linutronix.de> | |
46 | Cc: joonwoop@codeaurora.org | |
47 | Fixes: 3d30544f0212 ("sched/fair: Apply more PELT fixes) | |
48 | Link: http://lkml.kernel.org/r/1476881123-10159-1-git-send-email-vincent.guittot@linaro.org | |
49 | Signed-off-by: Ingo Molnar <mingo@kernel.org> | |
50 | Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> | |
51 | ||
52 | --- | |
53 | kernel/sched/fair.c | 9 ++++++++- | |
54 | 1 file changed, 8 insertions(+), 1 deletion(-) | |
55 | ||
56 | --- a/kernel/sched/fair.c | |
57 | +++ b/kernel/sched/fair.c | |
58 | @@ -680,7 +680,14 @@ void init_entity_runnable_average(struct | |
59 | * will definitely be update (after enqueue). | |
60 | */ | |
61 | sa->period_contrib = 1023; | |
62 | - sa->load_avg = scale_load_down(se->load.weight); | |
63 | + /* | |
64 | + * Tasks are intialized with full load to be seen as heavy tasks until | |
65 | + * they get a chance to stabilize to their real load level. | |
66 | + * Group entities are intialized with zero load to reflect the fact that | |
67 | + * nothing has been attached to the task group yet. | |
68 | + */ | |
69 | + if (entity_is_task(se)) | |
70 | + sa->load_avg = scale_load_down(se->load.weight); | |
71 | sa->load_sum = sa->load_avg * LOAD_AVG_MAX; | |
72 | /* | |
73 | * At this point, util_avg won't be used in select_task_rq_fair anyway |