From: Greg Kroah-Hartman Date: Sun, 10 Apr 2016 18:18:13 +0000 (-0700) Subject: 4.4-stable patches X-Git-Tag: v4.5.1~10 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=cce4eadaf27a6234381866862527e63bc9af61be;p=thirdparty%2Fkernel%2Fstable-queue.git 4.4-stable patches added patches: sched-cputime-fix-steal-time-accounting-vs.-cpu-hotplug.patch --- diff --git a/queue-4.4/sched-cputime-fix-steal-time-accounting-vs.-cpu-hotplug.patch b/queue-4.4/sched-cputime-fix-steal-time-accounting-vs.-cpu-hotplug.patch new file mode 100644 index 00000000000..9ba58704ad6 --- /dev/null +++ b/queue-4.4/sched-cputime-fix-steal-time-accounting-vs.-cpu-hotplug.patch @@ -0,0 +1,81 @@ +From e9532e69b8d1d1284e8ecf8d2586de34aec61244 Mon Sep 17 00:00:00 2001 +From: Thomas Gleixner +Date: Fri, 4 Mar 2016 15:59:42 +0100 +Subject: sched/cputime: Fix steal time accounting vs. CPU hotplug + +From: Thomas Gleixner + +commit e9532e69b8d1d1284e8ecf8d2586de34aec61244 upstream. + +On CPU hotplug the steal time accounting can keep a stale rq->prev_steal_time +value over CPU down and up. So after the CPU comes up again the delta +calculation in steal_account_process_tick() wreckages itself due to the +unsigned math: + + u64 steal = paravirt_steal_clock(smp_processor_id()); + + steal -= this_rq()->prev_steal_time; + +So if steal is smaller than rq->prev_steal_time we end up with an insane large +value which then gets added to rq->prev_steal_time, resulting in a permanent +wreckage of the accounting. As a consequence the per CPU stats in /proc/stat +become stale. + +Nice trick to tell the world how idle the system is (100%) while the CPU is +100% busy running tasks. Though we prefer realistic numbers. + +None of the accounting values which use a previous value to account for +fractions is reset at CPU hotplug time. update_rq_clock_task() has a sanity +check for prev_irq_time and prev_steal_time_rq, but that sanity check solely +deals with clock warps and limits the /proc/stat visible wreckage. The +prev_time values are still wrong. + +Solution is simple: Reset rq->prev_*_time when the CPU is plugged in again. + +Signed-off-by: Thomas Gleixner +Acked-by: Rik van Riel +Cc: Frederic Weisbecker +Cc: Glauber Costa +Cc: Linus Torvalds +Cc: Peter Zijlstra +Fixes: commit 095c0aa83e52 "sched: adjust scheduler cpu power for stolen time" +Fixes: commit aa483808516c "sched: Remove irq time from available CPU power" +Fixes: commit e6e6685accfa "KVM guest: Steal time accounting" +Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1603041539490.3686@nanos +Signed-off-by: Ingo Molnar +Signed-off-by: Greg Kroah-Hartman + +--- + kernel/sched/core.c | 1 + + kernel/sched/sched.h | 13 +++++++++++++ + 2 files changed, 14 insertions(+) + +--- a/kernel/sched/core.c ++++ b/kernel/sched/core.c +@@ -5525,6 +5525,7 @@ migration_call(struct notifier_block *nf + + case CPU_UP_PREPARE: + rq->calc_load_update = calc_load_update; ++ account_reset_rq(rq); + break; + + case CPU_ONLINE: +--- a/kernel/sched/sched.h ++++ b/kernel/sched/sched.h +@@ -1770,3 +1770,16 @@ static inline u64 irq_time_read(int cpu) + } + #endif /* CONFIG_64BIT */ + #endif /* CONFIG_IRQ_TIME_ACCOUNTING */ ++ ++static inline void account_reset_rq(struct rq *rq) ++{ ++#ifdef CONFIG_IRQ_TIME_ACCOUNTING ++ rq->prev_irq_time = 0; ++#endif ++#ifdef CONFIG_PARAVIRT ++ rq->prev_steal_time = 0; ++#endif ++#ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING ++ rq->prev_steal_time_rq = 0; ++#endif ++} diff --git a/queue-4.4/series b/queue-4.4/series index 41223d1b1d6..098200a3c45 100644 --- a/queue-4.4/series +++ b/queue-4.4/series @@ -203,3 +203,4 @@ mtd-onenand-fix-deadlock-in-onenand_block_markbad.patch intel_idle-prevent-skl-h-boot-failure-when-c8-c9-c10-enabled.patch pm-sleep-clear-pm_suspend_global_flags-upon-hibernate.patch scsi_common-do-not-clobber-fixed-sense-information.patch +sched-cputime-fix-steal-time-accounting-vs.-cpu-hotplug.patch