]> git.ipfire.org Git - thirdparty/kernel/stable-queue.git/blob - queue-2.6.22/x86-fix-tsc-clock-source-calibration-error.patch
58cf8d3d260205708af561d08c5d33f2f30def64
[thirdparty/kernel/stable-queue.git] / queue-2.6.22 / x86-fix-tsc-clock-source-calibration-error.patch
1 From edaf420fdc122e7a42326fe39274c8b8c9b19d41 Mon Sep 17 00:00:00 2001
2 From: Dave Johnson <djohnson@sw.starentnetworks.com>
3 Date: Tue, 23 Oct 2007 22:37:22 +0200
4 Subject: [PATCH] x86: fix TSC clock source calibration error
5 Message-ID: <20071018085713.GA11022@elte.hu>
6
7 From: Dave Johnson <djohnson@sw.starentnetworks.com>
8
9 patch edaf420fdc122e7a42326fe39274c8b8c9b19d41 in mainline.
10
11 I ran into this problem on a system that was unable to obtain NTP sync
12 because the clock was running very slow (over 10000ppm slow). ntpd had
13 declared all of its peers 'reject' with 'peer_dist' reason.
14
15 On investigation, the tsc_khz variable was significantly incorrect
16 causing xtime to run slow. After a reboot tsc_khz was correct so I
17 did a reboot test to see how often the problem occurred:
18
19 Test was done on a 2000 Mhz Xeon system. Of 689 reboots, 8 of them
20 had unacceptable tsc_khz values (>500ppm):
21
22 range of tsc_khz # of boots % of boots
23 ---------------- ---------- ----------
24 < 1999750 0 0.000%
25 1999750 - 1999800 21 3.048%
26 1999800 - 1999850 166 24.128%
27 1999850 - 1999900 241 35.029%
28 1999900 - 1999950 211 30.669%
29 1999950 - 2000000 42 6.105%
30 2000000 - 2000000 0 0.000%
31 2000050 - 2000100 0 0.000%
32 [...]
33 2000100 - 2015000 1 0.145% << BAD
34 2015000 - 2030000 6 0.872% << BAD
35 2030000 - 2045000 1 0.145% << BAD
36 2045000 < 0 0.000%
37
38 The worst boot was 2032.577 Mhz, over 1.5% off!
39
40 It appears that on rare occasions, mach_countup() is taking longer to
41 complete than necessary.
42
43 I suspect that this is caused by the CPU taking a periodic SMI
44 interrupt right at the end of the 30ms calibration loop. This would
45 cause the loop to delay while the SMI BIOS hander runs. The resulting
46 TSC value is beyond what it actually should be resulting in a higher
47 tsc_khz.
48
49 The below patch makes native_calculate_cpu_khz() take the best
50 (shortest duration, lowest khz) run of it's 3 calibration loops. If a
51 SMI goes off causing a bad result (long duration, higher khz) it will
52 be discarded.
53
54 With the patch applied, 300 boots of the same system produce good
55 results:
56
57 range of tsc_khz # of boots % of boots
58 ---------------- ---------- ----------
59 < 1999750 0 0.000%
60 1999750 - 1999800 30 10.000%
61 1999800 - 1999850 166 55.333%
62 1999850 - 1999900 89 29.667%
63 1999900 - 1999950 15 5.000%
64 1999950 < 0 0.000%
65
66 Problem was found and tested against 2.6.18. Patch is against 2.6.22.
67
68 Signed-off-by: Dave Johnson <djohnson@sw.starentnetworks.com>
69 Signed-off-by: Ingo Molnar <mingo@elte.hu>
70 Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
71 Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
72
73 ---
74 arch/i386/kernel/tsc.c | 5 ++---
75 1 file changed, 2 insertions(+), 3 deletions(-)
76
77 --- a/arch/i386/kernel/tsc.c
78 +++ b/arch/i386/kernel/tsc.c
79 @@ -122,7 +122,7 @@ unsigned long native_calculate_cpu_khz(v
80 {
81 unsigned long long start, end;
82 unsigned long count;
83 - u64 delta64;
84 + u64 delta64 = (u64)ULLONG_MAX;
85 int i;
86 unsigned long flags;
87
88 @@ -134,6 +134,7 @@ unsigned long native_calculate_cpu_khz(v
89 rdtscll(start);
90 mach_countup(&count);
91 rdtscll(end);
92 + delta64 = min(delta64, (end - start));
93 }
94 /*
95 * Error: ECTCNEVERSET
96 @@ -144,8 +145,6 @@ unsigned long native_calculate_cpu_khz(v
97 if (count <= 1)
98 goto err;
99
100 - delta64 = end - start;
101 -
102 /* cpu freq too fast: */
103 if (delta64 > (1ULL<<32))
104 goto err;