From: Miroslav Lichvar <mlichvar@redhat.com>
Date: Wed, 8 Oct 2025 11:09:10 +0000 (+0200)
Subject: local: improve measurement of clock precision
X-Git-Url: http://git.ipfire.org/gitweb.cgi?a=commitdiff_plain;h=2e29935c548fe842e700ffef3ba79efc22a94a51;p=thirdparty%2Fchrony.git

local: improve measurement of clock precision

By default, the clock precision is set to the minimum measured time
needed to read the clock. This value is typically larger than the actual
resolution, which causes the NTP server to add more noise to NTP
timestamps than necessary. With HW timestamping and PTP corrections
enabled by the NTP-over-PTP transport that can be the limiting factor in
the stability of NTP measurements.

Try to determine the actual resolution of the clock. On non-Linux
systems use the clock_getres() function. On FreeBSD and NetBSD it seems
to provide expected values. On illumos it returns a large value (kernel
tick length?). On Linux it seems to be the internal timer resolution,
which is 1 ns with hrtimers, even when using a lower-resolution
clocksource like hpet or acpi_pm.

On Linux, try to measure the resolution as the minimum observed change
in differences between consecutive readings of the CLOCK_MONOTONIC_RAW
clock with a varying amount of busy work. Ignore 1ns changes due to
the kernel converting readings to timespec. This seems to work reliably.
In a test with the acpi_pm clocksource, differences of 3073, 3352, and
3631 ns were measured, which gives a resolution of 279 ns, matching the
clocksource frequency of ~3.58 MHz. With a tsc clocksource it gives
the minimum accepted resolution of 2 ns and with kvm-clock 10 ns.

As the final value of the precision, use the minimum value from the
measured or clock_getres() resolution and the original minimum time
needed to read the clock.
---

diff --git a/doc/chrony.conf.adoc b/doc/chrony.conf.adoc
index fee500e6..39e47a6f 100644
--- a/doc/chrony.conf.adoc
+++ b/doc/chrony.conf.adoc
@@ -1133,23 +1133,29 @@ distances are in milliseconds.
 
 [[clockprecision]]*clockprecision* _precision_::
 The *clockprecision* directive specifies the precision of the system clock (in
-seconds). It is used by *chronyd* to estimate the minimum noise in NTP
-measurements and randomise low-order bits of timestamps in NTP responses. By
-default, the precision is measured on start-up as the minimum time to read the
-clock.
-+
-The measured value works well in most cases. It generally overestimates the
-precision and it can be sensitive to the CPU speed, however, which can
-change over time to save power. In some cases with a high-precision clocksource
-(e.g. the Time Stamp Counter of the CPU) and hardware timestamping, setting the
-precision on the server to a smaller value can improve stability of clients'
-NTP measurements. The server's precision is reported on clients by the
+seconds). This value is used by *chronyd* as the minimum expected error and
+amount of noise in NTP and refclock measurements, and to randomise low-order
+bits of timestamps in NTP responses to make them less predictable. The minimum
+value is 1 nanosecond and the maximum value is 1 second.
++
+By default, *chronyd* tries to determine the precision on start-up as the
+resolution of the clock. On Linux, it tries to measure the resolution by
+observing the minimum change in differences between consecutive readings of the
+clock. On other systems it relies on the *clock_getres(2)* system function.
++
+If the measurement fails, or the value provided by the system is too large, the
+minimum measured time needed to read the clock will be used instead. This value
+is typically larger than the resolution, and it is sensitive to the CPU speed,
+however, which can change over time to save power.
++
+The server's precision is reported on clients by the
 <<chronyc.adoc#ntpdata,*ntpdata*>> command.
 +
-An example setting the precision to 8 nanoseconds is:
+An example setting the precision to 1 nanosecond (e.g. when the system clock is
+using a Time Stamp Counter (TSC) updated at a rate of at least 1 GHz) is:
 +
 ----
-clockprecision 8e-9
+clockprecision 1e-9
 ----
 
 [[corrtimeratio]]*corrtimeratio* _ratio_::
diff --git a/local.c b/local.c
index 059d8c02..9745f4fb 100644
--- a/local.c
+++ b/local.c
@@ -97,8 +97,142 @@ static double precision_quantum;
 
 static double max_clock_error;
 
+#define NSEC_PER_SEC 1000000000
+
+/* ================================================== */
+
+/* Ask the system for the resolution of the system clock.  The Linux
+   clock_getres() is not usable, because it reports the internal timer
+   resolution, which is 1 ns when high-resolution timers are enabled,
+   even when using a lower-resolution clocksource. */
+
+static int
+get_clock_resolution(void)
+{
+#if defined(HAVE_CLOCK_GETTIME) && !defined(LINUX)
+  struct timespec res;
+
+  if (clock_getres(CLOCK_REALTIME, &res) < 0)
+    return 0;
+
+  return NSEC_PER_SEC * res.tv_sec + res.tv_nsec;
+#else
+  return 0;
+#endif
+}
+
+/* ================================================== */
+
+#if defined(LINUX) && defined(HAVE_CLOCK_GETTIME) && defined(CLOCK_MONOTONIC_RAW)
+
+static int
+compare_ints(const void *a, const void *b)
+{
+  return *(const int *)a - *(const int *)b;
+}
+
+#define READINGS 64
+
+/* On Linux, try to measure the actual resolution of the system
+   clock by performing a varying amount of busy work between clock
+   readings and finding the minimum change in the measured interval.
+   Require a change of at least two nanoseconds to ignore errors
+   caused by conversion to timespec.  Use the raw monotonic clock
+   to avoid the impact of potential frequency changes due to NTP
+   adjustments made by other processes, and the kernel dithering of
+   the 32-bit multiplier. */
+
+static int
+measure_clock_resolution(void)
+{
+  int i, j, b, busy, diffs[READINGS - 1], diff2, min;
+  struct timespec start_ts, ts[READINGS];
+  uint32_t acc;
+
+  if (clock_gettime(CLOCK_MONOTONIC_RAW, &start_ts) < 0)
+    return 0;
+
+  for (acc = 0, busy = 1; busy < 100000; busy = busy * 3 / 2 + 1) {
+    for (i = 0, b = busy * READINGS; i < READINGS; i++, b -= busy) {
+      if (clock_gettime(CLOCK_MONOTONIC_RAW, &ts[i]) < 0)
+        return 0;
+
+      for (j = b; j > 0; j--)
+        acc += (acc & 1) + (uint32_t)ts[i].tv_nsec;
+    }
+
+    /* Give up after 0.1 seconds */
+    if (UTI_DiffTimespecsToDouble(&ts[READINGS - 1], &start_ts) > 0.1) {
+      DEBUG_LOG("Measurement too slow");
+      return 0;
+    }
+
+    for (i = 0; i < READINGS - 1; i++) {
+      diffs[i] = NSEC_PER_SEC * (ts[i + 1].tv_sec - ts[i].tv_sec) +
+                 (ts[i + 1].tv_nsec - ts[i].tv_nsec);
+
+      /* Make sure the differences are sane.  A resolution larger than the
+         reading time will be measured in measure_clock_read_delay(). */
+      if (diffs[i] <= 0 || diffs[i] > NSEC_PER_SEC)
+        return 0;
+    }
+
+    /* Sort the differences and keep values unique within 1 ns from the
+       first half of the array, which are less likely to be impacted by CPU
+       interruptions */
+    qsort(diffs, READINGS - 1, sizeof (diffs[0]), compare_ints);
+    for (i = 1, j = 0; i < READINGS / 2; i++) {
+      if (diffs[j] + 1 < diffs[i])
+        diffs[++j] = diffs[i];
+    }
+    j++;
+
+#if 0
+    for (i = 0; i < j; i++)
+      DEBUG_LOG("busy %d diff %d %d", busy, i, diffs[i]);
+#endif
+
+    /* Require at least three unique differences to be more confident
+       with the result */
+    if (j < 3)
+      continue;
+
+    /* Find the smallest difference between the unique differences */
+    for (i = 1, min = 0; i < j; i++) {
+      diff2 = diffs[i] - diffs[i - 1];
+      if (min == 0 || min > diff2)
+        min = diff2;
+    }
+
+    if (min == 0)
+      continue;
+
+    /* Prevent the compiler from optimising the busy work out */
+    if (acc == 0)
+      min += 1;
+
+    return min;
+  }
+
+  return 0;
+}
+
+#else
+static int
+measure_clock_resolution(void)
+{
+  return 0;
+}
+#endif
+
 /* ================================================== */
 
+/* As a fallback, measure how long it takes to read the clock.  It
+   typically takes longer than the resolution of the clock (and it
+   depends on the CPU speed), i.e. every reading gives a different
+   value, but handle also low-resolution clocks that might give
+   the same reading multiple times. */
+
 /* Define the number of increments of the system clock that we want
    to see to be fairly sure that we've got something approaching
    the minimum increment.  Even on a crummy implementation that can't
@@ -106,10 +240,8 @@ static double max_clock_error;
    under 1s of busy waiting. */
 #define NITERS 100
 
-#define NSEC_PER_SEC 1000000000
-
-static double
-measure_clock_precision(void)
+static int
+measure_clock_read_delay(void)
 {
   struct timespec ts, old_ts;
   int iters, diff, best;
@@ -135,7 +267,28 @@ measure_clock_precision(void)
 
   assert(best > 0);
 
-  return 1.0e-9 * best;
+  return best;
+}
+
+/* ================================================== */
+
+static double
+measure_clock_precision(void)
+{
+  int res, delay, prec;
+
+  res = get_clock_resolution();
+  if (res <= 0)
+    res = measure_clock_resolution();
+
+  delay = measure_clock_read_delay();
+
+  if (res > 0)
+    prec = MIN(res, delay);
+  else
+    prec = delay;
+
+  return prec / 1.0e9;
 }
 
 /* ================================================== */