From: Harlan Stenn Date: Wed, 21 Dec 2011 20:05:08 +0000 (-0500) Subject: Documentation updates X-Git-Tag: NTP_4_2_7P242~1 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=e2dfda9f915af64d1701bd250cb7933b50a11ad9;p=thirdparty%2Fntp.git Documentation updates bk: 4ef23bf4imuA-y980ogmU8gDZqVL0A --- diff --git a/ChangeLog b/ChangeLog index 84b0b2d34..6d39bfb67 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,5 @@ +* Include missing html/icons/sitemap.png, reported by Michael Tatarinov. +* Documentation updates from Dave Mills. (4.2.7p241) 2011/12/18 Released by Harlan Stenn * [Bug 2015] Overriding sys_tick should recalculate sys_precision. * [Bug 2037] Fuzzed non-interpolated clock may decrease. diff --git a/html/icons/sitemap.png b/html/icons/sitemap.png new file mode 100644 index 000000000..17c7c5517 Binary files /dev/null and b/html/icons/sitemap.png differ diff --git a/html/prefer.html b/html/prefer.html index f402a2dbc..cf2b10830 100644 --- a/html/prefer.html +++ b/html/prefer.html @@ -9,7 +9,7 @@ gif from Alice's Adventures in Wonderland, Lewis Carroll

Listen carefully to what I say; it is very complicated.

Last update: - 05-Dec-2011 7:21 + 16-Dec-2011 20:58 UTC


Related Links

@@ -29,13 +29,13 @@

This page summarizes the criteria for choosing from among the survivors of the clock cluster algorithm a set of contributors to the clock discipline algorithm. The criteria are very meticulous, since they have to handle many different scenarios that may be optimized for special circumstances, including some scenarios designed to support planetary and deep space missions.

Recall the suite of NTP data acquisition and grooming algorithms. These algorithms proceed in five phases. Phase one discovers the available sources and mobilizes an association for each source found. These sources can result from explicit configuration, broadcast discovery or the pool and manycast autonomous configuration schemes. See the Automatic Server Discovery Schemes page for further information.

Phase two selects the candidates from among the sources by excluding those sources showing one or more of the errors summarized on the Clock Select Algorithmm page and to determine the truechimers from among the candidates, leaving behind the falsetickers. A server or peer configured with the true option is declared a truechimer independent of this algorithm. Phase four uses the algorithm described on the Clock Cluster Algorithm page to trim the statistical outliers from the truechimers, leaving the survivor list as result.

-

Phase five uses a set of algorithms and mitigation rules to combined the survivor statistics antdiscipline the systen clock. The mitigation rules select from among the survivors a system peer from which a set of system statistics can be inherited and passed along to dependent clients, if any. The algorithms and rules are the main topic of this page. The clock offset developed from these algorithms can discipline the system clock either using the clock discipline algorithm or enable the kernel to discipline the system clock directly, as described on the A Kernel Model for Precision Timekeeping page.

+

Phase five uses a set of algorithms and mitigation rules to combined the survivor statistics antdiscipline the systen clock. The mitigation rules select from among the survivors a system peer from which a set of system statistics can be inherited and passed along to dependent clients, if any. The algorithms and rules are the main topic of this page. The clock offset developed from these algorithms can discipline the system clock, either using the clock discipline algorithm or using the kernel to discipline the system clock directly, as described on the A Kernel Model for Precision Timekeeping page.

Combine Algorithm

The clock combine algorithm uses the survivor list to produce a weighted average of both offset and jitter. Absent other considerations discussed later, the combined offset is used to discipline the system clock, while the combined jitter is augmented with other components to produce the system jitter statistic inherited by dependent clients, if any.

The clock combine algorithm uses a weight factor for each survivor equal to the reciprocal of the root distance. This is normalized so that the sum of the reciprocals is equal to unity. This design favors the survivors at the smallest root distance and thus the smallest maximum error.

Anti-Clockhop Algorithm

The anti-clockhop algorithm is intended for cases where multiple servers are available on a fast LAN with modern computers. Typical offset differences between servers in such cases are less than 0.5 ms. However, changes between servers can result in unnecessary system jitter. The object of the anti-clockhop algorithm is to avoid changing the current server unless it becomes stale or the offset differences between it and the others on the survivor list becomes substantial.

-

To help compact this discussion, we will call the last selected server as the old peer, and the server at the head of the survivor list the candidate peer. The anti-clockhop algorithm is called immediately after the combine algorithm. First, the survivor list produced by the clock cluster algorithm is sorted by increasing root distance. The algorithm then initializes the anti-clockhop threshold with the value of mindist, by default 1 ms.

+

To help compact this discussion, we will call the last selected server the old peer, and thecurrently selected server the candidate peer. The anti-clockhop algorithm is called immediately after the combine algorithm. The metric used to select the candidate peer is formed as the root distance plus product of the stratum times the mindist variable. It is used to select the minimum survivor on the list produced by the clock cluster algorithm. The algorithm then initializes the anti-clockhop threshold with the value of mindist, by default 1 ms.

If there was no old peer or the old and candidate peers are the same, the candidate peer becomes the system peer. If not, the algorithm measures the difference between the offset of the old peer and the candidate peer. If the difference exceeds the anti-clockhop threshold, the candidate peer becomes the system peer and the anti-clockhop threshold is restored to its original value. If not, the old peer continues as the system peer. However, at each subsequent update, the algorithm reduces the anti-clockhop threshold by half. Should operation continue in this way, the candidate peer will eventually become the system peer.

Peer Classification

The behavior of the various algorithms and mitigation rules involved depends on how the various synchronization sources are classified. This depends on whether the source is local or remote and if local, the type of source. The following classes are defined:

diff --git a/html/select.html b/html/select.html index 4754fefdd..8a227da6f 100644 --- a/html/select.html +++ b/html/select.html @@ -10,14 +10,14 @@

Clock Select Algorithm

Last update: - 04-Dec-2011 14:27 + 16-Dec-2011 13:54 UTC


The clock select algorithm determines from a set of sources , which are correct (truechimers) and which are not (falsetickers) according to a set of formal correctness assertions. The principles are based on the observation that the maximum error in determining the offset of a candidate cannot exceed one-half the roundtrip delay to the primary reference clock at the time of measurement. This must be increased by the maximum error that can accumulate since then. The selection metric, called the root distance,, is one-half the roundtrip root delay plus the root dispersion plus minor error contributions not considered here.

First, a number of sanity checks is performed to sift the selectable candidate from among the source population. The sanity checks are sumarized as follows:.

  1. A stratum error occurs if (1) the source had never been synchronized or (2) the stratum of the source is below the floor option or not below the ceiling option of the tos command. The default values for these options are 0 and 15, respectively. Note that 15 is a valid stratum, but a server operating at that stratum cannot synchronize clients.
  2. -
  3. A distance error occurs for a remote source if the root distance (also known ad synchronization distance) of the source is not below the distance threshold maxdist option of the tos command. The default value for this option is 1.5 s for networks including only the Earth, but this should be increased to 2.5 s for networks including the Moon.
  4. +
  5. A distance error occurs for a source if the root distance (also known ad synchronization distance) of the source is not below the distance threshold maxdist option of the tos command. The default value for this option is 1.5 s for networks including only the Earth, but this should be increased to 2.5 s for networks including the Moon.
  6. A loop error occurs if the source is synchronized to the client. This can occur if two peers are configured with each other in symmetric modes.
  7. An unreachable error occurs if the source is unreachable or if the server or peer command for the source includes the noselect option.
diff --git a/html/warp.html b/html/warp.html index 85b030482..08ca1e04a 100644 --- a/html/warp.html +++ b/html/warp.html @@ -9,7 +9,7 @@

How NTP Works

Last update: - 05-Dec-2011 16:26 + 15-Dec-2011 16:30 UTC

Related Links

@@ -43,21 +43,19 @@

The algorithm described on the Clock Filter Algorithm page selects the offset and delay samples most likely to produce accurate results. Those servers that have passed the sanity tests are declared selectable. From the selectable population the statistics are used by the algorithm described on the Clock Select Algorithm page to determine a number of truechimers according to correctness principles. From the truechimer population the algorithm described on the Clock Cluster Algorithm page determines a number of survivors on the basis of statistical clustering principles. The algorithms described on the Mitigation Rules and the prefer Keyword page combine the survivor offsets, designate one of them as the system peer and produces the final offset used by the algorithm described on the Clock Discipline Algorithm page to adjust the system clock time and frequency. The clock offset and frequency, are recorded by the loopstats option of the filegen command. For additional details about these algorithms, see the Architecture Briefing on the Network Time Synchronization Research Project page.

NTP Timescale and Data Formats

NTP clients and servers synchronize to the Coordinated Universal Time (UTC) timescale used by national laboratories and disseminated by radio, satellite and telephone modem. This is a global timescale independent of geographic position. There are no provisions for local time zone or daylight savings time; however, these functions can be performed by the operating system on a per-user basis.

-

The UT1 timescale, upon which UTC is based, is determined by the rotation of the Earth about its axis, which is gradually slowing down. In order to rationalize UTC with respect to UT1, a leap second is inserted at intervals of about 18 months, as determined by the International Earth Rotation Service (IERS). The historic insertions are documented in the leap-seconds.list file, which can be downloaded from the NIST FTP server. This file is updated at intervals not exceeding six months. Leap second warnings are disseminated by the national laboratories in the broadcast timecode format. These warnings are propagated from the NTP primary servers via other server to the clients by the NTP on-wire protocol. The leap second is implemented by the operating system kernel, as described in the white paper The NTP Timescale and Leap Seconds.

+

The UT1 timescale, upon which UTC is based, is determined by the rotation of the Earth about its axis, which is gradually slowing down relative to International Attomic Time (TAI). In order to rationalize UTC with respect to TAI, a leap second is inserted at intervals of about 18 months, as determined by the International Earth Rotation Service (IERS). The historic insertions are documented in the leap-seconds.list file, which can be downloaded from the NIST FTP server. This file is updated at intervals not exceeding six months. Leap second warnings are disseminated by the national laboratories in the broadcast timecode format. These warnings are propagated from the NTP primary servers via other server to the clients by the NTP on-wire protocol. The leap second is implemented by the operating system kernel, as described in the white paper The NTP Timescale and Leap Seconds.

There are two NTP time formats, a 64-bit timestamp format and a 128-bit date format. The date format is used internally, while the timestamp format is used in packet headers exchanged between clients and servers. The timestamp format spans 136 years, called an era. The current era began on 1 January 1900, while the next one begins in 2036. Details on these formats and conversions between them are in the white paper The NTP Era and Era Numbering. However, the NTP protocol will synchronize correctly, regardless of era, as long as the system clock is set initially within 68 years of the correct time. Further discussion on this issue is in the white paper NTP Timestamp Calculations. Ordinarily, these formats are not seen by application programs, which convert these NTP formats to native Unix or Windows formats.

Statistics Budget

Each NTP synchronization source is characterized by the offset and delay samples measured by the on-wire protocol using the equations above. The dispersion sample is initialized with the sum of the server precision and the client precision as each update is received. The dispersion increases at a rate of 15 ms/s after that. For this purpose, the precision is equal to the latency to read the system clock. The offset, delay and dispersion are called the sample statistics.

-

In a window of eight (offset, delay, dispersion) samples, the algorithm described on the Clock Filter Algorithm page selects the sample with minimum delay, which generally represents the most accurate offset statistic. The selected sample becomes the peer offset and peer delay statistics. The peer dispersion is a weighted average of the dispersion samples in the window. These quantities are recalculated as each update is received from the server. Between updates, both the sample dispersion and peer dispersion continue to grow at the same rate, 15 ms/s. Finally, the peer jitter is determined as the root mean square (RMS) of the offset samples in the window relative to the selected offset sample. The peer statistics are recorded by the peerstats option of the filegen command. Peer variables are displayed by the rv command of the ntpq program.

-

The clock filter algorithm continues to process packets in this way until the source is no longer reachable. Reachability is determined by an eight-bit shift register, which is shifted left by one bit as each poll packet is sent, with 0 replacing the vacated rightmost bit. Each time an update is received, the rightmost bit is set to 1. The source is considered reachable if any bit is set to 1 in the register; otherwise, it is considered unreachable.

-

A server is considered nonselectable if it is unreachable, or the peer synchronization distance abbreviated to peer distance (see below) is above the select threshold, or if a timing loop is present. If none of these conditions exist, the server is considered selectable. The select threshold is by default 1.5 s, but can be changed by the maxdist option of the tos command. A timing loop is presentif the server is synchronized to the client, which can occur, for example, if they are configured in symmetric modes with each other. When a source becomes unreachable, a dummy sample with "infinite" dispersion is inserted in the shift register at each poll, thus displacing old samples. This causes the peeer dispersion, and thus the peer distance, to increase and eventually to exceed the select threshold.

+

In a window of eight (offset, delay, dispersion) samples, the algorithm described on the Clock Filter Algorithm page selects the sample with minimum delay, which generally represents the most accurate offset statistic. The selected sample becomes the peer offset and peer delay statistics. The peer dispersion is a weighted average of the dispersion samples in the window. These quantities are recalculated as each update is received from the source. Between updates, both the sample dispersion and peer dispersion continue to grow at the same rate, 15 ms/s. Finally, the peer jitter is determined as the root mean square (RMS) of the offset samples in the window relative to the selected offset sample. The peer statistics are recorded by the peerstats option of the filegen command. Peer variables are displayed by the rv command of the ntpq program.

+

The clock filter algorithm continues to process packets in this way until the source is no longer reachable. Reachability is determined by an eight-bit shift register, which is shifted left by one bit as each poll packet is sent, with 0 replacing the vacated rightmost bit. Each time a valid update is received, the rightmost bit is set to 1. The source is considered reachable if any bit is set to 1 in the register; otherwise, it is considered unreachable. When a source becomes unreachable, a dummy sample with "infinite" dispersion is inserted in the shift register at each poll, thus displacing old samples. This causes the peeer dispersion to increase eventually to infinity.

The composition of the survivor population and the system peer selection is re determined as each update from each source is received. The system peer and system variables are determined as described on the Mitigation Rules and the prefer Keyword page. The system variables are copied from the system peer variables of the same name and the system stratum set one greater than the system peer stratum. The system statistics are recorded by the loopstats option of the filegen command. System variables are displayed by the rv command of the ntpq program.

-

The system synchronization distance, usually called the root distance, is defined as half the system peer delay plus the system peer dispersion. Between updates it increases at the same rate as the system peer dispersion, even if all sources have become unselectable. If the server root distance exceeds the client select threshold, as apparent to dependent clients, the server is considered nonselectable. It is important to understand that a server in this condition remains a reliable source of synchronization within its error bounds, as described in the next section.

Quality of Service

-

The algorithms described on the Mitigation Rules and the prefer Keyword page deliver several important statistics, including system offset and system jitter. These statistics are determined by the mitigation algorithms from the survivor statistics produced by the clock cluster algorithm. System offset is best interpreted as the maximum-likelihood estimate of the system clock offset, while system jitter is best interpreted as the expected error of this estimate.

+

The algorithms described on the Mitigation Rules and the prefer Keyword page deliver several important statistics, including system offset and system jitter. These statistics are determined from the survivor statistics produced by the clock cluster algorithm. System offset is best interpreted as the maximum-likelihood estimate of the system clock offset, while system jitter is best interpreted as the expected error of this estimate.

Of interest in the following discussion is how the client determines these statistics from a survivor population including reference clocks and remote servers. This is determined from two statistics, expected error and maximum error. Expected error, also called system jitter, is determined from various jitter components; it represents the nominal error in determining the clock offset.

Maximum error is determined from delay and dispersion contributions and represents the worst-case error due to all causes. In order to simplify discussion, certain minor contributions to the maximum error statistic are ignored. Elsewhere in the documentation the maximum error is called system synchronization distance or root distance. If the precision time kernel support is available, both the estimated error and maximum error are reported to user programs via the ntp_gettime() kernel system call. See the Kernel Model for Precision Timekeeping page for further information.

The maximum error statistic is computed as one-half the root delay to the primary source of time; i.e., the primary reference clock, plus the root dispersion. The root variables are included in the NTP packet header received from each server. When calculating maximum error, the root delay is the sum of the root delay in the packet and the peer delay, while the root dispersion is the sum of the root dispersion in the packet and the peer dispersion.

-

A source is considered selectable only if its maximum error is less than the select threshold, by default 1.5 s, but can be changed according to client preference using the maxdist option of the tos command. A common consequence is when an upstream server loses all sources and its maximum error apparent to dependent clients begins to increase. The clients are not aware of this condition and continue to accept synchronization as long as the maximum error is less than the select threshold.

+

A source is considered selectable only if its maximum error is less than the select threshold, by default 1.5 s, but can be changed according to client preference using the maxdist option of the tos command. A common consequence is when an upstream server loses all sources and its maximum error apparent to dependent clients continues to increase. The clients are not aware of this condition and continue to accept synchronization as long as the maximum error is less than the select threshold.

Although it might seem counterintuitive, a cardinal rule in the selection process is, once a sample has been selected by the clock filter algorithm, older samples are no longer selectable. This applies also to the clock select algorithm. Once the peer variables for a source have been selected, older variables of the same or other sources are no longer selectable. The reason for these rules is to limit the time delay in the clock discipline algorithm. This is necessary to preserve the optimum impulse response and thus the risetime and overshoot.

This means that not every sample can be used to update the peer variables, and up to seven samples can be ignored between selected samples. This fact has been carefully considered in the discipline algorithm design with due consideration for feedback loop delay and minimum sampling rate. In engineering terms, even if only one sample in eight survives, the resulting sample rate is twice the Nyquist rate at any time constant and poll interval.

Clock Initialization and Management