Poll Program

About every eighteen months the International Earth Rotation Service (IERS) issues a bulletin announcing the insertion of a leap second in the Universal Coordinated Time (UTC) timescale. Ordinarily, this happens at the end of the last day of June or December; but, in principal, it could happen at the end of any month. While these bulletins are available on the Internet at www.iers.org, advance notice of leap seconds is also available in signals broadcast from national time and frequency stations, in GPS signals and in telephone modem services. Many, but not all, reference clocks recognize these signals and many, but not all, drivers for them can decode the signals and set the leap bits in the timecode accordingly. This means that many, but not all, primary servers can pass on these bits in the NTP packet heard to dependent secondary servers and clients. Secondary servers can pass these bits to their dependents and so on throughout the NTP subnet.

A leap second is inserted following second 59 of the last minute of the day and becomes second 60 of that day. A leap second is deleted by omitting second 59 of the last minute of the day, although this has never happened and is highly unlikely to happen in future. So far as is known, there are no provisions in the Unix or Windows libraries to account for this occasion other than to to affect the conversion of an NTP datestamp or timestamp to conventional civil time.

When an update is received from a reference clock or downstratum server, the leap bits are inspected for all survivors of the cluster algorithm. If the number of survivors showing a leap bit is greater than half the total number of survivors, a pending leap condition exists until the end of the current month.

When no means are available to determine the leap bits from a reference clock or downstratum server, a leapseconds file can be downloaded from time.nist.gov and installed using the leapfile command. The file includes a list of historic leap second and the NTP time of insertion. It is parsed by the ntpd daemon at startup and the latest leap time saved for future reference. Each time the clock is set, the current time is compared with the last leap time. If the current time is later than the last leap time, nothing further is done. If earlier, the leap timer is initialized with the time in seconds until the leap time and counts down from there. When the leap timer is less than one month, a pending leap condition exists until the end of the current month. If the leapseconds file is present, the leap bits for reference clocks and downstratum servers are ignored.

If the precision time kernel support is installed and enabled, at the begriming of the day of the leap event the leap bits are set in the Unix ntp_adjtime() system call to arm the kernel for the leap at the end of the day. The kernel automatically inserts one second exactly at the time of the leap, after which the leap bits are turned off. If the kernel support is not availed or disabled, the leap is implemented as a crude hack by setting the clock back one second using the Unix settimeofday() system call, which effectively repeats the last second. Note however that in any case setting the time backwards by one second does not actually set the system clock backwards, but effectively stalls the clock for one second. These points are expanded in the white paper The NTP Timescale and Leap Seconds. If the leap timer is less than one day, the leap bits are set for dependent servers and clients.

If the precision time kernel support is available and enabled, at the beginning of the day of the leap event, the leap bits are set by the Unix ntp_adjtime() system call to arm the kernel for the leap at the end of the day. The kernel automatically inserts one second exactly at the time of the leap, after which the leap bits are turned off. If the kernel support is not availed or disabled, the leap is implemented as a crude hack by setting the clock back one second using the Unix settimeofday() system call, which effectively repeats the last second. Note however that in any case setting the time backwards by one second does not actually set the system clock backwards, but effectively stalls the clock for one second. These points are expanded in the white paper The NTP Timescale and Leap Seconds. If the leap timer is less than one day, the leap bits are set for dependent servers and clients.

As an additional feature when the NIST leap seconds file is installed, it is possible to determine the number of leap seconds inserted in UTC since UTC began on 1 January 1972. This represents the offset between International Atomic Time (TAI) and UTC. If the precision time kernel modifications are available and enabled, the TAI offset is available to application programs using the antipasti() system call. If the Auto key public-key cryptography feature is installed, the TAI offset is automatically propagated along with other cryptographic media to dependent servers and clients.

Poll Program

Poll Process

Each poll process sends NTP packets at intervals determined by the poll program. The program is designed to provide a sufficient update rate for thc clock discipline algorithm, while minimizing network overhead. The program is designed to operate over a poll exponent range between 3 (8 s) and 17 (36 hr). The minimum and maximum poll exponent within this range can be set using the minpoll and maxpoll options of the server command, with default 6 (64 s) and 10 (1024 s), respectively.

The poll interval is managed by a heuristic algorithm developed over several years of experimentation. It depends on an exponentially weighted average of clock offset differences, called clock jitter, and a jiggle counter, which is initially set to zero. When a clock update is received and the offset exceeds the clock jitter by a factor of 4, the jiggle counter is increased by the poll exponent; otherwise, it is decreased by twice the poll exponent. If the jiggle counter is greater than an arbitrary threshold of 30, it is reset to 0 and the the poll exponent is increased by 1. If the jiggle counter is less than -30, it is set to 0 and the poll exponent decreased by 1. In effect, the algorithm has a relatively slow reaction to good news, but a relatively fast reaction to bad news.

As an option of the server command, instead of a single packet, the poll program can send a burst of six packets at 2-s intervals. This is designed to reduce the time to synchronize the clock at initial startup (iburst) and/or to reduce the phase noise at the longer poll intervals (burst). The iburst option is effective only when the server is unreachable, while the burst option is effective only when the server is reachable.

For the iburst option the number of packets in the burst is six, which is the number normally needed to synchronize the clock; for the burst option, the number of packets in the burst is determined by the difference between the current poll exponent and the minimum poll exponent as a power of 2. For instance, with the default minimum poll exponent of 6 (64 s), only a single packet is sent for every poll, while the full number of eight packets is sent at poll exponents of 9 (512 s) or more. This insures that the average headway will never exceed the minimum headway.

The poll process sends NTP packets at intervals determined by the clock discipline algorithm. The process is designed to provide a sufficient update rate to maximize accuracy while minimizing network overhead. The process is designed to operate over a poll exponent range between 3 (8 s) and 17 (36 hr). The minimum and maximum poll exponent within this range can be set using the minpoll and maxpoll options of the server command, with default 6 (64 s) and 10 (1024 s), respectively.

As an option of the server command, instead of a single packet, the poll process can send a burst of several packets at 2-s intervals. This is designed to reduce the time to synchronize the clock at initial startup (iburst) and/or to reduce the phase noise at the longer poll intervals (burst). The iburst option is effective only when the server is unreachable, while the burst option is effective only when the server is reachable. The two options are independent of each other and both can be enabled at the same time.

For the iburst option the number of packets in the burst is six, which is the number normally needed to synchronize the clock; for the burst option, the number of packets in the burst is determined by the difference between the current poll exponent and the minimum poll exponent as a power of 2. For instance, with the default minimum poll exponent of 6 (64 s), only one packet is sent for every poll, while the full number of eight packets is sent at poll exponents of 9 (512 s) or more. This insures that the average headway will never exceed the minimum headway.

The burst options can result in increased load on the network if not carefully designed. Both options are affected by the provisions described on the Rate Management and the Kiss-o'-Death Packet page. In addition, when iburst or burst are enabled, the first packet of the burst is sent, but the remaining packets sent only when the reply to the fist packet is received. If no reply has been received after a timeout set by the minpoll option, the first packet is sent again. This means that, even if a server is unreachable, the network load is no more than at the minimum poll interval.

To further reduce the network load when a server is unreachable, an unreach timer is incremented by 1 at each poll interval, but is set to 0 as each packet is received. If the timer exceeds the unreach threshold set at 10, the poll exponent is incremented by 1 and the unreach timer set to 0. This continues until the poll exponent reaches the maximum set by the maxpoll option.

Introduction

Most radio clocks are connected using a serial port operating at speeds of 9600 bps. The accuracy using typical timecode formats, where the on-time epoch is indicated by a designated ASCII character such as carriage-return <cr>, is normally limited to 100 ms. Using carefully crafted averaging techniques, the NTP algorithms can whittle this down to a few tens of microseconds. However, some radios produce a pulse-per-second (PPS) signal which can be used to improve the accuracy to a few microseconds. This page describes the hardware and software necessary for NTP to use the PPS signal.

Most radio clocks are connected using a serial port operating at speeds of 9600 bps. The accuracy using typical timecode formats, where the on-time epoch is indicated by a designated ASCII character such as carriage-return <cr>, is normally limited to 100 μs. Using carefully crafted averaging techniques, the NTP algorithms can whittle this down to a few tens of microseconds. However, some radios produce a pulse-per-second (PPS) signal which can be used to improve the accuracy to a few microseconds. This page describes the hardware and software necessary for NTP to use the PPS signal.

The PPS signal can be connected in either of two ways. On FreeBSD systems (with the PPS_SYNC and pps kernel options) it can be connected directly to the ACK pin of a parallel port. This is the preferred way, as it requires no additional hardware. Alternatively, it can be connected via the DCD pin of a serial port. However, the PPS signal levels are usually incompatible with the serial port interface signals. Note that NTP no longer supports connection via the RD pin of a serial port.

gif

diff --git a/html/prefer.html b/html/prefer.html index cf2b10830..3cf127d18 100644 --- a/html/prefer.html +++ b/html/prefer.html @@ -3,61 +3,62 @@ Mitigation Rules and the prefer Keyword - + +

Mitigation Rules and the `prefer` Keyword

from Alice's Adventures in Wonderland, Lewis Carroll

Listen carefully to what I say; it is very complicated.

Last update: - 16-Dec-2011 20:58 + 18-Jul-2012 16:41 UTC

General Overview

This page summarizes the criteria for choosing from among the survivors of the clock cluster algorithm a set of contributors to the clock discipline algorithm. The criteria are very meticulous, since they have to handle many different scenarios that may be optimized for special circumstances, including some scenarios designed to support planetary and deep space missions.

1. Introduction and Overview

Recall the suite of NTP data acquisition and grooming algorithms. These algorithms proceed in five phases. Phase one discovers the available sources and mobilizes an association for each source found. These sources can result from explicit configuration, broadcast discovery or the pool and manycast autonomous configuration schemes. See the Automatic Server Discovery Schemes page for further information.

Phase two selects the candidates from among the sources by excluding those sources showing one or more of the errors summarized on the Clock Select Algorithmm page and to determine the truechimers from among the candidates, leaving behind the falsetickers. A server or peer configured with the true option is declared a truechimer independent of this algorithm. Phase four uses the algorithm described on the Clock Cluster Algorithm page to trim the statistical outliers from the truechimers, leaving the survivor list as result.

Phase five uses a set of algorithms and mitigation rules to combined the survivor statistics antdiscipline the systen clock. The mitigation rules select from among the survivors a system peer from which a set of system statistics can be inherited and passed along to dependent clients, if any. The algorithms and rules are the main topic of this page. The clock offset developed from these algorithms can discipline the system clock, either using the clock discipline algorithm or using the kernel to discipline the system clock directly, as described on the A Kernel Model for Precision Timekeeping page.

Combine Algorithm

Phase two selects the candidates from among the sources by excluding those sources showing one or more of the errors summarized on the Clock Select Algorithm page and to determine the truechimers from among the candidates, leaving behind the falsetickers. A server or peer configured with the true option is declared a truechimer independent of this algorithm. Phase four uses the algorithm described on the Clock Cluster Algorithm page to prune the statistical outliers from the truechimers, leaving the survivor list as result.

Phase five uses a set of algorithms and mitigation rules to combined the survivor statistics and discipline the system clock. The mitigation rules select from among the survivors a system peer from which a set of system statistics can be inherited and passed along to dependent clients, if any. The mitigation algorithms and rules are the main topic of this page. The clock offset developed from these algorithms can discipline the system clock, either using the clock discipline algorithm or using the kernel to discipline the system clock directly, as described on the A Kernel Model for Precision Timekeeping page.

2. Combine Algorithm

The clock combine algorithm uses the survivor list to produce a weighted average of both offset and jitter. Absent other considerations discussed later, the combined offset is used to discipline the system clock, while the combined jitter is augmented with other components to produce the system jitter statistic inherited by dependent clients, if any.

The clock combine algorithm uses a weight factor for each survivor equal to the reciprocal of the root distance. This is normalized so that the sum of the reciprocals is equal to unity. This design favors the survivors at the smallest root distance and thus the smallest maximum error.

Anti-Clockhop Algorithm

The anti-clockhop algorithm is intended for cases where multiple servers are available on a fast LAN with modern computers. Typical offset differences between servers in such cases are less than 0.5 ms. However, changes between servers can result in unnecessary system jitter. The object of the anti-clockhop algorithm is to avoid changing the current server unless it becomes stale or the offset differences between it and the others on the survivor list becomes substantial.

To help compact this discussion, we will call the last selected server the old peer, and thecurrently selected server the candidate peer. The anti-clockhop algorithm is called immediately after the combine algorithm. The metric used to select the candidate peer is formed as the root distance plus product of the stratum times the mindist variable. It is used to select the minimum survivor on the list produced by the clock cluster algorithm. The algorithm then initializes the anti-clockhop threshold with the value of mindist, by default 1 ms.

If there was no old peer or the old and candidate peers are the same, the candidate peer becomes the system peer. If not, the algorithm measures the difference between the offset of the old peer and the candidate peer. If the difference exceeds the anti-clockhop threshold, the candidate peer becomes the system peer and the anti-clockhop threshold is restored to its original value. If not, the old peer continues as the system peer. However, at each subsequent update, the algorithm reduces the anti-clockhop threshold by half. Should operation continue in this way, the candidate peer will eventually become the system peer.

Peer Classification

3. Anti-Clockhop Algorithm

The anti-clockhop algorithm is intended for cases where multiple servers are available on a fast LAN with modern computers. Typical offset differences between servers in such cases are less than 0.5 ms. However, changes between servers can result in unnecessary system jitter. The object of the anti-clockhop algorithm is to avoid changing the current system peer, unless it becomes stale or has significant offset relative to other candidates on the survivor list.

For the purposes of the following description, call the last selected system peer the old peer, and the currently selected source the candidate peer. At each update, the candidate peer is selected as the first peer on the survivor list sorted by increasing root distance. The algorithm initializes the -clockhop threshold with the value of mindist, by default 1 ms.

The anti-clockhop algorithm is called immediately after the combine algorithm. If there was no old peer or the old and candidate peers are the same, the candidate peer becomes the system peer. If the old peer and the candidate peer are different, the algorithm measures the difference between the offset of the old peer and the candidate peer. If the difference exceeds the clockhop threshold, the candidate peer becomes the system peer and the clockhop threshold is restored to its original value. If the difference is less than the clockhop threshold, the old peer continues as the system peer. However, at each subsequent update, the algorithm reduces the clockhop threshold by half. Should operation continue in this way, the candidate peer will eventually become the system peer.

4. Peer Classification

The behavior of the various algorithms and mitigation rules involved depends on how the various synchronization sources are classified. This depends on whether the source is local or remote and if local, the type of source. The following classes are defined:

An association configured for a remote server or peer is classified as a server. All other associations are classified as device drivers of one kind or another. In general, one or more sources of either type will be configured in each installation.
If all sources have been lost and one or more hosts on a common DMZ network have specified the orphan stratum in the orphan option of the tos command, each of them becomes an orphan parent. Dependent orphan children on the same DMZ network will see the orphan parents as if synchronized to a server at the orphan stratum. Note that, as described below, all the orphan children having the same set of orphan parents will select the same parent.
A selectable association configured for a remote server or peer is classified as a client association. All other selectable associations are classified as device driver associations of one kind or another. In general, one or more sources of either type will be available in each installation.
If all sources have been lost and one or more hosts on a common DMZ network have specified the orphan stratum in the orphan option of the tos command, each of them can become an orphan parent. Dependent orphan children on the same DMZ network will see the orphan parents as if synchronized to a server at the orphan stratum. Note that, as described on the Orphan Mode page, all orphan children will select the same orphan parent for synchronization.
When a device driver has been configured for pulse-per-second (PPS) signals and PPS signals are being received, it is designated the PPS driver. Note that the Pulse-per-Second driver (type 22) is often used as a PPS driver, but any driver can be configure as a PPS driver if the hardware facilities are available. The PPS driver provides precision clock discipline only within ±0.4 s, so it is always associated with another source or sources that provide the seconds numbering function.
When the Undisciplined Local Clock driver (type 1) is configured, it is designated the local driver. It can be used either as a backup source (stratum greater than zero) should all sources fail, or as the primary source (stratum zero) whether or not other sources are available if the prefer option is present. The local driver can be used when the kernel time is disciplined by some other means of synchronization, such as the NIST lock clock scheme, or another synchronization protocol such as the IEEE 1588 Precision Time Protocol (PTP) or Digital Time Synchronization Service (DTSS).
When the Automated Computer Time Service driver (type 18) is configured, it is designated the modem driver. It is used either as a backup source, should all other sources fail, or as the primary source if the prefer option is present.

The `prefer` Peer

5. The `prefer` Peer

The mitigation rules are designed to provide an intelligent selection of the system peer from among the selectable sources of different types. When used with the server or peer commands, the prefer option designates one or more sources as preferred over all others. While the rules do not forbid it, it is usually not useful to designate more than one source as preferred; however, if more than one source is so designated, they are used in the order specified in the configuration file. If the first one becomes un selectable, the second one is considered and so forth. This order of priority is also applicable to multiple PPS drivers, multiple modem drivers and even multiple local drivers, although that would not normally be useful.

The cluster algorithm works on the set of truechimers produced by the select algorithm. At each round the algorithm casts off the survivor least likely to influence the choice of system peer. If selectable, the prefer peer is never discarded; on the contrary, its potential removal becomes a termination condition. However, the prefer peer can still be discarded by the select algorithm as a falseticker; otherwise, the prefer peer becomes the system peer.

Ordinarily, the combine algorithm computes a weighted average of the survivor - offsets to produce the final offset. However, if a prefer + offset and jitter to produce the final values. However, if a prefer peer is among the survivors, the combine algorithm is not used. Instead, - the offset of the prefer peer is used exclusively as the final offset. In the common case involving a radio clock and a flock of remote backup - servers, and with the radio clock designated a prefer peer, the result is that + the offset and jitter of the prefer peer are used exclusively as the final values. In the common case involving a radio clock and a flock of remote backup + servers, and with the radio clock designated a prefer peer, the the radio clock disciplines the system clock as long as the radio itself remains operational. However, if the radio fails or becomes a falseticker, - the averaged backup sources continue to discipline the system clock.

Mitigation Rules

+ the combined backup sources continue to discipline the system clock.

6. Mitigation Rules

As the select algorithm scans the associations for selectable candidates, the modem driver and local driver are segregated for later, but only if not designated a prefer peer. If so designated, the driver is included among the candidate population. In addition, if orphan parents are found, the parent with the lowest metric is segregated for later; the others are discarded. For this purpose the metric is defined as the four-octet IPv4 address or the first four octets of the hashed IPv6 address. The resulting candidates, including any prefer peers found, are processed by the select algorithm to produce a possibly empty set of truechimers.

As previously noted, the cluster algorithm casts out outliers, leaving the survivor list for later processing. The survivor list is then sorted by increasing root distance and the first entry temporarily designated the system peer. At this point the following contributors to the system clock discipline may be available:

If there is a PPS driver and the system clock offset at this point is less than 0.4 s, and if there is a prefer peer among the survivors or if the PPS peer is designated as a prefer peer, the PPS driver becomes the system peer and its offset and jitter are inherited by the system variables, thus overriding any variables already computed. Note that a PPS driver is present only if PPS signals are actually being received and enabled by the associated driver.

Microsoft Windows Authentication

Microsoft Windows Authentication

Public Key Cryptography

Clock Cluster Algorithm