From: Harlan Stenn Date: Tue, 21 Sep 2010 05:23:50 +0000 (-0400) Subject: Documentation updates from Dave Mills X-Git-Url: http://git.ipfire.org/gitweb.cgi?a=commitdiff_plain;h=8ea3354bd09a3043d398d044b6d863a99ce3bda4;p=thirdparty%2Fntp.git Documentation updates from Dave Mills bk: 4c9841662l4l7krZ-kUYshzosAIA6g --- diff --git a/ChangeLog b/ChangeLog index 3548839a7b..68b4716671 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,4 @@ +* Documentation updates from Dave Mills. * [Bug 1635] from 4.2.6p3-RC2: "filegen ... enable" is not default. (4.2.7p53) 2010/09/20 Released by Harlan Stenn * Documentation updates from Dave Mills. diff --git a/html/clock.html b/html/clock.html index 2b70f973fc..d715fcd60e 100644 --- a/html/clock.html +++ b/html/clock.html @@ -9,23 +9,56 @@

Clock State Machine

Last update: - 08-Sep-2010 21:38 + 21-Sep-2010 4:15 UTC

+

Table of Contents

+
-

Introduction

-

In the NTPv4 specification and reference implementation a state machine is used to manage the system clock under exceptional conditions, as when the daemon is first started or when encountering severe network congesiton, for example. The state machine uses three thresolds: panic, step and stepout, and a watchdog timer. The thresholds default to 1000 s, 128 ms and 900 s, respectively, but can be changed by command options.

-

The Panic Threhold

-

Most compters today incorporate a time-of-year (TOY) chip to maintain the time when the power is off. When the machine is restarted, the chip is used to initialize the operating system time. In case there is no TOY chip or the TOY time is different from NTP time by more than the panic threshold, the daemon assumes something must be terribly wrong, so exits with a message to the system operator to set the time manually. With the -g option, the daemon will set the clock to NTP time the first time, but exit if the offset exceed the any time after that.

-

The Step and Stepout Thresholds

-

Under ordinary conditions, the clock discipline slews the clock so that the time is effectively continuous and never runs backwards. If due to extreme network congestion, or an offset spike exceeds the step threshold, by default 128 ms, the spike is discarded. However, if offsets greater than the step threshold persist for more than the stepout threshold, by default 900 s, the system clock is stepped to the correct value. In practice the need for a step has been extremely rare and almost always the result of a hardware failure. Both the step threshold and stepout threshold can be set as options to the tinker command.

-

Historically, the most important appliccation of the step function was when a leap second was inserted in the Coordinated Univesal Time (UTC) timescale and kernel precision time support was not available. Further details are on the Leap Second Processing page.

-

In some applications the clock can never be set backward, even it accidentlly set forward a week by some other means. There are several ways to alter the daemon behavior to insure time is always monotone-increasing. If the step threhold is set to zero, there will never be a step. With the -x command line option the daemon will set will set the step threshold to 600 s, which is about the limit of eyeball and wristwatch. However, in any of these cases, the precision time kernel support is disabled, as it cannot handle offsets greater than ±0.5 s.

-

The issues should be carefully considered before using these options. The slew rate is fixed at 500 parts-per-million (PPM) by the Unix kernel. As a result, the clock can take 33 minutes to amortize each second the clock is outside the acceptable range. During this interval the clock will not be consistent with any other network clock and the system cannot be used for distributed applications that require correctly synchronized network time.

-

Frequency Training

-

The frequency file, usually called ntp.drift, contains the latest estimate of clock frequency. If this file does not exist when the daemon is started, the clock state machine enters a special mode designed to measure the particular frequency directly. The measurement takes an interval equal to the stepout threshold, after which the frequency is set and the daemon esumes normal mode where the time and frequency are continuously adjusted. The frequency file is updated at intervals of an hour or more depending on the measured clock stability.

+

Introduction

+

In the NTPv4 specification and reference implementation a state machine is used to manage the system clock under exceptional conditions, as when the daemon is first started or when encountering severe network congestion. This page describes the design and operation of the state machine in detail.

+

The state machine is activated upon receipt of an update by the clock discipline algorithm. its primary purpose is to determines whether the clock is slewed or stepped and how the initial time and frequency are determined using three thresholds: panic, step and stepout, and one timer: hold.

+

Panic Threshold

+

Most computers today incorporate a time-of-year (TOY) chip to maintain the time when the power is off. When the computer is restarted, the chip is used to initialize the operating system time. In case there is no TOY chip or the TOY time is different from NTP time by more than the panic threshold, the daemon assumes something must be terribly wrong, so exits with a message to the system operator to set the time manually. With the -g option on the command line, the daemon sets the clock to NTP time at the first update, but exits if the offset exceeds the panic threshold at subsequent updates. The panic threshold default is 1000 s, but it can be changed with the panic option of the tinker command.

+

Step and Stepout Thresholds

+

Under ordinary conditions, the clock discipline gradually slews the clock to the correct time, so that the time is effectively continuous and never stepped forward pr backward. If, due to extreme network congestion, an offset spike exceeds the step threshold, the spike is discarded. However, if offset spikes greater than the step threshold persist for an interval more than the stepout threshold, the system clock is stepped to the correct time. In practice, the need for a step has been extremely rare and almost always the result of a hardware failure or operator error. The step threshold and stepout thresholds default to 125 ms and 300 s, respective, but can be changed with the step and stepout options of the tinker command, respectively. If the step threshold is set to zero, the step function is entirely disabled and the clock is always slewed. The daemon sets the step threshold to 600 s using the -x option on the command line. If the -g option is used or the step threshold is set greater than 0.5 s, the precision time kernel support is disabled.

+

Historically, the most important application of the step function was when a leap second was inserted in the Coordinated Universal Time (UTC) timescale and the kernel precision time support was not available. This also happened with older reference clocks that indicated an impending leap second, but the radio itself did not respond until it resynchronized some minutes later. Further details are on the Leap Second Processing page.

+

In some applications the clock can never be set backward, even it accidentally set forward a week by some evil means. The issues should be carefully considered before using these options. The slew rate is fixed at 500 parts-per-million (PPM) by the Unix kernel. As a result, the clock can take 33 minutes to amortize each second the clock is outside the acceptable range. During this interval the clock will not be consistent with any other network clock and the system cannot be used for distributed applications that require correctly synchronized network time.

+

Hold Timer

+

When the daemon is started after a considerable downtime, it could be the TOY chip clock has drifted significantly from NTP time. This can cause a transient at system startup. In the past, this has produced a phase transient and resulted in a frequency surge that could take some time, even hours, to subside. When the highest accuracy is required, some means is necessary to manage the startup process so that the the clock is quickly set correctly and the frequency is undisturbed. The hold timer is used to suppress frequency adjustments during the training and startup intervals described below. At the beginning of the interval the hold timer is set to the stepout threshold and decrements at one second intervals until reaching zero. However, the hold timer is forced to zero if the residual clock offset is less than 0.5 ms. When nonzero, the discipline algorithm uses a small time constant (equivalent to a poll exponent of 2), but does not adjust the frequency. Assuming that the frequency has been set to within 1 PPM, either from the frequency file or by the training interval described later, the clock is set to within 0.5 ms in less than 300 s.

+

Operating Intervals

+

The state machine operates in one of four nonoverlapping intervals.

+
+
Training interval
+
This interval is used at startup when the frequency file is nor present at startup. It begins when the first update is received by the discipline algorithm and ends when an update is received following the stepout threshold. The clock phase is steered to the offset presented at the beginning of the interval, but without affecting the frequency. During the interval further updates are ignored. At the end of the interval the frequency is calculated as the phase change during the interval divided by the length of the interval. This generally results in a frequency error less than 0.5 PPM. Note that, if the intrinsic oscillator frequency error is large, the offset will in general have significant error. This is corrected during the subsequent startup interval.
+
Startup interval
+
This interval is used at startup to amortize the residual offset while not affecting the frequency. If the frequency file is present, it begins when the first update is received by the discipline. If not, it begins after the training interval. It ends when the hold timer decrements to zero or when the residual offset falls below 0.5 ms.
+
Step interval
+
This interval is used as a spike blanker during periods when the offsets exceed the step threshold. The interval continues as long as offsets are received that are greater than the step threshold, but ends when either an offset is received less than the step threshold or until the time since the last valid update exceeds the stepout threshold.
+
Sync Interval
+
This interval is implicit; that is, it is used when none of the above intervals are used.
+
+

State Transition Function

+

The state machine consists of five states. An event is created when an update is received by the discipline algorithm. Depending on the state and the the offset magnitude, the machine performs some actions and transitions to the same or another state. Following is a short description of the states.

+
+
FSET - The frequency file is present
+
Load the frequency file, initialize the hold timer and continue in SYNC state.
+
NSET - The frequency file is not present
+
Initialize the hold timer and continue in FREQ state.
+
FREQ - Frequency training state
+
Disable the clock discipline until the time since the last update exceeds the stepout threshold. When this happens, calculate the frequency, initialize the hold counter and transition to SYNC state.
+
SPIK - Spike state
+
A update greater than the step threshold has occurred. Ignore the update and continue in this state as long as updates greater than the step threshold occur. If a valid update is received, continue in SYNC state. When the time since the last valid update was received exceeds the stepout threshold, step the system clock and continue in SYNC state.
+
SYNC - Ordinary clock discipline state
+
Discipline the system clock time and frequency using the hybrid phase/frequency feedback loop. However, do not discipline the frequency if the hold timer is nonzero.
+

-

- -

+ diff --git a/html/miscopt.html b/html/miscopt.html index b4e6c82e6d..476b63d7d7 100644 --- a/html/miscopt.html +++ b/html/miscopt.html @@ -10,7 +10,7 @@ giffrom Pogo, Walt Kelly

We have three, now looking for more.

Last update: - 13-Sep-2010 3:59 + 21-Sep-2010 3:37 UTC


Related Links

@@ -97,7 +97,7 @@
setvar variable [default]
This command adds an additional system variable. These variables can be used to distribute additional information such as the access policy. If the variable of the form name = value is followed by the default keyword, the variable will be listed as part of the default system variables (ntpq rv command). These additional variables serve informational purposes only. They are not related to the protocol other that they can be listed. The known protocol variables will always override any variables defined via the setvar mechanism. There are three special variables that contain the names of all variable of the same group. The sys_var_list holds the names of all system variables. The peer_var_list holds the names of all peer variables and the clock_var_list holds the names of the reference clock variables.
tinker [ allan allan | dispersion dispersion | freq freq | huffpuff huffpuff | panic panic | step step | stepout stepout ]
-
This command alters certain system variables used by the clock discipline algorithm. The default values of these variables have been carefully optimized for a wide range of network speeds and reliability expectations. Very rarely is it necessary to change the default values; but, some folks can't resist twisting the knobs. The options are as follows:
+
This command alters certain system variables used by the clock discipline algorithm. The default values of these variables have been carefully optimized for a wide range of network speeds and reliability expectations. Very rarely is it necessary to change the default values; but, some folks can't resist twisting the knobs. Thptie oons are as follows:
allan allan
@@ -111,10 +111,10 @@
panic panic
Sp edifies the panic threshold in seconds with default 1000 s. If set to zero, the panic sanity check is disabled and a clock offset of any value will be accepted.
step step
-
Sp edifies the step threshold in seconds. The default without this command is 0.128 s. If set to zero, step adjustments will never occur. Note: The kernel time discipline is disabled if the step threshold is set to zero or greater than 0.5 - s.
+
Specifies the step threshold in seconds. The default without this command is 0.128 s. If set to zero, step adjustments will never occur. Note: The kernel time discipline is disabled if the step threshold is set to zero or greater than 0.5 + s. Further details are on the Clock State Machine page.
stepout stepout
-
Specifies the stepout threshold in seconds. The default without this command is 900 s. If set to zero, popcorn spikes will not be suppressed.
+
Specifies the stepout threshold in seconds. The default without this command is 300 s. Since this option also affects the training and startup intervals, it should not be set less than the default. Further details are on the Clock State Machine page.
tos [ beacon beacon | ceiling ceiling | cohort {0 | 1} | floor floor | maxclock maxclock | maxdist maxdist | minclock minclock | mindist mindist | minsane minsane | orphan stratum | orphanwait delay ]
diff --git a/html/monopt.html b/html/monopt.html index 3c67da2352..632cc36f84 100644 --- a/html/monopt.html +++ b/html/monopt.html @@ -11,7 +11,7 @@ gif from Pogo, Walt Kelly

Pig was hired to watch the logs.

Last update: - 11-Sep-2010 16:32 + 21-Sep-2010 4:40 UTC


Related Links

@@ -46,14 +46,14 @@ automatically summarized and archived for retrospective analysis.

Monitoring Commands and Options

Unless noted otherwise, further information about these commands is on the Event Messages and Status Codes page.

page.

-
filegen name file filename [type type] +
filegen name [file filename] [type type] [link | nolink] [enable | disable]
name
Specifies the file set type from the list in the next section.
file filename
-
Specfies the file set name.
+
Specifies the filename prefix. The default is the file set type, such as "loopstats".
type typename
Specifies the file set interval. The following intervals are supported with default day:
diff --git a/html/warp.html b/html/warp.html index c0ee022d56..042a83df08 100644 --- a/html/warp.html +++ b/html/warp.html @@ -9,7 +9,7 @@

How NTP Works

Last update: - 16-Sep-2010 19:12 + 21-Sep-2010 5:15 UTC

Table of Contents

    @@ -22,22 +22,22 @@

    Introduction

    NTP time synchronization services are widely available in the public Internet. The public NTP subnet in late 2010 includes several thousand servers in most countries and on every continent of the globe, including Antarctica, and sometimes in space and on the sea floor. These servers support a total population estimated at over 25 million computers in the global Internet.

    -

    The NTP subnet operates with a hierarchy of levels, where each level is assigned a number called the stratum. Stratum 1 (primary) servers at the lowest level are directly synchronized to national time services. Stratum 2 (secondary) servers at the next higher level are synchronize to stratum 1 servers and so on. Normally, NTP clients and servers with a relatively small number of clients do not synchronize to public primary servers. There are several hundred public secondary servers operating at higher strata and are the preferred choice.

    -

    This page preetns an overview of the NTP daemon included in this distribution. We refer to this as the reference implementation only because it was the one used to test and validate the NTPv4 specificatioin RFC-5905. It is best read in conjunction with the briefings on the Network Time Synchronization Research Project page.

    +

    The NTP subnet operates with a hierarchy of levels, where each level is assigned a number called the stratum. Stratum 1 (primary) servers at the lowest level are directly synchronized to national time servicesvia satellite, radio and telephone mdem services. Stratum 2 (secondary) servers at the next higher level are synchronize to stratum 1 servers and so on. Normally, NTP clients and servers with a relatively small number of clients do not synchronize to public primary servers. There are several hundred public secondary servers operating at higher strata and are the preferred choice.

    +

    This page presents an overview of the NTP daemon included in this distribution. We refer to this as the reference implementation only because it was used to test and validate the NTPv4 specificatioin RFC-5905. It is best read in conjunction with the briefings on the Network Time Synchronization Research Project page.

    gif

    Figure 1. NTP Daemon Processes and Algorithms

    NTP Daemon Architecture and Basic Operation

    -

    The overall organization of the NTP daemon is shown in Figure 1. It is useful in this context to consider the daemon as both a client of downstatum servers and as a server for upstratum clients. It includes a pair of peer/poll processes for each reference clock or remote server used as a synchronization source. The poll process sends NTP packets at intervals ranging from 8 s to 36 h. The peer process receives NTP packets and runs the on-wire protocol that collects four timestamps: the origin timestamp T1 upon departure of the client request and the receive timestamp T2 upon arrival at the server, the transmit timestamp T3 upon departure of the server reply and the destination timestamp T4 upon arrival at the client. These timestamps are used to calculate the clock offset and roundtrip delay:

    +

    The overall organization of the NTP daemon is shown in Figure 1. It is useful in this context to consider the daemon as both a client of upstream servers and as a server for downstream clients. It includes a pair of peer/poll processes for each reference clock or remote server used as a synchronization source. The poll process sends NTP packets at intervals ranging from 8 s to 36 hr. The peer process receives NTP packets and runs the on-wire protocol that collects four timestamps: the origin timestamp T1 upon departure of the client request, the receive timestamp T2 upon arrival at the server, the transmit timestamp T3 upon departure of the server reply and the destination timestamp T4 upon arrival at the client. These timestamps are used to calculate the clock offset and roundtrip delay:

    -

    offset = [(T2 -T1) + (T3 - T4)] / 2
    +

    offset = [(T2 - T1) + (T3 - T4)] / 2
    delay = (T4 - T1) - (T3 - T2).

    Those sources that have passed a number of sanity checks are declared selectable. From the selectable population the statistics are used by the select algorithm to determine a number of truechimers according to correctness principles. From the truechimer population a number of survivors are determined on the basis of statistical principles. One of the survivors is declared the system peer and the system statistics inherited from it. The combine algorithm computes a weighted average of the peer offset and jitter to produce the final values used by the clock discipline algorithm to adjust the system clock time and frequency.

    -

    When started, the program requires several measurements sufficient data fro these a algorithms to work properly before setting the clock. As the default poll interval is 64 s, it can take several minutes to set the clock. The time can be reduced using the iburst option on the Server Options page. For additional details about the clock filter, select, cluster and combine algorithms see the Architecture Briefing on the NTP Project Page.

    +

    When started, the program requires several measurements for these a algorithms to work reliably before setting the clock. As the default poll interval is 64 s, it can take several minutes to set the clock. The time can be reduced using the iburst option on the Server Options page. For additional details about the clock filter, select, cluster and combine algorithms see the Architecture Briefing on the NTP Project Page.

    How Statistics are Determined

    -

    Each source is characterized by the offset and delay measured by the on-wire protocol and the dispersion and jitter calculated by the clock filter algorithm of the peer process. Each time an NTP packet is received from a source, the dispersion is initialized by the sum of the precisions of the server and client.

    +

    Each source is characterized by the offset and delay measured by the on-wire protocol and the dispersion and jitter calculated by the clock filter algorithm of the peer process. Each time an NTP packet is received from a source, the dispersion is initialized by the sum of the precisions of the server and client. Table 1 shows the precisions measured for typical modern systems. The values are in log2 units.

    @@ -101,7 +101,8 @@
    Name

    Table 1. Typical Precision for Various Machines

    -

    The offset, delay and dispersion values are inserted as the youngest stage of an 8-stage shift register, thus discarding the oldest stage. Subsequently, the dispersion in each stage is increased at a fixed rate of 15 ms/s, representing the worst case error due to skew between the server and client clocks. The clock filter algorithm in each peer process selects the stage with the lowest delay, which generally represents the most accurate values, and the associated offset and delay values become the peer variables of the same name. The peer dispersion continues to grow at the same rate as the register dispersion. The peer dispersion is determined as a weighted average of the dispersion samples in the shift register. Finally, the peer jitter is determined as the root-mean-square (RMS) average of all the offset samples in the shift register relative to the selected sample.

    +

    The offset, delay and dispersion values are inserted as the youngest stage of an 8-stage shift register, thus discarding the oldest stage. Subsequently, the dispersion in each stage is increased at a fixed rate of 15 ms/s, representing the worst case error due to skew between the server and client clocks.

    +

    The clock filter algorithm in each peer process selects the stage with the lowest delay, which generally represents the most accurate values, and the associated offset and delay values become the peer variables of the same name. The dispersion continues to grow at the same rate as the register dispersion. The peer dispersion is determined as a weighted average of the dispersion samples in the shift register. Finally, the peer jitter is determined as the root-mean-square (RMS) average of all the offset samples in the shift register relative to the selected sample.

    The clock filter algorithm continues to process packets in this way until the source is no longer reachable. In this case the algorithm inserts dummy samples with "infinite" dispersion are inserted in the shift register, thus displacing old samples.

    The composition of the survivor population and the system peer selection is redetermined as each update from each server is received. The system variables are copied from the peer variables of the same name and the system stratum set one greater than the system peer stratum. Like peer dispersion, the system dispersion increases at the same rate so, even if all sources have become unreachable, the daemon appears to upstratum clients at ever increasing dispersion.

    Reachability and Selection Criteria

    @@ -117,16 +118,8 @@

    The poll interval is managed by a heuristic algorithm developed over several years of experimentation. It depends on an exponentially weighted average of clock offset differences, called the clock jitter, and a jiggle counter, which is initially set to zero. When a clock update is received and the offset exceeds the clock jitter by a factor of 4, the jiggle counter is increased by the poll exponent; otherwise, it is decreased by twice the poll exponent. If the jiggle counter is greater than an arbitrary threshold of 30, the poll exponent. if jiggle counter exceed an arbitrary threshold of 30, it is reset to zero and the poll exponent increased by 1. If the jiggle counter is less than -30, it is set to zero and the poll exponent decreased by 1. In effect, the algorithm has a relatively slow reaction to good news, but a relatively fast reaction to bad news.

    The optimum time constant, and thus the poll exponent, depends on the network time jitter and the clock oscillator frequency wander. Errors due to jitter decrease as the time constant increases, while errors due to wander decrease as the time constant decreases. The two error characteristics intersect at a point called the Allan intercept, which represents the ideal time constant. With a compromise Allan intercept of 2000 s, the optimum poll interval is about 64 s, which corresponds to a poll exponent of 6.

    Clock State Machine

    -

    In the NTPv4 specification and reference implementation a state machine is used to manage the system clock under exceptional conditions, as when the daemon is first started or when encountering severe network congesiton, for example. The state machine uses three thresolds: panic, step and stepout, and a watchdog timer. The thresholds default to 1000 s, 128 ms and 900 s, respectively, but can be changed by command options.

    -

    Most compters today incorporate a time-of-year (TOY) chip to maintain the time when the power is off. When the machine is restarted, the chip is used to initialize the operating system time. In case there is no TOY chip or the TOY time is different from NTP time by more than the panic threshold, the daemon assumes something must be terribly wrong, so exits with a message to the system operator to set the time manually. With the -g option, the daemon will set the clock to NTP time the first time, but exit if the offset exceed the any time after that.

    -

    Under ordinary conditions, the clock discipline slews the clock so that the time is effectively continuous and never runs backwards. If due to extreme network congestion, or an offset spike exceeds the step threshold, by default 128 ms, the spike is discarded. However, if offsets greater than the step threshold persist for more than the stepout threshold, by default 900 s, the system clock is stepped to the correct value. In practice the need for a step has been extremely rare and almost always the result of a hardware failure. Both the step threshold and stepout threshold can be set as options to the tinker command.

    -

    Historically, the most important appliccation of the step function was when a leap second was inserted in the Coordinated Univesal Time (UTC) timescale and kernel precision time support was not available. Further details are on the Leap Second Processing page.

    -

    In some applications the clock can never be set backward, even it accidentlly set forward a week by some other means. There are several ways to alter the daemon behavior to insure time is always monotone-increasing. If the step threhold is set to zero, there will never be a step. With the -x command line option the daemon will set will set the step threshold to 600 s, which is about the limit of eyeball and wristwatch. However, in any of these cases, the precision time kernel support is disabled, as it cannot handle offsets greater than ±0.5 s.

    -

    The issues should be carefully considered before using these options. The slew rate is fixed at 500 parts-per-million (PPM) by the Unix kernel. As a result, the clock can take 33 minutes to amortize each second the clock is outside the acceptable range. During this interval the clock will not be consistent with any other network clock and the system cannot be used for distributed applications that require correctly synchronized network time.

    -

    The frequency file, usually called ntp.drift, contains the latest estimate of clock frequency. If this file does not exist when the daemon is started, the clock state machine enters a special mode designed to measure the particular frequency directly. The measurement takes an interval equal to the stepout threshold, after which the frequency is set and the daemon esumes normal mode where the time and frequency are continuously adjusted. The frequency file is updated at intervals of an hour or more depending on the measured clock stability.

    +

    In the NTPv4 specification and reference implementation a state machine is used to manage the system clock under exceptional conditions, as when the daemon is first started or when encountering severe network congesiton. When the frequency file is present at startup is that the residual offset error is less than 0.5 ms within 300 s. When the frequency file is not present, this result is achieved within 600 s. Further details are on the Clock State Machine page.


    -

    - -

    +