Miroslav Lichvar [Wed, 14 Feb 2018 09:11:19 +0000 (10:11 +0100)]
ntp: keep kernel RX timestamping permanently enabled on Linux
The Linux kernel has a counter for sockets using kernel RX timestamping
and timestamps (all) received packets only when it is not zero. However,
this counter is updated asynchronously from setsockopt(). If there are
currently no other sockets using the timestamping, it is possible that a
fast server response is received before the kernel timestamping is
actually enabled after setting the socket option and sending a request.
Open a dummy socket on start to make sure there is always at least one
timestamping socket to avoid the race condition.
When dropping the root privileges, don't try to keep the CAP_SYS_TIME
capability if the -x option was enabled. This allows chronyd to be
started without the capability (e.g. in containers) and also drop the
root privileges.
When sending client requests to a close and fast server, it is possible
that a response will be received before the HW transmit timestamp of
the request itself. To avoid processing of the response without the HW
timestamp, monitor events returned by select() and suspend reading of
packets from the receive queue for up to 200 microseconds. As the
requests are normally separated by at least 200 milliseconds, it is
sufficient to monitor and suspend one socket at a time.
Miroslav Lichvar [Tue, 12 Dec 2017 10:03:04 +0000 (11:03 +0100)]
client: avoid reading clock after sending request
If chronyc sent a request which caused chronyd to step the clock (e.g.
makestep, settime) and the second reading of the clock before calling
select() to wait for a response happened after the clock was stepped, a
new request could be sent immediately and chronyd would process the same
command twice. If the second request failed (e.g. a settime request too
close to the first request), chronyc would report an error.
Change the submit_request() function to read the clock only once per
select() to wait for the first response even when the clock was stepped.
If the system clock was stepped forward after chronyc sent a request and
before it read the clock in order to calculate the receive timeout,
select() could be called with a negative timeout, which resulted in an
infinite loop waiting for select() to succeed.
Fix the submit_request() function to not call select() with a negative
timeout. Also, return immediately on any error of select().
Miroslav Lichvar [Wed, 11 Oct 2017 14:57:10 +0000 (16:57 +0200)]
refclock: improve TAI-UTC conversion
Instead of using the TAI-UTC offset which corresponds to the current
system time, get the offset for the reference time. This allows the
clock to be accurately stepped from a time with different TAI-UTC
offset.
Chris Perl [Tue, 10 Oct 2017 17:23:21 +0000 (13:23 -0400)]
refclock: add tai option
This option is for indicating to chronyd that the reference clock is
kept in TAI and that chrony should attempt to convert from TAI to UTC by
using the timezone configured by the "leapsectz" directive.
in order to make builds reproducible.
See https://reproducible-builds.org/ for why this is good
and https://reproducible-builds.org/specs/source-date-epoch/
for the definition of this variable.
Miroslav Lichvar [Thu, 14 Sep 2017 13:28:37 +0000 (15:28 +0200)]
reference: check for gmtime() error
Although gmtime() is expected to convert any time of the system clock at
least in the next few NTP eras, a correct code should always check the
returned value and this shouldn't be a fatal error in handling of leap
seconds.
Miroslav Lichvar [Fri, 25 Aug 2017 12:57:25 +0000 (14:57 +0200)]
ntp: improve maxdelayratio test
Similarly to the maxdelaydevratio test, include in the maximum delay
dispersion which accumulated in the interval since the last sample.
Also, enable the test for symmetric associations.
Miroslav Lichvar [Fri, 25 Aug 2017 10:29:13 +0000 (12:29 +0200)]
sourcestats: move maxdelaydevratio test to ntp_core
Instead of giving NTP-specific data to sourcestats in order to perform
the test, provide a function to get all data needed for the test in
ntp_core. While at it, improve the naming of variables.
Miroslav Lichvar [Thu, 24 Aug 2017 10:10:46 +0000 (12:10 +0200)]
memory: check for overflow when (re)allocating array
When (re)allocating an array with very large number of elements using
the MallocArray or ReallocArray macros, the calculated size of the array
could overflow size_t and less memory would be allocated than requested.
Add new functions for (re)allocating arrays that check the size and use
them in the MallocArray and ReallocArray macros.
This couldn't be exploited, because all arrays that can grow with cmdmon
or NTP requests already have their size checked before allocation, or
they are much smaller than memory allocated for structures to which they
are related (i.e. ntp_core and sourcestats instances), so a memory
allocation would fail before their size could overflow.
This issue was found in an audit performed by Cure53 and sponsored by
Mozilla.
Miroslav Lichvar [Thu, 24 Aug 2017 09:12:14 +0000 (11:12 +0200)]
util: check for gmtime() error
Fix the UTI_TimeToLogForm() function to check if gmtime() didn't fail.
This caused chronyc to crash due to dereferencing a NULL pointer when
a response to the "manual list" request contained time which gmtime()
could not convert to broken-down representation.
This issue was found in an audit performed by Cure53 and sponsored by
Mozilla.
Miroslav Lichvar [Wed, 23 Aug 2017 09:33:37 +0000 (11:33 +0200)]
ntp: allow TX-only HW timestamping by default
If no rxfilter is specified in the hwtimestamp directive and the NIC
doesn't support the all or ntp filter, enable TX-only HW timestamping
with the none filter.
Miroslav Lichvar [Tue, 22 Aug 2017 11:13:45 +0000 (13:13 +0200)]
hwclock: drop all samples on reset
On some HW it seems it's possible to get an occasional bad reading of
the PHC (with normal delay), or in a worse case the clock can step due
to a HW/driver bug, which triggers reset of the HW clock instance. To
avoid having a bad estimate of the frequency when the next (good) sample
is accumulated, drop also the last sample which triggered the reset.
Miroslav Lichvar [Thu, 17 Aug 2017 14:44:18 +0000 (16:44 +0200)]
sourcestats: add fixed minimum delay
If the minimum delay is known (in a static network configuration), it
can replace the measured minimum from the register. This should improve
the stability of corrections for asymmetric jitter, sample weighting and
maxdelay* tests.
Miroslav Lichvar [Tue, 15 Aug 2017 11:39:39 +0000 (13:39 +0200)]
sys_linux: fix building with older kernel headers
Programming pins for external PHC timestamping was added in Linux 3.15,
but the PHC subsystem is older than that. Compile the programming code
only when the ioctl is defined.
On some systems, passing NULL as the first argument to adjtime, will
result in returning the amount of adjustment outstanding from a previous
call to adjtime().
On macOS this is not allowed and the adjtime call will fault. We can
simulate the behaviour of the other systems by cancelling the current
adjustment then restarting the adjustment using the outstanding time
that was returned. On macOS 10.13 and later, the netbsd driver is now
used and must use these semantics when making/measuring corrections.
The sources and sourcestats commands accept -v as an option, but the
glibc implementation of getopt() reorders the arguments and parses the
option as a command-line option of chronyc.
Add '+' to the getopt string to disable this feature. Other getopt()
implementations should consider it a new command-line option, which will
be handled as an error if present.
sched: add new timeout class for peer transmissions
This allows transmissions in symmetric mode to be scheduled
independently from client transmissions. This reduces maximum delay
in scheduling when chronyd is configured with a larger number of
servers.
Fix a sign error in conversion of HW time to local time, which caused
the jitter to be amplified instead of reduced. NTP with HW timestamping
should now be more stable and able to ignore occasionally delayed
readings of PHC.
In basic client mode, set the origin and receive timestamp to zero.
This reduces the amount of information useful for fingerprinting and
improves privacy as the origin timestamp allows a passive observer to
track individual NTP clients as they move across networks. (With chrony
clients that assumes the timestamp wasn't reset by the chronyc offline
and online commands.)
This follows recommendations from the current version of IETF draft on
NTP data minimization [1].
The timestamp could be theoretically useful for enhanced rate limiting
which can limit individual clients behind NAT and better deal with DoS
attacks, but no server implementation is known to do that.
When no default route is configured, check each source if it has a
route. If the system has multiple network interfaces, this prevents
setting local NTP servers to offline when they can still be reached over
one of the interfaces.
ntp: don't send useless requests in interleaved client mode
In interleaved client mode, when so many consecutive requests were lost
that the first valid (interleaved) response would be dropped for being
too old, switch to basic mode so the response can be accepted if it
doesn't fail in the other tests.
ntp: limit number of interleaved responses in symmetric mode
In symmetric mode, don't send a packet in interleaved mode unless it is
the first response to the last valid request received from the peer and
there was just one response to the previous valid request. This prevents
the peer from matching the transmit timestamp with an older response if
it can't detect missed responses.
ntp: improve detection of missed packets in interleaved mode
In interleaved symmetric mode, check if the remote TX timestamp is
before RX timestamp. Only the first response from the peer after
receiving a request should pass this test. Check also the interval
between last two remote transmit timestamps when we know the remote poll
can't be constrained by minpoll. Use the minimum of previous remote and
local poll as a lower bound of the actual interval between peer's
transmissions.
Miroslav Lichvar [Tue, 25 Jul 2017 09:27:24 +0000 (11:27 +0200)]
reference: don't report zero stratum when synchronised
If synchronised to a stratum 15 source, return stratum of 16 instead of
0 in the tracking report. It will not match the value in server mode
packets, but it should be less confusing.
Miroslav Lichvar [Fri, 21 Jul 2017 10:16:21 +0000 (12:16 +0200)]
ntp: don't accumulate old samples in interleaved client mode
Check how many responses were missing before accumulating a sample using
old timestamps to avoid correcting the clock with an offset extrapolated
over a long interval.
This should be eventually done in sourcestats for all sources.
Miroslav Lichvar [Fri, 21 Jul 2017 08:55:06 +0000 (10:55 +0200)]
ntp: revert reversed poll tracking in interleaved mode
With the new selection of timestamps in the interleaved mode it's no
longer necessary to reverse the poll tracking in order to reduce the
local and remote intervals of measurements that makes the peer with
higher stratum.
Miroslav Lichvar [Fri, 21 Jul 2017 08:45:46 +0000 (10:45 +0200)]
ntp: select timestamps in interleaved mode
Use previous local TX and remote RX timestamps for the new sample in the
interleaved mode if it will make the local and remote intervals
significantly shorter in order to improve the accuracy of the measured
delay.
Miroslav Lichvar [Thu, 13 Jul 2017 12:13:01 +0000 (14:13 +0200)]
configure: check for hardening compiler options
If no CFLAGS are specified, check if common security hardening options
are supported and add them to the CFLAGS/LDFLAGS. These are typically
enabled in downstream packages, but users compiling chrony from sources
with default CFLAGS should get hardened binaries too.
sys_macosx: add support for ntp_adjtime() on macOS 10.13+
macOS 10.13 will implement the ntp_adjtime() system call, allowing
better control over the system clock than is possible with the existing
adjtime() system call. chronyd will support both the older and newer
calls, enabling binary code to run without recompilation on macOS 10.9
through macOS 10.13.
Early releases of macOS 10.13 have a very buggy adjtime() call. The
macOS driver tests adjtime() to see if the bug has been fixed. If the
bug persists then the timex driver is invoked otherwise the netbsd
driver.
Miroslav Lichvar [Wed, 12 Jul 2017 16:38:44 +0000 (18:38 +0200)]
main: don't require root privileges with -Q option
If the -Q option is specified, disable by default pidfile, ntpport,
cmdport, Unix domain command socket, and clock control, in order to
allow starting chronyd without root privileges and/or when another
chronyd instance is already running.