Miroslav Lichvar [Mon, 16 Nov 2015 14:21:32 +0000 (15:21 +0100)]
ntp: ignore poll in KoD RATE packets
The meaning of the poll value in KoD RATE packets is not currently
defined in the NTP specification (RFC 5905). In the reference NTP
implementation it signals the minimum acceptable polling interval to the
clients. In chrony the minimum poll is set to the KoD RATE poll if it's
larger, but not to a larger value than 10.
The problem is that ntpd as a server sets the KoD RATE poll to the
maximum of the client's poll and the configured rate limiting interval.
An attacker can send a burst of spoofed packets to the server to trigger
the client's request rate limit. When the client sends its next request
and the server responds with a KoD RATE packet, the client will set its
minimum poll to the current poll and it will no longer be able to switch
to a shorter poll when needed.
ntpd could be fixed to always set the KoD RATE poll to the rate limiting
interval. Unfortunately, ntpd as a client seems to depend on the current
behavior. It tries to follow the server poll and if the KoD RATE poll
was shorter than the current poll, the polling interval would be
reduced, defeating the purpose of KoD RATE. The server fix will probably
need to wait until clients are fixed and that could take a very long
time.
For now, ignore the poll value in KoD RATE packets. Just add an extra
delay based on the current poll to the next transmit timeout and stop an
ongoing burst.
Miroslav Lichvar [Mon, 16 Nov 2015 11:28:42 +0000 (12:28 +0100)]
ntp: adjust initial delay for polling interval
First packet after setting a source to online was sent with constant
delay (0.2s). If the period in which the source was offline was shorter
than the current polling interval, the new packet was sent sooner than
it would be if the source wasn't switched to offline and back.
Don't reset the local tx timestamp when mode is changed. When starting
the initial transmit timeout, adjust the delay to make the interval
between the two packets at least as long as the current polling
interval.
Miroslav Lichvar [Fri, 13 Nov 2015 15:08:02 +0000 (16:08 +0100)]
sched: update timeout randomization
Use UTI_GetRandomBytes() instead of random() to calculate the random
part of the timeout. This was the only remaining use of random() in the
code and the srandom() call can be removed.
Miroslav Lichvar [Tue, 10 Nov 2015 16:59:49 +0000 (17:59 +0100)]
ntp: don't reveal local clock in client packets
In client packets set the leap, stratum, reference ID, reference time,
root delay and root dispersion to constant values to not reveal the
state of the synchronization. Use precision 32 to make the receive and
transmit timestamps completely random and not reveal the local time.
Miroslav Lichvar [Tue, 10 Nov 2015 16:26:59 +0000 (17:26 +0100)]
util: rework timestamp fuzzing
Use UTI_GetRandomBytes() instead of random() to generate random bits
below precision. Save the result in NTP_int64 in the network order and
allow precision in the full range from -32 to 32. With precision 32
the fuzzing now makes the timestamp completely random and can be used to
hide the time.
Miroslav Lichvar [Tue, 10 Nov 2015 15:46:40 +0000 (16:46 +0100)]
util: add function to generate random bytes
Add a function to fill a buffer with random bytes which uses a better
PRNG than random(). Use arc4random() if it's available on the system.
Fall back to reading from /dev/urandom, which should be available on
all currently supported systems.
ntp: don't keep client sockets open for longer than necessary
After sending a client packet, schedule a timeout to close the socket
at the time when all server replies would fail the delay test, so the
socket is not open for longer than necessary (e.g. when the server is
unreachable). With the default maxdelay of 3 seconds the timeout is 7
seconds.
For testA in the client mode require also that the time the server
needed to process the client request is not longer than 4 seconds.
With maximum peer delay this limits the interval in which the client can
accept a server reply.
Miroslav Lichvar [Tue, 10 Nov 2015 13:41:19 +0000 (14:41 +0100)]
sched: don't return currently used timeout ID
To avoid problems in the very unlikely case where a timeout is so long
and new IDs are allocated so frequently that they would have a chance
to overflow and catch up with it, make sure before returning new ID that
it's currently not in use.
Timeout ID of zero can be now safely used to indicate that the timer is
not running. Remove the extra timer_running variables that were
necessary to track that.
rtc: restore time from driftfile if later than RTC time
This is useful on computers that have an RTC, but there is no battery to
keep the time when they are turned off and start with the same time on
each boot.
Instead of calling the handler directly schedule a timeout with zero
delay for resolving to make the function behave similarly to the real
asynchronous resolver. This should prevent problems with code that
inadvertently depends on this behavior and which would break only when
compiled without support for asynchronous resolving.
Miroslav Lichvar [Tue, 22 Sep 2015 15:12:15 +0000 (17:12 +0200)]
sys_generic: allow fast slewing with system driver
The system drivers may implement their own slewing which the generic
driver can use to slew faster than the maximum frequency the driver is
allowed to set directly.
Miroslav Lichvar [Fri, 18 Sep 2015 08:29:47 +0000 (10:29 +0200)]
sys_solaris: use timex driver
Remove driver functions based on adjtime() and switch to the new timex
driver. The kernel allows the timex frequency to be set in the full
range of int32_t, which gives a maximum frequency of 32768 ppm. Round
the limit to 32500 ppm.
Miroslav Lichvar [Fri, 18 Sep 2015 08:10:50 +0000 (10:10 +0200)]
fix building on Solaris
- a feature test macro is needed to get msg_control in struct msghdr
- variables must not be named sun to avoid conflict with a macro
- res_init() needs -lresolv
- configure tests for IPv6 and getaddrinfo need -lsocket -lnsl
- pid_t is defined as long and needs to be cast for %d format
Miroslav Lichvar [Thu, 17 Sep 2015 11:51:18 +0000 (13:51 +0200)]
configure: check if C compiler works
Check if the C compiler works to get a useful error message when it
doesn't or it's missing. If the CC environment variable is not set, try
gcc and then cc.
Miroslav Lichvar [Tue, 15 Sep 2015 16:43:43 +0000 (18:43 +0200)]
sys: use timex driver on FreeBSD
Switch from the SunOS adjtime() based driver to the timex driver.
There is no FreeBSD-specific code, so call SYS_Timex_Initialise()
and SYS_Timex_Finalise() directly from sys.c.
Miroslav Lichvar [Tue, 15 Sep 2015 15:38:58 +0000 (17:38 +0200)]
sys: move DRIFT_REMOVAL_INTERVAL definition
In the SunOS and Solaris drivers DRIFT_REMOVAL_INTERVAL needs to be
defined before it's used. This was broken in commit b6a27df5b9be0f07f151c8fba311cb7eadb2b13e.
Miroslav Lichvar [Tue, 15 Sep 2015 13:44:34 +0000 (15:44 +0200)]
sys_netbsd: use timex driver
Remove the driver functions based on adjtime() and switch to the new
timex driver, which is based on ntp_adjtime(). This allows chronyd to
control the kernel frequency, adjust the offset with sub-microsecond
accuracy, and set the kernel leap and sync status. A drawback is that
the maximum slew rate is now limited by the 500 ppm maximum frequency
offset, while adjtime() on NetBSD slewed by up to 5000 ppm.
Miroslav Lichvar [Tue, 15 Sep 2015 13:24:28 +0000 (15:24 +0200)]
sys_linux: use timex driver
Remove functions that are included in the new timex driver. Keep only
functions that have extended functionality, i.e. read and set the
frequency using the timex tick field and apply step offset with
ADJ_SETOFFSET.
Merge the code from wrap_adjtimex.c that is still needed with
sys_linux.c and remove the file.
Miroslav Lichvar [Tue, 15 Sep 2015 13:03:37 +0000 (15:03 +0200)]
sys: add generic timex driver
This is based on sys_linux.c and wrap_adjtimex.c. It's intended for all
systems that support the adjtimex() or ntp_adjtime() system call. The
driver functions can be replaced with extended system-specific versions
(e.g. to control the frequency with the tick field on Linux).
Miroslav Lichvar [Thu, 10 Sep 2015 13:34:56 +0000 (15:34 +0200)]
test: add tests for system adjtime() and ntp_adjtime()
Include a test program to determine how the adjtime() implementation
behaves. Check the range of supported offset, support for readonly
operation, and slew rate with different update intervals and offsets.
Also, add a test for ntp_adjtime() to check what frequency range it
supports.
Since the NTPv4 update, the detection of synchronization loops based on
the refid prevents a server to initialize its clock from its clients
after restart. Remove that part from the recommended configuration.
Also, mention the time smoothing feature.
Miroslav Lichvar [Mon, 16 Jun 2014 14:21:25 +0000 (16:21 +0200)]
sys_linux: add support for seccomp filters
The Linux secure computing (seccomp) facility allows a process to
install a filter in the kernel that will allow only specific system
calls to be made. The process is killed when trying to make other system
calls. This is useful to reduce the kernel attack surface and possibly
prevent kernel exploits when the process is compromised.
Use the libseccomp library to add rules and load the filter into the
kernel. Keep a list of system calls that are always allowed after
chronyd is initialized. Restrict arguments that may be passed to the
socket(), setsockopt(), fcntl(), and ioctl() system calls. Arguments
to socketcall(), which is used on some architectures as a multiplexer
instead of separate socket system calls, are not restricted for now.
The mailonchange directive is not allowed as it calls sendmail.
Calls made by the libraries that chronyd is using have to be covered
too. It's difficult to determine which system calls they need as it may
change after an upgrade and it may depend on their configuration (e.g.
resolver in libc). There are also differences between architectures. It
can all break very easily and is therefore disabled by default. It can
be enabled with the new -F option.
This is based on a patch from Andrew Griffiths <agriffit@redhat.com>.
rtc: fix setting time from driftfile when RTC reading fails
Fix RTC_Linux_TimePreInit() to return 0 when the RTC device can be
opened, but reading its time fails to at least have the time restored
from the driftfile.
sys_macosx: reset drift removal timer after spike in offset_sd
When a large spike occurs in offset_sd the drift removal interval can be
set to an excessively long time, although what ever event caused the
perturbation has passed. At the next set_sync_status() we now compare
the expected drift removal interval with that currently in effect. If
they are significantly different, the current timer is cancelled and new
cycle started using the new drift removal interval.
Miroslav Lichvar [Wed, 26 Aug 2015 12:45:36 +0000 (14:45 +0200)]
sys_linux: always call TMX_SetLeap() in set_leap()
The optimization avoiding unnecessary setting of the kernel leap status
can cause a problem when something outside chronyd sets the status to
the new expected value. There will be no TMX_SetLeap() call which would
update the saved status and the kernel status will be overwritten with
the old (incorrect) value in a later TMX_*() call.
Always call TMX_SetLeap() to save the new value and for the log message
selection just check if a leap second has been applied.
sys_macosx: add option to run chronyd as real-time process
Adds option -P to chronyd on MacOS X which can be used to enable the
thread time constraint scheduling policy. This near real-time scheduling
policy removes a 1usec bias from the 'System time' offset.
Miroslav Lichvar [Tue, 25 Aug 2015 14:27:36 +0000 (16:27 +0200)]
sources: add option to limit selection by root distance
Add maxdistance directive to set the maximum root distance the sources
are allowed to have to be selected. This is useful to reject NTPv4
sources that are no longer synchronized and report large dispersion.
The default value is 3 seconds.