Tony Finch [Wed, 4 May 2022 08:38:54 +0000 (09:38 +0100)]
Remove obsolete notes on name compression
These notes describe the initial compression design for BIND 9 in
1998/1999, when the IETF had some over-optimistic plans for using EDNS
to change the wire format of domain names. (Another example was
bitstring labels for IPv6 reverse DNS.) By the end of 2000 the EDNS
name compression schemes had been abandoned, and BIND 9's compression
code was rewritten to use a hash table.
There is nothing left of the implementation described here, and the
API functions are better described in `compress.h`, so these notes are
more misleading than helpful. Those who are interested in the past can
look at the version control history.
Mark Andrews [Wed, 11 May 2022 04:32:11 +0000 (14:32 +1000)]
Make modifications to keyless.example deterministic
The perl modifation code for keyless.example was not deterministic
(/NXT/ matched part of signature) resulting in different error
strings being returned. Replaced /NXT/ with /A RRSIG NSEC/ and
updated expected error string,
Evan Hunt [Sat, 14 May 2022 03:55:23 +0000 (20:55 -0700)]
don't create managed-keys zone unless dnssec-validation is "auto"
previously, a managed-keys zone was created for every view
regardless of whether rfc5011 was in use; when it was not in
use, the zone would be left empty. this made for some confusing
log messages.
we now only set up the managed-keys zone if dnssec-validation is
set to the default value of "auto".
certain system test servers have had their dnssec-validation settings
changed to auto because the tests depended on the existence of the
zone.
The key lifetime should not be shorter than the time it costs to
introduce the successor key, otherwise keys will be created faster than
they are removed, resulting in a large key set.
The time it takes to replace a key is determined by the publication
interval (Ipub) of the successor key and the retire interval of the
predecessor key (Iret).
For the ZSK, Ipub is the sum of the DNSKEY TTL and zone propagation
delay (and publish safety). Iret is the sum of Dsgn, the maximum zone
TTL and zone propagation delay (and retire safety). The sign delay is
the signature validity period minus the refresh interval: The time to
ensure that all existing RRsets have been re-signed with the new key.
The ZSK lifetime should be larger than both values.
For the KSK, Ipub is the sum of the DNSKEY TTL and zone propagation
delay (and publish safety). Iret is the sum of the DS TTL and parent
zone propagation delay (and retire safety). The KSK lifetime should be
larger than both values.
The signatures-refresh should not near the signatures-validity value,
to prevent operational instability. Same is true when checking against
signatures-validity-dnskey.
Evan Hunt [Wed, 4 May 2022 02:43:23 +0000 (19:43 -0700)]
Stop the unit tests from running twice
Move the libtest code into a 'libtest' subdirectory and make it
one of the SUBDIRS in the tests Makefile. having it at the top level
required having "." as one of the subdirs, and that caused the
unit tests to be executed twice.
Ondřej Surý [Tue, 3 May 2022 09:37:31 +0000 (11:37 +0200)]
Move all the unit tests to /tests/<libname>/
The unit tests are now using a common base, which means that
lib/dns/tests/ code now has to include lib/isc/include/isc/test.h and
link with lib/isc/test.c and lib/ns/tests has to include both libisc and
libdns parts.
Instead of cross-linking code between the directories, move the
/lib/<foo>/test.c to /tests/<foo>.c and /lib/<foo>/include/<foo>test.h
to /tests/include/tests/<foo>.h and create a single libtest.la
convenience library in /tests/.
At the same time, move the /lib/<foo>/tests/ to /tests/<foo>/ (but keep
it symlinked to the old location) and adjust paths accordingly. In few
places, we are now using absolute paths instead of relative paths,
because the directory level has changed. By moving the directories
under the /tests/ directory, the test-related code is kept in a single
place and we can avoid referencing files between libns->libdns->libisc
which is unhealthy because they live in a separate Makefile-space.
In the future, the /bin/tests/ should be merged to /tests/ and symlink
kept, and the /fuzz/ directory moved to /tests/fuzz/.
Ondřej Surý [Mon, 2 May 2022 08:56:42 +0000 (10:56 +0200)]
Give the unit tests a big overhaul
The unit tests contain a lot of duplicated code and here's an attempt
to reduce code duplication.
This commit does several things:
1. Remove #ifdef HAVE_CMOCKA - we already solve this with automake
conditionals.
2. Create a set of ISC_TEST_* and ISC_*_TEST_ macros to wrap the test
implementations, test lists, and the main test routine, so we don't
have to repeat this all over again. The macros were modeled after
libuv test suite but adapted to cmocka as the test driver.
ISC_TEST_MAIN (Discussion: Should this be ISC_TEST_RUN ?)
For more complicated examples including group setup and teardown
functions, and per-test setup and teardown functions.
3. The macros prefix the test functions and cmocka entries, so the name
of the test can now match the tested function name, and we don't have
to append `_test` because `run_test_` is automatically prepended to
the main test function, and `setup_test_` and `teardown_test_` is
prepended to setup and teardown function.
4. Update all the unit tests to use the new syntax and fix a few bits
here and there.
5. In the future, we can separate the test declarations and test
implementations which are going to greatly help with uncluttering the
bigger unit tests like doh_test and netmgr_test, because the test
implementations are not declared static (see `ISC_RUN_TEST_DECLARE`
and `ISC_RUN_TEST_IMPL` for more details.
NOTE: This heavily relies on preprocessor macros, but the result greatly
outweighs all the negatives of using the macros. There's less
duplicated code, the tests are more uniform and the implementation can
be more flexible.
Ondřej Surý [Thu, 19 May 2022 09:20:21 +0000 (11:20 +0200)]
Make all tasks to be bound to a thread
Previously, tasks could be created either unbound or bound to a specific
thread (worker loop). The unbound tasks would be assigned to a random
thread every time isc_task_send() was called. Because there's no logic
that would assign the task to the least busy worker, this just creates
unpredictability. Instead of random assignment, bind all the previously
unbound tasks to worker 0, which is guaranteed to exist.
Artem Boldariev [Tue, 24 May 2022 08:39:26 +0000 (11:39 +0300)]
CID 352848: split xfrin_start() and remove dead code
This commit separates TLS context creation code from xfrin_start() as
it has become too large and hard to follow into a new
function (similarly how it is done in dighost.c)
The dead code has been removed from the cleanup section of the TLS
creation code:
* there is no way 'tlsctx' can equal 'found';
* there is no way 'sess_cache' can be non-NULL in the cleanup section.
Also, it fixes a bug in the older version of the code, where TLS
client session context fetched from the cache would not get passed to
isc_nm_tlsdnsconnect().
Artem Boldariev [Tue, 24 May 2022 08:25:30 +0000 (11:25 +0300)]
CID 352849: refactor get_create_tls_context() within dighost.c
This commit removes dead code from cleanup handling part of the
get_create_tls_context().
In particular, currently:
* there is no way 'found_ctx' might equal 'ctx';
* there is no way 'session_cache' might equal a non-NULL value while
cleaning up after a TLS initialisation error.
Petr Menšík [Tue, 24 May 2022 17:42:41 +0000 (19:42 +0200)]
Fix failures in isc netmgr_test on big endian machines
Typing from libuv structure to isc_region_t is not possible, because
their sizes differ on 64 bit architectures. Little endian machines seems
to be lucky and still result in test passed. But big endian machine such
as s390x fails the test reliably.
Fix by directly creating the buffer as isc_region_t and skipping the
type conversion. More readable and still more correct.
Disable periodic interface re-scans on modern platforms
This commit disables periodic interface re-scans timer on Linux where
a kernel-based dynamic interface mechanisms make it a thing of the
past in most cases.
Artem Boldariev [Mon, 23 May 2022 10:41:06 +0000 (13:41 +0300)]
Do not provide a shim for SSL_SESSION_is_resumable()
The recently added TLS client session cache used
SSL_SESSION_is_resumable() to avoid polluting the cache with
non-resumable sessions. However, it turned out that we cannot provide
a shim for this function across the whole range of OpenSSL versions
due to the fact that OpenSSL 1.1.0 does uses opaque pointers for
SSL_SESSION objects.
The commit replaces the shim for SSL_SESSION_is_resumable() with a non
public approximation of it on systems shipped with OpenSSL 1.1.0. It
is not turned into a proper shim because it does not fully emulate the
behaviour of SSL_SESSION_is_resumable(), but in our case it is good
enough, as it still helps to protect the cache from pollution.
For systems shipped with OpenSSL 1.0.X and derivatives (e.g. older
versions of LibreSSL), the provided replacement perfectly mimics the
function it is intended to replace.
Matthijs Mekking [Tue, 10 May 2022 13:09:29 +0000 (15:09 +0200)]
Tweak timings in serve-stale system test
Give a little bit more time if we wait on a time out from the
authoritative (aka resolver failure), and give up after one try
(because the second attempt will likely result in a different EDE).
Matthijs Mekking [Tue, 17 May 2022 10:02:43 +0000 (12:02 +0200)]
Require valid key for dst_key functions
Make sure that the key structure is valid when calling the following
functions:
- dst_key_setexternal
- dst_key_isexternal
- dst_key_setmodified
- dst_key_ismodified
Artem Boldariev [Tue, 10 May 2022 20:09:59 +0000 (23:09 +0300)]
Dig: Do not call isc_nm_cancelread() for HTTP sockets
This commit ensures that isc_nm_cancelread() is not called from within
dig code for HTTP sockets, as these lack its implementation.
It does not have much sense to have it due to transactional nature of
HTTP.
Every HTTP request-response pair is represented by a virtual socket,
where read callback is called only when full DNS message is received
or when an error code is being passed there. That is, there is nothing
to cancel at the time of the call.
Extend TLS context cache with TLS client session cache
This commit extends TLS context cache with TLS client session cache so
that an associated session cache can be stored alongside the TLS
context within the context cache.
This commit adds an implementation of a client TLS session cache. TLS
client session cache is an object which allows efficient storing and
retrieval of previously saved TLS sessions so that they can be
resumed. This object is supposed to be a foundation for implementing
TLS session resumption - a standard technique to reduce the cost of
re-establishing a connection to the remote server endpoint.
OpenSSL does server-side TLS session caching transparently by
default. However, on the client-side, a TLS session to resume must be
manually specified when establishing the TLS connection. The TLS
client session cache is precisely the foundation for that.
Ondřej Surý [Tue, 17 May 2022 19:31:37 +0000 (21:31 +0200)]
Move setting the sock->write_timeout to the async_*send
Setting the sock->write_timeout from the TCP, TCPDNS, and TLSDNS send
functions could lead to (harmless) data race when setting the value for
the first time when the isc_nm_send() function would be called from
thread not-matching the socket we are sending to. Move the setting the
sock->write_timeout to the matching async function which is always
called from the matching thread.
Ondřej Surý [Thu, 19 May 2022 19:40:24 +0000 (21:40 +0200)]
Use C2x [[fallthrough]] when supported by LLVM/clang
Clang added support for the gcc-style fallthrough
attribute (i.e. __attribute__((fallthrough))) in version 10. However,
__has_attribute(fallthrough) will return 1 in C mode in older versions,
even though they only support the C++11 fallthrough attribute. At best,
the unsupported attribute is simply ignored; at worst, it causes errors.
The C2x fallthrough attribute has the advantages of being supported in
the broadest range of clang versions (added in version 9) and being easy
to check for support. Use C2x [[fallthrough]] attribute if possible, and
fall back to not using an attribute for clang versions that don't have
it.
Evan Hunt [Sun, 8 May 2022 03:58:19 +0000 (20:58 -0700)]
Always use the number of CPUS for resolver->ntasks
Since the fctx hash table is now self-resizing, and resolver tasks are
selected to match the thread that created the fetch context, there
shouldn't be any significant advantage to having multiple tasks per CPU;
a single task per thread should be sufficient.
Additionally, the fetch context is always pinned to the calling netmgr
thread to minimize the contention just to coalesced fetches - if two
threads starts the same fetch, it will be pinned to the first one to get
the bucket.
Tony Finch [Tue, 17 May 2022 12:13:57 +0000 (14:13 +0200)]
Teach dnssec-settime to read unset times that it writes
When there is no time in a key file, `dnssec-settime` will print
"UNSET", but to unset a time the user must specify "none" or "never".
This change allows "unset" or "UNSET" as well as "none" or "never".
The "UNSET" output remains the same to avoid compatibility problems
with wrapper scripts.
I have also re-synchronized the "Timing Options" sections of the man
pages.
Ondřej Surý [Mon, 16 May 2022 11:28:13 +0000 (13:28 +0200)]
Cleanup dns_message_gettemp*() functions - they cannot fail
The dns_message_gettempname(), dns_message_gettemprdata(),
dns_message_gettemprdataset(), and dns_message_gettemprdatalist() always
succeeds because the memory allocation cannot fail now. Change the API
to return void and cleanup all the use of aforementioned functions.
Matthijs Mekking [Mon, 16 May 2022 09:23:15 +0000 (11:23 +0200)]
Replace stat with PERL stat in kasp system test
7249bad7 introduced the -c option to stat(1) command, but BSD systems
do not know about it. Replace the stat(1) command with a PERL script
that achieves the same.
Why PERL? For consistency purposes, there are more places in the
system test where we use the same method.
Evan Hunt [Mon, 9 May 2022 00:17:29 +0000 (17:17 -0700)]
Add lower bound checks to fetchlimit test
Check that the recursing client count is above a reasonable
minimum, as well as below a maximum, so that we can detect
bugs that cause recursion to fail too early or too often.
The ns_statscounter_recursclients counter was previously only
incremented or decremented if client->recursionquota was non-NULL.
This was harmless, because that value should always be non-NULL if
recursion is enabled, but it made the code slightly confusing.
Evan Hunt [Thu, 5 May 2022 21:52:15 +0000 (14:52 -0700)]
Disable EDNS for the fetchlimit test server
The fetchlimit test depends on a resolver continuing to try UDP
and timing out while the client waits for resolution to succeed.
but since commit bb990030 (flag day 2020), a fetch will always
switch to TCP after two timeouts, unless EDNS was disabled for
the query.
This commit adds "edns no;" to server statements in the fetchlimit
resolver, to restore the behavior expected by the test.
Evan Hunt [Thu, 5 May 2022 00:27:56 +0000 (17:27 -0700)]
Fix the fetches-per-server quota calculation
Since commit bad5a523c2e, when the fetches-per-server quota
was increased or decreased, instead of the value being set to
the newly calculated quota, it was set to the *minimum* of
the new quota or 1 - which effectively meant it was always set to 1.
it should instead have been the maximum, to prevent the value from
ever dropping to zero.
Evan Hunt [Thu, 12 May 2022 22:51:10 +0000 (15:51 -0700)]
attach the resolver to the ADB
the ADB depends on the resolver, but previously only accessed it
via the view. as view->resolver may now be detached before the ADB
finishes, a shutdown race was possible. attaching to the resolver
directly prevents this.
Evan Hunt [Wed, 11 May 2022 20:55:01 +0000 (13:55 -0700)]
remove requestmgr whenshutdown events
the request manager has no direct dependency on the
view, so there's no need for a weak reference. remove the
dns_requestmgr_whenshutdown() mechanism since it is no longer
used.
Evan Hunt [Wed, 11 May 2022 20:55:01 +0000 (13:55 -0700)]
remove resolver whenshutdown events
weakly attaching and detaching when creating and destroying the
resolver obviates the need to have a callback event to do the weak
detach. remove the dns_resolver_whenshutdown() mechanism, as it is
now unused.
Evan Hunt [Wed, 11 May 2022 20:55:01 +0000 (13:55 -0700)]
remove ADB whenshutdown events
weakly attaching and detaching the view when creating or destroying
the ADB obviates the need for a whenshutdown callback event to do
the detaching. remove the dns_adb_whenshutdown() mechanism, since
it is no longer needed.
Evan Hunt [Wed, 11 May 2022 22:38:54 +0000 (15:38 -0700)]
move ADB and resolver stats out of the view object
for better object separation, ADB and resolver statistics counters
are now stored in the ADB and resolver objects themsevles, rather than
in the associated view.
Evan Hunt [Wed, 11 May 2022 21:18:45 +0000 (14:18 -0700)]
remove view->attributes field
it's not necessary to use view attributes to determine whether
the ADB, resolver and requestmgr need to be shut down; checking
whether they're NULL is sufficient.
Evan Hunt [Wed, 11 May 2022 19:15:46 +0000 (12:15 -0700)]
the validator can attach to the view normally
dns_view_weakattach() and _weakdetach() are used by objects that
are created by the view and need it to persist until they are
destroyed, but don't need to prevent it from being shut down. the
validator can use normal view references instead of weak references.
Evan Hunt [Tue, 10 May 2022 19:48:31 +0000 (12:48 -0700)]
minor view refactoring
- eliminate dns_view_flushanddetach(), which was only called from
one place; instead, we now call a function dns_view_flushonshutdown()
which sets the view up to flush zones when it is detached normally
with dns_view_detach().
- cleaned up code in dns_view_create().
Evan Hunt [Mon, 9 May 2022 19:55:07 +0000 (12:55 -0700)]
maybe_cancel_validators() is always called locked
there's no longer any need for a parameter to specify whether the
function is called while holding the bucket lock, because all
unlocked uses have been removed.
Add a new parameter to the dst_key structure, mark a key modified if
dst_key_(un)set[bool,num,state,time] is called. Only write out key
files during a keymgr run if the metadata has changed.
Add a test case that triggers a keymgr run that will not trigger any
metadata changes. Ensure that the last status change of the key files
is unmodified.