these options concentrate zone maintenance actions into
bursts for the benefit of servers with intermittent connections.
that's no longer something we really need to optimize.
Mark Andrews [Thu, 11 May 2023 02:09:26 +0000 (12:09 +1000)]
Use sub shell to isolate enviroment changes
'HOME=value command' should only change HOME for command but on
some platforms this occasionally sets HOME for the rest of the
test. Explicitly isolate the enviroment change using a sub shell.
Allow larger TTL values in zones that go insecure. This is necessary
because otherwise the zone will not be loaded due to the max-zone-ttl
of P1D that is part of the current insecure policy.
In the keymgr.c code, default back to P1D if the max-zone-ttl is set
to zero.
When using automated DNSSEC management, it is required that the zone
is dynamic, or that inline-signing is enabled (or both). Update the
checkconf code to also allow inline-signing to be enabled within
dnssec-policy.
Add an option to enable/disable inline-signing inside the
dnssec-policy clause. The existing inline-signing option that is
set in the zone clause takes priority, but if it is omitted, then the
value that is set in dnssec-policy is taken.
The built-in policies use inline-signing.
This means that if you want to use the default policy without
inline-signing you either have to set it explicitly in the zone
clause:
zone "example" {
...
dnssec-policy default;
inline-signing no;
};
Or create a new policy, only overriding the inline-signing option:
Use cds_lfht for updatenotify mechanism in dns_db unit
The updatenotify mechanism in dns_db relied on unlocked ISC_LIST for
adding and removing the "listeners". The mechanism relied on the
exclusive mode - it should have been updated only during reconfiguration
of the server. This turned not to be true anymore in the dns_catz - the
updatenotify list could have been updated during offloaded work as the
offloaded threads are not subject to the exclusive mode.
Change the update_listeners to be cds_lfht (lock-free hash-table), and
slightly refactor how register and unregister the callbacks - the calls
are now idempotent (the register call already was and the return value
of the unregister function was mostly ignored by the callers).
Ondřej Surý [Thu, 30 Mar 2023 08:08:52 +0000 (10:08 +0200)]
Add rwlock unit test
Add simple rwlock unit test and rwlock benchmark. The benchmark
compares the pthread rwlock with isc rwlock implementation, so it's
mainly useful when developing a new isc rwlock implementation.
Ondřej Surý [Tue, 27 Jun 2023 06:26:12 +0000 (08:26 +0200)]
Call rcu_barrier() five times in the isc__mem_destroy()
Because rcu_barrier() needs to be called as many times as the number of
nested call_rcu() calls (call_rcu() calls made from call_rcu thread),
and currently there's no mechanism to detect whether there are more
call_rcu callbacks scheduled, we simply call the rcu_barrier() multiple
times. The overhead is negligible and it prevents rare assertion
failures caused by the check for memory leaks in isc__mem_destroy().
Ondřej Surý [Thu, 22 Jun 2023 13:43:04 +0000 (15:43 +0200)]
Don't cleanup the dns_message_checksig fuzzer in atexit handler
After the dns_badcache refactoring, the dns_badcache_destroy() would
call call_rcu(). The dns_message_checksig cleanup which calls
dns_view_detach() happens in the atexit handler, so there might be
call_rcu threads started very late in the process. The liburcu
registers library destructor that destroys the data structured internal
to liburcu and this clashes with the call_rcu thread that just got
started in the atexit() handler causing either (depending on timing):
- a normal run
- a straight segfault
- an assertion failure from liburcu
Instead of trying to cleanup the dns_message_checksig unit, ignore the
leaked memory as we do with all the other fuzzing tests.
Ondřej Surý [Wed, 21 Jun 2023 12:10:28 +0000 (14:10 +0200)]
Make the load-names benchmark multithreaded
The load-names benchmark was originally only measuring single thread
performance of the data structures. As this is not how those are used
in the real life, it was refactored to be multi-threaded with proper
protections in place (rwlock for ht, hashmap and rbt; transactions for
qp).
The qp test has been extended to see effect of the dns_qp_compact() and
rcu_barrier() on the overall speed and memory consumption.
Ondřej Surý [Mon, 19 Jun 2023 13:43:02 +0000 (15:43 +0200)]
Refactor dns_badcache to use cds_lfht lock-free hashtable
The dns_badcache unit had (yet another) own locked hashtable
implementation. Replace the hashtable used by dns_badcache with
lock-free cds_lfht implementation from liburcu.
When dns_request was canceled via dns_requestmgr_shutdown() the cancel
event would be propagated on different loop (loop 0) than the loop where
request was created on. In turn this would propagate down to isc_netmgr
where we require all the events to be called from the matching isc_loop.
Pin the dns_requests to the loops and ensure that all the events are
called on the associated loop. This in turn allows us to remove the
hashed locks on the requests and change the single .requests list to be
a per-loop list for the request accounting.
Additionally, do some extra cleanup because some race condititions are
now not possible as all events on the dns_request are serialized.
With ThreadSanitizer support added to the Userspace RCU, we no longer
need to wrap the call_rcu and caa_container_of with
__tsan_{acquire,release} hints. Remove the direct calls to
__tsan_{acquire,release} and the isc_urcu_{container,cleanup} macros.
Ondřej Surý [Thu, 22 Jun 2023 10:25:45 +0000 (12:25 +0200)]
Workaround AddressSanitizer overzealous check
The cds_lfht_for_each_entry and cds_lfht_for_each_entry_duplicate macros
had a code that operated on the NULL pointer, at the end of the list it
was calling caa_container_of() on the NULL pointer in the init-clause
and iteration-expression, but the result wasn't actually used anywhere
because the cond-expression in the for loop has prevented executing
loop-statement. This made AddressSanitizer notice the invalid operation
and rightfully complain.
This was reported to the upstream and fixed there. Pull the upstream
fix into our <isc/urcu.h> header, so our CI checks pass.
Free struct stub_glue_request in stub_glue_response() callback
When stub_glue_response() is called, the associated data is stored in
newly allocated struct stub_glue_request. The allocated structure is
never freed in the callback, thus we leak a little bit of memory.
The stub_request_nameserver_address() used 'request' as name for
struct stub_glue_request leading to confusion between 'request'
(stub_glue_request) and 'request->request' (dns_request_t).
Unify the name to 'sgr' already used in struct stub_glue_response().
Ondřej Surý [Mon, 26 Jun 2023 08:58:30 +0000 (10:58 +0200)]
Refactor isc_stats_create() and its downstream users to return void
The isc_stats_create() can no longer return anything else than
ISC_R_SUCCESS. Refactor isc_stats_create() and its variants in libdns,
libns and named to just return void.
Tom Krizek [Mon, 24 Jul 2023 14:29:31 +0000 (16:29 +0200)]
Reproducer for CVE-2023-2911
The conditions that trigger the crash:
- a stale record is in cache
- stale-answer-client-timeout is 0
- multiple clients query for the stale record, enough of them to exceed
the recursive-clients quota
- the response from the authoritative is sufficiently delayed so that
recursive-clients quota is exceeded first
The reproducer attempts to simulate this situation. However, it hasn't
proven to be 100 % reproducible, especially in CI. When reproducing
locally, the priming query also seems to sometimes interfere and prevent
the crash. When the reproducer is ran twice, it appears to be more
reliable in reproducing the issue.
Tom Krizek [Mon, 24 Jul 2023 16:35:13 +0000 (18:35 +0200)]
Clean up keys directory in checkconf test
The keys directory should be cleaned up in clean.sh. Doing that in the
test itself isn't reliable which may lead to failing mkdir which causes
the test to fail with set -e.
After commit f4eb3ba4, that is part of removing 'auto-dnssec', the
inline system test started to fail in FIPS CI jobs. This is because
the 'nsec3-loop' zone started to use a RSASHA256 key size of 1024 and
this is not FIPS compliant.
This commit changes the key size from 1024 to 4096, in order to
become FIPS compliant again.
1. Change the _new, _add and _copy functions to return the new object
instead of returning 'void' (or always ISC_R_SUCCESS)
2. Cleanup the isc_ht_find() + isc_ht_add() usage - the code is always
locked with catzs->lock (mutex), so when isc_ht_find() returns
ISC_R_NOTFOUND, the isc_ht_add() must always succeed.
3. Instead of returning direct iterator for the catalog zone entries,
add dns_catz_zone_for_each_entry2() function that calls callback
for each catalog zone entry and passes two extra arguments to the
callback. This will allow changing the internal storage for the
catalog zone entries.
4. Cleanup the naming - dns_catz_<fn>_<obj> -> dns_catz_<obj>_<fn>, as an
example dns_catz_new_zone() gets renamed to dns_catz_zone_new().
Mark Andrews [Wed, 19 Jul 2023 23:16:03 +0000 (09:16 +1000)]
Mark a primary as unreachable on timed out in xfin
When a primary server is not responding, mark it as temporarialy
unreachable. This will prevent too many zones queuing up on a
unreachable server and allow the refresh process to move onto
the next primary sooner once it has been so marked.
Restore the IS_STUB() condition in zone_zonecut_callback
After the refactoring the condition whether to use DNAME or NS for the
zonecut was incorrectly simplified and the !IS_STUB() condition was
removed. This was flagged by Coverity as:
/lib/dns/rbt-zonedb.c: 192 in zone_zonecut_callback()
186 found = ns_header;
187 search->zonecut_sigheader = NULL;
188 } else if (dname_header != NULL) {
189 found = dname_header;
190 search->zonecut_sigheader = sigdname_header;
191 } else if (ns_header != NULL) {
>>> CID 462773: Control flow issues (DEADCODE)
>>> Execution cannot reach this statement: "found = ns_header;".
192 found = ns_header;
193 search->zonecut_sigheader = NULL;
194 }
195
196 if (found != NULL) {
197 /*
Instead of removing the extra block, restore the !IS_STUB() condition
for the first if block.
These two configuration options worked in conjunction with 'auto-dnssec'
to determine KSK usage, and thus are now obsoleted.
However, in the code we keep KSK processing so that when a zone is
reconfigured from using 'dnssec-policy' immediately to 'none' (without
going through 'insecure'), the zone is not immediately made bogus.
Add one more test case for going straight to none, now with a dynamic
zone (no inline-signing).
Matthijs Mekking [Thu, 29 Jun 2023 09:23:34 +0000 (11:23 +0200)]
Update views system test
Change test configuration to make use of 'dnssec-policy' instead of
'auto-dnssec'.
Because we now use 'dnssec-policy', there is no need to create an
explicit key in the final test that adds multiple inline zones
followed by a reconfig.
Matthijs Mekking [Thu, 29 Jun 2023 08:57:01 +0000 (10:57 +0200)]
Update statschannel system test
Change test configuration to make use of 'dnssec-policy' instead of
'auto-dnssec'.
Because we now add a DNSKEY with dynamic update, the sign statistics
change. When adding signatures triggered by dynamic update, the
dnssec-refresh stats are not incremented (this is only incremented
when signing is triggered by resign in lib/dns/zone.c).
Matthijs Mekking [Wed, 28 Jun 2023 13:38:42 +0000 (15:38 +0200)]
Alter mkeys system test
The mkeys system test configured 'auto-dnssec' on the root zone to do
smart signing and simulate root key changes that should be picked up
by the automated trust anchor management of BIND.
This does not require 'auto-dnssec' or 'dnssec-policy', so change the
tests to use manual smart signing with 'dnssec-signzone'.
Matthijs Mekking [Tue, 27 Jun 2023 14:25:30 +0000 (16:25 +0200)]
Remove dupsigs system test
This test uses key timing metadata to do rollovers, this is no longer
applicable with 'dnssec-policy'. Note that with 'dnssec-policy' key
timing metadata is still written, but it is not used for determining
what and when to do key rollovers.
Matthijs Mekking [Tue, 20 Jun 2023 08:08:29 +0000 (10:08 +0200)]
Copy DNSKEY record from unsigned zone db
Since external DNSKEY records may exist in the unsigned version of the
zone (for example DNSKEY records from other providers), handle these
RRsets also when copying non DNSSEC records from the unsigned zone
database to the signed version.
Matthijs Mekking [Tue, 20 Jun 2023 08:06:01 +0000 (10:06 +0200)]
Allow rndc signing commands with dnssec-policy
Some 'rndc signing' commands can still be used in conjunction with
'dnssec-policy' because it shows the progress of signing and
private type records can be cleaned up. Allow these commands to be
executed.
However, setting NSEC3 parameters is incompatible with dnssec-policy.
Matthijs Mekking [Mon, 19 Jun 2023 14:21:11 +0000 (16:21 +0200)]
Change inline system test
The inline system test tests 'auto-dnssec' in conjunction with
'inline-signing'. Change the tests to make use of 'dnssec-policy'.
Remove some tests that no longer make sense:
- The 'retransfer3.' zone tests changing the parameters with
'rndc signing -nsec3param'. This command is going away and NSEC3
parameters now need to be configured with nsec3param within
'dnssec-policy'.
- The 'inactivezsk.' and 'inactiveksk.' zones test whether the ZSK take
over signing if the KSK is inactive, or vice versa. This fallback
mode longer makes sense when using a DNSSEC policy.
Some tests need to be adapted more than just changing 'auto-dnssec'
to 'dnssec-policy':
- The 'delayedkeys.' zone first needs to be configured as insecure,
then we can change it to start signing. Previously, no existing
keys means that you cannot sign the zone, with 'dnssec-policy'
new keys will be created.
- The 'updated.' zone needs to have key states in a specific state
so that the minimal journal check still works (otherwise CDS/
CDNSKEY and related records will be in the journal too).
- External keys are now added to the unsigned zone and no longer
are maintained with key files. Adjust the 'externalkey.' zone
accordingly.
- The 'nsec3-loop.' zone requires three signing keys. Since
'dnssec-policy' will ignore duplicates in the 'keys' section,
create RSASHA256 keys with different role and/or key length.
Finally, the 'externalkey.' zone checks for an expected number of
DNSKEY and RRSIG records in the response. This used to be 3 DNSKEY
and 2 RRSIG records. Due to logic behavior changes (key timing
metadata is no longer authoritative, these expected values are
changed to 4 DNSKEY records (two signing keys and two external keys
per algorithm) and 1 RRSIG record (one active KSK per signing
algorithm).
Matthijs Mekking [Fri, 16 Jun 2023 15:06:28 +0000 (17:06 +0200)]
Update dnssec system test
The dnssec system test has some tests that use auto-dnssec. Update
these tests to make use of dnssec-policy.
Remove any 'rndc signing -nsec3param' commands because with
dnssec-policy you set the NSEC3 parameters in the configuration.
Remove now duplicate tests that checked if CDS and CDNSKEY RRsets
are signed with KSK only (the dnssec-dnskey-kskonly option worked
in combination with auto-dnssec).
Also remove the publish-inactive.example test case because such
use cases are no longer supported (only with manual signing).
The auto-nsec and auto-nsec3 zones need to use an alternative
algorithm because duplicate lines in dnssec-policy/keys are ignored.