Mark Andrews [Thu, 14 Dec 2023 04:02:22 +0000 (15:02 +1100)]
sync_secure_db failed to handle some TTL changes
If the DNSKEY, CDNSKEY or CDS RRset had different TTLs then the
filtering of these RRset resulted in dns_diff_apply failing with
"not exact". Identify tuple pairs that are just TTL changes and
allow them through the filter.
Mark Andrews [Tue, 12 Dec 2023 02:47:30 +0000 (13:47 +1100)]
Test dnssec-policy dnskey-ttl behaviour
If the dnskey-ttl in the dnssec-policy doesn't match the DNSKEY's
ttl then the DNSKEY, CDNSKEY and CDS rrset should be updated by
named to reflect the expressed policy. Check that named does this
by creating a zone with a TTL that does not match the policy's TTL
and check that it is correctly updated.
Mark Andrews [Tue, 2 Jan 2024 04:39:58 +0000 (15:39 +1100)]
Support Net::DNS::Nameserver 1.42
In Net::DNS 1.42 $ns->main_loop no longer loops. Use current methods
for starting the server, wait for SIGTERM then cleanup child processes
using $ns->stop_server(), then remove the pid file.
Michał Kępień [Fri, 22 Dec 2023 18:27:37 +0000 (19:27 +0100)]
Silence a scan-build warning in dns_rbt_addname()
Clang Static Analyzer is unable to grasp that when dns_rbt_addnode()
returns ISC_R_EXISTS, it always sets the pointer passed to it via its
'nodep' parameter to a non-NULL value. Add an extra safety check in the
conditional expression used in dns_rbt_addname() to silence that
warning.
Michał Kępień [Tue, 25 Jul 2023 12:37:05 +0000 (14:37 +0200)]
Add reconfiguration support to NamedInstance
Reconfiguring named using RNDC is a common action in BIND 9 system
tests. It involves sending the "reconfig" RNDC command to a named
instance and waiting until it is fully processed. Add a reconfigure()
method to the NamedInstance class in order to simplify and standardize
named reconfiguration using RNDC in Python-based system tests.
TODO:
- full reconfiguration support (w/templating *.in files)
- add an "rndc null" before every reconfiguration to show which file
is used (NamedInstance.add_mark_to_log() as it may be generically
useful?)
Michał Kępień [Tue, 25 Jul 2023 12:37:05 +0000 (14:37 +0200)]
Run mypy checks on Python helpers in GitLab CI
Ensure the type hints provided in helper code for Python-based system
tests are correct by continuously checking them using mypy in GitLab CI.
Check bin/tests/system/isctest.py exclusively for the time being because
it is the only Python file in the source tree which uses static typing
at the moment and working around the issues reported by mypy for other
(non-statically-typed) Python files present in the source tree would be
cumbersome.
Michał Kępień [Tue, 25 Jul 2023 12:37:05 +0000 (14:37 +0200)]
Clean up the "checkds" system test
The "checkds" system test contains a lot of duplicated code despite
carrying out the same set of actions for every tested scenario
(zone_check() → wait for logs to appear → keystate_check()). Extract
the parts of the code shared between all tests into a new function,
test_checkds(), and use pytest's test parametrization capabilities to
pass distinct sets of test parameters to this new function, in an
attempt to cleanly separate the fixed parts of this system test from the
variable ones. Replace format() calls with f-strings.
Michał Kępień [Tue, 25 Jul 2023 12:37:05 +0000 (14:37 +0200)]
Drop use of dns.resolver.Resolver from "checkds"
The "checkds" system test only uses dns.resolver.Resolver objects to
access their 'nameservers' and 'port' attributes. Instances of the
NamedInstance class also expose that information via their attributes,
so only pass NamedInstance objects around instead of needlessly
depending on dns.resolver.Resolver.
Michał Kępień [Tue, 25 Jul 2023 12:37:05 +0000 (14:37 +0200)]
Use helper Python classes for watching log files
Make log file watching in Python-based system tests consistent by
employing the helper Python classes designed for that purpose. Drop the
custom code currently used.
Michał Kępień [Tue, 25 Jul 2023 12:37:05 +0000 (14:37 +0200)]
Add helper Python classes for watching log files
Waiting for a specific log line to appear in a named.run file is a
common action in BIND 9 system tests. Implement a set of Python classes
which intend to simplify and standardize this task in Python-based
system tests.
Michał Kępień [Tue, 25 Jul 2023 12:37:05 +0000 (14:37 +0200)]
Simplify use of RNDC in Python-based tests
The "addzone" and "shutdown" system tests currently invoke rndc using
test-specific helper code. Rework the relevant bits of those tests so
that they use the helper classes from bin/tests/system/isctest.py.
Michał Kępień [Tue, 25 Jul 2023 12:37:05 +0000 (14:37 +0200)]
Implement Python helpers for using RNDC in tests
Controlling named instances using RNDC is a common action in BIND 9
system tests. However, there is currently no standardized way of doing
that from Python-based system tests, which leads to code duplication.
Add a set of Python classes and pytest fixtures which intend to simplify
and standardize use of RNDC in Python-based system tests.
For now, RNDC commands are sent to servers by invoking the rndc binary.
However, a switch to a native Python module able to send RNDC commands
without executing external binaries is expected to happen soon. Even
when that happens, though, having the capability to invoke the rndc
binary (in order to test it) will remain useful. Define a common Python
interface that such "RNDC executors" should implement (RNDCExecutor), in
order to make switching between them convenient.
Evan Hunt [Wed, 20 Dec 2023 08:32:57 +0000 (00:32 -0800)]
prevent an infinite loop in fix_iterator()
it was possible for fix_iterator() to get stuck in a loop while
trying to find the predecessor of a missing node. this has been
fixed and a regression test has been added.
Evan Hunt [Wed, 20 Dec 2023 20:38:12 +0000 (12:38 -0800)]
fix_iterator() could produce incoherent iterator stacks
the fix_iterator() function moves an iterator so that it points
to the predecessor of the searched-for name when that name doesn't
exist in the database. the tests only checked the correctness of
the top of the stack, however, and missed some cases where interior
branches in the stack could be missing or duplicated. in these
cases, the iterator would produce inconsistent results when walked.
the predecessors test case in qp_test has been updated to walk
each iterator to the end and ensure that the expected number of
nodes are found.
Matthijs Mekking [Tue, 19 Dec 2023 12:23:44 +0000 (13:23 +0100)]
Regression check for NSEC3 to NSEC3 conversion
When changing the NSEC3 chain, the new NSEC3 chain must be built before
the old NSEC3PARAM is removed. Check each delta in the conversion to
ensure this ordering is met.
Mark Andrews [Mon, 18 Dec 2023 00:23:21 +0000 (11:23 +1100)]
Regression check for NSEC3 to NSEC conversion
When transitioning from NSEC3 to NSEC the NSEC3 must be built before
the NSEC3PARAM is removed. Check each delta in the conversion to
ensure this ordering is met.
Mark Andrews [Wed, 20 Dec 2023 02:07:51 +0000 (13:07 +1100)]
Update the NSEC3PARAM TTL to match the SOA minimum
When building NSEC3 chains update the NSEC3PARAM TTL to match
the SOA minimum. Delete all records using the old TTL then
re-add them using the new TTL.
Evan Hunt [Thu, 16 Nov 2023 02:42:43 +0000 (18:42 -0800)]
disable checks by default in named-compilezone
Zone content integrity checks can significantly slow the conversion
of zones from raw to text. As this is more properly a job for
named-checkzone anyway, we now disable all zone checks by
default in named-compilezone.
Users relying on named-compilezone for integrity checks as
well as format conversion can run named-checkzone separately,
or re-enable the checks in named-compilezone by using:
"named-compilezone -n fail -k fail -r warn -T warn -W warn".
Mark Andrews [Wed, 6 Dec 2023 00:34:52 +0000 (11:34 +1100)]
dns_request_cancel needs to be callable from any thread
Check the tid and cancel the request immediately or pass it to the
appropriate loop for processing. Call request->cb directly from
req_sendevent as it is now always called with the correct tid.
Michał Kępień [Wed, 20 Dec 2023 16:21:14 +0000 (17:21 +0100)]
Do not destroy IXFR journal in xfrin_end()
The xfrin_end() function is run when a zone transfer is finished or
canceled. One of the actions it takes for incremental transfers (IXFR)
is calling dns_journal_destroy() on the zone journal structure that is
stored in the relevant zone transfer context (xfr->ixfr.journal). That
immediately invalidates that structure as it is not reference-counted.
However, since the changes present in the IXFR stream are applied to the
journal asynchronously (via isc_work_enqueue()), it is possible that
some zone changes may still be in the process of being written to the
journal by the time xfrin_end() destroys the relevant structure. Such a
scenario leads to crashes.
Fix by not destroying the zone journal structure until the entire zone
transfer context is destroyed. xfrin_destroy() already conditionally
calls dns_journal_destroy() and when the former is called, all
asynchronous work for a given zone transfer process is guaranteed to be
complete.
Matthijs Mekking [Wed, 13 Dec 2023 08:38:17 +0000 (09:38 +0100)]
Remove kasp mutex lock
Multiple zones should be able to read the same key and signing policy
at the same time. Since writing the kasp lock only happens during
reconfiguration, and the complete kasp list is being replaced, there
is actually no need for a lock. Reference counting ensures that a kasp
structure is not destroyed when still being attached to one or more
zones.
This significantly improves the load configuration time.
Mark Andrews [Mon, 18 Dec 2023 00:23:21 +0000 (11:23 +1100)]
Regression check for missing RRSIGs
When transitioning from NSEC3 to NSEC the added records where not
being signed because the wrong time was being used to determine if
a key should be used or not. Check that these records are actually
signed.
Mark Andrews [Thu, 14 Dec 2023 22:42:10 +0000 (09:42 +1100)]
Use 'now' rather than 'inception' in 'add_sigs'
When kasp support was added 'inception' was used as a proxy for
'now' and resulted in signatures not being generated or the wrong
signatures being generated. 'inception' is the time to be set
in the signatures being generated and is usually in the past to
allow for clock skew. 'now' determines what keys are to be used
for signing.
Michał Kępień [Mon, 18 Dec 2023 14:11:39 +0000 (15:11 +0100)]
"trust-anchor-telemetry" is no longer experimental
Remove the CFG_CLAUSEFLAG_EXPERIMENTAL flag from the
"trust-anchor-telemetry" statement as the behavior of the latter has not
been changed since its initial implementation and there are currently no
plans to do so. This silences a relevant log message that was emitted
even when the feature was explicitly disabled.
Michał Kępień [Mon, 18 Dec 2023 10:33:43 +0000 (11:33 +0100)]
Fix reference counting in do_nsfetch()
Each function queuing a do_nsfetch() call using isc_async_run() is
expected to increase the given zone's internal reference count
(zone->irefs), which is then correspondingly decreased in either
do_nsfetch() itself (when the dns_resolver_createfetch() fails) or in
nsfetch_done() (when recursion is finished).
However, do_nsfetch() can also return early if either the zone itself or
the relevant view's resolver object is being shut down. In that case,
do_nsfetch() simply returns without decreasing the internal reference
count for the zone. This leaves a dangling zone reference around, which
leads to hangs during named shutdown.
Fix by executing the same cleanup code for early returns from
do_nsfetch() as for a failed dns_resolver_createfetch() call in that
function as the reference count will not be decreased in nsfetch_done()
in any of these cases.
Michał Kępień [Mon, 18 Dec 2023 10:07:04 +0000 (11:07 +0100)]
Prevent an infinite loop in shutdown_listener()
The loop in shutdown_listener() assumes that the reference count for
every controlconnection_t object on the listener->connections linked
list will drop down to zero after the conn_shutdown() call in the loop's
body. However, when the timing is just right, some netmgr callbacks for
a given control connection may still be awaiting processing by the same
event loop that executes shutdown_listener() when the latter is run.
Since these netmgr callbacks must be run in order for the reference
count for the relevant controlconnection_t objects to drop to zero, when
the scenario described above happens, shutdown_listener() runs into an
infinite loop due to one of the controlconnection_t objects on the
listener->connections linked list never going away from the head of that
list.
Fix by safely iterating through the listener->connections list and
initiating shutdown for all controlconnection_t objects found. This
allows any pending netmgr callbacks to be run by the same event loop in
due course, i.e. after shutdown_listener() returns.
Aram Sargsyan [Tue, 12 Dec 2023 14:54:40 +0000 (14:54 +0000)]
Fix a statschannel system test zone loadtime issue
The check_loaded() function compares the zone's loadtime value and
an expected loadtime value, which is based on the zone file's mtime
extracted from the filesystem.
For the secondary zones there may be cases, when the zone file isn't
ready yet before the zone transfer is complete and the zone file is
dumped to the disk, so a so zero value mtime is retrieved.
In such cases wait one second and retry until timeout. Also modify
the affected check to allow a possible difference of the same amount
of seconds as the chosen timeout value.
Aram Sargsyan [Fri, 15 Dec 2023 09:43:36 +0000 (09:43 +0000)]
Use atomic store operations instead of atomic initialize
The atomic_init() function makes sense to use with structure's
members when creating a new instance of a strucutre. In other
places, use atomic store operations instead, in order to avoid
data races.