Nicki Křížek [Mon, 14 Oct 2024 12:44:06 +0000 (14:44 +0200)]
Disable too-many/too-few pylint checks
Enforcing pylint standards and default for our test code seems
counter-productive. Since most of the newly added code are tests or is
test-related, encountering these checks rarely make us refactor the code
in other ways and we just disable these checks individually. Code that
is too complex or convoluted will be pointed out in reviews anyways.
Move all test cases from tests.sh to tests_ksr.py. The only test that
is not moved is the check that key id's match expected keys. The
shell-based system test checks two earlier set environment variables
against each other that has become redundant in the pytest variant,
because we now check the signed key response against a list of keys
and for each key we take into account the timing metadata. So we
already ensure that each published key is in the correct key bundle.
Write initial pytest kasp library. This contains everything that is
required for testing Offline KSK functionality with pytest.
This includes:
- addtime: adding a value to a timing metadata
- get_timing_metdata: retrieve timing metadata from keyfile
- get_metadata/get_keystate: retrieve metadata from statefile
- get_keytag: retrieve keytag from base keyfile string
- get_keyrole: get key role from statefile
- dnskey_equals: compare DNSKEY record from file against a string
- cds_equals: compare CDS derived from file against a string
- zone_is_signed: wait until a zone is completely signed
- dnssec_verify: verify a DNSSEC signed zone with dnssec-verify
- check_dnssecstatus: check rndc dnssec -status output
- check_signatures: check that signatures for a given RRset are correct
- check_dnskeys: check that the published DNSKEY RRset is correct
- check_cds: check that the published CDS RRset is correct
- check_apex: check SOA, DNSKEY, CDNSKEY, and CDS RRset
- check_subdomain: check an RRset below the apex
Aydın Mercan [Wed, 16 Oct 2024 12:53:31 +0000 (12:53 +0000)]
chg: dev: unify explicit fetching and libcrypto handling
Unify libcrypto initialization and explicit digest fetching in a single place.
It will remove the remaining implicit fetching and deduplicate explicit
fetching inside the codebase. Initialization has been moved in to ensure
OpenSSL cleanup is done only after fetched contextes are destroyed.
Merge branch 'aydin/libdns-explicit-fetch' into 'main'
Nicki Křížek [Tue, 15 Oct 2024 11:14:02 +0000 (11:14 +0000)]
chg: ci: Allow re-try of unit tests on FreeBSD 14
The unit test doh_test tends do fail quite often due to exceeding run
time limit in the unit:clang:freebsd14:amd64 job. Use a retry on gitlab
level to alleviate the issue until a better fix is available.
Related #4924
Merge branch '4924-retry-doh_test-freebsd14' into 'main'
Nicki Křížek [Tue, 1 Oct 2024 15:10:23 +0000 (17:10 +0200)]
Allow re-try of unit tests on FreeBSD 14
The unit test doh_test tends do fail quite often due to exceeding run
time limit in the unit:clang:freebsd14:amd64 job. Use a retry on gitlab
level to alleviate the issue until a better fix is available.
Nicki Křížek [Tue, 15 Oct 2024 08:03:25 +0000 (10:03 +0200)]
Support dnspython 2.7.0
CookieOption with new .server/.client attributes (rather than .data) was
added to dnspython. Adjust the code to use the new attributes if
available and fall back to the old code for dnspython<2.7.0
compatibility.
Mark Andrews [Mon, 14 Oct 2024 23:54:27 +0000 (23:54 +0000)]
fix: usr: Fix NSEC3 closest encloser lookup for names with empty non-terminals
The performance improvement for finding the NSEC3 closest encloser when generating authoritative responses could cause servers to return incorrect NSEC3 records in some cases. This has been fixed.
Closes #4950
Merge branch '4950-bind-logs-expected-covering-nsec3-got-an-exact-match' into 'main'
Mark Andrews [Thu, 10 Oct 2024 05:39:10 +0000 (16:39 +1100)]
Use a binary search to find the NSEC3 closest encloser
maxlabels is the suffix length that corresponds to the latest
NXDOMAIN response. minlabels is the suffix length that corresponds
to longest found existing name.
Evan Hunt [Sat, 12 Oct 2024 03:19:02 +0000 (20:19 -0700)]
check 'rndc recursing'
there was no system test that exercised 'rndc recursing'; a
simple one has now been added; it confirms that the number of
recursing clients reported by 'rndc stats' is in agreement with
the list returned by 'rndc recursing'.
Michal Nowak [Wed, 9 Oct 2024 10:53:50 +0000 (12:53 +0200)]
Add libjson-c-dev before #4960 is addressed
Otherwise the "statistics-channels" option in doc/misc/options and
doc/man/named.conf.5in is marked as "not configured" (contrary to what
we have in release tarballs as they were build on a different image that
has libjson-c and libxml2 in it).
Caused by #4895 that made the option dependant on libjson-c or libxml2
presence in the build image.
Matthijs Mekking [Mon, 14 Oct 2024 07:15:41 +0000 (07:15 +0000)]
chg: usr: Harden key management when key files have become unavailabe
Prior to doing key management, BIND 9 will check if the key files on disk match the expected keys. If key files for previously observed keys have become unavailable, this will prevent the internal key manager from running.
Merge branch '4763-do-not-roll-if-key-files-are-missing' into 'main'
Matthijs Mekking [Tue, 20 Aug 2024 12:38:24 +0000 (14:38 +0200)]
Test removing DNSKEYs from other providers
In a multi-signer setup, removing DNSKEY records from the zone should
not be treated as a key that previously exists in the keyring, thus
blocking the keymgr. Add a test case to make sure.
Matthijs Mekking [Mon, 19 Aug 2024 07:46:56 +0000 (09:46 +0200)]
Add additional test case with purged key
Test that if a key to be purged is in the keyring, it does not
prevent the keymgr from running. Normally a key that is in the keyring
should be available again on the next run, but that is not true for
a key that can be purged.
In addition, fix some wait_for_log calls, by adding the missing
'|| ret=1' parts.
Matthijs Mekking [Wed, 14 Aug 2024 12:38:22 +0000 (14:38 +0200)]
Fix some system test cases
Some test cases were working but for the wrong reasons. These started
to fail when I implemented the first approach for #4763, where the
existence of a DNSKEY together with an empty keyring is suspicious and
would prevent the keymgr from running.
These are:
1. kasp: The multisigner-model2.kasp zone has ZSKs from other providers
in the zone, but not yet its own keys. Pregenerate signing keys and
add them to the unsigned zone as well.
2. kasp: The dynamic-signed-inline-signing.kasp zone has a key generated
and added in the raw version of the zone. But the key file is stored
outside the key-directory for the given zone. Add '-K keys' to the
dnssec-keygen command.
Matthijs Mekking [Thu, 15 Aug 2024 08:36:15 +0000 (10:36 +0200)]
Verify new key files before running keymgr
Prior to running the keymgr, first make sure that existing keys
are present in the new keylist. If not, treat this as an operational
error where the keys are made offline (temporarily), possibly unwanted.
In this specific case the key files are temporary unavailable, for
example because of an operator error, or a mount failure). In such
cases, BIND should not try to roll over these keys.
Artem Boldariev [Thu, 10 Oct 2024 19:05:54 +0000 (19:05 +0000)]
fix: dig - always set the default port when doing a UDP query
This commit ensures that the port is set before attempting a UDP
query. Before that a situation could appear when previous query have
completed over a different transport (that uses a dedicated port) and
then a UDP query will be attempted over the port of the previous
transport.
Closes: #4984.
Merge branch 'artem-debian-bug-1059582' into 'main'
Artem Boldariev [Thu, 10 Oct 2024 10:18:05 +0000 (13:18 +0300)]
dig: always set the default port when doing a UDP query
This commit ensures that the port is set before attempting a UDP
query. Before that a situation could appear when previous query have
completed over a different transport (that uses a dedicated port) and
then a UDP query will be attempted over the port of the previous
transport.
Arаm Sаrgsyаn [Wed, 9 Oct 2024 11:38:35 +0000 (11:38 +0000)]
fix: dev: Fix error path bugs in the manager's "recursing-clients" list management
In two places, after linking the client to the manager's
"recursing-clients" list using the check_recursionquota()
function, the query.c module fails to unlink it on error
paths. Fix the bugs by unlinking the client from the list.
Merge branch 'aram/unlink-recursing-clients-on-error-paths' into 'main'
Aram Sargsyan [Wed, 2 Oct 2024 16:39:13 +0000 (16:39 +0000)]
Refactor the way check_recursionquota() is used
Rename check_recursionquota() to acquire_recursionquota(), and
implement a new function called release_recursionquota() to
reverse the action. It helps with decreasing code duplication.
Aram Sargsyan [Wed, 2 Oct 2024 16:22:38 +0000 (16:22 +0000)]
Fix error path bugs in the "recursing-clients" list management
In two places, after linking the client to the manager's
"recursing-clients" list using the check_recursionquota()
function, the query.c module fails to unlink it on error
paths. Fix the bugs by unlinking the client from the list.
Also make sure that unlinking happens before detaching the
client's handle, as it is the logically correct order, e.g.
in case if it's the last handle and ns__client_reset_cb()
can be called because of the detachment.
Arаm Sаrgsyаn [Wed, 9 Oct 2024 10:31:09 +0000 (10:31 +0000)]
fix: dev: Fix a data race in dns_zone_getxfrintime()
The dns_zone_getxfrintime() function fails to lock the zone before
accessing its 'xfrintime' structure member, which can cause a data
race between soa_query() and the statistics channel. Add the missing
locking/unlocking pair, like it's done in numerous other similar
functions.
Closes #4976
Merge branch '4976-zone-xfrintime-data-race-fix' into 'main'
Aram Sargsyan [Mon, 7 Oct 2024 12:26:59 +0000 (12:26 +0000)]
Fix a data race in dns_zone_getxfrintime()
The dns_zone_getxfrintime() function fails to lock the zone before
accessing its 'xfrintime' structure member, which can cause a data
race between soa_query() and the statistics channel. Add the missing
locking/unlocking pair, like it's done in numerous other similar
functions.
Arаm Sаrgsyаn [Wed, 9 Oct 2024 09:12:45 +0000 (09:12 +0000)]
fix: dev: Clean up 'nodetach' in ns_client
The 'nodetach' member is a leftover from the times when non-zero
'stale-answer-client-timeout' values were supported, and currently
is always 'false'. Clean up the member and its usage.
Merge branch 'aram/cleanup-ns-client-nodetach' into 'main'
Aram Sargsyan [Mon, 7 Oct 2024 15:29:14 +0000 (15:29 +0000)]
Clean up 'nodetach' in ns_client
The 'nodetach' member is a leftover from the times when non-zero
'stale-answer-client-timeout' values were supported, and currently
is always 'false'. Clean up the member and its usage.
Ondřej Surý [Wed, 2 Oct 2024 12:16:03 +0000 (12:16 +0000)]
fix: dev: Don't enable REUSEADDR on outgoing UDP sockets
The outgoing UDP sockets enabled `SO_REUSEADDR` that allows sharing of the UDP sockets, but with one big caveat - the socket that was opened the last would get all traffic. The dispatch code would ignore the invalid responses in the dns_dispatch, but this could lead to unexpected results.
Merge branch 'ondrej/fix-outgoing-UDP-port-selection' into 'main'
Currently, the outgoing UDP sockets have enabled
SO_REUSEADDR (SO_REUSEPORT on BSDs) which allows multiple UDP sockets to
bind to the same address+port. There's one caveat though - only a
single (the last one) socket is going to receive all the incoming
traffic. This in turn could lead to incoming DNS message matching to
invalid dns_dispatch and getting dropped.
Disable setting the SO_REUSEADDR on the outgoing UDP sockets. This
needs to be done explicitly because `uv_udp_open()` silently enables the
option on the socket.
Ondřej Surý [Wed, 2 Oct 2024 06:37:48 +0000 (08:37 +0200)]
Skip TCP dispatch responses that are not ours
When matching the TCP dispatch responses, we should skip the responses
that do not belong to our TCP connection. This can happen with faulty
upstream server that sends invalid QID back to us.
Arаm Sаrgsyаn [Wed, 2 Oct 2024 09:51:40 +0000 (09:51 +0000)]
fix: dev: Don't ignore the local port number in dns_dispatch_add() for TCP
The dns_dispatch_add() function registers the 'resp' entry in
'disp->mgr->qids' hash table with 'resp->port' being 0, but in
tcp_recv_success(), when looking up an entry in the hash table
after a successfully received data the port is used, so if the
local port was set (i.e. it was not 0) it fails to find the
entry and results in an unexpected error.
Set the 'resp->port' to the given local port value extracted from
'disp->local'.
Closes #4969
Merge branch '4969-dispatch-tcp-source-port-bug-fix' into 'main'
Aram Sargsyan [Tue, 1 Oct 2024 14:16:47 +0000 (14:16 +0000)]
Don't ignore the local port number in dns_dispatch_add() for TCP
The dns_dispatch_add() function registers the 'resp' entry in
'disp->mgr->qids' hash table with 'resp->port' being 0, but in
tcp_recv_success(), when looking up an entry in the hash table
after a successfully received data the port is used, so if the
local port was set (i.e. it was not 0) it fails to find the
entry and results in an unexpected error.
Set the 'resp->port' to the given local port value extracted from
'disp->local'.
Alessio Podda [Wed, 2 Oct 2024 08:16:17 +0000 (08:16 +0000)]
new: usr: Support ISO timestamps with timezone information
The configuration option `print-time` can now be set to `iso8601-tzinfo` in order to use the ISO 8601 timestamp with timezone information when logging. This is used as a default for `named -g`.
Closes #4963
Merge branch '4963-provide-timezone-information-in-log-timestamps' into 'main'
This commit adds support for timestamps in iso8601 format with timezone
when logging. This is exposed through the iso8601-tzinfo printtime
suboption.
It also makes the new logging format the default for -g output,
hopefully removing the need for custom timestamp parsing in scripts.
Michal Nowak [Tue, 1 Oct 2024 12:05:39 +0000 (12:05 +0000)]
chg: test: Replace dns.query module with isctest.query
The `dns.query.udp` and `dns.query.tcp` methods are [prone to timeouts](https://gitlab.isc.org/isc-projects/bind9/-/jobs/4785053); their `isctest.query` equivalents should be used in system tests instead.
Merge branch 'mnowak/convert-dns-query-udp-and-tcp-to-isctest-query' into 'main'
Alessio Podda [Tue, 1 Oct 2024 10:33:56 +0000 (10:33 +0000)]
fix: dev: Null clausedefs for ancient options
This commit nulls all type fields for the clausedef lists that are
declared ancient, and removes the corresponding cfg_type_t and parsing
functions when they are found to be unused after the change.
Among others, it removes some leftovers from #1913.
Closes #4962
Merge branch '4962-null-clausedef-types-for-ancient-options' into 'main'
This commit nulls all type fields for the clausedef lists that are
declared ancient, and removes the corresponding cfg_type_t and parsing
functions when they are found to be unused after the change.
fix: doc: Restore text about sig validity and SOA expire
When `sig-validity-interval` was obsoleted, the text that the signature validity interval should be multiples of the SOA expire interval was removed. Restore this text to the description of the `signatures-validity` option.
Closes #4951
Merge branch '4951-document-signatures-validity-soa-expire' into 'main'
The example.com zone file given in the "Configurations and Zone Files"
chapter has an SOA expire of 3 weeks, which is not a multiple of
the default signatures-validity value. Adjust the SOA expire so that
it is much lower than the signatures-validity default.
When `sig-validity-interval` was obsoleted, the text that the signature
validity interval should be multiples of the SOA expire interval was
removed. Restore this text to the description of the
`signatures-validity` option.
Mark Andrews [Tue, 1 Oct 2024 01:26:56 +0000 (01:26 +0000)]
fix: usr: Fix a bug in the static-stub implementation
Static-stub addresses and addresses from other sources were being
mixed together, resulting in static-stub queries going to addresses
not specified in the configuration, or alternatively, static-stub
addresses being used instead of the correct server addresses.
Closes #4850
Merge branch '4850-add-an-additional-class-of-names-to-adb' into 'main'
Mark Andrews [Wed, 14 Aug 2024 13:46:44 +0000 (23:46 +1000)]
Store static-stub addresses seperately in the adb
Static-stub address and addresses from other sources where being
mixed together resulting in static-stub queries going to addresses
not specified in the configuration or alternatively static-stub
addresses being used instead of the real addresses.
chg: dev: Use release memory ordering when incrementing reference counter
As the relaxed memory ordering doesn't ensure any memory
synchronization, it is possible that the increment will succeed even
in the case when it should not - there is a race between
atomic_fetch_sub(..., acq_rel) and atomic_fetch_add(..., relaxed).
Only the result is consistent, but the previous value for both calls
could be same when both calls are executed at the same time.
Merge branch 'ondrej/use-release-memory-ordering-for-reference-counting' into 'main'
Use release memory ordering when incrementing reference counter
As the relaxed memory ordering doesn't ensure any memory
synchronization, it is possible that the increment will succeed even
in the case when it should not - there is a race between
atomic_fetch_sub(..., acq_rel) and atomic_fetch_add(..., relaxed).
Only the result is consistent, but the previous value for both calls
could be same when both calls are executed at the same time.
fix: dev: Add a missing rcu_read_unlock() call on exit path
An exit path in the dns_dispatch_add() function fails to get out of
the RCU critical section when returning early. Add the missing
rcu_read_unlock() call.
Merge branch 'aram/add-missing-rcu_read_unlock-in-dns_dispatch_add' into 'main'
An exit path in the dns_dispatch_add() function fails to get out of
the RCU critical section when returning early. Add the missing
rcu_read_unlock() call.