Michał Kępień [Fri, 30 May 2025 17:37:53 +0000 (17:37 +0000)]
chg: test: Use isctest.asyncserver in the "chain" test
Replace the custom DNS servers used in the "chain" system test with
new code based on the isctest.asyncserver module.
For ans3, replace the sequence of logical conditions present in Perl
code with zone files and a limited amount of custom logic applied on top
of them where necessary.
For ans4, replace the ctl_channel() and create_response() functions with
a custom control command handler coupled with a dynamically instantiated
response handler, making the code more robust and readable.
Migrate sendcmd() and its uses to the new way of sending control queries
to custom servers used in system tests.
Depends on !10409
Merge branch 'michal/chain-asyncserver' into 'main'
Michał Kępień [Fri, 30 May 2025 16:23:21 +0000 (18:23 +0200)]
Use isctest.asyncserver in the "chain" test
Replace the custom DNS servers used in the "chain" system test with
new code based on the isctest.asyncserver module.
For ans3, replace the sequence of logical conditions present in Perl
code with zone files and a limited amount of custom logic applied on top
of them where necessary.
For ans4, replace the ctl_channel() and create_response() functions with
a custom control command handler coupled with a dynamically instantiated
response handler, making the code more robust and readable.
Migrate sendcmd() and its uses to the new way of sending control queries
to custom servers used in system tests.
Michał Kępień [Fri, 30 May 2025 16:23:21 +0000 (18:23 +0200)]
Improve readability of sendcmd() calls
To improve readability of sendcmd() calls used for controlling
isctest.asyncserver-based custom DNS servers, pass the command's name
and arguments as separate parameters.
Michał Kępień [Fri, 30 May 2025 16:18:07 +0000 (16:18 +0000)]
new: test: Handle alias records in zone files loaded by AsyncDnsServer
dnspython does not treat CNAME records in zone files in any special way;
they are just RRsets belonging to zone nodes. Process CNAMEs when
preparing zone-based responses just like a normal authoritative DNS
server would.
Adding proper DNAME support to AsyncDnsServer would add complexity to
its code for little gain: DNAME use in custom system test servers is
limited to crafting responses that attempt to trigger bugs in named.
This fact will not be obvious to AsyncDnsServer users as it
automatically loads all zone files it finds and handles CNAME records
like a normal authoritative DNS server would.
Therefore, to prevent surprises:
- raise an exception whenever DNAME records are found in any of the
zone files loaded by AsyncDnsServer,
- add a new optional argument to the AsyncDnsServer constructor that
enables suppressing this new behavior, enabling zones with DNAME
records to be loaded anyway.
This enables response handlers to use the DNAME records present in zone
files in arbitrary ways without complicating the "base" code.
Merge branch 'michal/asyncserver-alias-records' into 'main'
Michał Kępień [Fri, 30 May 2025 16:08:54 +0000 (18:08 +0200)]
Force manual DNAME handling to be acknowledged
Adding proper DNAME support to AsyncDnsServer would add complexity to
its code for little gain: DNAME use in custom system test servers is
limited to crafting responses that attempt to trigger bugs in named.
This fact will not be obvious to AsyncDnsServer users as it
automatically loads all zone files it finds and handles CNAME records
like a normal authoritative DNS server would.
Therefore, to prevent surprises:
- raise an exception whenever DNAME records are found in any of the
zone files loaded by AsyncDnsServer,
- add a new optional argument to the AsyncDnsServer constructor that
enables suppressing this new behavior, enabling zones with DNAME
records to be loaded anyway.
This enables response handlers to use the DNAME records present in zone
files in arbitrary ways without complicating the "base" code.
Michał Kępień [Fri, 30 May 2025 16:08:54 +0000 (18:08 +0200)]
Drop unused AsyncDnsServer constructor argument
The constructor for the AsyncDnsServer class takes a 'load_zones'
argument that is not used anywhere and is not expected to be useful in
the future: zone files are not required for an AsyncDnsServer instance
to start and, if necessary, zone-based answers can be suppressed or
modified by installing a custom response handler.
Michał Kępień [Fri, 30 May 2025 16:08:54 +0000 (18:08 +0200)]
Properly handle CNAMEs when preparing responses
dnspython does not treat CNAME records in zone files in any special way;
they are just RRsets belonging to zone nodes. Process CNAMEs when
preparing zone-based responses just like a normal authoritative DNS
server would.
Nicki Křížek [Thu, 29 May 2025 12:34:15 +0000 (12:34 +0000)]
fix: test: Fix intermittent kasp pytest failures
The `pytest` cases checks if a zone is signed by looking at the `NSEC` record at the apex. If that has an RRSIG record, it is considered signed. But `named` signs zones incrementally (in batches) and so the zone may still lack some signatures. In other words, the tests may consider a zone signed while in fact signing is not yet complete, then performs additional checks such as is a subdomain signed with the right key. If this check happens before the zone is actually fully
signed, the check will fail.
Fix this by using `check_dnssec_verify` instead of `check_is_zone_signed`. We were already doing this check, but we now move it up. This will transfer the zone and then run `dnssec-verify` on the response. If the zone is partially signed, the check will fail, and it will retry for up to ten times.
Closes #5303
Merge branch '5303-kasp-pytest-intermittent-test-failures' into 'main'
The pytest cases checks if a zone is signed by looking at the NSEC
record at the apex. If that has an RRSIG record, it is considered
signed. But 'named' signs zones incrementally (in batches) and so
the zone may still lack some signatures. In other words, the tests
may consider a zone signed while in fact signing is not yet complete,
then performs additional checks such as is a subdomain signed with the
right key. If this check happens before the zone is actually fully
signed, the check will fail.
Fix this by using 'check_dnssec_verify' instead of
'check_is_zone_signed'. We were already doing this check, but we now
move it up. This will transfer the zone and then run 'dnssec-verify'
on the response. If the zone is partially signed, the check will fail,
and it will retry for up to ten times.
Nicki Křížek [Thu, 29 May 2025 09:04:04 +0000 (09:04 +0000)]
chg: test: Add utility module to import correct version of hypothesis
On FIPS-enabled platforms, we need to ensure a minimal version of
hypothesis which no longer uses MD5. This doesn't need to be enforced
for other platforms.
Move the import magic to a utility module to avoid copy-pasting the
boilerplate code around.
Merge branch 'nicki/pytest-import-hypothesis' into 'main'
Nicki Křížek [Mon, 5 May 2025 16:00:07 +0000 (18:00 +0200)]
Ensure supported version of hypothesis is available
On FIPS-enabled platforms, we need to ensure a minimal version of
hypothesis which no longer uses MD5. This doesn't need to be enforced
for other platforms.
Move the import magic to a utility module to avoid copy-pasting the
boilerplate code around.
Mark Andrews [Thu, 29 May 2025 06:59:00 +0000 (06:59 +0000)]
fix: nil: silence tainted scalar in client.c
Coverity detected that 'optlen' was not being checked in 'process_opt'.
This is actually already done when the OPT record was initially
parsed. Add an INSIST to silence Coverity as is done in message.c.
Closes #5330
Merge branch '5330-tainted-scalar-in-client-c' into 'main'
Mark Andrews [Wed, 28 May 2025 23:42:08 +0000 (09:42 +1000)]
Silence tainted scalar in client.c
Coverity detected that 'optlen' was not being checked in 'process_opt'.
This is actually already done when the OPT record was initially
parsed. Add an INSIST to silence Coverity as is done in message.c.
Ondřej Surý [Thu, 29 May 2025 04:24:26 +0000 (04:24 +0000)]
chg: dev: Unify handling of the program name in all the utilities
There were several methods how we used 'argv[0]'. Some programs had a
static value, some programs did use isc_file_progname(), some programs
stripped 'lt-' from the beginning of the name. And some used argv[0]
directly.
Unify the handling and all the variables into isc_commandline_progname
that gets populated by the new isc_commandline_init(argc, argv) call.
Merge branch 'ondrej/unify-handling-of-the-program-name' into 'main'
Ondřej Surý [Wed, 28 May 2025 20:43:38 +0000 (22:43 +0200)]
Unify handling of the program name in all the utilities
There were several methods how we used 'argv[0]'. Some programs had a
static value, some programs did use isc_file_progname(), some programs
stripped 'lt-' from the beginning of the name. And some used argv[0]
directly.
Unify the handling and all the variables into isc_commandline_progname
that gets populated by the new isc_commandline_init(argc, argv) call.
Ondřej Surý [Thu, 29 May 2025 03:50:44 +0000 (03:50 +0000)]
chg: dev: Set name for all the isc_mem context from isc_mem_create()
Instead of giving the memory context names with an explicit call to
isc_mem_setname(), add the name to isc_mem_create() call to have all the
memory contexts an unconditional name.
Merge branch 'ondrej/ondrej-isc_mem_create-with-name' into 'main'
Ondřej Surý [Wed, 28 May 2025 21:00:24 +0000 (23:00 +0200)]
Give every memory pool a name
Instead of giving the memory pools names with an explicit call to
isc_mempool_setname(), add the name to isc_mempool_create() call to have
all the memory pools an unconditional name.
Ondřej Surý [Fri, 21 Feb 2025 11:45:08 +0000 (12:45 +0100)]
Give every memory context a name
Instead of giving the memory context names with an explicit call to
isc_mem_setname(), add the name to isc_mem_create() call to have all the
memory contexts an unconditional name.
Colin Vidal [Wed, 28 May 2025 20:55:52 +0000 (22:55 +0200)]
coccinelle patch for isc_mem_free()/isc_mem_put()
add a Coccinelle patch to ensure the pointer being used by
isc_mem_free() and isc_mem_put() is not explicitly set to NULL (those
mecros are taking care of it).
The memory context for isc_managers and dst_api units had no name and
that was causing trouble with the statistics channel output. Set the
name for the two memory context that were missing a proper name.
Ondřej Surý [Wed, 28 May 2025 17:48:57 +0000 (17:48 +0000)]
fix: usr: Fix zone deletion issue
A secondary zone could initiate a new zone transfer from the
primary server after it had been already deleted from the
secondary server, and before the internal garbage collection
was activated to clean it up completely. This has been fixed.
Aram Sargsyan [Mon, 12 May 2025 13:58:38 +0000 (13:58 +0000)]
Prepare a zone for shutting down when deleting it from a view
After b171cacf4f0123ba96bef6eedfc92dfb608db6b7, a zone object can
remain in the memory for a while, until garbage collection is run.
Setting the DNS_ZONEFLG_EXITING flag should prevent the zone
maintenance function from running while it's in that state.
Otherwise, a secondary zone could initiate a zone transfer after
it had been deleted.
Ondřej Surý [Wed, 28 May 2025 16:51:33 +0000 (16:51 +0000)]
fix: usr: Fix a zone refresh bug
A secondary zone could fail to further refresh with new
versions of the zone from a primary server if named was
reconfigured during the SOA request step of an ongoing
zone transfer. This has been fixed.
Closes #5307
Merge branch '5307-zone-refresh-stuck-after-reconfiguration-fix' into 'main'
Aram Sargsyan [Wed, 21 May 2025 15:27:53 +0000 (15:27 +0000)]
Emit a ISC_R_CANCELED result instead of ISC_R_SHUTTINGDOWN
When request manager shuts down, it also shuts down all its ongoing
requests. Currently it calls their callback functions with a
ISC_R_SHUTTINGDOWN result code for the request. Since a request
manager can shutdown not only during named shutdown but also during
named reconfiguration, instead of sending ISC_R_SHUTTINGDOWN result
code send a ISC_R_CANCELED code to avoid confusion and errors with
the expectation that a ISC_R_SHUTTINGDOWN result code can only be
received during actual shutdown of named.
All the callback functions which are passed to either the
dns_request_create() or the dns_request_createraw() functions have
been analyzed to confirm that they can process both the
ISC_R_SHUTTINGDOWN and ISC_R_CANCELED result codes. Changes were
made where it was necessary.
Aram Sargsyan [Wed, 21 May 2025 14:44:50 +0000 (14:44 +0000)]
Fix a zone refresh bug in zone.c:refresh_callback()
When the zone.c:refresh_callback() callback function is called during
a SOA request before a zone transfer, it can receive a
ISC_R_SHUTTINGDOWN result for the sent request when named is shutting
down, and in that case it just destroys the request and finishes the
ongoing transfer, without clearing the DNS_ZONEFLG_REFRESH flag of the
zone. This is alright when named is going to shutdown, but currently
the callback can get a ISC_R_SHUTTINGDOWN result also when named is
reconfigured during the ongoibg SOA request. In that case, leaving the
DNS_ZONEFLG_REFRESH flag set results in the zone never being able
to refresh again, because any new attempts will be caneled while
the flag is set. Clear the DNS_ZONEFLG_REFRESH flag on the 'exiting'
error path of the callback function.
Colin Vidal [Wed, 28 May 2025 15:44:21 +0000 (15:44 +0000)]
fix: test: enable shell-based rndc system tests
Enable existing rndc system tests (the python test function calling the
shell file was missing). Also update the extra artifacts list to remove
one generated file which was left behind.
Colin Vidal [Wed, 28 May 2025 13:15:56 +0000 (15:15 +0200)]
enable shell-based rndc system tests
Enable existing rndc system tests (the python test function calling the
shell file was missing). Also update the extra artifacts list to remove
one generated file which was left behind.
Evan Hunt [Sat, 22 Mar 2025 06:32:27 +0000 (23:32 -0700)]
add DNS_RDATASET_FOREACH macro
replace the pattern `for (result = dns_rdataset_first(x); result ==
ISC_R_SUCCES; result = dns_rdataset_next(x)` with a new
`DNS_RDATASET_FOREACH` macro throughout BIND.
Evan Hunt [Sat, 22 Mar 2025 06:48:01 +0000 (23:48 -0700)]
import_rdataset() can't fail
the import_rdataset() function can't return any value other
than ISC_R_SUCCESS, so it's been changed to void and its callers
don't rely on its return value any longer.
Evan Hunt [Tue, 27 May 2025 23:11:27 +0000 (23:11 +0000)]
fix: nil: correct the DbC assertions in message.c
the comments for some calls in the dns_message API specified
requirements which were not actually enforced in the functions.
in most cases, this has now been corrected by adding the missing
REQUIREs. in one case, the comment was incorrect and has been
revised.
Merge branch 'each-fix-message-requires' into 'main'
Evan Hunt [Tue, 27 May 2025 23:08:35 +0000 (23:08 +0000)]
fix: dev: Make all ISC_LIST_FOREACH calls safe
Previously, `ISC_LIST_FOREACH` and `ISC_LIST_FOREACH_SAFE` were
two separate macros, with the _SAFE version allowing entries
to be unlinked during the loop. `ISC_LIST_FOREACH` is now also
safe, and the separate `_SAFE` macro has been removed.
Similarly, the `ISC_LIST_FOREACH_REV` macro is now safe, and
`ISC_LIST_FOREACH_REV_SAFE` has also been removed.
Evan Hunt [Fri, 23 May 2025 20:02:22 +0000 (13:02 -0700)]
make all ISC_LIST_FOREACH calls safe
previously, ISC_LIST_FOREACH and ISC_LIST_FOREACH_SAFE were
two separate macros, with the _SAFE version allowing entries
to be unlinked during the loop. ISC_LIST_FOREACH is now also
safe, and the separate _SAFE macro has been removed.
similarly, the ISC_LIST_FOREACH_REV macro is now safe, and
ISC_LIST_FOREACH_REV_SAFE has also been removed.
Alessio Podda [Thu, 22 May 2025 22:53:48 +0000 (22:53 +0000)]
chg: dev: Adaptive memory allocation strategy for qp-tries
qp-tries allocate their nodes (twigs) in chunks to reduce allocator
pressure and improve memory locality. The choice of chunk size presents
a tradeoff: larger chunks benefit qp-tries with many values (as seen
in large zones and resolvers) but waste memory in smaller use cases.
Previously, our fixed chunk size of 2^10 twigs meant that even an
empty qp-trie would consume 12KB of memory, while reducing this size
would negatively impact resolver performance.
This commit implements an adaptive chunking strategy that:
- Tracks the size of the most recently allocated chunk.
- Doubles the chunk size for each new allocation until reaching a
predefined maximum.
This approach effectively balances memory efficiency for small tries
while maintaining the performance benefits of larger chunk sizes for
bigger data structures.
Alessio Podda [Mon, 5 May 2025 09:43:44 +0000 (11:43 +0200)]
Tune min and max chunk size
Before implementing adaptive chunk sizing, it was necessary to ensure
that a chunk could hold up to 48 twigs, but the new logic will size-up
new chunks to ensure that the current allocation can succeed.
We exploit the new logic in two ways:
- We make the minimum chunk size smaller than the old limit of 2^6,
reducing memory consumption.
- We make the maximum chunk size larger, as it has been observed that
it improves resolver performance.
alessio [Sun, 9 Mar 2025 08:13:16 +0000 (09:13 +0100)]
Adaptive memory allocation strategy for qp-tries
qp-tries allocate their nodes (twigs) in chunks to reduce allocator
pressure and improve memory locality. The choice of chunk size presents
a tradeoff: larger chunks benefit qp-tries with many values (as seen
in large zones and resolvers) but waste memory in smaller use cases.
Previously, our fixed chunk size of 2^10 twigs meant that even an
empty qp-trie would consume 12KB of memory, while reducing this size
would negatively impact resolver performance.
This commit implements an adaptive chunking strategy that:
- Tracks the size of the most recently allocated chunk.
- Doubles the chunk size for each new allocation until reaching a
predefined maximum.
This approach effectively balances memory efficiency for small tries
while maintaining the performance benefits of larger chunk sizes for
bigger data structures.
This commit also splits the callback freeing qpmultis into two
phases, one that frees the underlying qptree, and one that reclaims
the qpmulti memory. In order to prevent races between the qpmulti
destructor and chunk garbage collection jobs, the second phase is
protected by reference counting.
Michał Kępień [Thu, 22 May 2025 12:21:04 +0000 (14:21 +0200)]
Send pre-announcement emails for all ISC projects
There is no reason for the public pre-announcements of security issues
to only be sent for BIND 9. Remove the "BIND 9 only" annotation from
the relevant checklist step as it caused confusion in practice.
Ondřej Surý [Tue, 20 May 2025 23:39:34 +0000 (23:39 +0000)]
rem: dev: Clean up the DST cryptographic API
The DST API has been cleaned up, duplicate functions has been squashed
into single call (verify and verify2 functions), and couple of unused
functions have been completely removed (createctx2, computesecret,
paramcompare, and cleanup).
Deprecate max-rsa-exponent-size, always use 4096 instead
The `max-rsa-exponent-size` could limit the exponents of the RSA
public keys during the DNSSEC verification. Instead of providing
a cryptic (not cryptographic) knob, hardcode the max exponent to
be 4096 (the theoretical maximum for DNSSEC).
The DST API has been cleaned up, duplicate functions has been squashed
into single call (verify and verify2 functions), and couple of unused
functions have been completely removed (createctx2, computesecret,
paramcompare, and cleanup).
Arаm Sаrgsyаn [Thu, 15 May 2025 13:26:44 +0000 (13:26 +0000)]
new: usr: Implement a new 'notify-defer' configuration option
This new option sets a delay (in seconds) to wait before sending
a set of NOTIFY messages for a zone. Whenever a NOTIFY message is
ready to be sent, sending will be deferred for this duration. This
option is not to be confused with the :any:`notify-delay` option.
The default is 0 seconds.
Closes #5259
Merge branch '5259-implement-zone-notify-defer' into 'main'
Implement a new 'notify-defer' configuration option
This new option sets the delay, in seconds, to wait before sending
a set of NOTIFY messages for a zone. Whenever a NOTIFY message is
ready to be sent, sending will be deferred for this duration.
Michał Kępień [Wed, 14 May 2025 17:17:11 +0000 (17:17 +0000)]
chg: test: Mark test_idle_timeout as flaky on FreeBSD 13
The test_idle_timeout check in the "timeouts" system test has been
failing often on FreeBSD 13 AWS hosts. Adding timestamped debug logging
shows that the time.sleep() calls used in that check are returning
significantly later than asked to on that platform (e.g. after 4 seconds
when just 1 second is requested), breaking the test's timing assumptions
and triggering false positives. These failures are not an indication of
a bug in named and have not been observed on any other platform. Mark
the problematic check as flaky, but only on FreeBSD 13, so that other
failure modes are caught appropriately.
Merge branch 'michal/mark-test_idle_timeout-as-flaky-on-freebsd-13' into 'main'
Michał Kępień [Wed, 14 May 2025 07:50:33 +0000 (09:50 +0200)]
Mark test_idle_timeout as flaky on FreeBSD 13
The test_idle_timeout check in the "timeouts" system test has been
failing often on FreeBSD 13 AWS hosts. Adding timestamped debug logging
shows that the time.sleep() calls used in that check are returning
significantly later than asked to on that platform (e.g. after 4 seconds
when just 1 second is requested), breaking the test's timing assumptions
and triggering false positives. These failures are not an indication of
a bug in named and have not been observed on any other platform. Mark
the problematic check as flaky, but only on FreeBSD 13, so that other
failure modes are caught appropriately.
Evan Hunt [Tue, 13 May 2025 07:56:08 +0000 (00:56 -0700)]
debug level was ignored when logging to stderr
In commit cc167266aa, the -g option was changed so it sets both
named_g_logstderr and also named_g_logflags to use ISO style timestamps
with tzinfo. Together with an error in named_log_setsafechannels(), that
change could cause the debugging level to be ignored.
Michał Kępień [Thu, 8 May 2025 20:45:48 +0000 (22:45 +0200)]
[CVE-2025-40775] sec: usr: Prevent assertion when processing TSIG algorithm
DNS messages that included a Transaction Signature (TSIG) containing an
invalid value in the algorithm field caused :iscman:`named` to crash
with an assertion failure. This has been fixed. :cve:`2025-40775`
See isc-projects/bind9#5300
Merge branch '5300-confidential-tsig-unknown-alg' into 'v9.21.8-release'
In a previous change, the "algorithm" value passed to
dns_tsigkey_create() was changed from a DNS name to an integer;
the name was then chosen from a table of known algorithms. A
side effect of this change was that a query using an unknown TSIG
algorithm was no longer handled correctly, and could trigger an
assertion failure. This has been corrected.
The dns_tsigkey struct now stores the signing algorithm
as dst_algorithm_t value 'alg' instead of as a dns_name,
but retains an 'algname' field, which is used only when the
algorithm is DST_ALG_UNKNOWN. This allows the name of the
unrecognized algorithm name to be returned in a BADKEY
response.
Mark Andrews [Tue, 22 Apr 2025 08:39:59 +0000 (18:39 +1000)]
Wrong NSEC3 chosen for NO QNAME proof
When we optimised the closest encloser NSEC3 discovery the maxlabels
variable was used in the binary search. The updated value was later
used to add the NO QNAME NSEC3 but that block of code needed the
original value. This resulted in the wrong NSEC3 sometimes being
chosen to perform this role.