git.ipfire.org Git - thirdparty/bind9.git/log

Port dnstap test to use isctest utilities

fix: dev: Add more iteration macros

Add more macros for iteration: `DNS_RDATASET_FOREACH`, `CFG_LIST_FOREACH`, `DNS_DBITERATOR_FOREACH`, and `DNS_RDATASETITER_FOREACH`.

Merge branch 'each-rdataset-foreach' into 'main'

See merge request isc-projects/bind9!10350

add DNS_DBITERATOR_FOREACH and DNS_RDATASETITER_FOREACH

when iterating databases, use DNS_DBITERATOR_FOREACH and
DNS_DNSRDATASETITER_FOREACH macros where possible.

add CFG_LIST_FOREACH macro

replace the pattern `for (elt = cfg_list_first(x); elt != NULL;
elt = cfg_list_next(elt))` with a new `CFG_LIST_FOREACH` macro.

add DNS_RDATASET_FOREACH macro

replace the pattern `for (result = dns_rdataset_first(x); result ==
ISC_R_SUCCES; result = dns_rdataset_next(x)` with a new
`DNS_RDATASET_FOREACH` macro throughout BIND.

import_rdataset() can't fail

the import_rdataset() function can't return any value other
than ISC_R_SUCCESS, so it's been changed to void and its callers
don't rely on its return value any longer.

fix: nil: correct the DbC assertions in message.c

the comments for some calls in the dns_message API specified
requirements which were not actually enforced in the functions.
in most cases, this has now been corrected by adding the missing
REQUIREs. in one case, the comment was incorrect and has been
revised.

Merge branch 'each-fix-message-requires' into 'main'

See merge request isc-projects/bind9!10466

correct the DbC assertions in message.c

the comments for some calls in the dns_message API specified
requirements which were not actually enforced in the functions.

in most cases, this has now been corrected by adding the missing
REQUIREs. in one case, the comment was incorrect and has been
revised.

fix: dev: Make all ISC_LIST_FOREACH calls safe

Previously, `ISC_LIST_FOREACH` and `ISC_LIST_FOREACH_SAFE` were
two separate macros, with the _SAFE version allowing entries
to be unlinked during the loop. `ISC_LIST_FOREACH` is now also
safe, and the separate `_SAFE` macro has been removed.

Similarly, the `ISC_LIST_FOREACH_REV` macro is now safe, and
`ISC_LIST_FOREACH_REV_SAFE` has also been removed.

Merge branch 'each-isc-list-foreach' into 'main'

See merge request isc-projects/bind9!10479

make all ISC_LIST_FOREACH calls safe

previously, ISC_LIST_FOREACH and ISC_LIST_FOREACH_SAFE were
two separate macros, with the _SAFE version allowing entries
to be unlinked during the loop. ISC_LIST_FOREACH is now also
safe, and the separate _SAFE macro has been removed.

similarly, the ISC_LIST_FOREACH_REV macro is now safe, and
ISC_LIST_FOREACH_REV_SAFE has also been removed.

[CVE-2025-40775] sec: test: Add a bad TSIG algorithm hypothesis python test

Closes #5300

Merge branch '5300-tsig-unknown-alg-test' into 'main'

See merge request isc-projects/bind9!10475

Add a bad TSIG algorithm hypothesis python test

Co-authored-by: Petr Špaček <pspacek@isc.org>

chg: dev: Adaptive memory allocation strategy for qp-tries

qp-tries allocate their nodes (twigs) in chunks to reduce allocator
pressure and improve memory locality. The choice of chunk size presents
a tradeoff: larger chunks benefit qp-tries with many values (as seen
in large zones and resolvers) but waste memory in smaller use cases.

Previously, our fixed chunk size of 2^10 twigs meant that even an
empty qp-trie would consume 12KB of memory, while reducing this size
would negatively impact resolver performance.

This commit implements an adaptive chunking strategy that:
- Tracks the size of the most recently allocated chunk.
- Doubles the chunk size for each new allocation until reaching a
predefined maximum.

This approach effectively balances memory efficiency for small tries
while maintaining the performance benefits of larger chunk sizes for
bigger data structures.

Merge branch 'alessio/qp-small-alloc' into 'main'

See merge request isc-projects/bind9!10245

Tune min and max chunk size

Before implementing adaptive chunk sizing, it was necessary to ensure
that a chunk could hold up to 48 twigs, but the new logic will size-up
new chunks to ensure that the current allocation can succeed.

We exploit the new logic in two ways:
- We make the minimum chunk size smaller than the old limit of 2^6,
reducing memory consumption.
- We make the maximum chunk size larger, as it has been observed that
it improves resolver performance.

Adaptive memory allocation strategy for qp-tries

qp-tries allocate their nodes (twigs) in chunks to reduce allocator
pressure and improve memory locality. The choice of chunk size presents
a tradeoff: larger chunks benefit qp-tries with many values (as seen
in large zones and resolvers) but waste memory in smaller use cases.

Previously, our fixed chunk size of 2^10 twigs meant that even an
empty qp-trie would consume 12KB of memory, while reducing this size
would negatively impact resolver performance.

This commit implements an adaptive chunking strategy that:
- Tracks the size of the most recently allocated chunk.
- Doubles the chunk size for each new allocation until reaching a
predefined maximum.

This approach effectively balances memory efficiency for small tries
while maintaining the performance benefits of larger chunk sizes for
bigger data structures.

This commit also splits the callback freeing qpmultis into two
phases, one that frees the underlying qptree, and one that reclaims
the qpmulti memory. In order to prevent races between the qpmulti
destructor and chunk garbage collection jobs, the second phase is
protected by reference counting.

fix: doc: Update CVE checklist

Merge branch 'michal/update-cve-checklist' into 'main'

See merge request isc-projects/bind9!10473

Fix duplicate Markdown reference

Commit 7e429463f527ab80d17ddf8c6c3418de7b5fc11e added a second
definition of the "step_asn_send" reference. Make the relevant links
distinct.

Send pre-announcement emails for all ISC projects

There is no reason for the public pre-announcements of security issues
to only be sent for BIND 9. Remove the "BIND 9 only" annotation from
the relevant checklist step as it caused confusion in practice.

Update CVE checklist template

Clarify a confusing step in the CVE checklist.

Merge tag 'v9.21.8'

rem: dev: Clean up the DST cryptographic API

The DST API has been cleaned up, duplicate functions has been squashed
into single call (verify and verify2 functions), and couple of unused
functions have been completely removed (createctx2, computesecret,
paramcompare, and cleanup).

Merge branch 'ondrej/dst_api-cleanup' into 'main'

See merge request isc-projects/bind9!10345

Deprecate max-rsa-exponent-size, always use 4096 instead

The `max-rsa-exponent-size` could limit the exponents of the RSA
public keys during the DNSSEC verification. Instead of providing
a cryptic (not cryptographic) knob, hardcode the max exponent to
be 4096 (the theoretical maximum for DNSSEC).

Cleanup the DST cryptographic API

The DST API has been cleaned up, duplicate functions has been squashed
into single call (verify and verify2 functions), and couple of unused
functions have been completely removed (createctx2, computesecret,
paramcompare, and cleanup).

new: usr: Implement a new 'notify-defer' configuration option

This new option sets a delay (in seconds) to wait before sending
a set of NOTIFY messages for a zone. Whenever a NOTIFY message is
ready to be sent, sending will be deferred for this duration. This
option is not to be confused with the :any:`notify-delay` option.
The default is 0 seconds.

Closes #5259

Merge branch '5259-implement-zone-notify-defer' into 'main'

See merge request isc-projects/bind9!10419

Implement a new 'notify-defer' configuration option

This new option sets the delay, in seconds, to wait before sending
a set of NOTIFY messages for a zone. Whenever a NOTIFY message is
ready to be sent, sending will be deferred for this duration.

Update the dns_zone_setnotifydelay() function's documentation

Add a note that the delay is in seconds.

Delete the unused dns_zone_getnotifydelete() function

The function is unused, delete it.

fix: test: Fix catz system test errors

Merge branch 'aram/catz-system-test-errors-fix' into 'main'

See merge request isc-projects/bind9!10444

Fix more catz system test errors

A quick grep check discovered a couple of more errors similar to the
one fixed in the previous commit. Fix them too.

Fix catz system test error

The '|| ret=1' is omitted from the check. This was introduced in the
b171cacf4f0123ba96bef6eedfc92dfb608db6b7 commit. Fix the error.

chg: test: Mark test_idle_timeout as flaky on FreeBSD 13

The test_idle_timeout check in the "timeouts" system test has been
failing often on FreeBSD 13 AWS hosts.  Adding timestamped debug logging
shows that the time.sleep() calls used in that check are returning
significantly later than asked to on that platform (e.g. after 4 seconds
when just 1 second is requested), breaking the test's timing assumptions
and triggering false positives.  These failures are not an indication of
a bug in named and have not been observed on any other platform.  Mark
the problematic check as flaky, but only on FreeBSD 13, so that other
failure modes are caught appropriately.

Merge branch 'michal/mark-test_idle_timeout-as-flaky-on-freebsd-13' into 'main'

See merge request isc-projects/bind9!10459

Mark test_idle_timeout as flaky on FreeBSD 13

The test_idle_timeout check in the "timeouts" system test has been
failing often on FreeBSD 13 AWS hosts.  Adding timestamped debug logging
shows that the time.sleep() calls used in that check are returning
significantly later than asked to on that platform (e.g. after 4 seconds
when just 1 second is requested), breaking the test's timing assumptions
and triggering false positives.  These failures are not an indication of
a bug in named and have not been observed on any other platform.  Mark
the problematic check as flaky, but only on FreeBSD 13, so that other
failure modes are caught appropriately.

fix: dev: Debug level was ignored when logging to stderr

The debug level (set with the `-d` option) was ignored when running `named` with the `-g` and `-u` options.

Merge branch 'each-fix-debug-level' into 'main'

See merge request isc-projects/bind9!10453

debug level was ignored when logging to stderr

In commit cc167266aa, the -g option was changed so it sets both
named_g_logstderr and also named_g_logflags to use ISO style timestamps
with tzinfo. Together with an error in named_log_setsafechannels(), that
change could cause the debugging level to be ignored.

rem: ci: Drop Ubuntu 20.04 Focal Fossa

Focal-specific ./configure options were moved to Jammy.

Merge branch 'mnowak/drop-ubuntu-focal' into 'main'

See merge request isc-projects/bind9!9899

Revert "Ignore .hypothesis files created by system tests"

This reverts commit f413ddbe5f2edfdeedc41603dcd2afe105ed2844.

Make FreeBSD 12.x part of Community-Maintained platforms

Drop Ubuntu 20.04 Focal Fossa

Focal-specific ./configure options were moved to Jammy.

chg: doc: Set up version for BIND 9.21.9

Merge branch 'michal/set-up-version-for-bind-9.21.9' into 'main'

See merge request isc-projects/bind9!10450

Update BIND version to 9.21.9-dev

Update BIND version for release

new: doc: Prepare documentation for BIND 9.21.8

Merge branch 'michal/prepare-documentation-for-bind-9.21.8' into 'v9.21.8-release'

See merge request isc-private/bind9!796

Reorder release notes

Tweak and reword release notes

Prepare release notes for BIND 9.21.8

Generate changelog for BIND 9.21.8

[CVE-2025-40775] sec: usr: Prevent assertion when processing TSIG algorithm

DNS messages that included a Transaction Signature (TSIG) containing an
invalid value in the algorithm field caused :iscman:`named` to crash
with an assertion failure. This has been fixed. :cve:`2025-40775`

See isc-projects/bind9#5300

Merge branch '5300-confidential-tsig-unknown-alg' into 'v9.21.8-release'

See merge request isc-private/bind9!793

Prevent assertion when processing TSIG algorithm

In a previous change, the "algorithm" value passed to
dns_tsigkey_create() was changed from a DNS name to an integer;
the name was then chosen from a table of known algorithms. A
side effect of this change was that a query using an unknown TSIG
algorithm was no longer handled correctly, and could trigger an
assertion failure. This has been corrected.

The dns_tsigkey struct now stores the signing algorithm
as dst_algorithm_t value 'alg' instead of as a dns_name,
but retains an 'algname' field, which is used only when the
algorithm is DST_ALG_UNKNOWN. This allows the name of the
unrecognized algorithm name to be returned in a BADKEY
response.

fix: usr: Return the correct NSEC3 records for NXDOMAIN responses

The wrong NSEC3 records were sometimes returned as proof that the QNAME
did not exist. This has been fixed.

Closes #5292

Merge branch '5292-wrong-nsec3-chosen-for-no-qname-proof' into 'main'

See merge request isc-projects/bind9!10447

Wrong NSEC3 chosen for NO QNAME proof

When we optimised the closest encloser NSEC3 discovery the maxlabels
variable was used in the binary search. The updated value was later
used to add the NO QNAME NSEC3 but that block of code needed the
original value. This resulted in the wrong NSEC3 sometimes being
chosen to perform this role.

chg: ci: Run linkchecker only on Wednesdays

Some domains tested by linkchecker may think that we connect to them too
often and will refuse connection or reply with an error code, which makes
this job fail. Let's check links only on Wednesdays.

Merge branch 'mnowak/run-linkchecker-only-sometimes' into 'main'

See merge request isc-projects/bind9!10439

Run linkchecker only on Wednesdays

Some domains tested by linkchecker may think that we connect to them too
often and will refuse connection or reply with and error code, which
makes this job fail. Let's check links only on Wednesdays.

chg: ci: Disable linkcheck on www.gnu.org

The check fails with the following error for some time:

broken https://www.gnu.org/software/libidn/#libidn2 - HTTPSConnectionPool(host='www.gnu.org', port=443): Max retries exceeded with url: /software/libidn/ (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f5bd4c14590>: Failed to establish a new connection: [Errno 111] Connection refused'))

Merge branch 'mnowak/linkcheck-disable-www-gnu-org' into 'main'

See merge request isc-projects/bind9!10436

Disable linkcheck on www.gnu.org

The check fails with the following error for some time:

broken https://www.gnu.org/software/libidn/#libidn2 - HTTPSConnectionPool(host='www.gnu.org', port=443): Max retries exceeded with url: /software/libidn/ (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f5bd4c14590>: Failed to establish a new connection: [Errno 111] Connection refused'))

fix: dev: fix the ksr two-tone test

The two-tone ksr subtest (test_ksr_twotone) depended on the dnssec-policy keys algorithm values in named.conf being entered in numerical order. As the algorithms used in the test can be selected randomly this does not always happen. Sort the dnssec-policy keys by algorithm when adding them to the key list from named.conf.

Closes #5286

Merge branch '5286-ksr-two-tone-test-only-work-by-luck' into 'main'

See merge request isc-projects/bind9!10395

Don't depend on keys being sorted

Extract each section of the bundle and check that the expected
records are there. The old code was assuming that the records in
each section where in a particular order which didn't happen in
practice.

fix: dev: fix the error handling of put_yamlstr calls

The return value was sometimes being ignored when it shouldn't
have been.

Closes #5301

Merge branch '5301-cid-550216-remove-dead-code' into 'main'

See merge request isc-projects/bind9!10432

Fix the error handling of put_yamlstr calls

The return value was sometimes being ignored when it shouldn't
have been.

chg: ci: Revise merge request pipeline job triggering rules

Over the past few years, some of the initial decisions made about which
GitLab CI jobs to run for all merge requests and which of them to run
just for scheduled/web-triggered pipelines turned out to be less than
ideal in practice: test coverage was found to be too lax in some areas
and on the other hand unnecessarily repetitive in others.  For example,
compilation failures for certain build types that are not exercised for
every merge request (e.g. FIPS-enabled builds) turned out to be much
more common in practice than e.g. test failures happening only on a
subset of releases of a given Linux distribution.

To limit excessive resource use while retaining broad test coverage,
adjust GitLab CI job triggering rules for merge request pipelines as
follows:

- run all possible build jobs for every merge request; compilation
failures triggered for build flavors that were only tested in
scheduled pipelines turned out to be surprisingly commonplace and
became a nuisance over time, particularly given that the run times
of build jobs are much lower than those of test jobs,

- for every merge request, run at least one system & unit test job for
each build flavor (e.g. sanitizer-enabled, FIPS-enabled,
out-of-tree, tarball-based, etc.),

- limit the amount of test jobs run for each distinct operating
system; for example, only run system & unit test jobs for Ubuntu
24.04 Noble Numbat in merge request pipelines, skipping those for
Ubuntu 22.04 Jammy Jellyfish and Ubuntu 20.04 Focal Fossa (while
still running them in other pipeline types, e.g. in scheduled
pipelines),

- ensure every merge request is tested on Oracle Linux 8, which is the
operating system with the oldest package versions out of the systems
that are still supported by this BIND 9 branch,

- decrease the number of test jobs run with sanitizers enabled while
still testing with both ASAN and TSAN and both GCC and Clang for
every merge request.

These changes do not affect the set of jobs created for any other
pipeline type (triggered by a schedule, by a GitLab API call, by the web
interface, etc.); only merge request pipelines are affected.

---

Since understanding the impact of this MR just by looking at the diff is
arguably challenging, I prepared some tables showing which jobs are
currently triggered for every merge request and what the new state of
things will be after this MR gets merged.

**Legend:**

  - :chart_with_upwards_trend: - job was *not* run for every merge
    request before, but will be

  - :chart_with_downwards_trend: - job was run for every merge request
    before, but will *not* be any longer

| Change | Job | Stage | Before | After | cff39d32455 | 2f1995c7136 / 4ad8c86cf2b |
| ------ | --- | ----- | ------ | ----- | ----------- | ----------- |
| | `docs` |  `docs` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | `docs:tarball` |  `docs` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | `clang:asan` |  `build` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | `clang:bookworm:amd64` |  `build` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | `clang:freebsd13:amd64` |  `build` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | `clang:freebsd14:amd64` |  `build` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | `clang:openbsd:amd64` |  `build` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | `clang:tsan` |  `build` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| :chart_with_upwards_trend:| `gcc:8fips:amd64` |  `build` | :x: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| :chart_with_upwards_trend:| `gcc:9fips:amd64` |  `build` | :x: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | `gcc:alpine3.21:amd64` |  `build` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | `gcc:asan` |  `build` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | `gcc:bookworm:amd64` |  `build` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | `gcc:bookworm:amd64cross32` |  `build` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | `gcc:bookworm:rbt:amd64` |  `build` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | `gcc:focal:amd64` |  `build` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | `gcc:jammy:amd64` |  `build` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | `gcc:noble:amd64` |  `build` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | `gcc:oraclelinux8:amd64` |  `build` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | `gcc:oraclelinux9:amd64` |  `build` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | `gcc:ossl3:sid:amd64` |  `build` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | `gcc:out-of-tree` |  `build` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | `gcc:sid:amd64` |  `build` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | `gcc:tarball` |  `build` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | `gcc:tarball:nosphinx` |  `build` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | `gcc:tsan` |  `build` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | `gcc:tumbleweed:amd64` |  `build` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | `cross-version-config-tests` |  `system` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | `respdiff` |  `system` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | `respdiff-third-party` |  `system` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | `respdiff:asan` |  `system` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | `respdiff:tsan` |  `system` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| :chart_with_downwards_trend:| `system:clang:asan` |  `system` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :x: |
| :chart_with_downwards_trend:| `system:clang:bookworm:amd64` |  `system` | :white_check_mark: | :x: | :x: | :x: |
| :chart_with_downwards_trend:| `system:clang:freebsd13:amd64` |  `system` | :white_check_mark: | :x: | :x: | :x: |
| | `system:clang:freebsd14:amd64` |  `system` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | `system:clang:tsan` |  `system` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| :chart_with_upwards_trend:| `system:gcc:8fips:amd64` |  `system` | :x: | :x: | :white_check_mark: | :white_check_mark: |
| | `system:gcc:9fips:amd64` |  `system` | :x: | :white_check_mark: | :x: | :x: |
| | `system:gcc:alpine3.21:amd64` |  `system` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | `system:gcc:asan` |  `system` | :white_check_mark: | :x: | :x: | :white_check_mark: |
| | `system:gcc:bookworm:amd64` |  `system` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | `system:gcc:bookworm:rbt:amd64` |  `system` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| :chart_with_downwards_trend:| `system:gcc:focal:amd64` |  `system` | :white_check_mark: | :x: | :x: | :x: |
| :chart_with_downwards_trend:| `system:gcc:jammy:amd64` |  `system` | :white_check_mark: | :x: | :x: | :x: |
| | `system:gcc:noble:amd64` |  `system` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| :chart_with_downwards_trend:| `system:gcc:oraclelinux8:amd64` |  `system` | :white_check_mark: | :x: | :x: | :x: |
| | `system:gcc:oraclelinux9:amd64` |  `system` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | `system:gcc:ossl3:sid:amd64` |  `system` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| :chart_with_upwards_trend:| `system:gcc:out-of-tree` |  `system` | :x: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| :chart_with_downwards_trend:| `system:gcc:sid:amd64` |  `system` | :white_check_mark: | :x: | :x: | :x: |
| :chart_with_upwards_trend:| `system:gcc:tarball` |  `system` | :x: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| :chart_with_downwards_trend:| `system:gcc:tsan` |  `system` | :white_check_mark: | :x: | :x: | :x: |
| | `system:gcc:tumbleweed:amd64` |  `system` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| :chart_with_downwards_trend:| `unit:clang:asan` |  `unit` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :x: |
| :chart_with_downwards_trend:| `unit:clang:bookworm:amd64` |  `unit` | :white_check_mark: | :x: | :x: | :x: |
| :chart_with_downwards_trend:| `unit:clang:freebsd13:amd64` |  `unit` | :white_check_mark: | :x: | :x: | :x: |
| | `unit:clang:freebsd14:amd64` |  `unit` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | `unit:clang:openbsd:amd64` |  `unit` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | `unit:clang:tsan` |  `unit` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| :chart_with_upwards_trend:| `unit:gcc:8fips:amd64` |  `unit` | :x: | :x: | :white_check_mark: | :white_check_mark: |
| | `unit:gcc:9fips:amd64` |  `unit` | :x: | :white_check_mark: | :x: | :x: |
| | `unit:gcc:alpine3.21:amd64` |  `unit` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | `unit:gcc:asan` |  `unit` | :white_check_mark: | :x: | :x: | :white_check_mark: |
| | `unit:gcc:bookworm:amd64` |  `unit` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | `unit:gcc:bookworm:rbt:amd64` |  `unit` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| :chart_with_downwards_trend:| `unit:gcc:focal:amd64` |  `unit` | :white_check_mark: | :x: | :x: | :x: |
| :chart_with_downwards_trend:| `unit:gcc:jammy:amd64` |  `unit` | :white_check_mark: | :x: | :x: | :x: |
| | `unit:gcc:noble:amd64` |  `unit` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| :chart_with_downwards_trend:| `unit:gcc:oraclelinux8:amd64` |  `unit` | :white_check_mark: | :x: | :x: | :x: |
| | `unit:gcc:oraclelinux9:amd64` |  `unit` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | `unit:gcc:ossl3:amd64` |  `unit` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| :chart_with_upwards_trend:| `unit:gcc:out-of-tree` |  `unit` | :x: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| :chart_with_downwards_trend:| `unit:gcc:sid:amd64` |  `unit` | :white_check_mark: | :x: | :x: | :x: |
| :chart_with_upwards_trend:| `unit:gcc:tarball` |  `unit` | :x: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| :chart_with_downwards_trend:| `unit:gcc:tsan` |  `unit` | :white_check_mark: | :x: | :x: | :x: |
| | `unit:gcc:tumbleweed:amd64` |  `unit` | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |

And a short statistical summary of the changes proposed:

| Stage | Before | After | Diff |
| ----- | ------ | ----- | ---- |
| `docs` | 2 | 2 | **0** |
| `build` | 23 | 25 | **+2** |
| `system` | 23 | 18 | **-5** |
| `unit` | 19 | 14 | **-5** |
| **TOTAL** | **67** | **59** | **-8** |

Mattermost thread (sparked by @pspacek):
https://mattermost.isc.org/isc/pl/z6nymnu4m3dhzr3rxtjkzzgk7a

Merge branch 'michal/revise-ci-job-triggering-rules' into 'main'

See merge request isc-projects/bind9!10349

Revise merge request pipeline job triggering rules

Over the past few years, some of the initial decisions made about which
GitLab CI jobs to run for all merge requests and which of them to run
just for scheduled/web-triggered pipelines turned out to be less than
ideal in practice: test coverage was found to be too lax in some areas
and on the other hand unnecessarily repetitive in others.  For example,
compilation failures for certain build types that are not exercised for
every merge request (e.g. FIPS-enabled builds) turned out to be much
more common in practice than e.g. test failures happening only on a
subset of releases of a given Linux distribution.

To limit excessive resource use while retaining broad test coverage,
adjust GitLab CI job triggering rules for merge request pipelines as
follows:

  - run all possible build jobs for every merge request; compilation
    failures triggered for build flavors that were only tested in
    scheduled pipelines turned out to be surprisingly commonplace and
    became a nuisance over time, particularly given that the run times
    of build jobs are much lower than those of test jobs,

  - for every merge request, run at least one system & unit test job for
    each build flavor (e.g. sanitizer-enabled, FIPS-enabled,
    out-of-tree, tarball-based, etc.),

  - limit the amount of test jobs run for each distinct operating
    system; for example, only run system & unit test jobs for Ubuntu
    24.04 Noble Numbat in merge request pipelines, skipping those for
    Ubuntu 22.04 Jammy Jellyfish and Ubuntu 20.04 Focal Fossa (while
    still running them in other pipeline types, e.g. in scheduled
    pipelines),

  - ensure every merge request is tested on Oracle Linux 8, which is the
    operating system with the oldest package versions out of the systems
    that are still supported by this BIND 9 branch,

  - decrease the number of test jobs run with sanitizers enabled while
    still testing with both ASAN and TSAN and both GCC and Clang for
    every merge request.

These changes do not affect the set of jobs created for any other
pipeline type (triggered by a schedule, by a GitLab API call, by the web
interface, etc.); only merge request pipelines are affected.

rem: ci: Drop OpenBSD from the CI

With the ongoing process of moving CI workloads to AWS, OpenBSD poses a
challenge, as there is no OpenBSD AMI image in the AWS catalog. Building
our image from scratch is disproportionately complicated, given that
OpenBSD is not a common deployment platform for BIND 9. Otherwise,
OpenBSD stays at the "Best-Effort" level of support.

Merge branch 'mnowak/drop-openbsd-from-ci' into 'main'

See merge request isc-projects/bind9!10375

Drop OpenBSD from the CI

With the ongoing process of moving CI workloads to AWS, OpenBSD poses a
challenge, as there is no OpenBSD AMI image in the AWS catalog. Building
our image from scratch is disproportionately complicated, given that
OpenBSD is not a common deployment platform for BIND 9. Otherwise,
OpenBSD stays at the "Best-Effort" level of support.

fix: dev: Call rcu_barrier earlier in the destructor

If a call_rcu thread is running, there is a possible race condition
where the destructors run before all call_rcu callbacks have finished
running. This can happen, for example, if the call_rcu callback tries to
log something after the logging context has been torn down.

In !10394, we tried to counter this by explicitely creating a call_rcu
thread an shutting it down before running the destructors, but it is
possible for things to "slip" and end up on the default call_rcu thread.

As a quickfix, this commit moves an rcu_barrier() that was in the mem
context destructor earlier, so that it "protects" all libisc
destructors.

Closes #5296

Merge branch '5296-join-rcu-thread-on-shutdown' into 'main'

See merge request isc-projects/bind9!10423

Call rcu_barrier earlier in the destructor

If a call_rcu thread is running, there is a possible race condition
where the destructors run before all call_rcu callbacks have finished
running. This can happen, for example, if the call_rcu callback tries to
log something after the logging context has been torn down.

In !10394, we tried to counter this by explicitely creating a call_rcu
thread an shutting it down before running the destructors, but it is
possible for things to "slip" and end up on the default call_rcu thread.

As a quickfix, this commit moves an rcu_barrier() that was in the mem
context destructor earlier, so that it "protects" all libisc
destructors.

chg: test: Rewrite kasp system test to pytest (4)

These tests do not easily fit in the standard test case framework, so they go into their own suite.
- zsk retired case
- checkds cases
- reload/restart
- inheritance tests

Merge branch 'matthijs-pytest-rewrite-kasp-system-test-4' into 'main'

See merge request isc-projects/bind9!10278

Convert kasp inheritance tests

These tests ensure that if dnssec-policy is set on a higher level, the
zone is still signed (or unsigned) as expected. Or if a higher level
has an override, the new policy is honored as expected.

Convert reload/restart kasp test case

This test checks that the SOA SERIAL and TTL are adjusted correctly
after a reload/restart.

Convert kasp checkds test cases to pytest

This converts the checkds test cases that deal with the 'rndc checkds'
command and setting the 'DSPublish' and 'DSRemoved' metadata.

Convert kasp zsk retired test case

This test case does not easily fit in the standard test case framework,
so it goes into its own suite.

new: usr: Implement tcp-primaries-timeout

The new `tcp-primaries-timeout` configuration option works the same way
as the older `tcp-initial-timeout` option, but applies only to the TCP
connections made to the primary servers, so that the timeout value can
be set separately for them. By default, it's set to 150, which is 15
seconds.

Closes #3649

Merge branch '3649-configurable-xfr-tcp-timeouts' into 'main'

See merge request isc-projects/bind9!9376

Fix delv default timeout value

The isc_nm_getinitialtimeout() function (and also the previously used
isc_nm_gettimeouts() function) returns timeout value(s) in milliseconds,
while the dns_request_create() function expects timeout values in
seconds. Fix the bug by dividing the timeout value by MS_PER_SEC.

There is no added test, because it turns out delv doesn't support
setting custom timeout values (as opposed to what is suggested in
its man page). Tests should be added later when the '+timeout=T'
option is implemented.

Separate the single setter/getter functions for TCP timeouts

Previously all kinds of TCP timeouts had a single getter and setter
functions. Separate each timeout to its own getter/setter functions,
because in majority of cases only one is required at a time, and it's
not optimal expanding those functions every time a new timeout value
is implemented.

Fix the notify system test after the newly applied timeout value

Since notify messages now use the configured 'tcp-initial-timeout'
connect timeout value, the existing "checking notify retries expire
within 30 seconds" check in the "notify" system test is failing. Set
the 'tcp-initial-timeout' option for ns3 to the previously hardcoded
value of 15 seconds for the test to pass successfully.

Use the configured TCP connect timeout in checkds_send_toaddr()

The checkds_send_toaddr() function uses hardcoded timeout values
for both UDP and TCP, however, with TCP named has configurable
timeout values. Slightly refactor the timeouts calculation part
and use the configured 'tcp-initial-timeout' value as the connect
timeout.

Use the configured TCP connect timeout in notify_send_toaddr()

The notify_send_toaddr() function uses hardcoded timeout values
for both UDP and TCP, however, with TCP named has configurable
timeout values. Slightly refactor the timeouts calculation part
and use the configured 'tcp-initial-timeout' value as the connect
timeout.

Implement tcp-primaries-timeout

The new 'tcp-primaries-timeout' configuration option works the same way
as the existing 'tcp-initial-timeout' option, but applies only to the
TCP connections made to the primary servers, so that the timeout value
can be set separately for them. The default is 15 seconds.

Also, while accommodating zone.c's code to support the new option, make
a light refactoring with the way UDP timeouts are calculated by using
definitions instead of hardcoded values.

chg: test: Rewrite kasp system test to pytest (3)

Write python-based tests for the many test cases from the kasp system test with the same pattern.

Merge branch 'matthijs-pytest-rewrite-kasp-system-test-3' into 'main'

See merge request isc-projects/bind9!10268

Parametrize the default kasp test cases

Make use of pytest.mark.parametrize to split up the many default kasp
test cases into separate tests.

Convert keystore and rumoured kasp test cases

For 'keystore.kasp', a setting 'key-directories' is used. If set, this
will expect a list of two directories, the first one is where the KSKs
will be stored, the second in the list is the ZSK key directory. This
may be expanded in the future to test more complex key storage cases.

The 'rumoured.kasp' zone is weird, the key timings can never match
those key states. But it is a regression test for an early day bug,
so we convert it, but skip the expected key times check.

Convert more kasp test cases to pytest

These test cases follow the same pattern as many other, but all require
some additional checks. These are set in "additional-tests".

The "zsk-missing.autosign" zone is special handled, as it expects the
KSK to sign the SOA RRset (because the ZSK is unavailable).

The kasp/ns3/setup.sh script is updated so the SyncPublish is not set
(named will initialize it correctly). For the test zones that have
missing private key files we do need to set the expected key timing
metadata.

Remove the counterparts for the newly added test from the kasp shell
tests script.

Update kasp check_signatures for dnssec-policy

The check_signatures code was initially created to be suitable for
the ksr system test, to test the Offline KSK feature. For that, a
key is expected to be signing if the current time is between
the timing metadata Active and Retired.

With dnssec-policy, the key timing metadata is indicative, the key
states determine the actual signing behavior.

Update the check_signatures function so that by default the signing
is derived from the key states (ksigning and zsigning). Add an
argument 'offline_ksk', if set the make sure that the zsigning is set
if the current time is between the Active and Retired timing metadata,
and for ksigning we just use the timing metadata (as the key is offline,
we cannot check the key states).

Another (upcoming) test case is where key files are missing. When the
ZSK private key file is missing, the KSK takes over. Add an argument
'zsk_missing', when set to True the expected zone signing (zsigning)
is reversed.

Two more kasp test cases converted to pytest

The zone 'pregenerated.kasp' is a case where there already exist more
keys than required. For this we set the 'pregenerated' setting. This
will change the 'keydir_to_keylist' function behavior: Only keys in use
are considered. A key is in use if all of the states are either
undefined, or set to 'hidden'.

The 'some-keys.kasp' zone is similar to 'pregenerated.kasp', except
only some keys have been pregenerated.

Convert many kasp test cases to pytst

Write python-based tests for the many test cases from the kasp system
test. These test cases all follow the same pattern:

- Wait until the zone is signed.
- Check the keys from the key-directory against expected properties.
- Set the expected key timings derived from when the key was created.
- Check the key timing metadata against expected timings.
- Check the 'rndc dnssec -status' output.
- Check the apex is signed correctly.
- Check a subdomain is signed correctly.
- Verify that the zone is DNSSEC correct.

Remove the counterparts for the newly added test from the kasp shell
tests script.

fix: dev: Fix a date race in qpcache_addrdataset()

The 'qpnode->nsec' structure member isn't protected by a lock and
there's a data race between the reading and writing parts in the
qpcache_addrdataset() function. Use a node read lock for accessing
'qpnode->nsec' in qpcache_addrdataset(). Add an additional
'qpnode->nsec != DNS_DB_NSEC_HAS_NSEC' check under a write lock
to be sure that no other competing thread changed it in the time
when the read lock is unlocked and a write lock is not acquired
yet.

Closes #5285

Merge branch '5285-data-race-in-qpcache_addrdataset' into 'main'

See merge request isc-projects/bind9!10397

Fix a date race in qpcache_addrdataset()

The 'qpnode->nsec' structure member isn't protected by a lock and
there's a data race between the reading and writing parts in the
qpcache_addrdataset() function. Use a node read lock for accessing
'qpnode->nsec' in qpcache_addrdataset(). Add an additional
'qpnode->nsec != DNS_DB_NSEC_HAS_NSEC' check under a write lock
to be sure that no other competing thread changed it in the time
when the read lock is unlocked and a write lock is not acquired
yet.

fix: usr: Fix a serve-stale issue with a delegated zone

When ``stale-answer-client-timeout 0`` option was enabled, it could be ignored
when resolving a zone which is a delegation of an authoritative zone belonging
to the resolver. This has been fixed.

Closes #5275

Merge branch '5275-stale-answer-client-timeout-0-and-delegation-fix' into 'main'

See merge request isc-projects/bind9!10381

Test 'stale-answer-client-timeout 0' with a delegation

Add a new test which gets an answer for a delegated zone, then
checks whether the 'stale-answer-client-timeout 0' mode (i.e. the
'stalefirst' mode) works for it.

Fix a serve-stale issue with a delegated zone

When 'stale-answer-client-timeout' is 0, named is allowed to return
a stale answer immediately, while also initiating a new query to get
the real answer. This mode is activated in ns__query_start() by setting
the 'qctx->options.stalefirst' optoin to 'true' before calling the
query_lookup() function, but not when the zone is known to be
authoritative to the server. When the zone is authoritative, and
query_looup() finds out that the requested name is a delegation,
then before proceeding with the query, named tries to look it up
in the cache first. Here comes the issue that it doesn't consider
enabling 'qctx->options.stalefirst' in this case, and so the
'stale-answer-client-timeout 0' setting doesn't work for those
delegated zones - instead of immediately returning the stale answer
(if it exists), named tries to resolve it.

Fix this issue by enabling 'qctx->options.stalefirst' in the
query_zone_delegation() function just before named looks up the name
in the cache using a new query_lookup() call. Also, if nothing was
found in the cache, don't initiate another query_lookup() from inside
query_notfound(), and let query_notfound() do its work, i.e. it will
call query_delegation() for further processing.

fix: usr: Fix EDNS yaml output

`dig` was producing invalid YAML when displaying some EDNS options. This has been corrected.

Several other improvements have been made to the display of EDNS option data:
- We now use the correct name for the UPDATE-LEASE option, which was previously displayed as "UL", and split it into separate LEASE and LEASE-KEY components in YAML mode.
- Human-readable durations are now displayed as comments in YAML mode so as not to interfere with machine parsing.
- KEY-TAG options are now displayed as an array of integers in YAML mode.
- EDNS COOKIE options are displayed as separate CLIENT and SERVER components, and cookie STATUS is a retrievable variable in YAML mode.

Closes #5014

Merge branch '5014-improve-edns-yaml-processing' into 'main'

See merge request isc-projects/bind9!9695

Fix a typo in a test description

The test description "checking delv -c CH is ignored, and
treated like IN" in digdelv was garbled.

Check EDNS CLIENT-TAG and SERVER-TAG are emitted using valid YAML

Check that when an EDNS CLIENT-TAG or EDNS SERVER-TAG option is
present in the message, the emitted YAML is valid.

Check EDNS EXPIRE option is emitted using valid YAML

Check that when an EDNS EXPIRE option is present in the message,
the emitted YAML is valid.

Check EDNS CLIENT-SUBNET option is emitted using valid YAML

Check that when there is an EDNS CLIENT-SUBNET option in the
message, the emitted YAML is valid.

Split EDNS COOKIE YAML into separate parts

Split the YAML display of the EDNS COOKIE option into CLIENT and SERVER
parts. The STATUS of the EDNS COOKIE in the reply is now a YAML element
rather than a comment.

Fix EDNS TCP-KEEPALIVE option YAML output

There was missing white space between the option name and its value.

Fix EDNS LLQ option YAML output

The EDNS LLQ option was not being emitted as valid YAML. Correct
the output to be valid YAML with each field of the LLQ being
individually selectable.

Change the EDNS KEY-TAG YAML output format

When using YAML, print the EDNS KEY-TAG as an array of integers
for easier machine parsing. Check the validity of the YAML output.

Use YAML comments for durations rather than parentheses

This will allow the values to be parsed using standard yaml processing
tools, and still provide the value in a human friendly form.

Change the name and YAML format of EDNS UL

The offical EDNS option name for "UL" is "UPDATE-LEASE".  We now
emit "UPDATE-LEASE" instead of "UL", when printing messages, but
"UL" has been retained as an alias on the command line.

Update leases consist of 1 or 2 values, LEASE and KEY-LEASE.  These
components are now emitted separately so they can be easily extracted
from YAML output.  Tests have been added to check YAML correctness.

Add YAML escaping where needed

When rendering text, such as domain names or the EXTRA-TEXT
field of the EDE option, backslashes and quotation marks must
be escaped to ensure that the emitted message is valid YAML.