[CVE-2025-40777] sec: usr: Fix a possible assertion failure when using the 'stale-answer-client-timeout 0' option
In specific circumstances the :iscman:`named` resolver process could
terminate unexpectedly when stale answers were enabled and the
``stale-answer-client-timeout 0`` configuration option was used.
This has been fixed.
See isc-projects/bind9#5372
Merge branch '5372-security-serve-stale-crash-on-insist-unreachable' into 'v9.21.10-release'
Aram Sargsyan [Wed, 18 Jun 2025 13:32:03 +0000 (13:32 +0000)]
Reset DNS_DBFIND_STALETIMEOUT in query_lookup()
If ns__query_start() is called because of a chained query (e.g.
after encountering a CNAME), a previously set DNS_DBFIND_STALETIMEOUT
flag on the query's 'dboptions' field can cause an assertion
failure if the new query's 'stalefirst' value is not true (e.g. if the
target qname is an authoritative zone for the server). Reset the
DNS_DBFIND_STALETIMEOUT flag in the query_lookup() function before
evaluating the 'stalefirst' value, and make sure to assign a fresh
value to the `stalefirst' flag instead of conditionally assigning it
only if the value is 'true'.
dangerfile.py checked for new configure switches in `configure.ac`,
these were annotated with "# [pairwise:..." in a leading line. Meson
reads those from `meson_options.txt` instead.
fix: usr: Fix the default interface-interval from 60s to 60m
When the interface-interval parser was changed from uint32 parser to
duration parser, the default value stayed at plain number `60` which
now means 60 seconds instead of 60 minutes. The documentation also
incorrectly states that the value is in minutes. That has been fixed.
Closes #5246
Merge branch '5246-fix-default-interface-interval' into 'main'
Ondřej Surý [Tue, 18 Mar 2025 13:05:39 +0000 (14:05 +0100)]
Fix the default interface-interval docs and default value
When the interface-interval parser was changed from uint32 parser to
duration parser, the default value stayed at plain 60 which now means 60
seconds instead of 60 minutes. Fix the default value and the
documentation to match the reality.
Colin Vidal [Mon, 30 Jun 2025 12:51:20 +0000 (14:51 +0200)]
new: test: add startup root DNSKEY refresh system test
Root trust anchors are automatically updated as described in RFC5011.
Add a system test which ensures the root DNSKEYs are always queried by
named during startup.
Because this test uses real internet DNS root servers, it is enabled
only when `CI_ENABLE_LIVE_INTERNET_TESTS` is set.
Colin Vidal [Tue, 24 Jun 2025 09:55:42 +0000 (11:55 +0200)]
add startup root DNSKEY refresh system test
Root trust anchors are automatically updated as described in RFC5011.
Add a system test which ensures the root DNSKEYs are always queried by
named during startup.
Because this test uses real internet DNS root servers, it is enabled
only when `CI_ENABLE_LIVE_INTERNET_TESTS` is set.
Ondřej Surý [Mon, 30 Jun 2025 11:23:38 +0000 (13:23 +0200)]
fix: dev: Prevent false sharing for the .inuse member of isc_mem_t
Change the .inuse member of memory context to have a loop-local
variable, so there's no contention even when the same memory
context is shared among multiple threads.
Closes #5354
Merge branch '5354-prevent-false-sharing-in-isc_mem' into 'main'
Ondřej Surý [Wed, 4 Jun 2025 16:14:23 +0000 (18:14 +0200)]
Change the .inuse member of isc_mem to be per-thread/per-loop
The .inuse member was causing a lot of contention between threads using
the same memory context. Scather the .inuse and .overmem members of
isc_mem_t structure to be an per-tid array of variables to reduce the
contention as the writes are now independent of each other.
The array uses one tad bit nasty trick, as ISC_TID_UNKNOWN is now -1,
the array has been sized to fit the unknown tid with [-1] index into the
array accomplished with `ctx->stat = &ctx->stat_s[1];`. It will not win
a beauty contest, but it works seamlessly by just passing `isc_tid()` as
an index into the array.
The caveat here is that gathering the real inuse value requires walking
the whole array for all registered tid values (isc_tid_count()). The
gather part happens only when statistics are being gathered or when
isc_mem_isovermem() is called. As the isc_mem_isovermem() call happens
only when new data is being added to cache or ADB, it doesn't happen on
the hottest (read-only) path and according to the measurements, it
doesn't slow down neither the cold cache nor the hot cache latency.
Ondřej Surý [Thu, 5 Jun 2025 10:19:43 +0000 (12:19 +0200)]
Don't use ssize_t for storing difference between sizes
As POSIX guarantees only that the type ssize_t shall be capable of
storing values at least in the range [-1, {SSIZE_MAX}], it can't be used
to calculate the difference between two memory sizes. Change the logic
for junk filling to test whether the new size is larger than old size
and then use size_t as the result will be always positive.
Ondřej Surý [Wed, 4 Jun 2025 15:43:34 +0000 (17:43 +0200)]
Remove .hi_called member of isc_mem_t structure
The .hi_called member was dead structure member and it hasn't been used
since the overmem callback has been removed in commit 14bdd21e0a7ad5f115bb2427d4f88fe7a84e9324.
Ondřej Surý [Wed, 4 Jun 2025 08:35:57 +0000 (10:35 +0200)]
Delete jemalloc arena support from isc_mem
The jemalloc arena in isc_mem was added to solve runaway memory problem
for outgoing TCP connections. In the end, this was a red herring and
the jemalloc arena code is now unused (via e28266bf). Remove the
support for jemalloc memory arenas as we can restore this at any time if
we need it ever again, but right now it's just a dead code.
Ondřej Surý [Wed, 25 Jun 2025 06:25:41 +0000 (08:25 +0200)]
Fix implicit headers when using isc/overflow.h header
In jemalloc_shim.h, we relied on including <isc/overflow.h> implicitly
instead of explicitly and same was happening inside isc/overflow.h - the
stdbool.h (for bool type) was being included implicitly instead of
explicitly.
Aydın Mercan [Tue, 24 Jun 2025 13:30:15 +0000 (16:30 +0300)]
do not install manpages for unbuilt binaries
Building and installing from a git release installed all manpages
unconditionally even if binaries like dnstap-read were disabled and not
built.
Now the manpage configuration checks for such cases and also cleans up
remaining artifacts and unnecessary pages if the build directory is
reconfigured.
Ondřej Surý [Sat, 28 Jun 2025 12:06:05 +0000 (14:06 +0200)]
chg: dev: Change isc_tid to be isc_tid_t type (a signed integer type)
Change the internal type used for isc_tid unit to isc_tid_t to hide the
specific integer type being used for the 'tid'. Internally, the isc_tid
unit is now using signed integer type. This allows us to have negatively
indexed arrays that works both for threads with assigned tid and the
threads with unassigned tid. Additionally, limit the number of threads
(loops) to 512 (compile time default).
Ondřej Surý [Wed, 4 Jun 2025 15:54:20 +0000 (17:54 +0200)]
Convert the isc/tid.h to use own signed integer isc_tid_t type
Change the internal type used for isc_tid unit to isc_tid_t to hide the
specific integer type being used for the 'tid'. Internally, the signed
integer type is being used. This allows us to have negatively indexed
arrays that works both for threads with assigned tid and the threads
with unassigned tid. This should be used only in specific situations.
Štěpán Balážik [Sat, 28 Jun 2025 10:51:59 +0000 (10:51 +0000)]
fix: nil: Only run ci-orphaned-anchors on MR events
Now, it is also run in schedules and most annoyingly on push which means
that it is run twice on a push to a branch where a MR exists and `.gitlab-ci.yml` is changed.
This was an oversight in https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/10654
Merge branch 'stepan/remove-additional-pipeline' into 'main'
Štěpán Balážik [Fri, 27 Jun 2025 15:20:29 +0000 (15:20 +0000)]
fix: nil: Move root zone mirror system test to a separate directory
This test doesn't require artifact checking but when bundled in the same
directory with the shell based tests, the `system:clang:tsan` job was
failing non-deterministically.
An example of the job failing and succeeding on the same commit:
- https://gitlab.isc.org/isc-projects/bind9/-/jobs/5809299
- https://gitlab.isc.org/isc-projects/bind9/-/jobs/5809447
Merge branch 'stepan/move-root-zone-mirror-test-to-a-separate-directory' into 'main'
Štěpán Balážik [Fri, 27 Jun 2025 13:51:05 +0000 (15:51 +0200)]
Move root zone mirror system test to a separate directory
This test doesn't require artifact checking but when bundled in the same
directory with the shell based tests, the `system:clang:tsan` job was
failing non-deterministically.
Nicki Křížek [Fri, 27 Jun 2025 15:03:54 +0000 (17:03 +0200)]
chg: test: Improve pytest log output
- increase clarity of multiline messages
- support `isc.query.*()` query&response logging
- replace use of `print()` statement with proper logging
- omit empty lines from test result output
Merge branch 'nicki/improve-pytest-logging' into 'main'
Nicki Křížek [Thu, 26 Jun 2025 16:20:06 +0000 (18:20 +0200)]
Log assertion failures right after test result
The extra messages are typically traceback from assertion failures.
Previously, they'd be printed only after all individual test case
results have been printed. That made it difficult to pair the traceback
to the failing test in some cases, as the node information (aka test
name) might not always be present.
Instead, log any extra messages related to a particular test failure
directly after reporting its result, making the failure details more
readily available and easy to connect with a particular test case.
Nicki Křížek [Thu, 26 Jun 2025 14:14:50 +0000 (16:14 +0200)]
Log query and response when using isctest.query.*
Make sure the queries and responses are logged at the DEBUG level, which
may provide useful information in case of failing tests.
This doesn't seem to significantly increase the overall artifacts size.
Previously, pytest.log.txt files from all system tests would take around
3 MB, with this change, it's around 8 MB).
Nicki Křížek [Tue, 17 Jun 2025 15:40:07 +0000 (17:40 +0200)]
Add options for query&response logging to pytest
In some cases, it's useful to log the sent and received DNS messages.
Add options to enable this on demand. Query is only logged the first
time it's sent, since it doesn't change. If response logging is turned
on, then each response is logged, since it might be different every
time.
Nicki Křížek [Tue, 17 Jun 2025 15:33:22 +0000 (17:33 +0200)]
Indent multiline output in pytest logging
When multiline message is logged, indent all but the first line (which
will be preceeded by the LOG_FORMAT). This improves the clarity of logs,
as it's immediately clear which lines are regular log output, and which
ones are multiline debug output.
Adjust the isctest.run.cmd() stdout/stderr logging to this new format.
Nicki Křížek [Tue, 17 Jun 2025 15:21:33 +0000 (17:21 +0200)]
Don't log empty test result messages
The messages obtained from test results may contain stuff like detailed
failure/error information, tracebacks etc. In many cases, the message
will be empty, in which case it doesn't need to be logged.
For an example, run test with many test cases, e.g.
verify/test_verify.py, and inspect the tail of the pytest.log.txt before
and after this commit.
Nicki Křížek [Tue, 17 Jun 2025 13:47:48 +0000 (15:47 +0200)]
Replace print statements in checkds test
Use isctest.log logging facility for consistent and predictable logging
output rather than using print(). Remove writes of stderr, as that
output will be logged in the debug log in case the commands called with
isctest.run.cmd() fails.
Petr Špaček [Wed, 25 Jun 2025 15:53:25 +0000 (17:53 +0200)]
Simplify maintenance of NO_BUILD_TEST_PREREQ CI hack
Our split between build and test phases in CI triggers odd corner case
in Meson:
- Newer Meson versions (1.7.0+) do not build test targets as part of
"all" target.
- We copy build artifacts from build phase into test container.
- meson test --no-rebuild does not build test artifacts even if they are
missing.
- To build these test binaries Meson has special target
"meson-test-prereq". This target exists only in Meson >= 0.63.
- Ubuntu 22.04 has only Meson 0.61.2 so this target does not exist.
To counter this problem, we introduced BUILD_TEST_PREREQ variable in CI
to explicitly build "meson-test-prereq" target in the "build" phase only
inside images with new-enough Meson versions. This worked, but it forced
us to keep track of Meson versions on various
distros and update the variable accordingly.
This commit inverts the logic so we build the special target by default
(in the build phase) and skip building it only if Meson version is too
old. So once we drop the old image, the variable (or rather it's usage)
will be gone and we don't need to touch it for newer images.
We have also considered installing newer Meson into the test image, but
decided to keep the old version around so we can test minimal Meson
version specified in meson.build file.
Michal Nowak [Thu, 26 Jun 2025 10:56:08 +0000 (12:56 +0200)]
chg: ci: Disable Kerberos in tumbleweed
In the tumbleweed image, we utilize LibreSSL. Several BIND 9 libraries
are linked against LibreSSL's libcrypto.so.55, and when Kerberos is
enabled, we link against libk5crypto.so.3, which in turn links against
OpenSSL's libcrypto.so.3. This might theoretically lead to a symbol
conflict.
Closes #5394
Merge branch '5394-disable-kerberos-in-tumbleweed' into 'main'
Michal Nowak [Wed, 25 Jun 2025 13:35:23 +0000 (15:35 +0200)]
Disable Kerberos in tumbleweed
In the tumbleweed image, we utilize LibreSSL. Several BIND 9 libraries
are linked against LibreSSL's libcrypto.so.55, and when Kerberos is
enabled, we link against libk5crypto.so.3, which in turn links against
OpenSSL's libcrypto.so.3. This might theoretically lead to a symbol
conflict.
Michał Kępień [Thu, 26 Jun 2025 10:06:35 +0000 (12:06 +0200)]
fix: nil: Fix version description in a startup log message
Commit 5cd6c173ff74309ae7fb73b3e4c754f1589eaddc changed the contents of
the PACKAGE_DESCRIPTION preprocessor macro from " (<description>)" to
just "<description>" and missed a spot while adjusting all uses of this
macro in the code base. Fix formatting for that malformed log message,
emitted upon named startup.
See #5379
Merge branch '5379-fix-version-description-in-a-startup-log-message' into 'main'
Michał Kępień [Thu, 26 Jun 2025 10:05:53 +0000 (12:05 +0200)]
Fix version description in a startup log message
Commit 5cd6c173ff74309ae7fb73b3e4c754f1589eaddc changed the contents of
the PACKAGE_DESCRIPTION preprocessor macro from " (<description>)" to
just "<description>" and missed a spot while adjusting all uses of this
macro in the code base. Fix formatting for that malformed log message,
emitted upon named startup.
Štěpán Balážik [Thu, 26 Jun 2025 10:04:02 +0000 (10:04 +0000)]
fix: ci: Ensure that junit.xml is present and non-empty after each system/unit test job
Previously, JUnit files were not generated or were generated empty for various reasons for some system/unit test runs.
Now, the number of tests collected for a MR is up from about 4k to 5.8k in the "Tests" tab of a pipeline.
Additionally, there is a check that ensures that [a somewhat sane](https://gitlab.isc.org/isc-projects/bind9/-/commit/c5a271eb8beb9912501ec564de3bb669ba02507d) `junit.xml` file is generated after every system/unit test job and fails the job otherwise.
Michal Nowak [Wed, 25 Jun 2025 12:06:36 +0000 (14:06 +0200)]
chg: doc: Make empty changelog fatal error
The prep_doc_mr.py script of the bind9-qa repo needs a way to know that
gitchangelog.py did not produce entries. In the case of release notes,
it dies with "No commits matching given revlist". For changelog entries
it used to warn about "Empty changelog", but did not return non-zero
exit code.
Merge branch 'mnowak/make-empty-changelog-fatal' into 'main'
Michal Nowak [Wed, 18 Jun 2025 07:52:37 +0000 (09:52 +0200)]
Make empty changelog fatal error
The prep_doc_mr.py script of the bind9-qa repo needs a way to know that
gitchangelog.py did not produce entries. In the case of release notes,
it dies with "No commits matching given revlist". For changelog entries
it used to warn about "Empty changelog", but did not return non-zero
exit code.
Petr Menšík [Tue, 24 Jun 2025 15:12:35 +0000 (17:12 +0200)]
Do not expect fail in cpu test default configuration
Previous CPU test relied on either missing default named.conf or the
missing permissions to write into its default directory. In short that
default configuration would be unusable with current user. It would hang
indefinitely at cpu test if the named user could write into directory
specified in default configuration.
Change it instead to explicitly try non-existent configuration file.
It will still fail immediately, but will not rely on running user or
presence of file at default configuration file path.
Alessio Podda [Wed, 25 Jun 2025 08:30:28 +0000 (08:30 +0000)]
chg: dev: Use RCU for rad name
The RAD/agent domain is a functionality from RFC 9567 that provides
a suffix for reporting error messages. On every query context reset,
we need to check if a RAD is configured and, if so, copy it.
Since we allow the RAD to be changed by reconfiguring the zone,
this access is currently protected by a mutex, which causes contention.
This commit replaces the mutex with RCU to reduce contention. The
change results in a 3% performance improvement in the 1M delegation
test.
Alessio Podda [Mon, 23 Jun 2025 09:13:44 +0000 (11:13 +0200)]
Use RCU for rad name
The RAD/agent domain is a functionality from RFC 9567 that provides
a suffix for reporting error messages. On every query context reset,
we need to check if a RAD is configured and, if so, copy it.
Since we allow the RAD to be changed by reconfiguring the zone,
this access is currently protected by a mutex, which causes contention.
This commit replaces the mutex with RCU to reduce contention. The
change results in a 3% performance improvement in the 1M delegation
test.
Mark Andrews [Wed, 18 Jun 2025 02:49:04 +0000 (12:49 +1000)]
Preserve brackets around string concatenation
We need disable clang-format here to preserve the brackets around
the string concatenation to prevent -Wstring-concatenation -Werror
breaking the build.
Nicki Křížek [Tue, 24 Jun 2025 15:30:12 +0000 (17:30 +0200)]
chg: ci: Add newline for changelog CI job
In case the changelog file doesn't have an empty line at the end of the
file, the job may fail with the following error:
WARNING: Bullet list ends without a blank line; unexpected unindent.
This typically happens in MRs targeting the -S edition, as those
changelogs usually don't have an empty newline. This change ensures the
changelog job can pass and verify the title/desc contents even in those
cases.
Merge branch 'nicki/ci-changelog-add-missing-newline' into 'main'
Nicki Křížek [Tue, 24 Jun 2025 14:35:56 +0000 (16:35 +0200)]
Add newline for changelog CI job
In case the changelog file doesn't have an empty line at the end of the
file, the job may fail with the following error:
WARNING: Bullet list ends without a blank line; unexpected unindent.
This typically happens in MRs targeting the -S edition, as those
changelogs usually don't have an empty newline. This change ensures the
changelog job can pass and verify the title/desc contents even in those
cases.
Nicki Křížek [Tue, 24 Jun 2025 14:57:27 +0000 (16:57 +0200)]
chg: test: Make extra_artifacts check optional
There is an ongoing debate about the usefulness of the extra artifacts
check. While it might be useful to detect unexpected behaviour in some
tests, it feels extraneous in many cases. This change provides a middle
ground by making the artifact checking optional. This might be
especially useful for writing new tests, since the author gets to decide
whether the check is useful -- and can utilize it, or can skip it for
sake of brevity.
Merge branch 'nicki/make-extra-artifacts-check-optional' into 'main'
Nicki Křížek [Tue, 24 Jun 2025 11:16:33 +0000 (13:16 +0200)]
Make extra_artifacts check optional
There is an ongoing debate about the usefulness of the extra artifacts
check. While it might be useful to detect unexpected behaviour in some
tests, it feels extraneous in many cases. This change provides a middle
ground by making the artifact checking optional. This might be
especially useful for writing new tests, since the author gets to decide
whether the check is useful -- and can utilize it, or can skip it for
sake of brevity.