Nicki Křížek [Thu, 26 Jun 2025 16:20:06 +0000 (18:20 +0200)]
Log assertion failures right after test result
The extra messages are typically traceback from assertion failures.
Previously, they'd be printed only after all individual test case
results have been printed. That made it difficult to pair the traceback
to the failing test in some cases, as the node information (aka test
name) might not always be present.
Instead, log any extra messages related to a particular test failure
directly after reporting its result, making the failure details more
readily available and easy to connect with a particular test case.
Nicki Křížek [Thu, 26 Jun 2025 14:14:50 +0000 (16:14 +0200)]
Log query and response when using isctest.query.*
Make sure the queries and responses are logged at the DEBUG level, which
may provide useful information in case of failing tests.
This doesn't seem to significantly increase the overall artifacts size.
Previously, pytest.log.txt files from all system tests would take around
3 MB, with this change, it's around 8 MB).
Nicki Křížek [Tue, 17 Jun 2025 15:40:07 +0000 (17:40 +0200)]
Add options for query&response logging to pytest
In some cases, it's useful to log the sent and received DNS messages.
Add options to enable this on demand. Query is only logged the first
time it's sent, since it doesn't change. If response logging is turned
on, then each response is logged, since it might be different every
time.
Nicki Křížek [Tue, 17 Jun 2025 15:33:22 +0000 (17:33 +0200)]
Indent multiline output in pytest logging
When multiline message is logged, indent all but the first line (which
will be preceeded by the LOG_FORMAT). This improves the clarity of logs,
as it's immediately clear which lines are regular log output, and which
ones are multiline debug output.
Adjust the isctest.run.cmd() stdout/stderr logging to this new format.
Nicki Křížek [Tue, 17 Jun 2025 15:21:33 +0000 (17:21 +0200)]
Don't log empty test result messages
The messages obtained from test results may contain stuff like detailed
failure/error information, tracebacks etc. In many cases, the message
will be empty, in which case it doesn't need to be logged.
For an example, run test with many test cases, e.g.
verify/test_verify.py, and inspect the tail of the pytest.log.txt before
and after this commit.
Nicki Křížek [Tue, 17 Jun 2025 13:47:48 +0000 (15:47 +0200)]
Replace print statements in checkds test
Use isctest.log logging facility for consistent and predictable logging
output rather than using print(). Remove writes of stderr, as that
output will be logged in the debug log in case the commands called with
isctest.run.cmd() fails.
Petr Špaček [Wed, 25 Jun 2025 15:53:25 +0000 (17:53 +0200)]
Simplify maintenance of NO_BUILD_TEST_PREREQ CI hack
Our split between build and test phases in CI triggers odd corner case
in Meson:
- Newer Meson versions (1.7.0+) do not build test targets as part of
"all" target.
- We copy build artifacts from build phase into test container.
- meson test --no-rebuild does not build test artifacts even if they are
missing.
- To build these test binaries Meson has special target
"meson-test-prereq". This target exists only in Meson >= 0.63.
- Ubuntu 22.04 has only Meson 0.61.2 so this target does not exist.
To counter this problem, we introduced BUILD_TEST_PREREQ variable in CI
to explicitly build "meson-test-prereq" target in the "build" phase only
inside images with new-enough Meson versions. This worked, but it forced
us to keep track of Meson versions on various
distros and update the variable accordingly.
This commit inverts the logic so we build the special target by default
(in the build phase) and skip building it only if Meson version is too
old. So once we drop the old image, the variable (or rather it's usage)
will be gone and we don't need to touch it for newer images.
We have also considered installing newer Meson into the test image, but
decided to keep the old version around so we can test minimal Meson
version specified in meson.build file.
Michal Nowak [Thu, 26 Jun 2025 10:56:08 +0000 (12:56 +0200)]
chg: ci: Disable Kerberos in tumbleweed
In the tumbleweed image, we utilize LibreSSL. Several BIND 9 libraries
are linked against LibreSSL's libcrypto.so.55, and when Kerberos is
enabled, we link against libk5crypto.so.3, which in turn links against
OpenSSL's libcrypto.so.3. This might theoretically lead to a symbol
conflict.
Closes #5394
Merge branch '5394-disable-kerberos-in-tumbleweed' into 'main'
Michal Nowak [Wed, 25 Jun 2025 13:35:23 +0000 (15:35 +0200)]
Disable Kerberos in tumbleweed
In the tumbleweed image, we utilize LibreSSL. Several BIND 9 libraries
are linked against LibreSSL's libcrypto.so.55, and when Kerberos is
enabled, we link against libk5crypto.so.3, which in turn links against
OpenSSL's libcrypto.so.3. This might theoretically lead to a symbol
conflict.
Michał Kępień [Thu, 26 Jun 2025 10:06:35 +0000 (12:06 +0200)]
fix: nil: Fix version description in a startup log message
Commit 5cd6c173ff74309ae7fb73b3e4c754f1589eaddc changed the contents of
the PACKAGE_DESCRIPTION preprocessor macro from " (<description>)" to
just "<description>" and missed a spot while adjusting all uses of this
macro in the code base. Fix formatting for that malformed log message,
emitted upon named startup.
See #5379
Merge branch '5379-fix-version-description-in-a-startup-log-message' into 'main'
Michał Kępień [Thu, 26 Jun 2025 10:05:53 +0000 (12:05 +0200)]
Fix version description in a startup log message
Commit 5cd6c173ff74309ae7fb73b3e4c754f1589eaddc changed the contents of
the PACKAGE_DESCRIPTION preprocessor macro from " (<description>)" to
just "<description>" and missed a spot while adjusting all uses of this
macro in the code base. Fix formatting for that malformed log message,
emitted upon named startup.
Štěpán Balážik [Thu, 26 Jun 2025 10:04:02 +0000 (10:04 +0000)]
fix: ci: Ensure that junit.xml is present and non-empty after each system/unit test job
Previously, JUnit files were not generated or were generated empty for various reasons for some system/unit test runs.
Now, the number of tests collected for a MR is up from about 4k to 5.8k in the "Tests" tab of a pipeline.
Additionally, there is a check that ensures that [a somewhat sane](https://gitlab.isc.org/isc-projects/bind9/-/commit/c5a271eb8beb9912501ec564de3bb669ba02507d) `junit.xml` file is generated after every system/unit test job and fails the job otherwise.
Michal Nowak [Wed, 25 Jun 2025 12:06:36 +0000 (14:06 +0200)]
chg: doc: Make empty changelog fatal error
The prep_doc_mr.py script of the bind9-qa repo needs a way to know that
gitchangelog.py did not produce entries. In the case of release notes,
it dies with "No commits matching given revlist". For changelog entries
it used to warn about "Empty changelog", but did not return non-zero
exit code.
Merge branch 'mnowak/make-empty-changelog-fatal' into 'main'
Michal Nowak [Wed, 18 Jun 2025 07:52:37 +0000 (09:52 +0200)]
Make empty changelog fatal error
The prep_doc_mr.py script of the bind9-qa repo needs a way to know that
gitchangelog.py did not produce entries. In the case of release notes,
it dies with "No commits matching given revlist". For changelog entries
it used to warn about "Empty changelog", but did not return non-zero
exit code.
Petr Menšík [Tue, 24 Jun 2025 15:12:35 +0000 (17:12 +0200)]
Do not expect fail in cpu test default configuration
Previous CPU test relied on either missing default named.conf or the
missing permissions to write into its default directory. In short that
default configuration would be unusable with current user. It would hang
indefinitely at cpu test if the named user could write into directory
specified in default configuration.
Change it instead to explicitly try non-existent configuration file.
It will still fail immediately, but will not rely on running user or
presence of file at default configuration file path.
Alessio Podda [Wed, 25 Jun 2025 08:30:28 +0000 (08:30 +0000)]
chg: dev: Use RCU for rad name
The RAD/agent domain is a functionality from RFC 9567 that provides
a suffix for reporting error messages. On every query context reset,
we need to check if a RAD is configured and, if so, copy it.
Since we allow the RAD to be changed by reconfiguring the zone,
this access is currently protected by a mutex, which causes contention.
This commit replaces the mutex with RCU to reduce contention. The
change results in a 3% performance improvement in the 1M delegation
test.
Alessio Podda [Mon, 23 Jun 2025 09:13:44 +0000 (11:13 +0200)]
Use RCU for rad name
The RAD/agent domain is a functionality from RFC 9567 that provides
a suffix for reporting error messages. On every query context reset,
we need to check if a RAD is configured and, if so, copy it.
Since we allow the RAD to be changed by reconfiguring the zone,
this access is currently protected by a mutex, which causes contention.
This commit replaces the mutex with RCU to reduce contention. The
change results in a 3% performance improvement in the 1M delegation
test.
Mark Andrews [Wed, 18 Jun 2025 02:49:04 +0000 (12:49 +1000)]
Preserve brackets around string concatenation
We need disable clang-format here to preserve the brackets around
the string concatenation to prevent -Wstring-concatenation -Werror
breaking the build.
Nicki Křížek [Tue, 24 Jun 2025 15:30:12 +0000 (17:30 +0200)]
chg: ci: Add newline for changelog CI job
In case the changelog file doesn't have an empty line at the end of the
file, the job may fail with the following error:
WARNING: Bullet list ends without a blank line; unexpected unindent.
This typically happens in MRs targeting the -S edition, as those
changelogs usually don't have an empty newline. This change ensures the
changelog job can pass and verify the title/desc contents even in those
cases.
Merge branch 'nicki/ci-changelog-add-missing-newline' into 'main'
Nicki Křížek [Tue, 24 Jun 2025 14:35:56 +0000 (16:35 +0200)]
Add newline for changelog CI job
In case the changelog file doesn't have an empty line at the end of the
file, the job may fail with the following error:
WARNING: Bullet list ends without a blank line; unexpected unindent.
This typically happens in MRs targeting the -S edition, as those
changelogs usually don't have an empty newline. This change ensures the
changelog job can pass and verify the title/desc contents even in those
cases.
Nicki Křížek [Tue, 24 Jun 2025 14:57:27 +0000 (16:57 +0200)]
chg: test: Make extra_artifacts check optional
There is an ongoing debate about the usefulness of the extra artifacts
check. While it might be useful to detect unexpected behaviour in some
tests, it feels extraneous in many cases. This change provides a middle
ground by making the artifact checking optional. This might be
especially useful for writing new tests, since the author gets to decide
whether the check is useful -- and can utilize it, or can skip it for
sake of brevity.
Merge branch 'nicki/make-extra-artifacts-check-optional' into 'main'
Nicki Křížek [Tue, 24 Jun 2025 11:16:33 +0000 (13:16 +0200)]
Make extra_artifacts check optional
There is an ongoing debate about the usefulness of the extra artifacts
check. While it might be useful to detect unexpected behaviour in some
tests, it feels extraneous in many cases. This change provides a middle
ground by making the artifact checking optional. This might be
especially useful for writing new tests, since the author gets to decide
whether the check is useful -- and can utilize it, or can skip it for
sake of brevity.
Colin Vidal [Tue, 24 Jun 2025 08:52:11 +0000 (10:52 +0200)]
chg: dev: parse user configuration before exclusive mode
Previously, `named.conf` was parsed while the server was in exclusive (i.e., single-threaded) mode and unable to answer queries. This could cause an unnecessary delay in query processing when the file was large. We now delay entry into exclusive mode until after the configuration has been parsed, but before it is applied.
Merge branch 'colin/configparse-before-exclusive' into 'main'
Colin Vidal [Mon, 23 Jun 2025 19:54:43 +0000 (21:54 +0200)]
wait for reload completed in emptyzones system test
The emptyzones system test ran two consecutive "rndc reload" commands
without waiting for the first one to complete. It used to work because
the commands were serialized, but now an rndc reconfig/reload command is
ignored if another one is already running, so the emptyzones test is
more likely to fail.
Fix this problem by waiting for the log message indicating that all the
zones are loaded before attempting the next reload.
Colin Vidal [Thu, 5 Jun 2025 16:28:22 +0000 (18:28 +0200)]
log-based test for load/apply config
Add a new system test which checks named output when starting,
reconfiguring and reloading the server. It checks that the steps where
configuration is loaded, when named enters exclusive mode, and when the
configuration is applied are all logged, and that they occur in the
correct order. This adds a guard/warning to keep the parsing of the
named.conf outside of the exclusive mode.
Colin Vidal [Tue, 22 Apr 2025 11:46:47 +0000 (13:46 +0200)]
parse user configuration before exclusive mode
The configuration file was parsed when named was in exclusive
(i.e. single-threaded) mode and unable to answer queries. Because
the parsing is a self-contained operation, it is now done before
named enters exclusive mode.
This reduces the amount of time named can't answer queries when
reloading the configuration when the configuration file is large.
Note that exclusive mode is still used for applying the
configuration changes to the server.
Also, simplify the configuration logic by parsing the built-in
configuration only once at server start time.
Impossible: How can diff be null and have not Correct in compare_c? Tag1 ("diff token: ( VS (\nFile \"./lib/dns/include/dns/rdatasetiter.h\", line 109, column 32, charpos = 3103\n around = '(',\n whole content = #define DNS_RDATASETITER_FOREACH(rds) \\\nFile \"/tmp/cocci-output-110376-c54da3-rdatasetiter.h\", line 109, column 32, charpos = 3103\n around = '(',\n whole content = #define DNS_RDATASETITER_FOREACH(rds) \\\n")
Impossible: How can diff be null and have not Correct in compare_c? Tag1 ("diff token: ( VS (\nFile \"./lib/dns/include/dns/dbiterator.h\", line 114, column 30, charpos = 3413\n around = '(',\n whole content = #define DNS_DBITERATOR_FOREACH(rds) \\\nFile \"/tmp/cocci-output-110387-883f2f-dbiterator.h\", line 114, column 30, charpos = 3413\n around = '(',\n whole content = #define DNS_DBITERATOR_FOREACH(rds) \\\n")
See https://github.com/coccinelle/coccinelle/issues/398.
Merge branch 'mnowak/coccinelle-fix-impossible-warning' into 'main'
Impossible: How can diff be null and have not Correct in compare_c? Tag1 ("diff token: ( VS (\nFile \"./lib/dns/include/dns/rdatasetiter.h\", line 109, column 32, charpos = 3103\n around = '(',\n whole content = #define DNS_RDATASETITER_FOREACH(rds) \\\nFile \"/tmp/cocci-output-110376-c54da3-rdatasetiter.h\", line 109, column 32, charpos = 3103\n around = '(',\n whole content = #define DNS_RDATASETITER_FOREACH(rds) \\\n")
Impossible: How can diff be null and have not Correct in compare_c? Tag1 ("diff token: ( VS (\nFile \"./lib/dns/include/dns/dbiterator.h\", line 114, column 30, charpos = 3413\n around = '(',\n whole content = #define DNS_DBITERATOR_FOREACH(rds) \\\nFile \"/tmp/cocci-output-110387-883f2f-dbiterator.h\", line 114, column 30, charpos = 3413\n around = '(',\n whole content = #define DNS_DBITERATOR_FOREACH(rds) \\\n")
See https://github.com/coccinelle/coccinelle/issues/398.
Petr Špaček [Mon, 23 Jun 2025 11:22:52 +0000 (13:22 +0200)]
Restore DNSSEC validation by default
Meson generated 'dnssec-validation yes' into the built-in config, but
this config without an explicit trust anchor does not enable validation.
Change default to 'dnssec-validation auto' to use built-in key, as in
the autotools days.
Aydın Mercan [Mon, 23 Jun 2025 11:23:53 +0000 (14:23 +0300)]
fix: dev: Fix RTD builds and minor documentation issues
Fix some leftover artifacts and information while transitioning BIND to Meson.
Add CI job to verify that pre-generated config grammar files are up-to-date with code.
Aydın Mercan [Fri, 13 Jun 2025 15:30:34 +0000 (18:30 +0300)]
Remove build requirements from building arm
The meson build switched to generating the file grammars and using meson
to build the manpages/ARM. This is because meson doesn't work well when
writing files outside the build directory.
However, this has been suboptimal when someone only wants to build the
documentation (like RTD). Sphinx can now be used outside meson like it
was with autoconf.
Grammars are now updated by the developer with CI checking if one is
needed or not, like clang-format.
Michał Kępień [Mon, 23 Jun 2025 08:28:26 +0000 (10:28 +0200)]
fix: nil: Use links() for checking if -latomic is necessary
Use the links() method instead of compiles() for checking whether
-latomic needs to be added to linker invocations as compiles() does not
perform the linking step and is therefore not appropriate for carrying
out this kind of checks.
See #5379
Merge branch '5379-use-links-for-checking-if-latomic-is-necessary' into 'main'
Michał Kępień [Mon, 23 Jun 2025 08:23:17 +0000 (10:23 +0200)]
Retain Meson >= 0.61 version requirement
The add_project_dependencies() method was only added in Meson 0.63.
Replace its only use in meson.build with a corresponding call to the
add_project_link_arguments() method to avoid bumping the minimum
required Meson version beyond the one available in stock Ubuntu 22.04
LTS repositories.
Michał Kępień [Mon, 23 Jun 2025 08:23:17 +0000 (10:23 +0200)]
Use links() for checking if -latomic is necessary
Use the links() method instead of compiles() for checking whether
-latomic needs to be added to linker invocations as compiles() does not
perform the linking step and is therefore not appropriate for carrying
out this kind of checks.
Michał Kępień [Sat, 21 Jun 2025 04:47:19 +0000 (04:47 +0000)]
chg: ci: move "stress" test generation script to QA repo
Move the util/generate-stress-test-configs.py script from the BIND 9
source repository to the BIND 9 QA repository. This simplifies the
maintenance of that script by eliminating the need to backport every
change applied to it to multiple branches.
Merge branch 'michal/move-stress-test-generation-script-to-qa-repo' into 'main'
Michał Kępień [Sat, 21 Jun 2025 04:43:36 +0000 (06:43 +0200)]
Move "stress" test generation script to QA repo
Move the util/generate-stress-test-configs.py script from the BIND 9
source repository to the BIND 9 QA repository. This simplifies the
maintenance of that script by eliminating the need to backport every
change applied to it to multiple branches.
Michał Kępień [Sat, 21 Jun 2025 04:06:45 +0000 (04:06 +0000)]
fix: nil: Install named-compilezone
named-compilezone is an alias for named-checkzone: the two tools are
built from the same set of source files, but they behave differently
depending on which executable gets invoked. With Automake,
named-compilezone was installed as a hard link to named-checkzone using
a custom installation hook; try to keep things simple with Meson by
using the install_symlink() method, which makes named-compilezone a
symbolic link to named-checkzone and is the same thing that is already
used for ddns-confgen/tsig-keygen.
See #5379
Merge branch '5379-install-named-compilezone' into 'main'
Michał Kępień [Sat, 21 Jun 2025 03:59:51 +0000 (05:59 +0200)]
Install named-compilezone
named-compilezone is an alias for named-checkzone: the two tools are
built from the same set of source files, but they behave differently
depending on which executable gets invoked. With Automake,
named-compilezone was installed as a hard link to named-checkzone using
a custom installation hook; try to keep things simple with Meson by
using the install_symlink() method, which makes named-compilezone a
symbolic link to named-checkzone and is the same thing that is already
used for ddns-confgen/tsig-keygen.
Nicki Křížek [Thu, 19 Jun 2025 13:49:52 +0000 (13:49 +0000)]
fix: test: Ignore softhsm2 errors when deleting token in keyfromlabel test
In some rare cases, the softhsm2 utility reports failure to delete the
token directory, despite the token being found. Subsequent attempts to
delete the token again indicate that the token was deleted.
Ignore this cleanup error, as it doesn't prevent our tests from working
properly. There is also an attempt to delete the token before the test
starts which ensures a clean state before the test is executed, in case
there's actually a leftover token.
Closes #5244
Merge branch '5244-ignore-softhsm2util-delete-token-error' into 'main'
Nicki Křížek [Thu, 19 Jun 2025 13:09:39 +0000 (15:09 +0200)]
Ignore softhsm2 errors when deleting token in keyfromlabel test
In some rare cases, the softhsm2 utility reports failure to delete the
token directory, despite the token being found. Subsequent attempts to
delete the token again indicate that the token was deleted.
Ignore this cleanup error, as it doesn't prevent our tests from working
properly. There is also an attempt to delete the token before the test
starts which ensures a clean state before the test is executed, in case
there's actually a leftover token.
Nicki Křížek [Thu, 19 Jun 2025 13:05:56 +0000 (13:05 +0000)]
chg: test: Improve logging from isctest.run.retry_with_timeout
Allow use of exception (and by extension, assert statements) in the
called function in order to extract essential debug information about
the type of failure that was encountered.
In case the called function fails to succeed on the last retry and
raised an exception, log it as error and set it as the assert message to
propagate it through the pytest framework.
Closes #5324
Merge branch '5324-pytest-isctest-run-logging' into 'main'
Nicki Křížek [Thu, 19 Jun 2025 12:09:57 +0000 (14:09 +0200)]
Use time.monotonic() for time measumeremts in pytest
For duration measurements, i.e. deadlines and timeouts, it's more
suitable to use monotonic time as it's guaranteed to only go forward,
unlike time.time() which can be affected by local clock settings.
Nicki Křížek [Fri, 6 Jun 2025 13:11:44 +0000 (15:11 +0200)]
Improve logging from isctest.run.retry_with_timeout
Allow use of exception (and by extension, assert statements) in the
called function in order to extract essential debug information about
the type of failure that was encountered.
In case the called function fails to succeed on the last retry and
raised an exception, log it as error and set it as the assert message to
propagate it through the pytest framework.
Matthijs Mekking [Thu, 22 May 2025 09:23:48 +0000 (11:23 +0200)]
Fix spurious missing key files log messages
This happens because old key is purged by one zone view, then the other
is freaking out about it.
Keys that are unused or being purged should not be taken into account
when verifying key files are available.
The keyring is maintained per zone. So in one zone, a key in the
keyring is being purged. The corresponding key file is removed.
The key maintenance is done for the other zone view. The key in that
keyring is not yet set to purge, but its corresponding key file is
removed. This leads to "some keys are missing" log errors.
We should not check the purge variable at this point, but the
current time and purge-keys duration.
Create a test scenario where a signed zone is in multiple views and
then a key may be purged. This is a bug case where the key files are
removed by one view and then the other view starts complaining.
Mark Andrews [Thu, 19 Jun 2025 01:01:12 +0000 (01:01 +0000)]
new: usr: "Add code paths to fully support PRIVATEDNS and PRIVATEOID keys"
Added support for PRIVATEDNS and PRIVATEOID key usage. Added PRIVATEOID
test algorithms using the assigned OIDs for RSASHA256 and RSASHA512.
Added code to support proposed DS digest types that encode the PRIVATEDNS
and PRIVATEOID identifiers at the start of the digest field of the DS record.
This code is disabled by default.
Closes #3240
Merge branch '3240-add-privatedns-and-privateoid-support' into 'main'
Mark Andrews [Wed, 28 May 2025 10:02:48 +0000 (20:02 +1000)]
Test extended DS digest type support
Add a zone using DS records that embed the private algorithm
identifier in the digest field. There are 2 DS record for an
unsupported DNSSEC algorithm one of which that doesn't have a
matching DNSKEY. This zone should validate as insecure as the
validator can establish that both DS records are for unsupported
DNSSEC algorithms.
Mark Andrews [Fri, 16 May 2025 05:50:53 +0000 (15:50 +1000)]
Add tests using PRIVATEOID algorithms
There are 4 tests:
1) a zone using a known private OID. Validations should succeed
and return AD=1.
2) a zone using an unknown private OID. Validation should succeed
and return AD=0 as the DS to DNSKEY has provably unsupported
algorithm.
3) a zone using a known private OID and an extra DS record. Validation
should succeed as there is DS to DNSKEY with a known algorithm
linkage.
4) a zone using an unknown private OID and an extra DS record.
Validation should fail as only one of the DS records can be matched
to a provable unknown algorithm. The algorithm of the second DS
is indeterminate.
Mark Andrews [Mon, 31 Mar 2025 13:12:52 +0000 (00:12 +1100)]
Add PRIVATEOIDs for RSASHA256 and RSASHA512
Use the existing RSASHA256 and RSASHA512 implementation to provide
working PRIVATEOID example implementations. We are using the OID
values normally associated with RSASHA256 (1.2.840.113549.1.1.11)
and RSASHA512 (1.2.840.113549.1.1.13).