Michal Nowak [Mon, 24 Jul 2023 15:30:35 +0000 (17:30 +0200)]
Drop unnecessary gcovr workarounds
Many problems of the Debian 11 gcovr version were fixed in the Debian 12
one. Replace workarounds we accumulated over the years with two new,
simple ones.
Ondřej Surý [Wed, 16 Aug 2023 14:30:53 +0000 (16:30 +0200)]
Limit the number of inactive handles kept for reuse
Instead of growing and never shrinking the list of the inactive
handles (to be reused mostly on the UDP connections), limit the number
of maximum number of inactive handles kept to 64. Instead of caching
the inactive handles for all listening sockets, enable the caching on on
UDP listening sockets. For TCP, the handles were cached for each
accepted socket thus reusing the handles only for long-standing TCP
connections, but not reusing the handles across different TCP streams.
Tom Krizek [Wed, 16 Aug 2023 08:38:09 +0000 (10:38 +0200)]
Add clean-local target to clean pytest runner artifacts
The command finds all directories in bin/tests/system which contain an
underscore. Underscore indicates either a temporary directory (_tmp_), a
symlink to test artifacts (TESTNAME_MODULENAME), or a python-related
cache. Using underscore for a system test name is invalid and a hyphen
must be used instead.
Tom Krizek [Thu, 10 Aug 2023 14:53:10 +0000 (16:53 +0200)]
Silence pylint's refactoring suggestions for system_test_dir()
While it'd be fairly easy to split the function up into smaller ones,
the readability wouldn't be improved in this case. Silence the
suggestions instead.
Tom Krizek [Thu, 10 Aug 2023 14:14:08 +0000 (16:14 +0200)]
Create symlinks to test artifacts for pytest runner
While temporary directories are useful for test execution to keep
everything clean, they are difficult to work with manually. Create a
symlink for each test artifact directory with a stable and predictable
path. The symlink always either points to the latest artifacts, or is
missing in case the last run succeeded.
Ensure these symlinked directories aren't detected as test suites by the
pytest runner.
Tom Krizek [Tue, 8 Aug 2023 11:23:20 +0000 (13:23 +0200)]
ci: run out-of-tree system tests with pytest runner
Out-of-tree builds are built in a directory that is different from
source directory. The build directory doesn't contain the non-compiled
test files from bin/tests/system which are the test cases required by
the pytest runner.
In order to run the system tests for out-of-tree build, copy over the
contents (tests) of bin/tests/system/ from the source directory into the
build directory. Then, it is possible to invoke the pytest runner inside
the build directory.
With the v9.19.16 release tag merged, the "cross-version-config-tests"
GitLab CI job will no longer fail due to the two relevant system tests
being absent from the development branch. This makes the pytest
filtering expression added to work around that issue unnecessary, so
remove it.
Michal Nowak [Tue, 15 Aug 2023 15:23:30 +0000 (17:23 +0200)]
Mark test_send_timeout as flaky
In some cases, BIND is not fast enough to fill the send buffer and
manages to answer all queries, contrary to what the test expects.
Repeat the check up to 3 times to limit this test instability.
Tom Krizek [Thu, 17 Aug 2023 08:30:46 +0000 (10:30 +0200)]
Add custom flaky decorator to handle unstable tests
If the flaky plugin for pytest is available, use its decorator to
support re-running unstable tests. In case the package is missing,
execute the test as usual without attempts to re-run it in case of
failure.
This is mostly intended to increase the test stability in CI. Using a
custom decorator enables us to keep the flaky package as an optional
dependency.
Michal Nowak [Thu, 3 Aug 2023 08:44:09 +0000 (10:44 +0200)]
Clean leftover files in autosign and masterformat
The following files were reported in CI by the legacy system test runner
and prevented job to pass. They should be removed.
$ if git rev-parse > /dev/null 2>&1; then ( ! grep "^I:.*:file.*not removed$" *.log ); fi
autosign.log:I:autosign:file autosign/ns3/kskonly.example.db.jbk not removed
autosign.log:I:autosign:file autosign/ns3/optout.example.db.jbk not removed
autosign.log:I:autosign:file autosign/ns3/reconf.example.db.jbk not removed
masterformat.log:I:masterformat:file masterformat/ns1/signed.db.raw.jbk not removed
masterformat.log:I:masterformat:file masterformat/ns1/signed.db.raw.signed not removed
masterformat.log:I:masterformat:file masterformat/ns1/signed.db.raw.signed.jnl not removed
Don't print an error when the ns*/inactive directory is not
present:
rmdir: ns*/inactive: No such file or directory
Remove nsupdate.out.test file instead of nsupdate.out, as the latter
does not exist.
Ondřej Surý [Tue, 15 Aug 2023 15:29:27 +0000 (17:29 +0200)]
Attach to the dns_dispatchmgr in the dns_view object
The dns_dispatchmgr object was only set in the dns_view object making it
prone to use-after-free in the dns_xfrin unit when shutting down named.
Remove dns_view_setdispatchmgr() and optionally pass the dispatchmgr
directly to dns_view_create() when it is attached and not just assigned,
so the dns_dispatchmgr doesn't cease to exist too early.
The dns_view_getdnsdispatchmgr() is now protected by the RCU lock, the
dispatchmgr reference is incremented, so the caller needs to detach from
it, and the function can return NULL in case the dns_view has been
already shut down.
Instead of an RBT for the forwarders table, use a QP trie.
We now use reference counting for dns_forwarders_t. When a forwarders
object is retrieved by dns_fwdtable_find(), it must now be explicitly
detached by the caller afterward.
QP tries require stored objects to include their names, so the
the forwarders object now has that. This obviates the need to
pass back a separate 'foundname' value from dns_fwdtable_find().
replace the red-black tree used by the negative trust anchor table
with a QP trie.
because of this change, dns_ntatable_init() can no longer fail, and
neither can dns_view_initntatable(). these functions have both been
changed to type void.
rename dns_qp_findname_parent() to _findname_ancestor()
this function finds the closest matching ancestor, but the function
name could be read to imply that it returns the direct parent node;
this commit suggests a slightly less misleading name.
Tony Finch [Thu, 6 Apr 2023 10:30:00 +0000 (11:30 +0100)]
A SET_IF_NOT_NULL() macro for optional return values
The SET_IF_NOT_NULL() macro avoids a fair amount of tedious boilerplate,
checking pointer parameters to see if they're non-NULL and updating
them if they are. The macro was already in the dns_zone unit, and this
commit moves it to the <isc/util.h> header.
I have included a Coccinelle semantic patch to use SET_IF_NOT_NULL()
where appropriate. The patch needs an #include in `openssl_shim.c`
in order to work.
Mark Andrews [Wed, 2 Aug 2023 06:16:30 +0000 (16:16 +1000)]
Add sleeps so that the modification time changes
The mkeys system test could fail because root zone was resigned
within the same second as it was previously signed causing reloads
to fail. Add delays to the test to prevent this.
Revert commit that always uses OpenSSL 3.0 API when available,
the new APIs should work always, but OpenSSL has non-obvious
omissions in the automatic mappings it provides.
Mark Andrews [Mon, 7 Aug 2023 08:22:29 +0000 (18:22 +1000)]
Fix 'addr', 'ckresult' and 'drop' functions
'addr', 'ckresult' and 'drop' should return 0 rather than 1 after
calling 'setret' as the error has been logged and these functions
are not expect to fail.
Michal Nowak [Mon, 7 Aug 2023 16:28:34 +0000 (18:28 +0200)]
Exclude dupsigs and keymgr2kasp from cross-version-config-tests
pytest should not schedule dupsigs and keymgr2kasp system tests removed
in BIND 9 mainline but still present in BIND 9 baseline version
(v9.19.15). (Can be dropped once the v9.19.16 tag is present.)
Michal Nowak [Wed, 25 Jan 2023 20:38:56 +0000 (21:38 +0100)]
Cross-version testing with named configurations
In #3381 (and #3385), we committed a backward-incompatible change to
BIND 9.19.5, 9.18.7, and 9.16.33, explicitly requiring "inline-signing"
for every "dnssec-policy".
We did this backward-incompatible change deliberately, knowing the
consequences for users and their configurations. But if we didn't, say,
we were unaware this is a backward-incompatible change and fixed failing
systems test by "tweaking a knob to make the CI pass", we would not have
a second look before the change hits user configurations.
"cross-version-config-tests" CI job is such a second look. It will run
system tests from the latest release tag specific to the particular
branch (e.g., v9.19.12 for the "main" branch) with BIND 9 binaries from
the current "HEAD" (the future v9.19.13). This Frankenstein build gets
conceived by altering the "TOP_BUILDDIR" variable in
"bin/tests/system/conf.sh".
Caveats:
- Only system test configurations are tested; no actual test code is
run.
- Problems with namedN.conf configurations are not identified.
When backward-incompatible change is introduced, the CI job is expected
to fail. If the change is deliberate, the job will keep failing until
the version with the backward-incompatible change is tagged, and the
minor version in configure.ac is bumped.
Timo Teräs [Fri, 28 Jul 2023 10:15:48 +0000 (13:15 +0300)]
Fix OpenSSL 3.0 API EC curve names
The OpenSSL man page examples used the NIST curve names which
are supported. But when querying the name, the native OpenSSL
name is returned. Use these names to pass curve type checks for
engine/provider objects.
Michał Kępień [Tue, 11 Jul 2023 13:56:31 +0000 (15:56 +0200)]
Convert setup.pl into static configurations
The setup.pl script has been replaced with static BIND configurations,
and in the course of this change, the unused ns1 server was removed.
This enhancement has greatly improved the overall test's readability.
Michal Nowak [Tue, 9 May 2023 17:11:00 +0000 (19:11 +0200)]
Rewrite stress test to pytest
The shell version of the test was completed only after all DNS zone
updates were sent, even if the BIND server crashed while processing
them, leading to prolonged execution and potential hang in the CI
environment. The Python rewrite of the test ensures that DNS update
tasks finish within five minutes of starting, irrespective of a BIND
crash possibility or DNS zone updates not finishing in time.
Michał Kępień [Mon, 7 Aug 2023 09:26:58 +0000 (11:26 +0200)]
Lower the minimum expected dnstap output file size
Lower the size requirement for the dnstap output file produced during
the "dnstap" system test from 454 to 450 bytes; while files of that size
are not generated in any GitLab CI job, they are in other environments
where the test passes.
Michał Kępień [Mon, 7 Aug 2023 09:26:58 +0000 (11:26 +0200)]
Wait until fstrm_capture is ready
The fstrm_capture utility is started in the background during the
"dnstap" system test. Consequently, "rndc dnstap-reopen" and similar
commands may be executed before fstrm_capture starts listening on the
Unix domain socket it is configured to receive dnstap data on. This
results in the dnstap data sent to that socket in the meantime to be
lost; while the fstrm writer thread is able to recover from such a
scenario within a couple of seconds (by reopening the configured dnstap
destination itself), only one write attempt is made for data
successfully queued to the writer thread, so dnstap frames can still be
lost in the process. This may happen during the "dnstap" system test,
leading to the dnstap output file being empty, which in turn causes the
test to fail.
Fix by waiting until fstrm_capture starts listening on the Unix domain
socket it is configured to use before asking named to reopen the
configured dnstap destination. Since various fstrm_capture versions log
different messages when the listening socket is set up, wait for a
common string that works for all fstrm_capture versions released to
date. Add a few extra debug messages indicating test progress and make
the test fail if the expected fstrm_capture log message is not generated
within 10 seconds.
Michał Kępień [Mon, 7 Aug 2023 09:26:58 +0000 (11:26 +0200)]
Capture all fstrm_capture output
The fstrm_capture.out file is overwritten when the fstrm_capture utility
is restarted during the "dnstap" system test. Use a separate output
file for each fstrm_capture instance to ensure all output produced by
that tool during the "dnstap" system test is preserved for forensic
purposes.
Mark Andrews [Sun, 6 Aug 2023 23:38:56 +0000 (09:38 +1000)]
Set ret=1 if _wait_for_stats does not succeed
Errors getting transfer statistics from named.run where not detected
as ret was not set to one if there hadn't been a success after looping
for a while.
This means that if you use TTL values larger than 1 day in your zone,
your zone runs the risk of going bogus before it moves safely to
insecure.
Most resolvers by default cap the maximum TTL that they cache RRsets,
at one day (Unbound, Knot, PowerDNS) so that is fine. However, BIND 9's
default is one week.
Change the default TTLsig to one week, so that also for BIND 9
resolvers in the default cases responses for zones that are going
insecure will not be evaluated as bogus.
This change does mean that when unsigning your zone, it will take six
days longer to safely go insecure, regardless of what TTL values you
use in the zone.