Ondřej Surý [Fri, 22 Nov 2024 14:10:26 +0000 (15:10 +0100)]
Remove dns_badcache usage in the resolver (lame-ttl)
The lame-ttl processing was overriden to be disabled in the config,
but the code related to the lame-ttl was still kept in the resolver
code. More importantly, the DNS_RESOLVER_BADCACHETTL() macro would
cause the entries in the resolver badcache to be always cached for at
least 30 seconds even if the lame-ttl would be set to 0.
Remove the dns_badcache code from the dns_resolver unit, so we save some
processing time and memory in the resolver code.
Ondřej Surý [Thu, 14 Nov 2024 18:51:29 +0000 (19:51 +0100)]
Improve the badcache cleaning by adding LRU and using RCU
Instead of cleaning the dns_badcache opportunistically, add per-loop
LRU, so each thread-loop can clean the expired entries. This also
allows removal of the atomic operations as the badcache entries are now
immutable, instead of updating the badcache entry in place, the old
entry is now deleted from the hashtable and the LRU list, and the new
entry is inserted in the LRU.
Arаm Sаrgsyаn [Sat, 16 Nov 2024 19:51:25 +0000 (19:51 +0000)]
[9.20] fix: dev: Fix a data race between dns_zone_getxfr() and dns_xfrin_create()
There is a data race between the statistics channel, which uses
`dns_zone_getxfr()` to get a reference to `zone->xfr`, and the creation
of `zone->xfr`, because the latter happens outside of a zone lock.
Split the `dns_xfrin_create()` function into two parts to separate the
zone transfer starting part from the zone transfer object creation part.
This allows us to attach the new object to a local variable first, then
attach it to `zone->xfr` under a lock, and only then start the transfer.
Closes #5011
Backport of MR !9716
Merge branch 'backport-5011-dns_zone_getxfr-race-fix-9.20' into 'bind-9.20'
Aram Sargsyan [Tue, 5 Nov 2024 10:28:37 +0000 (10:28 +0000)]
Fix a data race between dns_zone_getxfr() and dns_xfrin_create()
There is a data race between the statistics channel, which uses
`dns_zone_getxfr()` to get a reference to `zone->xfr`, and the creation
of `zone->xfr`, because the latter happens outside of a zone lock.
Split the `dns_xfrin_create()` function into two parts to separate the
zone tranfer startring part from the zone transfer object creation part.
This allows us to attach the new object to a local variable first, then
attach it to `zone->xfr` under a lock, and only then start the transfer.
Ondřej Surý [Fri, 15 Nov 2024 15:54:59 +0000 (15:54 +0000)]
[9.20] fix: nil: Add OpenSSL includes as needed
The isc/crypto.h now directly includes the OpenSSL headers (evp.h) and
any application that includes that header also needs to have
OPENSSL_CFLAGS in the Makefile.am. Adjust the required automake files
as needed.
Backport of MR !9713
Merge branch 'backport-ondrej/add-missing-OPENSSL_CFLAGS-9.20' into 'bind-9.20'
Ondřej Surý [Mon, 4 Nov 2024 15:13:45 +0000 (15:13 +0000)]
Add OpenSSL includes as needed
The isc/crypto.h now directly includes the OpenSSL headers (evp.h) and
any application that includes that header also needs to have
OPENSSL_CFLAGS in the Makefile.am. Adjust the required automake files
as needed.
Nicki Křížek [Thu, 7 Nov 2024 13:06:53 +0000 (14:06 +0100)]
Move Known Issues to BIND9 wiki
Keeping the Known Issues as part of the rendered docs has the issue that
the list can't be updated on the official docs website until the next
release. This is unpractical is a high-priority issue is discovered
shortly after a release. Keep the Known Issues in wiki and simply link
to the list from the rendered docs. The wiki article can be updated at
any time as needed.
Ondřej Surý [Wed, 13 Nov 2024 08:52:08 +0000 (08:52 +0000)]
[9.20] fix: usr: Fix race condition when canceling ADB find
When canceling the ADB find, the lock on the find gets released for
a brief period of time to be locked again inside adbname lock. During
the brief period that the ADB find is unlocked, it can get canceled by
other means removing it from the adbname list which in turn causes
assertion failure due to a double removal from the adbname list.
This has been fixed.
Closes #5024
Backport of MR !9722
Merge branch 'backport-5024-fix-crash-in-dns_adb_cancelfind-9.20' into 'bind-9.20'
Ondřej Surý [Tue, 5 Nov 2024 13:59:16 +0000 (14:59 +0100)]
Revalidate the adbname when canceling the ADB find
When canceling the ADB find, the lock on the find gets released for
a brief period of time to be locked again inside adbname lock. During
the brief period that the ADB find is unlocked, it can get canceled by
other means removing it from the adbname list which in turn causes
assertion failure due to a double removal from the adbname list.
Recheck if the find->adbname is still valid after acquiring the lock
again and if not just skip the double removal. Additionally, attach to
the adbname as in the worst case, the adbname might also cease to exist
if the scheduler would block this particular thread for a longer period
of time invalidating the lock we are going to acquire and release.
Nicki Křížek [Tue, 12 Nov 2024 09:07:02 +0000 (10:07 +0100)]
Ensure pytest runner get proper outcome from flaky reruns
When a test is re-run by the flaky plugin, the TestReport outcomes
collected in the pytest_runtest_makereport() hook should be overriden.
Each of the setup/call/teardown phases is reported again and since we
care about the overall outcome, their respective results should be
overriden so that only the outcome from the final test (re)run gets
reported.
Prior to this change, it lead to a situation where an extra_artifact
generated during the test might be ignored. This was caused because the
check was skipped, since the test was incorrectly considered as "failed"
in the case where the test would fail on the first run, but pass on a
subsequent flaky rerun.
Nicki Křížek [Fri, 8 Nov 2024 15:24:07 +0000 (15:24 +0000)]
[9.20] chg: dev: Use lists of expected artifacts in system tests
``clean.sh`` scripts have been replaced by lists of expected artifacts for each system test module. The list is defined using the custom ``pytest.mark.extra_artifacts`` mark, which can use both filenames and globs.
Closes #4261
Backport of MR !9426
Merge branch 'backport-4261-add-pytest-fixture-checking-test-artifacts-9.20' into 'bind-9.20'
Michał Kępień [Mon, 19 Aug 2024 16:49:08 +0000 (18:49 +0200)]
Add pytest fixture for checking test artifacts
Prior to introducing the pytest runner, clean.sh files were used as a
list of files that the test is expected to leave around as artifacts and
check that no extra files were created.
With the pytest runner, those scripts are no longer used, but the
ability to detect extraneous files is still useful. Add a new
"extra_artifacts" mark which can be used for the same purpose.
Mark Andrews [Mon, 4 Nov 2024 02:16:52 +0000 (02:16 +0000)]
[9.20] chg: usr: dnssec-ksr now supports KSK rollovers
The tool 'dnssec-ksr' now allows for KSK generation, as well as planned KSK rollovers. When signing a bundle from a Key Signing Request (KSR), only the key that is active in that time frame is being used for signing. Also, the CDS and CDNSKEY records are now added and removed at the correct time.
Closes #4697
Closes #4705
Backport of MR !9452
Merge branch 'backport-4705-dnssec-ksr-only-sign-with-active-ksks-9.20' into 'bind-9.20'
dnssec-ksr can now sign KSR files with multiple KSKs. A planned KSK
rollover is supported, meaning the KSR will first be signed with
one KSK and later with another. The timing metadata for CDS and
CDNSKEY records are also taken into account, so these records are
only published when the time is between "SyncPublish" and "SyncDelete".
Add a test case for Offline KSK where during the lifespan of the Signed
Key Response a KSK rollover happens. Ensure that the correct DNSKEY,
CDNSKEY, and CDS records are published at the right times.
When the zone is initially signed, the CDNSKEY/CDS RRset is not
immediately published. The DNSKEY and signatures must propagate first.
Adjust the test to allow for this case.
Add an option to dnssec-ksr keygen, -o, to create KSKs instead of ZSKs.
This way, we can create a set of KSKS for a given period too.
For KSKs we also need to set timing metadata, including "SyncPublish"
and "SyncDelete". This functionality already exists in keymgr.c so
let's make the function accessible.
Replace dnssec-keygen calls with dnssec-ksr keygen for KSK in the
ksr system test and check keys for created KSKs as well. This requires
a slight modification of the check_keys function to take into account
KSK timings and metadata.
[9.20] chg: test: Match algorithms when checking signatures
In the ksr system test, the 'test_ksr_twotone' case may fail if there are two keys with the same keytag (but different algorithms), because one key is expected to be signing and the other is not.
Switch to regular expression matching and include the algorithm in the search string.
Closes #5017
Backport of MR !9701
Merge branch 'backport-5017-unexpected-match-ksr-twotone-again-9.20' into 'bind-9.20'
Matthijs Mekking [Thu, 31 Oct 2024 10:25:23 +0000 (11:25 +0100)]
Match algorithms when checking signatures
In the ksr system test, the test_ksr_twotone case may fail if there
are two keys with the same keytag (but different algorithms), because
one key is expected to be signing and the other is not.
Switch to regular expression matching and include the algorithm in the
search string.
Michal Nowak [Thu, 31 Oct 2024 18:11:43 +0000 (18:11 +0000)]
[9.20] fix: doc: Remove the CHANGES file
With the introduction of the generated changelog, the CHANGES file
became a symlink to doc/arm/changelog.rst. After the changes made in
!9549, the changelog file transitioned from being a wholly generated
file to one that includes versioned changelog files, which are
themselves generated. However, while implementing !9549, we overlooked
that the CHANGES file is copied to a release directory on an FTP server
and contains just "include" directives, not the changelog itself.
Therefore, in the same fashion as the "RELEASE-NOTES*.html" file, create
a "CHANGELOG*.html" file that redirects to the Changelog appendix of the
ARM.
Closes #5000
Backport of MR !9690
Merge branch 'backport-5000-provide-correct-changelog-on-ftp-9.20' into 'bind-9.20'
Michal Nowak [Thu, 24 Oct 2024 15:01:57 +0000 (17:01 +0200)]
Remove the CHANGES file
With the introduction of the generated changelog, the CHANGES file
became a symlink to doc/arm/changelog.rst. After the changes made in
!9549, the changelog file transitioned from being a wholly generated
file to one that includes versioned changelog files, which are
themselves generated. However, while implementing !9549, we overlooked
that the CHANGES file is copied to a release directory on an FTP server
and contains just "include" directives, not the changelog itself.
Therefore, in the same fashion as the "RELEASE-NOTES*.html" file, create
a "CHANGELOG*.html" file that redirects to the Changelog appendix of the
ARM.
Nicki Křížek [Thu, 31 Oct 2024 12:13:09 +0000 (12:13 +0000)]
[9.20] new: dev: Support jinja2 templates in pytest runner
Configuration files in system tests which require some variables (e.g.
port numbers) filled in during test setup, can now use jinja2 templates
when `jinja2` python package is available.
Any `*.j2` file found within the system test directory will be
automatically rendered with the environment variables into a file
without the `.j2` extension by the pytest runner. E.g.
`ns1/named.conf.j2` will become `ns1/named.conf` during test setup. To
avoid automatic rendering, use `.j2.manual` extension and render the
files manually at test time.
New `templates` pytest fixture has been added. Its `render()` function
can be used to render a template with custom test variables. This can be
useful to fill in different config options during the test. With
advanced jinja2 template syntax, it can also be used to include/omit
entire sections of the config file rather than using `named1.conf.in`,
`named2.conf.in` etc.
Closes #4938
Backport of MR !9587
Merge branch 'backport-4938-use-jinja2-templates-in-system-tests-9.20' into 'bind-9.20'
Nicki Křížek [Tue, 1 Oct 2024 12:45:36 +0000 (14:45 +0200)]
Support jinja2 templates in pytest runner
Configuration files in system tests which require some variables (e.g.
port numbers) filled in during test setup, can now use jinja2 templates
when `jinja2` python package is available.
Any `*.j2` file found within the system test directory will be
automatically rendered with the environment variables into a file
without the `.j2` extension by the pytest runner. E.g.
`ns1/named.conf.j2` will become `ns1/named.conf` during test setup. To
avoid automatic rendering, use `.j2.manual` extension and render the
files manually at test time.
New `templates` pytest fixture has been added. Its `render()` function
can be used to render a template with custom test variables. This can be
useful to fill in different config options during the test. With
advanced jinja2 template syntax, it can also be used to include/omit
entire sections of the config file rather than using `named1.conf.in`,
`named2.conf.in` etc.
Nicki Křížek [Mon, 14 Oct 2024 17:32:39 +0000 (19:32 +0200)]
Make changelog audience mandatory
Use a stricter hazard check which ensures the audience tag is present in
the MR title and is one of the known values. This prevents siuations
where incorrect audience is accidentally used, resulting in a missing
changelog entry or a release note.
Coverity Scan reported a new issue for the ksr system test. There is allegedly a null pointer dereference (FORWARD_NULL) in check_keys().
This popped up because previously we set 'retired' to 0 in case of unlimited lifetime, but we changed it to None.
It is actually a false positive, because if lifetime is unlimited there will be only one key in 'keys'.
However, the code would be better if we always initialized 'active' and if it is not the first key and retired is set, set the successor key's active time to the retire time of the predecessor key.
Closes #5004
Backport of MR !9687
Merge branch 'backport-5004-cid-510858-ksr-check-keys-9.20' into 'bind-9.20'
Matthijs Mekking [Thu, 24 Oct 2024 12:03:58 +0000 (14:03 +0200)]
Fix CID 510858: Null ptr derefs in check_keys
Coverity Scan reported a new issue for the ksr system test. There
is allegedly a null pointer dereference (FORWARD_NULL) in check_keys().
This popped up because previously we set 'retired' to 0 in case of
unlimited lifetime, but we changed it to None.
It is actually a false positive, because if lifetime is unlimited
there will be only one key in 'keys'.
However, the code would be better if we always initialized 'active'
and if it is not the first key and retired is set, set the successor
key's active time to the retire time of the predecessor key.
Matthijs Mekking [Thu, 24 Oct 2024 12:30:51 +0000 (14:30 +0200)]
Fix intermittent ksr test failure
The test_ksr_twotwone may fail if the key id is shorter than 5 digits.
Add a leading space to the expected strings which start with the key
tag to avoid the issue.
Nicki Křížek [Mon, 27 May 2024 14:10:04 +0000 (16:10 +0200)]
Make system tests compatible with pytest 8.0.0+
The pytest collection mechanism has been overhauled in pytest 8.0.0,
resulting in a different node tree when collecting the tests. Ensure the
paths / names we're using that are derived from the node tree are
consistent across different pytest versions.
Particularly, this has affected the convenience symlink name (which is
supposed to be in the form of e.g. dns64_sh_dns64 for the dns64 module
and tests_sh_dns64.py module) and the test name that's logged at the
start of the test, which is supposed to include the system test
directory relative to the root system test directory as well as the
module name (e.g. dns64/tests_sh_dns64.py).
Related https://github.com/pytest-dev/pytest/issues/7777
Mark Andrews [Thu, 24 Oct 2024 06:01:40 +0000 (06:01 +0000)]
[9.20] fix: usr: Use TLS for notifies if configured to do so
Notifies configured to use TLS will now be sent over TLS, instead of plaintext UDP or TCP.
Also, failing to load the TLS configuration for notify now also results in an error.
Closes #4821
Backport of MR !9407
Merge branch 'backport-4821-notify-over-tls-9.20' into 'bind-9.20'
Mark Andrews [Tue, 15 Oct 2024 05:09:48 +0000 (16:09 +1100)]
Fix TCP dispatches and transport
Dispatch needs to know the transport that is being used over the
TCP connection to correctly allow for it to be reused. Add a
transport parameter to dns_dispatch_createtcp and dns_dispatch_gettcp
and use it when selecting a TCP socket for reuse.
Nicki Křížek [Tue, 22 Oct 2024 11:23:03 +0000 (11:23 +0000)]
[9.20] fix: test: Use UTC timezone when handling keys in kasp test library
When working with key timestamps, ensure we correctly set the UTC
timezone in order for the tests to work consistently regardless of the
local time setting.
Closes #4999
Backport of MR !9673
Merge branch 'backport-4999-pytest-kasp-use-utc-timezone-9.20' into 'bind-9.20'
Nicki Křížek [Mon, 21 Oct 2024 10:18:43 +0000 (12:18 +0200)]
Set TZ to Australia/Sydney for bookworm CI job
Use a different timezone via the TZ variable in at least one of the
system test jobs in order to detect possible issues with timezone
handling in python.
Nicki Křížek [Mon, 21 Oct 2024 10:08:52 +0000 (12:08 +0200)]
Use UTC timezone when handling keys in kasp test library
When working with key timestamps, ensure we correctly set the UTC
timezone in order for the tests to work consistently regardless of the
local time setting.
Nicki Křížek [Mon, 14 Oct 2024 12:44:06 +0000 (14:44 +0200)]
Disable too-many/too-few pylint checks
Enforcing pylint standards and default for our test code seems
counter-productive. Since most of the newly added code are tests or is
test-related, encountering these checks rarely make us refactor the code
in other ways and we just disable these checks individually. Code that
is too complex or convoluted will be pointed out in reviews anyways.
Move all test cases from tests.sh to tests_ksr.py. The only test that
is not moved is the check that key id's match expected keys. The
shell-based system test checks two earlier set environment variables
against each other that has become redundant in the pytest variant,
because we now check the signed key response against a list of keys
and for each key we take into account the timing metadata. So we
already ensure that each published key is in the correct key bundle.
Write initial pytest kasp library. This contains everything that is
required for testing Offline KSK functionality with pytest.
This includes:
- addtime: adding a value to a timing metadata
- get_timing_metdata: retrieve timing metadata from keyfile
- get_metadata/get_keystate: retrieve metadata from statefile
- get_keytag: retrieve keytag from base keyfile string
- get_keyrole: get key role from statefile
- dnskey_equals: compare DNSKEY record from file against a string
- cds_equals: compare CDS derived from file against a string
- zone_is_signed: wait until a zone is completely signed
- dnssec_verify: verify a DNSSEC signed zone with dnssec-verify
- check_dnssecstatus: check rndc dnssec -status output
- check_signatures: check that signatures for a given RRset are correct
- check_dnskeys: check that the published DNSKEY RRset is correct
- check_cds: check that the published CDS RRset is correct
- check_apex: check SOA, DNSKEY, CDNSKEY, and CDS RRset
- check_subdomain: check an RRset below the apex
Mark Andrews [Mon, 21 Oct 2024 00:34:32 +0000 (11:34 +1100)]
Fix parsing of hostnames in rndc.conf
When DSCP was removed the parsing of hostnames was accidentally
broken resulting in an assertion failure. Call cfg_parse_tuple
rather than using custom code in parse_sockaddrnameport.