Ondřej Surý [Mon, 23 Jan 2023 12:40:19 +0000 (13:40 +0100)]
Enforce receive_secure_serial() and setnsec3param() serialization
Both receive_secure_serial() and setnsec3param() run on the same zone
loop, therefore they are serialized. Remove the mechanism to enqueue
the nsec3param and secure serial updates in case one of them is
running (as they can not) and replace it with sanity check.
Ondřej Surý [Mon, 23 Jan 2023 11:13:43 +0000 (12:13 +0100)]
Replace the dns_io_t mechanism with offloaded threads
Previously, the zone loading and dumping was effectively serialized by
the dns_io_t mechanism. In theory, more IO operations could be run in
parallel, but the zone manager .iolimit was set to 1 and never increased
as dns_zonemgr_setiolimit() was never ever called.
As the dns_master asynchronous load and dump was already offloaded to
non-worker threads with isc_work mechanism, drop the whole dns_io_t
and just rely on the isc_work to do the load and dump scheduling.
Ondřej Surý [Thu, 12 Jan 2023 20:44:31 +0000 (21:44 +0100)]
Allow interrupting dnssec-signzone during signing
The signal handler in the isc_loop would wait for all the work to finish
before interrupting the signing. Add teardown handlers via
isc_loopmgr_teardown() to signal the assignwork() it should stop signing
and bail-out early.
NOTE: The dnssec-signzone binary still can't be interrupted during zone
loading, zone cleaning, nsec(3) chain generation or zone writing. This
might get addressed in the future if it becomes a problem.
Ondřej Surý [Thu, 12 Jan 2023 15:36:43 +0000 (16:36 +0100)]
Dump the signed zone in the text format at the end of dnssec-signzone
Instead of dumping the signed zone contents node by node during the
signing, dump the entire zone at the end. This was already done for the
raw zone format, but it shows that the IO is better utilized when the
zone dump is done in one single write rather than in small chunks.
A side effect of dumping node by node was that all names were printed
relative to the zone origin rather than being grouped under different
$ORIGINs as would normally be the case when dumping a zone. Also, state
was not maintained from one node to the next regarding whether the CLASS
has already been printed, so it was always included with the first
record of each node.
Since dnssec-signzone uses the dns_master_style_explicittl text format
style, and is the only application that does so, we can revise that
style and add a new DNS_STYLEFLAG_CLASS_PERNAME flag to get the output
back to what it was before this change.
Aram Sargsyan [Thu, 8 Dec 2022 10:29:15 +0000 (10:29 +0000)]
Test query forwarding to DoT-enabled upstream servers
Change the 'forward' system test to enable DoT on ns2 server,
and test that forwarding from ns4 to the DoT-enabled ns2 works.
In order to test different scenarios, create a test CA (based on
similar CAs for 'doth' and 'nsupdate' system tests), and test
both insecure (no certificate validation) and secure (also with
mutual TLS) TLS configurations, as well as a configuration with an
expired certificate.
Aram Sargsyan [Thu, 8 Dec 2022 10:57:37 +0000 (10:57 +0000)]
Add 'tls' configuration support for the 'forwarders' option
A 'tls' statement can be specified both for individual addresses
and for the whole list (as a default value when an individual
address doesn't have its own 'tls' set), just as it was done
before for the 'port' value.
Create a new function 'print_rawqstring()' to print a string residing
in a 'isc_textregion_t' type parameter.
Create a new function 'copy_string()' to copy a string from a
'cfg_obj_t' object into a 'isc_textregion_t'.
The old code didn't handle race conditions and errors on systems
with non load balancing sockets gracefully. Look for an error on
any child socket and if found close all the child sockets and return
an error.
The old code didn't handle race conditions and errors on systems
with non load balancing sockets gracefully. Look for an error on
any child socket and if found close all the child sockets and return
an error.
Artem Boldariev [Mon, 16 Jan 2023 16:31:08 +0000 (18:31 +0200)]
Fix building BIND on DragonFly BSD (on both older an newer versions)
This commit ensures that BIND and supplementary tools still can be
built on newer versions of DragonFly BSD. It used to be the case, but
somewhere between versions 6.2 and 6.4 the OS developers rearranged
headers and moved some function definitions around.
Before that the fact that it worked was more like a coincidence, this
time we, at least, looked at the related man pages included with the
OS.
No in depth testing has been done on this OS as we do not really
support this platform - so it is more like a goodwill act. We can,
however, use this platform for testing purposes, too. Also, we know
that the OS users do use BIND, as it is included in its ports
directory.
Building with './configure' and './configure --without-jemalloc' have
been fixed and are known to work at the time the commit is made.
Mark Andrews [Wed, 18 Jan 2023 04:54:42 +0000 (15:54 +1100)]
Add missing node lock when setting node->wild in rbtdb.c
The write node lock needs to be held when setting node->wild in
add_wildcard_magic except when being called from loading_addrdataset
which is used to load the zone without locking during its initial
load.
Aram Sargsyan [Wed, 18 Jan 2023 10:36:34 +0000 (10:36 +0000)]
Refactor isc_nm_xfr_allowed()
Return 'isc_result_t' type value instead of 'bool' to indicate
the actual failure. Rename the function to something not suggesting
a boolean type result. Make changes in the places where the API
function is being used to check for the result code instead of
a boolean value.
Matthijs Mekking [Fri, 13 Jan 2023 13:13:59 +0000 (14:13 +0100)]
Add checkds test case with resolver parental-agent
Add a test case for a server that uses a resolver as an parental-agent.
We need two root servers, ns1 and ns10, one that delegates to the
'checkds' tld with the DS published (ns2), and one that delegates to
the 'checkds' tld with the DS removed (ns5). Both root zones are
being setup in the 'ns1/setup.sh' script.
We also need two resolvers, ns3 and ns8, that use different root hints
(one uses ns1 address as a hint, the other uses ns10).
Then add the checks to test_checkds.py is similar to the existing tests.
Update 'types' because for zones that have the DS withdrawn (or to be
withdrawn), the CDS and CDNSKEY records should not be published and
thus should not be in the NSEC bitmap.
Ondřej Surý [Thu, 19 Jan 2023 08:14:53 +0000 (09:14 +0100)]
Detach the zone views outside of the zone lock
Detaching the views in the zone_shutdown() could lead to
lock-order-inversion between adb->namelocks[bucket], adb->lock,
view->lock and zone->lock. Detach the views outside of the section that
zone-locked.
Ondřej Surý [Thu, 19 Jan 2023 09:00:42 +0000 (10:00 +0100)]
Add python3-ply to GitHub CodeQL configuration
BIND 9.16 needs Python and PLY packages for configure to succeed.
Unless we want to tweak the build script to exclude python, we need to
add python3-ply package to the CodeQL configuration.
Ondřej Surý [Wed, 18 Jan 2023 21:38:27 +0000 (22:38 +0100)]
Use thread_local EVP_MD in isc_iterated_hash()
Cherry-pick small fixup commit from 9.18/9.16 branches needed for
thread-safety. This fixup commit is not needed for 9.19+ because of
reworked application setup, but it decouples isc_iterated_hash and
isc_md units and keeps all the branches in sync.
Ondřej Surý [Mon, 16 Jan 2023 10:12:06 +0000 (11:12 +0100)]
Use thread_local EVP_MD_CTX in isc_iterated_hash()
As this code is on hot path (NSEC3) this introduces an additional
optimization of the EVP_MD API - instead of calling EVP_MD_CTX_new() on
every call to isc_iterated_hash(), we create two thread_local objects
for each thread - a basectx and mdctx, initialize basectx once and then
use EVP_MD_CTX_copy_ex() to flip the initialized state into mdctx. This
saves us couple more valuable microseconds from the isc_iterated_hash()
call.
Ondřej Surý [Mon, 16 Jan 2023 11:56:53 +0000 (12:56 +0100)]
Use OpenSSL 1.x SHA_CTX API in isc_iterated_hash()
If the OpenSSL SHA1_{Init,Update,Final} API is still available, use it.
The API has been deprecated in OpenSSL 3.0, but it is significantly
faster than EVP_MD API, so make an exception here and keep using it
until we can't.
Ondřej Surý [Mon, 16 Jan 2023 10:12:06 +0000 (11:12 +0100)]
Use OpenSSL EVP_MD API directly in isc_iterated_hash()
Instead of going through another layer, use OpenSSL EVP_MD API directly
in the isc_iterated_hash() implementation. This shaves off couple of
microseconds in the microbenchmark.
Ondřej Surý [Mon, 16 Jan 2023 08:16:35 +0000 (09:16 +0100)]
Avoid implicit algorithm fetch for OpenSSL EVP_MD family
The implicit algorithm fetch causes a lock contention and significant
slowdown for small input buffers. For more details, see:
https://github.com/openssl/openssl/issues/19612
Instead of using EVP_DigestInit_ex() initialize empty MD_CTX objects for
each algorithm and use EVP_MD_CTX_copy_ex() to initialize MD_CTX from a
static copy. Additionally avoid implicit algorithm fetching by using
EVP_MD_fetch() for OpenSSL 3.0.
Evan Hunt [Wed, 11 Jan 2023 21:41:48 +0000 (13:41 -0800)]
remove dead code for reserved dispatches
named formerly reserved a set of dispatch objects for use when
sending requests from user-specified source ports. this objects
are no longer used and have been removed.
Evan Hunt [Sat, 7 Jan 2023 01:01:06 +0000 (17:01 -0800)]
mark "port" as deprecated for source address options
Deprecate the use of "port" when configuring query-source(-v6),
transfer-source(-v6), notify-source(-v6), parental-source(-v6),
etc. Also deprecate use-{v4,v6}-udp-ports and avoid-{v4,v6}udp-ports.
Ondřej Surý [Tue, 17 Jan 2023 06:21:34 +0000 (07:21 +0100)]
Commit the change of view for view->managed_keys
When we change the view in the view->managed_keys, we never commit the
change, keeping the previous view possibly attached forever.
Call the dns_zone_setviewcommit() immediately after changing the view as
we are detaching the previous view anyway and there's no way to recover
from that.
Ondřej Surý [Tue, 17 Jan 2023 06:18:16 +0000 (07:18 +0100)]
Detach the views in zone_shutdown(), not in zone_free()
The .view (and possibly .prev_view) would be kept attached to the
removed zone until the zone is fully removed from the memory in
zone_free(). If this process is delayed because server is busy
something else like doing constant `rndc reconfig`, it could take
seconds to detach the view, possibly keeping multiple dead views in the
memory. This could quickly lead to a massive memory bloat.
Release the views early in the zone_shutdown() call, and don't wait
until the zone is freed.
Artem Boldariev [Thu, 12 Jan 2023 18:09:51 +0000 (20:09 +0200)]
XoT: properly handle the case when checking for ALPN failed
During XoT it is important to check for "dot" ALPN tag to be
negotiated (according to the RFC 9103). We were doing that, however, the
situation was not handled properly, leading to non-cancelled zone
transfers that would crash (abort()) BIND on shutdown.
In this particular case 'result' might equal 'ISC_R_SUCCESS'. When
this is the case, the part of the code supposed to handle failures
will not cancel the zone transfer.
This situation cannot happen when BIND is a secondary of other BIND
instance. Only primaries following the RFC not closely enough could
trigger such a behaviour.
Tom Krizek [Tue, 17 Jan 2023 13:18:22 +0000 (14:18 +0100)]
Fix feature detection for pytest markers in tests
The condition was accidentally reversed during refactoring in 9730ac4c5691c36d58c06deec1762a4831b268c5 . It would result in skipped
tests on builds with proper support and false negatives on builds
without proper feature support.
Credit for reporting the issue and the fix goes to Stanislav Levin.
Tom Krizek [Fri, 6 Jan 2023 14:08:27 +0000 (15:08 +0100)]
Update the TEST_PARALLEL_JOBS value in CI
The authoritative source for this value is in the project's CI/CD
Variables Setting. The reason to keep it in .gitlab-ci.yaml as well is
to have functional testing in forks without the need to manually specify
this variable in Settings.
The tests have been executed with 4 jobs for some time now. This
"change" only brings .gitlab-ci.yaml file up to date, it doesn't
actually change the number of jobs we currently use to test.
Tom Krizek [Mon, 2 Jan 2023 16:54:58 +0000 (17:54 +0100)]
Look for ifconfig.sh.in in testsock.pl parent dir
Instead of using the current working directory to find the ifconfig.sh
script, look for the ifconfig.sh.in template in the directory where the
testsock.pl script is located. This enables the testsock.pl script to be
called from any working directory.
Using the ifconfig.sh.in template is sufficient, since it contains
the necessary information to be extracted: the max= value (which is
hard-coded in the template).
Tom Krizek [Tue, 20 Dec 2022 14:09:48 +0000 (15:09 +0100)]
Factor out script to handle system test core dumps
Move the core dump detection functionality for system test runs into a
separate script. This enables reuse by the pytest runner. The
functionality remains the same.
Tom Krizek [Mon, 19 Dec 2022 16:44:35 +0000 (17:44 +0100)]
testcrypto.sh: run in TMPDIR if possible
Avoid creating any temporary files in the current workdir.
Additional/changing files in the bin/tests/system directory are
problematic for pytest/xdist collection phase, which assumes the list of
files doesn't change between the collection phase of the main pytest
thread and the subsequent collection phase of the xdist worker threads.
Since the testcrypto.sh is also called during pytest initialization
through conf.sh.common (to detect feature support), this could
occasionally cause a race condition when the list of files would be
different for the main pytest thread and the xdist worker.
Aram Sargsyan [Mon, 14 Nov 2022 12:18:06 +0000 (12:18 +0000)]
Cancel all fetch events in dns_resolver_cancelfetch()
Although 'dns_fetch_t' fetch can have two associated events, one for
each of 'DNS_EVENT_FETCHDONE' and 'DNS_EVENT_TRYSTALE' types, the
dns_resolver_cancelfetch() function is designed in a way that it
expects only one existing event, which it must cancel, and when it
happens so that 'stale-answer-client-timeout' is enabled and there
are two events, only one of them is canceled, and it results in an
assertion in dns_resolver_destroyfetch(), when it finds a dangling
event.
Change the logic of dns_resolver_cancelfetch() function so that it
cancels both the events (if they exist), and in the right order.