Evan Hunt [Thu, 5 Oct 2023 01:14:55 +0000 (18:14 -0700)]
clean up dns_rbt
- create_node() in rbt.c cannot fail
- the dns_rbt_*name() functions, which are wrappers around
dns_rbt_[add|find|delete]node(), were never used except in tests.
this change isn't really necessary since RBT is likely to go away
eventually anyway. but keeping the API as simple as possible while it
persists is a good thing, and may reduce confusion while QPDB is being
developed from RBTDB code.
Evan Hunt [Thu, 5 Oct 2023 00:49:51 +0000 (17:49 -0700)]
move DNS_RBT_NSEC_* to db.h
these values pertain to whether a node is in the main, nsec, or nsec3
tree of an RBTDB. they need to be moved to a more generic location so
they can also be used by QPDB.
(this is in db.h rather than db_p.h because rbt.c needs access to it.
technically, that's a layer violation, but it's a long-existing one;
refactoring to get rid of it would be a large hassle, and eventually
we expect to remove rbt.c anyway.)
Evan Hunt [Sun, 1 Oct 2023 08:06:49 +0000 (01:06 -0700)]
separate generic DB helpers into db_p.h
when the QPDB is implemented, we will need to have both qpdb_p.h and
rbtdb_p.h. in order to prevent name collisions or code duplication,
this commit adds a generic private header file, db_p.h, containing
structures and macros that will be used by both databases.
some functions and structs have been renamed to more specifically refer
to the RBT database, in order to avoid namespace collision with similar
things that will be needed by the QPDB later.
Evan Hunt [Mon, 6 Nov 2023 16:02:49 +0000 (17:02 +0100)]
refactor wildcard matching
refactor the wildcard matching code to make it a bit easier to
understand, in hopes that it will reduce the difficulty of converting
from RBTDB to QPDB later.
there are also some minor optimizations: previously, after stepping
backward to find the predecessor, we stepped back foward *from* the
predecessor to find the successor. we now reset the rbtnode chain to
its original starting point before stepping forward; this eliminates
some unnecessary processing. and, if neither predecessor nor successor
is found, we return early rather than carrying on with an unnecessary
effort to match labels.
Coverity detected that address->type.sa was too small when copying
a struct sockaddr_sin6, use the alterative union element
address->type.sin6 instead.
Ondřej Surý [Sun, 11 Feb 2024 08:13:43 +0000 (09:13 +0100)]
Add a system test for mixed-case data for the same owner
We were missing a test where a single owner name would have multiple
types with a different case. The generated RRSIGs and NSEC records will
then have different case than the signed records and message parser have
to cope with that and treat everything as the same owner.
Ondřej Surý [Sat, 10 Feb 2024 23:49:32 +0000 (00:49 +0100)]
Fix case insensitive matching in isc_ht hash table implementation
The case insensitive matching in isc_ht was basically completely broken
as only the hashvalue computation was case insensitive, but the key
comparison was always case sensitive.
Aydın Mercan [Tue, 19 Dec 2023 07:41:15 +0000 (10:41 +0300)]
Convert rwlock in isc_log_t to RCU
The isc_log_t contains a isc_logconfig_t that is swapped, dereferenced
or accessed its fields through a mutex. Instead of protecting it with a
rwlock, use RCU.
Ondřej Surý [Thu, 8 Feb 2024 11:31:09 +0000 (12:31 +0100)]
Fix UAF in ccmsg.c when reading stopped before sending
When shutting down the whole server, the reading could stop and detach
from controlconnection before sending is done. If send callback then
detaches from the last controlconnection handle, the ccmsg would be
invalidated after the send callback and thus we must not access ccmsg
after calling the send_cb().
Ondřej Surý [Thu, 8 Feb 2024 11:31:09 +0000 (12:31 +0100)]
Add isc_nm_read_stop() and remove .reading member from ccmsg
We need to stop reading when calling isc_ccmsg_disconnect() as the
reading handle doesn't have to be last because sending might be in
progress. After that, we can safely remove .reading member because the
reading would not be called after the disconnect has been called.
The ccmsg_senddone() should also not call the recv callback if the
sending failed, that's the job of the caller's send callback - in fact
it already does that, so the code in ccmsg_senddone() was superfluous.
To reduce memory pressure, we can add light per-loop (netmgr worker)
memory pools for isc_nmsocket_t structures. This will help in
situations where there's a lot of churn creating and destroying the
nmsockets.
Reduce the isc_nmsocket_t size from 1840 to 1208 bytes
Embedding isc_nmsocket_h2_t directly inside isc_nmsocket_t had increased
the size of isc_nmsocket_t to 1840 bytes. Making the isc_nmsocket_h2_t
to be a pointer to the structure and allocated on demand allows us to
reduce the size to 1208 bytes. While there are still some possible
reductions in the isc_nmsocket_t (embedded tlsstream, streamdns
structures), this was the far biggest drop in the memory usage.
Reduce struct isc__nm_uvreq size from 1560 to 560 bytes
The uv_req union member of struct isc__nm_uvreq contained libuv request
types that we don't use. Turns out that uv_getnameinfo_t is 1000 bytes
big and unnecessarily enlarged the whole structure. Remove all the
unused members from the uv_req union.
After removing sockaddr_unix from isc_sockaddr, we can also remove
sockaddr_storage and reduce the isc_sockaddr size from 152 bytes to just
48 bytes needed to hold IPv6 addresses.
Tom Krizek [Tue, 6 Feb 2024 09:21:45 +0000 (10:21 +0100)]
Support older junit XML format in test result processing
When running `make check` on a platform which has older (but still
supported) pytest, e.g. 3.4.2 on EL8, the junit to trs conversion would
fail because the junit format has different structure. Make the junit
XML processing more lenient to support both the older and newer junit
XML formats.
Tom Krizek [Tue, 6 Feb 2024 14:35:49 +0000 (15:35 +0100)]
Use a single local port for ditch.pl
The ditch.pl script is used to generate burst traffic without waiting
for the responses. When running other tests in parallel, this can result
in a ephemeral port clash, since the ditch.pl process closes the socket
immediately. In rare occasions when the message ID also clashes with
other tests' queries, it might result in an UnexpectedSource error from
dnspython.
Use a dedicated port EXTRAPORT8 which is reserved for each test as a
source port for the burst traffic.
Ondřej Surý [Mon, 4 Dec 2023 11:21:33 +0000 (12:21 +0100)]
Use proper padding instead of using alignas()
As it was pointed out, the alignas() can't be used on objects larger
than `max_align_t` otherwise the compiler might miscompile the code to
use auto-vectorization on unaligned memory.
As we were only using alignas() as a way to prevent false memory
sharing, we can use manual padding in the affected structures.
Ondřej Surý [Thu, 8 Feb 2024 07:30:38 +0000 (08:30 +0100)]
Use DNS_QPGC_MAYBE instead of DNS_QPGC_ALL for more realistic load
In the benchmarks, DNS_QPGC_ALL was trying to hard to cleanup QP
and this was slowing down QP too much. Use DNS_QPGC_MAYBE instead
that we are going to use anyway for more realistic load - this also
shows the memory usage matching the real loads.
Ondřej Surý [Mon, 29 Jan 2024 15:36:30 +0000 (16:36 +0100)]
Optimize cname_and_other_data to stop as earliest as possible
Stop the cname_and_other_data processing if we already know that the
result is true. Also, we know that CNAME will be placed in the priority
headers, so we can stop looking for CNAME if we haven't found CNAME and
we are past the priority headers.
Ondřej Surý [Mon, 29 Jan 2024 15:36:30 +0000 (16:36 +0100)]
Optimize the slabheader placement for certain RRTypes
Mark the infrastructure RRTypes as "priority" types and place them at
the beginning of the rdataslab header data graph. The non-priority
types either go right after the priority types (if any).
Ondřej Surý [Tue, 6 Feb 2024 13:05:08 +0000 (14:05 +0100)]
Fix missing RRSIG for CNAME with different slabheader order
The cachedb was missing piece of code (already found in zonedb) that
would make lookups in the slabheaders to miss the RRSIGs for CNAME if
the order of CNAME and RRSIG(CNAME) was reversed in the node->data.
Ondřej Surý [Wed, 7 Feb 2024 14:25:13 +0000 (15:25 +0100)]
Remove isc__tls_setfatalmode() function and the calls
With _exit() instead of exit() in place, we don't need
isc__tls_setfatalmode() mechanism as the atexit() calls will not be
executed including OpenSSL atexit hooks.
Ondřej Surý [Wed, 7 Feb 2024 08:23:50 +0000 (09:23 +0100)]
Improve the rcu_barrier() call when destroying the mem context
Instead of crude 5x rcu_barrier() call in the isc__mem_destroy(), change
the mechanism to call rcu_barrier() until the memory use and references
stops decreasing. This should deal with any number of nested call_rcu()
levels.
Additionally, don't destroy the contextslock if the list of the contexts
isn't empty. Destroying the lock could make the late threads crash.
Ondřej Surý [Wed, 7 Feb 2024 13:44:39 +0000 (14:44 +0100)]
Use _exit() in the fatal() function
Since the fatal() isn't a correct but rather abrupt termination of the
program, we want to skip the various atexit() calls because not all
memory might be freed during fatal() call, etc. Using _exit() instead
of exit() has this effect - the program will end, but no destructors or
atexit routines will be called.
Aram Sargsyan [Fri, 10 Nov 2023 11:14:58 +0000 (11:14 +0000)]
Add a check for the 'first refresh' data in the stats channel
Currently we test the incoming zone transfers data in the statistics
channel by retransfering the zones in slow mode and capturing the XML
and JSON outputs in the meantime to check their validity. Add a new
transfer to the test, and check that the XML and JSON files correctly
indicate that we have 3 retransfers and 1 new (first time) transfer.
Aram Sargsyan [Fri, 10 Nov 2023 11:10:32 +0000 (11:10 +0000)]
Expose the 'first refresh' zone flag in rndc status
Expose the newly added 'first refresh' flag in the information
provided by the 'rndc staus' command, by showing the number of
zones, which are not yet fully ready, and their first refresh
is pending or is in-progress.
Aram Sargsyan [Thu, 14 Dec 2023 10:44:21 +0000 (10:44 +0000)]
Test trusted anchors configurations for 'dnssec-validation yes'
Add checks into the 'checkconf' system test to make sure that the
'dnssec-validation yes' option fails without configured trusted
anchors, and succeeds with configured non-empty, as well as empty
trusted anchors.
Aram Sargsyan [Thu, 14 Dec 2023 10:42:56 +0000 (10:42 +0000)]
Document new requirements for 'dnssec-validation yes'
Using the 'dnssec-validation yes' option now requires an explicitly
confgiured 'trust-anchors' statement (or 'managed-keys' or
'trusted-keys', both deprecated).
Aram Sargsyan [Thu, 14 Dec 2023 10:40:05 +0000 (10:40 +0000)]
Require trust anchors for 'dnnsec-validation yes'
Using the 'dnssec-validation yes' option now requires an explicitly
confgiured 'trust-anchors' statement (or 'managed-keys' or
'trusted-keys', both deprecated).
Matthijs Mekking [Mon, 15 Jan 2024 08:17:01 +0000 (09:17 +0100)]
Improve parental-agents definition in ARM
"A parental agent is the entity that is allowed to change a zone's
delegation information" is untrue, because it is possible to use some
hidden server or a validating resolver.
Also the new text makes it more clear that named sends DS queries to
these servers.
Aram Sargsyan [Wed, 31 Jan 2024 13:01:13 +0000 (13:01 +0000)]
Fix the DNS_GETDB_STALEFIRST flag
The DNS_GETDB_STALEFIRST flag is defined as 0x0C, which is the
combination of the DNS_GETDB_PARTIAL (0x04) and the
DNS_GETDB_IGNOREACL (0x08) flags (0x04 | 0x08 == 0x0C) , which is
an obvious error.
All the flags should be power of two, so they don't interfere with
each other. Fix the DNS_GETDB_STALEFIRST flag by setting it to 0x10.
Ondřej Surý [Mon, 11 Dec 2023 15:50:12 +0000 (16:50 +0100)]
Make the dns_validator validations asynchronous and limit it
Instead of running all the cryptographic validation in a tight loop,
spread it out into multiple event loop "ticks", but moving every single
validation into own isc_async_run() asynchronous event. Move the
cryptographic operations - both verification and DNSKEY selection - to
the offloaded threads (isc_work_enqueue), this further limits the time
we spend doing expensive operations on the event loops that should be
fast.
Limit the impact of invalid or malicious RRSets that contain crafted
records causing the dns_validator to do many validations per single
fetch by adding a cap on the maximum number of validations and maximum
number of validation failures that can happen before the resolving
fails.
Matthijs Mekking [Wed, 31 Jan 2024 10:44:07 +0000 (11:44 +0100)]
Don't also skip keymgr run if checkds is skipped
Checking the DS at the parent only happens if dns_zone_getdnsseckeys()
returns success. However, if this function somehow fails, it can also
prevent the keymgr from running.
Before adding the check DS functionality, the keymgr should only run
if 'dns_dnssec_findmatchingkeys()' did not return an error (either
ISC_R_SUCCESS or ISC_R_NOTFOUND). After this change the correct
result code is used again.