Matthijs Mekking [Thu, 25 Jan 2024 09:19:00 +0000 (10:19 +0100)]
Improve node reference counting
QP database node data is not reference counted the same way RBT nodes
were: in the RBT, node->references could be zero if the node was in the
tree but was not in use by any caller, whereas in the QP trie, the
database itself uses reference counting of nodes internally.
this caused some subtle errors. in RBTDB, when the newref() function is
called and the node reference count was zero, the node lock reference
counter would also be incremented. in the QP trie, this can never
happen - because as long as the node is in the database its reference
count cannot be zero - and so the node lock reference counter was never
incremented.
reference counting will probably need to be refactored in more detail
later; the node lock reference count may not be needed at all. but
for now, as a temporary measure, we add a third reference counter,
'erefs' (external references), to the dns_qpdata structure. this is
counted separately from the main reference counter, and should match
the node reference count as it would have been in RBTDB.
this change revealed a number of places where the node reference counter
was being incremented on behalf of a caller without newref() being
called; those were cleaned up as well.
Evan Hunt [Mon, 4 Dec 2023 03:35:08 +0000 (19:35 -0800)]
revise test for ENT NSEC3 cleanup
as a side effect of the switch from RBT to QBDB, NSEC3 records
are no longer created for empty non-terminal nodes when the
node only contains insecure delegations in an opt-out range.
such NSEC3 records are optional according to RFC 5155 (and,
for example, they are not created by dnssec-signzone), but they were
previously created by named, as a harmless side effect of the RBT
structure, which contains empty internal nodes that can be reached
by a DB iterator. these nodes are not present in the QPDB, so
NSEC3 records are not created unless they're actually required.
the autosign system test contained a test case (added in commit ad91a70d as part of GL #4027) that checked whether ENT NSEC3
records were deleted when the delegations under the ENT removed.
this test no longer passes, because the NSEC3's are not created
in the first place, and therefore cannot be removed.
rather than "fix" the QPDB to add unnecessary NSEC3 records, this
commit instead revises the test to check for removal of ENT NSEC3
records when *not* using opt-out.
Matthijs Mekking [Wed, 17 Jan 2024 15:53:27 +0000 (16:53 +0100)]
Add proper qp cleanup
Fix reference counting: unreference nodes that are succesfully inserted
in the tree, detach created nodes, and cleanup the interior data in
dns_qpdata_destroy().
Matthijs Mekking [Tue, 16 Jan 2024 10:26:20 +0000 (11:26 +0100)]
Replace rbtnodechain with qpchain and qpiter
The qp approach pulled apart the chain and iterator into two separate
things. Replace the rbtnodechain with qpchain and qpiter. Most of the
times we are interested in the iterator only, the rbtnodechain was
mainly used as an an iterator to get the previous and next name in the
DNS canonical order.
Since dns_qpiter_prev() and dns_qpiter_next() store the name, origin,
and node in the provided parameters, often there is no need to call
a current() function anymore.
Getting the first or last item from the iterator is done by
re-initializing the iterator and then call dns_qpiter_next() or
dns_qpiter_prev() respectively.
The dbiterator no longer needs to maintain a chain, only an iterator.
Matthijs Mekking [Fri, 12 Jan 2024 13:11:45 +0000 (14:11 +0100)]
Replace rbt_findnode with qp_lookup
All dns_qp_lookup() calls assume it is okay to find empty data, so
we don't need to do anything special for the DNS_RBTFIND_EMPTYDATA.
You can pass a callback function to dns_rbt_findnode(), something that
qp does not support. Instead, call the function afterwards. This has
the drawback that we do more lookup work if there was a zonecut.
With dns_qp_lookup() we also don't pass any options. In this case,
when DNS_RBTFIND_NOEXACT was set, we adapt the result after the lookup.
Matthijs Mekking [Thu, 11 Jan 2024 11:33:45 +0000 (12:33 +0100)]
Replace rbt_deletenode with qp_deletename
Replace dns_rbt_deletenode calls with dns_qp_deletename. For removing
the name from the nsec tree, we no longer first have to find it: we can
just remove the key (retrieved by name).
Matthijs Mekking [Wed, 10 Jan 2024 15:29:57 +0000 (16:29 +0100)]
Replace rbt_addnode with qp_insert
Replace dns_rbt_addnode calls with dns_qp_insert. With QP, it sometimes
makes more sense to first lookup the name and see if there is an
existing node (rather than create new data, insert, find out a node
already exists, and destroy the data again). This is done with
dns_qp_getname(), which is more lightweight than dns_qp_lookup(),
and we are only interested in if there is already a leaf node for this
name or not.
Evan Hunt [Tue, 5 Mar 2024 23:43:11 +0000 (15:43 -0800)]
switch database defaults from "rbt" to "qp"
replace the string "rbt" throughout BIND with "qp" so that
qpdb databases will be used by default instead of rbtdb.
rbtdb databases can still be used by specifying "database rbt;"
in a zone statement.
- Copy rbtdb.c, rbt-zonedb.c and rbt-cachedb.c to qp-*.
- Added qpmethods.
- Added a new structure dns_qpdata that will replace dns_rbtnode.
- Replaced normal, nsec, and nsec3 dns_rbt trees with dns_qp tries.
- Replaced dns_rbt_create() calls with dns_qp_create().
- Replaced the dns_rbt_destroy() call with dns_qp_destroy().
- Create a dns_qpdata struct and create/destroy methods.
Mark Andrews [Tue, 5 Mar 2024 04:51:05 +0000 (15:51 +1100)]
test: DS query against broken NODATA responses
This is a regresssion test for GL #4621 where the NODATA responses
are SOA records that match the QNAME rather than the zone name. In
particular for NS queries.
Mark Andrews [Thu, 29 Feb 2024 03:00:58 +0000 (14:00 +1100)]
Restore the disassociate call to before the fetch
[GL #3709] reordered the dns_rdataset_disassociate call to after
the dns_resolver_createfetch call resulting in qctx->nsrrset still
being associated when dns_resolver_createfetch is called in
resume_dslookup (7e4e125e). Revert that part of the change and add
comments as to why the multiple dns_rdataset_disassociate calls are
where they are.
Ondřej Surý [Mon, 4 Mar 2024 12:21:35 +0000 (13:21 +0100)]
Pin the xfr to a specific loop
Instead of getting the loop from the zone every time, attach the xfrin
directly to the loop. This also allows to remove the extra safety tid
checks from the dns_xfrin unit.
Evan Hunt [Tue, 6 Feb 2024 21:33:21 +0000 (13:33 -0800)]
move RRL broken-config check to checkconf
the RRL test included a test case that tried to start named with
a broken configuration. the same error could be found with
named-checkconf, so it should have been tested in the checkconf
system test.
Ondřej Surý [Tue, 20 Feb 2024 07:50:58 +0000 (08:50 +0100)]
Make the TTL-based cleaning more aggressive
It was discovered that the TTL-based cleaning could build up
a significant backlog of the rdataset headers during the periods where
the top of the TTL heap isn't expired yet. Make the TTL-based cleaning
more aggressive by cleaning more headers from the heap when we are
adding new header into the RBTDB.
Ondřej Surý [Tue, 20 Feb 2024 07:50:58 +0000 (08:50 +0100)]
Remove expired rdataset headers from the heap
It was discovered that an expired header could sit on top of the heap
a little longer than desireable. Remove expired headers (headers with
rdh_ttl set to 0) from the heap completely, so they don't block the next
TTL-based cleaning.
Ondřej Surý [Wed, 21 Feb 2024 12:32:09 +0000 (13:32 +0100)]
Simplify the parent cleaning in the prune_tree() mechanism
Instead of juggling with node locks in a cycle, cleanup the node we are
just pruning and send any the parent that's also subject to the pruning
to the prune tree via normal way (e.g. enqueue pruning on the parent).
This simplifies the code and also spreads the pruning load across more
event loop ticks which is better for lock contention as less things run
in a tight loop.
In some older BIND 9 branches, the extra queuing overhead eliminated by
this change could be remotely exploited to cause excessive memory use.
Due to architectural shift, this branch is not vulnerable to that issue,
but applying the fix to the latter is nevertheless deemed prudent for
consistency and to make the code future-proof.
However, it turned out that having a single queue for the nodes to be
pruned increased lock contention to a level where cleaning up nodes from
the RBTDB took too long, causing the amount of memory used by the cache
to grow indefinitely over time.
This commit reverts the change to the pruning mechanism introduced by
commit 24381cc36d8528f5a4046fb2614451aeac4cdfc1 as BIND branches newer
than 9.16 were not affected by the excessive event queueing overhead
issue mentioned in the log message for the above commit.
Artem Boldariev [Thu, 22 Feb 2024 17:42:04 +0000 (19:42 +0200)]
Improve documentation on ephemeral TLS configuration
This commit improves the documentation on the ephemeral TLS
configuration and describes in more detail what is happening with TLS
configurations on reconfiguration in general.
Michal Nowak [Fri, 23 Feb 2024 13:51:23 +0000 (14:51 +0100)]
Watch logs from start in dialup system test
When the first parametrized test takes a bit longer than usual, the zone
transfer in ns3 may succeed before the second parametrized test is even
started, and then watch_log_from_here() won't find the "Transfer status:
success" message in the named log. Using watch_log_from_start() instead
makes sure the test is more stable.
Mark Andrews [Thu, 22 Feb 2024 23:12:47 +0000 (10:12 +1100)]
Do not use header_prev in expire_lru_headers
dns__cacherbt_expireheader can unlink / free header_prev underneath
it. Use ISC_LIST_TAIL after calling dns__cacherbt_expireheader
instead to get the next pointer to be processed.
Artem Boldariev [Mon, 19 Feb 2024 18:02:38 +0000 (20:02 +0200)]
Do not lock workers when using -T transferslowly/transferstuck
This commit ensures that worker threads are not sleeping (by using
select()) when '-T transferslowly/transferstuck' test options are
used. This commit converts synchronous implementation of the code into
an asynchronous one based on timers.
Artem Boldariev [Mon, 12 Feb 2024 20:51:39 +0000 (22:51 +0200)]
DoT: do not crash resolver on TLS context creation failure
The resolver's code was not ready to failures when trying to establish
a connection via TCP-based transports (e.g. when creating TLS contexts
before establishing a TLS connection).
Tom Krizek [Thu, 15 Feb 2024 13:55:56 +0000 (14:55 +0100)]
Don't include temp testdir on each log line
This was mostly an artifact to tell which log lines belong to which test
from the time when the test output could be all mingled together. Now
this info is reduntant, because the pytest logger already includes both
the system test name, and the specific test.
Tom Krizek [Thu, 15 Feb 2024 13:47:13 +0000 (14:47 +0100)]
Add utility logging functions to isctest.log
Unify the different loggers (conftest, module, test) into a single
interface. Remove the need to select the proper logger by automatically
selecting the most-specific logger currently available.
This also removes the need to use the logger/mlogger fixtures manually
and pass these around. This was especially annoying and unwieldy when
splitting the test cases into functions, because logger had to always be
passed around. Instead, it is now possible to use the
isctest.log.(debug,info,warning,error) functions.
Tom Krizek [Thu, 15 Feb 2024 12:57:42 +0000 (13:57 +0100)]
Move watchlog module into isctest.log package
Preparation for further logging improvements - keep the watchlog
contents in a separate module inside isctest.log. Export the names in
the log package so the imports don't change for the users of these
classes.
Tom Krizek [Tue, 13 Feb 2024 15:42:16 +0000 (16:42 +0100)]
Remove accidentally duplicated RNDCExecutor code
This code has probably been accidentally added during some rebase. The
actual RNDCExecutor and related classes are in isctest/rndc.py. Remove
the duplicated and unused code from isctest/log.py, as it doesn't belong
there.
Aram Sargsyan [Wed, 7 Feb 2024 09:22:55 +0000 (09:22 +0000)]
Address scan-build warnings
The warnings (see below) seem to be false-positives. Address them
by adding runtime checks.
resolver.c:1627:10: warning: Access to field 'tid' results in a dereference of a null pointer (loaded from variable 'fctx') [core.NullDereference]
1627 | REQUIRE(fctx->tid == isc_tid());
| ^~~~~~~~~
../../lib/isc/include/isc/util.h:332:34: note: expanded from macro 'REQUIRE'
332 | #define REQUIRE(e) ISC_REQUIRE(e)
| ^
../../lib/isc/include/isc/assertions.h:45:11: note: expanded from macro 'ISC_REQUIRE'
45 | ((void)((cond) || \
| ^~~~
resolver.c:10335:6: warning: Access to field 'depth' results in a dereference of a null pointer (loaded from variable 'fctx') [core.NullDereference]
10335 | if (fctx->depth > depth) {
| ^~~~~~~~~~~
2 warnings generated.
Evan Hunt [Wed, 14 Feb 2024 20:58:01 +0000 (12:58 -0800)]
fix several bugs in the RBTDB dbiterator implementation
- the DNS_DB_NSEC3ONLY and DNS_DB_NONSEC3 flags are mutually
exclusive; it never made sense to set both at the same time.
to enforce this, it is now a fatal error to do so. the
dbiterator implementation has been cleaned up to remove
code that treated the two as independent: if nonsec3 is
true, we can be certain nsec3only is false, and vice versa.
- previously, iterating a database backwards omitted
NSEC3 records even if DNS_DB_NONSEC3 had not been set. this
has been corrected.
- when an iterator reaches the origin node of the NSEC3 tree, we
need to skip over it and go to the next node in the sequence.
the NSEC3 origin node is there for housekeeping purposes and
never contains data.
- the dbiterator_test unit test has been expanded, several
incorrect expectations have been fixed. (for example, the
expected number of iterations has been reduced by one; we were
previously counting the NSEC3 origin node and we should not
have been doing so.)
Michał Kępień [Wed, 14 Feb 2024 13:49:49 +0000 (14:49 +0100)]
Swap CHANGES entries 6343 and 6344
Fix a CHANGES entries numbering issue that was inadvertently introduced
when change 6344 was backported. This makes the affected CHANGES
numbers consistent across all branches and releases again.
Michał Kępień [Wed, 14 Feb 2024 13:49:49 +0000 (14:49 +0100)]
Retroactively add release note for CVE-2023-50868
A release note for CVE-2023-50868 was not included in BIND 9.19.21, even
though that vulnerability was already addressed in that release (by the
fix for CVE-2023-50387). Retroactively add a relevant release note for
BIND 9.19.21.
Michał Kępień [Wed, 14 Feb 2024 13:49:49 +0000 (14:49 +0100)]
Mention CVE-2023-50868 in CHANGES entry 6322
Since CVE-2023-50868 does not have a dedicated fix in BIND 9, mention
its CVE identifier in the CHANGES entry for CVE-2023-50387 (KeyTrap),
which accompanied the code change that addresses both of these
vulnerabilities.
Evan Hunt [Thu, 5 Oct 2023 01:14:55 +0000 (18:14 -0700)]
clean up dns_rbt
- create_node() in rbt.c cannot fail
- the dns_rbt_*name() functions, which are wrappers around
dns_rbt_[add|find|delete]node(), were never used except in tests.
this change isn't really necessary since RBT is likely to go away
eventually anyway. but keeping the API as simple as possible while it
persists is a good thing, and may reduce confusion while QPDB is being
developed from RBTDB code.
Evan Hunt [Thu, 5 Oct 2023 00:49:51 +0000 (17:49 -0700)]
move DNS_RBT_NSEC_* to db.h
these values pertain to whether a node is in the main, nsec, or nsec3
tree of an RBTDB. they need to be moved to a more generic location so
they can also be used by QPDB.
(this is in db.h rather than db_p.h because rbt.c needs access to it.
technically, that's a layer violation, but it's a long-existing one;
refactoring to get rid of it would be a large hassle, and eventually
we expect to remove rbt.c anyway.)