Mark Andrews [Fri, 21 Apr 2023 02:11:15 +0000 (12:11 +1000)]
Cleanup orphaned empty-non-terminal NSEC3
When OPTOUT was in use we didn't ensure that NSEC3 records
for orphaned empty-non-terminals where removed. Check if
there are orphaned empty-non-terminal NSEC3 even if there
wasn't an NSEC3 RRset to be removed in dns_nsec3_delnsec3.
'-T transferinsecs' makes named interpret the max-transfer-time-out,
max-transfer-idle-out, max-transfer-time-in and max-transfer-idle-in
configuration options as seconds instead of minutes.
'-T transferslowly' makes named to sleep for one second for every
xfrout message.
'-T transferstuck' makes named to sleep for one minute for every
xfrout message.
Implement maximum global and idle time for incoming XFR
After the dns_xfrin was changed to use network manager, the maximum
global (max-transfer-time-in) and idle (max-transfer-idle-in) times for
incoming transfers were turned inoperational because of missing
implementation.
Restore this functionality by implementing the timers for the incoming
transfers.
Ondřej Surý [Fri, 3 Mar 2023 08:24:13 +0000 (09:24 +0100)]
Add isc_spinlock unit with shim pthread_spin implementation
The spinlock is small (atomic_uint_fast32_t at most), lightweight
synchronization primitive and should only be used for short-lived and
most of the time a isc_mutex should be used.
Add a isc_spinlock unit which is either (most of the time) a think
wrapper around pthread_spin API or an efficient shim implementation of
the simple spinlock.
Ondřej Surý [Thu, 30 Mar 2023 07:14:21 +0000 (09:14 +0200)]
Always initialize the workers in the libtest
The workers variable might be needed even to tests not using
loopmgr. Split the workers initialization into setup_workers() function
and always call it from the default main loop.
When shutting down TCP sockets, the read callback calling logic was
flawed, it would call either one less callback or one extra. Fix the
logic in the way:
1. When isc_nm_read() has been called but isc_nm_read_stop() hasn't on
the handle, the read callback will be called with ISC_R_CANCELED to
cancel active reading from the socket/handle.
2. When isc_nm_read() has been called and isc_nm_read_stop() has been
called on the on the handle, the read callback will be called with
ISC_R_SHUTTINGDOWN to signal that the dormant (not-reading) socket
is being shut down.
3. The .reading and .recv_read flags are little bit tricky. The
.reading flag indicates if the outer layer is reading the data (that
would be uv_tcp_t for TCP and isc_nmsocket_t (TCP) for TLSStream),
the .recv_read flag indicates whether somebody is interested in the
data read from the socket.
Usually, you would expect that the .reading should be false when
.recv_read is false, but it gets even more tricky with TLSStream as
the TLS protocol might need to read from the socket even when sending
data.
Fix the usage of the .recv_read and .reading flags in the TLSStream
to their true meaning - which mostly consist of using .recv_read
everywhere and then wrapping isc_nm_read() and isc_nm_read_stop()
with the .reading flag.
4. The TLS failed read helper has been modified to resemble the TCP code
as much as possible, clearing and re-setting the .recv_read flag in
the TCP timeout code has been fixed and .recv_read is now cleared
when isc_nm_read_stop() has been called on the streaming socket.
5. The use of Network Manager in the named_controlconf, isccc_ccmsg, and
isc_httpd units have been greatly simplified due to the improved design.
6. More unit tests for TCP and TLS testing the shutdown conditions have
been added.
Handle the failure to send notify more gracefully and with log
When dns_request_create() failed in notify_send_toaddr(), sending the
notify would silently fail. When notify_done() failed, the error would
be logged on the DEBUG(2) level.
This commit remedies the situation by:
* Promoting several messages related to notifies to INFO level and add
a "success" log message at the INFO level
* Adding a TCP fallback - when sending the notify over UDP fails, named
will retry sending notify over TCP and log the information on the
NOTICE level
* When sending the notify over TCP fails, it will be logged on the
WARNING level
There is no 'ret' in this test, and it is obvious that 'ret=1'
should be 'tmp=1' for the check to work correctly, if the string
is not found in the log file.
Mark Andrews [Tue, 11 Apr 2023 05:32:51 +0000 (15:32 +1000)]
isc_loopmgr_pause was called inappropriately
isc_loopmgr_pause can't be called before isc_loopmgr_run is
called as the thread ids are not yet valid. If there is a
fatal error before isc_loopmgr_run is run then don't call
isc_loopmgr_pause.
Add a test case to cover #3679 where a user migrates from a KSK/ZSK
split using auto-dnssec maintain, to the default dnssec-policy (CSK).
The test actually does not use the default dnssec-policy, but it does
use one that has the same keys clause. For testing convenience, we use
the same propagation time values as other test cases that migrate to
dnssec-policy with mismatching existing key set.
Run the forward_cancel on the appropriate zone->loop
If the zone forwards are canceled from dns_zonemgr_shutdown(), the
forward_cancel() would get called from the main loop, which is wrong.
It needs to be called from the matching zone->loop.
Run the dns_request_cancel() via isc_async_run() on the loop associated
with the zone instead of calling the dns_request_cancel() directly from
the main loop.
By inspecting the code, it was discovered that .sendbuf member of the
isc__nm_networker_t was unused and just consuming ~64k per worker.
Remove the member and the association allocation/deallocation.
This makes any debugging of the unit tests too hard. Futures attempts
to fix #3980 should add a custom automake test harness (log driver) that
would kill the unit test after configured timeout.
Refactor the isc_quota code and fix the quota in TCP accept code
In e18541287231b721c9cdb7e492697a2a80fd83fc, the TCP accept quota code
became broken in a subtle way - the quota would get initialized on the
first accept for the server socket and then deleted from the server
socket, so it would never get applied again.
Properly fixing this required a bigger refactoring of the isc_quota API
code to make it much simpler. The new code decouples the ownership of
the quota and acquiring/releasing the quota limit.
After (during) the refactoring it became more clear that we need to use
the callback from the child side of the accepted connection, and not the
server side.
Run closehandle_cb on run queue instead of async queue
Instead of using isc_async_run() when closing StreamDNS handle, add
isc_job_t member to the isc_nmhandle_t structure and use isc_job_run()
to avoid allocation/deallocation on the StreamDNS hot-path.
Accept overquota TCP connection on local thread if possible
If the quota callback is called on a thread matching the socket, call
the TCP accept function directly instead of using isc_async_run() which
allocates-deallocates memory.
The isc_tid() function is often called on the hot-path and it's the only
function is to return thread_local variable, make the isc_tid() function
a header-only to save several function calls during query-response
processing.
Michal Nowak [Wed, 5 Apr 2023 13:55:09 +0000 (15:55 +0200)]
Do not retry in resolution_fails() on timeout
At the time of test number (19), there were 10 "sending packet to
10.53.0.7" lines in the "legacy/ns1/named.run" file; usually, only seven
are present:
I:legacy:checking recursive lookup to edns 512 + no tcp server does not cause query loops (19)
I:legacy:ns1 sent 10 queries to ns7, expected less than 10
I:legacy:failed
Those three can be attributed to tests "8", "10", and "18", where the
dig of "resolution_fails()" retried after a timeout to succeed with
"status: SERVFAIL" subsequently, as seen in each of
dig.out.test{8,10,18} files.
;; communications error to 10.53.0.1#13093: timed out
Tony Finch [Wed, 5 Apr 2023 12:42:52 +0000 (13:42 +0100)]
Correct value of DNS_NAME_MAXLABELS
It should be floor(DNS_NAME_MAXWIRE / 2) + 1 == 128
The mistake was introduced in c6bf51492dbd because:
* I was refactoring an existing `DNS_MAX_LABELS` defined as 127
* There was a longstanding bug in `dns_name_isvalid()` which
checked the number of labels against 127U instead of 128
* I mistakenly thought `dns_name_isvalid()` was correct and
`dns_name_countlabels()` was incorrect, but the reverse was true.
After this commit, occurrances of `DNS_NAME_MAXLABELS` with value
128 are consistent with the use of 127 or 128 before commit c6bf51492dbd except for the mistake in `dns_name_isvalid()`.
This commit adds a test case that checks the MAXLABELS case
in `dns_name_fromtext()` and `dns_name_isvalid()`.
Tony Finch [Tue, 14 Feb 2023 16:13:16 +0000 (16:13 +0000)]
Use a qp-trie for the zone table
This change makes the zone table lock-free for reads. Previously, the
zone table used a red-black tree, which is not thread safe, so the hot
read path acquired both the per-view mutex and the per-zonetable
rwlock. (The double locking was to fix to cleanup races on shutdown.)
One visible difference is that zones are not necessarily shut down
promptly: it depends on when the qp-trie garbage collector cleans up
the zone table. The `catz` system test checks several times that zones
have been deleted; the test now checks for zones to be removed from
the server configuration, instead of being fully shut down. The catz
test does not churn through enough zones to trigger a gc, so the zones
are not fully detached until the server exits.
After this change, it is still possible to improve the way we handle
changes to the zone table, for instance, batching changes, or better
compaction heuristics.
Tony Finch [Fri, 3 Mar 2023 12:05:51 +0000 (12:05 +0000)]
Compact more in dns_qp_compact(DNS_QPGC_ALL)
Commit 0858514ae8 enriched dns_qp_compact() to give callers more
control over how thoroughly the trie should be compacted.
In the DNS_QPGC_ALL case, if the trie is small it might be compacted
to a new position in the same memory chunk. In this situation it will
still be holding references to old leaf objects which have been
removed from the trie but will not be completely detached until the
chunk containing the references is freed.
This change resets the qp-trie allocator to a fresh chunk before a
DNS_QPGC_ALL compaction, so all the old memory chunks will be
evacuated and old leaf objects can be detached sooner.
Tony Finch [Thu, 2 Mar 2023 13:30:24 +0000 (13:30 +0000)]
Support for off-loop read-ony qp-trie transactions
It is sometimes necessary to access a qp-trie outside an isc_loop,
such as in tests or an isc_work callback. The best option was to use
a `dns_qpmulti_write()` transaction, but that has overheads that are
not necessary for read-only access, such as committing a new version
of the trie even when nothing changed.
So this commit adds a `dns_qpmulti_read()` transaction, which is
nearly as lightweight as a query transaction, but it takes the mutex
like a write transaction.
Tony Finch [Fri, 10 Feb 2023 16:53:31 +0000 (16:53 +0000)]
Support for finding the longest parent domain in a qp-trie
This is the first of the "fancy" searches that know how the DNS
namespace maps on to the structure of a qp-trie. For example, it will
find the closest enclosing zone in the zone tree.