Ondřej Surý [Thu, 15 Dec 2022 10:54:51 +0000 (11:54 +0100)]
Change the isc_buffer_reserve() to take just buffer pointer
The isc_buffer_reserve() would be passed a reference to the buffer
pointer, which was unnecessary as the pointer would never be changed
in the current implementation. Remove the extra dereference.
Ondřej Surý [Wed, 30 Nov 2022 16:58:35 +0000 (17:58 +0100)]
Fix the thread safety in the dns_dispatch unit
The dispatches are not thread-bound, and used freely between various
threads (see the dns_resolver and dns_request units for details).
This refactoring make sure that all non-const dns_dispatch_t and
dns_dispentry_t members are accessed under a lock, and both object now
track their internal state (NONE, CONNECTING, CONNECTED, CANCELED)
instead of guessing the state from the state of various struct members.
During the refactoring, the artificial limit DNS_DISPATCH_SOCKSQUOTA on
UDP sockets per dispatch was removed as the limiting needs to happen and
happens on in dns_resolver and limiting the number of UDP sockets
artificially in dispatch could lead to unpredictable behaviour in case
one dispatch has the limit exhausted by others are idle.
The TCP artificial limit of DNS_DISPATCH_MAXREQUESTS makes even less
sense as the TCP connections are only reused in the dns_request API
that's not a heavy user of the outgoing connections.
As a side note, the fact that UDP and TCP dispatch pretends to be same
thing, but in fact the connected UDP is handled from dns_dispentry_t and
dns_dispatch_t acts as a broker, but connected TCP is handled from
dns_dispatch_t and dns_dispatchmgr_t acts as a broker doesn't really
help the clarity of this unit.
This refactoring kept to API almost same - only dns_dispatch_cancel()
and dns_dispatch_done() were merged into dns_dispatch_done() as we need
to cancel active netmgr handles in any case to not leave dangling
connections around. The functions handling UDP and TCP have been mostly
split to their matching counterparts and the dns_dispatch_<function>
functions are now thing wrappers that call <udp|tcp>_dispatch_<function>
based on the socket type.
More debugging-level logging was added to the unit to accomodate for
this fact.
Ondřej Surý [Fri, 16 Dec 2022 20:46:50 +0000 (21:46 +0100)]
Fix reference counting in get_attached_entry (again)
When get_attached_entry() encounters entry that would be expired, it
needs to get reference to the entry before calling maybe_expire_entry(),
so the ADB entry doesn't get destroyed inside the its own lock.
This creeped into the code base again during review, so I am adding
an extra comment to prevent this.
Tom Krizek [Thu, 15 Dec 2022 16:52:52 +0000 (17:52 +0100)]
danger: check backport commits for original commit IDs
A full backport must have all the commit from the original MR and the
original commit IDs must be referenced in the backport commit messages.
If the criteria above is not met, the MR should be marked as a partial
backport. In that case, any discrepencies are only logged as informative
messages rather than failures.
Tom Krizek [Thu, 15 Dec 2022 16:51:24 +0000 (17:51 +0100)]
danger: check that original MR has been merged
When checking a backport MR, ensure that the original MR has been merged
already. This is vital for followup checks that verify commit IDs from
original commits are present in backport commit messages.
Tom Krizek [Thu, 15 Dec 2022 16:48:34 +0000 (17:48 +0100)]
danger: check backport links to the original MR
When doing archeology, it is much easier to find stuff if it's properly
linked. This check ensures that backport MR are linked to their original
MR via a "Backport of !XXXX" message.
The regular expression is fairly broad and has been tested to accept the
following variants of the message:
Backport of MR !XXXX
Backport of: !XXXX
backport of mr !XXXX
Backport of !XXXX
Backport of https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/XXXX
Ondřej Surý [Tue, 13 Dec 2022 13:32:19 +0000 (14:32 +0100)]
Don't expire fresh ADB names and entries
The overmem cleaning in ADB could become overzealous and clean fresh ADB
names and entries. Add a safety check to not clean any ADB names and
entries that are below ADB_CACHE_MINIMUM threshold.
Ondřej Surý [Tue, 13 Dec 2022 13:14:21 +0000 (14:14 +0100)]
Exclude the ADB hashmaps from ADB overmem accounting
The ADB overmem accounting would include the memory used by hashtables
thus vastly reducing the space that can be used for ADB names and
entries when the hashtables would grow. Create own memory context for
the ADB names and entries hash tables.
Ondřej Surý [Tue, 13 Dec 2022 12:48:55 +0000 (13:48 +0100)]
Lock the adbname and adbentry prior to unlocking hash locks
There was a datarace that could expire a freshly created ADB names and
entries between the return from get_attached_{name,entry} and locking it
again. Lock the ADB name and ADB entry inside the hash table lock, so
they won't get expired until the full initialization has been complete.
Aram Sargsyan [Wed, 14 Dec 2022 14:40:31 +0000 (14:40 +0000)]
Fix logging a uint32_t SOA serial value in dns_catz_update_from_db()
The dns_catz_update_from_db() function prints serial number as a signed
number (with "%d" in the format string), but the `vers` variable's type
is 'uint32_t'. This breaks serials bigger than 2^31.
Ondřej Surý [Wed, 7 Dec 2022 08:45:34 +0000 (09:45 +0100)]
Add internal logging functions to the netmgr
Add internal logging functions isc__netmgr_log, isc__nmsocket_log(), and
isc__nmhandle_log() that can be used to add logging messages to the
netmgr, and change all direct use of isc_log_write() to use those
logging functions to properly prefix them with netmgr, nmsocket and
nmsocket+nmhandle.
Ondřej Surý [Tue, 13 Dec 2022 14:20:10 +0000 (15:20 +0100)]
Revert the statistics system test change after we fixed the resolver
When the resolver was refactored, the statistics system test had to be
adjusted in c6b4d8255775a24a12b832a90a78cbf86e9faa8d. Unfortunately,
this change had to be done because of an error in the resolver
refactoring where timeout would not retry next server, but keep trying
the same server. As we have now fixed this bug, revert the change to
the test back to the previous state.
Ondřej Surý [Thu, 8 Dec 2022 09:46:09 +0000 (10:46 +0100)]
Allow zero length keys in isc_hashmap
In case, we are trying to hash the empty key into the hashmap, the key
is going to have zero length. This might happen in the unit test.
Allow this and add a unit test to ensure the empty zero-length key
doesn't hash to slot 0 as SipHash 2-4 (our hash function of choice) has
no problem with zero-length inputs.
Artem Boldariev [Fri, 9 Dec 2022 16:44:01 +0000 (18:44 +0200)]
Fix TLS session resumption via IDs when Mutual TLS is used
This commit fixes TLS session resumption via session IDs when
client certificates are used. To do so it makes sure that session ID
contexts are set within server TLS contexts. See OpenSSL documentation
for 'SSL_CTX_set_session_id_context()', the "Warnings" section.
Ondřej Surý [Tue, 13 Dec 2022 10:02:47 +0000 (11:02 +0100)]
Fix intermittent memory leak in dns_resolver unit
A rdataset could have been left unassociated on the error path in the
resume_dslookup() in the dns_resolver unit. Clone the rdataset after
the error check, so it's not cloned before we check whether we can make
further progress chasing DS records.
Mark Andrews [Tue, 13 Dec 2022 01:03:49 +0000 (12:03 +1100)]
Properly initialise local_ndata in isdotlocal in dig
Remove the trailing '\0' so that the length field of the dns_name_t
structure is correct. The old data just happens to work with
dns_name_issubdomain but would fail with dns_name_equal.
Ondřej Surý [Fri, 9 Dec 2022 07:53:20 +0000 (08:53 +0100)]
Implement proper reference counting for dns_keyfileio_t
Instead of relying on hash table search when using the keys, implement a
proper reference counting in dns_keyfileio_t objects, and attach/detach
the objects to the zone.
Ondřej Surý [Wed, 7 Dec 2022 15:45:33 +0000 (16:45 +0100)]
Release unused key file IO lock objects
Due to off-by-one error in zonemgr_keymgmt_delete, unused key file IO
lock objects were never freed and they were kept until the server
shutdown. Adjust the returned value by -1 to accomodate the fact that
the atomic_fetch_*() functions return the value before the operation and
not current value after the operation.
Mark Andrews [Mon, 21 Nov 2022 00:59:45 +0000 (11:59 +1100)]
Remove different zero TTL handling for rdataset iterator
Zero TTL handling does not need to be different for 'rdatasetiter_first'
and 'rdatasetiter_next' and it interacts badly with 'bind_rdatadataset'
which makes different determinations.
Ondřej Surý [Sun, 13 Nov 2022 10:04:30 +0000 (11:04 +0100)]
Remove isc_resource API and set limits directly in named_os unit
The only function left in the isc_resource API was setting the file
limit. Replace the whole unit with a simple getrlimit to check the
maximum value of RLIMIT_NOFILE and set the maximum back to rlimit_cur.
This is more compatible than trying to set RLIMIT_UNLIMITED on the
RLIMIT_NOFILE as it doesn't work on Linux (see man 5 proc on
/proc/sys/fs/nr_open), neither it does on Darwin kernel (see man 2
getrlimit).
The only place where the maximum value could be raised under privileged
user would be BSDs, but the `named_os_adjustnofile()` were not called
there before. We would apply the increased limits only on Linux and Sun
platforms.
Ondřej Surý [Sun, 13 Nov 2022 09:28:17 +0000 (10:28 +0100)]
Mark setting operating system limits from named.conf as ancient
After deprecating the operating system limits settings (coresize,
datasize, files and stacksize), mark them as ancient and remove the code
that sets the values from config.
Ondřej Surý [Thu, 3 Nov 2022 16:42:12 +0000 (17:42 +0100)]
Propagate the shutdown event to the recursing ns_client(s)
Send the ns_query_cancel() on the recursing clients when we initiate the
named shutdown for faster shutdown.
When we are shutting down the resolver, we cancel all the outstanding
fetches, and the ISC_R_CANCEL events doesn't propagate to the ns_client
callback.
In the future, the better solution how to fix this would be to look at
the shutdown paths and let them all propagate from bottom (loopmgr) to
top (f.e. ns_client).
Ondřej Surý [Tue, 6 Dec 2022 14:59:35 +0000 (15:59 +0100)]
Fix reference counting in get_attached_entry
When get_attached_entry() encounters entry that would be expired, it
needs to get reference to the entry before calling maybe_expire_entry(),
so the ADB entry doesn't get destroyed inside the its own lock.
Michal Nowak [Tue, 22 Nov 2022 09:27:17 +0000 (10:27 +0100)]
Add ASAN- and TSAN-enabled respdiff jobs
Neither of the new CI jobs can reliably pass at the moment; hence they
are defined with "allow_failure: true" until issues in the code base are
resolved.
Mark Andrews [Wed, 30 Nov 2022 08:32:11 +0000 (19:32 +1100)]
Check that restored catalog zone works
Using a restored catalog zone excercised a use-after-free bug.
The test checks that the use-after-free bug is gone and is just
a reasonable behaviour check in its own right.
Mark Andrews [Wed, 30 Nov 2022 07:44:37 +0000 (18:44 +1100)]
Call dns_db_updatenotify_unregister earlier
dns_db_updatenotify_unregister needed to be called earlier to ensure
that listener->onupdate_arg always points to a valid object. The
existing lazy cleanup in rbtdb_free did not ensure that.