Aram Sargsyan [Wed, 2 Oct 2024 16:22:38 +0000 (16:22 +0000)]
Fix error path bugs in the "recursing-clients" list management
In two places, after linking the client to the manager's
"recursing-clients" list using the check_recursionquota()
function, the query.c module fails to unlink it on error
paths. Fix the bugs by unlinking the client from the list.
Also make sure that unlinking happens before detaching the
client's handle, as it is the logically correct order, e.g.
in case if it's the last handle and ns__client_reset_cb()
can be called because of the detachment.
Arаm Sаrgsyаn [Wed, 9 Oct 2024 10:31:09 +0000 (10:31 +0000)]
fix: dev: Fix a data race in dns_zone_getxfrintime()
The dns_zone_getxfrintime() function fails to lock the zone before
accessing its 'xfrintime' structure member, which can cause a data
race between soa_query() and the statistics channel. Add the missing
locking/unlocking pair, like it's done in numerous other similar
functions.
Closes #4976
Merge branch '4976-zone-xfrintime-data-race-fix' into 'main'
Aram Sargsyan [Mon, 7 Oct 2024 12:26:59 +0000 (12:26 +0000)]
Fix a data race in dns_zone_getxfrintime()
The dns_zone_getxfrintime() function fails to lock the zone before
accessing its 'xfrintime' structure member, which can cause a data
race between soa_query() and the statistics channel. Add the missing
locking/unlocking pair, like it's done in numerous other similar
functions.
Arаm Sаrgsyаn [Wed, 9 Oct 2024 09:12:45 +0000 (09:12 +0000)]
fix: dev: Clean up 'nodetach' in ns_client
The 'nodetach' member is a leftover from the times when non-zero
'stale-answer-client-timeout' values were supported, and currently
is always 'false'. Clean up the member and its usage.
Merge branch 'aram/cleanup-ns-client-nodetach' into 'main'
Aram Sargsyan [Mon, 7 Oct 2024 15:29:14 +0000 (15:29 +0000)]
Clean up 'nodetach' in ns_client
The 'nodetach' member is a leftover from the times when non-zero
'stale-answer-client-timeout' values were supported, and currently
is always 'false'. Clean up the member and its usage.
Ondřej Surý [Wed, 2 Oct 2024 12:16:03 +0000 (12:16 +0000)]
fix: dev: Don't enable REUSEADDR on outgoing UDP sockets
The outgoing UDP sockets enabled `SO_REUSEADDR` that allows sharing of the UDP sockets, but with one big caveat - the socket that was opened the last would get all traffic. The dispatch code would ignore the invalid responses in the dns_dispatch, but this could lead to unexpected results.
Merge branch 'ondrej/fix-outgoing-UDP-port-selection' into 'main'
Currently, the outgoing UDP sockets have enabled
SO_REUSEADDR (SO_REUSEPORT on BSDs) which allows multiple UDP sockets to
bind to the same address+port. There's one caveat though - only a
single (the last one) socket is going to receive all the incoming
traffic. This in turn could lead to incoming DNS message matching to
invalid dns_dispatch and getting dropped.
Disable setting the SO_REUSEADDR on the outgoing UDP sockets. This
needs to be done explicitly because `uv_udp_open()` silently enables the
option on the socket.
Ondřej Surý [Wed, 2 Oct 2024 06:37:48 +0000 (08:37 +0200)]
Skip TCP dispatch responses that are not ours
When matching the TCP dispatch responses, we should skip the responses
that do not belong to our TCP connection. This can happen with faulty
upstream server that sends invalid QID back to us.
Arаm Sаrgsyаn [Wed, 2 Oct 2024 09:51:40 +0000 (09:51 +0000)]
fix: dev: Don't ignore the local port number in dns_dispatch_add() for TCP
The dns_dispatch_add() function registers the 'resp' entry in
'disp->mgr->qids' hash table with 'resp->port' being 0, but in
tcp_recv_success(), when looking up an entry in the hash table
after a successfully received data the port is used, so if the
local port was set (i.e. it was not 0) it fails to find the
entry and results in an unexpected error.
Set the 'resp->port' to the given local port value extracted from
'disp->local'.
Closes #4969
Merge branch '4969-dispatch-tcp-source-port-bug-fix' into 'main'
Aram Sargsyan [Tue, 1 Oct 2024 14:16:47 +0000 (14:16 +0000)]
Don't ignore the local port number in dns_dispatch_add() for TCP
The dns_dispatch_add() function registers the 'resp' entry in
'disp->mgr->qids' hash table with 'resp->port' being 0, but in
tcp_recv_success(), when looking up an entry in the hash table
after a successfully received data the port is used, so if the
local port was set (i.e. it was not 0) it fails to find the
entry and results in an unexpected error.
Set the 'resp->port' to the given local port value extracted from
'disp->local'.
Alessio Podda [Wed, 2 Oct 2024 08:16:17 +0000 (08:16 +0000)]
new: usr: Support ISO timestamps with timezone information
The configuration option `print-time` can now be set to `iso8601-tzinfo` in order to use the ISO 8601 timestamp with timezone information when logging. This is used as a default for `named -g`.
Closes #4963
Merge branch '4963-provide-timezone-information-in-log-timestamps' into 'main'
This commit adds support for timestamps in iso8601 format with timezone
when logging. This is exposed through the iso8601-tzinfo printtime
suboption.
It also makes the new logging format the default for -g output,
hopefully removing the need for custom timestamp parsing in scripts.
Michal Nowak [Tue, 1 Oct 2024 12:05:39 +0000 (12:05 +0000)]
chg: test: Replace dns.query module with isctest.query
The `dns.query.udp` and `dns.query.tcp` methods are [prone to timeouts](https://gitlab.isc.org/isc-projects/bind9/-/jobs/4785053); their `isctest.query` equivalents should be used in system tests instead.
Merge branch 'mnowak/convert-dns-query-udp-and-tcp-to-isctest-query' into 'main'
Alessio Podda [Tue, 1 Oct 2024 10:33:56 +0000 (10:33 +0000)]
fix: dev: Null clausedefs for ancient options
This commit nulls all type fields for the clausedef lists that are
declared ancient, and removes the corresponding cfg_type_t and parsing
functions when they are found to be unused after the change.
Among others, it removes some leftovers from #1913.
Closes #4962
Merge branch '4962-null-clausedef-types-for-ancient-options' into 'main'
This commit nulls all type fields for the clausedef lists that are
declared ancient, and removes the corresponding cfg_type_t and parsing
functions when they are found to be unused after the change.
fix: doc: Restore text about sig validity and SOA expire
When `sig-validity-interval` was obsoleted, the text that the signature validity interval should be multiples of the SOA expire interval was removed. Restore this text to the description of the `signatures-validity` option.
Closes #4951
Merge branch '4951-document-signatures-validity-soa-expire' into 'main'
The example.com zone file given in the "Configurations and Zone Files"
chapter has an SOA expire of 3 weeks, which is not a multiple of
the default signatures-validity value. Adjust the SOA expire so that
it is much lower than the signatures-validity default.
When `sig-validity-interval` was obsoleted, the text that the signature
validity interval should be multiples of the SOA expire interval was
removed. Restore this text to the description of the
`signatures-validity` option.
Mark Andrews [Tue, 1 Oct 2024 01:26:56 +0000 (01:26 +0000)]
fix: usr: Fix a bug in the static-stub implementation
Static-stub addresses and addresses from other sources were being
mixed together, resulting in static-stub queries going to addresses
not specified in the configuration, or alternatively, static-stub
addresses being used instead of the correct server addresses.
Closes #4850
Merge branch '4850-add-an-additional-class-of-names-to-adb' into 'main'
Mark Andrews [Wed, 14 Aug 2024 13:46:44 +0000 (23:46 +1000)]
Store static-stub addresses seperately in the adb
Static-stub address and addresses from other sources where being
mixed together resulting in static-stub queries going to addresses
not specified in the configuration or alternatively static-stub
addresses being used instead of the real addresses.
chg: dev: Use release memory ordering when incrementing reference counter
As the relaxed memory ordering doesn't ensure any memory
synchronization, it is possible that the increment will succeed even
in the case when it should not - there is a race between
atomic_fetch_sub(..., acq_rel) and atomic_fetch_add(..., relaxed).
Only the result is consistent, but the previous value for both calls
could be same when both calls are executed at the same time.
Merge branch 'ondrej/use-release-memory-ordering-for-reference-counting' into 'main'
Use release memory ordering when incrementing reference counter
As the relaxed memory ordering doesn't ensure any memory
synchronization, it is possible that the increment will succeed even
in the case when it should not - there is a race between
atomic_fetch_sub(..., acq_rel) and atomic_fetch_add(..., relaxed).
Only the result is consistent, but the previous value for both calls
could be same when both calls are executed at the same time.
fix: dev: Add a missing rcu_read_unlock() call on exit path
An exit path in the dns_dispatch_add() function fails to get out of
the RCU critical section when returning early. Add the missing
rcu_read_unlock() call.
Merge branch 'aram/add-missing-rcu_read_unlock-in-dns_dispatch_add' into 'main'
An exit path in the dns_dispatch_add() function fails to get out of
the RCU critical section when returning early. Add the missing
rcu_read_unlock() call.
chg: usr: Honour the Control Group memory contraints on Linux
On Linux, the system administrator can use Control Group ``cgroup``
mechanism to limit the amount of available memory to the process. This
limit will be honoured when calculating the percentage-based values.
Merge branch 'ondrej/use-uv_get_available_memory-doc' into 'main'
Document that we now honour the cgroup memory limit
On Linux, the system administrator can use Control Group ``cgroup``
mechanism to limit the amount of available memory to the process. This
limit will be honoured when calculating the percentage-based values.
Mark Andrews [Wed, 25 Sep 2024 12:03:45 +0000 (12:03 +0000)]
new: usr: Added WALLET type
Add the new record type WALLET (262). This provides a mapping from a domain name to a cryptographic currency wallet. Multiple mappings can exist if multiple records exist.
Closes #4947
Merge branch '4947-add-wallet-type-to-named' into 'main'
fix: usr: Fix the 'rndc dumpdb' command's error reporting
The 'rndc dumpdb' command wasn't reporting errors which
occurred when starting up the database dump process by named,
like, for example, a permission denied error for the
'dump-file' file. This has been fixed. Note, however, that
'rndc dumpdb' performs asynchronous writes, so errors can
also occur during the dumping process, which will not be
reported back to 'rndc', but which will still be logged by
named.
Closes #4944
Merge branch '4944-rndc-dumpdb-do-not-ignore-errors' into 'main'
The named_server_dumpdb() function, which is called when a 'rndc dumpdb'
command is issued, returns a 'isc_result_t' result code and it has been
always ignored since its introduction in eb8713ed947fdf22a41dad673d561896dd6fe4a2, where it was still called
ns_server_dumpdb(). The orignal reasoning is not preserved, but it could
have been also a simple copy-paste mistake, as there are commands, which
return 'void' and require manually setting 'result = ISC_R_SUCCESS;', as
it was done here. Anyway, named will now return the actual result, and
'rndc' will report an error, when the 'dumpdb' command fails.
Since the changes aren't tracked in the single changelog.rst file,
generate the changelog to stdout instead, so it can be easily redirected
to the proper file.
Use libuv functions to get memory available to BIND 9
This change uses uv_get_total_memory() to get the memory available to
BIND 9 with possible modification by uv_get_constrained_memory() if the
libuv version is recent enough to honour constraints created by
f.e. cgroups.
chg: ci: Increase the load TCP/DoT shotgun perf tests
Due to the recent improvements to the TCP processing, much higher loads
can be handled by BIND9 without causing client timeouts. The updated
parameters give us useful data for both cold and hot cache testing.
Merge branch 'nicki/increase-tcp-dot-shotgun-load' into 'main'
Due to the recent improvements to the TCP processing, much higher loads
can be handled by BIND9 without causing client timeouts. The updated
parameters give us useful data for both cold and hot cache testing.
Michal Nowak [Mon, 23 Sep 2024 15:39:16 +0000 (15:39 +0000)]
chg: test: Downgrade "timeout" and "attempts" arguments in shutdown
The shutdown system test sends queries when named is shutting down, not
in an attempt to get answers but to destabilize the server into a crash.
With isctest.query.udp() defaulting to try up to ten times with a
ten-second timeout to get a response we don't care about from a likely
terminated server, we make the test run much longer than needed because
of retries and long timeouts.
Also, see isc-projects/bind9#4943.
Merge branch 'mnowak/shutdown-downgrade-timeout-and-attempts-arguments' into 'main'
Michal Nowak [Mon, 16 Sep 2024 12:55:06 +0000 (14:55 +0200)]
Downgrade "timeout" and "attempts" arguments in shutdown
The shutdown system test sends queries when named is shutting down, not
in an attempt to get answers but to destabilize the server into a crash.
With isctest.query.udp() defaulting to try up to ten times with a
ten-second timeout to get a response we don't care about from a likely
terminated server, we make the test run much longer than needed because
of retries and long timeouts.
chg: dev: Use uv_available_parallelism() if available
Instead of cooking up our own code for getting the number of available
CPUs for named to use, make use of uv_available_parallelism() from
libuv >= 1.44.0.
Merge branch 'ondrej/use-uv_available_parallelism-if-available' into 'main'
Add support to read number of online CPUs on OpenBSD
The OpenBSD doesn't have sysctlbyname(), but sysctl() can be used to
read the number of online/available CPUs by reading following MIB(s):
[CTL_HW, HW_NCPUONLINE] with fallback to [CTL_HW, HW_NCPU].
Cleanup the sysctlbyname and friends configure checks and ifdefs
Cleanup various checks and cleanups that are available on the all
platforms like sysctlbyname() and various related <sys/*.h> headers
that are either defined in POSIX or available on Linux and all BSDs.
Instead of cooking up our own code for getting the number of available
CPUs for named to use, make use of uv_available_parallelism() from
libuv >= 1.44.0.
Incoming transfers that took longer than 30 seconds would stop reading from the TCP stream and the incoming transfer would be indefinitely stuck causing BIND 9 to hang during shutdown.
This has been fixed and the `max-transfer-time-in` and `max-transfer-idle-in` timeouts are now honoured.
Closes #4949
Merge branch '4949-fix-ignored-and-invalid-dispatch-timeout-in-dns_xfrin' into 'main'
Don't enable timeouts in dns_dispatch for incoming transfers
The dns_dispatch_add() call in the dns_xfrin unit had hardcoded 30
second limit. This meant that any incoming transfer would be stopped in
it didn't finish within 30 seconds limit. Additionally, dns_xfrin
callback was ignoring the return value from dns_dispatch_getnext() when
restarting the reading from the TCP stream; this could cause transfers
to get stuck waiting for a callback that would never come due to the
dns_dispatch having already been shut down.
Call the dns_dispatch_add() without a timeout and properly handle the
result code from the dns_dispatch_getnext().
The dns_dispatch_add() has timeout parameter that could not be 0 (for
not timeout). Modify the dns_dispatch implementation to accept a zero
timeout for cases where the timeouts are undesirable because they are
managed externally.
chg: dev: Restore the number of threadpool threads back to original value
The issue of long-running operations potentially blocking query resolution has been fixed. Revert this temporary workaround and restore the number of threadpool threads.
Related #4898
Merge branch '4898-remove-workaround-and-note' into 'main'
Petr Menšík [Wed, 6 Oct 2021 11:53:33 +0000 (13:53 +0200)]
Move common flags logging to shared functions
Query and response log shares the same flags. Move flags logging out of
log_query to share it with log_response. Use buffer instead of snprintf
to fill flags a bit faster.
Petr Menšík [Tue, 13 Jul 2021 18:12:11 +0000 (20:12 +0200)]
Make responselog flags similar to querylog
Remove answer flag from log, log instead count of records for each
message section. Include EDNS version and few flags of response. Add
also status of result.
when the QP cache was adapted from the RBT database, some names
weren't changed. this could be confusing, so let's change them now.
also, we no longer need to include rbt.h.
rem: usr: Remove DNSRPS implementation from the open-source version
DNSRPS was the API for a commercial implementation of Response-Policy
Zones that was supposedly better. However, it was never open-sourced
and has only ever been available from a single vendor. This goes against
the principle that the open-source edition of BIND 9 should contain only
features that are generally available and universal.
This commit removes the DNSRPS implementation from BIND 9. It may be
reinstated in the subscription edition if there's enough interest from
customers, but it would have to be rewritten as a plugin (hook) instead
of hard-wiring it again in so many places.
Merge branch 'ondrej/remove-DNSRPS-from-open-source-edition' into 'main'
Ondřej Surý [Mon, 19 Aug 2024 15:19:21 +0000 (17:19 +0200)]
Remove DNSRPS implementation
DNSRPS was the API for a commercial implementation of Response-Policy
Zones that was supposedly better. However, it was never open-sourced
and has only ever been available from a single vendor. This goes against
the principle that the open-source edition of BIND 9 should contain only
features that are generally available and universal.
This commit removes the DNSRPS implementation from BIND 9. It may be
reinstated in the subscription edition if there's enough interest from
customers, but it would have to be rewritten as a plugin (hook) instead
of hard-wiring it again in so many places.
Evan Hunt [Thu, 22 Aug 2024 23:01:50 +0000 (16:01 -0700)]
use uv_dlopen() instead of dlopen() when linking DNSRPZ
take advantage of libuv's shared library handling capability
when linking to a DNSRPS library. (see b396f555861 and 37b9511ce1d
for prior related work.)
If the operating system UDP queue gets full and the outgoing UDP sending
starts to be delayed, BIND 9 could exhibit memory spikes as it tries to
enqueue all the outgoing UDP messages. Try a bit harder to deliver the
outgoing UDP messages synchronously and if that fails, drop the outgoing
DNS message that would get queued up and then timeout on the client side.
Closes #4930
Merge branch '4930-limit-the-UDP-send-queue' into 'main'
If the operating system UDP queue gets full and the outgoing UDP sending
starts to be delayed, BIND 9 could exhibit memory spikes as it tries to
enqueue all the outgoing UDP messages. As those are not going to be
delivered anyway (as we argued when we stopped enlarging the operating
system send and receive buffers), try to send the UDP messages directly
using `uv_udp_try_send()` and if that fails, drop the outgoing UDP
message.