This commit adds support for timestamps in iso8601 format with timezone
when logging. This is exposed through the iso8601-tzinfo printtime
suboption.
It also makes the new logging format the default for -g output,
hopefully removing the need for custom timestamp parsing in scripts.
Michal Nowak [Tue, 1 Oct 2024 12:05:39 +0000 (12:05 +0000)]
chg: test: Replace dns.query module with isctest.query
The `dns.query.udp` and `dns.query.tcp` methods are [prone to timeouts](https://gitlab.isc.org/isc-projects/bind9/-/jobs/4785053); their `isctest.query` equivalents should be used in system tests instead.
Merge branch 'mnowak/convert-dns-query-udp-and-tcp-to-isctest-query' into 'main'
Alessio Podda [Tue, 1 Oct 2024 10:33:56 +0000 (10:33 +0000)]
fix: dev: Null clausedefs for ancient options
This commit nulls all type fields for the clausedef lists that are
declared ancient, and removes the corresponding cfg_type_t and parsing
functions when they are found to be unused after the change.
Among others, it removes some leftovers from #1913.
Closes #4962
Merge branch '4962-null-clausedef-types-for-ancient-options' into 'main'
This commit nulls all type fields for the clausedef lists that are
declared ancient, and removes the corresponding cfg_type_t and parsing
functions when they are found to be unused after the change.
fix: doc: Restore text about sig validity and SOA expire
When `sig-validity-interval` was obsoleted, the text that the signature validity interval should be multiples of the SOA expire interval was removed. Restore this text to the description of the `signatures-validity` option.
Closes #4951
Merge branch '4951-document-signatures-validity-soa-expire' into 'main'
The example.com zone file given in the "Configurations and Zone Files"
chapter has an SOA expire of 3 weeks, which is not a multiple of
the default signatures-validity value. Adjust the SOA expire so that
it is much lower than the signatures-validity default.
When `sig-validity-interval` was obsoleted, the text that the signature
validity interval should be multiples of the SOA expire interval was
removed. Restore this text to the description of the
`signatures-validity` option.
Mark Andrews [Tue, 1 Oct 2024 01:26:56 +0000 (01:26 +0000)]
fix: usr: Fix a bug in the static-stub implementation
Static-stub addresses and addresses from other sources were being
mixed together, resulting in static-stub queries going to addresses
not specified in the configuration, or alternatively, static-stub
addresses being used instead of the correct server addresses.
Closes #4850
Merge branch '4850-add-an-additional-class-of-names-to-adb' into 'main'
Mark Andrews [Wed, 14 Aug 2024 13:46:44 +0000 (23:46 +1000)]
Store static-stub addresses seperately in the adb
Static-stub address and addresses from other sources where being
mixed together resulting in static-stub queries going to addresses
not specified in the configuration or alternatively static-stub
addresses being used instead of the real addresses.
chg: dev: Use release memory ordering when incrementing reference counter
As the relaxed memory ordering doesn't ensure any memory
synchronization, it is possible that the increment will succeed even
in the case when it should not - there is a race between
atomic_fetch_sub(..., acq_rel) and atomic_fetch_add(..., relaxed).
Only the result is consistent, but the previous value for both calls
could be same when both calls are executed at the same time.
Merge branch 'ondrej/use-release-memory-ordering-for-reference-counting' into 'main'
Use release memory ordering when incrementing reference counter
As the relaxed memory ordering doesn't ensure any memory
synchronization, it is possible that the increment will succeed even
in the case when it should not - there is a race between
atomic_fetch_sub(..., acq_rel) and atomic_fetch_add(..., relaxed).
Only the result is consistent, but the previous value for both calls
could be same when both calls are executed at the same time.
fix: dev: Add a missing rcu_read_unlock() call on exit path
An exit path in the dns_dispatch_add() function fails to get out of
the RCU critical section when returning early. Add the missing
rcu_read_unlock() call.
Merge branch 'aram/add-missing-rcu_read_unlock-in-dns_dispatch_add' into 'main'
An exit path in the dns_dispatch_add() function fails to get out of
the RCU critical section when returning early. Add the missing
rcu_read_unlock() call.
chg: usr: Honour the Control Group memory contraints on Linux
On Linux, the system administrator can use Control Group ``cgroup``
mechanism to limit the amount of available memory to the process. This
limit will be honoured when calculating the percentage-based values.
Merge branch 'ondrej/use-uv_get_available_memory-doc' into 'main'
Document that we now honour the cgroup memory limit
On Linux, the system administrator can use Control Group ``cgroup``
mechanism to limit the amount of available memory to the process. This
limit will be honoured when calculating the percentage-based values.
Mark Andrews [Wed, 25 Sep 2024 12:03:45 +0000 (12:03 +0000)]
new: usr: Added WALLET type
Add the new record type WALLET (262). This provides a mapping from a domain name to a cryptographic currency wallet. Multiple mappings can exist if multiple records exist.
Closes #4947
Merge branch '4947-add-wallet-type-to-named' into 'main'
fix: usr: Fix the 'rndc dumpdb' command's error reporting
The 'rndc dumpdb' command wasn't reporting errors which
occurred when starting up the database dump process by named,
like, for example, a permission denied error for the
'dump-file' file. This has been fixed. Note, however, that
'rndc dumpdb' performs asynchronous writes, so errors can
also occur during the dumping process, which will not be
reported back to 'rndc', but which will still be logged by
named.
Closes #4944
Merge branch '4944-rndc-dumpdb-do-not-ignore-errors' into 'main'
The named_server_dumpdb() function, which is called when a 'rndc dumpdb'
command is issued, returns a 'isc_result_t' result code and it has been
always ignored since its introduction in eb8713ed947fdf22a41dad673d561896dd6fe4a2, where it was still called
ns_server_dumpdb(). The orignal reasoning is not preserved, but it could
have been also a simple copy-paste mistake, as there are commands, which
return 'void' and require manually setting 'result = ISC_R_SUCCESS;', as
it was done here. Anyway, named will now return the actual result, and
'rndc' will report an error, when the 'dumpdb' command fails.
Since the changes aren't tracked in the single changelog.rst file,
generate the changelog to stdout instead, so it can be easily redirected
to the proper file.
Use libuv functions to get memory available to BIND 9
This change uses uv_get_total_memory() to get the memory available to
BIND 9 with possible modification by uv_get_constrained_memory() if the
libuv version is recent enough to honour constraints created by
f.e. cgroups.
chg: ci: Increase the load TCP/DoT shotgun perf tests
Due to the recent improvements to the TCP processing, much higher loads
can be handled by BIND9 without causing client timeouts. The updated
parameters give us useful data for both cold and hot cache testing.
Merge branch 'nicki/increase-tcp-dot-shotgun-load' into 'main'
Due to the recent improvements to the TCP processing, much higher loads
can be handled by BIND9 without causing client timeouts. The updated
parameters give us useful data for both cold and hot cache testing.
Michal Nowak [Mon, 23 Sep 2024 15:39:16 +0000 (15:39 +0000)]
chg: test: Downgrade "timeout" and "attempts" arguments in shutdown
The shutdown system test sends queries when named is shutting down, not
in an attempt to get answers but to destabilize the server into a crash.
With isctest.query.udp() defaulting to try up to ten times with a
ten-second timeout to get a response we don't care about from a likely
terminated server, we make the test run much longer than needed because
of retries and long timeouts.
Also, see isc-projects/bind9#4943.
Merge branch 'mnowak/shutdown-downgrade-timeout-and-attempts-arguments' into 'main'
Michal Nowak [Mon, 16 Sep 2024 12:55:06 +0000 (14:55 +0200)]
Downgrade "timeout" and "attempts" arguments in shutdown
The shutdown system test sends queries when named is shutting down, not
in an attempt to get answers but to destabilize the server into a crash.
With isctest.query.udp() defaulting to try up to ten times with a
ten-second timeout to get a response we don't care about from a likely
terminated server, we make the test run much longer than needed because
of retries and long timeouts.
chg: dev: Use uv_available_parallelism() if available
Instead of cooking up our own code for getting the number of available
CPUs for named to use, make use of uv_available_parallelism() from
libuv >= 1.44.0.
Merge branch 'ondrej/use-uv_available_parallelism-if-available' into 'main'
Add support to read number of online CPUs on OpenBSD
The OpenBSD doesn't have sysctlbyname(), but sysctl() can be used to
read the number of online/available CPUs by reading following MIB(s):
[CTL_HW, HW_NCPUONLINE] with fallback to [CTL_HW, HW_NCPU].
Cleanup the sysctlbyname and friends configure checks and ifdefs
Cleanup various checks and cleanups that are available on the all
platforms like sysctlbyname() and various related <sys/*.h> headers
that are either defined in POSIX or available on Linux and all BSDs.
Instead of cooking up our own code for getting the number of available
CPUs for named to use, make use of uv_available_parallelism() from
libuv >= 1.44.0.
Incoming transfers that took longer than 30 seconds would stop reading from the TCP stream and the incoming transfer would be indefinitely stuck causing BIND 9 to hang during shutdown.
This has been fixed and the `max-transfer-time-in` and `max-transfer-idle-in` timeouts are now honoured.
Closes #4949
Merge branch '4949-fix-ignored-and-invalid-dispatch-timeout-in-dns_xfrin' into 'main'
Don't enable timeouts in dns_dispatch for incoming transfers
The dns_dispatch_add() call in the dns_xfrin unit had hardcoded 30
second limit. This meant that any incoming transfer would be stopped in
it didn't finish within 30 seconds limit. Additionally, dns_xfrin
callback was ignoring the return value from dns_dispatch_getnext() when
restarting the reading from the TCP stream; this could cause transfers
to get stuck waiting for a callback that would never come due to the
dns_dispatch having already been shut down.
Call the dns_dispatch_add() without a timeout and properly handle the
result code from the dns_dispatch_getnext().
The dns_dispatch_add() has timeout parameter that could not be 0 (for
not timeout). Modify the dns_dispatch implementation to accept a zero
timeout for cases where the timeouts are undesirable because they are
managed externally.
chg: dev: Restore the number of threadpool threads back to original value
The issue of long-running operations potentially blocking query resolution has been fixed. Revert this temporary workaround and restore the number of threadpool threads.
Related #4898
Merge branch '4898-remove-workaround-and-note' into 'main'
Petr Menšík [Wed, 6 Oct 2021 11:53:33 +0000 (13:53 +0200)]
Move common flags logging to shared functions
Query and response log shares the same flags. Move flags logging out of
log_query to share it with log_response. Use buffer instead of snprintf
to fill flags a bit faster.
Petr Menšík [Tue, 13 Jul 2021 18:12:11 +0000 (20:12 +0200)]
Make responselog flags similar to querylog
Remove answer flag from log, log instead count of records for each
message section. Include EDNS version and few flags of response. Add
also status of result.
when the QP cache was adapted from the RBT database, some names
weren't changed. this could be confusing, so let's change them now.
also, we no longer need to include rbt.h.
rem: usr: Remove DNSRPS implementation from the open-source version
DNSRPS was the API for a commercial implementation of Response-Policy
Zones that was supposedly better. However, it was never open-sourced
and has only ever been available from a single vendor. This goes against
the principle that the open-source edition of BIND 9 should contain only
features that are generally available and universal.
This commit removes the DNSRPS implementation from BIND 9. It may be
reinstated in the subscription edition if there's enough interest from
customers, but it would have to be rewritten as a plugin (hook) instead
of hard-wiring it again in so many places.
Merge branch 'ondrej/remove-DNSRPS-from-open-source-edition' into 'main'
Ondřej Surý [Mon, 19 Aug 2024 15:19:21 +0000 (17:19 +0200)]
Remove DNSRPS implementation
DNSRPS was the API for a commercial implementation of Response-Policy
Zones that was supposedly better. However, it was never open-sourced
and has only ever been available from a single vendor. This goes against
the principle that the open-source edition of BIND 9 should contain only
features that are generally available and universal.
This commit removes the DNSRPS implementation from BIND 9. It may be
reinstated in the subscription edition if there's enough interest from
customers, but it would have to be rewritten as a plugin (hook) instead
of hard-wiring it again in so many places.
Evan Hunt [Thu, 22 Aug 2024 23:01:50 +0000 (16:01 -0700)]
use uv_dlopen() instead of dlopen() when linking DNSRPZ
take advantage of libuv's shared library handling capability
when linking to a DNSRPS library. (see b396f555861 and 37b9511ce1d
for prior related work.)
If the operating system UDP queue gets full and the outgoing UDP sending
starts to be delayed, BIND 9 could exhibit memory spikes as it tries to
enqueue all the outgoing UDP messages. Try a bit harder to deliver the
outgoing UDP messages synchronously and if that fails, drop the outgoing
DNS message that would get queued up and then timeout on the client side.
Closes #4930
Merge branch '4930-limit-the-UDP-send-queue' into 'main'
If the operating system UDP queue gets full and the outgoing UDP sending
starts to be delayed, BIND 9 could exhibit memory spikes as it tries to
enqueue all the outgoing UDP messages. As those are not going to be
delivered anyway (as we argued when we stopped enlarging the operating
system send and receive buffers), try to send the UDP messages directly
using `uv_udp_try_send()` and if that fails, drop the outgoing UDP
message.
We currently set SO_INCOMING_CPU incorrectly, and testing by Ondrej
shows that fixing the issue by setting affinities is worse than letting
the kernel schedule threads without constraints. So we should not set
SO_INCOMING_CPU anymore.
Closes #4936
Merge branch '4936-remove-so-incoming-cpu' into 'main'
We currently set SO_INCOMING_CPU incorrectly, and testing by Ondrej
shows that fixing the issue and setting affinities is worse than letting
the kernel schedule threads without constraints. So we should not set
SO_INCOMING_CPU anymore.
fix: usr: Fix a statistics channel counter bug when 'forward only' zones are used
When resolving a zone with a 'forward only' policy, and
finding out that all the forwarders are marked as "bad",
the 'ServerQuota' counter of the statistics channel was
incorrectly increased. This has been fixed.
Closes #1793
Merge branch '1793-serverquota-counter-bug-with-forward-only' into 'main'
Add a statistics channel check in the forward system test
Check that the fix in the previous commit works and that the
'ServerQuota' counter in the statistics channel is still unset
after a SERVFAIL result in a 'forward only' zone.
The 'all_spilled' local variable in resolver.c:fctx_getaddresses()
is 'true' by default, and only becomes false when there is at least
one successfully found NS address. However, when a 'forward only;'
configuration is used, the code jumps over the part where it looks
for NS addresses and doesn't reset the 'all_spilled' to false, which
results in incorretly increased 'serverquota' statistics variable,
and also in invalid return error code from the function. The result
code error didn't make any differences, because all codes other than
'ISC_R_SUCCESS' or 'DNS_R_WAIT' were treated in the same way, and
the result code was never logged anywhere.
Set the default value of 'all_spilled' to 'false', and only make it
'true' before actually starting to look up NS addresses.
Mark Andrews [Mon, 16 Sep 2024 02:49:11 +0000 (02:49 +0000)]
chg: dev: Remove statslock from dnssec-signzone
Silence Coverity CID 468757 and 468767 (DATA RACE read not locked) by converting dnssec-signzone to use atomics for statistics counters rather than using a lock.
Closes #4939
Merge branch '4939-remove-stats-lock-from-dnssec-signzone' into 'main'
Mark Andrews [Fri, 13 Sep 2024 03:30:34 +0000 (13:30 +1000)]
Remove 'statslock' from dnssec-signzone
Silence Coverity CID 468757 and 468767 (DATA RACE read not locked)
by converting dnssec-signzone to use atomics for statistics counters
rather than using a lock. This should be marginally faster than
using the lock as well when statistics are requested.
fix: usr: Separate DNSSEC validation from the long-running tasks
As part of the KeyTrap \[CVE-2023-50387\] mitigation, the DNSSEC CPU-intensive operations were offloaded to a separate threadpool that we use to run other tasks that could affect the networking latency.
If that threadpool is running some long-running tasks like RPZ, catalog zone processing, or zone file operations, it would delay DNSSEC validations to a point where the resolving signed DNS records would fail.
Split the CPU-intensive and long-running tasks into separate threadpools in a way that the long-running tasks don't block the CPU-intensive operations.
Closes #4898
Merge branch '4898-move-offloaded-DNSSEC-to-own-threads' into 'main'
Move offloaded DNSSEC operations to different helper threads
Currently, the isc_work API is overloaded. It runs both the
CPU-intensive operations like DNSSEC validations and long-term tasks
like RPZ processing, CATZ processing, zone file loading/dumping and few
others.
Under specific circumstances, when many large zones are being loaded, or
RPZ zones processed, this stops the CPU-intensive tasks and the DNSSEC
validation is practically stopped until the long-running tasks are
finished.
As this is undesireable, this commit moves the CPU-intensive operations
from the isc_work API to the isc_helper API that only runs fast memory
cleanups now.
Add isc_helper API that adds 1:1 thread for each loop
Add an extra thread that can be used to offload operations that would
affect latency, but are not long-running tasks; those are handled by
isc_work API.
Each isc_loop now has matching isc_helper thread that also built on top
of uv_loop. In fact, it matches most of the isc_loop functionality, but
only the `isc_helper_run()` asynchronous call is exposed.