Mark Andrews [Thu, 12 Sep 2024 03:27:10 +0000 (03:27 +0000)]
fix: usr: Don't allow statistics-channel if libxml2 and libjson-c are unsupported
When the libxml2 and libjson-c libraries are not supported, the statistics channel can't return anything useful, so it is now disabled. Use of `statistics-channel` in `named.conf` is a fatal error.
Closes #4895
Merge branch '4895-link-style-sheet-to-libxml2-support' into 'main'
Mark Andrews [Wed, 11 Sep 2024 23:05:34 +0000 (23:05 +0000)]
fix: test: The statschannel tests fails if one of libxml2 or json-c is configured
The `statschannel` system test failed if only one of `libxml2` or `json-c` is
available / configured as checks were being run against the non available
statistics page.
Closes #4919
Merge branch '4919-fix-statschannel-system-test' into 'main'
chg: usr: allow IXFR-to-AXFR fallback on DNS_R_TOOMANYRECORDS
This change allows fallback from an IXFR failure to AXFR when the reason is `DNS_R_TOOMANYRECORDS`. This is because this error condition could be temporary only in an intermediate version of IXFR transactions and it's possible that the latest version of the zone doesn't have that condition. In such a case, the secondary would never be able to update the zone (even if it could) without this fallback.
This fallback behavior is particularly useful with the recently introduced `max-records-per-type` and `max-types-per-name` options: the primary may not have these limitations and may temporarily introduce "too many" records, breaking IXFR. If the primary side subsequently deletes these records, this fallback will help recover the zone transfer failure automatically; without it, the secondary side would first need to increase the limit, which requires more operational overhead and has its own adverse effect.
Closes #4928
Merge branch 'fallback-ixfr-to-axfr-on-toomanyrecords' into 'main'
JINMEI Tatuya [Fri, 16 Aug 2024 07:53:38 +0000 (16:53 +0900)]
allow IXFR-to-AXFR fallback on DNS_R_TOOMANYRECORDS
This change allows fallback from an IXFR failure to AXFR when the
reason is DNS_R_TOOMANYRECORDS. This is because this error condition
could be temporary only in an intermediate version of IXFR
transactions and it's possible that the latest version of the zone
doesn't have that condition. In such a case, the secondary would never
be able to update the zone (even if it could) without this fallback.
This fallback behavior is particularly useful with the recently
introduced max-records-per-type and max-types-per-name options:
the primary may not have these limitations and may temporarily
introduce "too many" records, breaking IXFR. If the primary side
subsequently deletes these records, this fallback will help recover
the zone transfer failure automatically; without it, the secondary
side would first need to increase the limit, which requires more
operational overhead and has its own adverse effect.
This change also fixes a minor glitch that DNS_R_TOOMANYRECORDS wasn't
logged in xfrin_fail.
The rcu_xchg_pointer() function can be used outside of a critical
section, and usually must be followed by a synchronize_rcu() or
call_rcu() call to detach from the resource, unless if there are
some guarantees in place because of our own reference counting.
Mark Andrews [Tue, 10 Sep 2024 00:08:51 +0000 (00:08 +0000)]
new: usr: Add flag to named-checkconf to ignore "not configured" errors
`named-checkconf` now takes "-n" to ignore "not configured" errors. This allows named-checkconf to check the syntax of configurations from other builds which have support for more options.
Merge branch '4913-add-option-to-named-checkconf-to-override-notconfigured-flag' into 'main'
Mark Andrews [Mon, 2 Sep 2024 06:03:17 +0000 (16:03 +1000)]
Add flag to named-checkconf to ignore "not configured" errors
named-checkconf now takes "-n" to ignore "not configured" errors.
This allows named-checkconf to check the syntax of configurations
from other builds which have support for more options.
This file was initially created for unit testing, but later code was added to generate the file. The static file should have been removed from the git repo.
Closes #4916
Merge branch '4916-skr-unit-test-rm-test-file' into 'main'
This file was initially created for unit testing, but later code was
added to generate the file. The static file should have been removed
from the git repo.
Fix dnssec-policy options formatting and links in ARM
The statements that already exist in the grammar can't be created with
the namedconf:statement. Use a plain definition list for these
statements and add a manual anchor for each one so links to them can be
created.
Avoid using the :any: syntax in the definition lists, as that just
creates a link to the duplicate and completely unrelated statement,
which just makes the documentation more confusing.
fix: usr: Fix bug in Offline KSK that is using ZSK with unlimited lifetime
If the ZSK has unlimited lifetime, the timing metadata "Inactive" and "Delete" cannot be found and is treated as an error, preventing the zone to be signed. This has been fixed.
Closes #4914
Merge branch '4914-offline-ksk-zsk-lifetime-unlimited-bug' into 'main'
fix: usr: Fix an assertion failure in validate_dnskey_dsset_done()
Under rare circumstances, named could terminate unexpectedly
when validating a DNSKEY resource record if the validation
was canceled in the meantime. This has been fixed.
Closes isc-projects/bind9#4911
Merge branch '4911-assertion-failure-in-validate_dnskey_dsset_done' into 'v9.21.1-release'
If the ZSK has lifetime unlimited, the timing metadata "Inactive" and
"Delete" cannot be found and is treated as an error. Fix by allowing
these metadata to not exist.
Process canceled/shut down results in validate_dnskey_dsset_done()
When a validator is already shut down, val->name becomes NULL. We
need to process and keep the ISC_R_CANCELED or ISC_R_SHUTTINGDOWN
result code before calling validate_async_done(), otherwise, when it
is called with the hardcoded DNS_R_NOVALIDSIG result code, it can
cause an assetion failure when val->name (being NULL) is used in
proveunsecure().
Evan Hunt [Thu, 29 Aug 2024 18:11:15 +0000 (18:11 +0000)]
fix: usr: Delay release of root privileges until after configuring controls
Delay relinquishing root privileges until the control channel has been configured, for the benefit of systems that require root to use privileged port numbers. This mostly affects systems without fine-grained privilege systems (i.e., other than Linux).
Closes #4793
Merge branch '4793-bind-9-19-24-not-listening-to-rndc-port-953-on-localhost' into 'main'
Delay release of root privileges until after configuring controls
On systems where root access is needed to configure privileged
ports, we don't want to fully relinquish root privileges until
after the control channel (which typically runs on port 953) has
been established.
named_os_changeuser() now takes a boolean argument 'permanent'.
This allows us to switch the effective userid temporarily with
named_os_changeuser(false) and restore it with named_os_restoreuser(),
before permanently dropping privileges with named_os_changeuser(true).
Ondřej Surý [Thu, 29 Aug 2024 14:43:34 +0000 (14:43 +0000)]
chg: usr: Follow the number of CPU set by taskset/cpuset
Administrators may wish to constrain the set of cores that BIND 9 runs on via the 'taskset', 'cpuset' or 'numactl' programs (or equivalent on other O/S).
If the admin has used taskset, the `named` will now follow to automatically use the given number of CPUs rather than the system wide count.
Closes #4884
Merge branch '4884-use-cpuset-to-get-number-of-cpus' into 'main'
Ondřej Surý [Thu, 22 Aug 2024 15:23:09 +0000 (17:23 +0200)]
Follow the number of CPU set by taskset/cpuset
Administrators may wish to constrain the set of cores that BIND 9 runs
on via the 'taskset', 'cpuset' or 'numactl' programs (or equivalent on
other O/S), for example to achieve higher (or more stable) performance
by more closely associating threads with individual NIC rx queues. If
the admin has used taskset, it follows that BIND ought to
automatically use the given number of CPUs rather than the system wide
count.
Michal Nowak [Thu, 29 Aug 2024 14:38:06 +0000 (14:38 +0000)]
chg: test: Bump max-recursion-queries to 100 in resolver system test
With max-recursion-queries set to 50 the resolver system test was
unstable in the "checking query resolution for a domain with a valid
glueless delegation chain" check as ns1 replied with SERVFAIL.
Closes #4897
Merge branch '4897-resolver-ns1-max-recursion-queries-100' into 'main'
Michal Nowak [Mon, 26 Aug 2024 15:56:56 +0000 (17:56 +0200)]
Bump max-recursion-queries to 100 in resolver system test
With max-recursion-queries set to 50 the resolver system test was
unstable in the "checking query resolution for a domain with a valid
glueless delegation chain" check as ns1 replied with SERVFAIL.
Mark Andrews [Thu, 29 Aug 2024 13:24:09 +0000 (13:24 +0000)]
fix: chg: Improve performance when looking for the closest encloser when returning NSEC3 proofs
Use the fact that the database returns the longest matching part of the requested name to find the required NSEC3 record. If there are multiple versions present in the database we may have to search further.
Closes #4460
Merge branch '4460-auth-nsec3-many-labels' into 'main'
Mark Andrews [Mon, 4 Dec 2023 06:15:41 +0000 (17:15 +1100)]
Return partial match when requested
Return partial match from dns_db_find/dns_db_find when requested
to short circuit the closest encloser discover process. Most of the
time this will be the actual closest encloser but may not be when
there yet to be committed / cleaned up versions of the zone with
names below the actual closest encloser.
Mark Andrews [Thu, 29 Aug 2024 12:46:12 +0000 (12:46 +0000)]
fix: Accessing fctx->state without holding lock
Move lock earlier in the call sequence to address access without lock report.
```
1559 /*
1560 * Caller must be holding the fctx lock.
1561 */
CID 468796: (#1 of 1): Data race condition (MISSING_LOCK)
1. missing_lock: Accessing fctx->state without holding lock fetchctx.lock. Elsewhere, fetchctx.state is written to with fetchctx.lock held 2 out of 2 times.
1562 REQUIRE(fctx->state == fetchstate_done);
1563
1564 FCTXTRACE("sendevents");
1565
1566 LOCK(&fctx->lock);
1567
```
Closes #4902
Merge branch '4902-accessing-fctx-state-without-holding-lock' into 'main'
Mark Andrews [Wed, 28 Aug 2024 03:07:54 +0000 (13:07 +1000)]
Move lock earlier in the call sequence
fctx->state should be read with the lock held.
1559 /*
1560 * Caller must be holding the fctx lock.
1561 */
CID 468796: (#1 of 1): Data race condition (MISSING_LOCK)
1. missing_lock: Accessing fctx->state without holding lock fetchctx.lock.
Elsewhere, fetchctx.state is written to with fetchctx.lock held 2 out of 2 times.
1562 REQUIRE(fctx->state == fetchstate_done);
1563
1564 FCTXTRACE("sendevents");
1565
1566 LOCK(&fctx->lock);
1567
Arаm Sаrgsyаn [Mon, 26 Aug 2024 15:50:50 +0000 (15:50 +0000)]
chg: usr: Exempt prefetches from the fetches-per-zone and fetches-per-server quotas
Fetches generated automatically as a result of 'prefetch' are now
exempt from the 'fetches-per-zone' and 'fetches-per-server' quotas.
This should help in maintaining the cache from which query responses
can be given.
Closes #4219
Merge branch '4219-exempt-good-queries-from-fetch-limits' into 'main'
Aram Sargsyan [Fri, 7 Jun 2024 16:24:00 +0000 (16:24 +0000)]
Exempt prefetches from the fetches-per-server quota
Give prefetches a free pass through the quota so that the cache
entries for popular zones could be updated successfully even if the
quota for is already reached.
Aram Sargsyan [Fri, 7 Jun 2024 16:19:40 +0000 (16:19 +0000)]
Exempt prefetches from the fetches-per-zone quota
Give prefetches a free pass through the quota so that the cache entry
for a popular zone could be updated successfully even if the quota for
it is already reached.
Ondřej Surý [Mon, 26 Aug 2024 15:01:03 +0000 (15:01 +0000)]
fix: dev: Stop using malloc_usable_size and malloc_size
The `malloc_usable_size()` can return size larger than originally allocated and when these sizes disagree the fortifier enabled by `_FORTIFY_SOURCE=3` detects overflow and stops the `named` execution abruptly. Stop using these convenience functions as they are primary used for introspection-only.
Closes #4880
Merge branch '4880-dont-use-malloc_usable_size' into 'main'
Ondřej Surý [Fri, 23 Aug 2024 04:02:00 +0000 (06:02 +0200)]
Stop using malloc_usable_size and malloc_size
Although the nanual page of malloc_usable_size says:
Although the excess bytes can be over‐written by the application
without ill effects, this is not good programming practice: the
number of excess bytes in an allocation depends on the underlying
implementation.
it looks like the premise is broken with _FORTIFY_SOURCE=3 on newer
systems and it might return a value that causes program to stop with
"buffer overflow" detected from the _FORTIFY_SOURCE. As we do have own
implementation that tracks the allocation size that we can use to track
the allocation size, we can stop relying on this introspection function.
Also the newer manual page for malloc_usable_size changed the NOTES to:
The value returned by malloc_usable_size() may be greater than the
requested size of the allocation because of various internal
implementation details, none of which the programmer should rely on.
This function is intended to only be used for diagnostics and
statistics; writing to the excess memory without first calling
realloc(3) to resize the allocation is not supported. The returned
value is only valid at the time of the call.
Remove usage of both malloc_usable_size() and malloc_size() to be on the
safe size and only use the internal size tracking mechanism when
jemalloc is not available.
Michal Nowak [Mon, 26 Aug 2024 14:28:47 +0000 (14:28 +0000)]
chg: ci: Drop removed system tests from cross-version-config-tests
The cross-version-config-tests job fails when a system test is removed
from the upcoming release. To avoid this, remove the system test also
from the $BIND_BASELINE_VERSION.
See the failure mode at https://gitlab.isc.org/isc-projects/bind9/-/jobs/4668947.
Merge branch 'mnowak/remove-dialup-from-cross-version-config-tests-job' into 'main'
Michal Nowak [Mon, 26 Aug 2024 11:41:47 +0000 (13:41 +0200)]
Drop removed system tests from $BIND_BASELINE_VERSION
The cross-version-config-tests job fails when a system test is removed
from the upcoming release. To avoid this, remove the system test also
from the $BIND_BASELINE_VERSION.
James Addison [Sun, 25 Feb 2024 21:10:36 +0000 (21:10 +0000)]
Preserve de-duplicated tag order in documentation
The 'set' datatype in Python does not provide iteration-order
guarantees related to insertion-order. That means that its
usage in the 'split_csv' helper function during documentation
build can produce nondeterministic results.
That is non-desirable for two reasons: it means that the
documentation output may appear to vary unnecessarily between
builds, and secondly there could be loss-of-information in cases
where tag order in the source documentation is significant.
This patch implements order-preserving de-duplication of tags,
allowing authors to specify tags using intentional priority
ordering, while also removing tags that appear more than once.
Petr Špaček [Mon, 5 Aug 2024 08:48:34 +0000 (10:48 +0200)]
Automatically adjust MR metadata after merge
1. Set milestone to 'Not released yet' after merge
We will set milestone to actual version number when we actually tag a
particular version. This will get rid of mass MR reassignment when we
do last minute changes to a release plan etc.
2. Adjust No CHANGES and Release Notes MR labels to match gitchangelog
workflow.
Petr Špaček [Mon, 5 Aug 2024 08:21:46 +0000 (10:21 +0200)]
Mark backports CI job as non-interruptible
Previously CI job for the autobackport bot inherited "interruptible:
true" global configuration. This caused premature termination of the job
when another merge was finished before the autobackport job ran to
completion.
Arаm Sаrgsyаn [Thu, 22 Aug 2024 15:33:17 +0000 (15:33 +0000)]
new: usr: implement the 'request-ixfr-max-diffs' configuration option
The new 'request-ixfr-max-diffs' configuration option sets the
maximum number of incoming incremental zone transfer (IXFR) differences,
exceeding which triggers a full zone transfer (AXFR).
Closes #4389
Merge branch '4389-request-ixfr-max-diffs' into 'main'
Aram Sargsyan [Fri, 7 Jun 2024 14:49:59 +0000 (14:49 +0000)]
Test the 'request-ixfr-max-diffs' configuration option
Configure a maximum of 3 allowed differences and add 5 new records.
Check that named detected that the differences exceed the allowed
limit and successfully retries with AXFR.
Aram Sargsyan [Fri, 7 Jun 2024 14:47:55 +0000 (14:47 +0000)]
Implement the 'request-ixfr-max-diffs' configuration option
This limits the maximum number of received incremental zone
transfer differences for a secondary server. Upon reaching the
confgiured limit, the secondary aborts IXFR and initiates a full
zone transfer (AXFR).
Mark Andrews [Thu, 22 Aug 2024 12:55:46 +0000 (12:55 +0000)]
new: usr: Support restricted key tag range when generating new keys
It is useful when multiple signers are being used
to sign a zone to able to specify a restricted
range of range of key tags that will be used by an
operator to sign the zone. This adds controls to
named (dnssec-policy), dnssec-signzone, dnssec-keyfromlabel and
dnssec-ksr (dnssec-policy) to specify such ranges.
Closes #4830
Merge branch '4830-support-restricted-key-tag-range-when-generating-new-keys' into 'main'
Mark Andrews [Wed, 7 Aug 2024 05:47:05 +0000 (15:47 +1000)]
Document -M tag_min:tag_max
A new argument has been added to dnssec-keygen and dnssec-keyfromlabel
to restrict the tag value of key generated / imported to a particular
range. This is intended to be used by multi-signers.
Matthijs Mekking [Thu, 22 Aug 2024 10:11:29 +0000 (10:11 +0000)]
fix: usr: Fix algoritm rollover bug when there are two keys with the same keytag
If there is an algorithm rollover and two keys of different algorithm share the same keytags, then there is a possibility that if we check that a key matches a specific state, we are checking against the wrong key. This has been fixed by not only checking for matching key tag but also key algorithm.
Closes #4878
Merge branch '4878-fix-algorithm-rollover-keytag-conflict-bug' into 'main'
Matthijs Mekking [Wed, 21 Aug 2024 15:14:48 +0000 (17:14 +0200)]
Fix algorithm rollover bug wrt keytag conflicts
If there is an algorithm rollover and two keys of different algorithm
share the same keytags, then there is a possibility that if we check
that a key matches a specific state, we are checking against the wrong
key.
Fix this by not only checking for matching key id but also key
algorithm.