new: usr: Add a new option to configure the maximum number of outgoing queries per client request
The configuration option 'max-query-count' sets how many outgoing queries per client request is allowed. The existing 'max-recursion-queries' is the number of permissible queries for a single name and is reset on every CNAME redirection. This new option is a global limit on the client request. The default is 200.
This allows us to send a bit more queries while looking up a single name. The default for 'max-recursion-queries' is changed from 32 to 50.
Closes #4980
Closes #4921
Merge branch '4980-global-limit-outgoing-queries' into 'main'
Changing the default for max-recursion-queries from 100 to 32 was too
strict in some cases, especially lookups in reverse IPv6 trees started
to fail more frequently. From issue #4921 it looks like 50 is a better
default.
Now that we have 'max-query-count' as a global limit of outgoing queries
per client request, we can increase the default for
'max-recursion-queries' again, as the number of recursive queries is
no longer bound by the multiple of 'max-recursion-queries' and
'max-query-restarts'.
Matthijs Mekking [Mon, 25 Nov 2024 15:27:21 +0000 (16:27 +0100)]
Add a CAMP test case
This adds a new test directory specifically for CAMP attacks. This first
test in this test directory follows multiple CNAME chains, restarting
the max-recursion-queries counter, but should bail when the global
maximum quota max-query-count is reached.
Add another option to configure how many outgoing queries per
client request is allowed. The existing 'max-recursion-queries' is
per restart, this one is a global limit.
Michal Nowak [Thu, 5 Dec 2024 10:07:46 +0000 (10:07 +0000)]
fix: ci: Add ns2/managed1.conf to mkeys extra_artifacts
The ns2/managed1.conf file is created by the setup.sh script. Then, in
the tests.sh script it is moved to ns2/managed.conf. The latter file
name is in mkeys extra_artifacts, but the former one is not. This is a
problem when pytest is started with the --setup-only option as it only
runs the setup.sh script (e.g., in the cross-version-config-tests CI
job) and thus failing the "Unexpected files found" assertion.
Merge branch 'mnowak/mkeys-add-ns2-managed1-conf-to-extra-artifacts' into 'main'
Michal Nowak [Wed, 4 Dec 2024 17:17:40 +0000 (18:17 +0100)]
Add ns2/managed1.conf to mkeys extra_artifacts
The ns2/managed1.conf file is created by the setup.sh script. Then, in
the tests.sh script it is moved to ns2/managed.conf. The latter file
name is in mkeys extra_artifacts, but the former one is not. This is a
problem when pytest is started with the --setup-only option as it only
runs the setup.sh script (e.g., in the cross-version-config-tests CI
job) and thus failing the "Unexpected files found" assertion.
Mark Andrews [Tue, 19 Nov 2024 14:20:42 +0000 (01:20 +1100)]
Keep a local copy of the update rules to prevent UAF
Previously, the update policy rules check was moved earlier in the
sequence, and the keep rule match pointers were kept to maintain the
ability to verify maximum records by type.
However, these pointers can become invalid if server reloading
or reconfiguration occurs before update completion. To prevent
this issue, extract the maximum records by type value immediately
during processing and only keep the copy of the values instead of the
full ssurule.
Evan Hunt [Thu, 5 Dec 2024 02:36:47 +0000 (02:36 +0000)]
fix: doc: document optional statements the same, enabled or not
The automatically-generated grammar for named.conf clauses that may or may not be enabled at compile time will now include the same comment, regardless of whether or not they are. Previously, the grammar didn't include a comment if an option was enabled, but said "not configured" if it was disabled. Now, in both cases, it will say "optional (only available if configured)".
Evan Hunt [Wed, 2 Oct 2024 02:16:55 +0000 (19:16 -0700)]
document optional statements the same, enabled or not
the generated grammar for named.conf clauses that may or may not be
enabled at compile time will now print the same comment regardless of
whether or not they are.
previously, the grammar didn't print a comment if an option was enabled,
but printed "not configured" if it was disabled. now, in both cases,
it will say "optional (only available if configured)".
as an incidental fix, clarified the documentation for "named-checkconf -n".
Artem Boldariev [Wed, 4 Dec 2024 16:50:36 +0000 (16:50 +0000)]
fix: ci: tests: Use FIPS compatible DH-param files
When the tests were added, the files were generated without FIPS
compatibility in mind. That made the tests fail on recent OpenSSL
versions in FIPS mode.
So, the files were regenerated on a FIPS compliant system using the
following stanza:
```
$ openssl dhparam -out <file> 3072
```
Apparently, the old files are not valid for FIPS starting with OpneSSL
3.1.X release series as "FIPS 140-3 compliance changes" are mentioned
in the [changelog](https://openssl-library.org/news/openssl-3.1-notes/).
Closes #5074.
Merge branch '5074-fips-compatible-dhparams' into 'main'
Artem Boldariev [Tue, 3 Dec 2024 10:38:34 +0000 (12:38 +0200)]
Use FIPS compatible DH-param files
When the tests were added, the files were generated without FIPS
compatibility in mind. That made the tests fail on recent OpenSSL
versions in FIPS mode.
So, the files were regenerated on a FIPS compliant system using the
following stanza:
$ openssl dhparam -out <file> 3072
Apparently, the old files are not valid for FIPS starting with OpneSSL
3.1.X release series as "FIPS 140-3 compliance changes" are mentioned
in the changelog:
Colin Vidal [Wed, 4 Dec 2024 15:52:16 +0000 (15:52 +0000)]
new: usr: Add Extended DNS Error Code 22 - No Reachable Authority
When the resolver is trying to query an authority server and eventually timed out, a SERVFAIL answer is given to the client. Add the Extended DNS Error Code 22 - No Reachable Authority to the response.
Closes #2268
Merge branch '2268/ede-no-reachable-authority' into 'main'
Colin Vidal [Fri, 8 Nov 2024 17:18:30 +0000 (18:18 +0100)]
Add EDE 22 No reachable authority code
Add support for Extended DNS Errors (EDE) error 22: No reachable
authority. This occurs when after a timeout delay when the resolver is
trying to query an authority server.
Petr Špaček [Mon, 2 Dec 2024 13:53:38 +0000 (13:53 +0000)]
chg: doc: gitchangelog: don't break lines on hyphens in relnotes
When release notes are generated, the text is wrapped and line breaks
are inserted into each paragraph (sourced from the commit message's
body). Prevent line breaks after hyphens, as these are often used for
option names. This makes it possible to easily find the options
afterwards.
Merge branch 'nicki/gitchangelog-dont-break-on-hyphens' into 'main'
Nicki Křížek [Mon, 2 Dec 2024 10:10:01 +0000 (11:10 +0100)]
gitchangelog: don't break lines on hyphens in relnotes
When release notes are generated, the text is wrapped and line breaks
are inserted into each paragraph (sourced from the commit message's
body). Prevent line breaks after hyphens, as these are often used for
option names. This makes it possible to easily find the options
afterwards.
Ondřej Surý [Wed, 27 Nov 2024 17:04:29 +0000 (17:04 +0000)]
fix: usr: Improve the memory cleaning in the SERVFAIL cache
The SERVFAIL cache doesn't have a memory bound and the
cleaning of the old SERVFAIL cache entries was implemented
only in opportunistic manner. Improve the memory cleaning
of the SERVFAIL cache to be more aggressive, so it doesn't
consume a lot of memory in the case the server encounters
many SERVFAILs at once.
Closes #5025
Merge branch '5025-improve-badcache-cleaning' into 'main'
Ondřej Surý [Fri, 22 Nov 2024 14:10:26 +0000 (15:10 +0100)]
Remove dns_badcache usage in the resolver (lame-ttl)
The lame-ttl processing was overriden to be disabled in the config,
but the code related to the lame-ttl was still kept in the resolver
code. More importantly, the DNS_RESOLVER_BADCACHETTL() macro would
cause the entries in the resolver badcache to be always cached for at
least 30 seconds even if the lame-ttl would be set to 0.
Remove the dns_badcache code from the dns_resolver unit, so we save some
processing time and memory in the resolver code.
Ondřej Surý [Thu, 14 Nov 2024 18:51:29 +0000 (19:51 +0100)]
Improve the badcache cleaning by adding LRU and using RCU
Instead of cleaning the dns_badcache opportunistically, add per-loop
LRU, so each thread-loop can clean the expired entries. This also
allows removal of the atomic operations as the badcache entries are now
immutable, instead of updating the badcache entry in place, the old
entry is now deleted from the hashtable and the LRU list, and the new
entry is inserted in the LRU.
alessio [Tue, 5 Nov 2024 08:36:24 +0000 (09:36 +0100)]
Optimize memory layout of core structs
Reduce memory footprint by:
- Reordering struct fields to minimize padding.
- Using exact-sized atomic types instead of *_least/*_fast variants
- Downsizing integer fields where possible
Ondřej Surý [Wed, 27 Nov 2024 14:23:11 +0000 (14:23 +0000)]
chg: dev: Assume IPv6 is universally available (on the kernel level)
Instead of various probing, just assume that IPv6 is universally available
and cleanup the various checks and defines that we have accumulated over
the years.
Merge branch 'ondrej/cleanup-IPv6-networking-support' into 'main'
Ondřej Surý [Tue, 20 Aug 2024 10:12:47 +0000 (12:12 +0200)]
Remove the incomplete code for IPv6 pktinfo
The code that listens on individual interfaces is now stable and doesn't
require any changes. The code that would bind to IPv6 wildcard address
and then use IPv6 pktinfo structure to get the source address is not
going to be completed, so it's better to just remove the dead cruft.
Ondřej Surý [Tue, 20 Aug 2024 09:53:07 +0000 (11:53 +0200)]
Assume that IPv4 and IPv6 is always available
In 2024, it is reasonable to assume that IPv4 and IPv6 is always
available on a socket() level. We still keep the option to enable or
disable each IP version individually, as the routing might be broken or
undesirable for one of the versions.
Arаm Sаrgsyаn [Wed, 27 Nov 2024 13:34:29 +0000 (13:34 +0000)]
fix: test: Fix the nslookup system test
The nslookup system test checks the count of resolved addresses in
the CNAME tests using a 'grep' match on the hostname, and ignoring
lines containing the 'canonical name' string. In order to protect
the check from intermittent failures like the 'address in use' warning
message, which then automatically resolves after a retry, edit the
'grep' matching string to also ignore the comments (as the mentioned
warning message is a comment which contains the hostname).
The nslookup system test checks the count of resolved addresses in
the CNAME tests using a 'grep' match on the hostname, and ignoring
lines containing the 'canonical name' string. In order to protect
the check from intermittent failures like the 'address in use' warning
message, which then automatically resolves after a retry, edit the
'grep' matching string to also ignore the comments (as the mentioned
warning message is a comment which contains the hostname).
Ondřej Surý [Wed, 27 Nov 2024 13:00:33 +0000 (13:00 +0000)]
fix: dev: Make dns_validator_cancel() respect the data ownership
There was a data race dns_validator_cancel() was called when the
offloaded operations were in progress. Make dns_validator_cancel()
respect the data ownership and only set new .canceling variable when
the offloaded operations are in progress. The cancel operation would
then finish when the offloaded work passes the ownership back to the
respective thread.
Closes #4926
Merge branch '4926-fix-data-race-in-dns_validator' into 'main'
Make dns_validator_cancel() respect the data ownership
There was a data race dns_validator_cancel() was called when the
offloaded operations were in progress. Make dns_validator_cancel()
respect the data ownership and only set new .shuttingdown variable when
the offloaded operations are in progress. The cancel operation would
then finish when the offloaded work passes the ownership back to the
respective thread.
Arаm Sаrgsyаn [Wed, 27 Nov 2024 11:46:09 +0000 (11:46 +0000)]
fix: usr: Fix trying the next primary server when the preivous one was marked as unreachable
In some cases (there is evidence only when XoT was used) `named` failed
to try the next primary server in the list when the previous one was
marked as unreachable. This has been fixed.
Closes #5038
Merge branch '5038-xfr-primary-next-fix' into 'main'
Aram Sargsyan [Tue, 26 Nov 2024 12:09:57 +0000 (12:09 +0000)]
Test trying of the next primary server
Add test cases which check that when a XoT primary server is
unreachable or is already marked as unreachble then the next
primary server in the list is used.
Aram Sargsyan [Tue, 26 Nov 2024 12:06:03 +0000 (12:06 +0000)]
xfrin: refactor and fix the ISC_R_CANCELED case handling
Previously a ISC_R_CANCELED result code switch-case has been added to
the zone.c:zone_xfrdone() function, which did two things:
1. Schedule a new zone transfer if there's a scheduled force reload of
the zone.
2. Reset the primaries list.
This proved to be not a well-thought change and causes problems,
because the ISC_R_CANCELED code is used not only when the whole transfer
is canceled, but also when, for example, a particular primary server is
unreachable, and named still needs to continue the transfer process by
trying the next server, which it now no longer does in some cases. To
solve this issue, three changes are made:
1. Make sure dns_zone_refresh() runs on the zone's loop, so that the
sequential calls of dns_zone_stopxfr() and dns_zone_forcexfr()
functions (like done in 'rndc retransfer -force') run in intended
order and don't race with each other.
2. Since starting the new transfer is now guaranteed to run after the
previous transfer is shut down (see the previous change), remove the
special handling of the ISC_R_CANCELED case, and let the default
handler to handle it like before. This will bring back the ability to
try the next primary if the current one was interrupted with a
ISC_R_CANCELED result code.
3. Change the xfrin.c:xfrin_shutdown() function to pass the
ISC_R_SHUTTINGDOWN result code instead of ISC_R_CANCELED, as it makes
more sense.
Evan Hunt [Wed, 27 Nov 2024 00:08:28 +0000 (00:08 +0000)]
chg: dev: Use default listening rules from config.c string
Remove special code which creates default listeners, and use the normal named.conf configuration parser instead. This removes unneeded code and makes the built-in configuration text provide a true primary source of defaults. This change should be transparent to end-users and should not cause any visible change.
Closes #1424
Merge branch '1424-listen-builtin-config' into 'main'
Petr Menšík [Mon, 6 Dec 2021 12:42:53 +0000 (13:42 +0100)]
Load default listen-on[-v6] values from config.c
Stop using ns_listenlist_default() to set the default listen-on
and listen-on-v6 configuration. Instead, configure these options
using the default values in config.c.
Ondřej Surý [Tue, 26 Nov 2024 11:30:12 +0000 (11:30 +0000)]
rem: usr: Move contributed DLZ modules into a separate repository
The DLZ modules are poorly maintained as we only ensure they can still
be compiled, the DLZ interface is blocking, so anything that blocks the
query to the database blocks the whole server and they should not be
used except in testing. The DLZ interface itself is going to be scheduled
for removal.
The DLZ modules now live in https://gitlab.isc.org/isc-projects/dlz-modules
repository.
Closes #4865
Merge branch '4865-remove-contributed-DLZ-modules' into 'main'
Ondřej Surý [Mon, 19 Aug 2024 12:39:11 +0000 (14:39 +0200)]
Move contributed DLZ modules into a separate repository
The DLZ modules are poorly maintained as we only ensure they can still
be compiled, the DLZ interface is blocking, so anything that blocks the
query to the database blocks the whole server and they should not be
used except in testing. The DLZ interface itself should be scheduled
for removal.
Ondřej Surý [Tue, 26 Nov 2024 10:23:11 +0000 (10:23 +0000)]
chg: usr: Add new logging module for logging crypto errors in libisc
Add a new 'crypto' log module that will be used for a low-level
cryptographic operations. The DNS related cryptography logs
are still logged in the 'dns/crypto' module.
Merge branch 'ondrej/add-ISC_LOGMODULE_CRYPTO' into 'main'
Ondřej Surý [Thu, 8 Aug 2024 09:26:27 +0000 (11:26 +0200)]
Add new logging category for logging crypto errors in libisc
The libisc now includes sizeable chunks of cryptography, but the crypto
log module was missing. Add the new ISC_LOGMODULE_CRYPTO to libisc and
use it in the isc_tls error logging.
Colin Vidal [Tue, 26 Nov 2024 08:46:58 +0000 (08:46 +0000)]
chg: usr: Add none parameter to query-source and query-source-v6 to disable IPv4 or IPv6 upstream queries
Add a none parameter to named configuration option `query-source` (respectively `query-source-v6`) which forbid usage of IPv4 (respectively IPv6) addresses when named is doing an upstream query.
Closes #4981 Turning-off upstream IPv6 queries while still listening to downstream queries on IPv6.
Colin Vidal [Tue, 5 Nov 2024 12:40:55 +0000 (13:40 +0100)]
Add a none parameter to query-source[-v6]
This change adds a "none" parameter to the query-source[-v6]
options in named.conf, which forbid the usage of IPv4 or IPv6
addresses when doing upstream queries.
Mark Andrews [Tue, 26 Nov 2024 07:15:25 +0000 (07:15 +0000)]
chg: usr: emit more helpful log for exceeding max-records-per-type
The new log message is emitted when adding or updating an RRset
fails due to exceeding the max-records-per-type limit. The log includes
the owner name and type, corresponding zone name, and the limit value.
It will be emitted on loading a zone file, inbound zone transfer
(both AXFR and IXFR), handling a DDNS update, or updating a cache DB.
It's especially helpful in the case of zone transfer, since the
secondary side doesn't have direct access to the offending zone data.
It could also be used for max-types-per-name, but this change
doesn't implement it yet as it's much less likely to happen
in practice.
Merge branch 'helpful-log-on-toomanyrecords' into 'main'
use more generic log module name for 'logtoomanyrecords'
DNS_LOGMODULE_RBTDB was simply inappropriate, and this
log message is actually dependent on db implementation
details, so DNS_LOGMODULE_DB would be the best choice.
JINMEI Tatuya [Thu, 29 Aug 2024 07:24:48 +0000 (16:24 +0900)]
emit more helpful log for exceeding max-records-per-type
The new log message is emitted when adding or updating an RRset
fails due to exceeding the max-records-per-type limit. The log includes
the owner name and type, corresponding zone name, and the limit value.
It will be emitted on loading a zone file, inbound zone transfer
(both AXFR and IXFR), handling a DDNS update, or updating a cache DB.
It's especially helpful in the case of zone transfer, since the
secondary side doesn't have direct access to the offending zone data.
It could also be used for max-types-per-name, but this change
doesn't implement it yet as it's much less likely to happen
in practice.
Mark Andrews [Tue, 26 Nov 2024 03:40:57 +0000 (03:40 +0000)]
fix: usr: '{&dns}' is as valid as '{?dns}' in a SVCB's dohpath
`dig` fails to parse a valid (as far as I can tell, and accepted by `kdig` and `Wireshark`) `SVCB` record with a `dohpath` URI template containing a `{&dns}`, like `dohpath=/some/path?key=value{&dns}"`. If the URI template contains a `{?dns}` instead `dig` is happy, but my understanding of rfc9461 and section 1.2. "Levels and Expression Types" of rfc6570 is that `{&dns}` is valid.
See for example section 1.2. "Levels and Expression Types" of rfc6570.
Note that Peter van Dijk suggested that `{dns}` and `{dns,someothervar}` might be valid forms as well, so my patch might be too restrictive, although it's anyone's guess how DoH clients would handle complex templates.
Mark Andrews [Mon, 9 Sep 2024 05:59:30 +0000 (15:59 +1000)]
Parse the URI template and check for a dns variable
The 'dns' variable in dohpath can be in various forms ({?dns},
{dns}, {&dns} etc.). To check for a valid dohpath it ends up
being simpler to just parse the URI template rather than looking
for all the various forms if substring.
Nicki Křížek [Mon, 25 Nov 2024 14:35:17 +0000 (14:35 +0000)]
fix: test: Allow re-run of mkeys system test
On some slow systems, the test might intermittently fail due to inherent
timing issues. In our CI, this most often happens in the
system:gcc:8fips:amd64 jobs.
Closes #3098
Merge branch '3098-allow-re-run-of-mkeys-test' into 'main'
Nicki Křížek [Thu, 7 Nov 2024 15:15:54 +0000 (16:15 +0100)]
Allow re-run of mkeys system test
On some slow systems, the test might intermittently fail due to inherent
timing issues. In our CI, this most often happens in the
system:gcc:8fips:amd64 jobs.
Michal Nowak [Mon, 25 Nov 2024 12:11:02 +0000 (12:11 +0000)]
fix: ci: Fix paths to binaries in cross-version-config-tests job
The cross-version-config-tests job has never functioned in CI because
the testing framework changed after the testing was completed. To run
the new "named" binary using the old configurations, paths in the test
framework must be updated to point to the location of the new binaries.
Closes #4977
Merge branch '4977-fix-cross-version-config-tests' into 'main'
Michal Nowak [Wed, 30 Oct 2024 20:15:44 +0000 (21:15 +0100)]
Fix paths to binaries in cross-version-config-tests job
The cross-version-config-tests job has never functioned in CI because
the testing framework changed after the testing was completed. To run
the new "named" binary using the old configurations, paths in the test
framework must be updated to point to the location of the new binaries.
Aydın Mercan [Mon, 25 Nov 2024 10:09:26 +0000 (10:09 +0000)]
new: usr: add separate query counters for new protocols
Add query counters for DoT, DoH, unencrypted DoH and their proxied
counterparts. The new protocols do not update their respective TCP/UDP
transport counter and is now for TCP/UDP over plain 53 only.
Closes #598
Merge branch '598-wishlist-statistics-for-dns-over-tcp-and-tls' into 'main'
Aydın Mercan [Fri, 4 Oct 2024 10:14:52 +0000 (13:14 +0300)]
add separate query counters for new protocols
Add query counters for DoT, DoH, unencrypted DoH and their proxied
counterparts. The protocols don't increment TCP/UDP counters anymore
since they aren't the same as plain DNS-over-53.
Colin Vidal [Fri, 22 Nov 2024 18:34:51 +0000 (18:34 +0000)]
rem: dev: Remove namedconf port/tls deprecated check on *-source[-v6] options
The usage of port and tls arguments in *-source and *-source-v6 named configuration options has been previously removed. Remove various configuration check deprecating usage of those arguments.
Merge branch 'colin/querysource-check-cleanup' into 'main'
Colin Vidal [Tue, 12 Nov 2024 09:10:12 +0000 (10:10 +0100)]
Remove namedconf port/tls deprecated check on *-source[-v6] options
The usage of port and tls arguments in *-source and *-source-v6 named
configuration options has been previously removed. Remove
configuration check deprecating usage of those arguments.
Alessio Podda [Fri, 22 Nov 2024 17:35:48 +0000 (17:35 +0000)]
chg: dev: Incrementally apply AXFR transfer
Reintroduce logic to apply diffs when the number of pending tuples is
above 128. The previous strategy of accumulating all the tuples and
pushing them at the end leads to excessive memory consumption during
transfer.
alessio [Sun, 3 Nov 2024 20:25:15 +0000 (21:25 +0100)]
Incrementally apply AXFR transfer
Reintroduce logic to apply diffs when the number of pending tuples is
above 128. The previous strategy of accumulating all the tuples and
pushing them at the end leads to excessive memory consumption during
transfer.
alessio [Fri, 22 Nov 2024 07:32:19 +0000 (08:32 +0100)]
Fix alpine build by removing LargestIntegralType in time_test
Avoids using functions that require LargestIntegralType arguments in
time_test to resolve import issues on Alpine Linux. Using size_t instead
wasn't an option due to compatibility issues with 32-bit architectures.