Matthijs Mekking [Wed, 19 May 2021 13:32:56 +0000 (15:32 +0200)]
Add helpful function 'dns_zone_getdnsseckeys'
This code gathers DNSSEC keys from key files and from the DNSKEY RRset.
It is used for the 'rndc dnssec -status' command, but will also be
needed for "checkds". Turn it into a function.
Matthijs Mekking [Wed, 12 May 2021 09:09:33 +0000 (11:09 +0200)]
Add dst_key_role function
Change the static function 'get_ksk_zsk' to a library function that
can be used to determine the role of a dst_key. Add checks if the
boolean parameters to store the role are not NULL. Rename to
'dst_key_role'.
Matthijs Mekking [Tue, 11 May 2021 12:40:04 +0000 (14:40 +0200)]
Add checkds system test
Add a Pytest based system test for the 'checkds' feature. There is
one nameserver (ns9, because it should be started the latest) that
has configured several zones with dnssec-policy. The zones are set
in such a state that they are waiting for DS publication or DS
withdrawal.
Then several other name servers act as parent servers that either have
the DS for these published, or not. Also one server in the mix is
to test a badly configured parental-agent.
There are tests for DS publication, DS publication error handling,
DS withdrawal and DS withdrawal error handling.
The tests ensures that the zone is DNSSEC valid, and that the
DSPublish/DSRemoved key metadata is set (or not in case of the error
handling).
It does not test if the rollover continues, this is already tested in
the kasp system test (that uses 'rndc -dnssec checkds' to set the
DSPublish/DSRemoved key metadata).
Add checks for "parental-agents" configuration, checking for the option
being at wrong type of zone (only allowed for primaries and
secondaries), duplicate definitions, duplicate references, and
undefined parental clauses (the name referenced in the zone clause
does not have a matching "parental-agent" clause).
Matthijs Mekking [Wed, 23 Jun 2021 09:20:43 +0000 (11:20 +0200)]
Fix setnsec3param hang on shutdown
When performing the 'setnsec3param' task, zones that are not loaded will have
their task rescheduled. We should do this only if the zone load is still
pending, this prevents zones that failed to load get stuck in a busy wait and
causing a hang on shutdown.
Matthijs Mekking [Wed, 23 Jun 2021 09:17:02 +0000 (11:17 +0200)]
Add configuration that causes setnsec3param hang
Add a zone to the configuration file that uses NSEC3 with dnssec-policy
and fails to load. This will cause setnsec3param to go into a busy wait
and will cause a hang on shutdown.
Move the include Makefile.tests to the bottom of Makefile.am(s)
The Makefile.tests was modifying global AM_CFLAGS and LDADD and could
accidentally pull /usr/include to be listed before the internal
libraries, which is known to cause problems if the headers from the
previous version of BIND 9 has been installed on the build machine.
We should drop the HISTORY file because it's confusing and the same
information is covered by the release notes for .0 releases (or at
least they should be).
Remove references to the HISTORY file, update the README to tell
people go look somewhere else.
This was written down in the outdated doc/dev/release documentation.
Since the rest of that file can go, add these steps to a separate file
and update it to current standards (e.g. use git commands).
Ondřej Surý [Wed, 2 Jun 2021 10:48:18 +0000 (12:48 +0200)]
Remove unused or outdated utils, developer and design documentation
The util/, doc/design/, and doc/dev/ directories included couple of
tools or documents there were completely outdated because they either
refered the the VCS we no longer use (cvs) or described processes that
have been redesigned and they are documented elsewhere.
Ondřej Surý [Wed, 23 Jun 2021 18:54:20 +0000 (20:54 +0200)]
Change the safe edns-udp-size from 1400 to 1432
When backporting the Don't Fragment UDP socket option, it was noticed
that the edns-udp-size probing uses 1432 as one of the values to be
probed and the documentation would be recommending 1400 as the safe
value. As the safe value can be from the 1400-1500 interval, the
documentation has been changed to match the probed value, so we do not
skip it.
Evan Hunt [Wed, 9 Jun 2021 21:19:46 +0000 (14:19 -0700)]
add test for server failover on REFUSED
- add an 'nsupdate -C' option to override resolv.conf file for nsupdate
- set resolv.conf to use two test servers, the first one of which will
return REFUSED for a query for 'example'.
Evan Hunt [Wed, 9 Jun 2021 20:37:20 +0000 (13:37 -0700)]
nsupdate: try next server on REFUSED
when nsupdate sends an SOA query to a resolver, if it fails
with REFUSED, nsupdate will now try the next server rather than
aborting the update completely.
Ondřej Surý [Tue, 22 Jun 2021 14:12:44 +0000 (16:12 +0200)]
Disable IP fragmentation on the UDP sockets
In DNS Flag Day 2020, we started setting the DF (Don't Fragment socket
option on the UDP sockets. It turned out, that this code was incomplete
leading to dropping the outgoing UDP packets.
This has been now remedied, so it is possible to disable the
fragmentation on the UDP sockets again as the sending error is now
handled by sending back an empty response with TC (truncated) bit set.
Evan Hunt [Tue, 22 Jun 2021 15:01:35 +0000 (17:01 +0200)]
Handle UDP send errors when sending DNS message larger than MTU
When the fragmentation is disabled on UDP sockets, the uv_udp_send()
call can fail with UV_EMSGSIZE for messages larger than path MTU.
Previously, this error would end with just discarding the response. In
this commit, a proper handling of such case is added and on such error,
a new DNS response with truncated bit set is generated and sent to the
client.
This change allows us to disable the fragmentation on the UDP
sockets again.
Matthijs Mekking [Fri, 18 Jun 2021 08:30:56 +0000 (10:30 +0200)]
Add more test cases for #2778
Add three more test cases that detect a configuration error if the
key-directory is inherited but has the same value for a zone in a
different view with a deviating DNSSEC policy.
Ondřej Surý [Wed, 23 Jun 2021 13:29:22 +0000 (15:29 +0200)]
Add rbtdb setownercase/getownercase unit test
This commit adds a unittest that tests private rdataset_getownercase()
and rdataset_setownercase() methods from rbtdb.c. The test setups
minimal mock dns_rbtdb_t and dns_rbtdbnode_t data structures.
As the rbtdb methods are generally hidden behind layers and layers, we
include the "rbtdb.c" directly from rbtdb_test.c, and thus we can use
the private methods and data structures directly. This also opens up
opportunity to add more unittest for the rbtdb private functions without
going through all the layers.
Looking at the dig output for a failed test, the query actually got a
response from the authoritative server (in one specific example the
query time was 2991 msec, close to 3 seconds).
After doing the query for the test, we enable the authoritative
server after a sleep of three seconds. If we bump this sleep to 4
seconds, the race will be more in favor of the query timing out,
making it unlikely that this test will fail intermittently.
Bump the subsequent wait_for_log checks also with one second.
Ondřej Surý [Tue, 22 Jun 2021 11:08:17 +0000 (13:08 +0200)]
Use POSIX tolower(), toupper() and isupper() functions
In the code that rdataset_setownercase() and rdataset_getownercase() we
now use tolower()/toupper()/isupper() functions appropriately instead of
rolling our own code.
Ondřej Surý [Tue, 22 Jun 2021 11:05:15 +0000 (13:05 +0200)]
Don't set locale globally, just use it when needed
Previously, we would set the locale on a global level and that could
possibly lead to different behaviour in underlying functions. In this
commit, we change to code to use the system locale only when calling the
libidn2 functions and reset the locale back to "POSIX" when exiting the
libidn2 code.
Michał Kępień [Tue, 22 Jun 2021 20:45:39 +0000 (22:45 +0200)]
Improve description of mirror zone validation
Expand the description of mirror zones in the ARM by adding a brief
discussion of how the validation process works for AXFR and IXFR. Move
the paragraph mentioning the "file" option higher up. Apply minor
stylistic and whitespace-related tweaks to the relevant section of the
ARM.
Michał Kępień [Tue, 22 Jun 2021 20:26:46 +0000 (22:26 +0200)]
Tweak descriptions of buffering-related options
Apply minor stylistical and whitespace-related tweaks to the
descriptions of the "tcp-receive-buffer", "udp-receive-buffer",
"tcp-send-buffer", and "udp-send-buffer" options in the ARM.
Petr Špaček [Tue, 15 Jun 2021 08:01:59 +0000 (10:01 +0200)]
Rework description of the "max-cache-size" option
Improve the description of the "max-cache-size" option in the ARM by
focusing on its meaning for multiple views and default values.
Add mention of a hash table preallocation.
Artem Boldariev [Tue, 22 Jun 2021 10:32:24 +0000 (13:32 +0300)]
System tests to check named behaviour for unexpected opcodes
This commit adds a set of tests to verify that BIND will not crash
when some opcodes are sent over DoT or DoH, leading to marking network
handle in question as sequential.
Ondřej Surý [Tue, 22 Jun 2021 10:24:44 +0000 (12:24 +0200)]
Replace netmgr per-protocol sequential function with a common one
Previously, each protocol (TCPDNS, TLSDNS) has specified own function to
disable pipelining on the connection. An oversight would lead to
assertion failure when opcode is not query over non-TCPDNS protocol
because the isc_nm_tcpdns_sequential() function would be called over
non-TCPDNS socket. This commit removes the per-protocol functions and
refactors the code to have and use common isc_nm_sequential() function
that would either disable the pipelining on the socket or would handle
the request in per specific manner. Currently it ignores the call for
HTTP sockets and causes assertion failure for protocols where it doesn't
make sense to call the function at all.
Michał Kępień [Tue, 22 Jun 2021 13:28:31 +0000 (15:28 +0200)]
Hardcode "max-cache-size" for the "_bind" view
The built-in "_bind" view does not allow recursion and therefore does
not need a large cache database. However, as "max-cache-size" is not
explicitly set for that view in the default configuration, it inherits
that setting from global options. Set "max-cache-size" for the built-in
"_bind" view to a fixed value (2 MB, i.e. the smallest allowed value) to
prevent needlessly preallocating memory for its cache RBT hash table.
Michał Kępień [Tue, 22 Jun 2021 13:28:31 +0000 (15:28 +0200)]
Use minimal-sized caches for non-recursive views
Currently the implicit default for the "max-cache-size" option is "90%".
As this option is inherited by all configured views, using multiple
views can lead to memory exhaustion over time due to overcommitment.
The "max-cache-size 90%;" default also causes cache RBT hash tables to
be preallocated for every configured view, which does not really make
sense for views which do not allow recursion.
To limit this problem's potential for causing operational issues, use a
minimal-sized cache for views which do not allow recursion and do not
have "max-cache-size" explicitly set (either in global configuration or
in view configuration).
For configurations which include multiple views allowing recursion,
adjusting "max-cache-size" appropriately is still left to the operator.
Matthijs Mekking [Mon, 21 Jun 2021 09:36:50 +0000 (11:36 +0200)]
Fix deadlock issue with key-directory and in-view
When locking key files for a zone, we iterate over all the views and
lock a mutex inside the zone structure. However, if we envounter an
in-view zone, we will try to lock the key files twice, one time for
the home view and one time for the in-view view. This will lead to
a deadlock because one thread is trying to get the same lock twice.
Michał Kępień [Thu, 17 Jun 2021 15:09:37 +0000 (17:09 +0200)]
Allow resetting hash table size limits for DNS DBs
When "max-cache-size" is changed to "unlimited" (or "0") for a running
named instance (using "rndc reconfig"), the hash table size limit for
each affected cache DB is not reset to the maximum possible value,
preventing those hash tables from being allowed to grow as a result of
new nodes being added.
Extend dns_rbt_adjusthashsize() to interpret "size" set to 0 as a signal
to remove any previously imposed limits on the hash table size. Adjust
API documentation for dns_db_adjusthashsize() accordingly. Move the
call to dns_db_adjusthashsize() from dns_cache_setcachesize() so that it
also happens when "size" is set to 0.
Michał Kępień [Thu, 17 Jun 2021 15:09:37 +0000 (17:09 +0200)]
Allow hash tables for cache RBTs to be grown
Upon creation, each dns_rbt_t structure has its "maxhashbits" field
initialized to the value of the RBT_HASH_MAX_BITS preprocessor macro,
i.e. 32. When the dns_rbt_adjusthashsize() function is called for the
first time for a given RBT (for cache RBTs, this happens when they are
first created, i.e. upon named startup), it lowers the value of the
"maxhashbits" field to the number of bits required to index the
requested number of hash table slots. When a larger hash table size is
subsequently requested, the value of the "maxhashbits" field should be
increased accordingly, up to RBT_HASH_MAX_BITS. However, the loop in
the rehash_bits() function currently ensures that the number of bits
necessary to index the resized hash table will not be larger than
rbt->maxhashbits instead of RBT_HASH_MAX_BITS, preventing the hash table
from being grown once the "maxhashbits" field of a given dns_rbt_t
structure is set to any value lower than RBT_HASH_MAX_BITS.
Fix by tweaking the loop guard condition in the rehash_bits() function
so that it compares the new number of bits used for indexing the hash
table against RBT_HASH_MAX_BITS rather than rbt->maxhashbits.
Michał Kępień [Thu, 17 Jun 2021 10:39:32 +0000 (12:39 +0200)]
Increase timeout in the rndc deadlock test
The timeout originally picked for "rndc status" invocations (2 seconds)
in the test attempting to reproduce a deadlock caused by running
multiple "rndc addzone", "rndc modzone", and "rndc delzone" commands
concurrently causes intermittent failures of the "addzone" system test
in GitLab CI. Increase the timeout to 10 seconds to make such failures
less probable. Adjust code comments accordingly.
Ondřej Surý [Thu, 3 Jun 2021 06:00:22 +0000 (08:00 +0200)]
Drop support for clang atomic and gcc __sync builtins
The requirements for BIND 9.17+ now requires C11 support from the
compiler, so we can safely drop most of the stdatomic.h shims from
lib/isc/unix/include/stdatomic.h.
This commit removes support for clang atomic builtins (clang >= 3.6.0
includes stdatomic.h header) and for Gcc __sync builtins.
The only compatibility shim that remains is support for __atomic
builtins for Gcc >= 4.7.0 since CentOS 7 still includes only Gcc 4.8.1
and the proper stdatomic.h header was only introduced in Gcc >= 4.9.