This is done to suppress messages like "lt-rndc: 'reconfig' failed:
failure" appearing in the message log of the test, because failure
is actually expected, and the appearance of that message can be
confusing.
The port value used in this case is not correct, making the
`rndc reload` command to fail. This error was not detected earlier
only because the failure of the command is actually expected, but
the failure happens for a "wrong" reason, and the test still passes.
Fix the error by using the existing variable instead of the fixed
number.
Aram Sargsyan [Wed, 29 Dec 2021 09:07:03 +0000 (09:07 +0000)]
Add a system test for view reverting after a failed reconfiguration
Test the view reverting code by introducing a faulty dlz configuration
in named.conf and using `rndc reconfig` to check if named handles the
situation correctly.
We use "dlz" because the dlz processing code is located in an ideal
place in the view configuration function for the test to cover the
view reverting code.
This test is specifically added to the catz system test to additionally
cover the catz reconfiguration during the mentioned failed
reconfiguration attempt.
Aram Sargsyan [Tue, 28 Dec 2021 12:08:48 +0000 (12:08 +0000)]
Improve the zones' view reverting logic when a zone is a catalog zone
When a zone is being configured with a new view, the catalog zones
structure will also be linked to that view. Later on, in case of some
error, should the zone be reverted to the previous view, the link
between the catalog zones structure and the view won't be reverted.
Change the dns_zone_setviewrevert() function so it calls
dns_zone_catz_enable() during a zone revert, which will reset the
link between `catzs` and view.
Aram Sargsyan [Wed, 5 Jan 2022 09:38:36 +0000 (09:38 +0000)]
Separate the locked parts of dns_zone_catz_enable/disable functions
Separate the locked parts of dns_zone_catz_enable() and
dns_zone_catz_disable() functions into static functions. This will
let us perform those tasks from the other parts of the module while
the zone is locked, avoiding one pair of additional unlocking and
locking operations.
Aram Sargsyan [Tue, 28 Dec 2021 11:51:01 +0000 (11:51 +0000)]
Improve the view configuration error handling and reverting logic
If a view configuration error occurs during a named reconfiguration
procedure, BIND can end up having twin views (old and new), with some
zones and internal structures attached to the old one, and others
attached to the new one, which essentially creates chaos.
Implement some additional view reverting mechanisms to avoid the
situation described above:
Petr Špaček [Thu, 20 Jan 2022 16:18:06 +0000 (17:18 +0100)]
Remove RFCs not implemented in BIND from list in the ARM
This commit partially removes extra RFCs which are not listed in
file doc/misc/rfc-compliance.
Most of the removed RFCs are either outright obsolete, irrelevant,
or not implemented. Rationale:
- 974 - obsolete
- 1033 - ops info, hardly followed today
- 1464 - ops info
- 1591 - policy
- 1537 - obsolete
- 1713 - obsolete
- 1794 - notimp
- 2010 - ops info
- 2052 - obsolete
- 2065 - obsolete
- 2137 - obsolete
- 2168 - obsolete
- 2240 - obsolete
- 2345 - not dns
- 2352 - not dns
- 2540 - notimp
- 2825 - notimp, info, obsolete
- 2826 - notimp
- 2929 - obsolete
- 3071 - policy
- 3090 - obsolete
- 3258 - notimp
- 6594 - iana, SSHFP
- 7216 - not dns
- 8482 - notimp
- 8490 - notimp
Probably most notable RFCs removed are:
- 8482 for special ANY handling
- 8490 for Stateful Operations
As far as I can tell BIND does not implement those.
Petr Špaček [Wed, 19 Jan 2022 15:30:18 +0000 (16:30 +0100)]
Replace all occurences of PLATFORMS file with reference to the ARM
The conf.py exclude_patterns now includes platforms.rst to avoid
problems with redefining labels:
https://github.com/sphinx-doc/sphinx/issues/1668#issuecomment-71376208
Ondřej Surý [Sat, 22 Jan 2022 15:59:50 +0000 (16:59 +0100)]
Ignore the invalid L1 cache line size returned by sysconf()
On some systems, the glibc can return 0 instead of cache-line size to
indicate the cache line sizes cannot be determined. This is comment
from glibc source code:
/* In general we cannot determine these values. Therefore we
return zero which indicates that no information is
available. */
As the goal of the check is to determine whether the L1 cache line size
is still 64 and we would use this value in case the sysconf() call is
not available, we can also ignore the invalid values returned by the
sysconf() call.
Petr Špaček [Mon, 17 Jan 2022 13:53:59 +0000 (14:53 +0100)]
Remove duplicate named.conf.rst file
As far as I can tell, it is some leftover from the times when Sphinx
docs were introduced (commit 9fb6d11abbdb10ded128f0ee5c004f24b4030ede).
It seems like it is not referenced from anywhere.
Michał Kępień [Thu, 20 Jan 2022 14:40:37 +0000 (15:40 +0100)]
Suggest --disable-doh when libnghttp2 is not found
Extend the error message displayed when support for DNS over HTTPS is
requested but libnghttp2 is unavailable at build time, in order to help
the user find a way out of such a situation.
Michał Kępień [Thu, 20 Jan 2022 14:40:37 +0000 (15:40 +0100)]
Fix spelling of "DNS over HTTPS" & "DNS over TLS"
The terms "DNS over HTTPS" and "DNS over TLS" should be hyphenated when
they are used as adjectives and non-hyphenated otherwise. Ensure all
occurrences of these terms in the source tree follow the above rule.
(CHANGES and release notes are intentionally left intact.)
Tweak a related ARM snippet, fixing a typo in the process.
Michał Kępień [Wed, 19 Jan 2022 13:30:17 +0000 (14:30 +0100)]
rndc: prevent crashing after receiving a signal
If isc_app_run() gets interrupted by a signal, the global 'rndc_task'
variable may already be detached from (set to NULL) by the time the
outstanding netmgr callbacks are run. This triggers an assertion
failure in isc_task_shutdown(). However, explicitly calling
isc_task_shutdown() from rndc code is redundant because it does not use
isc_task_onshutdown() and the task_shutdown() function gets
automatically called anyway when the task manager gets destroyed (after
isc_app_run() returns). Remove the redundant isc_task_shutdown() calls
to prevent crashes after receiving a signal.
Evan Hunt [Thu, 13 Jan 2022 19:18:27 +0000 (11:18 -0800)]
rndc: sync ISC_R_CANCELED handling in callbacks
rndc_recvdone() is not treating the ISC_R_CANCELED result code as a
request to stop data processing, which may cause a crash when trying to
dereference ccmsg->buffer. Fix by ensuring ISC_R_CANCELED results in an
early exit from rndc_recvdone().
Make sure the logic for handling ISC_R_CANCELED in rndc_recvnonce()
matches the one present in rndc_recvdone() to ensure consistent behavior
between these two sibling functions.
Artem Boldariev [Fri, 14 Jan 2022 10:25:04 +0000 (12:25 +0200)]
doth test: fix failure after reconfig
Sometimes the serving a query or two might fail in the test due to the
listeners not being reinitialised on time. This commit makes the test
suite to wait for reconfiguration message in the log file to detect
the time when the reconfiguration request completed.
Michał Kępień [Tue, 18 Jan 2022 10:00:46 +0000 (11:00 +0100)]
Reimplement the gnutls-cli check in Python
gnutls-cli is tricky to script around as it immediately closes the
server connection when its standard input is closed. This prevents
simple shell-based I/O redirection from being used for capturing the DNS
response sent over a TLS connection and the workarounds for this issue
employ non-standard utilities like "timeout".
Instead of resorting to clever shell hacks, reimplement the relevant
check in Python. Exit immediately upon receiving a valid DNS response
or when gnutls-cli exits in order to decrease the test's run time.
Employ dnspython to avoid the need for storing DNS queries in binary
files and to improve test readability. Capture more diagnostic output
to facilitate troubleshooting. Use a pytest fixture instead of an
Autoconf macro to keep test requirements localized.
Ondřej Surý [Thu, 13 Jan 2022 12:24:55 +0000 (13:24 +0100)]
Explicitly enable IPV6_V6ONLY on the netmgr sockets
Some operating systems (OpenBSD and DragonFly BSD) don't restrict the
IPv6 sockets to sending and receiving IPv6 packets only. Explicitly
enable the IPV6_V6ONLY socket option on the IPv6 sockets to prevent
failures from using the IPv4-mapped IPv6 address.
Artem Boldariev [Fri, 14 Jan 2022 12:14:53 +0000 (14:14 +0200)]
DoH: ensure that server_send_error_response() is used properly
The server_send_error_response() function is supposed to be used only
in case of failures and never in case of legitimate requests. Ensure
that ISC_HTTP_ERROR_SUCCESS is never passed there by mistake.
Ondřej Surý [Fri, 14 Jan 2022 11:27:19 +0000 (12:27 +0100)]
Increase the timeout to 15 seconds for the resolver test
1. 10 seconds is an unfortunate pick because that reintroduces the
problem described in commit 5307bf64 (for an earlier check).
Change the +tries=3 +timeout=10 to +tries=2 +time=15, so that we
minimize the risk of dig missing any responses sent by the server in
the first 15 seconds while also increasing our chances of the
response arriving in time on machines under heavy load and allowing
it a single retry in case things go awry.
2. The comment about TCP above was misleading: as painfully proven by
GitLab CI, using TCP is no guarantee of receiving a response in a
timely manner. It may help a bit, but it is certainly not a 100%
reliable solution.
Change the dig invocation to just use UDP like in the two prior
tests for consistency (and revise that comment accordingly).
Ondřej Surý [Fri, 14 Jan 2022 10:01:36 +0000 (11:01 +0100)]
Increase the dig timeout in resolver test to 10 seconds
The resolver system tests was exhibiting often intermitten failures,
increase the timeout from default 5 second to 10 seconds to give the dig
more leeway for providing an answer.
Ondrej Sury [Wed, 12 Jan 2022 23:04:57 +0000 (00:04 +0100)]
Disable udp recvmmsg support on systems with MUSL libc
The Linux kernel diverts from the POSIX specification for two members of
struct msghdr making them size_t sized (instead of int and socklen_t).
In glibc, the developers have decided to use that. However, the MUSL
developers used padding for the struct and kept the members defined
according to the POSIX.
This creates a problem, because libuv doesn't use recvmmsg() library
call where the padding members are correctly zeroed and instead calls
the syscall directly, the struct msghdr is passed to the kernel with
enormous values in those two members (because of the random junk in the
padding members) and the syscall thus fail with EMSGSIZE.
Disable udp recvmmsg support on systems with MUSL libc until the libuv
starts zeroing the struct msghdr before passing it to the syscall.
Ondřej Surý [Tue, 11 Jan 2022 11:14:23 +0000 (12:14 +0100)]
Fix the UDP recvmmsg support
Previously, the netmgr/udp.c tried to detect the recvmmsg detection in
libuv with #ifdef UV_UDP_<foo> preprocessor macros. However, because
the UV_UDP_<foo> are not preprocessor macros, but enum members, the
detection didn't work. Because the detection didn't work, the code
didn't have access to the information when we received the final chunk
of the recvmmsg and tried to free the uvbuf every time. Fortunately,
the isc__nm_free_uvbuf() had a kludge that detected attempt to free in
the middle of the receive buffer, so the code worked.
However, libuv 1.37.0 changed the way the recvmmsg was enabled from
implicit to explicit, and we checked for yet another enum member
presence with preprocessor macro, so in fact libuv recvmmsg support was
never enabled with libuv >= 1.37.0.
This commit changes to the preprocessor macros to autoconf checks for
declaration, so the detection now works again. On top of that, it's now
possible to cleanup the alloc_cb and free_uvbuf functions because now,
the information whether we can or cannot free the buffer is available to
us.
Ondřej Surý [Tue, 11 Jan 2022 13:27:28 +0000 (14:27 +0100)]
Use ISC_R_SHUTTINGDOWN to detect netmgr shutting down
When the dispatch code was refactored in libdns, the netmgr was changed
to return ISC_R_SHUTTINGDOWN when the netmgr is shutting down, and the
ISC_R_CANCELED is now reserved only for situation where the callback was
canceled by the caller.
This change wasn't reflected in the controlconf.c channel which was
still looking for ISC_R_CANCELED as the shutdown event.