Don't attempt to resolve DNS responses for intermediate results. This
may create multiple refreshes and can cause a crash.
One scenario is where for the query there is a CNAME and canonical
answer in cache that are both stale. This will trigger a refresh of
the RRsets because we encountered stale data and we prioritized it over
the lookup. It will trigger a refresh of both RRsets. When we start
recursing, it will detect a recursion loop because the recursion
parameters will eventually be the same. In 'dns_resolver_destroyfetch'
the sanity check fails, one of the callers did not get its event back
before trying to destroy the fetch.
Move the call to 'query_refresh_rrset' to 'ns_query_done', so that it
is only called once per client request.
Another scenario is where for the query there is a stale CNAME in the
cache that points to a record that is also in cache but not stale. This
will trigger a refresh of the RRset (because we encountered stale data
and we prioritized it over the lookup).
We mark RRsets that we add to the message with
DNS_RDATASETATTR_STALE_ADDED to prevent adding a duplicate RRset when
a stale lookup and a normal lookup conflict with each other. However,
the other non-stale RRset when following a CNAME chain will be added to
the message without setting that attribute, because it is not stale.
This is a variant of the bug in #2594. The fix covered the same crash
but for stale-answer-client-timeout > 0.
Fix this by clearing all RRsets from the message before refreshing.
This requires the refresh to happen after the query is send back to
the client.
Aram Sargsyan [Thu, 18 Aug 2022 08:59:09 +0000 (08:59 +0000)]
Fix memory leaks in DH code
When used with OpenSSL v3.0.0+, the `openssldh_compare()`,
`openssldh_paramcompare()`, and `openssldh_todns()` functions
fail to cleanup the used memory on some error paths.
Use `DST_RET` instead of `return`, when there is memory to be
released before returning from the functions.
Evan Hunt [Tue, 16 Aug 2022 23:26:02 +0000 (16:26 -0700)]
compression buffer was not reused correctly
when the compression buffer was reused for multiple statistics
requests, responses could grow beyond the correct size. this was
because the buffer was not cleared before reuse; compressed data
was still written to the beginning of the buffer, but then the size
of used region was increased by the amount written, rather than set
to the amount written. this caused responses to grow larger and
larger, potentially reading past the end of the allocated buffer.
Michał Kępień [Thu, 8 Sep 2022 09:11:30 +0000 (11:11 +0200)]
Bound the amount of work performed for delegations
Limit the amount of database lookups that can be triggered in
fctx_getaddresses() (i.e. when determining the name server addresses to
query next) by setting a hard limit on the number of NS RRs processed
for any delegation encountered. Without any limit in place, named can
be forced to perform large amounts of database lookups per each query
received, which severely impacts resolver performance.
The limit used (20) is an arbitrary value that is considered to be big
enough for any sane DNS delegation.
Fix RRL responses-per-second bypass using wildcard names
It is possible to bypass Response Rate Limiting (RRL)
`responses-per-second` limitation using specially crafted wildcard
names, because the current implementation, when encountering a found
DNS name generated from a wildcard record, just strips the leftmost
label of the name before making a key for the bucket.
While that technique helps with limiting random requests like
<random>.example.com (because all those requests will be accounted
as belonging to a bucket constructed from "example.com" name), it does
not help with random names like subdomain.<random>.example.com.
The best solution would have been to strip not just the leftmost
label, but as many labels as necessary until reaching the suffix part
of the wildcard record from which the found name is generated, however,
we do not have that information readily available in the context of RRL
processing code.
Fix the issue by interpreting all valid wildcard domain names as
the zone's origin name concatenated to the "*" name, so they all will
be put into the same bucket.
Matthijs Mekking [Tue, 30 Aug 2022 08:04:16 +0000 (10:04 +0200)]
Update inline system test, zone 'retransfer3.'
The zone 'retransfer3.' tests whether zones that 'rndc signing
-nsec3param' requests are queued even if the zone is not loaded.
The test assumes that if 'rndc signing -list' shows that the zone is
done signing with two keys, and there are no NSEC3 chains pending, the
zone is done handling the '-nsec3param' queued requests. However, it
is possible that the 'rndc signing -list' command is received before
the corresponding privatetype records are added to the zone (the records
that are used to retrieve the signing status with 'rndc signing').
This is what happens in test failure
https://gitlab.isc.org/isc-projects/bind9/-/jobs/2722752.
The 'rndc signing -list retransfer3' is thus an unreliable check.
It is simpler to just remove the check and wait for a certain amount
of time and check whether ns3 has re-signed the zone using NSEC3.
Michał Kępień [Wed, 7 Sep 2022 10:50:08 +0000 (12:50 +0200)]
Fix building with --disable-doh
Commit b69e783164cd50e3306364668558e460617ee8fc inadvertently caused
builds using the --disable-doh switch to fail, by putting the
declaration of the isc__nm_async_settlsctx() function inside an #ifdef
block that is only evaluated when DNS-over-HTTPS support is enabled.
This results in the following compilation errors being triggered:
netmgr/netmgr.c:2657:1: error: no previous prototype for 'isc__nm_async_settlsctx' [-Werror=missing-prototypes]
2657 | isc__nm_async_settlsctx(isc__networker_t *worker, isc__netievent_t *ev0) {
| ^~~~~~~~~~~~~~~~~~~~~~~
Fix by making the declaration of the isc__nm_async_settlsctx() function
in lib/isc/netmgr/netmgr-int.h visible regardless of whether
DNS-over-HTTPS support is enabled or not.
Mark Andrews [Mon, 18 Jul 2022 07:21:25 +0000 (17:21 +1000)]
Silence REVERSE_INULL
Remove unnecessary != NULL checks
*** CID 352809: Null pointer dereferences (REVERSE_INULL) /lib/dns/message.c: 4654 in dns_message_buildopt()
4648 if (rdata != NULL) {
4649 dns_message_puttemprdata(message, &rdata);
4650 }
4651 if (rdataset != NULL) {
4652 dns_message_puttemprdataset(message, &rdataset);
4653 }
>>> CID 352809: Null pointer dereferences (REVERSE_INULL)
>>> Null-checking "rdatalist" suggests that it may be null, but it has already been dereferenced on all paths leading to the check.
4654 if (rdatalist != NULL) {
4655 dns_message_puttemprdatalist(message, &rdatalist);
4656 }
4657 return (result);
4658 }
4659
The usage of xmlInitThreads() and xmlCleanupThreads() functions in
libxml2 is now marked as deprecated, and these functions will be made
private in the future.
Use xmlInitParser() and xmlCleanupParser() instead of them.
The isc_nm_listentlsdns() function erroneously calls
isc__nm_tcpdns_stoplistening() instead of isc__nm_tlsdns_stoplistening()
when something goes wrong, which can cause an assertion failure.
Ondřej Surý [Fri, 26 Aug 2022 10:24:07 +0000 (12:24 +0200)]
Allow fallback to IDNA2003 processing
In several cases where IDNA2008 mappings do not exist whereas IDNA2003
mappings do, dig was failing to process the suplied domain name. Take a
backwards compatible approach, and convert the domain to IDNA2008 form,
and if that fails try the IDNA2003 conversion.
Evan Hunt [Sat, 27 Aug 2022 00:58:55 +0000 (17:58 -0700)]
quote addresses in YAML output
YAML strings should be quoted if they contain colon characters.
Since IPv6 addresses do, we now quote the query_address and
response_address strings in all YAML output.
Evan Hunt [Fri, 26 Aug 2022 22:38:34 +0000 (15:38 -0700)]
dnstap query_message field was erroneously set with responses
The dnstap query_message field was in some cases being filled in
with response messages, along with the response_message field.
The query_message field should only be used when logging requests,
and the response_message field only when logging responses.
Ondřej Surý [Wed, 24 Aug 2022 12:59:50 +0000 (14:59 +0200)]
Clear the callbacks when isc_nm_stoplistening() is called
When we are closing the listening sockets, there's a time window in
which the TCP connection could be accepted although the respective
stoplistening function has already returned to control to the caller.
Clear the accept callback function early, so it doesn't get called when
we are not interested in the incoming connections anymore.
Split netmgr_test into separate per-transport unit tests
The netmgr_test unit test has been subdivided into tcp_test,
tcpdns_test, tls_test, tlsdns_test, and udp_test components.
These have been updated to use the new loopmgr.
* isc_timer was rewritten using the uv_timer, and isc_timermgr_t was
completely removed; isc_timer objects are now directly created on the
isc_loop event loops.
* the isc_timer API has been simplified. the "inactive" timer type has
been removed; timers are now stopped by calling isc_timer_stop()
instead of resetting to inactive.
* isc_manager now creates a loop manager rather than a timer manager.
* modules and applications using isc_timer have been updated to use the
new API.
This commit introduces new APIs for applications and signal handling,
intended to replace isc_app for applications built on top of libisc.
* isc_app will be replaced with isc_loopmgr, which handles the
starting and stopping of applications. In isc_loopmgr, the main
thread is not blocked, but is part of the working thread set.
The loop manager will start a number of threads, each with a
uv_loop event loop running. Setup and teardown functions can be
assigned which will run when the loop starts and stops, and
jobs can be scheduled to run in the meantime. When
isc_loopmgr_shutdown() is run from any the loops, all loops
will shut down and the application can terminate.
* signal handling will now be handled with a separate isc_signal unit.
isc_loopmgr only handles SIGTERM and SIGINT for application
termination, but the application may install additional signal
handlers, such as SIGHUP as a signal to reload configuration.
* new job running primitives, isc_job and isc_async, have been added.
Both units schedule callbacks (specifying a callback function and
argument) on an event loop. The difference is that isc_job unit is
unlocked and not thread-safe, so it can be used to efficiently
run jobs in the same thread, while isc_async is thread-safe and
uses locking, so it can be used to pass jobs from one thread to
another.
* isc_tid will be used to track the thread ID in isc_loop worker
threads.
Matthijs Mekking [Tue, 23 Aug 2022 08:54:42 +0000 (10:54 +0200)]
nsec3.c: Add a missing dns_db_detachnode() call
There is one case in 'dns_nsec3_activex()' where it returns but forgets
to detach the db node. Add the missing 'dns_db_detachnode()' call.
This case only triggers if 'sig-signing-type' (privatetype) is set to 0
(which by default is not), or if the function is called with 'complete'
is set to 'true' (which at this moment do not exist).
Matthijs Mekking [Fri, 19 Aug 2022 12:42:47 +0000 (14:42 +0200)]
Fix nsec3 system test issues
The wait_for_zone_is_signed function was never called, which could lead
to test failures due to timing issues (where a zone was not fully signed
yet, but the test was trying to verify the zone).
Also add two missing set_nsec3param calls to ensure the ITERATIONS
value is set for these test cases.
Matthijs Mekking [Wed, 10 Aug 2022 14:41:30 +0000 (16:41 +0200)]
Add test case for #3486
Add two scenarios where we change the dnssec-policy from using RSASHA1
to something with NSEC3.
The first case should work, as the DS is still in hidden state and we
can basically do anything with DNSSEC.
The second case should fail, because the DS of the predecessor is
published and we can't immediately remove the predecessor DNSKEY. So
in this case we should keep the NSEC chain for a bit longer.
Add two more scenarios where we change the dnssec-policy from using
NSEC3 to something NSEC only. Both should work because there are no
restrictions on using NSEC when it comes to algorithms, but in the
cases where the DS is published we can't bluntly remove the predecessor.
Extend the nsec3 system test by also checking the DNSKEY RRset for the
expected DNSKEY records. This requires some "kasp system"-style setup
for each test (setting key properties and key states). Also move the
dnssec-verify check inside the check_nsec/check_nsec3 functions because
we will have to do that every time.
Matthijs Mekking [Wed, 10 Aug 2022 13:29:59 +0000 (15:29 +0200)]
Wait with NSEC3 during a DNSSEC policy change
When doing a dnssec-policy reconfiguration from a zone with NSEC only
keys to a zone that uses NSEC3, figure out to wait with building the
NSEC3 chain.
Previously, BIND 9 would attempt to sign such a zone, but failed to
do so because the NSEC3 chain conflicted with existing DNSKEY records
in the zone that were not compatible with NSEC3.
There exists logic for detecting such a case in the functions
dnskey_sane() (in lib/dns/zone.c) and check_dnssec() (in
lib/ns/update.c). Both functions look very similar so refactor them
to use the same code and call the new function (called
dns_zone_check_dnskey_nsec3()).
Also update the dns_nsec_nseconly() function to take an additional
parameter 'diff' that, if provided, will be checked whether an
offending NSEC only DNSKEY will be deleted from the zone. If so,
this key will not be considered when checking the zone for NSEC only
DNSKEYs. This is needed to allow a transition from an NSEC zone with
NSEC only DNSKEYs to an NSEC3 zone.
Fix statistics channel multiple request processing with non-empty bodies
When the HTTP request has a body part after the HTTP headers, it is
not getting processed and is being prepended to the next request's data,
which results in an error when trying to parse it.
Improve the httpd.c:process_request() function with the following
additions:
1. Require that HTTP POST requests must have Content-Length header.
2. When Content-Length header is set, extract its value, and make sure
that it is valid and that the whole request's body is received before
processing the request.
3. Discard the request's body by consuming Content-Length worth of data
in the buffer.