Artem Boldariev [Thu, 11 Nov 2021 14:17:02 +0000 (16:17 +0200)]
Fix a crash on unexpected incoming DNS message during XoT xfer
This commit fixes a peculiar corner case in the client-side DoT code
because of which a crash could occur during a zone transfer. A junk
DNS message should be sent at the end of a zone transfer via TLS to
trigger the crash (abort).
This commit, hopefully, fixes that.
Also, this commit adds similar changes to the TCP DNS code, as it
shares the same origin and most of the logic.
Michał Kępień [Tue, 23 Nov 2021 14:35:39 +0000 (15:35 +0100)]
Fix handling of mismatched responses past timeout
When a UDP dispatch receives a mismatched response, it checks whether
there is still enough time to wait for the correct one to arrive before
the timeout fires. If there is not, the result code is set to
ISC_R_TIMEDOUT, but it is not subsequently used anywhere as 'response'
is set to NULL a few lines earlier. This results in the higher-level
read callback (resquery_response() in case of resolver code) not being
called. However, shortly afterwards, a few levels up the call chain,
isc__nm_udp_read_cb() calls isc__nmsocket_timer_stop() on the dispatch
socket, effectively disabling read timeout handling for that socket.
Combined with the fact that reading is not restarted in such a case
(e.g. by calling dispatch_getnext() from udp_recv()), this leads to the
higher-level query structure remaining referenced indefinitely because
the dispatch socket it uses will neither be read from nor closed due to
a timeout. This in turn causes fetch contexts to linger around
indefinitely, which in turn i.a. prevents certain cache nodes (those
containing rdatasets used by fetch contexts, like fctx->nameservers)
from being cleaned.
Fix by making sure the higher-level callback does get invoked with the
ISC_R_TIMEDOUT result code when udp_recv() determines there is no more
time left to receive the correct UDP response before the timeout fires.
This allows the higher-level callback to clean things up, preventing the
reference leak described above.
Aram Sargsyan [Mon, 11 Oct 2021 18:13:39 +0000 (18:13 +0000)]
Fix catalog zone reconfiguration crash
The following scenario triggers a "named" crash:
1. Configure a catalog zone.
2. Start "named".
3. Comment out the "catalog-zone" clause.
4. Run `rndc reconfig`.
5. Uncomment the "catalog-zone" clause.
6. Run `rndc reconfig` again.
Implement the required cleanup of the in-memory catalog zone during
the first `rndc reconfig`, so that the second `rndc reconfig` could
find it in an expected state.
Evan Hunt [Tue, 16 Nov 2021 05:59:37 +0000 (21:59 -0800)]
fix intermittent resolver test error
the resolver test checks that the correct number of fetches have
been sent NS rrsets of a given size, but it formerly did so by
counting queries received by the authoritative server, which could
result in an off-by-one count if one of the queries had been resent
due to a timeout or a port number collision.
this commit changes the test to count fetches initiated by the
resolver, which should prevent the intermittent test failure, and
is the actual datum we were interested in anyway.
Mark Andrews [Fri, 19 Nov 2021 09:54:05 +0000 (10:54 +0100)]
Reject too long ECDSA public keys
opensslecdsa_fromdns() already rejects too short ECDSA public keys.
Make it also reject too long ones. Remove an assignment made redundant
by this change.
Michał Kępień [Fri, 19 Nov 2021 09:32:21 +0000 (10:32 +0100)]
Fix parsing ECDSA keys
raw_key_to_ossl() assumes fixed ECDSA private key sizes (32 bytes for
ECDSAP256SHA256, 48 bytes for ECDSAP384SHA384). Meanwhile, in rare
cases, ECDSAP256SHA256 private keys are representable in 31 bytes or
less (similarly for ECDSAP384SHA384) and that is how they are then
stored in the "PrivateKey" field of the key file. Nevertheless,
raw_key_to_ossl() always calls BN_bin2bn() with a fixed length argument,
which in the cases mentioned above leads to erroneously interpreting
uninitialized memory as a part of the private key. This results in the
latter being malformed and broken signatures being generated. Address
by using the key length provided by the caller rather than a fixed one.
Apply the same change to public key parsing code for consistency, adding
an INSIST() to prevent buffer overruns.
Mark Andrews [Thu, 18 Nov 2021 03:31:52 +0000 (14:31 +1100)]
Don't use 'dnssec-signzone -P' unless necessary
Most of the test zones in the dnssec system test can be verified.
Use -z when only a single key is being used so that the verifier
knows that only a single key is in use.
Mark Andrews [Thu, 18 Nov 2021 03:22:04 +0000 (14:22 +1100)]
Generate test zone with multiple NSEC and NSEC3 chains
The method used to generate a test zone with multiple NSEC and
NSEC3 chains was incorrect. Multiple calls to dnssec-signzone
with multiple parameters is not additive. Extract the chain on
each run then add them to the final signed zone instance.
Evan Hunt [Fri, 19 Nov 2021 03:29:07 +0000 (19:29 -0800)]
fix a use-after-free in resolver
when processing a mismatched response, we call dns_dispatch_getnext().
If that fails, for example because of a timeout, fctx_done() is called,
which cancels all queries. This triggers a crash afterward when
fctx_cancelquery() is called, and is unnecessary since fctx_done()
would have been called later anyway.
Ondřej Surý [Mon, 15 Nov 2021 11:13:22 +0000 (12:13 +0100)]
Fix the data race when shutting down dns_adb
When dns_adb is shutting down, first the adb->shutting_down flag is set
and then task is created that runs shutdown_stage2() that sets the
shutdown flag on names and entries. However, when dns_adb_createfind()
is called, only the individual shutdown flags are being checked, and the
global adb->shutting_down flag was not checked. Because of that it was
possible for a different thread to slip in and create new find between
the dns_adb_shutdown() and dns_adb_detach(), but before the
shutdown_stage2() task is complete. This was detected by
ThreadSanitizer as data race because the zonetable might have been
already detached by dns_view shutdown process and simultaneously
accessed by dns_adb_createfind().
This commit converts the adb->shutting_down to atomic_bool to prevent
the global adb lock when creating the find.
Evan Hunt [Mon, 8 Nov 2021 20:44:55 +0000 (12:44 -0800)]
address '--disable-doh' failures
Change 5756 (GL #2854) introduced build errors when using
'configure --disable-doh'. To fix this, isc_nm_is_http_handle() is
now defined in all builds, not just builds that have DoH enabled.
Missing code comments were added both for that function and for
isc_nm_is_tlsdns_handle().
Petr Špaček [Fri, 5 Nov 2021 10:39:07 +0000 (11:39 +0100)]
Automatically cancel CI jobs on outdated branches
Gitlab feature
https://docs.gitlab.com/ee/ci/pipelines/settings.html#auto-cancel-redundant-pipelines
can automatically cancel jobs which operate on an outdated code, i.e. on
branches which received new commits while jobs with an older set of
commits are still running. For this feature to work jobs have to be
configured with boolean interruptible: true.
I think practically all of our current CI jobs can be cancelled,
so the option is now on by default for all jobs.
Petr Špaček [Mon, 21 Jun 2021 12:51:43 +0000 (14:51 +0200)]
Add new system test for wildcard expansion
This is almost minimal prototype to show how to use python-hypothesis
library in a system test. It does not fully replace existing shell-based
system test for wildcards.
Artem Boldariev [Tue, 12 Oct 2021 13:58:45 +0000 (16:58 +0300)]
DoH: Set the "max-age" "Cache-Control" HTTP header value
This commit makes BIND set the "max-age" value of the "Cache-Control"
HTTP header to the minimal TTL from the Answer section for positive
answers, as RFC 8484 advises in section 5.1.
We calculate the minimal TTL as a side effect of rendering the
response DNS message, so it does not change the code flow much, nor
should it have any measurable negative impact on the performance.
For negative answers, the "max-age" value is set using the TTL and
SOA-minimum values from an SOA record in the Authority section.
Artem Boldariev [Wed, 6 Oct 2021 11:09:53 +0000 (14:09 +0300)]
DoH: Add isc_nm_set_min_answer_ttl()
This commit adds an isc_nm_set_min_answer_ttl() function which is
intended to to be used to give a hint to the underlying transport
regarding the answer TTL.
The interface is intentionally kept generic because over time more
transports might benefit from this functionality, but currently it is
intended for DoH to set "max-age" value within "Cache-Control" HTTP
header (as recommended in the RFC8484, section 5.1 "Cache
Interaction").
It is no-op for other DNS transports for the time being.
Petr Špaček [Wed, 3 Nov 2021 13:50:08 +0000 (14:50 +0100)]
Fix incorrect version bump in statistics channels
The version number for the XML statistics channel was not incremented
correctly after removal of isc_socket code in a55589f881bc4e4c1099e50b6d4ce84ffc7b5ba3, and the JSON version number
was not incremented at all.
Mark Andrews [Tue, 26 Oct 2021 04:28:36 +0000 (15:28 +1100)]
Handle HTTP/1.1 pipelined requests
Check to see whether there are outstanding requests in the
httpd receive buffer after sending the response, and if so,
process them.
Test that pipelined requests are handled by sending multiple
minimal HTTP/1.1 using netcat (nc) and checking that we get
back the same number of responses.
Mark Andrews [Tue, 26 Oct 2021 00:54:31 +0000 (11:54 +1100)]
Consume the HTTP headers after processing a request
Remember the amount of space consumed by the HTTP headers, then
move any trailing data to the start of the httpd->recvbuf once
we have finished processing the request.
if an incoming HTTP request is incomplete, but nothing else is clearly
wrong with it, the stats channel continues reading to see if there's
more coming. the buffer length was not being processed correctly in
this case. also, the server state was not reset correctly when the
request was complete, so that subsequent requests could be appended to
the first buffer instead of being treated as new.
in addition fixing the above problems, this commit also increases the
size of the httpd request buffer from 1024 to 4096, because some
browsers send a lot of headers.
Michal Nowak [Mon, 1 Nov 2021 14:40:08 +0000 (15:40 +0100)]
Fix typo in dns_name_copy-with-result.spatch
A typo introduced in f3f1cab05e05c9bdd5da91f3ab159ec6658ec7f4 prevents
execution of the dns_name_copy-with-result.spatch. The replacement
should end with semicolon not a colon:
Mark Andrews [Mon, 1 Nov 2021 00:40:26 +0000 (11:40 +1100)]
Address bugs in opensslrsa_tofile
1) if 'key->external' is set we just need to call
dst__privstruct_writefile
2) the cleanup of 'bufs' was incorrect as 'i' doesn't reflect the
the current index into 'bufs'. Use a simple for loop.
This review was triggered by Coverity reporting a buffer overrun
on 'bufs'.
Mark Andrews [Mon, 1 Nov 2021 00:13:43 +0000 (11:13 +1100)]
Address potential memory leak in openssldh_parse()
'dh' was being assigned to key->keydata.dh too soon which could
result in a memory leak on error. Moved the assignement of
key->keydata.dh until after dh was correct.
Coverity was reporting dead code on the error path cleaning up 'dh'
which triggered this review.
Michal Nowak [Mon, 1 Nov 2021 13:40:00 +0000 (14:40 +0100)]
Add comparekeys to release tarball
'make dist' omits lib/dns/tests/comparekeys/ (added in 7101afa23cfc7cd005aeeb00802481094a0b9cf5) from release tarball it
creates which makes the unit:gcc:tarball CI job permanently fail in the
dst unit test.
Artem Boldariev [Fri, 29 Oct 2021 15:43:40 +0000 (18:43 +0300)]
Be less strict regarding "tls" statements in the configuration file
In the 9.17.19 release "tls" statements verification code was
added. The code was too strict and assumed that every such a statement
should have both "cert-file" and "key-file" specified. This turned out
to be a regression, as in some cases we plan to use the "tls"
statement to specify TLS connection parameters.
This commit fixes this behaviour; now a "tls" statement should either
have both "cert-file" and "key-file" specified, or both should be
omitted.
Petr Špaček [Thu, 28 Oct 2021 12:26:09 +0000 (14:26 +0200)]
remove last remaining reference to _REENTRANT macro and fix DLZ example
It was used only as guard against unused variable declaration, but the
surrounding code depends on strtok_r being defined unconditionally, so
there is no point in guarding a variable.
Glibc documentation suggests it is obsolete anyway and e.g. Meson build
system decided to ignore it. It seems to be required only by old
Solaris compiler and OpenIndiana uses gcc.
Petr Špaček [Thu, 28 Oct 2021 14:39:20 +0000 (16:39 +0200)]
retain diff output if clang-format changes something
It's major PITA trying to guess what exactly clang-format has changed,
so how CI stores patch file with changes which can be applied locally if
needed.