Fix crash in pk11_numbits() when native-pkcs11 is used
When pk11_numbits() is passed a user provided input that contains all
zeroes (via crafted DNS message), it would crash with assertion
failure. Fix that by properly handling such input.
permanently disable QNAME minimization in a fetch when forwarding
QNAME minimization is normally disabled when forwarding. if, in the
course of processing a fetch, we switch back to normal recursion at
some point, we can't safely start minimizing because we may have
been left in an inconsistent state.
Each worker has a receive buffer with space for 20 DNS messages of up
to 2^16 bytes each, and the allocator function passed to uv_read_start()
or uv_udp_recv_start() will reserve a portion of it for use by sockets.
UDP can use recvmmsg() and so it needs that entire space, but TCP reads
one message at a time.
This commit introduces separate allocator functions for TCP and UDP
setting different buffer size limits, so that libuv will provide the
correct buffer sizes to each of them.
Michał Kępień [Wed, 5 Aug 2020 10:04:59 +0000 (12:04 +0200)]
Remove arm64 jobs from GitLab CI
The only arm64 runner we have at our disposal is suffering from
intermittent connectivity issues which make it unusable for extended
periods of time. Remove arm64 jobs from GitLab CI until we manage to
set up an arm64 runner with more reliable connectivity.
Michał Kępień [Wed, 5 Aug 2020 07:04:53 +0000 (09:04 +0200)]
Set "max-cache-size" in the "geoip2" system test
The named configuration files used in the "geoip2" system test cause a
rather large number of views (6-8) to be set up in each tested named
instance. Each view has its own cache.
Commit e24bc324b455d9cad7b51acd3d5c7b4e40c66187 caused the RBT hash
table to be pre-allocated to a size derived from "max-cache-size", so
that it never needs to be rehashed. The size of that hash table is not
expected to be significant enough to cause memory use issues in typical
conditions even for large "max-cache-size" settings.
However, these two factors combined can cause memory exhaustion issues
in GitLab CI, where we run multiple "instances" of the test suite in
parallel on the same runner, each test suite executes multiple system
tests concurrently, and each system test may potentially start multiple
named instances at the same time. In practice, this problem currently
only seems to be affecting the "geoip2" system test, which is failing
intermittently due to named instances used by that test getting killed
by oom-killer.
Prevent the "geoip2" system test from failing intermittently by setting
"max-cache-size" in named configuration files used in that test to a low
value in order to keep memory usage at bay even with a large number of
views configured.
Mark Andrews [Wed, 22 Jul 2020 23:47:49 +0000 (09:47 +1000)]
Map DNS_R_BADTSIG to FORMERR
Now that the log message has been printed set the result code to
DNS_R_FORMERR. We don't do this via dns_result_torcode() as we
don't want upstream errors to produce FORMERR if that processing
end with DNS_R_BADTSIG.
Expire the 0 TTL RRSet quickly rather using them for serve-stale
When a received RRSet has TTL 0, they would be preserved for
serve-stale (default `max-stale-cache` is 12 hours) rather than expiring
them quickly from the cache database.
This commit makes sure the RRSet didn't have TTL 0 before marking the
entry in the database as "stale".
Add stale-cache-enable option and disable serve-stable by default
The current serve-stale implementation in BIND 9 stores all received
records in the cache for a max-stale-ttl interval (default 12 hours).
This allows DNS operators to turn the serve-stale answers in an event of
large authoritative DNS outage. The caching of the stale answers needs
to be enabled before the outage happens or the feature would be
otherwise useless.
The negative consequence of the default setting is the inevitable
cache-bloat that happens for every and each DNS operator running named.
In this MR, a new configuration option `stale-cache-enable` is
introduced that allows the operators to selectively enable or disable
the serve-stale feature of BIND 9 based on their decision.
The newly introduced option has been disabled by default,
e.g. serve-stale is disabled in the default configuration and has to be
enabled if required.
Add fuzzing for the isc_lex (isc_lex_gettoken,isc_lex_getmastertoken) API
In this commit, the simple fuzzing tests for the isc_lex_gettoken() and
isc_lex_getmastertoken() functions have been added.
As part of this commit, the initialization has been moved from fuzz.h
constructor/destructor to LLVMFuzzerInitialize() in each fuzz test. The
main.c of no-fuzzing and AFL modes have been modified to run the
LLVMFuzzerInitialize() at the start of the main() function mimicking
the libfuzzer mode of operation.
The fuzzing tests were temporarily disabled when the build system has been
converted to automake. This commit restores the functionality to run the
fuzzing tests as part of the `make check`. When the afl or libfuzzer
is enabled via ./configure, it uses a custom LOG_DRIVER (fuzz/<fuzzer.sh>).
Currently only libfuzzer.sh has been implemented that runs each fuzz
test for 5 seconds each.
netmgr: retry binding with IP_FREEBIND when EADDRNOTAVAIL is returned.
When a new IPv6 interface/address appears it's first in a tentative
state - in which we cannot bind to it, yet it's already being reported
by the route socket. Because of that BIND9 is unable to listen on any
newly detected IPv6 addresses. Fix it by setting IP_FREEBIND option (or
equivalent option on other OSes) and then retrying bind() call.
Mark Andrews [Thu, 5 Dec 2019 02:29:45 +0000 (13:29 +1100)]
Always check the return from isc_refcount_decrement.
Created isc_refcount_decrement_expect macro to test conditionally
the return value to ensure it is in expected range. Converted
unchecked isc_refcount_decrement to use isc_refcount_decrement_expect.
Converted INSIST(isc_refcount_decrement()...) to isc_refcount_decrement_expect.
Mark Andrews [Mon, 20 Jul 2020 01:53:40 +0000 (11:53 +1000)]
Refactor the code that counts the last log version to keep
When silencing the Coverity warning in remove_old_tsversions(), the code
was refactored to reduce the indentation levels and break down the long
code into individual functions. This improve fix for [GL #1989].
Michal Nowak [Tue, 28 Jul 2020 12:42:55 +0000 (14:42 +0200)]
Make sure we don't introduce SYSTEMTESTTOP anymore
':!.gitlab-ci.yml' is a pathspec pattern used to limit paths in the "git
grep" command to all but the .gitlab-ci.yml file which includes the
checked word itself. This requires Git 2.13.
Michal Nowak [Tue, 21 Jul 2020 14:29:14 +0000 (16:29 +0200)]
Source config.guess from source root
It seems that config.guess gets always created in source root, so for
that sake of out-of-tree system test, we should expect the file there
instead of where configure was run.
Michal Nowak [Tue, 21 Jul 2020 10:12:59 +0000 (12:12 +0200)]
Drop $SYSTEMTESTTOP from bin/tests/system/
The $SYSTEMTESTTOP shell variable if often set to .. in various shell
scripts inside bin/tests/system/, but most of the time it is only
used one line later, while sourcing conf.sh. This hardly improves
code readability.
$SYSTEMTESTTOP is also used for the purpose of referencing
scripts/files living in bin/tests/system/, but given that the
variable is always set to a short, relative path, we can drop it and
replace all of its occurrences with the relative path without adversely
affecting code readability.
Michał Kępień [Thu, 30 Jul 2020 08:58:39 +0000 (10:58 +0200)]
Fix idle timeout for connected TCP sockets
When named acting as a resolver connects to an authoritative server over
TCP, it sets the idle timeout for that connection to 20 seconds. This
fixed timeout was picked back when the default processing timeout for
each client query was hardcoded to 30 seconds. Commit 000a8970f840a0c27c5cc404826853c4674362ac made this processing timeout
configurable through "resolver-query-timeout" and decreased its default
value to 10 seconds, but the idle TCP timeout was not adjusted to
reflect that change. As a result, with the current defaults in effect,
a single hung TCP connection will consistently cause the resolution
process for a given query to time out.
Set the idle timeout for connected TCP sockets to half of the client
query processing timeout configured for a resolver. This allows named
to handle hung TCP connections more robustly and prevents the timeout
mismatch issue from resurfacing in the future if the default is ever
changed again.
initialize, rather than invalidating, new http buffers
when building without ISC_BUFFER_USEINLINE (which is the default on
Windows) an assertion failure could occur when setting up a new
isc_httpd_t object for the statistics channel.
Diego Fronza [Tue, 9 Jun 2020 23:45:21 +0000 (20:45 -0300)]
Fix rpz wildcard name matching
Whenever an exact match is found by dns_rbt_findnode(),
the highest level node in the chain will not be put into
chain->levels[] array, but instead the chain->end
pointer will be adjusted to point to that node.
Suppose we have the following entries in a rpz zone:
example.com CNAME rpz-passthru.
*.example.com CNAME rpz-passthru.
A query for www.example.com would result in the
following chain object returned by dns_rbt_findnode():
Since exact matches only care for testing rpz set bits,
we need to test for rpz wild bits through iterating the nodechain, and
that includes testing the rpz wild bits in the highest level node found.
In the case of an exact match, chain->levels[chain->level_matches]
will be NULL, to address that we must use chain->end as the start point,
then iterate over the remaining levels in the chain.
Creation of EVP_MD_CTX and EVP_PKEY is quite expensive, so until we fix the code
to reuse the OpenSSL contexts and keys we'll use our own implementation of
siphash instead of trying to integrate with OpenSSL.
Fix the rbt hashtable and grow it when setting max-cache-size
There were several problems with rbt hashtable implementation:
1. Our internal hashing function returns uint64_t value, but it was
silently truncated to unsigned int in dns_name_hash() and
dns_name_fullhash() functions. As the SipHash 2-4 higher bits are
more random, we need to use the upper half of the return value.
2. The hashtable implementation in rbt.c was using modulo to pick the
slot number for the hash table. This has several problems because
modulo is: a) slow, b) oblivious to patterns in the input data. This
could lead to very uneven distribution of the hashed data in the
hashtable. Combined with the single-linked lists we use, it could
really hog-down the lookup and removal of the nodes from the rbt
tree[a]. The Fibonacci Hashing is much better fit for the hashtable
function here. For longer description, read "Fibonacci Hashing: The
Optimization that the World Forgot"[b] or just look at the Linux
kernel. Also this will make Diego very happy :).
3. The hashtable would rehash every time the number of nodes in the rbt
tree would exceed 3 * (hashtable size). The overcommit will make the
uneven distribution in the hashtable even worse, but the main problem
lies in the rehashing - every time the database grows beyond the
limit, each subsequent rehashing will be much slower. The mitigation
here is letting the rbt know how big the cache can grown and
pre-allocate the hashtable to be big enough to actually never need to
rehash. This will consume more memory at the start, but since the
size of the hashtable is capped to `1 << 32` (e.g. 4 mio entries), it
will only consume maximum of 32GB of memory for hashtable in the
worst case (and max-cache-size would need to be set to more than
4TB). Calling the dns_db_adjusthashsize() will also cap the maximum
size of the hashtable to the pre-computed number of bits, so it won't
try to consume more gigabytes of memory than available for the
database.
FIXME: What is the average size of the rbt node that gets hashed? I
chose the pagesize (4k) as initial value to precompute the size of
the hashtable, but the value is based on feeling and not any real
data.
For future work, there are more places where we use result of the hash
value modulo some small number and that would benefit from Fibonacci
Hashing to get better distribution.
Notes:
a. A doubly linked list should be used here to speedup the removal of
the entries from the hashtable.
b. https://probablydance.com/2018/06/16/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo/