netmgr: retry binding with IP_FREEBIND when EADDRNOTAVAIL is returned.
When a new IPv6 interface/address appears it's first in a tentative
state - in which we cannot bind to it, yet it's already being reported
by the route socket. Because of that BIND9 is unable to listen on any
newly detected IPv6 addresses. Fix it by setting IP_FREEBIND option (or
equivalent option on other OSes) and then retrying bind() call.
Mark Andrews [Thu, 5 Dec 2019 02:29:45 +0000 (13:29 +1100)]
Always check the return from isc_refcount_decrement.
Created isc_refcount_decrement_expect macro to test conditionally
the return value to ensure it is in expected range. Converted
unchecked isc_refcount_decrement to use isc_refcount_decrement_expect.
Converted INSIST(isc_refcount_decrement()...) to isc_refcount_decrement_expect.
Mark Andrews [Mon, 20 Jul 2020 01:53:40 +0000 (11:53 +1000)]
Refactor the code that counts the last log version to keep
When silencing the Coverity warning in remove_old_tsversions(), the code
was refactored to reduce the indentation levels and break down the long
code into individual functions. This improve fix for [GL #1989].
Michal Nowak [Tue, 28 Jul 2020 12:42:55 +0000 (14:42 +0200)]
Make sure we don't introduce SYSTEMTESTTOP anymore
':!.gitlab-ci.yml' is a pathspec pattern used to limit paths in the "git
grep" command to all but the .gitlab-ci.yml file which includes the
checked word itself. This requires Git 2.13.
Michal Nowak [Tue, 21 Jul 2020 14:29:14 +0000 (16:29 +0200)]
Source config.guess from source root
It seems that config.guess gets always created in source root, so for
that sake of out-of-tree system test, we should expect the file there
instead of where configure was run.
Michal Nowak [Tue, 21 Jul 2020 10:12:59 +0000 (12:12 +0200)]
Drop $SYSTEMTESTTOP from bin/tests/system/
The $SYSTEMTESTTOP shell variable if often set to .. in various shell
scripts inside bin/tests/system/, but most of the time it is only
used one line later, while sourcing conf.sh. This hardly improves
code readability.
$SYSTEMTESTTOP is also used for the purpose of referencing
scripts/files living in bin/tests/system/, but given that the
variable is always set to a short, relative path, we can drop it and
replace all of its occurrences with the relative path without adversely
affecting code readability.
Michał Kępień [Thu, 30 Jul 2020 08:58:39 +0000 (10:58 +0200)]
Fix idle timeout for connected TCP sockets
When named acting as a resolver connects to an authoritative server over
TCP, it sets the idle timeout for that connection to 20 seconds. This
fixed timeout was picked back when the default processing timeout for
each client query was hardcoded to 30 seconds. Commit 000a8970f840a0c27c5cc404826853c4674362ac made this processing timeout
configurable through "resolver-query-timeout" and decreased its default
value to 10 seconds, but the idle TCP timeout was not adjusted to
reflect that change. As a result, with the current defaults in effect,
a single hung TCP connection will consistently cause the resolution
process for a given query to time out.
Set the idle timeout for connected TCP sockets to half of the client
query processing timeout configured for a resolver. This allows named
to handle hung TCP connections more robustly and prevents the timeout
mismatch issue from resurfacing in the future if the default is ever
changed again.
initialize, rather than invalidating, new http buffers
when building without ISC_BUFFER_USEINLINE (which is the default on
Windows) an assertion failure could occur when setting up a new
isc_httpd_t object for the statistics channel.
Diego Fronza [Tue, 9 Jun 2020 23:45:21 +0000 (20:45 -0300)]
Fix rpz wildcard name matching
Whenever an exact match is found by dns_rbt_findnode(),
the highest level node in the chain will not be put into
chain->levels[] array, but instead the chain->end
pointer will be adjusted to point to that node.
Suppose we have the following entries in a rpz zone:
example.com CNAME rpz-passthru.
*.example.com CNAME rpz-passthru.
A query for www.example.com would result in the
following chain object returned by dns_rbt_findnode():
Since exact matches only care for testing rpz set bits,
we need to test for rpz wild bits through iterating the nodechain, and
that includes testing the rpz wild bits in the highest level node found.
In the case of an exact match, chain->levels[chain->level_matches]
will be NULL, to address that we must use chain->end as the start point,
then iterate over the remaining levels in the chain.
Creation of EVP_MD_CTX and EVP_PKEY is quite expensive, so until we fix the code
to reuse the OpenSSL contexts and keys we'll use our own implementation of
siphash instead of trying to integrate with OpenSSL.
Fix the rbt hashtable and grow it when setting max-cache-size
There were several problems with rbt hashtable implementation:
1. Our internal hashing function returns uint64_t value, but it was
silently truncated to unsigned int in dns_name_hash() and
dns_name_fullhash() functions. As the SipHash 2-4 higher bits are
more random, we need to use the upper half of the return value.
2. The hashtable implementation in rbt.c was using modulo to pick the
slot number for the hash table. This has several problems because
modulo is: a) slow, b) oblivious to patterns in the input data. This
could lead to very uneven distribution of the hashed data in the
hashtable. Combined with the single-linked lists we use, it could
really hog-down the lookup and removal of the nodes from the rbt
tree[a]. The Fibonacci Hashing is much better fit for the hashtable
function here. For longer description, read "Fibonacci Hashing: The
Optimization that the World Forgot"[b] or just look at the Linux
kernel. Also this will make Diego very happy :).
3. The hashtable would rehash every time the number of nodes in the rbt
tree would exceed 3 * (hashtable size). The overcommit will make the
uneven distribution in the hashtable even worse, but the main problem
lies in the rehashing - every time the database grows beyond the
limit, each subsequent rehashing will be much slower. The mitigation
here is letting the rbt know how big the cache can grown and
pre-allocate the hashtable to be big enough to actually never need to
rehash. This will consume more memory at the start, but since the
size of the hashtable is capped to `1 << 32` (e.g. 4 mio entries), it
will only consume maximum of 32GB of memory for hashtable in the
worst case (and max-cache-size would need to be set to more than
4TB). Calling the dns_db_adjusthashsize() will also cap the maximum
size of the hashtable to the pre-computed number of bits, so it won't
try to consume more gigabytes of memory than available for the
database.
FIXME: What is the average size of the rbt node that gets hashed? I
chose the pagesize (4k) as initial value to precompute the size of
the hashtable, but the value is based on feeling and not any real
data.
For future work, there are more places where we use result of the hash
value modulo some small number and that would benefit from Fibonacci
Hashing to get better distribution.
Notes:
a. A doubly linked list should be used here to speedup the removal of
the entries from the hashtable.
b. https://probablydance.com/2018/06/16/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo/
Michal Nowak [Mon, 22 Jun 2020 12:13:46 +0000 (14:13 +0200)]
Ensure various test issues are treated as failures
Make sure bin/tests/system/run.sh returns a non-zero exit code if any of
the following happens:
- the test being run produces a core dump,
- assertion failures are found in the test's logs,
- ThreadSanitizer reports are found after the test completes,
- the servers started by the test fail to shut down cleanly.
This change is necessary to always fail a test in such cases (before the
migration to Automake, test failures were determined based on the
presence of "R:<test-name>:FAIL" lines in the test suite output and thus
it was not necessary for bin/tests/system/run.sh to return a non-zero
exit code).
Michał Kępień [Thu, 16 Jul 2020 09:28:09 +0000 (11:28 +0200)]
Update release checklist
Add an item to the release checklist to make sure confidential issues
assigned to the relevant milestone are made public after the BIND
versions addressing them are released.
Michał Kępień [Tue, 14 Jul 2020 07:58:04 +0000 (09:58 +0200)]
Use "image" key in QEMU-based CI job templates
Our GitLab Runner Custom executor scripts now use the "image" key
instead of the job name for determining the QCOW2 image to use for a
given CI job. Update .gitlab-ci.yml to reflect that change.
Tony Finch [Mon, 22 Jun 2020 19:23:29 +0000 (20:23 +0100)]
Fix re-signing when `sig-validity-interval` has two arguments
Since October 2019 I have had complaints from `dnssec-cds` reporting
that the signatures on some of my test zones had expired. These were
zones signed by BIND 9.15 or 9.17, with a DNSKEY TTL of 24h and
`sig-validity-interval 10 8`.
This is the same setup we have used for our production zones since
2015, which is intended to re-sign the zones every 2 days, keeping
at least 8 days signature validity. The SOA expire interval is 7
days, so even in the presence of zone transfer problems, no-one
should ever see expired signatures. (These timers are a bit too
tight to be completely correct, because I should have increased
the expiry timers when I increased the DNSKEY TTLs from 1h to 24h.
But that should only matter when zone transfers are broken, which
was not the case for the error reports that led to this patch.)
This TTL was captured at 20200622105807 so the resolver cached the
RRset 64976 seconds previously (18h02m56s), at 20200621165511
only about 12h before expiry.
The other symptom of this error was incorrect `resign` times in
the output from `rndc zonestatus`.
For example, I have configured a test zone
zone fast.dotat.at {
file "../u/z/fast.dotat.at";
type primary;
auto-dnssec maintain;
sig-validity-interval 500 499;
};
The zone is reset to a minimal zone containing only SOA and NS
records, and when `named` starts it loads and signs the zone. After
that, `rndc zonestatus` reports:
next resign node: fast.dotat.at/NS
next resign time: Fri, 28 May 2021 12:48:47 GMT
The resign time should be within the next 24h, but instead it is
near the signature expiry time, which the RRSIG(NS) says is 20210618074847. (Note 499 hours is a bit more than 20 days.)
May/June 2021 is less than 500 days from now because expiry time
jitter is applied to the NS records.
Using this test I bisected this bug to 09990672d which contained a
mistake leading to the resigning interval always being calculated in
hours, when days are expected.
This bug only occurs for configurations that use the two-argument form
of `sig-validity-interval`.