git.ipfire.org Git - thirdparty/bind9.git/log

Merge branch '2357-cleanup-public-headers' into 'main'

Resolve "Cannot compile current versions on macOS "Catalina""

Closes #2357

See merge request isc-projects/bind9!4670

Stop including gssapi.h from dst/gssapi.h header

The only reason for including the gssapi.h from the dst/gssapi.h header
was to get the typedefs of gss_cred_id_t and gss_ctx_id_t. Instead of
using those types directly this commit introduces dns_gss_cred_id_t and
dns_gss_ctx_id_t types that are being used in the public API and
privately retyped to their counterparts when we actually call the gss
api.

This also conceals the gssapi headers, so users of the libdns library
doesn't have to add GSSAPI_CFLAGS to the Makefile when including libdns
dst API.

Stop including dnstap headers from <dns/dnstap.h>

The <fstrm.h> and <protobuf-c/protobuf-c.h> headers are only directly
included where used and we stopped exposing those headers from libdns
headers.

Stop including lmdb.h from <dns/view.h>

The lmdb.h doesn't have to be included from the dns/view.h header as it
is separately included where used. This stops exposing the inclusion of
lmdb.h from the libdns headers.

Move the <isc/readline.h> header to bin/dig/readline.h

The <isc/readline.h> header provided a compatibility shim to use when
other non-GNU readline libraries are in use. The two places where
readline library is being used is nslookup and nsupdate, so the header
file has been moved to bin/dig directory and it's directly included from
bin/nsupdate.

This also conceals any readline headers exposed from the libisc headers.

Remove the extra CFLAGS from libisc_CFLAGS and libdns_CFLAGS

The extra library CFLAGS were causing the headers to be included in
wrong order possibly pulling header files from previously installed
BIND 9 version.

This commit cleans up the extra <foo>_CFLAGS from the includes in favor
of not exposing 3rd party headers in our own header files.

Merge branch '2041-bug-reconfig-auto-dnssec-high-thread-number-leak-resources-and-crash-named' into 'main'

Resolve "BUG reconfig+auto-dnssec+high thread number leak resources and crash named"

Closes #2041

See merge request isc-projects/bind9!4669

Add CHANGES note for [GL #2041]

Test reconfig after adding inline signed zones won't crash named

This test ensures that named won't crash after many inline-signed zones
are added to configurarion, followed by a rndc reconfig.

Fix dangling references to outdated views after reconfig

This commit fix a leak which was happening every time an inline-signed
zone was added to the configuration, followed by a rndc reconfig.

During the reconfig process, the secure version of every inline-signed
zone was "moved" to a new view upon a reconfig and it "took the raw
version along", but only once the secure version was freed (at shutdown)
was prev_view for the raw version detached from, causing the old view to
be released as well.

This caused dangling references to be kept for the previous view, thus
keeping all resources used by that view in memory.

Merge branch 'mnowak/merge-skipped-and-untested-system-test-results' into 'main'

Merge UNTESTED and SKIPPED system test results

See merge request isc-projects/bind9!4517

Add CHANGES note for [GL !4517]

Do not build geoip_test when GeoIP is not available

Record skipped unit test as skipped in Automake framework

Merge UNTESTED and SKIPPED system test results

Descriptions of UNTESTED and SKIPPED system test results are very
similar to one another and it may be confusing when to pick one and
when the other. Merging these two system test results removes the
confusion and also makes system test more aligned with Automake,
which does not know about UNTESTED test result.

Record skipped test as skipped in testsuite summary

When system test execution was ported to Automake, SKIPPED and UNTESTED
system test result were not made to match Automake expectations,
therefore a skipped test is recorded by Automake as "PASS":

    $ make check TESTS=cpu V=1
    I:cpu:cpu test only runs on Linux, skipping test
    I:cpu:Prerequisites missing, skipping test.
    R:cpu:SKIPPED
    E:cpu:2020-12-16T11:36:58+0000
    PASS: cpu
    ====================================================================
    Testsuite summary for BIND 9.17.7
    ====================================================================
    # TOTAL: 1
    # PASS:  1

For a test to be recorded by Automake as skipped, the test, or it's test
driver, needs to exit with code 77:

    $ make check TESTS=cpu V=1
    I:cpu:cpu test only runs on Linux, skipping test
    I:cpu:Prerequisites missing, skipping test.
    R:cpu:SKIPPED
    E:cpu:2020-12-16T11:39:10+0000
    SKIP: cpu
    ====================================================================
    Testsuite summary for BIND 9.17.7
    ====================================================================
    # TOTAL: 1
    # PASS:  0
    # SKIP:  1

Merge branch '2443-cid-316608-memory-corruptions-overrun' into 'main'

Resolve "CID 316608: Memory - corruptions (OVERRUN)"

Closes #2443

See merge request isc-projects/bind9!4623

Address theoretical buffer overrun in recent change

The strlcat() call was wrong.

    *** CID 316608:  Memory - corruptions  (OVERRUN)
    /lib/dns/resolver.c: 5017 in fctx_create()
    5011      * Make fctx->info point to a copy of a formatted string
    5012      * "name/type".
    5013      */
    5014      dns_name_format(name, buf, sizeof(buf));
    5015      dns_rdatatype_format(type, typebuf, sizeof(typebuf));
    5016      p = strlcat(buf, "/", sizeof(buf));
    >>>     CID 316608:  Memory - corruptions  (OVERRUN)
    >>>     Calling "strlcat" with "buf + p" and "1036UL" is suspicious because "buf" points into a buffer of 1036 bytes and the function call may access "(char *)(buf + p) + 1035UL". [Note: The source code implementation of the function has been overridden by a builtin model.]
    5017      strlcat(buf + p, typebuf, sizeof(buf));
    5018      fctx->info = isc_mem_strdup(mctx, buf);
    5019
    5020      FCTXTRACE("create");
    5021      dns_name_init(&fctx->name, NULL);
    5022      dns_name_dup(name, mctx, &fctx->name);

Merge branch 'pspacek/ci-python-allthetime' into 'main'

Run Python linters in CI even outside of merge requests

See merge request isc-projects/bind9!4540

Run Python linters in CI even outside of merge requests

Previously it did not get run on scheduled CI pipelines.

Merge branch 'mnowak/check-for-unrecognized-options' into 'main'

Check for unrecognized configure options

See merge request isc-projects/bind9!4567

Add --enable-option-checking=fatal to ./configure in CI

The --enable-option-checking=fatal option prevents ./configure from
proceeding when an unknown option is used in the ./configure step in CI.
This change will avoid adding unsupported ./configure options or options
with typo or typo in pairwise testing "# [pairwise: ...]" marker.

Merge branch '2312-lint-generated-manual-pages' into 'main'

Lint manual pages

Closes #2312

See merge request isc-projects/bind9!4475

Lint manual pages

As we generate manual pages from reStructuredText sources, we don't have
absolute control on manual page output and therefore 'mandoc -Tlint' may
always report warnings we can't eliminate. In light of this some mandoc
warnings need to be ignored.

Build man pages when "make doc" is run

Man pages are currently only generated from reStructuredText sources
when "make man" is run in the doc/man/ directory. Tweak
doc/man/Makefile.am so that running "make doc" in the top-level
directory also causes man pages to be generated, so that all potential
documentation building problems can be detected by a single make
invocation.

Merge branch '2421-cid-316509-untrusted-value-as-argument-tainted_scalar' into 'main'

Resolve "CID 316509: Untrusted value as argument (TAINTED_SCALAR)"

Closes #2423 and #2421

See merge request isc-projects/bind9!4606

Silence Insecure data handling (TAINTED_SCALAR)

Coverity assumes that the memory holding any value read using byte
swapping is tainted.  As we store the NSEC3PARAM records in wire
form and iterations is byte swapped the memory holding the record
is marked as tainted.  nsec3->salt_length is marked as tainted
transitively. To remove the taint the value need to be range checked.
For a correctly formatted record region.length should match
nsec3->salt_length and provides a convenient value to check the field
against.

    *** CID 316507:  Insecure data handling  (TAINTED_SCALAR)
    /lib/dns/rdata/generic/nsec3param_51.c: 241 in tostruct_nsec3param()
    235      region.length = rdata->length;
    236      nsec3param->hash = uint8_consume_fromregion(&region);
    237      nsec3param->flags = uint8_consume_fromregion(&region);
    238      nsec3param->iterations = uint16_consume_fromregion(&region);
    239
    240      nsec3param->salt_length = uint8_consume_fromregion(&region);
    >>>     CID 316507:  Insecure data handling  (TAINTED_SCALAR)
    >>>     Passing tainted expression "nsec3param->salt_length" to "mem_maybedup", which uses it as an offset.
    241      nsec3param->salt = mem_maybedup(mctx, region.base,
    242      nsec3param->salt_length);
    243      if (nsec3param->salt == NULL) {
    244      return (ISC_R_NOMEMORY);
    245      }
    246      isc_region_consume(&region, nsec3param->salt_length);

Silence Untrusted value as argument (TAINTED_SCALAR)

Coverity assumes that the memory holding any value read using byte
swapping is tainted.  As we store the NSEC3 records in wire form
and iterations is byte swapped the memory holding the record is
marked as tainted.  nsec3->salt_length and nsec3->next_length are
marked as tainted transitively. To remove the taint the values need
to be range checked.  Valid values for these should never exceed
region.length so that is becomes a reasonable value to check against.

    *** CID 316509:    (TAINTED_SCALAR)
    /lib/dns/rdata/generic/nsec3_50.c: 312 in tostruct_nsec3()
    306      if (nsec3->salt == NULL) {
    307      return (ISC_R_NOMEMORY);
    308      }
    309      isc_region_consume(&region, nsec3->salt_length);
    310
    311      nsec3->next_length = uint8_consume_fromregion(&region);
    >>>     CID 316509:    (TAINTED_SCALAR)
    >>>     Passing tainted expression "nsec3->next_length" to "mem_maybedup", which uses it as an offset.
    312      nsec3->next = mem_maybedup(mctx, region.base, nsec3->next_length);
    313      if (nsec3->next == NULL) {
    314      goto cleanup;
    315      }
    316      isc_region_consume(&region, nsec3->next_length);
    317
    /lib/dns/rdata/generic/nsec3_50.c: 305 in tostruct_nsec3()
    299      region.length = rdata->length;
    300      nsec3->hash = uint8_consume_fromregion(&region);
    301      nsec3->flags = uint8_consume_fromregion(&region);
    302      nsec3->iterations = uint16_consume_fromregion(&region);
    303
    304      nsec3->salt_length = uint8_consume_fromregion(&region);
    >>>     CID 316509:    (TAINTED_SCALAR)
    >>>     Passing tainted expression "nsec3->salt_length" to "mem_maybedup", which uses it as an offset.
    305      nsec3->salt = mem_maybedup(mctx, region.base, nsec3->salt_length);
    306      if (nsec3->salt == NULL) {
    307      return (ISC_R_NOMEMORY);
    308      }
    309      isc_region_consume(&region, nsec3->salt_length);
    310

Merge branch 'mnowak/enable-libns-tests-to-run-under-asan' into 'main'

Drop AddressSanitizer constraint from libns unit tests

See merge request isc-projects/bind9!4622

Drop AddressSanitizer constraint from libns unit tests

The AddressSanitizer constraint in some libns unit tests does not seem
to be necessary anymore, these tests run fine under AddressSanitizer.

Merge branch '2460-incorrect-size-passed-to-isc_mem_put' into 'main'

Resolve "Incorrect size passed to isc_mem_put"

Closes #2460

See merge request isc-projects/bind9!4633

Add release note for [GL #2460]

Add CHANGES note for [GL #2460]

Fix wrong length passed to isc_mem_put

If an invalid key name (e.g. "a..b") in a primaries list in named.conf
is specified the wrong size is passed to isc_mem_put resulting in the
returned memory being put on the wrong freed list.

    *** CID 316784:  Incorrect expression  (SIZEOF_MISMATCH)
    /bin/named/config.c: 636 in named_config_getname()
    630      isc_buffer_constinit(&b, objstr, strlen(objstr));
    631      isc_buffer_add(&b, strlen(objstr));
    632      dns_fixedname_init(&fname);
    633      result = dns_name_fromtext(dns_fixedname_name(&fname), &b, dns_rootname,
    634         0, NULL);
    635      if (result != ISC_R_SUCCESS) {
       CID 316784:  Incorrect expression  (SIZEOF_MISMATCH)
       Passing argument "*namep" of type "dns_name_t *" and argument "8UL /* sizeof (*namep) */" to function "isc__mem_put" is suspicious.
    636      isc_mem_put(mctx, *namep, sizeof(*namep));
    637      *namep = NULL;
    638      return (result);
    639      }
    640      dns_name_dup(dns_fixedname_name(&fname), mctx, *namep);
    641

Merge branch '1810-refactor-ecdsa-eddsa-system-tests' into 'main'

Resolve "Refactor ecdsa and eddsa tests after testcrypto.sh changes"

Closes #1810

See merge request isc-projects/bind9!4645

Update copyrights for [#1810]

Refactor ecdsa system test

Similar to eddsa system test.

Enable eddsa test

It should be fixed now.

Refactor eddsa system test

Test for Ed25519 and Ed448. If both algorithms are not supported, skip
test. If only one algorithm is supported, run test, skip the
unsupported algorithm. If both are supported, run test normally.

Create new ns3. This will test Ed448 specifically, while now ns2 only
tests Ed25519. This moves some files from ns2/ to ns3/.

Fix testcrypto.sh

Testing Ed448 was actually testing Ed25519.

Merge branch 'mnowak/drop-kyua-references-in-.gitlab-ci.yml' into 'main'

Remove remnant Kyua references

See merge request isc-projects/bind9!4638

Remove remnant Kyua references

Unit tests were ported from Kyua to Automake. All references to Kyua
thus should be removed from the main branch.

Merge branch 'mnowak/check-asan-errors-in-configure' into 'main'

Check config.log for ASAN errors

See merge request isc-projects/bind9!4655

Check config.log for ASAN errors

./configure checks might produce a false negative error due to ASAN
errors and thus disable some options.

Merge branch '2434-fetch-limit-serve-stale-follow-up' into 'main'

Resolve "Serve stale when fetch limits are hit" (follow-up)

Closes #2434

See merge request isc-projects/bind9!4654

Adjust serve-stale test

The number of queries to use in the burst can be reduced, as we have
a very low fetch limit of 1.

The dig command in 'wait_for_fetchlimits()' should time out sooner as
we expect a SERVFAIL to be returned promptly.

Enabling serve-stale can be done before hitting fetch-limits. This
reduces the chance that the resolver queries time out and fetch count
is reset. The chance of that happening is already slim because
'resolver-query-timeout' is 10 seconds, but better to first let the
data become stale rather than doing that while attempting to resolve.

Use stale on error also when unable to recurse

The 'query_usestale()' function was only called when in
'query_gotanswer()' and an unexpected error occurred. This may have
been "quota reached", and thus we were in some cases returning
stale data on fetch-limits (and if serve-stale enabled of course).

But we can also hit fetch-limits when recursing because we are
following a referral (in 'query_notfound()' and
'query_delegation_recurse()'). Here we should also check for using
stale data in case an error occurred.

Specifically don't check for using stale data when refetching a
zero TTL RRset from cache.

Move the setting of DNS_DBFIND_STALESTART into the 'query_usestale()'
function to avoid code duplication.

Merge branch '2469-cid-281461-untrusted-loop-bound' into 'main'

Resolve "CID 281461: untrusted loop bound"

Closes #2469

See merge request isc-projects/bind9!4642

Attempt to silence untrusted loop bound

Assign hit_len + key_len to len and test the result
rather than recomputing and letting the compiler simplify.

    213        isc_region_consume(&region, 2); /* hit length + algorithm */
        9. tainted_return_value: Function uint16_fromregion returns tainted data. [show details]
        10. tainted_data_transitive: Call to function uint16_fromregion with tainted argument *region.base returns tainted data.
        11. tainted_return_value: Function uint16_fromregion returns tainted data.
        12. tainted_data_transitive: Call to function uint16_fromregion with tainted argument *region.base returns tainted data.
        13. var_assign: Assigning: key_len = uint16_fromregion(&region), which taints key_len.
    214        key_len = uint16_fromregion(&region);
        14. lower_bounds: Casting narrower unsigned key_len to wider signed type int effectively tests its lower bound.
        15. Condition key_len == 0, taking false branch.
    215        if (key_len == 0) {
    216                RETERR(DNS_R_FORMERR);
    217        }
        16. Condition !!(_r->length >= _l), taking true branch.
        17. Condition !!(_r->length >= _l), taking true branch.
    218        isc_region_consume(&region, 2);
        18. lower_bounds: Casting narrower unsigned key_len to wider signed type int effectively tests its lower bound.
        19. Condition region.length < (unsigned int)(hit_len + key_len), taking false branch.
    219        if (region.length < (unsigned)(hit_len + key_len)) {
    220                RETERR(DNS_R_FORMERR);
    221        }
    222
        20. lower_bounds: Casting narrower unsigned key_len to wider signed type int effectively tests its lower bound.
        21. Condition _r != 0, taking false branch.
    223        RETERR(mem_tobuffer(target, rr.base, 4 + hit_len + key_len));
        22. lower_bounds: Casting narrower unsigned key_len to wider signed type int effectively tests its lower bound.
        23. var_assign_var: Compound assignment involving tainted variable 4 + hit_len + key_len to variable source->current taints source->current.
    224        isc_buffer_forward(source, 4 + hit_len + key_len);
    225
    226        dns_decompress_setmethods(dctx, DNS_COMPRESS_NONE);

    CID 281461 (#1 of 1): Untrusted loop bound (TAINTED_SCALAR)
        24. tainted_data: Using tainted variable source->active - source->current as a loop boundary.
    Ensure that tainted values are properly sanitized, by checking that their values are within a permissible range.
    227        while (isc_buffer_activelength(source) > 0) {
    228                dns_name_init(&name, NULL);
    229                RETERR(dns_name_fromwire(&name, source, dctx, options, target));
    230        }

Merge branch 'mnowak/check-arm-pdf-validity' into 'main'

Check PDF file structure with QPDF

See merge request isc-projects/bind9!4620

Check PDF file structure with QPDF

"qpdf --check" checks file structure of generated ARM PDF.

Merge branch '2377-allow-a-records-below-an-_spf-label-as-a-check-names-exception' into 'main'

Resolve "Allow A records below an '_spf' label as a check-names exception"

Closes #2377

See merge request isc-projects/bind9!4529

Add release note entry

Add CHANGES

Check that A record is accepted with _spf label present

Allow A records below '_spf' labels as recommend by RFC7208

Merge branch '2375-dnssec-policy-three-is-a-crowd-rollover-bug' into 'main'

Resolve "three is a crowd" dnssec-policy key rollover bug

Closes #2375

See merge request isc-projects/bind9!4541

Add kasp test todo for [#2375]

This bugfix has been manually verified but is missing a unit test.
Created GL #2471 to track this.

Use NUM_KEYSTATES constant where appropriate

We use the number 4 a lot when working on key states. Better to use
the NUM_KEYSTATES constant instead.

Add change and release note for [#2375]

News worthy.

Cleanup keymgr.c

Three small cleanups:

1. Remove an unused keystr/dst_key_format.
2. Initialize a dst_key_state_t state with NA.
3. Update false comment about local policy (local policy only adds
barrier on transitions to the RUMOURED state, not the UNRETENTIVE
state).

Fix DS/DNSKEY hidden or chained functions

There was a bug in function 'keymgr_ds_hidden_or_chained()'.

The funcion 'keymgr_ds_hidden_or_chained()' implements (3e) of rule2
as defined in the "Flexible and Robust Key Rollover" paper. The rules
says: All DS records need to be in the HIDDEN state, or if it is not
there must be a key with its DNSKEY and KRRSIG in OMNIPRESENT, and
its DS in the same state as the key in question. In human langauge,
if all keys have their DS in HIDDEN state you can do what you want,
but if a DS record is available to some validators, there must be
a chain of trust for it.

Note that the barriers on transitions first check if the current
state is valid, and then if the next state is valid too. But
here we falsely updated the 'dnskey_omnipresent' (now 'dnskey_chained')
with the next state. The next state applies to 'key' not to the state
to be checked. Updating the state here leads to (true) always, because
the key that will move its state will match the falsely updated
expected state. This could lead to the assumption that Key 2 would be
a valid chain of trust for Key 1, while clearly the presence of any
DS is uncertain.

The fix here is to check if the DNSKEY and KRRSIG are in OMNIPRESENT
state for the key that does not have its DS in the HIDDEN state, and
only if that is not the case, ensure that there is a key with the same
algorithm, that provides a valid chain of trust, that is, has its
DNSKEY, KRRSIG, and DS in OMNIPRESENT state.

The changes in 'keymgr_dnskey_hidden_or_chained()' are only cosmetical,
renaming 'rrsig_omnipresent' to 'rrsig_chained' and removing the
redundant initialization of the DST_KEY_DNSKEY expected state to NA.

Update keymgr_key_is_successor() calls

The previous commit changed the function definition of
'keymgr_key_is_successor()', this commit updates the code where
this function is called.

In 'keymgr_key_exists_with_state()' the logic is also updated slightly
to become more readable. First handle the easy cases:
- If the key does not match the state, continue with the next key.
- If we found a key with matching state, and there is no need to
check the successor relationship, return (true).
- Otherwise check the successor relationship.

In 'keymgr_key_has_successor()' it is enough to check if a key has
a direct successor, so instead of calling 'keymgr_key_is_successor()',
we can just check 'keymgr_direct_dep()'.

In 'dns_keymgr_run()', we want to make sure that there is no
dependency on the keys before retiring excess keys, so replace
'keymgr_key_is_successor()' with 'keymgr_dep()'.

Implement Equation(2) of "Flexible Key Rollover"

So far the key manager could only deal with two keys in a rollover,
because it used a simplified version of the successor relationship
equation from "Flexible and Robust Key Rollover" paper. The simplified
version assumes only two keys take part in the key rollover and it
for that it is enough to check the direct relationship between two
keys (is key x the direct predecessor of key z and is key z the direct
successor of key x?).

But when a third key (or more keys) comes into the equation, the key
manager would assume that one key (or more) is redundant and removed
it from the zone prematurely.

Fix by implementing Equation(2) correctly, where we check for
dependencies on keys:

z ->T x: Dep(x, T) = ∅ ∧
         (x ∈ Dep(z, T) ∨
          ∃ y ∈ Dep(z, T)(y != z ∧ y ->T x ∧ DyKyRySy = DzKzRzSz))

This says: key z is a successor of key x if:
- key x depends on key z if z is a direct successor of x,
- or if there is another key y that depends on key z that has identical
  key states as key z and key y is a successor of key x.
- Also, key x may not have any other keys depending on it.

This is still a simplified version of Equation(2) (but at least much
better), because the paper allows for a set of keys to depend on a
key. This is defined as the set Dep(x, T). Keys in the set Dep(x, T)
have a dependency on key x for record type T. The BIND implementation
can only have one key in the set Dep(x, T). The function
'keymgr_dep()' stores this key in 'uint32_t *dep' if there is a
dependency.

There are two scenarios where multiple keys can depend on a single key:

1. Rolling keys is faster than the time required to finish the
   rollover procedure. This scenario is covered by the recursive
   implementation, and checking for a chain of direct dependencies
   will suffice.

2. Changing the policy, when a zone is requested to be signed with
   a different key length for example. BIND 9 will not mark successor
   relationships in this case, but tries to move towards the new
   policy. Since there is no successor relationship, the rules are
   even more strict, and the DNSSEC reconfiguration is actually slower
   than required.

Note: this commit breaks the build, because the function definition
of 'keymgr_key_is_successor' changed. This will be fixed in the
following commit.

Merge branch '2468-cid-318094-null-pointer-dereferences-reverse_inull' into 'main'

Resolve "CID 318094: Null pointer dereferences (REVERSE_INULL)"

Closes #2468

See merge request isc-projects/bind9!4641

Remove redundant 'version == NULL' check

    *** CID 318094:  Null pointer dereferences  (REVERSE_INULL)
    /lib/dns/rbtdb.c: 1389 in newversion()
    1383      version->xfrsize = rbtdb->current_version->xfrsize;
    1384      RWUNLOCK(&rbtdb->current_version->rwlock, isc_rwlocktype_read);
    1385      rbtdb->next_serial++;
    1386      rbtdb->future_version = version;
    1387      RBTDB_UNLOCK(&rbtdb->lock, isc_rwlocktype_write);
    1388
       CID 318094:  Null pointer dereferences  (REVERSE_INULL)
       Null-checking "version" suggests that it may be null, but it has already been dereferenced on all paths leading to the check.
    1389      if (version == NULL) {
    1390      return (result);
    1391      }
    1392
    1393      *versionp = version;
    1394

Merge branch '1144-dns-over-https-server' into 'main'

Resolve "Encrypted DNS - RFC 8484, DNS over HTTPS, DOH (also DoT comments)"

Closes #1144

See merge request isc-projects/bind9!4644

CHANGES, release notes

Drop gcc:sid:i386 from GitLab CI

Building sid-i386 in Docker no longer works and we don't have a viable
alternative now, so dropping gcc:sid:i386 is our only option in this
very moment.

support "tls ephemeral" with https

tls and http configuration code was unnecessarily complex

removed the isc_cfg_http_t and isc_cfg_tls_t structures
and the functions that loaded and accessed them; this can
be done using normal config parser functions.

Unit-test fixes and manual page updates for DoH configuration

This commit contains fixes to unit tests to make them work well on
various platforms (in particular ones shipping old versions of
OpenSSL) and for different configurations.

It also updates the generated manpage to include DoH configuration
options.

Initial support for DNS-over-HTTP(S)

This commit completes the support for DNS-over-HTTP(S) built on top of
nghttp2 and plugs it into the BIND. Support for both GET and POST
requests is present, as required by RFC8484.

Both encrypted (via TLS) and unencrypted HTTP/2 connections are
supported. The latter are mostly there for debugging/troubleshooting
purposes and for the means of encryption offloading to third-party
software (as might be desirable in some environments to simplify TLS
certificates management).

nghttp2-based HTTP layer in netmgr

This commit includes work-in-progress implementation of
DNS-over-HTTP(S).

Server-side code remains mostly untested, and there is only support
for POST requests.

Add isc_mem_strndup() function

This commit adds an implementation of strndup() function which
allocates memory from the supplied isc_mem_t memory context.

update ARM with "http" grammar

add a link to the http statement grammar and explanations and examples
for configuring DoH listeners.

Add parser support for DoH configuration options

This commit adds stub parser support and tests for:
- an "http" global option for HTTP/2 endpoint configuration.
- command line options to set http or https port numbers by
  specifying -p http=PORT or -p https=PORT.  (NOTE: this change
  only affects syntax; specifying HTTP and HTTPS ports on the
  command line currently has no effect.)
- named.conf options "http-port" and "https-port"
- HTTPSPORT environment variable for use when running tests.

Resurrect old TLS code

This commit resurrects the old TLS code from
8f73c70d23e26954165fd44ce5617a95f112bcff.

It also includes numerous stability fixes and support for
isc_nm_cancelread() for the TLS layer.

The code was resurrected to be used for DoH.

Merge branch '2448-tweak-sphinx-build-invocations' into 'main'

Tweak sphinx-build invocations

Closes #2448

See merge request isc-projects/bind9!4640

Make sphinx-build warnings fatal

In order to prevent documentation building issues from being glossed
over, pass the -W command line switch to all sphinx-build invocations.
This causes the latter to return with a non-zero exit code whenever any
Sphinx warnings are triggered.

Address a Sphinx duplicate label warning

Both doc/man/ddns-confgen.rst and doc/man/tsig-keygen.rst include
bin/confgen/tsig-keygen.rst, which defines a "man_tsig-keygen" label.
This triggers the following warning when running sphinx-build with the
-W command line switch in the doc/man/ directory:

    ../../bin/confgen/tsig-keygen.rst:27: WARNING: duplicate label man_tsig-keygen, other instance in /tmp/bind9/doc/man/ddns-confgen.rst

Move the offending label from bin/confgen/tsig-keygen.rst to the proper
spot in doc/arm/manpages.rst to avoid effectively defining it twice in
different source documents while still allowing the relevant man page to
be referenced in the ARM.  Also rename that label so that it more
closely matches the content it points to.  As the label no longer
immediately precedes a section title in its new location, use
:ref:`Title <label>` syntax for the only reference to the
tsig-keygen/ddns-confgen man page in the ARM.

Use separate sphinx-build cache directories

Simultaneously starting multiple sphinx-build instances with the -d
command line switch set to a common value (which is what happens when
e.g. "make -j6 doc" is run) causes intermittent problems which we failed
to notice before because they only trigger Sphinx warnings, not errors,
e.g.:

WARNING: toctree contains ref to nonexisting file 'reference'

The message above is not triggered because doc/arm/reference.rst is
actually missing from disk at any point, but rather because a temporary
file created by one sphinx-build instance gets truncated by another one
working in parallel (the confusing message quoted above is logged
because of an overly broad "except" statement in Sphinx code).

Prevent this problem from being triggered by making each sphinx-build
process use its own dedicated cache directory.

Merge branch '2406-kasp-init-inactive-delete-metadata' into 'main'

Resolve "kasp: look at Inactive/Delete when initializing state files"

Closes #2406

See merge request isc-projects/bind9!4599

Remove initialize goal code

Since keys now have their goals initialized in 'keymgr_key_init()',
remove this redundant piece of code in 'keymgr_key_run()'.

Correctly initialize old key with state file

The 'key_init()' function is used to initialize a state file for keys
that don't have one yet. This can happen if you are migrating from a
'auto-dnssec' or 'inline-signing' to a 'dnssec-policy' configuration.

It did not look at the "Inactive" and "Delete" timing metadata and so
old keys left behind in the key directory would also be considered as
a possible active key. This commit fixes this and now explicitly sets
the key goal to OMNIPRESENT for keys that have their "Active/Publish"
timing metadata in the past, but their "Inactive/Delete" timing
metadata in the future. If the "Inactive/Delete" timing metadata is
also in the past, the key goal is set to HIDDEN.

If the "Inactive/Delete" timing metadata is in the past, also the
key states are adjusted to either UNRETENTIVE or HIDDEN, depending on
how far in the past the metadata is set.

Update legacy-keys kasp test

The 'legacy-keys.kasp' test checks that a zone with key files but not
yet state files is signed correctly. This test is expanded to cover
the case where old key files still exist in the key directory. This
covers bug #2406 where keys with the "Delete" timing metadata are
picked up by the keymgr as active keys.

Fix the 'legacy-keys.kasp' test, by creating the right key files
(for zone 'legacy-keys.kasp', not 'legacy,kasp').

Use a unique policy for this zone, using shorter lifetimes.

Create two more keys for the zone, and use 'dnssec-settime' to set
the timing metadata in the past, long enough ago so that the keys
should not be considered by the keymgr.

Update the 'key_unused()' test function, and consider keys with
their "Delete" timing metadata in the past as unused.

Extend the test to ensure that the keys to be used are not the old
predecessor keys (with their "Delete" timing metadata in the past).

Update the test so that the checks performed are consistent with the
newly configured policy.

Merge branch '1697-isc_rwlock_init-can-no-longer-fail-in-master-clean-up-calls' into 'main'

Resolve "isc_rwlock_init can no longer fail in master, clean up calls."

Closes #2462 and #1697

See merge request isc-projects/bind9!4635

Cleanup redundant isc_rwlock_init() result checks

Merge branch '2444-call-freeaddrinfo-in-test_client' into 'main'

Fix addrinfo leak in test_client.c

Closes #2444

See merge request isc-projects/bind9!4629

Fix addrinfo leak in test_client.c

The addrinfo we got from getaddrinfo() was never freed.

Merge branch '2392-xot-xfrin' into 'main'

Add support for incoming tranfers via XoT

Closes #2392

See merge request isc-projects/bind9!4571

CHANGES and release notes

implement xfrin via XoT

Add support for a "tls" key/value pair for zone primaries, referencing
either a "tls" configuration statement or "ephemeral". If set to use
TLS, zones will send SOA and AXFR/IXFR queries over a TLS channel.

Merge branch '2442-tsan-error-lib-dns-rbtdb-c' into 'main'

Resolve "TSAN error: lib/dns/rbtdb.c"

Closes #2442

See merge request isc-projects/bind9!4609

Fix race condition on check_stale_header

This commit fix a race that could happen when two or more threads have
failed to refresh the same RRset, the threads could simultaneously
attempt to update the header->last_refresh_fail_ts field in
check_stale_header, a field used to implement stale-refresh-time.

By making this field atomic we avoid such race.

Merge branch '2434-fetch-limit-serve-stale' into 'main'

Resolve "Serve stale when fetch limits are hit"

Closes #2434

See merge request isc-projects/bind9!4607

Add notes and change entry for [#2434]

This concludes the serve-stale improvements.

Add test for serve-stale /w fetch-limits

Add a test case when fetch-limits are reached and we have stale data
in cache.

This test starts with a positive answer for 'data.example/TXT' in
cache.

1. Reload named.conf to set fetch limits.
2. Disable responses from the authoritative server.
3. Now send a batch of queries to the resolver, until hitting the
   fetch limits. We can detect this by looking at the response RCODE,
   at some point we will see SERVFAIL responses.
4. At that point we will turn on serve-stale.
5. Clients should see stale answers now.
6. An incoming query should not set the stale-refresh-time window,
   so a following query should still get a stale answer because of a
   resolver failure (and not because it was in the stale-refresh-time
   window).

Only start stale refresh window when resuming

If we did not attempt a fetch due to fetch-limits, we should not start
the stale-refresh-time window.

Introduce a new flag DNS_DBFIND_STALESTART to differentiate between
a resolver failure and unexpected error. If we are resuming, this
indicates a resolver failure, then start the stale-refresh-time window,
otherwise don't start the stale-refresh-time window, but still fall
back to using stale data.

(This commit also wraps some docstrings to 80 characters width)

Use stale data also if we are not resuming

Before this change, BIND will only fallback to using stale data if
there was an actual attempt to resolve the query. Then on a timeout,
the stale data from cache becomes eligible.

This commit changes this so that on any unexpected error stale data
becomes eligble (you would still have to have 'stale-answer-enable'
enabled of course).

If there is no stale data, this may return in an error again, so don't
loop on stale data lookup attempts. If the DNS_DBFIND_STALEOK flag is
set, this means we already tried to lookup stale data, so if that is
the case, don't use stale again.