git.ipfire.org Git - thirdparty/bind9.git/log

Use Debian "sid" for pylint and mypy jobs to get recent dnspython

The base image tends to have a rather old dnspython version and when
used with pylint and mypy it produces errors about newer dnspython
features the old version does not know about.

    $ mypy "bin/tests/system/isctest/"
    bin/tests/system/isctest/query.py:55: error: Unexpected keyword argument "verify" for "tls"  [call-arg]
    /usr/lib/python3/dist-packages/dns/query.py:958: note: "tls" defined here

    $ pylint --rcfile $CI_PROJECT_DIR/.pylintrc --disable=wrong-import-position $(git ls-files 'bin/tests/system/*.py' | grep -vE 'ans\.py')
    ************* Module isctest.query
    bin/tests/system/isctest/query.py:55:11: E1123: Unexpected keyword argument 'verify' in function call (unexpected-keyword-arg)

Add isctest.query.tls() function

When explicitly set to True, the "verify" argument lets dnspython verify
certificates used for the connection. As most certificates in the system
test will inevitably be self-signed, the "verify" argument defaults to
False.

The "verify" argument is present in dnspython since the version 2.5.0.

Add "without_fips" mark

The "without_fips" mark disables test function when BIND 9 was built
with the FIPS mode enabled as not everything works in FIPS-enabled
builds.

rem: dev: Clean up unused result codes

A number of result codes are obsolete and can be removed. Others, including `ISC_R_NOMEMORY`, are still checked in various places even though they can't occur any longer. These have been cleaned up.

Merge branch 'each-cleanup-results' into 'main'

See merge request isc-projects/bind9!9942

deduplicate result codes

ISCCC_R_SYNTAX, ISCCC_R_EXPIRED, and ISCCC_R_CLOCKSKEW have the
same usage and text formats as DNS_R_SYNTAX, DNS_R_EXPIRED and
DNS_R_CLOCKSCREW respectively. this was originally done because
result codes were defined in separate libraries, and some tool
might be linked with libisccc but not libdns. as the result codes
are now defined in only one place, there's no need to retain the
duplicates.

clean up result codes that are never used

the following result codes are obsolete and have been removed
from result.h and result.c:

        - ISC_R_NOTHREADS
        - ISC_R_BOUND
        - ISC_R_NOTBOUND
        - ISC_R_NOTDIRECTORY
        - ISC_R_EMPTY
        - ISC_R_NOTBLOCKING
        - ISC_R_INPROGRESS
        - ISC_R_WOULDBLOCK

        - DNS_R_TOOMANYHOPS
        - DNS_R_NOREDATA
        - DNS_R_BADCKSUM
        - DNS_R_MOREDATA
        - DNS_R_NOVALIDDS
        - DNS_R_UNKNOWNOPT
        - DNS_R_NOVALIDKEY
        - DNS_R_NTACOVERED

        - DST_R_COMPUTESECRETFAILURE
        - DST_R_NORANDOMNESS
        - DST_R_NOCRYPTO

clean up uses of DST_R_NOCRYPTO

building BIND without crypto support is no longer possible.
consequently this result code is never sent, and therefore we
don't need code in calling functions to handle it.

clean up uses of ISC_R_NOMEMORY

the isc_mem allocation functions can no longer fail; as a result,
ISC_R_NOMEMORY is now rarely used: only when an external library
such as libjson-c or libfstrm could return NULL. (even in
these cases, arguably we should assert rather than returning
ISC_R_NOMEMORY.)

code and comments that mentioned ISC_R_NOMEMORY have been
cleaned up, and the following functions have been changed to
type void, since (in most cases) the only value they could
return was ISC_R_SUCCESS:

- dns_dns64_create()
- dns_dyndb_create()
- dns_ipkeylist_resize()
- dns_kasp_create()
- dns_kasp_key_create()
- dns_keystore_create()
- dns_order_create()
- dns_order_add()
- dns_peerlist_new()
- dns_tkeyctx_create()
- dns_view_create()
- dns_zone_setorigin()
- dns_zone_setfile()
- dns_zone_setstream()
- dns_zone_getdbtype()
- dns_zone_setjournal()
- dns_zone_setkeydirectory()
- isc_lex_openstream()
- isc_portset_create()
- isc_symtab_create()

(the exception is dns_view_create(), which could have returned
other error codes in the event of a crypto library failure when
calling isc_file_sanitize(), but that should be a RUNTIME_CHECK
anyway.)

chg: ci: Set stricter limits for respdiff testing

Adjust the limit of maximum disagreements in respdiff results based on
recent pipeline results.

The respdiff and respdiff:asan seem to have almost identical results,
typically around 0.07 % of differences with ocassional spikes up to
around 0.11 %. Similar results are for respdiff:tsan, perhaps with more
common spikes with values up to around 0.12 %. Set the limit to 0.15 %
to allow for some tolerance due to network conditions, time of day etc.

The respdiff:third-party has a slightly higher disagreements average,
with typical values being around 0.12 %. Set the limit to 0.2 %.

Exceeding either of those values should be quite clear indication that
some resolution behaviour has changed, since the values appear to be
very stable within the newly configured limits.

Merge branch 'nicki/ci-respdiff-limits' into 'main'

See merge request isc-projects/bind9!9950

Set stricter limits for respdiff testing

Adjust the limit of maximum disagreements in respdiff results based on
recent pipeline results.

The respdiff and respdiff:asan seem to have almost identical results,
typically around 0.07 % of differences with ocassional spikes up to
around 0.11 %. Similar results are for respdiff:tsan, perhaps with more
common spikes with values up to around 0.12 %. Set the limit to 0.15 %
to allow for some tolerance due to network conditions, time of day etc.

The respdiff:third-party has a slightly higher disagreements average,
with typical values being around 0.12 %. Set the limit to 0.2 %.

Exceeding either of those values should be quite clear indication that
some resolution behaviour has changed, since the values appear to be
very stable within the newly configured limits.

chg: doc: Document how secondaries refresh a zone in the ARM

Closes #5123

Merge branch '5123-document-refreshing-a-secondary' into 'main'

See merge request isc-projects/bind9!9966

Document how secondaries refresh a zone in the ARM

We have a KB article that describes this, put a condensed version into
the ARM.

fix: doc: Clarify dnssec-signzone interval option

There was confusion about whether the interval was calculated from
the validity period provided on the command line (with -s and -e),
or from the signature being replaced.

Add text to clarify that the interval is calculated from the new
validity period.

Closes #5128

Merge branch '5128-clarify-dnssec-signzone-interval' into 'main'

See merge request isc-projects/bind9!9955

Clarify dnssec-signzone interval option

There was confusion about whether the interval was calculated from
the validity period provided on the command line (with -s and -e),
or from the signature being replaced.

Add text to clarify that the interval is calculated from the new
validity period.

fix: usr: Fix a bug in dnssec-signzone related to keys being offline

In the case when `dnssec-signzone` is called on an already signed zone, and the private key file is unavailable, a signature that needs to be refreshed may be dropped without being able to generate a replacement. This has been fixed.

Closes #5126

Merge branch '5126-dnssec-signzone-retain-rrsig-if-key-is-offline' into 'main'

See merge request isc-projects/bind9!9951

dnssec-signzone retain signature if key is offline

Track inside the dns_dnsseckey structure whether we have seen the
private key, or if this key only has a public key file.

If the key only has a public key file, or a DNSKEY reference in the
zone, mark the key 'pubkey'. In dnssec-signzone, if the key only
has a public key available, consider the key to be offline. Any
signatures that should be refreshed for which the key is not available,
retain the signature.

So in the code, 'expired' becomes 'refresh', and the new 'expired'
is only used to determine whether we need to keep the signature if
the corresponding key is not available (retaining the signature if
it is not expired).

In the 'keysthatsigned' function, we can remove:
- key->force_publish = false;
- key->force_sign = false;

because they are redundant ('dns_dnsseckey_create' already sets these
values to false).

Test dnssec-signzone with private key file missing

Add a test case for the scenario below.

There is a case when signing a zone with dnssec-signzone where the
private key file is moved outside the key directory (for offline
ksk purposes), and then the zone is resigned. The signature of the
DNSKEY needs refreshing, but is not expired.

Rather than removing the signature without having a valid replacement,
leave the signature in the zone (despite it needs to be refreshed).

fix: dev: Fix possible truncation in dns_keymgr_status()

If the generated status output exceeds 4096 it was silently truncated, now we output that the status was truncated.

Closes #4180

Merge branch '4180-possible-truncation-in-dns_keymgr_status' into 'main'

See merge request isc-projects/bind9!9905

Fix possible truncation in dns_keymgr_status()

If the generated status output exceeds 4096 it was silently truncated,
now we output that the status was truncated.

fix: usr: Yaml string not terminated in negative response in delv

Closes #5098

Merge branch '5098-missing-yaml-string-termination-delv' into 'main'

See merge request isc-projects/bind9!9922

Check delv +yaml negative response output

Terminate yaml string after negative comment

new: usr: Add support for multiple extended DNS errors

Extended DNS error mechanism (EDE) may have several errors raised during a DNS resolution. `named` is now able to add up to three EDE codes in a DNS response. In the case of duplicate error codes, only the first one will be part of the DNS response.

Closes #5085

Merge branch '5085-multiple-ede' into 'main'

See merge request isc-projects/bind9!9952

add unit tests covering multiple EDE support

add support for multiple EDE

Extended DNS error mechanism (EDE) enables to have several EDE raised
during a DNS resolution (typically, a DNSSEC query will do multiple
fetches which each of them can have an error). Add support to up to 3
EDE errors in an DNS response. If duplicates occur (two EDEs with the
same code, the extra text is not compared), only the first one will be
part of the DNS answer.

Because the maximum number of EDE is statically fixed, `ns_client_t`
object own a static vector of `DNS_DE_MAX_ERRORS` (instead of a linked
list, for instance). The array can be fully filled (all slots point to
an allocated `dns_ednsopt_t` object) or partially filled (or
empty). In such case, the first NULL slot means there is no more EDE
objects.

chg: dev: Use a suitable response in tcp_connected() when initiating a read

When 'ISC_R_TIMEDOUT' is received in 'tcp_recv()', it times out the
oldest response in the active responses queue, and only after that it
checks whether other active responses have also timed out. So when
setting a timeout value for a read operation after a successful
connection, it makes sense to take the timeout value from the oldest
response in the active queue too, because, theoretically, the responses
can have different timeout values, e.g. when the TCP dispatch is shared.
Currently 'resp' is always NULL. Previously when connect and read timeouts
were not separated in dispatch this affected only logging, but now since
we are setting a new timeout after a successful connection, we need to
choose a suitable response from the active queue.

Merge branch 'aram/dispatch-tcp_connected-fix' into 'main'

See merge request isc-projects/bind9!9927

Clean up fctx->next_timeout

Since the support for non-zero values of stale-answer-client-timeout
was removed in bd7463914fe6375e3e9157f305c60d0172f2b312, 'next_timeout'
is unused. Clean it up.

Adjust the resolver-query-timeout test

Since the read timeout now works, the resolver time outs from the
dispatch level instead of from the "hung fetch" timer, and so the
EDE value in 'fctx_expired()' is not being set. Remove the expected
EDE value from the test.

Fix rtt calculation bug for TCP in the resolver

When TCP is used, 'fctx_query()' adds one second to the rtt
(round-trip time) value, but there's a bug when the decision
about using TCP is made already after the calculation. Move the
block of the code which looks up the peers list to decide
whether to use TCP into a place that's before the rtt calculation
is performed. This commit doesn't add or remove any code, it just
moves the code and adds a comment block.

Use a suitable response in tcp_connected() when initiating a read

When 'ISC_R_TIMEDOUT' is received in 'tcp_recv()', it times out the
oldest response in the active responses queue, and only after that it
checks whether other active responses have also timed out. So when
setting a timeout value for a read operation after a successful
connection, it makes sense to take the timeout value from the oldest
response in the active queue too, because, theoretically, the responses
can have different timeout values, e.g. when the TCP dispatch is shared.
Currently 'resp' is always NULL. Previously when connect and read
timeouts were not separated in dispatch this affected only logging, but
now since we are setting a new timeout after a successful connection,
we need to choose a suitable response from the active queue.

fix: usr: Avoid unnecessary locking in the zone/cache database

Prevent lock contention among many worker threads referring to the same database node at the same time. This would improve zone and cache database performance for the heavily contended database nodes.

Closes #5130

Merge branch '5130-reduce-lock-contention-in-decrement-reference' into 'main'

See merge request isc-projects/bind9!9963

Optimize database decref by avoiding locking with refs > 1

Previously, this function always acquires a node write lock if it
might need node cleanup in case the reference decrements to 0. In
fact, the lock is unnecessary if the reference is larger than 1 and it
can be optimized as an "easy" case. This optimization could even be
"necessary". In some extreme cases, many worker threads could repeat
acquring and releasing the reference on the same node, resulting in
severe lock contention for nothing (as the ref wouldn't decrement to 0
in most cases). This change would prevent noticeable performance
drop like query timeout for such cases.

Co-authored-by: JINMEI Tatuya <jtatuya@infoblox.com>
Co-authored-by: Ondřej Surý <ondrej@isc.org>

chg: dev: Shutdown the fetch context after canceling the last fetch

Shutdown the fetch context immediately after the last fetch has been canceled from
that particular fetch context.

Merge branch 'ondrej/shutdown-the-fetch-context-early' into 'main'

See merge request isc-projects/bind9!9958

Shutdown the fetch context after canceling the last fetch

Currently, the fetch context will continue running even when the last
fetch (response) has been removed from the context, so named can process
and cache the answer. This can lead to a situation where the number of
outgoing recursing clients exceeds the the configured number for
recursive-clients.

Be more stringent about the recursive-clients limit and shutdown the
fetch context immediately after the last fetch has been canceled from
that particular fetch context.

fix: usr: Apply the memory limit only to ADB database items

Resolver under heavy-load could exhaust the memory available for storing
the information in the Address Database (ADB) effectively evicting already
stored information in the ADB. The memory used to retrieve and provide
information from the ADB is now not a subject of the same memory limits
that are applied for storing the information in the Address Database.

Closes #5127

Merge branch '5127-change-ADB-memory-split' into 'main'

See merge request isc-projects/bind9!9954

Remove memory limit on ADB finds and fetches

Address Database (ADB) shares the memory for the short lived ADB
objects (finds, fetches, addrinfo) and the long lived ADB
objects (names, entries, namehooks). This could lead to a situation
where the resolver-heavy load would force evict ADB objects from the
database to point where ADB is completely empty, leading to even more
resolver-heavy load.

Make the short lived ADB objects use the other memory context that we
already created for the hashmaps. This makes the ADB overmem condition
to not be triggered by the ongoing resolver fetches.

chg: dev: Separate the connect and the read TCP timeouts in dispatch

The network manager layer has two different timers with their
own timeout values for TCP connections: connect timeout and read
timeout. Separate the connect and the read TCP timeouts in the
dispatch module too.

Closes #5009

Merge branch '5009-dispatch-separate-connect-and-read-timeouts' into 'main'

See merge request isc-projects/bind9!9698

Remove dispatch timeout INT16_MAX limitation

In some places there was a limitation of the maximum timeout
value of INT16_MAX, which is only about 32 seconds. Refactor
the code to remove the limitation.

Separate the connect and the read timeouts in dispatch

The network manager layer has two different timers with their
own timeout values for TCP connections: connect timeout and read
timeout. Separate the connect and the read TCP timeouts in the
dispatch module too.

dispatch_test: make client timeouts shorter

Use shorter timeouts for the client to ensure that the clients
time out before the server.

Update the dns_dispatch_add() function's documentation

The 'timedout' callback no longer exists. Remove the mentioning of
the 'timedout' callback.

new: nil: ignore TAGS files

Merge branch 'colin/ignoreTAGS' into 'main'

See merge request isc-projects/bind9!9956

ignore TAGS files

TAGS file are generated from `make tags` using etags. Other index tags
are already ignored (GTAGS, GPATH, etc.). Also ignoring `TAGS`.

rem: dev: remove fields from struct fetchctx

struct fetchctx does have several fields which are now unused or confusing, removing those.

Merge branch 'colin/remove-fctx-validator' into 'main'

See merge request isc-projects/bind9!9945

remove ISC_LINK(link) property from fetchctx

Likely because of historical reasons, struct fetchctx does have a list
link property but is never used as a list. Remove this link property.

remove validator link form fetchctx

struct fetchctx does have a list of pending validators as well as a
pointer to the HEAD validator. Remove the validator pointer to avoid
confusion, as there is no perticular reasons to have it directly
accessible outside of the list.

chg: doc: Set up version for BIND 9.21.5

Merge branch 'andoni/set-up-version-for-bind-9.21.5' into 'main'

See merge request isc-projects/bind9!9968

Update BIND version to 9.21.5-dev

Update BIND version for release

new: doc: Prepare documentation for BIND 9.21.4

Merge branch 'andoni/prepare-documentation-for-bind-9.21.4' into 'v9.21.4-release'

See merge request isc-private/bind9!772

Reorder release notes

Add release note for GL #5099

Tweak and reword release notes

Fix broken option reference in the ARM

Prepare release notes for BIND 9.21.4

Generate changelog for BIND 9.21.4

[CVE-2024-12705] sec: usr: DNS-over-HTTP(s) flooding fixes

Fix DNS-over-HTTP(S) implementation issues that arise under heavy
query load. Optimize resource usage for :iscman:`named` instances
that accept queries over DNS-over-HTTP(S).

Previously, :iscman:`named` would process all incoming HTTP/2 data
at once, which could overwhelm the server, especially when dealing
with clients that send requests but don't wait for responses. That
has been fixed. Now, :iscman:`named` handles HTTP/2 data in smaller
chunks and throttles reading until the remote side reads the
response data. It also throttles clients that send too many requests
at once.

Additionally, :iscman:`named` now carefully processes data sent by
some clients, which can be considered "flooding." It logs these
clients and drops connections from them.
:gl:`#4795`

In some cases, :iscman:`named` could leave DNS-over-HTTP(S)
connections in the `CLOSE_WAIT` state indefinitely. That also has
been fixed. ISC would like to thank JF Billaud for thoroughly
investigating the issue and verifying the fix.
:gl:`#5083`

See https://gitlab.isc.org/isc-projects/bind9/-/issues/4795

Closes https://gitlab.isc.org/isc-projects/bind9/-/issues/5083

Merge branch 'artem-improve-doh-resource-usage' into 'v9.21.4-release'

See merge request isc-private/bind9!732

DoH: reduce excessive bad request logging

We started using isc_nm_bad_request() more actively throughout
codebase. In the case of HTTP/2 it can lead to a large count of
useless "Bad Request" messages in the BIND log, as often we attempt to
send such request over effectively finished HTTP/2 sessions.

This commit fixes that.

Do not stop timer in isc_nm_read_stop() in manual timer mode

A call to isc_nm_read_stop() would always stop reading timer even in
manual timer control mode which was added with StreamDNS in mind. That
looks like an omission that happened due to how timers are controlled
in StreamDNS where we always stop the timer before pausing reading
anyway (see streamdns_on_complete_dnsmessage()). That would not work
well for HTTP, though, where we might want pause reading without
stopping the timer in the case we want to split incoming data into
multiple chunks to be processed independently.

I suppose that it happened due to NM refactoring in the middle of
StreamDNS development (at the time isc_nm_cancelread() and
isc_nm_pauseread() were removed), as the StreamDNS code seems to be
written as if timers are not stoping during a call to
isc_nm_read_stop().

DoH: introduce manual read timer control

This commit introduces manual read timer control as used by StreamDNS
and its underlying transports. Before that, DoH code would rely on the
timer control provided by TCP, which would reset the timer any time
some data arrived. Now, the timer is restarted only when a full DNS
message is processed in line with other DNS transports.

That change is required because we should not stop the timer when
reading from the network is paused due to throttling. We need a way to
drop timed-out clients, particularly those who refuse to read the data
we send.

DoH: floodding clients detection

This commit adds logic to make code better protected against clients
that send valid HTTP/2 data that is useless from a DNS server
perspective.

Firstly, it adds logic that protects against clients who send too
little useful (=DNS) data. We achieve that by adding a check that
eventually detects such clients with a nonfavorable useful to
processed data ratio after the initial grace period. The grace period
is limited to processing 128 KiB of data, which should be enough for
sending the largest possible DNS message in a GET request and then
some. This is the main safety belt that would detect even flooding
clients that initially behave well in order to fool the checks server.

Secondly, in addition to the above, we introduce additional checks to
detect outright misbehaving clients earlier:

The code will treat clients that open too many streams (50) without
sending any data for processing as flooding ones; The clients that
managed to send 1.5 KiB of data without opening a single stream or
submitting at least some DNS data will be treated as flooding ones.
Of course, the behaviour described above is nothing else but
heuristical checks, so they can never be perfect. At the same time,
they should be reasonable enough not to drop any valid clients,
realatively easy to implement, and have negligible computational
overhead.

DoH: process data chunk by chunk instead of all at once

Initially, our DNS-over-HTTP(S) implementation would try to process as
much incoming data from the network as possible. However, that might
be undesirable as we might create too many streams (each effectively
backed by a ns_client_t object). That is too forgiving as it might
overwhelm the server and trash its memory allocator, causing high CPU
and memory usage.

Instead of doing that, we resort to processing incoming data using a
chunk-by-chunk processing strategy. That is, we split data into small
chunks (currently 256 bytes) and process each of them
asynchronously. However, we can process more than one chunk at
once (up to 4 currently), given that the number of HTTP/2 streams has
not increased while processing a chunk.

That alone is not enough, though. In addition to the above, we should
limit the number of active streams: these streams for which we have
received a request and started processing it (the ones for which a
read callback was called), as it is perfectly fine to have more opened
streams than active ones. In the case we have reached or surpassed the
limit of active streams, we stop reading AND processing the data from
the remote peer. The number of active streams is effectively decreased
only when responses associated with the active streams are sent to the
remote peer.

Overall, this strategy is very similar to the one used for other
stream-based DNS transports like TCP and TLS.

[CVE-2024-11187] sec: usr: Limit the additional processing for large RDATA sets

When answering queries, don't add data to the additional section if the answer has more than 13 names in the RDATA. This limits the number of lookups into the database(s) during a single client query, reducing query processing load.

See isc-projects/bind9#5034

Merge branch '5034-security-limit-additional' into 'v9.21.4-release'

See merge request isc-private/bind9!750

Limit the additional processing for large RDATA sets

Limit the number of records appended to ADDITIONAL section to the names
that have less than 14 records in the RDATA. This limits the number
of the lookups into the database(s) during single client query.

Also don't append any additional data to ANY queries. The answer to ANY
is already big enough.

Isolate using the -T noaa flag only for part of the resolver test

Instead of running the whole resolver/ns4 server with -T noaa flag,
use it only for the part where it is actually needed. The -T noaa
could interfere with other parts of the test because the answers don't
have the authoritative-answer bit set, and we could have false
positives (or false negatives) in the test because the authoritative
server doesn't follow the DNS protocol for all the tests in the resolver
system test.

Rename the qpzone and qpcache methods that implement DB api

All the database implementations share the same names for the methods
implementing the database. That has some advantages like knowing what
to expect, but it turns out that any time such method shows up in any
kind of tracing - be it perf record, backtrace or anything else that
uses symbol names, it is very hard to distinguish whether the find()
belongs to qpcache, qpzone, builtin or sdlz implementation.

Make at least the names for qpzone and qpcache unique.

fix: usr: querying an NSEC3-signed zone for an empty record could trigger an assertion

A bug in the qpzone database could trigger a crash when querying for a deleted name, or a newly-added empty non-terminal name, in an NSEC3-signed zone. This has been fixed.

Closes #5108

Merge branch '5108-nsec3-empty-node' into 'main'

See merge request isc-projects/bind9!9928

detect when closest-encloser name is too long

there was a database bug in which dns_db_find() could get a partial
match for the query name, but still set foundname to match the full
query name. this triggered an assertion when query_addwildcardproof()
assumed that foundname would be shorter.

the database bug has been fixed, but in case it happens again, we
can just copy the name instead of splitting it. we will also log a
warning that the closest-encloser name was invalid.

dns_nsec3_addnsec3() can fail when iterating back

when adding a new NSEC3 record, dns_nsec3_addnsec3() uses a
dbiterator to seek to the newly created node and then find its
predecessor. dbiterators in the qpzone use snapshots, so changes
to the database are not reflected in an already-existing iterator.
consequently, when we add a new node, we have to create a new iterator
before we can seek to it.

add a regression test for a new ENT node

this test adds a record with empty non-terminal nodes above it. this
has also been observed to trigger the crash in NSEC3 zones.

NOTE: the test currently fails, because while there is no crash, the
query results are not as expected. when we add a node below an ENT,
receive_secure_serial() gets DNS_R_PARTIALMATCH, and the signed
zone is never updated. this is not a regression from fixing the
crash bug; it's a separate inline-signing bug.

add a regression test for record deletion

test that there's no crash when querying for a newly-deleted node.

(incidentally also renamed ns3/named.conf.in to ns3/named1.conf.in,
because named2.conf.in does exist, and they should match.)

qpzone find() function could set foundname incorrectly

when a requested name is found in the QP trie during a lookup, but its
records have been marked as nonexistent by a previous deletion, then
it's treated as a partial match, but the foundname could be left
pointing to the original qname rather than the parent. this could
lead to an assertion failure in query_findclosestnsec3().

fix: nil: Fix default IANA root zone mirror configuration

Closes #5115

Merge branch '5115-fix-default-iana-root-zone-mirror-configuration' into 'main'

See merge request isc-projects/bind9!9934

Fix default IANA root zone mirror configuration

Commit b121f02eac342ee285b6ab1292a0136448a91ee0 renamed the top-level
"primaries" block in bin/named/config.c to "remote-servers".  This
configuration block lists the primary servers used for an IANA root zone
mirror when no primary servers are explicitly specified for it in the
configuration.  However, the relevant part of the named_zone_configure()
function only looks for a top-level "primaries" block and not for any of
its synonyms.  As a result, configuring an IANA root zone mirror with
just:

    zone "." {
        type mirror;
    };

now results in a cryptic fatal error on startup:

    loading configuration: not found
    exiting (due to fatal error)

Fix by using the correct top-level block name in named_zone_configure().

fix: usr: Fix response policy zones and catalog zones with an $INCLUDE statement defined

Response policy zones (RPZ) and catalog zones were not working correctly if they had an $INCLUDE statement defined. This has been fixed.

Closes #5111

Merge branch '5111-includes-disable-rpz-and-catz-fix' into 'main'

See merge request isc-projects/bind9!9930

Fix a typo in dns/master.h

The ISC_R_SEENINCLUDE definition does not exist, the correct one
is DNS_R_SEENINCLUDE.

Don't disable RPZ and CATZ for zones with an $INCLUDE statement

The code in zone_startload() disables RPZ and CATZ for a zone if
dns_master_loadfile() returns anything other than ISC_R_SUCCESS,
which makes sense, but it's an error because zone_startload() can
also return DNS_R_SEENINCLUDE upon success when the zone had an
$INCLUDE statement.

new: ci: Add shotgun perf test of DoH GET to CI

Add performance tests of DoH using the GET protocol to nightly pipelines.

Merge branch 'nicki/ci-shotgun-doh-get' into 'main'

See merge request isc-projects/bind9!9926

Add shotgun perf test of DoH GET to CI

fix: test: Fix "checking startup notify rate limit" fails on OL 8 FIPS

Adjust number of zones down to 23 to match those present when testing in FIPS mode.

Closes #5097

Merge branch '5097-checking-startup-notify-rate-limit-fails-on-ol-8-fips' into 'main'

See merge request isc-projects/bind9!9919

Adjust number of zones to those in FIPS mode

new: dev: Log both "from" and "to" socket in debug messages

Debug messages logging network traffic now include information about both sides of each communication channel rather than just one of them.

Closes #4345

Merge branch '4345-log-both-from-and-to-socket-in-debug-messages' into 'main'

See merge request isc-projects/bind9!8349

Account for revised log messages in test code

Adjust test code so that it expects the extended output that the
dns_message_logpacketfromto() function now emits.

Adjust dns_message_logpacketfrom() log prefixes

Ensure the log prefixes passed to the dns_message_logpacketfrom()
function by its callers do not include the word "from" as the latter is
now emitted by the logfmtpacket() helper function.

Adjust dns_message_logpacketfromto() log prefixes

Ensure the log prefixes passed to the dns_message_logpacketfromto()
function by its callers do not include the words "from" or "to" as those
are now emitted by the logfmtpacket() helper function.

Log both "from" and "to" socket in debug messages

Move dns_dispentry_getlocaladdress() calls around so that they are not
only invoked when dnstap support is compiled in. This function calls
isc_nmhandle_localaddr(), which may issue a system call, but only if the
ISC_SOCKET_DETAILS preprocessor macro is set at compile time.

Pass the value extracted by dns_dispentry_getlocaladdress() to
dns_message_logpacketfromto() so that it gets logged, adding useful
information to the relevant debug messages.

Rename dns_message_logpacket()

Since dns_message_logpacket() only takes a single socket address as a
parameter (and it is always the sending socket's address), rename it to
dns_message_logpacketfrom() so that its name better conveys its purpose
and so that the difference in purpose between this function and
dns_message_logpacketfromto() becomes more apparent.

Rename dns_message_logfmtpacket()

Since dns_message_logfmtpacket() needs to be provided with both "from"
and "to" socket addresses, rename it to dns_message_logpacketfromto() so
that its name better conveys its purpose. Clean up the code comments
for that function.

Enable logging both "from" and "to" socket

Change the function prototype for dns_message_logfmtpacket() so that it
takes two isc_sockaddr_t parameters: one for the sending side and
another one for the receiving side.  This enables debug messages to be
more precise.

Also adjust the function prototype for logfmtpacket() accordingly.
Unlike dns_message_logfmtpacket(), this function must not require both
'from' and 'to' parameters to be non-NULL as it is still going to be
used by dns_message_logpacket(), which only provides a single socket
address.  Adjust its log format to handle both of these cases properly.

Adjust both dns_message_logfmtpacket() call sites accordingly, without
actually providing the second socket address yet.  (This causes the
revised REQUIRE() assertion in dns_message_logfmtpacket() to fail; the
issue will be addressed in a separate commit.)

dns_message_logfmtpacket(): drop 'style' parameter

Both existing callers of the dns_message_logfmtpacket() function set the
argument passed as 'style' to &dns_master_style_comment. To simplify
these call sites, drop the 'style' parameter from the prototype for
dns_message_logfmtpacket() and use a fixed value of
&dns_master_style_comment in the function's body instead.

logfmtpacket(): drop useless local variables

All callers of the logfmtpacket() helper function require the argument
passed as 'address' to be non-NULL. Meanwhile, the 'newline' and
'space' local variables in logfmtpacket() are only set to values
different than their initial values if the 'address' parameter is NULL.
Replace the 'newline' and 'space' local variables in logfmtpacket() with
fixed strings to improve code readability.

new: dev: Enable extraction of exact local socket addresses

Enable extracting the exact address/port that a local wildcard/TCP socket is bound to, improving the accuracy of dnstap logging and providing more information in debug logs produced by system tests. Since this requires issuing an extra system call on some hot paths, this new feature is only enabled when the ``ISC_SOCKET_DETAILS`` preprocessor macro is set at compile time.

Closes #4344

Merge branch '4344-enable-extraction-of-exact-local-socket-addresses' into 'main'

See merge request isc-projects/bind9!8348

Enable extraction of exact local socket addresses

Extracting the exact address that each wildcard/TCP socket is bound to
locally requires issuing the getsockname() system call, which libuv
exposes via its uv_*_getsockname() functions.  This is only required for
detailed logging and comes at a noticeable performance cost, so it
should not happen by default.  However, it is useful for debugging
certain problems (e.g. cryptic system test failures), so a convenient
way of enabling that behavior should exist.

Update isc_nmhandle_localaddr() so that it calls uv_*_getsockname() when
the ISC_SOCKET_DETAILS preprocessor macro is set at compile time.
Ensure proper handling of sockets that wrap other sockets.

Set the new ISC_SOCKET_DETAILS macro by default when --enable-developer
is passed to ./configure.  This enables detailed logging in the system
tests run in GitLab CI without affecting performance in non-development
BIND 9 builds.

Note that setting the ISC_SOCKET_DETAILS preprocessor macro at compile
time enables all callers of isc_nmhandle_localaddr() to extract the
exact address of a given local socket, which results e.g. in dnstap
captures containing more accurate information.

Mention the new preprocessor macro in the section of the ARM that
discusses why exact socket addresses may not be logged by default.

chg: nil: Improve reuse of outgoing TCP connections

This MR is a prerequisite for !8348.

It intentionally does not have a changelog entry associated with it, to
prevent making a false impression of improving connection reuse **for
existing code**. It will only make a difference once !8348 gets merged
(and even then, only if the new `ISC_SOCKET_DETAILS` macro will be set
during build). That's because `isc_nmhandle_localaddr()` currently
simply returns `handle->local` and its return value will only be set to
the actual address the socket is bound to with !8348 in place.

Note that `dns_dispatch_gettcp()` is currently only used by the
`dns_request` API, so this MR's potential for introducing new breakage
is relatively low.

Closes #4693

Merge branch '4693-improve-reuse-of-outgoing-tcp-connections' into 'main'

See merge request isc-projects/bind9!8972

Improve reuse of outgoing TCP connections

The dns_dispatch_gettcp() function is used for finding an existing TCP
connection that can be reused for sending a query from a specified local
address to a specified remote address.  The logic for matching the
provided <local address, remote address> tuple to one of the existing
TCP connections is implemented in the dispatch_match() function:

  - if the examined TCP connection already has a libuv handle assigned,
    it means the connection has already been established; therefore,
    compare the provided <local address, remote address> tuple against
    the corresponding address tuple for the libuv handle associated with
    the connection,

  - if the examined TCP connection does not yet have a libuv handle
    assigned, it means the connection has not yet been established;
    therefore, compare the provided <local address, remote address>
    tuple against the corresponding address tuple that the TCP
    connection was originally created for.

This logic limits TCP connection reuse potential as the libuv handle
assigned to an existing dispatch object may have a more specific local
<address, port> tuple associated with it than the local <address, port>
tuple that the dispatch object was originally created for.  That's
because the local address for outgoing connections can be set to a
wildcard <address, port> tuple (indicating that the caller does not care
what source <address, port> tuple will be used for establishing the
connection, thereby delegating the task of picking it to the operating
system) and then get "upgraded" to a specific <address, port> tuple when
the socket is bound (and a libuv handle gets associated with it).  When
another dns_dispatch_gettcp() caller then tries to look for an existing
TCP connection to the same peer and passes a wildcard address in the
local part of the tuple, the function will not match that request to a
previously-established TCP connection (unless isc_nmhandle_localaddr()
returns a wildcard address as well).

Simplify dispatch_match() so that the libuv handle associated with an
existing dispatch object is not examined for the purpose of matching it
to the provided <local address, remote address> tuple; instead, always
examine the <local address, remote address> tuple that the dispatch
object was originally created for.  This enables reuse of TCP
connections created without providing a specific local socket address
while still preventing other connections (created for a specific local
socket address) from being inadvertently shared.

chg: dev: Add TLS SNI extension to all outgoing TLS connections

This change ensures that SNI extension is used in outgoing connections over TLS (e.g. for DoT and DoH) when applicable.

Closes #5099

Merge branch 'artem-outgoing-tls-sni-support' into 'main'

See merge request isc-projects/bind9!9923

BIND - enable TLS SNI support for outgoing TLS connections

This commit ensures that BIND enables TLS SNI support for outgoing DoT
connections (when possible) in order to improve compatibility with
other DNS server software.

Dig - enable TLS SNI support

This commit ensures that dig enables TLS SNI support for outgoing
connections in order to improve compatibility with other DNS server
software.

TLS SNI - add low level support for SNI to the networking code

This commit adds support for setting SNI hostnames in outgoing
connections over TLS.

Most of the changes are related to either adapting the code to accept
and extra argument in *connect() functions and a couple of changes to
the TLS Stream to actually make use of the new SNI hostname
information.

fix: dev: Use CMM_{STORE,LOAD}_SHARED to store/load glue in gluelist

ThreadSanitizer has trouble understanding that gluelist->glue is
constant after it is assigned to the slabheader with cmpxchg. Help
ThreadSanitizer to understand the code by using CMM_STORE_SHARED and
CMM_LOAD_SHARED on gluelist->glue.

Merge branch 'ondrej/hint-tsan-in-addglue' into 'main'

See merge request isc-projects/bind9!9929