Michał Kępień [Fri, 13 Feb 2026 13:27:10 +0000 (14:27 +0100)]
Implement a response handler that forwards queries
Add a new response handler, ForwarderHandler, which enables forwarding
all queries to another DNS server. To simplify implementation, always
forward queries to the target server via UDP, even if they are
originally received using a different transport protocol.
Michał Kępień [Fri, 13 Feb 2026 13:27:10 +0000 (14:27 +0100)]
Log the server socket receiving each query
Extend AsyncDnsServer._log_query() and AsyncDnsServer._log_response() so
that they also log the <address, port> tuple for the socket on which a
given query was received on. Minimize the signatures of those methods
by taking advantage of all the information contained in the QueryContext
instances passed to them.
Michał Kępień [Fri, 13 Feb 2026 13:27:10 +0000 (14:27 +0100)]
Store server socket information in QueryContext
Extend the QueryContext class with a field holding the <address, port>
tuple for the socket on which a given query was received. This will
enable query handlers to act upon that information in arbitrary ways.
```
dnshost.example. 300 NS ns.dnshost.example.
ns.dnshost.example. 300 A 1.2.3.4
```
And then the child-side of `foo.example.`:
```
foo.example 3600 NS ns.dnshost.example.
a.foo.example 300 A 5.6.7.8
```
While there is a zone misconfiguration (the TTL of the delegation and glue doesn't match in the parent and the child), it is possible to resolve `a.foo.example` on a cold-cache resolver. However, after the `ns.dnshost.example.` glue expires, the resolution would have failed with a "fetch loop detected" error. This is now fixed.
Colin Vidal [Fri, 30 Jan 2026 16:09:18 +0000 (17:09 +0100)]
fetch loop detection improvements
The fetch loop detection occured in two places: when
`dns_resolver_createfetch()` is invoked (looking up through the parent
fetches chain and stops the fetch if a parent fetch is the same qname and
qtype) and right after calling `dns_adb_findname()` in the resolver
(stops the fetch if the current fetch is the same name from the ADB
lookup, and ADB lookup needs to fetch it).
Regarding fetch loop detection at the `dns_resulver_createfetch()`
entry, there are case where both qname and qtype are similar but the
zonecut is different. This will then query different name servers and
get different responses. For instance, the following delegation
parent-side (both for `foo.example.` and `dnshost.example.`):
dnshost.example. 300 NS ns.dnshost.example.
ns.dnshost.example. 300 A 1.2.3.4
Then the child-side of `foo.example.`:
foo.example 3600 NS ns.dnshost.example.
a.foo.example 300 A 5.6.7.8
Obviously, there is a misconfiguration between the parent-side and the
child-side of `dnshost.example` (the mismatch of the TTL), but, this
happens...
Because the resolver is currently child-centric, the parent-side
delegation's glue of `dnshost.example.` will be overriden by the
child-side of the delegation. Once both A records will expires, the
resolver will attempt to find out the A RRs but will start from the
`foo.example.` zonecut, as the delegation itself is still valid.
Then the resolver will attempt to resolve `ns.dnshost.example.`, still
using the `foo.example.` zonecut, which will immediately trigger another
attempt to resolve `ns.foo.example.` (because the A RR is expired). This
is, however _not_ a loop, because the second attempt will have
`dnshost.example.` zonecut. And this changes everything, because the
resolver detects the A name is in-domain, and pass a flag to ADB so
`dns_view_find()` won't use the cache. As a result, the zonecut will be
`.`, and the hints (root servers) will be queried instead.
From that point, they'll return the parent-side delegation, which
includes the glue for `ns.dnshost.example/A`, and the resolution can
continue. Previously, this wouldn't be possible because a loop would be
detected from the second attempt to looking `ns.foo.example/A` and would
result in a SERVFAIL.
Now, the loop detection is relaxed as the loop is detected if the qname,
qtype _and_ zonecut are equals.
This commit also changes the way the loop detection post
`dns_adb_createfind()` works. From the same example above, there would
be two ADB fetches with the same name, but with two different ADB flags
(the first one without DNS_ADB_STARTATZONE, the second one with that
flag). It means that there will be two fetches out of those two ADB
lookups, both legit, and not a loop (i.e. it won't be stuck). To
differenciate between a find which has a pending fetch (which could be
from another find the current find has been attached to), a new find
option `DNS_ADBFIND_STARTEDFETCH` is introduced, which tells that the
current has did started a fetch.
That way, if a find doesn't have `DNS_ADBFIND_STARTEDFETCH` option but
has pending fetches, we know this is a find attached to a similar find
so this is a loop. Otherwise, with `DNS_ADBFIND_STARTEDFETCH`, we know
that even if there is a pending fetch, this is not a loop as the fetch
has just been started
Colin Vidal [Mon, 2 Feb 2026 12:50:38 +0000 (13:50 +0100)]
extends named -T so ADB settings can be tweaked
ADB entry window and ADB min cache time can be tweaked using `named -T
adbentrywindow=<unsigned int>` and `named -T adbmincache=<unsigned
int>`.
While those values doesn't needs to be exposed to the operator, this can
be needed to be able to system test ADB behaviors without having to wait
as long as those values are by default.
Colin Vidal [Tue, 10 Feb 2026 08:25:09 +0000 (09:25 +0100)]
chg: dev: resolver: refactoring of the dns_fetchresponse_t handling
Instead of cloning fetch responses immediately after inserting them at the head of the `fetch_response` list, defer cloning until the events are actually sent.
This enables to:
- Remove the `fctx->cloned` state;
- Simplify the code by eliminating explicit calls to `clone_result()`;
- Remove the logic that enforced having a fetch response with a `sigrdataset` at the head of the list;
- Remove (just a bit of) locking in some places.
The fetch result is stored directly in new `fctx` properties, but there is no memory increase as those are grouped in an anonymous struct used in a union besides another (bigger) anonymous struct wrapping properties used by qmin fetch only (and, in the case of qmin fetch, those fetch result properties are not needed).
Merge branch 'colin/resolver-cloneresults' into 'main'
Evan Hunt [Sat, 17 Jan 2026 04:59:08 +0000 (20:59 -0800)]
use a union for resp and qmin data
It's potentially confusing to use "resp_rdataset" for QNAME
minimization, but we can make it a union and have resp.rdataset
and qmin.rdataset using the same memory.
We can save even more space by using the same union to combine
qminname and resp_foundname and access them as qmin.name and
resp.foundname.
Colin Vidal [Fri, 16 Jan 2026 11:14:58 +0000 (12:14 +0100)]
resolver: remove `qminrrset`, `qminsigrrset` from fctx
Two rdataset property `qminrrset` and `qminsigrrset` are removed from
the fetch context. They only are used as temporary storage for the query
result of the qmin query, and are immediately detached from
`resume_qmin` once the query is over.
As an alternative, use `resp_rdataset` and `resp_sigrdataset`
instead; those are not needed for storing the response data until
after qmin_resume() is over.
Colin Vidal [Thu, 15 Jan 2026 16:36:50 +0000 (17:36 +0100)]
resolver: copy fetch responses and send events in one go
Instead of first copying query response data into each fetch response
and then iterating again to send the response to the caller, perform
both operations in one go.
Colin Vidal [Thu, 15 Jan 2026 15:30:59 +0000 (16:30 +0100)]
resolver: simplify fetch response handling
There is no longer a need to decide whether a fetch response should be
prepended or appended to the fetch response list. As query response data
is stored directly in the fetch context object, responses containing a
sigrdataset no longer need to be ordered first. Remove the code
implementing this logic.
Additionally, the distinction between `fetchstate_done` and
`fetchstate_sendevents` is no longer needed. New clients
`dns_fetchresponse_t` can be attached any time to the fetch context
until `fctx__done()` is called, since there is no dependency on the
first fetch response in the list. This simplifies the code and reduces
(just a bit) locking usage.
Colin Vidal [Thu, 15 Jan 2026 13:47:46 +0000 (14:47 +0100)]
resolver: temporarily store query answer in fetch context
Query answers are now stored in dedicated fetch context properties,
instead of using `ISC_LIST_HEAD(fctx->resps)`.
This reduces lock critical section usage in some places, and enables
further simplifications. (In particular, it removes the need for special
logic to prepend a fetch response to the list when it contains a
sigrdataset.)
Colin Vidal [Thu, 15 Jan 2026 09:42:13 +0000 (10:42 +0100)]
resolver: Defer cloning of fetch responses until events are sent
Instead of cloning fetch responses immediately after writing to the
head of the fetch response list, defer cloning until the events are
actually sent.
This removes the need for the `fctx->cloned` state. However, a new
fetch state value, fetchstate_sentevents, is introduced and occurs
after fetchstate_done. To prevent new fetch responses from being
prepended after the head is written but before cloning occurs,
fetchstate_done is now set at all call sites that previously invoked
`clone_results()`.
Ondřej Surý [Mon, 9 Feb 2026 10:05:20 +0000 (11:05 +0100)]
fix: usr: Fix NULL Pointer Dereference in QP-trie Cache add()
When RRSIG(rdtype) was independently cached before the RDATA for the
rdtype itself, named would crash on the subsequent query for the RDATA
itself. This has been fixed.
ISC would like to thank Vitaly Simonovich for bringing this
vulnerability to our attention.
Closes #5738
Merge branch '5738-null-pointer-dereference-in-qpcache-add' into 'main'
Ondřej Surý [Sat, 7 Feb 2026 04:19:48 +0000 (05:19 +0100)]
Fix NULL Pointer Dereference in QP-trie Cache add()
When RRSIG(rdtype) was independently cached before the RDATA for the
rdtype itself, named would crash on the subsequent query for the RDATA
itself. This has been fixed.
ISC would like to thank Vitaly Simonovich for bringing this
vulnerability to our attention.
Ondřej Surý [Fri, 6 Feb 2026 17:34:00 +0000 (18:34 +0100)]
fix: nil: Release gnamebuf also on the error path
In dst_gssapi_acceptctx(), the gnamebuf could leak a little bit of
memory if dns_name_fromtext() would theoretically fail. This would
require a Kerberos principal with invalid DNS name.
Closes #5737
Merge branch '5737-memory-leak-in-dst_gssapi_acceptctx-on-dns_name_fromtext-failure' into 'main'
Ondřej Surý [Fri, 6 Feb 2026 16:50:55 +0000 (17:50 +0100)]
Release gnamebuf also on the error path
In dst_gssapi_acceptctx(), the gnamebuf could leak a little bit of
memory if dns_name_fromtext() would theoretically fail. This would
require a Kerberos principal with invalid DNS name.
Mark Andrews [Fri, 6 Feb 2026 01:52:55 +0000 (12:52 +1100)]
Record query time for all dnstap responses
The description in the protobuf specification is not a list of request
types to process but rather a list of examples to qualify the
description of whether the time indicates when the message is received
or sent.
Nicki Křížek [Thu, 29 Jan 2026 10:42:37 +0000 (11:42 +0100)]
Allow re-run of kasp test case on all FreeBSDs
Previously, the issue when the kasp.test_kasp_case[secondary.kasp] fails
due to a timeout has been only ocassionally observed on FreeBSD 13
in our CI. It seems to have come back on FreeBSD 15.
A lingering `sizeof` from the prototype era of !11094 caused the
key-wipe in `isc_hmac_key_destroy` to use `sizeof(key->len)` instead of
`key->len` for the length argument of `isc_safe_memwipe`.
This results in a buffer overflow of zero bytes in HMAC keys that are
less than 4 bytes. As such, the overflow can only be visibile in keys
that are less than 32-bits, which is beyond broken and creating such
keys are only possible in testing.
Therefore, this change is *not* a security fix since the conditions are
never reachable in any imaginable deployment scenario.
Builds that use OpenSSL >=3.0 are unaffected as the `sizeof` was only
remaining in pre-3.0 builds.
Closes #5732
Merge branch '5732-invalid-params-to-isc_safe_memwipe' into 'main'
Aydın Mercan [Thu, 5 Feb 2026 12:01:52 +0000 (15:01 +0300)]
wipe hmac keys correctly pre-3.0 libcrypto
A lingering `sizeof` from the prototype era of !11094 caused the
key-wipe in `isc_hmac_key_destroy` to use `sizeof(key->len)` instead of
`key->len` for the length argument of `isc_safe_memwipe`.
This results in a buffer overflow of zero bytes in HMAC keys that are
less than 4 bytes. As such, the overflow can only be visibile in keys
that are less than 32-bits, which is beyond broken and creating such
keys are only possible in testing.
Therefore, this change is *not* a security fix since the conditions are
never reachable in any imaginable deployment scenario.
Builds that use OpenSSL >=3.0 are unaffected as the `sizeof` was only
remaining in pre-3.0 builds.
Aydın Mercan [Mon, 2 Feb 2026 09:43:48 +0000 (12:43 +0300)]
chg: dev: initial openssl version splitting
Dealing with OpenSSL has been rapidly turning into an unwieldy situation
as post-3.0 changes turn the library into a different beast.
Start treating pre and post-3.0 versions differently for easier
maintenance.
To help with this Sisyphean task, this MR had to shift things around.
`OPENSSL_NO_DEPRECATED` is now declared in BIND alongside an appropriate
`OPENSSL_API_COMPAT` value. The former value will set to declare either
OpenSSL 1.1.0 or 3.0 as the bare minimum version.
Instead of splitting `md.c` and `hmac.c` into separate version-specific
files, they now live inside `crypto/ossl1_1.c` and `crypto/ossl3.c`.
This way, these functions will be able to utilize the same static
`OSSL_PARAM` tables, removing redundant reconstruction for HMAC.
For pre-3.0, `isc_hmac` has been reverted back to using the `HMAC_`
interface. Using `EVP_MD_CTX`-based functions for HMAC will end up
libcrypto calling the same `HMAC_` functions in the end, giving no
advantage while confusingly using the digest functions.
A new API, `isc_ossl_wrap` has been added. This family of functions
aim to provide a common interface for libcrypto version specific code
while not abstracting away OpenSSL's structures such as `EVP_PKEY`.
Currently the main user of this API is the `dst` family of functions
where some ECDSA and RSA opeations need to use the new `OSSL_PARAM`
functionality by requirement or to avoid speed penalties.
Furthermore OpenSSL based logging has been moved from `isc_tls` to
`isc_ossl_wrap` as its a more appropriate place for such functionality.
Merge branch 'aydin/openssl-version-split' into 'main'
Aydın Mercan [Mon, 1 Dec 2025 14:07:54 +0000 (17:07 +0300)]
remove libcrypto version specific code in opensslecdsa_link
Using `EVP_SIGNATURE` explicit algoritms for signatures have been added
in OpenSSL 3.4 and so is skipped for the initial OpenSSL version
specific code splitting.
Aydın Mercan [Mon, 1 Dec 2025 13:23:37 +0000 (16:23 +0300)]
remove libcrypto version specific code in opensslrsa_link
Using `EVP_SIGNATURE` explicit algoritms for signatures have been added
in OpenSSL 3.4 and so is skipped for the initial OpenSSL version
specific code splitting.
Aydın Mercan [Mon, 1 Dec 2025 10:49:46 +0000 (13:49 +0300)]
move openssl error reporting to isc/ossl_wrap
While being the best place at the time, the tlserr2result doesn't belong
inside TLS code since it is generic to OpenSSL and mostly used in the
dst interface. The newly created ossl_wrap interface is the idea place
for flushing the OpenSSL thread error queue.
Aydın Mercan [Wed, 17 Sep 2025 12:52:35 +0000 (14:52 +0200)]
Separate isc_hmac between pre and post OpenSSL 3.0
Instead of the `EVP_MD_CTX` based functions, use either the new
`EVP_MAC` or the old `HMAC_CTX` based functions.
`EVP_MAC` is the recommended way using using MAC functions in post-3.0
while `HMAC_CTX` is used internally by `EVP_MD_CTX`, making the latter
redundant.
Aydın Mercan [Tue, 16 Sep 2025 13:11:37 +0000 (15:11 +0200)]
switch isc_md_type_t to a proper enum
Get rid of the OpenSSL-isms that plague the codebase where the hash type
is `EVP_MD *`
By using a proper enum, alongside the cleanup, we also get the ability
to use constants for known hash sizes instead of having a function call
every time.
`EVP_MD_CTX_get0_md` has been removed instead of being adapted since it
wasn't used anymore.
Ondřej Surý [Thu, 29 Jan 2026 03:29:45 +0000 (04:29 +0100)]
chg: usr: Enable minimal ANY answers by default
ANY queries are widely abused by attackers doing reflection attacks as
they return the largest answers. Enable minimal ANY answers by default
to reduce the attack surface of the DNS servers.
Closes #5723
Merge branch '5723-change-minimal_any-default-to-yes' into 'main'
Ondřej Surý [Wed, 28 Jan 2026 14:04:58 +0000 (15:04 +0100)]
Enable minimal ANY answers by default
ANY queries are widely abused by attackers doing reflection attacks as
they return the largest answers. Enable minimal ANY answers by default
to reduce the attack surface of the DNS servers.
Mark Andrews [Wed, 28 Jan 2026 10:23:48 +0000 (21:23 +1100)]
fix: test: ISC_RUN_TEST_IMPL should use a static declaration
These functions don't need to be called from multiple places and
by making them static we will detect when they are not added to the
list functions to be tested.
Closes #5715
Merge branch '5715-isc_run_test_impl-should-use-a-static-declaration' into 'main'
Mark Andrews [Fri, 23 Jan 2026 04:57:42 +0000 (15:57 +1100)]
ISC_RUN_TEST_IMPL should use a static declaration
These functions don't need to be called from multiple places and
by making them static we will detect when they are not added to the
list functions to be tested.
Mark Andrews [Tue, 27 Jan 2026 20:22:59 +0000 (07:22 +1100)]
chg: dev: Use enum rather than numbers for isc_base64_tobuffer and isc_hex_tobuffer
Use isc_one_or_more and isc_zero_or_more rather than (-2) and
(-1) when calling isc_base64_tobuffer. Similarly for
isc_hex_tobuffer. This should help reduce the probability
that the wrong number is used and it makes the intent clearer.
Closes #5713
Merge branch '5713-use-macros-with-isc_base64_tobuffer-and-isc_hex_tobuffer' into 'main'
Mark Andrews [Fri, 23 Jan 2026 03:53:18 +0000 (14:53 +1100)]
Add enum for use with isc_base64_tobuffer and isc_hex_tobuffer
This adds the following enum isc_one_or_more and isc_zero_or_more
which specify if one or more or zeror or more bytes are required
when reading the unbounded base64 / hex encoded data.
Arаm Sаrgsyаn [Tue, 27 Jan 2026 11:32:07 +0000 (11:32 +0000)]
fix: usr: Fix a possible issue with reponse policy zones and catalog zones
If a response policy zone (RPZ) or a catalog zone contained an
`$INCLUDE` directive, then manually reloading that zone could
fail to process the changes in the response policy or in the
catalog, respectively. This has been fixed.
Closes #5714
Merge branch '5714-zone_loaddone-rpz-and-catz-bugfix' into 'main'
Aram Sargsyan [Mon, 26 Jan 2026 15:34:00 +0000 (15:34 +0000)]
Fix a bug in zone_loaddone()
The zone_loaddone() function disables database notifications for
a catalog zones and response policy zones (RPZ) when loading had
failed. Howerer, the 'result != ISC_R_SUCCESS' check is insufficient,
because the DNS_R_SEENINCLUDE result also indicates success.
Nicki Křížek [Tue, 27 Jan 2026 10:46:55 +0000 (11:46 +0100)]
fix: test: Resolve the system_test_dir in pytest
If the system_test_dir contains a symlink, then it might cause issues
further down when using relative_to(), unless it is resolved first. This
has been observed on FreeBSD13 in CI where /home is a symlink to
/usr/home.
Merge branch 'nicki/pytest-freebsd13-artifacts-path' into 'main'
Nicki Křížek [Mon, 26 Jan 2026 17:37:00 +0000 (18:37 +0100)]
Resolve the system_test_dir in pytest
If the system_test_dir contains a symlink, then it might cause issues
further down when using relative_to(), unless it is resolved first. This
has been observed on FreeBSD13 in CI where /home is a symlink to
/usr/home.
Nicki Křížek [Mon, 26 Jan 2026 12:10:25 +0000 (13:10 +0100)]
fix: test: Fix a race condition in dnssec test
When dumpdb command is executed, it might take a while until the file is
written. Rather than checking the file once, use the WatchLog mechanism
to allow the desired line to appear before a timeout happens.
This affected test_validation_recovery and test_cache tests which have
been intermittently failing on EL8 in our CI.
Merge branch 'nicki/fix-dnssec-test-dumpdb-race' into 'main'
Nicki Křížek [Mon, 26 Jan 2026 09:45:34 +0000 (10:45 +0100)]
Fix a race condition in dnssec test
When dumpdb command is executed, it might take a while until the file is
written. Rather than checking the file once, use the WatchLog mechanism
to allow the desired line to appear before a timeout happens.
This affected test_validation_recovery and test_cache tests which have
been intermittently failing on EL8 in our CI.