Nicki Křížek [Fri, 19 Jun 2026 11:50:58 +0000 (13:50 +0200)]
chg: ci: Raise respdiff third-party and recent-named disagreement thresholds
The third-party comparison is right at the 0.4 % threshold, making it
fail quite often. The current range of observed values ranges from
0.3-0.5 %. Raise the threshold to 0.5 %.
The recent-named comparison produces values ranging from 0.05 to 0.12 %,
but appears to be more sensitive to time of the day when the test runs.
Raise the threshold to 0.15 %.
Both results are stable within the specified ranges across the last
three releases in both main and bind-9.20 series.
Merge branch 'nicki/respdiff-threshold-bump' into 'main'
Nicki Křížek [Fri, 19 Jun 2026 11:09:42 +0000 (11:09 +0000)]
Raise respdiff third-party and recent-named disagreement thresholds
The third-party comparison is right at the 0.4 % threshold, making it
fail quite often. The current range of observed values ranges from
0.3-0.5 %. Raise the threshold to 0.5 %.
The recent-named comparison produces values ranging from 0.05 to 0.12 %,
but appears to be more sensitive to time of the day when the test runs.
Raise the threshold to 0.15 %.
Both results are stable within the specified ranges across the last
three releases in both main and bind-9.20 series.
Matthijs Mekking [Fri, 19 Jun 2026 09:07:10 +0000 (09:07 +0000)]
fix: usr: CDS/CDNSKEY records were not removed when re-configuring the server
When on an ``rndc reconfig`` the DNSSEC policy changes such that it changes the expected ``CDNSKEY`` and/or ``CDS`` records in the zone, the RRset should
be updated accordingly. This did not happen when removing digests from the configuration, or setting `cdnskey no;`. This has been fixed.
Closes #6166
Merge branch '6166-reconfig-delete-cds' into 'main'
Matthijs Mekking [Wed, 17 Jun 2026 15:34:42 +0000 (17:34 +0200)]
Remove CDs/CDNSKEY records on reconfig
When adding to dnssec-policy:
cdnskey no;
cds-digest-types { };
and then reconfig the server, named must remove existing CDS and CDNSKEY
records. Note this already worked when adding CDS digest, or setting
'cdnskey yes;', but not when digests were removed from the list, or
when setting 'cdnskey no;'.
Matthijs Mekking [Wed, 17 Jun 2026 15:29:51 +0000 (17:29 +0200)]
Add system test for reconfiguring CDS/CDNSKEY
When on an 'rndc reconfig' the DNSSEC policy changes such that it
changes the expected CDNSKEY/CDS records in the zone, the RRset should
be updated accordingly.
Add a test case where we reconfigure a zone with a policy such that
these records should be removed, and on a second reconfigure add
them back again.
Note the test deliberately adds a different CDS digest on the
second reconfigure.
Evan Hunt [Thu, 18 Jun 2026 18:40:03 +0000 (18:40 +0000)]
fix: dev: Update dnssec validation test to match new behavior
Some of the tests in `dnssec/tests_validation.py` worked by iterating through the response message looking for failure conditions, such as excessively high TTL values. In some cases, previous changes caused additional data not to be returned. Since there was nothing to iterate, the tests still "passed".
Tests that don't make sense anymore have been removed. Other tests that iterate through responses have been updated with checks to ensure that the responses actually do contain data.
Merge branch 'each-cleanup-validation-test' into 'main'
Evan Hunt [Wed, 17 Jun 2026 18:48:42 +0000 (11:48 -0700)]
update tests_validation.py test for new behavior
Some of the tests in in dnssec/tests_validation.py worked by iterating
through the response message looking for failure conditions, such as
excessively high TTL values. In some cases, previous changes caused
additional data not to be returned. Since there was nothing to
iterate, the tests still "passed".
Tests that don't make sense anymore have been removed. Other tests that
iterate through responses have been updated with checks to ensure that
the responses actually do contain data.
Evan Hunt [Wed, 17 Jun 2026 18:47:45 +0000 (11:47 -0700)]
add isctest.check functions for section empty or non-empty
expand on the isctest.check.empty_answer() function, adding
empty_authority(), empty_additional(), has_answer(), has_authority(),
and has_additional().
Evan Hunt [Thu, 18 Jun 2026 17:47:51 +0000 (17:47 +0000)]
fix: usr: Check wildcard signer and NOQNAME signer match
A positive wildcard answer, and the NSEC3 proof that the requested
name doesn't exist in the zone, must both be from the same zone.
Otherwise, an NSEC3 from an ancestor zone could be used to interfere
with validation.
We now retrieve the signer name from a wildcard response's signature.
An NSEC3 record cannot be used as a NOQNAME proof for the
wildcard unless it exactly matches the name one level above the NSEC3.
Closes #5971
Merge branch '5971-wildcard-noqname-mismatch' into 'main'
Evan Hunt [Tue, 16 Jun 2026 19:06:21 +0000 (12:06 -0700)]
Check wildcard signer and NOQNAME signer match
A positive wildcard answer, and the NSEC3 proof that the requested
name doesn't exist in the zone, must both be from the same zone.
Otherwise, an NSEC3 from an ancestor zone could be used to interfere
with validation.
We now retrieve the signer name from a wildcard response's signature.
An NSEC3 record cannot be used as a NOQNAME proof for the wildcard
unless it exactly matches the name one level above the NSEC3.
Alessio Podda [Tue, 16 Jun 2026 10:07:49 +0000 (12:07 +0200)]
Reject external referrals for forward zones
Apply the existing name_external() bailiwick check to NS RRsets
processed as referrals in rctx_authority_negative(), and enforce the
same check again in rctx_referral() before caching or following the
delegation.
This prevents a forward-first forwarder from installing a parent
zone-cut above the configured forward zone via an authority-section
NS RRset.
Andoni Duarte [Thu, 18 Jun 2026 11:29:23 +0000 (11:29 +0000)]
chg: ci: Migrate Mattermost notifications to Zulip in CI
Since internal communications are now Zulip based, CI jobs now target
Zulip instead of Mattermost. The `MATTERMOST_WEBHOOK_URL` environment
variable is no longer needed, scripts now use `ZULIP_SERVER_URL` and
`ZULIP_API_KEY`.
In order to harmonize Zulip messaging, `message_zulip.py` is used where
curl calls to the webhook were previously used.
Merge branch 'andoni/mattermost-to-zulip-migration' into 'main'
Migrate Mattermost notifications to Zulip in .gitlab-ci.yml
Since internal communications are now Zulip based, CI jobs now target
Zulip instead of Mattermost. The MATTERMOST_WEBHOOK_URL environment
variable is no longer needed, scripts now use ZULIP_SERVER_URL and
ZULIP_API_KEY.
In order to harmonize Zulip messaging, message_zulip.py is used where
curl calls to the webhook were previously used.
Colin Vidal [Thu, 18 Jun 2026 07:21:26 +0000 (09:21 +0200)]
fix: usr: Fix recursion loop in case of badly behaving forwarders
When forwarding DNS queries, the CD bit is cleared on the first query, and the CD bit is only used as a fallback if the first query fails. However, due to a logic bug this could lead to an unbounded loop re-sending the same message, until the maximum query count is hit. This has been fixed.
Closes #5804
Merge branch '5804-resend-loop-forwarder-cd' into 'main'
Colin Vidal [Wed, 8 Apr 2026 10:04:57 +0000 (12:04 +0200)]
Avoid resend loop when forwarder SERVFAILs with both CD=0 and CD=1
Commit `36cf1c6a5bf943ad718ddba9fbe6ea97810e3bc2` introduces the
`DNS_FETCHOPT_TRYCD` flag which enables, when sending a query to a
forwarder, the forwarder to validate the answer (CD=0). The crux is
that if for some reason the forwarder returns SERVFAIL, we can retry the
same query and disable the forwarder validation (CD=1) so the resolver
can attempt validation itself (or detect it's bogus).
The logic was to first set `DNS_FETCHOPT_TRYCD` to the query options but
not on the message (so CD=0), and, when getting a SERVFAIL answer, if
the option `DNS_FETCHOPT_TRYCD` was set, to also set it into the
message. However, there was no way to know if this was the first (or
second) query because the original message is discarded when getting the
answer. This can lead to an unbounded loop re-sending the same message
again and again (until the global query count stops it).
This is fixed by using two separate flags `DNS_FETCHOPT_TRYNOCD`, set on
the query options for the very first query, then, if it SERVFAIL,
check if `DNS_FETCHOPT_TRYNOCD` is set but `DNS_FETCHOPT_TRYCD` is not.
In this case, we know we're about to send the second query. If it also
fails, `DNS_FETCHOPT_TRYCD` will be set anyway, so there is no point
retrying. This breaks the unbounded loop.
OBSERVED BEHAVIOR
The malicious server receives tens of thousands of resend packets
within seconds. CPU usage of the named worker thread remains elevated
(50–100% of one core) until the default fetch timeout (~10 seconds)
terminates the request. Instrumentation during testing confirmed that
isc_counter_used(fctx->qc) remains constant (value 1) throughout the
entire resend loop.
Colin Vidal [Thu, 18 Jun 2026 06:48:29 +0000 (08:48 +0200)]
fix: usr: Cache glue only for enabled address families
When caching delegation NS data, only use A/AAAA glue records if the resolver has the corresponding IPv4/IPv6 dispatcher configured. If IPv4 or IPv6 is disabled, ignore glue for that family and fall back to caching the nameserver name if there is no glue from the other supported family.
Merge branch 'colin/glues-supported-stack' into 'main'
Colin Vidal [Wed, 22 Apr 2026 14:54:24 +0000 (16:54 +0200)]
Cache glue only for enabled address families
When caching delegation NS data, only use A/AAAA glue records if the
resolver has the corresponding IPv4/IPv6 dispatcher configured. If IPv4
or IPv6 is disabled, ignore glue for that family and fall back to
caching the nameserver name if there is no glue from the other supported
family.
The new `cache_delegns` system test is covering delegation NS caching
with dual-stack resolver, IPv4-only, IPv6-only configurations. It also
set up an authoritative sever with zones with A-only, AAAA-only, and
dual-stack glue, which are all queried, and checks the delegation
database dump to confirm that the cached delegation data correspond to
the resolver configuration.
Ondřej Surý [Thu, 18 Jun 2026 05:13:42 +0000 (07:13 +0200)]
fix: dev: Fix invalid pointer release in JSON statistics-channel response
Each response served on a JSON statistics endpoint released the wrong
pointer to the JSON library after the response was sent: the response
body string instead of the JSON document. With the current responses
this does not crash named in practice, but the call is incorrect and
can in principle corrupt memory. XML responses are not affected.
Closes #6024
Merge branch '6024-statschannel-json-response-invalid-free' into 'main'
Ondřej Surý [Thu, 21 May 2026 09:07:32 +0000 (11:07 +0200)]
Fix invalid free in statistics-channel JSON renderer
wrap_jsonfree() called json_object_put() on the response-body buffer
base, which is the JSON string returned by
json_object_to_json_string_ext(), not a struct json_object. The root
object is already passed in as the callback argument; release only
that.
Ondřej Surý [Thu, 21 May 2026 08:47:15 +0000 (10:47 +0200)]
Add regression test for statistics-channel JSON free bug
Hit every JSON statistics endpoint several times. The current code
calls json_object_put() on the response-body string pointer, which
doesn't crash just by accident - the memory position contains large
value from static string.
Ondřej Surý [Fri, 12 Jun 2026 13:43:16 +0000 (15:43 +0200)]
Re-apply the picohttpparser.c patch
This:
- makes sure all variables are initialized
- adds missing curly braces for single line statements
- use proper comment for fallthrough case statements
Michal Nowak [Mon, 1 Jun 2026 14:37:12 +0000 (16:37 +0200)]
Sync picohttpparser.c with upstream commit a875a01
Pull in upstream commit a875a01 from h2o/picohttpparser: enforce use of
CRLF in chunk headers, by rejecting bare CR / LF. Replaces the lenient
CHUNKED_IN_CHUNK_CRLF state with strict CHUNKED_IN_CHUNK_HEADER_EXPECT_LF,
CHUNKED_IN_CHUNK_DATA_EXPECT_CR and CHUNKED_IN_CHUNK_DATA_EXPECT_LF states.
Ondřej Surý [Wed, 17 Jun 2026 20:30:13 +0000 (22:30 +0200)]
fix: nil: Allocate work threads from their owning loop's memory context
The per-loop worker threads allocated their state from the loop manager's
memory context instead of the per-loop context that owns them. Allocate
from the owning loop's context and hold a reference to that loop for the
thread's lifetime, matching the context handling already used on the
work-enqueue and completion paths.
Already changed as part of 9.20 backport.
Merge branch 'ondrej/rewrite-threadpool-fixups' into 'main'
Ondřej Surý [Wed, 17 Jun 2026 19:16:50 +0000 (21:16 +0200)]
Run the work asynchronously when shutting down
Instead of running the work directly, run it asynchronously to prevent
dead-locks when then isc_work is scheduled from inside a lock and the
job itself is using locking.
Ondřej Surý [Wed, 17 Jun 2026 18:37:55 +0000 (20:37 +0200)]
Allocate work threads from their owning loop's memory context
A per-loop work thread referenced the loop manager's memory context,
which is not the context that backs the loop the thread serves. Pass
the owning loop instead and allocate from loop->mctx, keeping a loop
reference for the thread's lifetime. This matches how isc_work_enqueue
and work_done already obtain the context from the loop, and the
teardown uses loop->mctx before dropping the reference.
Ondřej Surý [Wed, 17 Jun 2026 17:07:21 +0000 (19:07 +0200)]
chg: dev: Rework isc_work as per-loop, per-lane cancelable worker threads
Fold the libuv thread pool and the per-loop isc_helper threads into a single
isc_work pool. Each (loop, lane) gets its own SPSC queue and worker, which drops
the shared-queue contention, and the FAST/SLOW lanes keep short crypto tasks off
the long blocking ones (zone dump/load, xfrin). isc_work jobs are now cancelable:
isc_work_cancel() tombstones a still-queued job and its after_cb fires with
ISC_R_CANCELED, so abandoned work can be dropped instead of run to completion.
Merge branch 'ondrej/rewrite-threadpool' into 'main'
Ondřej Surý [Tue, 9 Jun 2026 04:12:58 +0000 (06:12 +0200)]
Replace the shared work pool with per-loop, per-lane worker threads
Offloaded work used two different mechanisms: a per-loop isc_helper
thread for CPU-bound crypto (DNSSEC validation, message signature
checks) and the process-global libuv thread pool for blocking I/O (zone
load and dump, inbound transfer apply). Neither could cancel a queued
task, and the two disagreed about exclusive mode — the helper paused
with its loop under isc_loopmgr_pause() but the libuv pool did not, so
blocking offloaded work kept running while a loop held the exclusive
lock.
Unify both behind isc_work: each loop gets its own worker thread per
lane — FAST for short, bounded tasks and SLOW for long, blocking ones —
fed by a private queue. Separate lanes keep a short crypto task off the
path of a multi-second zone dump once both run on per-loop workers;
every lane parks with isc_loopmgr_pause() so exclusive mode now quiesces
offloaded work too; and a still-queued task can be canceled before it
starts (isc_work_cancel). isc_helper is removed and its callers select a
lane.
Nicki Křížek [Wed, 17 Jun 2026 15:42:35 +0000 (17:42 +0200)]
chg: ci: Adjust allow_failure in cross-version-config tests
The failures caused by #6007 are longer happening.
However, in the mean time, other tests have begun to fail due to https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/12163/diffs?commit_id=9771df0aca5ce399ad0535f656c403a58d674606
Merge branch 'nicki/revert-allow-failure-after-9.21.23' into 'main'
Colin Vidal [Wed, 17 Jun 2026 09:24:32 +0000 (11:24 +0200)]
fix: usr: Fix CNAME resolution failure caused by a cached SERVFAIL response
Under certain circumstances, a cached SERVFAIL response could incorrectly prevent successful resolution of a CNAME target. This could cause resolution failures to persist until the cached SERVFAIL entry expired, even when the CNAME target itself was otherwise resolvable. This issue has been fixed.
Closes #5983
Merge branch '5983-servfail-cache-cname' into 'main'
Colin Vidal [Mon, 1 Jun 2026 14:58:45 +0000 (16:58 +0200)]
Use original query name when caching SERVFAIL
Instead of using `client->query.qname` when caching a SERVFAIL answer,
use `client->query.origqname` when available.
This avoids caching a SERVFAIL against a CNAME target when the failure
occurs while the resolver is following the CNAME chain. This is
problematic, for instance, when the SERVFAIL is triggered by the
`max-query-count` threshold being reached, which would incorrectly
prevent legitimate resolution of the CNAME target while in the SERVFAIL
cache.
Note that if the SERVFAIL genuinely originated from resolving the CNAME
target, that specific failure will no longer be cached, and a direct
query for the CNAME target will trigger a fresh (likely failing)
resolution attempt. However, this is still preferable to the previous
behaviour, which would wrongly prevent resolving the CNAME target if it
was cached for other reasons (like the example above).
Evan Hunt [Wed, 17 Jun 2026 02:42:40 +0000 (02:42 +0000)]
rem: usr: remove the secondary validator in query.c
Previously, when the additional section of a response was being
populated, if cached data was found with pending trust, it would be
opportunistically validated. The code implementing this validation was
not quite formally correct. Rather than fixing it, the code has been
removed: RRsets with pending trust are now omitted from responses.
Closes #5966
Closes #5968
Closes #5972
Merge branch 'each-remove-lightweight-validator' into 'main'
Evan Hunt [Mon, 15 Jun 2026 23:28:25 +0000 (16:28 -0700)]
Merge DNSSEC wildcard tests
Merge the tests for #5966 (F-043) and #5972 (F-045), previously called
dnssec_wildcard_additional and dnssec_replayed_parent_wildcard, into a
single directory with two modules.
Evan Hunt [Thu, 11 Jun 2026 19:04:40 +0000 (12:04 -0700)]
Remove the secondary validator in query.c
Previously, when the additional section of a response was being
populated, if cached data was found with pending trust, it would be
opportunistically validated. The code implementing this validation was
not quite formally correct; rather than fixing it, the code has been
removed; RRsets with pending trust are now omitted from responses.
Vicky Risk [Tue, 16 Jun 2026 15:26:01 +0000 (11:26 -0400)]
fix: doc: Edit SECURITY.md to remove references to bind-security@isc.org
We no longer want to encourage people to open new issues via email because we are getting too many spammy reports generated by LLM. By requiring people actually login to Gitlab to make their report, they will (hopefully) see our reporting template, and think at least a little bit about whether they are making a well-considered, valid report.
Ondřej Surý [Mon, 18 May 2026 05:53:33 +0000 (07:53 +0200)]
Use SIEVE for TSIG generated-key LRU
The generated-key cache used a strict LRU: every lookup hit promoted
the key to the tail of the list under the write lock, even when the
caller arrived holding only the read lock. That promotion was both
a lock upgrade and a list reshuffle on the hot path.
Replace the LRU with the SIEVE eviction policy. Lookups now do a
lock-free atomic mark of the visited bit; eviction work moves to
insertion time, which already holds the write lock.
Ondřej Surý [Tue, 16 Jun 2026 12:01:33 +0000 (14:01 +0200)]
fix: test: Fix the rpz servfail-until-ready test when zone updates run serially
The rpz servfail-until-ready test assumed a particular policy zone always
finished loading last, which only holds when zone updates run in parallel;
on a single CPU (or with serialized offload) it could fail spuriously. It now
polls until RPZ reports ready instead.
Merge branch 'ondrej/fix-rpz-system-test-on-single-cpu' into 'main'
Ondřej Surý [Tue, 16 Jun 2026 10:39:56 +0000 (12:39 +0200)]
Poll for RPZ readiness in the servfail-until-ready test
RPZ is ready only once every policy zone has completed its first update,
and the zones do not finish in a fixed order, so whenever the updates
run serially (per-loop offload, or any single-CPU run): 'slow-rpz' zone
can finish before the others and the query still gets SERVFAIL. Poll
the query until it returns NOERROR instead.
Nicki Křížek [Tue, 16 Jun 2026 08:28:36 +0000 (10:28 +0200)]
new: test: Add AnsInstance as the ans counterpart of NamedInstance
Tests interacting with mock ans servers had to hardcode their IP
addresses and open ans.run directly, while named instances already
had the NamedInstance abstraction with `.ip`, `.log` and the
watch_log_*() helpers. Factor the parts of NamedInstance that are
not named-specific into a ServerInstance base class and add an
AnsInstance subclass for ans servers, exposed through the `servers`
fixture and new ans1-ans11 convenience fixtures.
Assisted-by: Claude:claude-fable-5
Merge branch 'nicki/pytest-ans-instance' into 'main'
Nicki Křížek [Thu, 11 Jun 2026 14:57:58 +0000 (14:57 +0000)]
Add AnsInstance as the ans counterpart of NamedInstance
Tests interacting with mock ans servers had to hardcode their IP
addresses and open ans.run directly, while named instances already
had the NamedInstance abstraction with `.ip`, `.log` and the
watch_log_*() helpers. Factor the parts of NamedInstance that are
not named-specific into a ServerInstance base class and add an
AnsInstance subclass for ans servers, exposed through the `servers`
fixture and new ans1-ans11 convenience fixtures.
Ondřej Surý [Mon, 15 Jun 2026 18:05:17 +0000 (20:05 +0200)]
fix: nil: Fix a latent NULL dereference in the DoH client request helper
isc__nm_http_request()'s error path reloaded sock->h2->connect.cstream after client_send() had already detached and freed it on a submit failure, dereferencing NULL. The helper is only used by the DoH unit tests. Guard the cleanup path against the detached stream.
Closes #6160
Merge branch '6160-fix-latent-NULL-dereference-in-http2' into 'main'
Ondřej Surý [Mon, 15 Jun 2026 16:09:37 +0000 (18:09 +0200)]
Guard against a detached stream in isc__nm_http_request() error path
On a submit failure, client_send() nullifies sock->h2->connect.cstream
and frees the stream before returning the error. The error: label in
isc__nm_http_request() reloaded that pointer and dereferenced it
unconditionally, reading through a NULL stream. The function is only
used by the DoH unit tests -- production DoH client send goes through
isc__nm_http_send()/client_httpsend(), whose submit failure is reported
via the NULL-safe send callback -- so this is a latent defect in the
test helper rather than a reachable named crash.
Skip the read callback when the stream has already been detached and
let the caller report the failure from the error result it receives.
Arаm Sаrgsyаn [Mon, 15 Jun 2026 11:35:12 +0000 (11:35 +0000)]
fix: usr: Fix a 'deny-answer-aliases' configuration bypass issue
It was possible to use a maliciously crafted authoritative
zone to make :iscman:`named` resolver synthesize a ``DNAME``
"alias" that should have been rejected by the configured
:any:`deny-answer-aliases` option. This has been fixed.
Closes #5930
Merge branch '5930-deny-answer-aliases-and-cached-dname-buf-fix' into 'main'
Aram Sargsyan [Mon, 18 May 2026 09:17:51 +0000 (09:17 +0000)]
Fix a 'deny-answer-aliases' bug when using a cached DNAME
When using a cached DNAME to resolve a name, make sure to consult
the denied answers lists, otherwise it is possible to consutruct
a restricted alias by caching a DNAME that is a parent of the
denied alias. See the comments in the tests case from the previous
commit an example.
Aram Sargsyan [Mon, 18 May 2026 09:14:27 +0000 (09:14 +0000)]
Add a new 'deny-answer-aliases' check in 'resolver' system test
This new check exercises an attack against guarantees given by the
'deny-answer-aliases' configuration option by caching a DNAME
that is a parent of the restricted alias, and then "constructing"
the restricted alias from the cache.
Ondřej Surý [Fri, 12 Jun 2026 15:40:20 +0000 (17:40 +0200)]
fix: nil: Remove isc_mem_strndup()
The isc_mem_strndup() function had a single caller, the HTTP/2
request-path handling, which now uses isc_mem_allocate() and strlcpy()
directly. Remove the function from the libisc API.
Closes #6087
Merge branch '6087-remove-isc_mem_strndup' into 'main'
Ondřej Surý [Fri, 12 Jun 2026 14:17:46 +0000 (16:17 +0200)]
Remove isc_mem_strndup()
The function had a single caller, the HTTP/2 request-path handling in
the network manager, and its semantics (strlen() of the source clamped
to the requested size) amounted to an obscured bounded string copy.
Replace the only use with a plain allocation and strlcpy(), and drop the
function.
Arаm Sаrgsyаn [Fri, 12 Jun 2026 15:36:57 +0000 (15:36 +0000)]
fix: usr: Fix a zone transfer over TLS (XoT) issue when using the opportunistic TLS mode
The :iscman:`named` process, running as secondary DNS server,
configured to transfer a zone from a primary server using an
encrypted XoT transport in opportunistic TLS mode (i.e. without
peer certificate/hostname validation) could terminate unexpectedly
when the TLS ALPN negotiation with primary server was unsuccessful.
This has been fixed.
Closes #5957
Merge branch '5957-xot-xfrin_connect_done-bug-fix' into 'main'
Aram Sargsyan [Fri, 22 May 2026 11:31:37 +0000 (11:31 +0000)]
Fix a bug in xfrin.c:xfrin_connect_done()
When the connect callback's result is ISC_R_SUCCESS and the callback
changes the result because of some condition, the 'xfr' should not
be detached, because it now belongs to the receive callback.
Detach the reference only if the callback's result is non-success.
Aram Sargsyan [Fri, 22 May 2026 11:27:54 +0000 (11:27 +0000)]
Add a check for the "doth" system test
Configure a zone transfer using XoT (with opportunistic TLS) from
a non-DoT port, which does not provide ALPN "dot" (in this case
it will try to connect to a DoH port). This is expected to fail,
but the client should handle the error gracefully and not to crash.
Colin Vidal [Fri, 12 Jun 2026 14:50:23 +0000 (16:50 +0200)]
fix: dev: Fix delegdb dump buffer overflow
A buffer used to dump a DNS name in the delegdb dump flow was using the
wrong size: it was using `DNS_NAME_MAXWIRE` which is the actual max
length of a DNS name on the wire instead of using `DNS_NAME_FORMATSIZE`
which is the maximum length of a textual representation of a DNS name
(which can be way longer than `DNS_NAME_MAXWIRE` if using the master
file escape sequence format) plus 1 (end of string byte). This could
lead to a buffer overflow. This is now fixed.
Closes #6132
Merge branch '6132-delegdb-dump-overflow' into 'main'
Colin Vidal [Fri, 5 Jun 2026 09:58:02 +0000 (11:58 +0200)]
Fix delegdb dump buffer overflow
A buffer used to dump a DNS name in the delegdb dump flow was using the
wrong size: it was using `DNS_NAME_MAXWIRE` which is the actual max
length of a DNS name on the wire instead of using `DNS_NAME_FORMATSIZE`
which is the maximum length of a textual representation of a DNS name
(which can be way longer than `DNS_NAME_MAXWIRE` if using the master
file escape sequence format) plus 1 (end of string byte). This could
lead to a buffer overflow. This is now fixed.
Colin Vidal [Fri, 5 Jun 2026 09:55:07 +0000 (11:55 +0200)]
Add test for delegdb dump with very long name
Add a delegdb test which dump a database which contains a very long name
(using DNS master file format with escape sequence as defined per RFC
1035). This ensure that the delegdb uses large enough internal buffers
to load the names in DB and generate the dump. If this is not the case,
the test crashes on a build with address sanatizer enabled.
Nicki Křížek [Fri, 12 Jun 2026 12:37:21 +0000 (14:37 +0200)]
new: test: Add a system test cookbook
The README documents what the framework is; the cookbook documents how
to get common tasks done with it: iterating on a single test, adding a
new test directory, writing a regression reproducer, mocking a
misbehaving server with isctest.asyncserver, signing zones in
bootstrap(), and driving named via the NamedInstance fixtures. All
recipes are distilled from existing tests (cyclic_glue, dnssec_py,
dispatch) so they reflect the current canonical patterns.
Assisted-by: Claude:claude-fable-5
Merge branch 'nicki/systest-cookbook' into 'main'
Nicki Křížek [Thu, 11 Jun 2026 09:11:18 +0000 (09:11 +0000)]
Add a system test cookbook
The README documents what the framework is; the cookbook documents how
to get common tasks done with it: iterating on a single test, adding a
new test directory, writing a regression reproducer, mocking a
misbehaving server with isctest.asyncserver, signing zones in
bootstrap(), and driving named via the NamedInstance fixtures. All
recipes are distilled from existing tests (cyclic_glue, dnssec_py,
dispatch) so they reflect the current canonical patterns.
Michal Nowak [Fri, 12 Jun 2026 10:02:45 +0000 (12:02 +0200)]
new: ci: Enforce AI commit-trailer rules in danger checks
`CONTRIBUTING.md` documents several rules around how AI coding assistants should (and should not) be attributed in commit messages. Teach `dangerfile.py` to enforce them so that violations are caught at MR time.
Merge branch 'mnowak/danger-ai-trailer-checks' into 'main'
Michal Nowak [Tue, 5 May 2026 18:50:10 +0000 (20:50 +0200)]
Validate Assisted-by trailer format and tool list
CONTRIBUTING.md documents the Assisted-by trailer format as
Assisted-by: AGENT_NAME:MODEL_VERSION [TOOL1] [TOOL2]
and excludes basic development tools (git, compilers, meson,
ninja, editors, clang-format, black, ruff) from the optional
tool list.
Walk every `Assisted-by:` line in each commit message and emit a
`warn()` when:
- the line does not match the documented `AGENT:VERSION` shape;
- the optional tool list contains basic-tool names.
The basic-tool list extends the CONTRIBUTING.md examples with
other formatters, generic linters, and build/test runners
commonly invoked from `.gitlab-ci.yml`. Specialized analysis
tools (coccinelle, clang-tidy, AFL, Coverity, cppcheck,
valgrind, sanitizers) are intentionally absent so they remain
allowed in the trailer.
Use `warn()` rather than `fail()` because the format is
human-written and overly strict matching would produce false
positives on edge cases.
Michal Nowak [Tue, 5 May 2026 17:43:35 +0000 (19:43 +0200)]
Reject Signed-off-by trailers from AI tools in danger check
CONTRIBUTING.md states that AI agents must not add Signed-off-by
tags, since only humans can legally certify the Developer
Certificate of Origin. Mirror the existing LLM Co-Authored-By
check against the Signed-off-by trailer line so danger fails on
commits that violate the rule.
The shared alternation of known LLM agent names is factored out
into LLM_AGENT_NAMES_RE so adding a new tool only requires one
edit.
Michal Nowak [Tue, 5 May 2026 17:05:49 +0000 (19:05 +0200)]
Detect Co-Authored-By trailers and reject AI co-authors
CONTRIBUTING.md states that AI agents must not be listed as
co-authors and that contributors should use the `Assisted-by:`
trailer instead. Teach `dangerfile.py` to fail merge requests
whose commit messages include a `Co-Authored-By:` trailer naming
a known LLM (Claude, Codex, Mistral, Copilot, Gemini, Cursor,
Devin, Aider, Sourcegraph, CodeWhisperer).
For any other `Co-Authored-By:` trailer, emit an info-level
`message()` that includes the full trailer line so reviewers can
confirm the named co-author is a human contributor and not an
unrecognised AI tool.
Arаm Sаrgsyаn [Thu, 11 Jun 2026 14:38:23 +0000 (14:38 +0000)]
fix: usr: Fix a bug in GeoIP2 string matching
When using GeoIP2 ACLs (see :any:`acl`), :iscman:`named` could
incorrectly match a name using a sub-string instead of the full
name match. This has been fixed.
Closes #6019
Merge branch '6019-geoip2-string-match-buf-fix' into 'main'
Aram Sargsyan [Mon, 25 May 2026 14:19:53 +0000 (14:19 +0000)]
Fix 'geoip' ACL matching bug
The geoip2.c:match_string() function can incorrectly return 'true'
when matching strings of different lengths (i.e. it matches a
substring). Return 'false' when the lengths of the matched strings
are different.
Nicki Křížek [Thu, 11 Jun 2026 13:22:49 +0000 (15:22 +0200)]
chg: nil: Update the system test README for the pytest-native workflow
The README predated the meson migration and most of the pytest runner
features. Document building the test dependencies with meson and drop
the make-based instructions, the Makefile.am registration step, and the
stale -T flag list. Describe the jinja2 templating, bootstrap(), the
conftest fixtures, the pytest marks, and recommend node IDs and
parametrization over -k matching. Fix the directory naming rule, which
switched from hyphens to underscores.
Also declare pytest and pytest-xdist as required dependencies: the
runner's pytest.ini uses --dist=loadscope unconditionally, so pytest
without pytest-xdist cannot even start.
Related #3810
Assisted-by: Claude:claude-fable-5
Merge branch 'nicki/systest-readme-refresh' into 'main'
Nicki Křížek [Thu, 11 Jun 2026 09:07:08 +0000 (09:07 +0000)]
Update the system test README for the pytest-native workflow
The README predated the meson migration and most of the pytest runner
features. Document building the test dependencies with meson and drop
the make-based instructions, the Makefile.am registration step, and the
stale -T flag list. Describe the jinja2 templating, bootstrap(), the
conftest fixtures, the pytest marks, and recommend node IDs and
parametrization over -k matching. Fix the directory naming rule, which
switched from hyphens to underscores.
Also declare pytest and pytest-xdist as required dependencies: the
runner's pytest.ini uses --dist=loadscope unconditionally, so pytest
without pytest-xdist cannot even start.
The :any:`http-listener-clients` and :any:`http-streams-per-connection`
configuration options could be truncated to smaller values (or to ``0``,
which means unlimited) when very big configuration values were used, which
exceeded ``65535``. As a note - it is very unlikely that such big values
are used in production, and the default values for the affected options
are ``300`` and ``100``, correspondingly. This has been fixed.
Closes #6021
Merge branch '6021-doh-quota-type-truncation-fix' into 'main'
Aram Sargsyan [Mon, 25 May 2026 12:11:30 +0000 (12:11 +0000)]
Fix DoH quota global variables type
The 'named_g_http_listener_clients' and 'named_g_http_streams_per_conn'
global variables are defined as 'in_port_t', which is usually 16 bits,
but both the readers and the writers of those variables use 'uint32_t'
as the target/source, which can result in truncation.
Matthijs Mekking [Thu, 11 Jun 2026 11:27:41 +0000 (11:27 +0000)]
fix: usr: Ignore updates removing DNSKEY RRset with class ANY
When a Dynamic Update is received that removes the ``DNSKEY`` (or ``CDNSKEY``,
or ``CDS``) RRset, remove all records except the ones that are in use
for signing for the zone.
Closes #6045
Merge branch '6045-dns-update-delete-in-use-dnskey-any' into 'main'
When a Dynamic Update is received that removes the DNSKEY (or CDNSKEY,
or CDS) RRset, remove all records except the ones that are in use
for signing for the zone (with dnssec-policy).