git.ipfire.org Git - thirdparty/bind9.git/log

chg: ci: Raise respdiff third-party and recent-named disagreement thresholds

The third-party comparison is right at the 0.4 % threshold, making it
fail quite often. The current range of observed values ranges from
0.3-0.5 %. Raise the threshold to 0.5 %.

The recent-named comparison produces values ranging from 0.05 to 0.12 %,
but appears to be more sensitive to time of the day when the test runs.
Raise the threshold to 0.15 %.

Both results are stable within the specified ranges across the last
three releases in both main and bind-9.20 series.

Merge branch 'nicki/respdiff-threshold-bump' into 'main'

See merge request isc-projects/bind9!12286

Raise respdiff third-party and recent-named disagreement thresholds

The third-party comparison is right at the 0.4 % threshold, making it
fail quite often. The current range of observed values ranges from
0.3-0.5 %. Raise the threshold to 0.5 %.

The recent-named comparison produces values ranging from 0.05 to 0.12 %,
but appears to be more sensitive to time of the day when the test runs.
Raise the threshold to 0.15 %.

Both results are stable within the specified ranges across the last
three releases in both main and bind-9.20 series.

fix: usr: CDS/CDNSKEY records were not removed when re-configuring the server

When on an ``rndc reconfig`` the DNSSEC policy changes such that it changes the expected ``CDNSKEY`` and/or ``CDS`` records in the zone, the RRset should
be updated accordingly. This did not happen when removing digests from the configuration, or setting `cdnskey no;`. This has been fixed.

Closes #6166

Merge branch '6166-reconfig-delete-cds' into 'main'

See merge request isc-projects/bind9!12265

Don't rely on smart signing in cds system test

With dnssec-signzone smart-signing (-S), the CDS and CDSNKEY are
derived from the key timing metadata and the configuration options (-G).

The test has specific test cases that smart signing (with the fix)
interferes with. Therefor, disable smart-signing in the cds system test.

Remove CDs/CDNSKEY records on reconfig

When adding to dnssec-policy:

cdnskey no;
cds-digest-types { };

and then reconfig the server, named must remove existing CDS and CDNSKEY
records. Note this already worked when adding CDS digest, or setting
'cdnskey yes;', but not when digests were removed from the list, or
when setting 'cdnskey no;'.

Add system test for reconfiguring CDS/CDNSKEY

When on an 'rndc reconfig' the DNSSEC policy changes such that it
changes the expected CDNSKEY/CDS records in the zone, the RRset should
be updated accordingly.

Add a test case where we reconfigure a zone with a policy such that
these records should be removed, and on a second reconfigure add
them back again.

Note the test deliberately adds a different CDS digest on the
second reconfigure.

fix: dev: Update dnssec validation test to match new behavior

Some of the tests in `dnssec/tests_validation.py` worked by iterating through the response message looking for failure conditions, such as excessively high TTL values. In some cases, previous changes caused additional data not to be returned. Since there was nothing to iterate, the tests still "passed".

Tests that don't make sense anymore have been removed. Other tests that iterate through responses have been updated with checks to ensure that the responses actually do contain data.

Merge branch 'each-cleanup-validation-test' into 'main'

See merge request isc-projects/bind9!12269

update tests_validation.py test for new behavior

Some of the tests in in dnssec/tests_validation.py worked by iterating
through the response message looking for failure conditions, such as
excessively high TTL values. In some cases, previous changes caused
additional data not to be returned. Since there was nothing to
iterate, the tests still "passed".

Tests that don't make sense anymore have been removed. Other tests that
iterate through responses have been updated with checks to ensure that
the responses actually do contain data.

add isctest.check functions for section empty or non-empty

expand on the isctest.check.empty_answer() function, adding
empty_authority(), empty_additional(), has_answer(), has_authority(),
and has_additional().

fix: usr: Check wildcard signer and NOQNAME signer match

A positive wildcard answer, and the NSEC3 proof that the requested
name doesn't exist in the zone, must both be from the same zone.
Otherwise, an NSEC3 from an ancestor zone could be used to interfere
with validation.

We now retrieve the signer name from a wildcard response's signature.
An NSEC3 record cannot be used as a NOQNAME proof for the
wildcard unless it exactly matches the name one level above the NSEC3.

Closes #5971

Merge branch '5971-wildcard-noqname-mismatch' into 'main'

See merge request isc-projects/bind9!12256

Reproducer for #5971 NSEC3 from ancestor zone

Create a new nsec3_wrong_zone system test as a regression test.

Co-Authored By: Evan Hunt <each@isc.org>

Check wildcard signer and NOQNAME signer match

A positive wildcard answer, and the NSEC3 proof that the requested
name doesn't exist in the zone, must both be from the same zone.
Otherwise, an NSEC3 from an ancestor zone could be used to interfere
with validation.

We now retrieve the signer name from a wildcard response's signature.
An NSEC3 record cannot be used as a NOQNAME proof for the wildcard
unless it exactly matches the name one level above the NSEC3.

Fixes: isc-projects/bind9#5971

fix: usr: Reject external referrals from forwarders

Under `forward-first` policy in a forwarding zone BIND could accept NS above the forward zone apex from negative responses. This has been fixed.

ISC would like to thank Qifan Zhang, of Palo Alto Networks, for the report.

Closes #5937

Merge branch '5937-fix-forward-first-referral-bailiwick-v2' into 'main'

See merge request isc-projects/bind9!12154

Reject referrals from global forwarders

Reject referrals from root/global forwarders, where there is no narrower
forward-zone apex for name_external() to enforce.

Reject external referrals for forward zones

Apply the existing name_external() bailiwick check to NS RRsets
processed as referrals in rctx_authority_negative(), and enforce the
same check again in rctx_referral() before caching or following the
delegation.

This prevents a forward-first forwarder from installing a parent
zone-cut above the configured forward zone via an authority-section
NS RRset.

Add forward-first referral poisoning reproducer

Add a system test covering authority-section NS referrals returned by
configured forwarders under forward first.

The test verifies that a forwarder for fwd.hack cannot install the
parent hack zone cut and redirect resolution for the sibling zone
sibling.hack.

chg: ci: Migrate Mattermost notifications to Zulip in CI

Since internal communications are now Zulip based, CI jobs now target
Zulip instead of Mattermost. The `MATTERMOST_WEBHOOK_URL` environment
variable is no longer needed, scripts now use `ZULIP_SERVER_URL` and
`ZULIP_API_KEY`.

In order to harmonize Zulip messaging, `message_zulip.py` is used where
curl calls to the webhook were previously used.

Merge branch 'andoni/mattermost-to-zulip-migration' into 'main'

See merge request isc-projects/bind9!12199

Migrate Mattermost notifications to Zulip in .gitlab-ci.yml

Since internal communications are now Zulip based, CI jobs now target
Zulip instead of Mattermost. The MATTERMOST_WEBHOOK_URL environment
variable is no longer needed, scripts now use ZULIP_SERVER_URL and
ZULIP_API_KEY.

In order to harmonize Zulip messaging, message_zulip.py is used where
curl calls to the webhook were previously used.

fix: usr: Fix recursion loop in case of badly behaving forwarders

When forwarding DNS queries, the CD bit is cleared on the first query, and the CD bit is only used as a fallback if the first query fails. However, due to a logic bug this could lead to an unbounded loop re-sending the same message, until the maximum query count is hit. This has been fixed.

Closes #5804

Merge branch '5804-resend-loop-forwarder-cd' into 'main'

See merge request isc-projects/bind9!12133

Avoid resend loop when forwarder SERVFAILs with both CD=0 and CD=1

Commit `36cf1c6a5bf943ad718ddba9fbe6ea97810e3bc2` introduces the
`DNS_FETCHOPT_TRYCD` flag which enables, when sending a query to a
forwarder, the forwarder to validate the answer (CD=0). The crux is
that if for some reason the forwarder returns SERVFAIL, we can retry the
same query and disable the forwarder validation (CD=1) so the resolver
can attempt validation itself (or detect it's bogus).

The logic was to first set `DNS_FETCHOPT_TRYCD` to the query options but
not on the message (so CD=0), and, when getting a SERVFAIL answer, if
the option `DNS_FETCHOPT_TRYCD` was set, to also set it into the
message. However, there was no way to know if this was the first (or
second) query because the original message is discarded when getting the
answer. This can lead to an unbounded loop re-sending the same message
again and again (until the global query count stops it).

This is fixed by using two separate flags `DNS_FETCHOPT_TRYNOCD`, set on
the query options for the very first query, then, if it SERVFAIL,
check if `DNS_FETCHOPT_TRYNOCD` is set but `DNS_FETCHOPT_TRYCD` is not.
In this case, we know we're about to send the second query. If it also
fails, `DNS_FETCHOPT_TRYCD` will be set anyway, so there is no point
retrying. This breaks the unbounded loop.

Reproducer forwarder resend loop

Run malicious server: resend_loop/ans2/ans.py

Start BIND: ns1

Send single query to test.com

OBSERVED BEHAVIOR
The malicious server receives tens of thousands of resend packets
within seconds. CPU usage of the named worker thread remains elevated
(50–100% of one core) until the default fetch timeout (~10 seconds)
terminates the request. Instrumentation during testing confirmed that
isc_counter_used(fctx->qc) remains constant (value 1) throughout the
entire resend loop.

fix: usr: Cache glue only for enabled address families

When caching delegation NS data, only use A/AAAA glue records if the resolver has the corresponding IPv4/IPv6 dispatcher configured. If IPv4 or IPv6 is disabled, ignore glue for that family and fall back to caching the nameserver name if there is no glue from the other supported family.

Merge branch 'colin/glues-supported-stack' into 'main'

See merge request isc-projects/bind9!11889

Cache glue only for enabled address families

When caching delegation NS data, only use A/AAAA glue records if the
resolver has the corresponding IPv4/IPv6 dispatcher configured. If IPv4
or IPv6 is disabled, ignore glue for that family and fall back to
caching the nameserver name if there is no glue from the other supported
family.

The new `cache_delegns` system test is covering delegation NS caching
with dual-stack resolver, IPv4-only, IPv6-only configurations. It also
set up an authoritative sever with zones with A-only, AAAA-only, and
dual-stack glue, which are all queried, and checks the delegation
database dump to confirm that the cached delegation data correspond to
the resolver configuration.

rem: usr: Remove GeoIP2 `metro` and `metrocode`

The `geoip metro` and `geoip metrocode` configuration options has been
removed as metro code are deprecated from MaxMind library.

Merge branch 'colin/remove-geoip-metro' into 'main'

See merge request isc-projects/bind9!12217

remove GeoIP2 `metro` and `metrocode`

The `geoip metro` and `geoip metrocode` configuration options has been
removed as metro code are deprecated from MaxMind library.

fix: dev: Fix invalid pointer release in JSON statistics-channel response

Each response served on a JSON statistics endpoint released the wrong
pointer to the JSON library after the response was sent: the response
body string instead of the JSON document. With the current responses
this does not crash named in practice, but the call is incorrect and
can in principle corrupt memory. XML responses are not affected.

Closes #6024

Merge branch '6024-statschannel-json-response-invalid-free' into 'main'

See merge request isc-projects/bind9!12068

Fix invalid free in statistics-channel JSON renderer

wrap_jsonfree() called json_object_put() on the response-body buffer
base, which is the JSON string returned by
json_object_to_json_string_ext(), not a struct json_object. The root
object is already passed in as the callback argument; release only
that.

Add regression test for statistics-channel JSON free bug

Hit every JSON statistics endpoint several times. The current code
calls json_object_put() on the response-body string pointer, which
doesn't crash just by accident - the memory position contains large
value from static string.

chg: dev: Sync picohttpparser.c with upstream commit a875a01

Synced with the h2o/picohttpparser upstream repository up to commit
f4d94b48b31e0abae029ebeafcfd9ca0680ede58.

This commit is just hygiene and consistency by keeping the vendored copy
current.

Merge branch 'mnowak/picohttpparser-sync-a875a01' into 'main'

See merge request isc-projects/bind9!12159

Re-apply the picohttpparser.c patch

This:
- makes sure all variables are initialized
- adds missing curly braces for single line statements
- use proper comment for fallthrough case statements

Sync picohttpparser.c with upstream commit a875a01

Pull in upstream commit a875a01 from h2o/picohttpparser: enforce use of
CRLF in chunk headers, by rejecting bare CR / LF. Replaces the lenient
CHUNKED_IN_CHUNK_CRLF state with strict CHUNKED_IN_CHUNK_HEADER_EXPECT_LF,
CHUNKED_IN_CHUNK_DATA_EXPECT_CR and CHUNKED_IN_CHUNK_DATA_EXPECT_LF states.

Synced with the h2o/picohttpparser upstream repository up to commit
f4d94b48b31e0abae029ebeafcfd9ca0680ede58.

This commit is just hygiene and consistency by keeping the vendored copy
current.

Assisted-by: Claude:claude-opus-4-7

fix: nil: Allocate work threads from their owning loop's memory context

The per-loop worker threads allocated their state from the loop manager's
memory context instead of the per-loop context that owns them. Allocate
from the owning loop's context and hold a reference to that loop for the
thread's lifetime, matching the context handling already used on the
work-enqueue and completion paths.

Already changed as part of 9.20 backport.

Merge branch 'ondrej/rewrite-threadpool-fixups' into 'main'

See merge request isc-projects/bind9!12268

Run the work asynchronously when shutting down

Instead of running the work directly, run it asynchronously to prevent
dead-locks when then isc_work is scheduled from inside a lock and the
job itself is using locking.

Allocate work threads from their owning loop's memory context

A per-loop work thread referenced the loop manager's memory context,
which is not the context that backs the loop the thread serves. Pass
the owning loop instead and allocate from loop->mctx, keeping a loop
reference for the thread's lifetime. This matches how isc_work_enqueue
and work_done already obtain the context from the loop, and the
teardown uses loop->mctx before dropping the reference.

chg: test: Reimplement the `formerr` system test in Python

Replace the shell and Perl based FORMERR system test with a Python
test that constructs the malformed DNS packets directly and checks
the responses.

Remove the legacy shell and Perl test script and the intermediate
packet files in hex, leaving the packet construction inline in
tests_formerr.py.

Merge branch 'stepan/formerr-system-test-python' into 'main'

See merge request isc-projects/bind9!11898

Simplify the packets in the formerr system test

Normalize the message ID to 0 and the TTL of records to 1 unless
required (OPT and UPDATE records require TTL=0).

Rename the questionclass test case to twoquestionclasses for
consistency.

Reimplement the FORMERR system test in Python

Replace the shell and Perl based FORMERR system test with a Python
test that constructs the malformed DNS packets directly and checks
the responses.

Remove the legacy shell and Perl test script and the intermediate
packet files in hex, leaving the packet construction inline in
tests_formerr.py.

Preserve the same wire for all packets sent to the server, but
construct them in a more explicit and readable way.

chg: dev: Rework isc_work as per-loop, per-lane cancelable worker threads

Fold the libuv thread pool and the per-loop isc_helper threads into a single
isc_work pool. Each (loop, lane) gets its own SPSC queue and worker, which drops
the shared-queue contention, and the FAST/SLOW lanes keep short crypto tasks off
the long blocking ones (zone dump/load, xfrin). isc_work jobs are now cancelable:
isc_work_cancel() tombstones a still-queued job and its after_cb fires with
ISC_R_CANCELED, so abandoned work can be dropped instead of run to completion.

Merge branch 'ondrej/rewrite-threadpool' into 'main'

See merge request isc-projects/bind9!12226

Replace the shared work pool with per-loop, per-lane worker threads

Offloaded work used two different mechanisms: a per-loop isc_helper
thread for CPU-bound crypto (DNSSEC validation, message signature
checks) and the process-global libuv thread pool for blocking I/O (zone
load and dump, inbound transfer apply). Neither could cancel a queued
task, and the two disagreed about exclusive mode — the helper paused
with its loop under isc_loopmgr_pause() but the libuv pool did not, so
blocking offloaded work kept running while a loop held the exclusive
lock.

Unify both behind isc_work: each loop gets its own worker thread per
lane — FAST for short, bounded tasks and SLOW for long, blocking ones —
fed by a private queue. Separate lanes keep a short crypto task off the
path of a multi-second zone dump once both run on per-loop workers;
every lane parks with isc_loopmgr_pause() so exclusive mode now quiesces
offloaded work too; and a still-queued task can be canceled before it
starts (isc_work_cancel). isc_helper is removed and its callers select a
lane.

chg: ci: Adjust allow_failure in cross-version-config tests

The failures caused by #6007 are longer happening.

However, in the mean time, other tests have begun to fail due to https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/12163/diffs?commit_id=9771df0aca5ce399ad0535f656c403a58d674606

Merge branch 'nicki/revert-allow-failure-after-9.21.23' into 'main'

See merge request isc-projects/bind9!12264

Add allow_failure to cross-version-config tests

Due to a recent change which dropped the non-IN-class in views, the
following tests are failing in cross-version-config-tests:

auth
catz
class
resolver
unknown

This can be reverted once the July releases are public.

Revert "Make cross-version-config-tests allow_failure: true"

This reverts commit 86b0615aea37289ada4f77db581f7202b677cb0e.

Merge tag 'v9.21.23'

fix: usr: Only print per zone glue stats when query statistics is full

The code printing query statistics was ignoring the zone-statistics
option. This has been fixed.

Closes #6164

Merge branch '6164-no-per-zone-glue-stats' into 'main'

See merge request isc-projects/bind9!12262

Add test for query statistics

Ensure that no per-zone glue statistic is printed unless
zone-statistics is set to full.

Only print per zone glue cache stats when zone-statistics is full

The code printing glue cache statistics was ignoring the
zone-statistics option. This has been fixed.

fix: usr: Fix CNAME resolution failure caused by a cached SERVFAIL response

Under certain circumstances, a cached SERVFAIL response could incorrectly prevent successful resolution of a CNAME target. This could cause resolution failures to persist until the cached SERVFAIL entry expired, even when the CNAME target itself was otherwise resolvable. This issue has been fixed.

Closes #5983

Merge branch '5983-servfail-cache-cname' into 'main'

See merge request isc-projects/bind9!12158

Use original query name when caching SERVFAIL

Instead of using `client->query.qname` when caching a SERVFAIL answer,
use `client->query.origqname` when available.

This avoids caching a SERVFAIL against a CNAME target when the failure
occurs while the resolver is following the CNAME chain. This is
problematic, for instance, when the SERVFAIL is triggered by the
`max-query-count` threshold being reached, which would incorrectly
prevent legitimate resolution of the CNAME target while in the SERVFAIL
cache.

Note that if the SERVFAIL genuinely originated from resolving the CNAME
target, that specific failure will no longer be cached, and a direct
query for the CNAME target will trigger a fresh (likely failing)
resolution attempt. However, this is still preferable to the previous
behaviour, which would wrongly prevent resolving the CNAME target if it
was cached for other reasons (like the example above).

System test covering SERVFAIL cache and CNAME

Add a system test for the case where resolution SERVFAILs because the
fetch context reaches the `max-query-count` threshold while following a
CNAME.

Resolving the CNAME target independently should still work, because the
SERVFAIL cache stores the original query name rather than the target.

rem: usr: remove the secondary validator in query.c

Previously, when the additional section of a response was being
populated, if cached data was found with pending trust, it would be
opportunistically validated. The code implementing this validation was
not quite formally correct. Rather than fixing it, the code has been
removed: RRsets with pending trust are now omitted from responses.

Closes #5966

Closes #5968

Closes #5972

Merge branch 'each-remove-lightweight-validator' into 'main'

See merge request isc-projects/bind9!12236

Don't run vulture on ansX system test dirs

Exclude ansX directories from vulture, as splitting up the handlers into
multiple files gets flagged as unused code.

Merge DNSSEC wildcard tests

Merge the tests for #5966 (F-043) and #5972 (F-045), previously called
dnssec_wildcard_additional and dnssec_replayed_parent_wildcard, into a
single directory with two modules.

Reproducer for #5968 parent RRSIG in additional

Add a new "dnssec_parent_rrsig" system test.

Co-Authored-By: Matthijs Mekking <matthijs@isc.org>

Reproducer for #5972 DNS_R_FROMWILDCARD accepted

Add a new "dnssec_wildcard_additional" system test.

Co-Authored-By: Evan Hunt <each@isc.org>

Reproducer for #5966 replay parent wildcard

Add a new "dnssec_replayed_parent_wildcard" system test.

Co-Authored-By: Matthijs Mekking <matthijs@isc.org>

add isctest.mark method for ecdsa_deterinistic

This checks support for ECDSA deterministic mode in the cryptography
library.

Add isctest.check methods for AA flag

Similar to checking AD flag, add methods to see if the AA flag is
present or absent.

Remove the secondary validator in query.c

Previously, when the additional section of a response was being
populated, if cached data was found with pending trust, it would be
opportunistically validated. The code implementing this validation was
not quite formally correct; rather than fixing it, the code has been
removed; RRsets with pending trust are now omitted from responses.

Fixes: isc-projects/bind9#5966
Fixes: isc-projects/bind9#5968
Fixes: isc-projects/bind9#5972

fix: doc: Edit SECURITY.md to remove references to bind-security@isc.org

We no longer want to encourage people to open new issues via email because we are getting too many spammy reports generated by LLM. By requiring people actually login to Gitlab to make their report, they will (hopefully) see our reporting template, and think at least a little bit about whether they are making a well-considered, valid report.

Merge branch 'vicky-main-patch-01884' into 'main'

See merge request isc-projects/bind9!12248

Edit SECURITY.md to remove references to bind-security@isc.org

chg: dev: Use SIEVE for TSIG generated-key LRU

Replace the list-based LRU for TSIG KEYs with SIEVE-based LRU.

Merge branch 'ondrej/use-sieve-for-tsigkey-lru' into 'main'

See merge request isc-projects/bind9!12043

Use SIEVE for TSIG generated-key LRU

The generated-key cache used a strict LRU: every lookup hit promoted
the key to the tail of the list under the write lock, even when the
caller arrived holding only the read lock. That promotion was both
a lock upgrade and a list reshuffle on the hot path.

Replace the LRU with the SIEVE eviction policy. Lookups now do a
lock-free atomic mark of the visited bit; eviction work moves to
insertion time, which already holds the write lock.

fix: test: Fix the rpz servfail-until-ready test when zone updates run serially

The rpz servfail-until-ready test assumed a particular policy zone always
finished loading last, which only holds when zone updates run in parallel;
on a single CPU (or with serialized offload) it could fail spuriously. It now
polls until RPZ reports ready instead.

Merge branch 'ondrej/fix-rpz-system-test-on-single-cpu' into 'main'

See merge request isc-projects/bind9!12251

Poll for RPZ readiness in the servfail-until-ready test

RPZ is ready only once every policy zone has completed its first update,
and the zones do not finish in a fixed order, so whenever the updates
run serially (per-loop offload, or any single-CPU run): 'slow-rpz' zone
can finish before the others and the query still gets SERVFAIL. Poll
the query until it returns NOERROR instead.

Assisted-by: Claude:claude-opus-4-8

fix: dev: Keep RRL ncache fixed name alive

Move the fixed name storage out of the NCACHE branch so the name passed to dns_rrl() remains valid for cached NXDOMAIN responses.

Closes #6029

Merge branch '6029-fixedname-fix' into 'main'

See merge request isc-projects/bind9!12096

Keep RRL ncache fixed name alive

Move the fixed name storage out of the NCACHE branch so the name passed to
dns_rrl() remains valid for cached NXDOMAIN responses.

Add RRL cached NXDOMAIN system test

Add a Python system test that primes ns2's negative cache, queries the
cached NXDOMAIN through RRL, and verifies named remains responsive.

new: test: Add AnsInstance as the ans counterpart of NamedInstance

Tests interacting with mock ans servers had to hardcode their IP
addresses and open ans.run directly, while named instances already
had the NamedInstance abstraction with `.ip`, `.log` and the
watch_log_*() helpers. Factor the parts of NamedInstance that are
not named-specific into a ServerInstance base class and add an
AnsInstance subclass for ans servers, exposed through the `servers`
fixture and new ans1-ans11 convenience fixtures.

Assisted-by: Claude:claude-fable-5
Merge branch 'nicki/pytest-ans-instance' into 'main'

See merge request isc-projects/bind9!12241

Use AnsInstance fixture in dispatch test

Replace manual ans.run parsing with the ans4 fixture.

Assisted-by: Claude:claude-fable-5

Add AnsInstance as the ans counterpart of NamedInstance

Tests interacting with mock ans servers had to hardcode their IP
addresses and open ans.run directly, while named instances already
had the NamedInstance abstraction with `.ip`, `.log` and the
watch_log_*() helpers. Factor the parts of NamedInstance that are
not named-specific into a ServerInstance base class and add an
AnsInstance subclass for ans servers, exposed through the `servers`
fixture and new ans1-ans11 convenience fixtures.

Assisted-by: Claude:claude-fable-5

fix: nil: Fix a latent NULL dereference in the DoH client request helper

isc__nm_http_request()'s error path reloaded sock->h2->connect.cstream after client_send() had already detached and freed it on a submit failure, dereferencing NULL. The helper is only used by the DoH unit tests. Guard the cleanup path against the detached stream.

Closes #6160

Merge branch '6160-fix-latent-NULL-dereference-in-http2' into 'main'

See merge request isc-projects/bind9!12247

Guard against a detached stream in isc__nm_http_request() error path

On a submit failure, client_send() nullifies sock->h2->connect.cstream
and frees the stream before returning the error. The error: label in
isc__nm_http_request() reloaded that pointer and dereferenced it
unconditionally, reading through a NULL stream. The function is only
used by the DoH unit tests -- production DoH client send goes through
isc__nm_http_send()/client_httpsend(), whose submit failure is reported
via the NULL-safe send callback -- so this is a latent defect in the
test helper rather than a reachable named crash.

Skip the read callback when the stream has already been detached and
let the caller report the failure from the error result it receives.

Assisted-by: Claude:claude-opus-4-8

fix: usr: Fix a 'deny-answer-aliases' configuration bypass issue

It was possible to use a maliciously crafted authoritative
zone to make :iscman:`named` resolver synthesize a ``DNAME``
"alias" that should have been rejected by the configured
:any:`deny-answer-aliases` option. This has been fixed.

Closes #5930

Merge branch '5930-deny-answer-aliases-and-cached-dname-buf-fix' into 'main'

See merge request isc-projects/bind9!12044

Fix a 'deny-answer-aliases' bug when using a cached DNAME

When using a cached DNAME to resolve a name, make sure to consult
the denied answers lists, otherwise it is possible to consutruct
a restricted alias by caching a DNAME that is a parent of the
denied alias. See the comments in the tests case from the previous
commit an example.

Add a new 'deny-answer-aliases' check in 'resolver' system test

This new check exercises an attack against guarantees given by the
'deny-answer-aliases' configuration option by caching a DNAME
that is a parent of the restricted alias, and then "constructing"
the restricted alias from the cache.

fix: nil: Remove isc_mem_strndup()

The isc_mem_strndup() function had a single caller, the HTTP/2
request-path handling, which now uses isc_mem_allocate() and strlcpy()
directly. Remove the function from the libisc API.

Closes #6087

Merge branch '6087-remove-isc_mem_strndup' into 'main'

See merge request isc-projects/bind9!12240

Remove isc_mem_strndup()

The function had a single caller, the HTTP/2 request-path handling in
the network manager, and its semantics (strlen() of the source clamped
to the requested size) amounted to an obscured bounded string copy.
Replace the only use with a plain allocation and strlcpy(), and drop the
function.

fix: usr: Fix a zone transfer over TLS (XoT) issue when using the opportunistic TLS mode

The :iscman:`named` process, running as secondary DNS server,
configured to transfer a zone from a primary server using an
encrypted XoT transport in opportunistic TLS mode (i.e. without
peer certificate/hostname validation) could terminate unexpectedly
when the TLS ALPN negotiation with primary server was unsuccessful.
This has been fixed.

Closes #5957

Merge branch '5957-xot-xfrin_connect_done-bug-fix' into 'main'

See merge request isc-projects/bind9!12081

Fix a bug in xfrin.c:xfrin_connect_done()

When the connect callback's result is ISC_R_SUCCESS and the callback
changes the result because of some condition, the 'xfr' should not
be detached, because it now belongs to the receive callback.

Detach the reference only if the callback's result is non-success.

Add a check for the "doth" system test

Configure a zone transfer using XoT (with opportunistic TLS) from
a non-DoT port, which does not provide ALPN "dot" (in this case
it will try to connect to a DoH port). This is expected to fail,
but the client should handle the error gracefully and not to crash.

fix: dev: Fix delegdb dump buffer overflow

A buffer used to dump a DNS name in the delegdb dump flow was using the
wrong size: it was using `DNS_NAME_MAXWIRE` which is the actual max
length of a DNS name on the wire instead of using `DNS_NAME_FORMATSIZE`
which is the maximum length of a textual representation of a DNS name
(which can be way longer than `DNS_NAME_MAXWIRE` if using the master
file escape sequence format) plus 1 (end of string byte). This could
lead to a buffer overflow. This is now fixed.

Closes #6132

Merge branch '6132-delegdb-dump-overflow' into 'main'

See merge request isc-projects/bind9!12195

Fix delegdb dump buffer overflow

A buffer used to dump a DNS name in the delegdb dump flow was using the
wrong size: it was using `DNS_NAME_MAXWIRE` which is the actual max
length of a DNS name on the wire instead of using `DNS_NAME_FORMATSIZE`
which is the maximum length of a textual representation of a DNS name
(which can be way longer than `DNS_NAME_MAXWIRE` if using the master
file escape sequence format) plus 1 (end of string byte). This could
lead to a buffer overflow. This is now fixed.

Add test for delegdb dump with very long name

Add a delegdb test which dump a database which contains a very long name
(using DNS master file format with escape sequence as defined per RFC
1035). This ensure that the delegdb uses large enough internal buffers
to load the names in DB and generate the dump. If this is not the case,
the test crashes on a build with address sanatizer enabled.

new: test: Add a system test cookbook

The README documents what the framework is; the cookbook documents how
to get common tasks done with it: iterating on a single test, adding a
new test directory, writing a regression reproducer, mocking a
misbehaving server with isctest.asyncserver, signing zones in
bootstrap(), and driving named via the NamedInstance fixtures. All
recipes are distilled from existing tests (cyclic_glue, dnssec_py,
dispatch) so they reflect the current canonical patterns.

Assisted-by: Claude:claude-fable-5
Merge branch 'nicki/systest-cookbook' into 'main'

See merge request isc-projects/bind9!12234

Add a system test cookbook

The README documents what the framework is; the cookbook documents how
to get common tasks done with it: iterating on a single test, adding a
new test directory, writing a regression reproducer, mocking a
misbehaving server with isctest.asyncserver, signing zones in
bootstrap(), and driving named via the NamedInstance fixtures. All
recipes are distilled from existing tests (cyclic_glue, dnssec_py,
dispatch) so they reflect the current canonical patterns.

Assisted-by: Claude:claude-fable-5

new: ci: Enforce AI commit-trailer rules in danger checks

`CONTRIBUTING.md` documents several rules around how AI coding assistants should (and should not) be attributed in commit messages. Teach `dangerfile.py` to enforce them so that violations are caught at MR time.

Merge branch 'mnowak/danger-ai-trailer-checks' into 'main'

See merge request isc-projects/bind9!11969

Validate Assisted-by trailer format and tool list

CONTRIBUTING.md documents the Assisted-by trailer format as

Assisted-by: AGENT_NAME:MODEL_VERSION [TOOL1] [TOOL2]
and excludes basic development tools (git, compilers, meson,
ninja, editors, clang-format, black, ruff) from the optional
tool list.

Walk every `Assisted-by:` line in each commit message and emit a
`warn()` when:

  - the line does not match the documented `AGENT:VERSION` shape;
  - the optional tool list contains basic-tool names.

The basic-tool list extends the CONTRIBUTING.md examples with
other formatters, generic linters, and build/test runners
commonly invoked from `.gitlab-ci.yml`.  Specialized analysis
tools (coccinelle, clang-tidy, AFL, Coverity, cppcheck,
valgrind, sanitizers) are intentionally absent so they remain
allowed in the trailer.

Use `warn()` rather than `fail()` because the format is
human-written and overly strict matching would produce false
positives on edge cases.

Assisted-by: Claude:claude-opus-4-7

Reject Signed-off-by trailers from AI tools in danger check

CONTRIBUTING.md states that AI agents must not add Signed-off-by
tags, since only humans can legally certify the Developer
Certificate of Origin. Mirror the existing LLM Co-Authored-By
check against the Signed-off-by trailer line so danger fails on
commits that violate the rule.

The shared alternation of known LLM agent names is factored out
into LLM_AGENT_NAMES_RE so adding a new tool only requires one
edit.

Assisted-by: Claude:claude-opus-4-7

Detect Co-Authored-By trailers and reject AI co-authors

CONTRIBUTING.md states that AI agents must not be listed as
co-authors and that contributors should use the `Assisted-by:`
trailer instead. Teach `dangerfile.py` to fail merge requests
whose commit messages include a `Co-Authored-By:` trailer naming
a known LLM (Claude, Codex, Mistral, Copilot, Gemini, Cursor,
Devin, Aider, Sourcegraph, CodeWhisperer).

For any other `Co-Authored-By:` trailer, emit an info-level
`message()` that includes the full trailer line so reviewers can
confirm the named co-author is a human contributor and not an
unrecognised AI tool.

Assisted-by: Claude:claude-opus-4-7

fix: usr: Fix a bug in GeoIP2 string matching

When using GeoIP2 ACLs (see :any:`acl`), :iscman:`named` could
incorrectly match a name using a sub-string instead of the full
name match. This has been fixed.

Closes #6019

Merge branch '6019-geoip2-string-match-buf-fix' into 'main'

See merge request isc-projects/bind9!12092

Fix 'geoip' ACL matching bug

The geoip2.c:match_string() function can incorrectly return 'true'
when matching strings of different lengths (i.e. it matches a
substring). Return 'false' when the lengths of the matched strings
are different.

Add a new check for the 'geoip2' system test

Check that an ACL can't be matched by a substring in the
GeoIP database, instead of the full string comparision.

chg: nil: Update the system test README for the pytest-native workflow

The README predated the meson migration and most of the pytest runner
features. Document building the test dependencies with meson and drop
the make-based instructions, the Makefile.am registration step, and the
stale -T flag list. Describe the jinja2 templating, bootstrap(), the
conftest fixtures, the pytest marks, and recommend node IDs and
parametrization over -k matching. Fix the directory naming rule, which
switched from hyphens to underscores.

Also declare pytest and pytest-xdist as required dependencies: the
runner's pytest.ini uses --dist=loadscope unconditionally, so pytest
without pytest-xdist cannot even start.

Related #3810

Assisted-by: Claude:claude-fable-5
Merge branch 'nicki/systest-readme-refresh' into 'main'

See merge request isc-projects/bind9!12232

Update the system test README for the pytest-native workflow

The README predated the meson migration and most of the pytest runner
features. Document building the test dependencies with meson and drop
the make-based instructions, the Makefile.am registration step, and the
stale -T flag list. Describe the jinja2 templating, bootstrap(), the
conftest fixtures, the pytest marks, and recommend node IDs and
parametrization over -k matching. Fix the directory naming rule, which
switched from hyphens to underscores.

Also declare pytest and pytest-xdist as required dependencies: the
runner's pytest.ini uses --dist=loadscope unconditionally, so pytest
without pytest-xdist cannot even start.

Related #3810

Assisted-by: Claude:claude-fable-5

fix: usr: Fix DNS-over-HTTPS (DoH) quota configuration issue

The :any:`http-listener-clients` and :any:`http-streams-per-connection`
configuration options could be truncated to smaller values (or to ``0``,
which means unlimited) when very big configuration values were used, which
exceeded ``65535``. As a note - it is very unlikely that such big values
are used in production, and the default values for the affected options
are ``300`` and ``100``, correspondingly. This has been fixed.

Closes #6021

Merge branch '6021-doh-quota-type-truncation-fix' into 'main'

See merge request isc-projects/bind9!12085

Fix DoH quota global variables type

The 'named_g_http_listener_clients' and 'named_g_http_streams_per_conn'
global variables are defined as 'in_port_t', which is usually 16 bits,
but both the readers and the writers of those variables use 'uint32_t'
as the target/source, which can result in truncation.

Use correct types.

fix: usr: Ignore updates removing DNSKEY RRset with class ANY

When a Dynamic Update is received that removes the ``DNSKEY`` (or ``CDNSKEY``,
or ``CDS``) RRset, remove all records except the ones that are in use
for signing for the zone.

Closes #6045

Merge branch '6045-dns-update-delete-in-use-dnskey-any' into 'main'

See merge request isc-projects/bind9!12166

ISC_ATTR_UNUSED in favor of UNUSED()

Keep our key on update removing DNSKEY RRset

When a Dynamic Update is received that removes the DNSKEY (or CDNSKEY,
or CDS) RRset, remove all records except the ones that are in use
for signing for the zone (with dnssec-policy).

Test removing DNSKEY records with class ANY

The update should ignore DNSKEY, CDNSKEY and CDS records
for keys that are used for signing.