Arаm Sаrgsyаn [Wed, 6 May 2026 19:36:35 +0000 (19:36 +0000)]
fix: usr: Fix a bug in allow-query/allow-transfer catalog zone custom properties
The :iscman:`named` process could terminate unexpectedly when
processing a catalog zone with an invalid ``allow-query`` or
``allow-transfer`` custom property (i.e. having a non-APL type)
coexisting with the valid property. This has been fixed.
Closes #5941
Merge branch '5941-catz-catz_process_apl-bug-fix' into 'main'
Aram Sargsyan [Mon, 4 May 2026 22:34:01 +0000 (22:34 +0000)]
Fix a bug in catz_process_apl()
The allow-transfer/allow-query catalog zone custom properties support
only APL RRtypes. All other types are correctly rejected by the
catz_process_apl() function. However, when an APL RRtype is processed
by that function, and another (non-APL) RRtype is then attempted to be
processed, there is an assertion failure happening in the prologue
of the function because `*aclbp != NULL` (i.e. an APL has been already
processed). Move the code to do type checking before the affected
REQUIRE assertion.
Arаm Sаrgsyаn [Wed, 6 May 2026 18:18:58 +0000 (18:18 +0000)]
fix: usr: Fix a memory leak issue in the catalog zones
The :iscman:`named` process could leak small amounts of memory
when processing a catalog zone entry which had defined custom
primary servers with TSIG keys using both the regular ``primaries``
custom property syntax and the legacy alternative syntax (``masters``)
at the same time. This has been fixed.
Closes #5943
Merge branch '5943-catz-primaries-tsig-key-name-leak-fix' into 'main'
Ondřej Surý [Wed, 6 May 2026 04:46:42 +0000 (06:46 +0200)]
fix: usr: Prevent a crash when using both dns64 and filter-aaaa
An assertion failure could be triggered if both `dns64` and the `filter-aaaa` plugin were in use simultaneously. This happened if the plugin triggered a second recursion process, which then attempted to store DNS64 state information in a pointer that had already been set by the original recursion process. This has been fixed.
Evan Hunt [Mon, 4 May 2026 05:00:39 +0000 (22:00 -0700)]
Clear dns64_aaaaok immediately after use
The DNS64 state information stored in client->query.dns64_aaaaok
could cause an assertion failure in query_respond() if the server
was configured in such a way as to trigger a new recursion before
the query had been reset - for example, by using the filter-aaaa
plugin, which may need to recurse to find out whether an A record
exists.
This has been addressed by clearing DNS64 state information
immediately after the call to query_filter64().
Evan Hunt [Tue, 5 May 2026 23:19:59 +0000 (23:19 +0000)]
fix: dev: Fix a stack use-after-free in qpzone
In previous_closest_nsec(), a new qpreader was opened to search the NSEC
tree. It was possible for that to be used to update a QP iterator object
owned by the caller, and then be destroyed when the function returned.
This qpreader object isn't necessary anymore; since namespaces were
added to the QP trie in commit 15653c54a0, we can now just reuse the
existing reader for the main tree.
Evan Hunt [Mon, 4 May 2026 23:10:49 +0000 (16:10 -0700)]
Fix a stack use-after-free in qpzone
In previous_closest_nsec(), a new qpreader was opened to search the NSEC
tree. It was possible for that to be used to update a QP iterator object
owned by the caller, and then be destroyed when the function returned.
This qpreader object isn't necessary anymore; since namespaces were
added to the QP trie in commit 15653c54a0, we can now just reuse the
existing reader for the main tree.
Ondřej Surý [Tue, 5 May 2026 20:27:46 +0000 (22:27 +0200)]
fix: usr: Fix a crash when reconfiguring while an NTA is being rechecked
When named was reconfigured or shut down while a negative trust anchor
was being rechecked against authoritative servers, the in-flight recheck
could outlive the view that owned it and cause `named` to crash. This
has been fixed.
Evan Hunt [Mon, 4 May 2026 07:05:27 +0000 (00:05 -0700)]
Hold a reference to the NTA table for the lifetime of each NTA
Each dns__nta_t now references its parent ntatable in nta_create() and
releases it in dns__nta_destroy(). This avoids a use-after-free in
fetch_done() and other callbacks that dereference nta->ntatable: the
ntatable could otherwise be released by view destruction while an
in-flight resolver fetch still holds a reference to the NTA.
Ondřej Surý [Tue, 5 May 2026 19:06:43 +0000 (21:06 +0200)]
fix: dev: handle KSR files with DNSKEY records before any header
A DNSKEY record appearing before the first ';; KeySigningRequest'
header in a KSR file made dnssec-ksr abort on an internal assertion
instead of producing a structured error, killing pipelines that
fed it crafted or corrupted input. The tool now exits with a
fatal error naming the file and line.
Closes #5914
Merge branch '5914-dnssec-ksr-rdatalist-null-insist' into 'main'
Replace INSIST in KSR DNSKEY parser with a structured error
A DNSKEY record appearing before any ';; KeySigningRequest' header
in a KSR file made dnssec-ksr abort on INSIST(rdatalist != NULL),
which is the wrong tool for a malformed-input case. Issue a fatal()
naming the file and line instead so pipelines see a clean exit
status and an actionable message; the now-unreachable NULL check on
the rdatalist->ttl update goes away too.
Ondřej Surý [Tue, 5 May 2026 16:15:19 +0000 (18:15 +0200)]
fix: usr: Reject record sets too large to serve in DNS
When BIND was asked to store a record set whose total size exceeds
what fits in a DNS message, it would allocate memory and build the
structure, then fail later at response time. Such oversized record
sets are now rejected at the time of storage with an error, avoiding
wasted work on data that can never be served.
Merge branch 'ondrej/harden-buflen-overflow' into 'main'
makeslab(), makevec(), dns_rdatavec_merge() and dns_rdatavec_subtract()
summed per-record storage into an unsigned int with no upper-bound
check. An RRset whose total encoded size exceeds DNS_RDATA_MAXLENGTH
cannot fit in a DNS message and is unservable; building its in-memory
representation only burns memory on data that will fail at response
time, and at the upper bound the running sum could in theory wrap.
Cap the running total at DNS_RDATA_MAXLENGTH and return ISC_R_NOSPACE
when exceeded. Update the qpdb cache memory-purge test to use a
record size that fits within the new limit.
Ondřej Surý [Tue, 5 May 2026 08:49:37 +0000 (10:49 +0200)]
rem: dev: Remove obsolete KEY record flags deprecated by RFC 3445
KEY resource records originally defined NOAUTH, NOCONF, EXTENDED, and
ENTITY flags that were removed by RFC 3445 back in 2002. BIND still
carried code to parse and emit them, including the additional two-octet
flags field that followed when the EXTENDED bit was set. That handling
has been removed and the affected bit positions are now reserved.
Dropping the extended-flags handling also eliminates a possible crash
that could be reached when signing a zone containing an invalid key.
Closes #5900
Merge branch '5900-remove-keyflag-extended' into 'main'
Mark Andrews [Thu, 30 Apr 2026 23:06:36 +0000 (09:06 +1000)]
Remove remaining RFC 3445 KEY flags
RFC 3445 also eliminated the DNS_KEYTYPE_NOAUTH, DNS_KEYTYPE_NOCONF,
and DNS_KEYOWNER_ENTITY flags. With NOAUTH and NOCONF gone, the
concept of NOKEY can no longer be expressed in KEY records.
DNS_KEYOWNER_ENTITY was already unused as of 22d688f656 but still
defined; that is now also removed.
The DNS_KEYFLAG_EXTENDED flag was only legitimate for type KEY
and was eliminated by RFC 3445. Dropping the extended-flags
handling in pub_compare() also fixes a possible crash when
signing a zone whose journal contains a crafted DNSKEY: a
6-byte record with the EXTENDED bit set produced a memmove()
length that underflowed and ran off a stack buffer.
Ondřej Surý [Mon, 4 May 2026 12:58:42 +0000 (14:58 +0200)]
fix: usr: Prevent crafted queries from degrading RRL performance
With response rate limiting enabled, an attacker sending queries from many
spoofed source addresses could steer entries into the same slot of the
internal rate-limit table and slow down query processing on the affected
server. The table now uses a per-process keyed hash so the placement of
entries cannot be predicted or influenced from the network.
Closes #5906
Merge branch '5906-rrl-hash-collision-dos' into 'main'
The previous hash_key() was a deterministic, unkeyed (<<1) + add over the
key words. An off-path attacker could invert it offline and submit
queries whose source /24, qname hash, and qtype map to a single bucket;
under chaining this turns every lookup into an O(N) walk under
rrl->lock and starves legitimate query processing on the very feature
deployed to mitigate DoS.
Replace it with isc_hash32(), which is HalfSipHash-2-4 keyed by a
per-process random seed, so collision sets cannot be precomputed.
Ondřej Surý [Fri, 1 May 2026 06:18:44 +0000 (08:18 +0200)]
fix: dev: Avoid named assertion failure during parent-NS lookups when none exist
Configuring the root zone as a signed primary with parental agents (or with
notify-on-cds-changes) caused named to exit on an internal assertion as soon
as the DS-publication machinery tried to look up the parent NS RRset — the root
has no parent. The lookup is now short-circuited cleanly.
Similar, a zone with no NS records in the parent caused named to exit in the same way.
Closes #5910
Merge branch '5910-nsfetch-start-root-domain-assertion' into 'main'
Once the walk reaches the root, splitting one more label off would
trip an internal assertion and abort named. Stop cleanly with
ISC_R_NOTFOUND so the dispatcher cancels the fetch. Only reachable
through misconfiguration (root configured as a primary with parental
agents, or a parent zone that NODATAs its own NS).
This is required to AXFR and verify the root zone and it makes no
difference for non-root zones (dnssec-verify takes FQDN or makes the
provided name absolute).
Add a test case where the root zone has dnssec-policy configured, with
checkds enabled. This is a silly case because the root does not have
any parent NS records, but it should not crash the server.
The same is true for zones that do not have parent NS records, but
eventually they will hit the same code path.
Ondřej Surý [Fri, 1 May 2026 05:50:38 +0000 (07:50 +0200)]
chg: dev: Catch rare named crash in recursive resolution earlier for diagnosis
A rare crash has been observed in named while it is resolving upstream nameserver
addresses for a recursive query, surfacing as a segmentation fault with no immediate
clue as to the cause. This change adds internal consistency checks so that a future
occurrence of the same condition aborts named with a diagnostic message at the point
the inconsistency arises, rather than corrupting state and crashing later in
an unrelated location.
Closes #5602
Merge branch '5602-adb-find-sanity-checks' into 'main'
Ondřej Surý [Fri, 1 May 2026 04:44:06 +0000 (06:44 +0200)]
Assert adb find loop-affinity invariant at lifetime entry points
The dns_adbfind_t lifetime model has no reference counting; storage
liveness is held together by find->lock and the FIND_EVENT_SENT
idempotency flag, plus an unwritten cross-module rule that all
non-trivial operations on a find run on find->loop. If a caller
violates that rule, the unlock-relock window in dns_adb_cancelfind
(and similar paths) becomes a use-after-free and we crash later
inside libpthread on a corrupted mutex.
Add REQUIREs at dns_adb_cancelfind, dns_adb_destroyfind and
find_sendevent so a violation aborts at the offending call site
rather than silently freeing storage another loop is still touching.
Also poison find->magic with ~DNS_ADBFIND_MAGIC in free_adbfind so
DNS_ADBFIND_VALID catches reuse-after-free at the next public entry
point instead of letting the dangling pointer reach the mutex code.
Ondřej Surý [Fri, 1 May 2026 05:19:57 +0000 (07:19 +0200)]
fix: dev: Harden dig's EDNS option parsing against malformed replies
dig's parser for EDNS options in a DNS reply now stops cleanly when an
option declares a length that runs past the end of the option data,
rather than trusting the upstream OPT-record validator to reject the
reply first. This is a defensive change; behavior is unchanged in
practice.
Merge branch 'ondrej/dig-process-opt-edns-optlen-oob' into 'main'
Bound EDNS option length in dig's process_opt() walk
process_opt() reads the per-option (optcode, optlen) header from the
OPT rdata and then advances the buffer by optlen, both for the COOKIE
branch (via process_cookie()) and for any other optcode. The walk
itself never compared optlen to the buffer remainder; the only reason
it cannot trip the isc_buffer_forward() REQUIRE today is that
fromwire_opt() (lib/dns/rdata/generic/opt_41.c) already validates each
option's length against the rdata bounds before the rdataset is
handed back, so process_opt() never sees a self-inconsistent rdata.
That upstream guarantee is fine, but it leaves the local walker
trusting an invariant established elsewhere. Add a defensive check
that just stops the walk when a future caller (a cached message, an
alternate parser, a refactor of the OPT validator) hands process_opt()
a buffer where optlen would run past the end.
Michał Kępień [Thu, 30 Apr 2026 20:34:55 +0000 (22:34 +0200)]
fix: ci: Use "git push --force-with-lease" for autorebases
If a merge request is merged to an autorebased branch while it is
getting rebased, the "git push -f" command at the end of the autorebase
job will cause the contents of that merge request to be silently deleted
from Git history even though the merge request will still be (correctly)
shown as "merged" by GitLab.
Use "git push --force-with-lease" instead to prevent force-pushing the
rebased version of the branch if it is pushed to after its pre-rebase
version is fetched by the autorebase job. Report such an event
accordingly. For simplicity, no retries are attempted as the problem is
expected to be resolved by the next autorebase and the chances of this
scenario happening in practice are already low to begin with.
Merge branch 'michal/use-git-push-force-with-lease-for-autorebases' into 'main'
Michał Kępień [Thu, 30 Apr 2026 20:19:59 +0000 (22:19 +0200)]
Use "git push --force-with-lease" for autorebases
If a merge request is merged to an autorebased branch while it is
getting rebased, the "git push -f" command at the end of the autorebase
job will cause the contents of that merge request to be silently deleted
from Git history even though the merge request will still be (correctly)
shown as "merged" by GitLab.
Use "git push --force-with-lease" instead to prevent force-pushing the
rebased version of the branch if it is pushed to after its pre-rebase
version is fetched by the autorebase job. Report such an event
accordingly. For simplicity, no retries are attempted as the problem is
expected to be resolved by the next autorebase and the chances of this
scenario happening in practice are already low to begin with.
fix: usr: Reject negative and out-of-range TTLs in dnssec-* tools
The dnssec-* tools accepted negative and out-of-range values for TTL
flags such as dnssec-keygen -L, dnssec-signzone -t and
dnssec-settime -L, silently turning them into TTLs of around 136 years
in the resulting key or zone files. The flag values are now validated
and rejected with a clear "TTL must be non-negative" or "TTL out of
range" error.
Closes #5923
Merge branch '5923-dnssectool-strtottl-negative-ttl-accepted' into 'main'
Reject negative and out-of-range TTLs in dnssec-* tools
strtottl() parsed the operator's TTL string with strtol() and assigned
the long directly to dns_ttl_t (uint32_t) with no sign or ERANGE
check. The only validation was the "no digits parsed" branch, so a
fully-consumed "-1" became UINT32_MAX (~136 years) and was silently
written into DNSKEY/key files by dnssec-keygen -L, dnssec-signzone -t,
dnssec-settime -L, etc. Any signing pipeline interpolating the TTL
from a variable could mint a key with a multi-decade TTL and never see
an error.
Switch to strtoul(), reject a leading '-' explicitly (strtoul silently
negates), check errno == ERANGE, and reject values exceeding
UINT32_MAX before handing the result to time_units(). The pre-existing
multiplication wrap inside time_units() is tracked separately.
Colin Vidal [Thu, 30 Apr 2026 14:31:21 +0000 (16:31 +0200)]
fix: test: Fix `cyclic_glue` system test
The `cyclic_glue` system test was not explicitly waiting for the dump to
complete. As a result, the test could read an outdated dump file and
perform assertions on database state. Fix this by waiting for `dumpdb`
command to finish before reading `named_dump.db`.
Merge branch 'colin/fix-cyclic_glue-test' into 'main'
Colin Vidal [Thu, 30 Apr 2026 13:18:03 +0000 (14:18 +0100)]
Fix `cyclic_glue` system test
The `cyclic_glue` system test was not explicitly waiting for the dump to
complete. As a result, the test could read an outdated dump file and
perform assertions on database state. Fix this by waiting for `dumpdb`
command to finish before reading `named_dump.db`.
fix: dev: Reject RSA DNSKEYs with degenerate modulus
A crafted DNSKEY rdata whose declared exponent length consumed the
whole buffer produced an RSA key with no modulus, which dnssec-importkey
accepted as valid and wrote to a .private file with no key material.
The wire-format parser now rejects RSA public keys with a modulus
smaller than 512 bits, the lowest legitimate size across the RSA
DNSSEC algorithms.
Closes #5920
Merge branch '5920-opensslrsa-fromdns-zero-modulus-accepted' into 'main'
Reject RSA DNSKEYs with degenerate modulus at parse time
The wire-format RSA DNSKEY parser used the residual rdata length after
the exponent as the modulus length, with no positive lower bound. A
crafted DNSKEY whose declared exponent length consumed the whole buffer
produced n = 0; the BN_bin2bn(_, 0, _) returned a non-NULL BIGNUM, the
NULL-check passed, and dnssec-importkey -f wrote out a "valid" key with
no key material. RSASHA1 also bypassed the algorithm-specific lower
bound in opensslrsa_createctx (which only checks an upper bound for the
SHA1 algorithms), so the degenerate key reached the verify path with
whatever behaviour the linked OpenSSL exhibits for n = 0.
Add OPENSSLRSA_MIN_MODULUS_BITS = 512 (the lowest legitimate modulus
across the RSA DNSSEC algorithms per RFC 5702) and reject smaller
moduli at parse time in opensslrsa_fromdns, opensslrsa_parse, and
opensslrsa_fromlabel — the same three load paths where the existing
exponent upper-bound check lives.
fix: usr: Fix dig -x crash on excessively long arguments
dig -x crashed with a segmentation fault rather than printing an
error when given an argument with thousands of dot-separated
components. dig -x now rejects such inputs cleanly with "Invalid IP
address".
Closes #5917
Merge branch '5917-dig-reverse-octets-stack-overflow' into 'main'
reverse_octets() recursed once per dot, with depth bounded only by
ARG_MAX (~2 MiB on Linux), so feeding dig -x a deep input like
'1.1.1.…1' busted the call stack and crashed the tool with SIGSEGV
instead of a structured error. The transformation it performs is
purely textual (split on '.', emit components in reverse), so the
recursion was never load-bearing.
Walk the input once into a fixed-size array of label slices, capped at
DNS_NAME_MAXLABELS (which is the most we could ever fit into the
result buffer anyway), then iterate the array in reverse to write the
output. Inputs with more than DNS_NAME_MAXLABELS labels now return
DNS_R_NAMETOOLONG, which dig.c surfaces as 'Invalid IP address' and
exit 1. Drop the unnecessary (int) casts on ptrdiff_t/size_t lengths
while at it.
Michał Kępień [Thu, 30 Apr 2026 10:17:32 +0000 (12:17 +0200)]
new: ci: Set up automatic rebasing for security-* branches
Introduce a set of private branches containing only security fixes that
are automatically rebased onto the corresponding open source branches
whenever new changes are merged. Each rebase triggers a basic build,
failing the CI job if the build breaks.
When a security-* branch is rebased, create a CI pipeline for its new
revision and rebase its corresponding bind-9.x-sub branch (if it exists)
on top of it, creating a rebase chain.
Report any failures in the process via Mattermost.
These changes enable treating security fixes similarly to other code
changes, without deferring merges all the way until release prep.
Merge branch 'michal/autorebase-chain' into 'main'
Michał Kępień [Thu, 30 Apr 2026 09:58:55 +0000 (11:58 +0200)]
Set up automatic rebasing for security-* branches
Introduce a set of private branches containing only security fixes that
are automatically rebased onto the corresponding open source branches
whenever new changes are merged. Each rebase triggers a basic build,
failing the CI job if the build breaks.
When a security-* branch is rebased, create a CI pipeline for its new
revision and rebase its corresponding bind-9.x-sub branch (if it exists)
on top of it, creating a rebase chain.
Report any failures in the process via Mattermost.
These changes enable treating security fixes similarly to other code
changes, without deferring merges all the way until release prep.
fix: usr: Stop delv from aborting on a malformed query name
delv aborts with SIGABRT instead of exiting cleanly when given a query
name that fails wire-format conversion (e.g. a label longer than 63
octets). After this change delv prints the parse error and exits with
a normal failure code.
Closes #5916
Merge branch '5916-delv-run-resolve-null-detach-abort' into 'main'
run_resolve allocates dns_client_t late, but the cleanup epilogue
called dns_client_detach() unconditionally. When convert_name() or
dns_client_create() failed first, the detach hit a NULL client and
the REQUIRE(DNS_CLIENT_VALID) inside it aborted the process with
SIGABRT instead of a clean error exit.
Guard the detach with a NULL check. Add a digdelv test that runs
delv on a query name whose first label exceeds 63 octets and
asserts the process does not exit 134.
fix: usr: prevent malicious DNSSEC zones from exhausting validator CPU
A DNSSEC-signed zone could publish a DNSKEY with an unusually large
RSA public exponent and force any validator resolving names in that
zone to spend disproportionate CPU verifying signatures. The
validator now rejects such DNSKEYs, matching the limit already
applied to keys read from files or HSMs.
Closes #5881
Merge branch '5881-rsa-exponent-keytrap-cpu-amplification' into 'main'
Reject RSA DNSKEYs with oversize public exponents at parse time
The wire-format RSA DNSKEY parser was the only key path with no upper
bound on the public exponent — opensslrsa_parse and opensslrsa_fromlabel
already cap at RSA_MAX_PUBEXP_BITS. An attacker-controlled DNSKEY could
therefore force a validator to compute s^e mod n with e up to ~|n| bits,
amplifying every verify by ~120x for typical 2048-bit moduli (OpenSSL
itself only caps the exponent for moduli above 3072 bits). Apply the
same bit-count cap to wire-format keys.
fix: usr: prevent rare named crash when notifies are cancelled
Under heavy load, named could occasionally crash when a queued
outbound notify or zone refresh was cancelled at the moment it
was being sent — for example, while a zone was being reloaded or
removed. The race that caused the crash is now prevented.
Closes #5915
Merge branch '5915-ratelimiter-dequeue-tick-uaf' into 'main'
isc__ratelimiter_tick() and isc_ratelimiter_shutdown() each pulled
events out of rl->pending into a function-local list, dropped the
mutex, and then iterated. ISC_LIST_APPEND leaves the link in the
LINKED state, so a concurrent isc_ratelimiter_dequeue() saw an
event as still queued, called ISC_LIST_UNLINK against rl->pending —
which patched the prev/next of the local list — and freed the
event before dispatch finished, producing either an INSIST in the
unlink macro or a use-after-free in the dispatch loop.
isc_async_run() is a non-blocking wfcq enqueue, so there is no
benefit to dropping the mutex around it. Unlink each event and
hand it to isc_async_run() while still holding rl->lock; the
existing ISC_LINK_LINKED check in dequeue then correctly
distinguishes "still queued and cancellable" from "already taken".
fix: dev: free per-command rndc state when response serialisation fails
When isccc_cc_towire failed while building an rndc reply,
control_respond returned without releasing the per-command request,
response, HMAC secret copy, and text buffer. They were eventually
freed when the connection closed, but until then the HMAC key copy
stayed in named's memory. The failure path now goes through the
same cleanup label as every other error.
Closes #5913
Merge branch '5913-controlconf-control-respond-cleanup-leak' into 'main'
Run conn_cleanup on isccc_cc_towire failure in control_respond
The bare return left conn->secret, conn->response, conn->request, and
conn->text pinned until the connection itself was torn down — every
other error in the function reaches conn_cleanup via goto, and the
success path falls into the same label, so the towire-failure return
was the lone outlier. Send it through the existing cleanup path.
testgen existed only to let the rndc system test generate large response payloads.
It accepted an unbounded count and was reachable from read-only control channels,
so any read-only rndc client could drive named into memory exhaustion. The command
and its supporting test helper are gone; remaining rndc commands already produce
non-trivial responses, so transport coverage is preserved.
Closes #5911
Merge branch '5911-rndc-testgen-32bit-truncation-memory-exhaustion' into 'main'
testgen existed solely to let the rndc system test exercise large
response payloads — it has no operator value, accepts an unbounded
count, and could be invoked by any read-only rndc client to drive
named into memory exhaustion. Drop the command, the gencheck helper
that validated its output, and the buffer-size loop in the rndc
system test; the remaining rndc subcommands already produce
non-trivial responses, so the framing path stays exercised.
fix: dev: Fix swapped arguments in redirect2() single-label branch
On a recursive resolver with nxdomain-redirect configured, an
NXDOMAIN result for a query whose qname is the root could corrupt
the view's nxdomain-redirect target, after which the redirect
feature stopped working for every subsequent query in that view
until named was restarted.
Closes #5908
Merge branch '5908-query-redirect2-name-copy-arg-swap' into 'main'
Fix swapped arguments in redirect2() single-label branch
For a query whose qname is the root, the labels==1 branch in
redirect2() called dns_name_copy(redirectname, view->redirectzone)
with arguments reversed, overwriting the view-global
nxdomain-redirect target with the empty redirectname rather than
copying the configured target into the per-query lookup name. After
the corruption, view->redirectzone names the root, so
dns_name_issubdomain() makes redirect2() short-circuit for every
subsequent query and the nxdomain-redirect feature stops working
until named is restarted.
Triggering this needs the resolver to receive an NXDOMAIN for the
root from upstream, which does not happen in normal DNS operation.
Swap the arguments to match the dns_name_copy(source, dest)
signature. Add a system test that issues a root query through the
nxdomain-redirect resolver and verifies the redirect feature still
works for a normal NXDOMAIN-producing query afterwards.
`rndc-confgen -A hmac-sha384` and `-A hmac-sha512` documented a `-b`
range of 1..1024, but any value above 512 aborted on hardened builds
instead of producing a key. The full advertised range now works.
Closes #5903
Merge branch '5903-hmac-generate-stack-overflow' into 'main'
Size HMAC key generation buffers to the maximum block size
hmac_generate() declared its on-stack nonce buffer as
unsigned char data[ISC_MAX_MD_SIZE], i.e. 64 bytes. That is the maximum
digest size, but the buffer is filled up to the algorithm's HMAC block
size, which is 128 bytes for SHA-384 and SHA-512. Asking rndc-confgen
for an HMAC-SHA-384 or HMAC-SHA-512 key with -b > 512 (the documented
range allows up to 1024) wrote past the end of the stack buffer; on
hardened builds this aborted with a stack-smash detector firing
instead of producing a key.
Use the existing ISC_MAX_BLOCK_SIZE (128) for the buffer so the full
1..1024 range advertised by -A hmac-sha{384,512} works as documented.
The matching key_rawsecret[64] in confgen's generate_key() is enlarged
the same way so the generated key fits when dumped to the buffer.
Add a system test that exercises rndc-confgen across the previously
overflowing keysizes; with -Db_sanitize=address it caught the abort
before the fix.
fix: dev: Do not follow symlinks when chowning the NZD database
When `named` runs as root, the per-view NZD database file is chowned
to the user `named` drops to. The chown call followed symlinks, so a
symlink at the database path could redirect the ownership change to an
unrelated file. The chown now refuses non-regular files and never
follows symlinks.
Closes #5905
Merge branch '5905-nzd-env-close-symlink-chown' into 'main'
When named is running as root, nzd_env_close() chowns the per-view
NZD database file to the unprivileged user that named will drop to.
The call used chown(), which follows symlinks, so a symlink at the
NZD path would silently transfer ownership of whatever the link
pointed at instead of the database file itself.
Switch to lstat() + S_ISREG() + lchown() so the chown only fires when
the path is a regular file and never traverses a symlink even if one
is planted between the lstat and the lchown.
fix: usr: Validate key names in rndc-confgen, tsig-keygen, ddns-confgen
The three tools embedded the key-name argument verbatim into the
generated `named.conf` block, so a name containing characters like
`"`, `{`, or `;` produced output that did not match the intended
`key` clause. Key names are now restricted to letters, digits, dots,
hyphens, and underscores.
Closes #5904
Merge branch '5904-confgen-keyname-config-injection' into 'main'
Reject unsafe key names in rndc-confgen, tsig-keygen, ddns-confgen
The three tools interpolated their key-name argument verbatim into the
generated 'key "..." { ... };' clause. A name containing '"', '{', '}',
or ';' could close the clause and append additional named.conf
statements — for example, a second key block with an attacker-chosen
secret. The injected output passes named-checkconf and is loaded by
named as a valid configuration. The risk shows up when an automation
wrapper feeds tenant or zone names from a less-trusted source through
-k / -y / -s / -z (or the tsig-keygen positional argument).
Validate the final key name (after the optional -s / -z suffix is
concatenated in tsig-keygen) against [A-Za-z0-9._-]+ and exit with an
error otherwise. The allowlist covers the documented usage; every
character used in the injection vectors is excluded.
Add a system test that runs the documented PoC payloads through each
tool and asserts a non-zero exit, plus sanity coverage for the default
key names and dotted DNS-style names.
fix: usr: Fix suppressed missing-glue check in named-checkzone
named-checkzone and named-checkconf -z silently skipped the
missing-glue check for any NS name that had already triggered an
extra-AAAA-glue warning, so zones missing required A glue could pass
validation and be deployed with broken delegations.
Merge branch 'ondrej/check-tool-err-glue-code-collision' into 'main'
Resolve ERR_MISSING_GLUE / ERR_EXTRA_AAAA value collision
Both constants were defined as 5. The symbol table used by checkns() to
deduplicate log messages keys on (name, error_code), so logging an
extra-AAAA error caused logged() to also return true for the
missing-glue check, silently skipping the entire missing-glue block for
the same name in named-checkzone and named-checkconf -z.
Convert the ERR_* defines to an auto-numbered enum so the compiler
guarantees the values stay pairwise distinct.
fix: dev: Validate -l and -L numeric arguments in named-checkzone
named-checkzone and named-compilezone parsed the -l (max TTL) and -L
(source serial) arguments with strtol(), so a negative value such as
-1 silently became UINT32_MAX and out-of-range values were truncated
to 32 bits without warning; -l in particular appeared to cap TTLs but
no longer enforced anything. Both flags now go through isc_parse_uint32()
and reject any value that is not a valid 32-bit unsigned integer.
Merge branch 'ondrej/named-checkzone-strtol-truncation' into 'main'
The -l (max TTL) and -L (source serial) flags parsed their arguments
with strtol() and assigned the result directly to uint32_t with no
range check. A negative value such as -1 became UINT32_MAX, which made
-l silently disable the TTL cap it claimed to enforce, and out-of-range
values truncated to 32 bits without warning.
Switch both flags to isc_parse_uint32(), which rejects leading non-
alphanumeric input (catching '-'), checks ERANGE, and validates the
32-bit range, so an invalid argument now exits with an error instead
of being silently coerced.
fix: usr: Stop rndc-confgen from following symlinks when writing the keyfile
When rndc-confgen -a (re)created the rndc control key, it followed a
symbolic link if one happened to exist at the keyfile path: the
existence check looked through the link, then the file was truncated,
its ownership changed, and the key contents written into whatever file
the link pointed at. rndc-confgen now refuses to follow symbolic links
at the keyfile path and fails with an error instead, so the wrong file
can no longer be overwritten by accident.
Merge branch '5901-rndc-confgen-symlink-attack' into 'main'
The function existence-checked the target with stat() and then opened
the same path without O_NOFOLLOW, so a symlink at the target path
passed the regular-file test against the link's destination and the
open() that followed truncated and wrote through the link.
rndc-confgen -a is typically run as root and writes the keyfile under
a directory that service accounts may have write access to, so a stray
symlink there would silently redirect the truncate, fchown, and
overwrite to whatever file the link pointed at.
Switch the existence check to lstat() and use S_ISREG() so a symlink's
S_IFLNK mode is detected directly (a plain bitmask of S_IFREG matches
both, since S_IFLNK shares its high bit). Add O_NOFOLLOW to both
open() flag sets to close the lstat/open TOCTOU window. Hardening
against unexpected symlinks on intermediate path components is out of
scope.
chg: usr: Document that named-checkzone must not run on untrusted input
The zone-file parser implements $INCLUDE by opening whatever local
path the zone text names, and fragments of the included file leak
through parser error messages. There is no safe way to validate
untrusted zone text with named-checkzone or named-compilezone, so
the manual pages for both tools now warn against doing so.
Merge branch 'ondrej/named-checkzone-include-path-traversal' into 'main'
Drop unused DNS_MASTER_NOINCLUDE and warn about untrusted zone text
DNS_MASTER_NOINCLUDE was defined to suppress $INCLUDE processing, but
no caller ever set it, so the guarded code path was dead and the flag
gave the false impression that named-checkzone could be hardened
against untrusted input. The zone-file parser cannot safely read text
from a less-trusted source than the user running the tool: $INCLUDE
opens any local file readable by that user, and fragments of its
contents leak through tokenizer error messages.
Rather than wire up an opt-in flag that suggests this is a supported
mode, remove the dead flag and the dead guard, and document in the
named-checkzone and named-compilezone manual pages that these tools
must not be run on zone text from an untrusted source.
Colin Vidal [Wed, 29 Apr 2026 09:29:36 +0000 (11:29 +0200)]
fix: usr: Glues from different parent are rejected
The changes making BIND 9 parent-centric !11621 introduced an issue where it could be possible, when processing a referral, to use the glue to a nameserver which has a different parent than the zonecut. For instance:
ADDITIONAL
ns.bar. A 1.2.3.4
ns.foo.example. A 5.6.7.8
ns.test.example. A 9.8.7.6
In such situation, only the glues for `ns.foo.example.` and `ns.test.example.` should be used, and the glue from `ns.bar.` must be ignored as this is not either a sub-domain or a sibling domain, the parent is different (`bar.` instead of `example.`). This is now fixed.
Sibling glue and cyclic sibling glues are defined in RFC 9471 section 2.2 and section 2.3.
Merge branch 'colin/cyclic-glues-test' into 'main'
ADDITIONAL
ns.bar. A 1.2.3.4
ns.foo.example. A 5.6.7.8
ns.test.example. A 9.8.7.6
```
In such situation, only the glues for `ns.foo.example.` and
`ns.test.example.` should be used, and the glue from `ns.bar.` should be
ignored as this is not either a sub-domain or a sibling domain, the
parent is different (`bar.` instead of `example.`). This is now fixed.
Sibling glue and cyclic sibling glues are defined in RFC 9471 section
2.2 and section 2.3.
OPENSSL_cleanup() in OpenSSL 4 doesn't free the memory, and that is
not compatible with BIND 9's memory leak detection code. Don't use
custom allocation/deallocation functions for OpenSSL's internal memory
management.
See https://github.com/openssl/openssl/pull/29721
Closes #5808
Merge branch '5808-openssl4-compat-fix' into 'main'
Remove OpenSSL memory tracking support from the ossl3.c module
OPENSSL_cleanup() in OpenSSL 4 doesn't free the memory, and that is
not compatible with BIND 9's memory leak detection code. Don't use
custom allocation/deallocation functions for OpenSSL's internal memory
management in the ossl3.c module.
Aydın Mercan [Fri, 6 Feb 2026 12:31:40 +0000 (15:31 +0300)]
don't set named curves explicitly in pre-3.0 libcrypto
The function `EC_KEY_set_asn1_flag` is deprecated in AWS-LC. Fortunately
calling it to make sure we use named curve keys is entirely unnecessary.
More information for pre-3.0 libcrypto and significant forks are as
following:
OpenSSL: Named curves were the default between 1.1.0 and 3.6.1 [1],[2]
AWS-LC: Library only supports named curves in the first place [3]
BoringSSL: Likewise with AWS-LC [4]
LibreSSL: `EC_GROUP`s are named by default [5]
Compute qpzone_get_lock(elem->node) into a local variable while the
heap lock is still held, rather than dereferencing the stale elem
pointer after releasing the lock. A concurrent thread running
setsigningtime() (e.g. via IXFR apply on a worker thread) could free
the top-of-heap element between the heap lock release and the
dereference, causing a use-after-free.
Closes #5883
Merge branch '5883-getsigningtime-race-fix' into 'main'
Compute qpzone_get_lock(elem->node) into a local variable while the
heap lock is still held, rather than dereferencing the stale elem
pointer after releasing the lock. A concurrent thread running
setsigningtime() (e.g. via IXFR apply on a worker thread) could free
the top-of-heap element between the heap lock release and the
dereference, causing a use-after-free.
new: doc: Add AI coding assistants guidance to CONTRIBUTING.md
Adapted from the Linux kernel's Documentation/process/coding-assistants.rst
to the BIND 9 context. Adds three subsections under the existing
"Guidelines for Tool-Generated Content" section:
- Licensing and legal requirements (MPL-2.0, SPDX identifiers).
- Signed-off-by and Developer Certificate of Origin: AI agents must
not add Signed-off-by trailers; only the human submitter may
certify the DCO.
- Attribution: the Assisted-by: AGENT_NAME:MODEL_VERSION trailer
for recording AI involvement, with an explicit prohibition on
AI-added Co-Authored-By trailers (Co-Authored-By designates a
human co-author who shares responsibility).
Merge branch 'ondrej/coding-assistants-doc' into 'main'
Add AI coding assistants guidance to CONTRIBUTING.md
Adapted from the Linux kernel's Documentation/process/coding-assistants.rst
to the BIND 9 context. Adds three subsections under the existing
"Guidelines for Tool-Generated Content" section:
- Licensing and legal requirements (MPL-2.0, SPDX identifiers).
- Signed-off-by and Developer Certificate of Origin: AI agents must
not add Signed-off-by trailers; only the human submitter may
certify the DCO.
- Attribution: the Assisted-by: AGENT_NAME:MODEL_VERSION trailer
for recording AI involvement, with an explicit prohibition on
AI-added Co-Authored-By trailers (Co-Authored-By designates a
human co-author who shares responsibility).
fix: dev: Remove unneeded options in dns_zonefetch
In the `dns_zonefetch` mechanism, some option flags for
`dns_resolver_createfetch()` were used for all fetches, but
were actually only needed by the `DNSKEY` refresh fetches.
(Specifially, these options were `DNS_FETCHOPT_UNSHARED`
and `DNS_FETCHOPT_NOCACHED`, which were used along with
`DNS_FETCHOPT_NOVALIDATE` to ensure we get a new copy of
the DNSKEY as it is currently published by the authority,
without prior validation. Those conditions are needed
for RFC 5011 trust anchor maintenace, but not when looking
up parent-`NS` or `DSYNC` RRsets.)
In the dns_zonefetch mechanism, some option flags for
dns_resolver_createfetch() were used for all fetches, but
were actually only needed by the DNSKEY refresh fetches.
(Specifially, these options were DNS_FETCHOPT_UNSHARED
and DNS_FETCHOPT_NOCACHED, which were used along with
DNS_FETCHOPT_NOVALIDATE to ensure we get a new copy of
the DNSKEY as it is currently published by the authority,
without prior validation. Those conditions are needed
for RFC 5011 trust anchor maintenace, but not when looking
up parent-NS or DSYNC RRsets.)
new: dev: Add DTRACE probes to the delegation cache
The new delegation cache, which stores NS-based and DELEG-based delegations per view, is now instrumented
with static user-space tracing probes so that cache hit rate, insertion and lookup latency, eviction pressure
under memory limits, and removals triggered by rndc flush-delegation can be observed on a running named.
Merge branch 'ondrej/delegdb-dtrace-probes' into 'main'
Introduces a top-level dtrace/ directory for user-contributed trace
scripts that consume the USDT probes exported by libdns, libns, and
libisc. Ships with delegdb-trace.stp, which streams every insertion,
eviction, and rndc flush-delegation removal in the delegation cache,
and a README pointing at the provider files and explaining how to list
and run the probes on Linux (SystemTap) and on FreeBSD/macOS (DTrace).
Instrument the delegation cache (introduced to back both NS-based and
DELEG-based delegations) with 11 USDT probes in the libdns provider so
that hit rate, eviction pressure, and lookup latency can be measured
without recompiling or enabling logging.
The probes are:
- delegdb_lookup_start / delegdb_lookup_done wrap dns_delegdb_lookup()
and pass the query name plus the result code.
- delegdb_insert_start / delegdb_insert_done wrap dns_delegset_insert().
The early SHUTTINGDOWN return is funneled through the cleanup label
so the done probe fires on every path.
- delegdb_cleanup_start / delegdb_cleanup_done bracket the SIEVE-based
eviction triggered when the cache goes overmem, reporting the number
of bytes requested and actually reclaimed. An additional per-node
delegdb_evict probe (guarded by _ENABLED() because it fires inside
the loop) exposes which zones are being evicted.
- delegdb_create, delegdb_reuse, and delegdb_shutdown trace the per-view
lifecycle across server reloads.
- delegdb_delete traces rndc flush-delegation paths, reporting whether
a subtree or single name was removed.
Name arguments are stringified with dns_name_format() behind
LIBDNS_*_ENABLED() guards so that the hot lookup and insert paths remain
zero-cost when no consumer is attached.
fix: dev: Fix inverted gethostname() check in rndc status
The replacement of named_os_gethostname() with raw gethostname()
inverted the success check: the "localhost" fallback runs on success,
and on failure the uninitialized hostname buffer is read by snprintf(),
leaking stack memory via the rndc status reply.
Closes #5889
Merge branch '5889-fix-gethostname-inverted-check' into 'main'
When named_os_gethostname() was replaced with raw gethostname(), the
success/failure polarity was flipped: the fallback to "localhost" now
runs on success and the hostname buffer is left uninitialized on
failure. In the failure path, snprintf() then reads the uninitialized
stack buffer, disclosing stack contents via the rndc status reply.
Replace the less obvious and less explicit `struct.unpack()` and
`struct.pack()` calls with calls to `int.from_bytes()` and
`int.to_bytes()`, respectively.
Štěpán Balážik [Mon, 19 Jan 2026 20:49:36 +0000 (21:49 +0100)]
Fix FallbackTooManyRecordsAxfrHandler to follow convention
All the other subclasses AxfrHandler send three messages.
This oversight was inherited from the original Perl implementation of
the server and was not fixed in 46ecbbe where it was rewritten.
This allows refactoring and sharing of the superclass.
fix: usr: Fix named crash when processing SIG records in dynamic updates
Previously, :iscman:`named` could abort if a client sent a dynamic update containing a SIG record (the legacy signature type) to a zone configured with an update-policy. The function `dns_db_findrdataset` had an incorrect requirements prerequisite that prevented SIG records being looked up, which was triggered as part of processing an UPDATE request and could be triggered remotely by any client permitted to send updates. This has been fixed by ensuring that SIG records are handled consistently with RRSIG records during update processing.
Make sure the nameserver correctly handles SIG records in the
prerequisites of the dynamic update. The first check is to ensure that
the prerequisites are not examined prior to checking the credentials.
The second test case checks that the SIG present prerequisite is
examined and therefore refuses the update. Also this should not trigger
an assertion failure in dns__db_findrdataset() (due to the REQUIRE()
only accepted dns_rdatatype_rrsig when the covers parameter was set).
Add AXFR regression test for SIG covers preservation
diff.c rdata_covers() runs on both dns_diff_apply (IXFR, ns/update.c
dynamic updates) and dns_diff_load (AXFR). After the previous commit
refused SIG and NXT in dynamic updates, the AXFR path remains the
most natural way to drive legacy SIG records into a secondary's zone
DB and regression-gate the rdata_covers() fix.
The test adds ans11 as an AsyncDnsServer primary for a small zone
whose AXFR carries two SIG rdatas at the same owner with different
covered types (A, MX) and different TTLs (600, 1200), and declares
ns6 a secondary of that zone. With the bug present, dns_diff_load
groups both tuples at typepair (SIG, 0) and the MX-covering record
inherits the first-seen TTL (600); the fix keeps them at (SIG, A)
and (SIG, MX) with their original TTLs.
rndc dumpdb -zones on the secondary is used to inspect stored state
directly, because the wire-level SIG query response merges
same-(owner,type,class) RRs and masks the per-rdataset TTLs.
SIG (24) and NXT (30) are obsolete DNSSEC record types, superseded by
RRSIG and NSEC in RFC 3755. Allowing them through dynamic update
exposes two distinct bugs that the surrounding GL#5818 work already
fixes as defense-in-depth:
- dns__db_findrdataset() used to REQUIRE that (covers == 0 ||
type == RRSIG), which aborts named when a SIG update reaches the
prescan foreach_rr() call. Fixed to accept dns_rdatatype_issig().
- diff.c rdata_covers() used to test only RRSIG, dropping the
covered-type field for SIG rdatas; the zone DB then filed every
SIG rdataset under typepair (SIG, 0) instead of
(SIG, covered_type) and follow-up adds collided at that bucket.
Fixed to use dns_rdatatype_issig().
Both underlying bugs are still reachable via inbound zone transfer
(diff.c rdata_covers() runs from both dns_diff_apply on the IXFR path
and dns_diff_load on the AXFR path), so the type-helper fixes above
remain necessary. For the dynamic-update path, the simplest and
safest posture is to refuse SIG and NXT outright at the front door in
ns/update.c, alongside the existing NSEC/NSEC3/non-apex-RRSIG
refusals. KEY remains permitted because it is still used to carry
public keys for SIG(0) transaction authentication.
The existing tcp-self SIG regression test is repointed to assert
REFUSED on the SIG add, a symmetric NXT test is added, and the
SIG-via-dyn-update covers-bucket test is removed because it is no
longer reachable through this entry point; AXFR-based coverage of
diff.c rdata_covers() follows in a separate commit.
Add regression test for SIG covers being dropped in dns_diff_apply
rdata_covers() in lib/dns/diff.c tests `type == dns_rdatatype_rrsig`
instead of dns_rdatatype_issig(), so for a legacy SIG (24) rdata it
returns 0 and the covered type is discarded on the dynamic-update /
IXFR path. The zone DB then files every SIG rdataset under typepair
(SIG, 0) instead of (SIG, covered_type), and a follow-up add with a
different covers field but a different TTL collides at that bucket,
trips DNS_DBADD_EXACTTTL in qpzone, returns DNS_R_NOTEXACT, and comes
back to the client as SERVFAIL.
The new test adds a PTR to establish the node (tcp-self requires the
client IP's reverse form to equal the owner), then two SIG updates
with different covers and different TTLs; on a buggy build the second
update is SERVFAIL and named logs `dns_diff_apply: .../SIG/IN: add
not exact`. The test is expected to pass once rdata_covers() is
switched to dns_rdatatype_issig(), matching the fix already adopted
for dns__db_findrdataset() on this branch and the helper pattern used
in master.c, xfrout.c, and qpcache.c.