fix: usr: Prevent excessive priming queries to the root servers
BIND was sending a priming query to the root servers on nearly every
recursive lookup instead of only when the cached root information
expired. Priming now rearms only after the TTL of the fetched records
elapses, and the refreshed root NS set is used for query routing until
the next cycle.
Merge branch 'ondrej/fix-delegdb-priming' into 'main'
Rename view->hints to view->rootdb and rearm priming
With the parent-centric resolver, dns_view_bestzonecut() consults the
delegation DB (view->deleg) rather than the main cache for the closest
zonecut. Root is never the target of a referral, so it never lands in
delegdb; bestzonecut therefore falls through to the hints lookup on
every query whose closest ancestor is root. prime_done() only called
dns_root_checkhints(), which logs discrepancies but does not update
any store bestzonecut looks at, so the fresh root NS records obtained
by priming were never used and priming kept re-firing.
Rename view->hints to view->rootdb and refresh it when a priming
fetch completes: the '.' NS rdataset is replaced with the fetched
one, and for each listed nameserver the matching A/AAAA glue is
copied from the response's ADDITIONAL section. Only glue for names
that actually appear as NS targets is accepted, so a hostile response
cannot inject unrelated records. Glue the response did not carry is
left untouched, so the hints-file records loaded at startup remain as
a fallback.
Each view gets its own rootdb: the previous shared
named_g_server->in_roothints is gone, and configure_view() calls
dns_rootns_create() per view when the class-IN defaults are needed.
That keeps the priming writer one-per-DB, so concurrent priming in
different views cannot race on the same zone-DB version.
The rootdb refresh runs synchronously from the resolver response path,
so records go straight from the wire into rootdb with no cache round
trip and no dependency on DNSSEC validation state. A new
DNS_FETCHOPT_PRIMING option marks the priming fetch; prime_done()
itself is now pure cleanup.
Track the rootdb freshness window in view->rootdb_expires and trigger
re-priming lazily from dns_view_find() and bestzonecut_rootdb() only
when the window has elapsed. Stale records are still served while the
fresh priming fetch is in flight.
Drop dns_root_checkhints() and its helpers; the rootdb is now the
authoritative source the resolver consults.
The :iscman:`named` process could terminate unexpectedly when
processing a catalog member zone containing special characters
like '%' or '$' which could be interpreted as zone filename tokens
and trigger a case-sensitivity bug in the token-parsing code. This
has been fixed.
Closes #5849
Merge branch '5849-catz-filename-and-token-parsing-fix' into 'main'
Treat '%' and '$' as special characters for catalog member zone names
The filename of the catalog member zones are generated dynamically
based on the zone's name. If the zone's name is too long or if it
contains special characters the name's digest is used instead.
Since '%' and '$' are now treated as special characters in the zone
names (see !10779), add these characters to the list of the special
characters.
A a test to check zone filename case-insensitivity
The test adds a catalog member zone which has '%X' in its name and
it ends up in the zone filename parser's code because the filename
is currently generated (by the catalog zone code) based on the zone's
name.
Zones which have a name with the '%' special character should be
filtered and their name's digest should be used instead for filenane
generation (like it is implemented for other special characters), and
that fix is coming next.
Fix case-sensitivity bug in zone filename token-parsing
The setfilename() function uses case-insensitive strcasestr() when
matching the possible tokens, but then one of the token parsers
uses case-sensitive INSIST checks which can assert when, for example,
matching '%X' and INSIST only accepts '%x'.
The case-insensitivity is documented, which means it's the parser
that needs to be fixed, not the matcher.
Convert the character to lowercase before checking the token's
validity.
fix: usr: Avoid extra round trips for DS lookups when the parent delegation is already cached
DS queries could take two unnecessary extra round trips when the resolver sent them to the child zone instead of the parent. The child responds with NODATA, forcing a recovery path to rediscover the parent delegation even though it was already cached. The resolver now consults its delegation cache before starting DS fetches, sending queries directly to the correct parent nameservers and eliminating the extra latency.
Colin Vidal [Wed, 15 Apr 2026 12:00:06 +0000 (14:00 +0200)]
Add system test for the chase DS fix
Add a system test which ensures, whenever the DS record can't be found
in the local cache, that the resolver first tries to get the parent NS
from the delegation cache to ask them the DS record, directly, rather
than running the fallback flow where the resolver attempts to query the
DS record from NS of the validating name (which would fails, then the
resolver would remove one label and fetch again, fails, and so on until
it reach the closest zonecut).
The test relies on the fact that when the fallback flow is run, the
`rctx_chaseds()` function is run, adding the "chase DS servers ..." and
"suspending DS lookup to find parent's..." logs.
Replace FIXME with rationale for not cleaning expired delegdb nodes
Expired delegation nodes are naturally replaced when the resolver
fetches fresh data, and any remaining stale nodes are reclaimed by
SIEVE eviction under memory pressure.
delegdb_cleanup() was overwriting the caller-supplied 'requested'
value with (hiwater - lowater), so every overmem cleanup tried to
free the full watermark band regardless of how much memory the new
delegation actually needed. Drop the override so the caller's size
is used: we now walk the SIEVE only until we have reclaimed enough
room for the new node, leaving unrelated entries in place.
Account transient delegsets against the caller's memory context
dns_delegset_fromnsrdataset() used isc_g_mctx for the transient
delegset it builds from a DNS NS rdataset. That hides delegation
data in the global default context instead of accounting it against
the subsystem that owns it: a resolver fctx, a view, or a query
context.
Take an explicit mctx parameter so callers can direct the allocation
to the right place, and update the three call sites:
- lib/dns/view.c:1189 (dns_view_bestzonecut fallback) uses view->mctx
- lib/dns/resolver.c:7071 (resume_dslookup) uses fctx->mctx
- lib/ns/query.c:8672 (query_delegation_recurse) uses the client
manager's mctx
Also tighten delegdb cleanup to run inside the same write transaction
as the insert: delegdb_node_prepare() now returns the size of the new
node, and delegdb_cleanup() takes the caller's open qp so that the
overmem reclamation and the insert share one commit instead of doing
two nested write transactions.
Fix delegation database NOEXACT lookup for top-level names
dns__deleg_lookup() with DNS_DBFIND_NOEXACT is supposed to return
the deepest proper ancestor of the lookup name. It called
getparentnode() to step up from an exact match, but getparentnode()
only iterated while the chain length was >= 2. When the chain
contained a single entry (the exact match itself with no ancestor
stored in the trie), the loop did not execute and left the caller
looking at the exact match. The subsequent isactive() check then
returned success and the function reported the exact match as the
"deepest ancestor", violating NOEXACT semantics.
This was observable as the resolver picking the child-side
delegation for an at-parent type (e.g. a DS query for a TLD), then
sending the query to the child's own nameservers and recovering via
the "chase DS servers" path.
Have getparentnode() set '*node' to NULL when it cannot find an
active proper ancestor, and make dns__deleg_lookup() NULL-check
before returning, matching the canonical NOEXACT implementation in
dns_zt_find(). Update the deleg unit test to expect NOTFOUND for
the top-level-no-parent case.
When the validator needs a DS RRset and the cache does not have it,
get_dsset() falls back to creating a fresh fetch. Without a hint, the
resolver picks the closest known zone cut for the DS query, and in the
parent-centric resolver that can land on a delegation at the DS owner
name itself (the child side). This can happens when the parent
delegation is expired, or if the zonecut of the parent doesn't match the
labels in the name.
Querying the child for its own DS records yields NODATA from the apex of
the zone, which sends the resolver into the "chase DS servers" recovery
path and costs two extra round trips for a parent delegation we already
had cached in the delegation database.
Look up the parent zone in the delegation database before kicking
off the fetch, and pass any usable delegation to the resolver as a
hint. When the hint is present, the resolver sends the DS query
straight to the parent's nameservers and the chase path is avoided
entirely.
To support this, create_fetch() now takes optional 'domain' and
'delegset' parameters that are forwarded to dns_resolver_createfetch().
All other call sites pass NULL.
rem: nil: Continue removal of license headers from test zones
Copyright license headers were removed from system test zone files in
commit f144db6b686, but this change only applied to files named '*.db',
'*.db.in', etc. There were some zone files called '*.zone' which were
left unchanged; these have been updated now as well.
Continue removal of license headers from test zones
Copyright license headers were removed from system test zone files in
commit f144db6b686, but this change only applied to files named '*.db',
'*.db.in', etc. There were some zone files called '*.zone' which were
left unchanged; these have been updated now as well.
Use virtualenv's Python interpreter when running tests from a venv
Meson bakes the absolute path of the detected Python binary (e.g.
/usr/bin/python3.12) into the PYTHON build variable. When tests are run
from a virtualenv, that stored path might point to the system Python
which lacks the virtualenv's installed packages, causing test failures.
Fix this by checking whether the current process is running inside a
virtualenv (sys.prefix != sys.base_prefix) and, if so, replacing the
stored PYTHON build var with sys.executable — the interpreter that is
already running pytest and has all required dependencies available.
The behaviour on EL8/EL9 (where meson prefers python3.12 over the older
platform default) and on FreeBSD (python3.11) is unchanged, since those
workflows run pytest without an active virtualenv in our CI.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When a zone filename is defined in `named.conf` which will be
written to by the server - i.e., for secondary or dynamically updated
zones - there is a test at configuration time to ensure that the
filename is non-unique.
This test is run before the zone is actually created, so a zone
configured using a template may not have had its filename expanded
yet. This can cause a configuration to fail because, for example,
multiple zones appear to using the filename `$name.db`.
This has been fixed by adding a new function `dns_zone_expandzonefile()`
and calling it during the uniqueness check.
Mark Andrews [Thu, 2 Apr 2026 04:25:09 +0000 (15:25 +1100)]
Fix a bug with template filename reuse
When a zone filename is defined in named.conf which will be
written to by the server - i.e., secondary or dynamically updated
zones - there is a test at configuration time to ensure that the
filename is non-unique.
This test is run before the zone is actually created, so a zone
configured using a template may not have had its filename expanded
yet. This can cause a configuration to fail because, for example,
multiple zones appear to using the filename "$name.db".
This has been fixed by calling dns_zone_expandzonefile() from
isccfg_check_zoneconf(), to expand the names when checking for
uniqueness.
Mark Andrews [Thu, 2 Apr 2026 04:25:09 +0000 (15:25 +1100)]
Make zone filename expansion accessible from outside dns_zone
This adds a new API call dns_zone_expandzonefie(), which will enable
named-checkconf to expand filenames the same way the server does in
dns_zone_setfile().
Mark Andrews [Wed, 15 Apr 2026 01:36:50 +0000 (11:36 +1000)]
fix: usr: Remove unnecessary dns_name_free call
When processing a catalog zone member's primaries definition and
there is a TXT record containing an invalid name TSIG key name,
dns_name_free was incorrectly called triggering an assertion.
This has been fixed.
Closes #5858
Merge branch '5858-remove-unnecessary-dns-name-free-call' into 'main'
Mark Andrews [Fri, 10 Apr 2026 03:07:26 +0000 (13:07 +1000)]
Remove unnecessary dns_name_free call
When processing a catalog zone member's primaries definition and
there is a TXT record containing an invalid name TSIG key name,
dns_name_free was incorrectly called triggering an assertion.
This has been fixed.
The resolver can and will reuse outgoing TCP connections to the same host, as recommended by RFC 7766. This prevents a whole class of attacks that abuse the fact that establishing a TCP connection is expensive and it is fairly easy to deplete the outgoing TCP ports by putting them into TIME_WAIT state.
The number of pipelined queries per connection is capped at 256 to limit the impact of a connection drop.
Merge branch '3741-reuse-tcp-connections' into 'main'
Include disptype and transport in dispatch hash key
Move disptype and transport into dispatch_hash() and dispatch_match()
so that the match function is the single source of truth for whether
two TCP dispatches are interchangeable. This replaces the post-loop
disptype filter in dispatch_gettcp() and makes the disptype field in
struct dispatch_key actually used.
Ondřej Surý [Sun, 15 Mar 2026 06:52:34 +0000 (07:52 +0100)]
Use sequential per-dispatch message IDs for TCP
TCP dispentries no longer use the global QID hash table at all.
Responses are matched by scanning disp->active, and sequential
per-dispatch IDs (bounded by the pipelining limit) are unique
within a single dispatch by construction. Since TCP delivers
only data we asked for on a specific connection, the per-peer
uniqueness that the global table enforced was never actually
needed for TCP.
DNS_DISPATCHOPT_FIXEDID is plumbed through dns_request_createraw
-> get_dispatch -> dns_dispatch_createtcp so FIXEDID TCP requests
always get a fresh isolated dispatch — the caller-supplied ID
then cannot collide with any other in-flight query either.
Ondřej Surý [Sun, 15 Mar 2026 06:23:33 +0000 (07:23 +0100)]
Limit TCP pipelining per shared dispatch
Cap the number of in-flight queries on a single shared TCP dispatch.
When the limit is reached, the dispatch is removed from the hash
table so subsequent queries get a fresh connection. The existing
dispatch continues serving its queries until they complete.
This bounds the blast radius of a connection drop: at most N queries
fail simultaneously instead of all queries to that server.
The default limit is 256. It can be overridden for testing via
'named -T tcppipelining=N'.
Ondřej Surý [Sun, 15 Mar 2026 07:57:26 +0000 (08:57 +0100)]
Disable TCP pipelining in tcp and masterformat system test
Set tcppipelining=1 on recursive servers in the system tests to
restore one-query-per-connection behavior. The tests relies on
specific connection and query counting that breaks with TCP
connection sharing.
Ondřej Surý [Tue, 17 Feb 2026 10:05:33 +0000 (11:05 +0100)]
Implement seamless TCP connection reuse in dns_dispatch
Previously, the user of dns_dispatch API had to first call
dns_dispatch_gettcp() and if that failed create a new TCP dispatch with
dns_dispatch_createtcp(). This has been changed and the TCP connection
reuse happens transparently inside dns_dispatch_createtcp(). There are
separate buckets for dns_resolver, dns_request and dns_xfrin units, so
these don't get mixed together.
fix: usr: Fix 'rndc modzone' issue with non-existing zones
The :iscman:`named` process could terminate unexpectedly or become
subject to undefined behavior when issued an :option:`rndc modzone`
operation for a non-existing zone. This has been fixed.
Closes #5848
Merge branch '5848-do_modzone-unlock-bug-fix' into 'main'
The cleanup path always unlocks the 'view->newzone.lock' lock, but
there are 'goto cleanup;' operations even before the lock is locked,
which causes an assertion failure.
Don't use the cleanup path before the lock is locked.
Recently, a broken version of libuv was released breaking BIND on
several platforms. The offending [commit](https://github.com/libuv/libuv/issues/5030) was on the development branch
for months, but we didn't notice.
In nightly pipelines, build the current 'main' (actually 'v1.x') branch
of libuv and run the unit and system tests against it.
Merge branch 'stepan/prelease-testing-for-libuv' into 'main'
Štěpán Balážik [Mon, 9 Mar 2026 16:26:13 +0000 (17:26 +0100)]
Test development version of libuv in CI
Recently, a broken version of libuv was released breaking BIND on
several platforms. The offending commit [1] was on the development
branch for months, but we didn't notice.
In nightly pipelines, build the current 'main' (actually 'v1.x') branch
of libuv and run the unit and system tests against it.
Mark Andrews [Fri, 10 Apr 2026 06:23:27 +0000 (16:23 +1000)]
fix: usr: Fix zone verification of NSEC3 signed zones
Previously, when computing the compressed bitmap during verification of an NSEC3-signed zone, an undersized buffer was used that resulted in an out-of-bounds write if there were too many active windows in the bitmap. This impacted mirror zones which are NSEC3-signed, `dnssec-signzone` and `dnssec-verifyzone`. This has been fixed.
Michał Kępień [Thu, 9 Apr 2026 11:25:14 +0000 (13:25 +0200)]
fix: ci: Purge distros token in a separate CI job
The "publish" job runs on a dedicated, locked-down runner that lacks the
Python modules necessary to execute the manage_distros_token.py script.
Instead of deleting the token within the "publish" job, purge it in a
separate job that automatically runs on the "base" image after the
"publish" job succeeds. Define "rules" for the new job so that the
token is only deleted for security releases, as it should have been
initially.
Merge branch 'michal/purge-distros-token-in-a-separate-ci-job' into 'main'
Michał Kępień [Thu, 9 Apr 2026 11:23:57 +0000 (13:23 +0200)]
Purge distros token in a separate CI job
The "publish" job runs on a dedicated, locked-down runner that lacks the
Python modules necessary to execute the manage_distros_token.py script.
Instead of deleting the token within the "publish" job, purge it in a
separate job that automatically runs on the "base" image after the
"publish" job succeeds. Define "rules" for the new job so that the
token is only deleted for security releases, as it should have been
initially.
Michał Kępień [Thu, 9 Apr 2026 04:02:34 +0000 (06:02 +0200)]
Handle CVE reproducers along with fixes
With AI agents widely available, delaying CVE reproducer publication no
longer provides any benefit, as feeding a patch with a fix to a large
language model can produce a usable exploit. Revise the CVE checklist
to ensure the reproducer and the fix are pushed to the same merge
request (as separate commits) and remove the post-disclosure step for
regression test publishing.
Mark Andrews [Thu, 9 Apr 2026 00:33:41 +0000 (10:33 +1000)]
fix: doc: nsupdate does not handle zero length RDATA well
Nsupdate does not distinguish between a non-existing RDATA field
and an empty RDATA field when determining which action is desired
when the RDATA field is empty. This only affects a few data types,
like APL, which allow an empty RDATA field. Document a workaround
of using the '\# 0' form for entering these specific records. e.g.
# delete the APL RRset
update delete IN APL
# delete the APL record with a zero length rdata
update delete IN APL \# 0
Closes #5835
Merge branch '5835-nsupdate-doc-zero-length-rdata-how-to' into 'main'
Mark Andrews [Tue, 31 Mar 2026 01:26:42 +0000 (12:26 +1100)]
nsupdate does not handle zero length RDATA well
Nsupdate does not distinguish between a non-existing RDATA field
and an empty RDATA field when determining which action is desired
when the RDATA field is empty. This only affects a few data types,
like APL, which allow an empty RDATA field. Document a workaround
of using the '\# 0' form for entering these specific records. e.g.
# delete the APL RRset
update delete IN APL
# delete the APL record with a zero length rdata
update delete IN APL \# 0
chg: usr: Reduce memory footprint by actively returning unused memory to the OS
Previously, :iscman:`named` relied on the default allocator settings for
releasing unused memory back to the operating system, which could result in
unnecessarily high resident memory usage. :iscman:`named` now actively
manages memory page purging. On systems using jemalloc, background cleanup
threads are enabled and the dirty page decay time is reduced from 10 seconds
to 5 seconds. Additionally, a volume-based decay pass is triggered after
every 16 MiB of freed memory. On glibc-based systems, a similar
volume-based mechanism using malloc_trim() is used instead.
Merge branch 'ondrej/enable-background-cleaning-of-unused-memory' into 'main'
Ondřej Surý [Mon, 30 Mar 2026 06:50:07 +0000 (08:50 +0200)]
Reduce memory footprint by enabling background page purging
Enable jemalloc background threads and reduce dirty page decay time from
10s to 1s so that unused memory is returned to the OS sooner. As an
additional safety net, trigger a decay pass after every 16 MiB of frees
(rate-limited to once per second) to handle bursts that the background
thread might not catch in time. On glibc, fall back to malloc_trim(0)
with the same volume-based trigger.
Matthijs Mekking [Thu, 19 Mar 2026 16:58:30 +0000 (17:58 +0100)]
Move three more functions to zoneproperties.c
Move the following functions to the zoneproperties source files, as
they are simple get functions:
- dns_zone_getgluecachestats
- dns_zone_getkeystores
- dns_zone_getrequesttransporttype
Matthijs Mekking [Thu, 19 Mar 2026 16:10:18 +0000 (17:10 +0100)]
Move zonemgr to own source file
In order to make zone.c more readable, we are splitting it up in
separate source files. This moves the zonemgr to its own file
("zonemgr.c").
Since this code accesses the zone structure directly, move the
'struct dns_zonemgr' and its prerequisites to "zone_p.h".
The helper functions 'forward_cancel()', 'zone_xfrdone()',
'zmgr_start_xfrin_ifquota()', and 'zmgr_resume_xfrs() need to be
internally accessible to both source files.
fix: test: Check exit status of dig and nsupdate in nsupdate system test
Add missing failure checks to six dig and nsupdate invocations in nsupdate system test so that command failures are properly caught instead of silently ignored.
Merge branch 'marka/check-return-codes-in-nsupdate-test' into 'main'
The name 'isdelegation()' was confusing. This function is not checking
whether this message is a delegation, but whether the denial of
existence proofs in this message is a proof of a referral to an
unsigned zone.
The name 'is_unsecure_referral()' is more appropriate.
Revert isdelegation() to return boolean value again
The isdelegation() was changed to return an isc_result_t because the
idea was to have a separate return value DNS_R_NSEC3ITERRANGE to signal
to the caller we could not verify the proof because of too many
iterations in the NSEC3 record, or perhaps ISC_R_UNEXPECTED for a more
generic cause that verification was not done.
But this would make error handling more fragile and all we care about
is whether we can reliably say the NS bit was not set.
If we can not reliably say so, we have to treat it as an insecure
referrral.
Since the answer is either yes or no, we can revert back to returning
a boolean value.
Aydın Mercan [Sun, 5 Apr 2026 10:25:31 +0000 (13:25 +0300)]
chg: dev: embed default sanitizer flags in executables
Replicating CI failures requires the developer to piece together the
sanitizer flags by hand, reducing ergonomics.
Fix this problem by embedding the relevant settings to the executables.
Symbol resolution still needs manual intervention by setting the env
variable `*SAN_SYMBOLIZER_PATH`. However, this doesn't affect any behavior.
The flags are passed though a meson-configured `sanitize.c.in` template file
to toggle which flags are included for the executable. Using the built-in
`__SANITIZE_XXX__` or `__has_feature` for this task is more trouble than it's
worth because only one of the two is available in most GCC/clang versions,
alongside the lack of `__SANITIZE_UNDEFINED__` from GCC.
Meson's own unit test execution sets its own `ASAN_OPTIONS` etc. To prevent it
from overriding the default options, we also pass the same options to unit tests
environment variables.
A new script `ci/sanitizer-default-check.py` is used in CI to detect if
a build directory with sanitizers enabled has a meson `executable` definition
that doesn't include the sanitizer flag source file.
Closes #5469
Merge branch '5469-embed-default-sanitizer-flags-in-the-executable' into 'main'
Aydın Mercan [Mon, 1 Sep 2025 07:33:22 +0000 (10:33 +0300)]
embed default sanitizer flags in executables
Replicating CI failures requires the developer to piece together the
sanitizer flags by hand, reducing ergonomics.
Fix this problem by embedding the relevant settings to the executables.
Symbol resolution still needs manual intervention by setting the env
variable `*SAN_SYMBOLIZER_PATH`. However, this doesn't affect any behavior.
Replace the hand-rolled threaded socket server with the standard
AsyncDnsServer framework used by other ans.py servers in the test suite.
The DNS wire-format message builders (IXFR diff, AXFR, SOA, SERVFAIL)
are retained unchanged since they produce carefully crafted messages
needed to trigger the IXFR->AXFR race condition. The server
infrastructure is replaced:
- Manual TCP/UDP socket management and threading replaced by
AsyncDnsServer, which handles both protocols, pidfile lifecycle,
and signal handling.
- Query parsing replaced by the framework's dns.message-based parser;
query dispatch moved into IxfrRaceHandler.get_responses().
- The axfr_done_event threading.Event replaced by a boolean instance
variable on IxfrRaceHandler, safe within the single asyncio event
loop.
- For IXFR over TCP, the handler yields two BytesResponseSend actions
(msg1 then msg2) so the framework sends both with TCP length prefixes,
preserving the race-triggering sequence.
- For IXFR over UDP, the TC flag is set on the response to force TCP
retry.
- Unused encode_name_compressed() and parse_dns_query() removed.
Also fix a timing issue that might result in the initial transfer not
being done by the time the test is executed -- since ns11 is started
after ns6. Ensure the initial transfer has happened before running the
ixfr_race test.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Aram Sargsyan [Wed, 4 Mar 2026 16:25:33 +0000 (16:25 +0000)]
Fix a race condition in xfrin_recv_done() when calling xfrin_reset()
When the xfrin_recv_done() function decides to retry the transfer
using AXFR because of a previous error, it calls the xfrin_reset()
function which calls dns_db_closeversion() on 'xfr->ver'. The problem
is that the ixfr processing of a previous message could be still
in process in a worker thread, which then can use freed 'xfr->ver'.
If there is an ongoing worker thread delay the AXFR retry until after
the worker thread has finished its work.
Aram Sargsyan [Thu, 5 Mar 2026 11:15:38 +0000 (11:15 +0000)]
Add a test to check for IXFR->AXFR race-condition
The test initiates a zone transfer with IXFR, which produces
a big amount of differences and then generates an error. The
secondary should be able to gracefully shutdown the ongoing
IXFR transfer and retry with AXFR without race conditions
between them.
This test checks for an issue (GL#5767) but since a race
condition is usually time-sensitive it might require several
attempts before it reproduces the issue.
fix: usr: Fix wrong NSEC proof for empty non-terminals after IXFR
When a secondary received an IXFR that transitioned a zone from unsigned to NSEC-signed, queries for empty non-terminal names returned the zone apex NSEC record instead of the NSEC that actually covers the queried name. The issue only occurred with incremental transfers; a full AXFR or a server restart resolved it.
Fix wrong NSEC proof for empty non-terminals after IXFR
When receiving NSEC records via IXFR, the node was not marked with
havensec because the condition checked the uninitialized output
rdataset type instead of the input rdataset type. This caused
queries for empty non-terminal names in NSEC-signed zones received
via IXFR to return the zone apex NSEC instead of the correct
covering NSEC record.
Add regression test for NSEC proof after unsigned-to-signed IXFR
Test that a secondary receiving an IXFR transitioning a zone from
unsigned to NSEC-signed returns the correct covering NSEC record
for empty non-terminal names.
Add isctest.query.wait_for_serial() shared helper for waiting until
a server has a specific SOA serial.
chg: dev: Change NSEC3 and NSEC3PARAM rdata struct fields to use isc_region_t
Replace the separate pointer+length field pairs in the NSEC3 and NSEC3PARAM rdata structures (salt/salt_length, next/next_length, typebits/len) with isc_region_t, making the fields self-describing and eliminating a class of length-mismatch bugs.
Merge branch 'ondrej/change-nsec3-and-nsec3param-to-use-isc_region_t' into 'main'
Ondřej Surý [Tue, 24 Feb 2026 12:30:56 +0000 (13:30 +0100)]
Change NSEC3 and NSEC3PARAM struct fields to use isc_region_t
Replace the separate pointer+length field pairs in dns_rdata_nsec3_t
(salt/salt_length, next/next_length, typebits/len) and
dns_rdata_nsec3param_t (salt/salt_length) with isc_region_t. This
makes the structs self-describing and eliminates a class of
length-mismatch bugs.
The dns_zone_setnsec3param() signature is updated to take
isc_region_t *salt instead of separate saltlen and salt arguments.
Function signatures for dns_nsec3_addnsec3, dns_db_getnsec3parameters,
and related internal functions still use separate pointer+length pairs
and should be updated in a follow-up.
The system test was also subject to the same off by one bug that also
existed in the code. That is: if the inception time of the signature
is exactly equal to the inactive time of the key, we still have to
expect the signature.
This specific test case triggered a bug where the SKR included bundles
with unsigned DNSKEY RRsets (signatures where omitted because the
inception time was equal to the inactive time of the key).
If the inception time of the signature is exactly equal to the
inactive time of the key, still include the signature. Otherwise there
may be corner cases where signatures are omitted erroneously.
Matthijs Mekking [Thu, 19 Mar 2026 15:23:59 +0000 (16:23 +0100)]
Return void in functions that cannot fail
dns_zone_getloadtime(), dns_zone_getexpiretime(),
dns_zone_getrefreshtime(), and dns_zone_getrefreshkeytime()
cannot fail, so return void instead of ISC_R_SUCCESS.
Matthijs Mekking [Thu, 19 Mar 2026 14:59:59 +0000 (15:59 +0100)]
Lock zone when checking for inline raw/secure
The caller is supposed to hold the zone lock for 'inline_raw()' and
'inline_secure()', but when adding 'REQUIRE(LOCKED_ZONE(zone));' to
these functions it turned out to be not always the case.
Matthijs Mekking [Thu, 19 Mar 2026 13:56:43 +0000 (14:56 +0100)]
Move zone set/get properties to own source file
In order to make zone.c more readable, we are splitting it up in
separate source files. This moves the set and get functions to its
own file ("zoneproperties.c").
Since this code accesses the zone structure directly, move the
'struct dns_zone' and its prerequisites to "zone_p.h".
The helper functions 'inline_raw()', 'inline_secure()',
'dns_zone_setview_helper()', 'zone_settimer(), 'set_resigntime()', and
'zone_freedbargs()' need to be internally accessible to both source
files.
fix: usr: Fix rndc modzone behavior for a zone in named.conf
If a zone was present in the configuration file and not originally added by `rndc addzone`, `rndc modzone` for that zone would succeed once but subsequent `modzone` attempts would fail. This has been fixed.
Closes #5826
Merge branch '5826-fix-subsequenrt-rndc-modzone' into 'main'
In the rare case where a catalog zone member is being modified with
'rndc modzone', also mark the zone as modded, so when the zone is
deleted again with 'rndc delzone', the configuration is also removed
from the NZD.