Recently, a broken version of libuv was released breaking BIND on
several platforms. The offending [commit](https://github.com/libuv/libuv/issues/5030) was on the development branch
for months, but we didn't notice.
In nightly pipelines, build the current 'main' (actually 'v1.x') branch
of libuv and run the unit and system tests against it.
Merge branch 'stepan/prelease-testing-for-libuv' into 'main'
Štěpán Balážik [Mon, 9 Mar 2026 16:26:13 +0000 (17:26 +0100)]
Test development version of libuv in CI
Recently, a broken version of libuv was released breaking BIND on
several platforms. The offending commit [1] was on the development
branch for months, but we didn't notice.
In nightly pipelines, build the current 'main' (actually 'v1.x') branch
of libuv and run the unit and system tests against it.
Mark Andrews [Fri, 10 Apr 2026 06:23:27 +0000 (16:23 +1000)]
fix: usr: Fix zone verification of NSEC3 signed zones
Previously, when computing the compressed bitmap during verification of an NSEC3-signed zone, an undersized buffer was used that resulted in an out-of-bounds write if there were too many active windows in the bitmap. This impacted mirror zones which are NSEC3-signed, `dnssec-signzone` and `dnssec-verifyzone`. This has been fixed.
Michał Kępień [Thu, 9 Apr 2026 11:25:14 +0000 (13:25 +0200)]
fix: ci: Purge distros token in a separate CI job
The "publish" job runs on a dedicated, locked-down runner that lacks the
Python modules necessary to execute the manage_distros_token.py script.
Instead of deleting the token within the "publish" job, purge it in a
separate job that automatically runs on the "base" image after the
"publish" job succeeds. Define "rules" for the new job so that the
token is only deleted for security releases, as it should have been
initially.
Merge branch 'michal/purge-distros-token-in-a-separate-ci-job' into 'main'
Michał Kępień [Thu, 9 Apr 2026 11:23:57 +0000 (13:23 +0200)]
Purge distros token in a separate CI job
The "publish" job runs on a dedicated, locked-down runner that lacks the
Python modules necessary to execute the manage_distros_token.py script.
Instead of deleting the token within the "publish" job, purge it in a
separate job that automatically runs on the "base" image after the
"publish" job succeeds. Define "rules" for the new job so that the
token is only deleted for security releases, as it should have been
initially.
Michał Kępień [Thu, 9 Apr 2026 04:02:34 +0000 (06:02 +0200)]
Handle CVE reproducers along with fixes
With AI agents widely available, delaying CVE reproducer publication no
longer provides any benefit, as feeding a patch with a fix to a large
language model can produce a usable exploit. Revise the CVE checklist
to ensure the reproducer and the fix are pushed to the same merge
request (as separate commits) and remove the post-disclosure step for
regression test publishing.
Mark Andrews [Thu, 9 Apr 2026 00:33:41 +0000 (10:33 +1000)]
fix: doc: nsupdate does not handle zero length RDATA well
Nsupdate does not distinguish between a non-existing RDATA field
and an empty RDATA field when determining which action is desired
when the RDATA field is empty. This only affects a few data types,
like APL, which allow an empty RDATA field. Document a workaround
of using the '\# 0' form for entering these specific records. e.g.
# delete the APL RRset
update delete IN APL
# delete the APL record with a zero length rdata
update delete IN APL \# 0
Closes #5835
Merge branch '5835-nsupdate-doc-zero-length-rdata-how-to' into 'main'
Mark Andrews [Tue, 31 Mar 2026 01:26:42 +0000 (12:26 +1100)]
nsupdate does not handle zero length RDATA well
Nsupdate does not distinguish between a non-existing RDATA field
and an empty RDATA field when determining which action is desired
when the RDATA field is empty. This only affects a few data types,
like APL, which allow an empty RDATA field. Document a workaround
of using the '\# 0' form for entering these specific records. e.g.
# delete the APL RRset
update delete IN APL
# delete the APL record with a zero length rdata
update delete IN APL \# 0
chg: usr: Reduce memory footprint by actively returning unused memory to the OS
Previously, :iscman:`named` relied on the default allocator settings for
releasing unused memory back to the operating system, which could result in
unnecessarily high resident memory usage. :iscman:`named` now actively
manages memory page purging. On systems using jemalloc, background cleanup
threads are enabled and the dirty page decay time is reduced from 10 seconds
to 5 seconds. Additionally, a volume-based decay pass is triggered after
every 16 MiB of freed memory. On glibc-based systems, a similar
volume-based mechanism using malloc_trim() is used instead.
Merge branch 'ondrej/enable-background-cleaning-of-unused-memory' into 'main'
Ondřej Surý [Mon, 30 Mar 2026 06:50:07 +0000 (08:50 +0200)]
Reduce memory footprint by enabling background page purging
Enable jemalloc background threads and reduce dirty page decay time from
10s to 1s so that unused memory is returned to the OS sooner. As an
additional safety net, trigger a decay pass after every 16 MiB of frees
(rate-limited to once per second) to handle bursts that the background
thread might not catch in time. On glibc, fall back to malloc_trim(0)
with the same volume-based trigger.
Matthijs Mekking [Thu, 19 Mar 2026 16:58:30 +0000 (17:58 +0100)]
Move three more functions to zoneproperties.c
Move the following functions to the zoneproperties source files, as
they are simple get functions:
- dns_zone_getgluecachestats
- dns_zone_getkeystores
- dns_zone_getrequesttransporttype
Matthijs Mekking [Thu, 19 Mar 2026 16:10:18 +0000 (17:10 +0100)]
Move zonemgr to own source file
In order to make zone.c more readable, we are splitting it up in
separate source files. This moves the zonemgr to its own file
("zonemgr.c").
Since this code accesses the zone structure directly, move the
'struct dns_zonemgr' and its prerequisites to "zone_p.h".
The helper functions 'forward_cancel()', 'zone_xfrdone()',
'zmgr_start_xfrin_ifquota()', and 'zmgr_resume_xfrs() need to be
internally accessible to both source files.
fix: test: Check exit status of dig and nsupdate in nsupdate system test
Add missing failure checks to six dig and nsupdate invocations in nsupdate system test so that command failures are properly caught instead of silently ignored.
Merge branch 'marka/check-return-codes-in-nsupdate-test' into 'main'
The name 'isdelegation()' was confusing. This function is not checking
whether this message is a delegation, but whether the denial of
existence proofs in this message is a proof of a referral to an
unsigned zone.
The name 'is_unsecure_referral()' is more appropriate.
Revert isdelegation() to return boolean value again
The isdelegation() was changed to return an isc_result_t because the
idea was to have a separate return value DNS_R_NSEC3ITERRANGE to signal
to the caller we could not verify the proof because of too many
iterations in the NSEC3 record, or perhaps ISC_R_UNEXPECTED for a more
generic cause that verification was not done.
But this would make error handling more fragile and all we care about
is whether we can reliably say the NS bit was not set.
If we can not reliably say so, we have to treat it as an insecure
referrral.
Since the answer is either yes or no, we can revert back to returning
a boolean value.
Aydın Mercan [Sun, 5 Apr 2026 10:25:31 +0000 (13:25 +0300)]
chg: dev: embed default sanitizer flags in executables
Replicating CI failures requires the developer to piece together the
sanitizer flags by hand, reducing ergonomics.
Fix this problem by embedding the relevant settings to the executables.
Symbol resolution still needs manual intervention by setting the env
variable `*SAN_SYMBOLIZER_PATH`. However, this doesn't affect any behavior.
The flags are passed though a meson-configured `sanitize.c.in` template file
to toggle which flags are included for the executable. Using the built-in
`__SANITIZE_XXX__` or `__has_feature` for this task is more trouble than it's
worth because only one of the two is available in most GCC/clang versions,
alongside the lack of `__SANITIZE_UNDEFINED__` from GCC.
Meson's own unit test execution sets its own `ASAN_OPTIONS` etc. To prevent it
from overriding the default options, we also pass the same options to unit tests
environment variables.
A new script `ci/sanitizer-default-check.py` is used in CI to detect if
a build directory with sanitizers enabled has a meson `executable` definition
that doesn't include the sanitizer flag source file.
Closes #5469
Merge branch '5469-embed-default-sanitizer-flags-in-the-executable' into 'main'
Aydın Mercan [Mon, 1 Sep 2025 07:33:22 +0000 (10:33 +0300)]
embed default sanitizer flags in executables
Replicating CI failures requires the developer to piece together the
sanitizer flags by hand, reducing ergonomics.
Fix this problem by embedding the relevant settings to the executables.
Symbol resolution still needs manual intervention by setting the env
variable `*SAN_SYMBOLIZER_PATH`. However, this doesn't affect any behavior.
Replace the hand-rolled threaded socket server with the standard
AsyncDnsServer framework used by other ans.py servers in the test suite.
The DNS wire-format message builders (IXFR diff, AXFR, SOA, SERVFAIL)
are retained unchanged since they produce carefully crafted messages
needed to trigger the IXFR->AXFR race condition. The server
infrastructure is replaced:
- Manual TCP/UDP socket management and threading replaced by
AsyncDnsServer, which handles both protocols, pidfile lifecycle,
and signal handling.
- Query parsing replaced by the framework's dns.message-based parser;
query dispatch moved into IxfrRaceHandler.get_responses().
- The axfr_done_event threading.Event replaced by a boolean instance
variable on IxfrRaceHandler, safe within the single asyncio event
loop.
- For IXFR over TCP, the handler yields two BytesResponseSend actions
(msg1 then msg2) so the framework sends both with TCP length prefixes,
preserving the race-triggering sequence.
- For IXFR over UDP, the TC flag is set on the response to force TCP
retry.
- Unused encode_name_compressed() and parse_dns_query() removed.
Also fix a timing issue that might result in the initial transfer not
being done by the time the test is executed -- since ns11 is started
after ns6. Ensure the initial transfer has happened before running the
ixfr_race test.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Aram Sargsyan [Wed, 4 Mar 2026 16:25:33 +0000 (16:25 +0000)]
Fix a race condition in xfrin_recv_done() when calling xfrin_reset()
When the xfrin_recv_done() function decides to retry the transfer
using AXFR because of a previous error, it calls the xfrin_reset()
function which calls dns_db_closeversion() on 'xfr->ver'. The problem
is that the ixfr processing of a previous message could be still
in process in a worker thread, which then can use freed 'xfr->ver'.
If there is an ongoing worker thread delay the AXFR retry until after
the worker thread has finished its work.
Aram Sargsyan [Thu, 5 Mar 2026 11:15:38 +0000 (11:15 +0000)]
Add a test to check for IXFR->AXFR race-condition
The test initiates a zone transfer with IXFR, which produces
a big amount of differences and then generates an error. The
secondary should be able to gracefully shutdown the ongoing
IXFR transfer and retry with AXFR without race conditions
between them.
This test checks for an issue (GL#5767) but since a race
condition is usually time-sensitive it might require several
attempts before it reproduces the issue.
fix: usr: Fix wrong NSEC proof for empty non-terminals after IXFR
When a secondary received an IXFR that transitioned a zone from unsigned to NSEC-signed, queries for empty non-terminal names returned the zone apex NSEC record instead of the NSEC that actually covers the queried name. The issue only occurred with incremental transfers; a full AXFR or a server restart resolved it.
Fix wrong NSEC proof for empty non-terminals after IXFR
When receiving NSEC records via IXFR, the node was not marked with
havensec because the condition checked the uninitialized output
rdataset type instead of the input rdataset type. This caused
queries for empty non-terminal names in NSEC-signed zones received
via IXFR to return the zone apex NSEC instead of the correct
covering NSEC record.
Add regression test for NSEC proof after unsigned-to-signed IXFR
Test that a secondary receiving an IXFR transitioning a zone from
unsigned to NSEC-signed returns the correct covering NSEC record
for empty non-terminal names.
Add isctest.query.wait_for_serial() shared helper for waiting until
a server has a specific SOA serial.
chg: dev: Change NSEC3 and NSEC3PARAM rdata struct fields to use isc_region_t
Replace the separate pointer+length field pairs in the NSEC3 and NSEC3PARAM rdata structures (salt/salt_length, next/next_length, typebits/len) with isc_region_t, making the fields self-describing and eliminating a class of length-mismatch bugs.
Merge branch 'ondrej/change-nsec3-and-nsec3param-to-use-isc_region_t' into 'main'
Ondřej Surý [Tue, 24 Feb 2026 12:30:56 +0000 (13:30 +0100)]
Change NSEC3 and NSEC3PARAM struct fields to use isc_region_t
Replace the separate pointer+length field pairs in dns_rdata_nsec3_t
(salt/salt_length, next/next_length, typebits/len) and
dns_rdata_nsec3param_t (salt/salt_length) with isc_region_t. This
makes the structs self-describing and eliminates a class of
length-mismatch bugs.
The dns_zone_setnsec3param() signature is updated to take
isc_region_t *salt instead of separate saltlen and salt arguments.
Function signatures for dns_nsec3_addnsec3, dns_db_getnsec3parameters,
and related internal functions still use separate pointer+length pairs
and should be updated in a follow-up.
The system test was also subject to the same off by one bug that also
existed in the code. That is: if the inception time of the signature
is exactly equal to the inactive time of the key, we still have to
expect the signature.
This specific test case triggered a bug where the SKR included bundles
with unsigned DNSKEY RRsets (signatures where omitted because the
inception time was equal to the inactive time of the key).
If the inception time of the signature is exactly equal to the
inactive time of the key, still include the signature. Otherwise there
may be corner cases where signatures are omitted erroneously.
Matthijs Mekking [Thu, 19 Mar 2026 15:23:59 +0000 (16:23 +0100)]
Return void in functions that cannot fail
dns_zone_getloadtime(), dns_zone_getexpiretime(),
dns_zone_getrefreshtime(), and dns_zone_getrefreshkeytime()
cannot fail, so return void instead of ISC_R_SUCCESS.
Matthijs Mekking [Thu, 19 Mar 2026 14:59:59 +0000 (15:59 +0100)]
Lock zone when checking for inline raw/secure
The caller is supposed to hold the zone lock for 'inline_raw()' and
'inline_secure()', but when adding 'REQUIRE(LOCKED_ZONE(zone));' to
these functions it turned out to be not always the case.
Matthijs Mekking [Thu, 19 Mar 2026 13:56:43 +0000 (14:56 +0100)]
Move zone set/get properties to own source file
In order to make zone.c more readable, we are splitting it up in
separate source files. This moves the set and get functions to its
own file ("zoneproperties.c").
Since this code accesses the zone structure directly, move the
'struct dns_zone' and its prerequisites to "zone_p.h".
The helper functions 'inline_raw()', 'inline_secure()',
'dns_zone_setview_helper()', 'zone_settimer(), 'set_resigntime()', and
'zone_freedbargs()' need to be internally accessible to both source
files.
fix: usr: Fix rndc modzone behavior for a zone in named.conf
If a zone was present in the configuration file and not originally added by `rndc addzone`, `rndc modzone` for that zone would succeed once but subsequent `modzone` attempts would fail. This has been fixed.
Closes #5826
Merge branch '5826-fix-subsequenrt-rndc-modzone' into 'main'
In the rare case where a catalog zone member is being modified with
'rndc modzone', also mark the zone as modded, so when the zone is
deleted again with 'rndc delzone', the configuration is also removed
from the NZD.
Matthijs Mekking [Tue, 31 Mar 2026 12:46:19 +0000 (14:46 +0200)]
Test restart works after rndc modzone
When a zone that is configured in named.conf is modified with
'rndc modzone', the new zone configuration is now also stored in the
NZD. Add a test to ensure that after a restart, the old zone
configuration is used.
Matthijs Mekking [Tue, 24 Mar 2026 16:02:36 +0000 (17:02 +0100)]
Add the modified zone configuration to NZD
When a zone that is configured in named.conf is modified with
'rndc modzone', the zone configuration is deleted from the effective
config. Store the new configuration in the NZD. Mark the zone
as 'modified by rndc modzone'. Otherwise, subsequent calls to
'rndc modzone' would fail because the zone configuration cannot be
found.
JINMEI Tatuya [Tue, 24 Mar 2026 15:57:49 +0000 (16:57 +0100)]
Test rndc modzone succeeds twice for a zone in named.conf
If a zone is in named.conf, not originally added by rndc addzone,
rndc modzone for that zone succeeds once, but subsequent modzone
attempts fail. This is because do_modzone removes the zone config
from global or view options, but it would fail due to 'not found'
once the config is removed.
Colin Vidal [Thu, 2 Apr 2026 07:17:17 +0000 (09:17 +0200)]
fix NULL dereference in dns_view_bestzonecut()
When `dns_view_bestzonecut()` is called with a NULL `delegsetp`, it
calls `bestzonecut_zone()` with a NULL `rdataset` pointer but there is a
non-guarded de-reference of the `rdataset` pointer in
`bestzonecut_zone()`.
In practice, the only current situation where `dns_view_bestzonecut()`
is called with NULL `delegsetp` is from a case of `seek_ds()` _and_ the
non-guarded dereference occurs only if there is a static-stub local
zone matching the zonecut `seek_ds()` is looking for. It's unclear if
such flow is actually possible.
The `rdataset` is now always valid inside `dns_view_bestzonecut()`. (It
was initially set only if `delegsetp` was set to avoid extra works in
the qpzone, which can be skipped when `rdataset` is NULL, but this
doesn't really make a difference, considering we are in a slow path
considering the result wasn't found in this case.)
Colin Vidal [Thu, 2 Apr 2026 09:51:25 +0000 (11:51 +0200)]
fix: dev: remove deadcode in `query_addbestns()`
The local variable `zfname` was released in the cleanup part of the
function if not NULL, but it turns out it is now always NULL at that
point.
The flow can get to that part only in two cases: either `zfname` is not
NULL, and then it's ownership is moved to a different variable (thus, it
is now NULL), or `zfname` is already NULL.
Removing the bit of deadcode releasing it.
Merge branch 'colin/fix-getbestns-deadcode' into 'main'
Colin Vidal [Thu, 2 Apr 2026 08:48:41 +0000 (10:48 +0200)]
remove deadcode in `query_addbestns()`
The local variable `zfname` was released in the cleanup part of the
function if not NULL, but it turns out it is now always NULL at that
point.
The flow can get to that part only in two cases: either `zfname` is not
NULL, and then it's ownership is moved to a different variable (thus, it
is now NULL), or `zfname` is already NULL.
python -m pip install -r https://gitlab.isc.org/isc-projects/bind9/-/raw/main/doc/arm/requirements.txt
ERROR: Ignored the following yanked versions: 8.3.0
ERROR: Ignored the following versions that require a different python version: 9.1.0 Requires-Python >=3.12; 9.1.0rc1 Requires-Python >=3.12; 9.1.0rc2 Requires-Python >=3.12
ERROR: Could not find a version that satisfies the requirement Sphinx==9.1.0 (from versions: 0.1.61611, ..., 9.0.4)
ERROR: No matching distribution found for Sphinx==9.1.0
Merge branch 'mnowak/revert-sphinx-9.1.0' into 'main'
python -m pip install -r https://gitlab.isc.org/isc-projects/bind9/-/raw/main/doc/arm/requirements.txt
ERROR: Ignored the following yanked versions: 8.3.0
ERROR: Ignored the following versions that require a different python version: 9.1.0 Requires-Python >=3.12; 9.1.0rc1 Requires-Python >=3.12; 9.1.0rc2 Requires-Python >=3.12
ERROR: Could not find a version that satisfies the requirement Sphinx==9.1.0 (from versions: 0.1.61611, ..., 9.0.4)
ERROR: No matching distribution found for Sphinx==9.1.0
fix: usr: Use the zone file's basename as origin in DNSSEC tools
In `dnssec-signzone` and `dnssec-verify`, when the zone origin is not specified using the `-o` parameter, the default behavior is to try to sign using the zone's file name as the origin. So, for example, `dnssec-signzone -S example.com` will work, so long as the file name matches the zone name.
This now also works if the zone is in a different directory. For example, `dnssec-signzone -S zones/example.com` will set the origin value to `example.com`.
Evan Hunt [Wed, 10 Dec 2025 00:52:44 +0000 (16:52 -0800)]
use the zone file's basename as origin in dnssec tools
In dnssec-signzone and dnssec-verify, if the zone origin is not
specified using the `-o` parameter, the default behavior is to try
to use the zone's file name as the origin. So, for example,
`dnssec-signzone -S example.com` or 'dnssec-verify example.com'
will work, so long as the file name matches the zone name.
This now also works if the zone is in a different directory.
For example, `dnssec-signzone -S zones/example.com` or
'dnssec-verify zones/example.com' will set the origin value
to `example.com`.
Check for existing TSIG keys before accepting a new
GSS-API negotiation and delete the key if it has expired.
Previously, an expired GSS key would permanently block
re-negotiation for that name until the server was restarted.
Merge branch 'ondrej/cleanup-gssapi-and-tkey-api' into 'main'
Ondřej Surý [Fri, 20 Mar 2026 07:43:28 +0000 (08:43 +0100)]
Add regression test for RFC 3645 Section 4.1.1 duplicate TKEY name
Add 'tkeyname' command to nsupdate to allow specifying a fixed TKEY
name instead of the default random one. This is used by the test to
send two GSS-API TKEY negotiations with the same name.
After a successful GSS-API TKEY negotiation via nsupdate -g, a second
attempt with the same TKEY name must be rejected with BADKEY
(error=17), not BADNAME (error=20).
Ondřej Surý [Wed, 18 Mar 2026 00:01:34 +0000 (01:01 +0100)]
Fix GSS context leak on error paths in process_gsstkey()
After gss_accept_sec_context() succeeds, the GSS context is passed
to dst_key_fromgssapi() which transfers ownership to the dst_key.
If a subsequent operation fails (dst_key_fromgssapi itself,
dns_tsigkey_createfromkey, or dns_tsigkeyring_add), the cleanup
label frees the dst_key but only if it was created. If the failure
happened before dst_key_fromgssapi, the GSS context was orphaned.
Delete the GSS context in the cleanup path when it was not
transferred to a dst_key.
Ondřej Surý [Wed, 18 Mar 2026 00:00:39 +0000 (01:00 +0100)]
Fix GSS context leak when principal name is empty
When gss_accept_sec_context() completes successfully but
gss_display_name() returns an empty principal, the GSS context
was leaked — it was neither stored in a key nor deleted.
Delete the context and reject with BADKEY in this case. This
should only occur due to a GSS library bug, since a completed
context should always have a valid principal.
Ondřej Surý [Tue, 17 Mar 2026 23:28:04 +0000 (00:28 +0100)]
Fix off-by-one in TSIG generated key eviction
Use pre-increment (++ring->generated) instead of post-increment
(ring->generated++) so the comparison against DNS_TSIG_MAXGENERATEDKEYS
happens after counting the new key. With post-increment, one extra key
beyond the limit was allowed before eviction kicked in.
Michal Nowak [Tue, 31 Mar 2026 15:36:38 +0000 (17:36 +0200)]
new: doc: Prepare documentation for BIND 9.21.21
I asked Claude to prepare the "Tweak and reword release notes" (663dba18f3015aefe178ba8b4790c7180f943c74) commit with the following guidance:
> add RST markup to @doc/notes/notes-9.21.21.rst. possible RST markups are to be found in @doc/arm/. if in doubt look at previous release notes in @doc/notes/. while at it, fix grammar and make sure the text is aligned to max 72 characters.
It did better that I'd do.
Merge branch 'mnowak/prepare-documentation-for-bind-9.21.21' into 'v9.21.21-release'
Alessio Podda [Tue, 31 Mar 2026 15:17:43 +0000 (15:17 +0000)]
chg: dev: Add a refcount to the vecheaders
This MR changes the way the ownership of the vecheaders is tracked. Before this MR, the ownership of the vecheader was implicitely tracked through a mix of the refcount on the node owning the header, the external refcount of the same node and the version. This has some adverse consequences in terms of contention, such as that querying A and AAAA glue hits the same refcount.
This MR adds a refcount to the vecheader itself, allowing it to exist independently of the node it is contained in. On its own, this would create a cycle, where the node has a reference to the header, which has a reference to the heap, which in turn has a reference to the node.
To break this cycle, this MR also moves from an "intrusive" heap, to a more traditional one where pointers to the node and vecheader in the heap are stored in a hashmap.
Alessio Podda [Thu, 5 Mar 2026 14:24:01 +0000 (15:24 +0100)]
Fix benign race condition
The dns_rdatavec_subtractrdataset function would copy the old header
using memmove but the old header includes fields such as trust and
reference counts that are atomic.
While the values of those fields were never used, it did cause a benign
race condition. This commit refactors dns_rdatavec_subtractrdataset and
dns_rdatavec_merge not to use memmove.