Colin Vidal [Fri, 5 Dec 2025 08:35:58 +0000 (09:35 +0100)]
chg: dev: Shrunk cfgobj down from 72 bytes to 48 bytes
Make all non-scalar properties of `cfg_obj_t` allocated values, which
ensures the union size is the width of one pointer. Also reorder the
fields inside `cfg_obj_t` to avoid alignment padding that would increase
the size. As a result, a `cfg_obj_t` instance is now 48 bytes on a
64-bit platform.
Add a static assertion to avoid increasing the size of the struct by
mistake.
The function `parse_sockaddrsub` was taking advantage of the fact that
both sockaddr and sockaddrtls were in the same position, and used to
initialize the sockaddr field independently if this was a -tls one or
not. This doesn't work anymore now that all fields are allocated,
so it has been slightly rewritten to take both cases into account
separately.
Colin Vidal [Fri, 7 Nov 2025 13:58:36 +0000 (14:58 +0100)]
shrunk cfgobj down to 48bytes
Make all non-scalar properties of `cfg_obj_t` allocated values, which
ensures the union size is the width of one pointer. Also reorder the
fields inside `cfg_obj_t` to avoid alignment padding that would increase
the size. As a result, a `cfg_obj_t` instance is now 48 bytes on a
64-bit platform.
Add a static assertion to avoid increasing the size of the struct by
mistake.
The function `parse_sockaddrsub` was taking advantage of the fact that
both sockaddr and sockaddrtls were in the same position, and used to
initialize the sockaddr field independently if this was a -tls one or
not. This doesn't work anymore now that all fields are allocated,
so it has been slightly rewritten to take both cases into account
separately.
Colin Vidal [Thu, 4 Dec 2025 15:42:42 +0000 (16:42 +0100)]
chg: dev: Remove memory context form `cfg_obj_t`
Removes the `cfg_obj_t` memory context pointer, as the parser always uses `isc_g_mctx`. This simplifies the parser API/configuration tree API (no need to pass the memory context); and the `cfg_obj_t` size goes down from 80 bytes to 72 bytes.
While not directly related to the changes, also remove the `cfg_parser_t` `references` field as it is not used anymore (since the `cfg_obj_t` types doesn't reference it anymore).
Merge branch 'colin/remove-memctx-cfgobj' into 'main'
Colin Vidal [Thu, 4 Dec 2025 13:02:01 +0000 (14:02 +0100)]
document usage of BIND9 constructors/destructors
Document the way `__attribute__((__constructor__))` and
`__attribute__((__destructor__))` must be used in BIND9 libraries in
order to avoid unexpected behaviors with other third-party libraries.
Colin Vidal [Fri, 7 Nov 2025 17:56:39 +0000 (18:56 +0100)]
remove --memstats from cfg_test
The `--memstats` option from cfg_test is unused, and even if used, does
nothing because `--memstats` relies on `isc_mem_stats` which dump memory
pools statistics, which are not used at all for configuration.
Also, dropping the option avoid to add a parser API to get the memory
stats (as the parser now uses the global memory context).
Colin Vidal [Tue, 4 Nov 2025 09:49:40 +0000 (10:49 +0100)]
remove memory context from parser context
As the isccfg library now uses the global memory context, it is now
used directly instead of passing the parser context around to grab its
memory context.
Also remove the memory context from the parser, as well as from
`cfg_obj_t`, as it's now useless.
Colin Vidal [Mon, 3 Nov 2025 16:24:22 +0000 (17:24 +0100)]
parser: add cfg_string_create() API
The parser has a static function `create_string()` used
internally. But there was duplicate code to create a string node
in `namedconf.c`. Instead of implementing the same logic twice,
`create_string()` is now publicly exposed as `cfg_string_create()`.
Evan Hunt [Thu, 4 Dec 2025 03:15:12 +0000 (03:15 +0000)]
fix: dev: Standardize CHECK and RETERR macros
Previously, there were over 40 separate definitions of `CHECK` macros, of
which most used `goto cleanup`, and the rest `goto failure` or `goto out`.
There were another 10 definitions of `RETERR`, of which most were identical
to `CHECK`, but some simply returned a result code instead
of jumping to a cleanup label.
This has now been standardized throughout the code base: `RETERR` is for
returning an error code in the case of an error, and `CHECK` is for jumping
to a cleanup tag, which is now always called `cleanup`. Both macros are
defined in `isc/util.h`.
Evan Hunt [Thu, 16 Oct 2025 22:05:01 +0000 (15:05 -0700)]
use a standard CLEANUP macro
CLEANUP is a macro similar to CHECK but unconditional, jumping
to cleanup even if the result is ISC_R_SUCCESS. It is now used
in place of DST_RET, CLEANUP_WITH, and CHECK(<non-success constant>).
Evan Hunt [Wed, 21 May 2025 20:22:58 +0000 (13:22 -0700)]
standardize CHECK and RETERR macros
previously, there were over 40 separate definitions of CHECK macros, of
which most used "goto cleanup", and the rest "goto failure" or "goto
out". there were another 10 definitions of RETERR, of which most were
identical to CHECK, but some simply returned a result code instead of
jumping to a cleanup label.
this has now been standardized throughout the code base: RETERR is for
returning an error code in the case of an error, and CHECK is for jumping
to a cleanup tag, which is now always called "cleanup". both macros are
defined in isc/util.h.
Colin Vidal [Wed, 3 Dec 2025 15:26:22 +0000 (16:26 +0100)]
chg: dev: Add RRSIG if required as soon as they are found
When EDNS DO flag (`dig +dnssec`) flag is set, an rdataset is allocated
to hold the RRSIG of an RR, if present in DB. However, this allocation
is not done if the zone DB is not considered as secure
(`dns_db_issecure() == false`). Changes this behaviour by allocating the
rdataset anyway, so the RRSIG can be associated in the answer section of
the response as soon it is found from the DB.
The fact we attach the rrsig potentially more often (though it probably
occurs in edge cases) doesn't seems to affect performance in any ways:
Merge branch 'colin/rrsig-nonsecure-db' into 'main'
Colin Vidal [Tue, 2 Dec 2025 18:00:55 +0000 (19:00 +0100)]
test for RRSIG provided as soon as they are found
Add a system test which checks that a server authoritative on zone which
is not fully signed (here, it is missing the DNSKEY records as well as the
RRSIG on the RR `b`) still return the RRSIG associated with an RR if
provided in the zone.
Colin Vidal [Tue, 2 Dec 2025 15:53:40 +0000 (16:53 +0100)]
add RRSIG if required as soon as they are found
When EDNS DO flag (`dig +dnssec`) flag is set, an rdataset is allocated
to hold the RRSIG of an RR, if present in DB. However, this allocation
is not done if the zone DB is not considered as secure
(`dns_db_issecure() == false`). Changes this behaviour by allocating the
rdataset anyway, so the RRSIG can be associated in the answer section of
the response as soon it is found from the DB.
Arаm Sаrgsyаn [Wed, 3 Dec 2025 10:16:08 +0000 (10:16 +0000)]
fix: test: Fix an issue with unreachable cache's unit test
The isc_stdtime_now() function used by dns_unreachcache_find() to
check if the entry needs to be expired has a one-second resolution,
and the test sleeps for 1 second and then for the amount of the
expiration interval, which in a worst-case scenario can cause the
test to fail, because the entry was expected to be expired but it
wasn't. Sleep for 2 seconds instead of 1 to avoid the timing
resolution issue.
Closes #5601
Merge branch '5601-unreachable-cache-expire-test-fix' into 'main'
Aram Sargsyan [Thu, 6 Nov 2025 11:33:41 +0000 (11:33 +0000)]
Fix an issue with unreachable cache's unit test
The isc_stdtime_now() function used by dns_unreachcache_find() to
check if the entry needs to be expired has a one-second resolution,
and the test sleeps for 1 second and then for the amount of the
expiration interval, which in a worst-case scenario can cause the
test to fail, because the entry was expected to be expired but it
wasn't. Sleep for 2 seconds instead of 1 to avoid the timing
resolution issue.
Matthijs Mekking [Thu, 20 Nov 2025 13:10:45 +0000 (14:10 +0100)]
Wait for log zone_needdump is more reliable
In some cases we wait for the log message "sending notifies" before
proceeding with the test case. Notifies are rate limited. They are not
sent on every change to the zone. The "zone_needdump" messages happen on
every change.
Evan Hunt [Fri, 28 Nov 2025 19:07:48 +0000 (19:07 +0000)]
fix: dev: Pass isc_buffer_t pointers when applicable
In commit aea251f3bce7, `isc_buffer_reserve()` was changed to
take a simple `isc_buffer_t *` instead of `isc_buffer_t **`.
A number of functions calling it have now been similarly
modified.
Evan Hunt [Wed, 26 Nov 2025 07:23:19 +0000 (23:23 -0800)]
pass isc_buffer_t pointers when applicable
In commit aea251f3bce7, `isc_buffer_reserve()` was changed to
take a simple `isc_buffer_t *` instead of `isc_buffer_t **`.
A number of functions calling it have now been similarly
modified.
Matthijs Mekking [Fri, 28 Nov 2025 15:15:39 +0000 (15:15 +0000)]
chg: usr: Improve output of 'rndc dnssec -status'
Add a new parameter ``-v`` to the ``rndc dnssec -status`` command for more verbose output. Previously, key states were printed, and keys that can be purged were listed. This made the output hard to read. This information is now only shown in the verbose output.
Add more meaningful messages to the status output, making it clearer what the state of a rollover is.
This makes the output more condense, improving its readability.
Closes #3938
Merge branch '3938-improve-rndc-dnssec-status-output' into 'main'
Matthijs Mekking [Wed, 15 Oct 2025 14:04:28 +0000 (16:04 +0200)]
Change output of rndc dnssec -status
Wrap 'dns_keymgr_status()' in 'dns_zone_dnssecstatus()' so we can easily
retrieve the zone string name and refresh key time value.
In addition to the current time, output when the next key event is
expected.
Don't log keys that are completely hidden unless verbose is set.
Don't log key state values unless verbose is set, or they are in a
weird state.
For expected key states, log a more useful message of the stage of
the rollover. If we are in the middle of a key rollover, don't log
when the next key rollover is scheduled.
Matthijs Mekking [Wed, 26 Nov 2025 07:31:45 +0000 (08:31 +0100)]
Update misleading comments in multisigner test
We are not actually retrieving these records from the other provider,
they are available as key files to us and we are using those files
to send a dynamic update to the server.
Matthijs Mekking [Sun, 12 Oct 2025 09:49:00 +0000 (11:49 +0200)]
Convert model2.secondary test to pytest
This test is similar to model2.multisigner, but now the two providers
are both secondary, both using the same hidden primary. The DNSKEY,
CDNSKEY, and CDS records need to be published at the hidden primary,
ns5, the zone is transferred to both secondaries, ns3 and ns4.
To avoid intermittent test failures, we wait for the line
"zone {zone}/IN (signed): serial {serial2} (unsigned {serial1})" in
the secondary server logs. This is a signal that the unsigned zone
with serial <serial1> has a signed version ready with serial <serial2>.
Matthijs Mekking [Fri, 10 Oct 2025 16:02:58 +0000 (18:02 +0200)]
Update multisigner system test to set primary
When testing multi-signer as bump-in-the-wire (upcoming test), we want
to be able to do dynamically updates to a hidden primary. Update the
test functions such that we can set a specific primary server.
Matthijs Mekking [Fri, 10 Oct 2025 08:57:50 +0000 (10:57 +0200)]
Convert model2.multisigner test to pytest
This converts the model2.multisigner tests from the multisigner system
test to pytest based code. Crappy shell test functions such as
'zsks_are_published', 'records_published' and others are replaced with
the standard test code from isctest.kasp and by setting 'private=False'
and 'legacy=True' on the keys from the other providers so we don't do
any key file testing.
Ondřej Surý [Mon, 24 Nov 2025 09:06:51 +0000 (10:06 +0100)]
Provide more information when the memory allocation fails
Instead of just crashing when memory allocation fails, also print a
message saying "Out of memory!", the size of the allocation that failed,
total allocated memory from all memory contexts and value of errno.
Ondřej Surý [Fri, 28 Nov 2025 13:34:04 +0000 (14:34 +0100)]
fix: nil: Fix missing field 'merge' initializer for the new cfg_clausedef_t
In !11121, a .merge member was added to cfg_clausedef_t. This caused
a build failure with -Werror,-Wmissing-field-initializers enabled.
Add the missing initializer and set them all to NULL to match the
intent.
Merge branch 'ondrej/fix-compilation-on-macos' into 'main'
Ondřej Surý [Fri, 28 Nov 2025 10:38:41 +0000 (11:38 +0100)]
Fix missing field 'merge' initializer for the new cfg_clausedef_t
In !11121, a .merge member was added to cfg_clausedef_t. This caused
a build failure with -Werror,-Wmissing-field-initializers enabled.
Add the missing initializer and set them all to NULL to match the
intent.
Colin Vidal [Fri, 28 Nov 2025 12:45:06 +0000 (13:45 +0100)]
fix: dev: Fix uninitialized pointer check on getipandkeylist
Function `named_config_getipandkeylist` could, in case of error in the early code attempting to get the `port` or `tls-port`, make a pointer check on a non-initialized value. This is now fixed.
Merge branch 'colin/getipandkeylist-uinitstate' into 'main'
Colin Vidal [Fri, 28 Nov 2025 10:55:32 +0000 (11:55 +0100)]
fix unitiailized pointer check on getipandkeylist
Function `named_config_getipandkeylist` could, in case of error in the
early code attempting to get the `port` or `tls-port`, make a pointer
check on a non-initialized value. This is now fixed.
Ondřej Surý [Fri, 28 Nov 2025 09:51:38 +0000 (10:51 +0100)]
fix: usr: Fix caching RRSIG covering cache NODATA record
When a RRSIG for type that we already have cached NODATA record was cached due to mismatch of the records on the upstream nameservers, an assertion failure could trigger. This has been fixed.
Closes #5633
Merge branch '5633-evict-related-rrsig-when-adding-negative-header' into 'main'
Ondřej Surý [Sat, 8 Nov 2025 11:10:51 +0000 (12:10 +0100)]
Fix not caching RRSIG covering cache NODATA record
During refactoring, a condition that prevented caching RRSIGs for
records that we already have cached NODATA records was changed in an
invalid way. This was caught later when a cached NODATA(type) +
RRSIG(type) was found in the cache and caused an assertion failure.
Fix and simplify condition that prevents adding such RRSIGs.
Ondřej Surý [Sat, 8 Nov 2025 11:06:20 +0000 (12:06 +0100)]
Evict the RRSIG when adding negative header
Formerly, we've evicted the RRSIG(type) only when we were changing
existing header from positive to negative. Move the eviction routine
for the RRSIG to a common path, so the RRSIG also gets evicted when we
are adding new negative header for a specific type.
Colin Vidal [Tue, 25 Nov 2025 14:45:22 +0000 (15:45 +0100)]
check validity of key and tls in a server-list
If a `key` or `tls` is associated to an IP address inside a server-list,
only the `tls` existence in the configuration was checked. Also, if
`key` or `tls` is associated to a named server-list inside a
server-list, there was no check at all.
Add the check for making sure a `key` is defined in the configuration,
as well as the check for `key` and `tls` when used on a named
server-list.
Colin Vidal [Tue, 25 Nov 2025 14:34:26 +0000 (15:34 +0100)]
check remote-servers list correctness
`check.c` only checks if `remote-servers`, `primaries`, etc. are not
duplicated inside the configuration file, but does not check the
correctness of its definition. This commit fixes this by calling
`validate_remotes()` for each `remote-servers` (and other aliases),
which validates the correctness of the definition itself (this is the
same call done to validate other cases like `also-notify`, etc.).
Colin Vidal [Tue, 25 Nov 2025 12:56:37 +0000 (13:56 +0100)]
refactoring of `named_config_getipandkeylist`
Function `named_config_getipandkeylist()` processes the nested lists by
overriding the current local variable of the function, jumping back to
the beginning of the list processing. Of course, in order to go back to
the previous state and process the remaining items of the current list,
a "stack" array is used in order to put and get back the next list
element and associated values.
This makes the logic quite complex and error prone. Instead, this commit
changes the logic by recursing into the nested list (while sharing a
state between all the invocations). The processing is fundamentally
identical, but instead of "manually" handling the stack to go back to
the previous state (and process remaining elements of the current list),
takes advantage of recursion.
did not work: the `fookey` was silently ignored. No matter how `bar` was
used, the server `10.53.0.5` wouldn't be contacted using the TSIG key
`fookey`. The problem is the same the for `tls` property.
The reason of the problem was that when `named_config_getipandkeylist()`
reached a named server-list (here, `foo`), it modified the current
context in order to immediately process what is inside `foo`, but forgot
to look at the fields `key` and `tls`, to associate those with `foo`
addresses.
Fix the problem by wrapping the `key` and `tls` from the "caller" list
inside the existing `lists` struct which is used to figure out if a
list is already processed or not. That way, the `key` and `tls` values
can be read when adding the addresses of the nested list.
Colin Vidal [Fri, 21 Nov 2025 16:05:15 +0000 (17:05 +0100)]
test named remote-servers `key` usage
Even though `remote-servers` now allows using named server-list with `key`
(or `tls`), the `key` or `tls` is not used, in the context of a named
server-list, when configuring the server.
Colin Vidal [Wed, 19 Nov 2025 16:34:16 +0000 (17:34 +0100)]
allow named remote-servers list with key or tls
The remote-servers clause enables the following pattern:
remote-servers a { 1.2.3.4; ... };
remote-servers b { a key foo; };
However, `check.c` was explicitly throwing an error if a `key` or `tls`
was provided after a named server-list. Remove this check, as this is a
valid use case.
Arаm Sаrgsyаn [Thu, 27 Nov 2025 17:41:17 +0000 (17:41 +0000)]
fix: usr: Fix TLS contexts cache object usage bug in the resolver
:iscman:`named` could terminate unexpectedly when reconfiguring or
reloading, and if client-side TLS transport was in use (for example,
when forwarding queries to a DoT server). This has been fixed.
Closes #5653
Merge branch '5653-tlsctx_cache-reference-bug-fix' into 'main'
Aram Sargsyan [Thu, 27 Nov 2025 15:00:26 +0000 (15:00 +0000)]
Fix a bug where tlsctx_cache could be destroyed while still in use
When named is being reconfigured, it detaches from the old
'isc_tlsctx_cache_t' TLS context cache object and creates a
new one. This can cause an assertion failure within the
resolver when the object is destroyed while still in use,
because the resolver is using the object without getting
attached to it.
Add an attach/detach so that the 'isc_tlsctx_cache_t' doesn't
get destroyed while still being in use.
Ondřej Surý [Thu, 27 Nov 2025 16:34:42 +0000 (17:34 +0100)]
fix: usr: Fix the spurious timeouts while resolving names
Sometimes the loops in the resolving (e.g. to resolve or validate ns1.example.com we need to resolve ns1.example.com) were not properly detected leading to spurious 10 seconds delay. This has been fixed and such loops are properly detected.
Closes #3033, #5578
Merge branch '5578-tracker-parent-fetch' into 'main'
Ondřej Surý [Wed, 22 Oct 2025 17:25:55 +0000 (19:25 +0200)]
Detect resolution loops between fetches
Maintain the relationship between the parent and child fetch and when
creating a new child fetch, properly check the resolution loops that
would lead to a new fetch would join one of the parent's fetch contexts.
Ondřej Surý [Thu, 27 Nov 2025 16:34:07 +0000 (17:34 +0100)]
chg: usr: Change the QNAME minimization algorithm to follow the standard
In !9155, the QNAME minimization was changed to not leak the query type
to the parent name server. This violates RFC 9156 Section 3, step (3)
and it is not necessary. It also breaks some (weird) authoritative DNS
setups, especially when CNAMEs are involved. Also there is really no
privacy leak with query type.
Closes #5661
Merge branch '5661-dont-minimize-when-QNAME-matches-original-QNAME' into 'main'
Ondřej Surý [Thu, 27 Nov 2025 13:07:35 +0000 (14:07 +0100)]
Change the QNAME minimization algorithm to follow the standard
In !9155, the QNAME minimization was changed to not leak the query type
to the parent name server. This violates RFC 9156 Section 3, step (3)
and it is not necessary. It also breaks some (weird) authoritative DNS
setups, especially when CNAMEs are involved. Also there is really no
privacy leak with query type.
Nicki Křížek [Thu, 27 Nov 2025 13:49:01 +0000 (14:49 +0100)]
new: test: Create trust anchors from isctest.kasp.Key
Add isctest.kasp.Key.into_ta() method which convert the key into DS /
DNSKEY trust anchor for BIND config. Add a shared template
trusted.conf.j2 which can be linked to in tests to create the trust
anchor configuration from trust anchor data returned from bootstrap()
function.
This is basically a python replacement for the keyfile_to_static_ds (and
friends) from the conf.sh shell framework.
Merge branch 'nicki/pytest-add-trust-anchor-template' into 'main'
Nicki Křížek [Mon, 3 Nov 2025 13:59:00 +0000 (14:59 +0100)]
Add a template for TA and generate it from isctest.kasp.Key
Add isctest.kasp.Key.into_ta() method which convert the key into DS /
DNSKEY trust anchor for BIND config. Add a shared template
trusted.conf.j2 which can be linked to in tests to create the trust
anchor configuration from trust anchor data returned from bootstrap()
function.
This is basically a python replacement for the keyfile_to_static_ds (and
friends) from the conf.sh shell framework.
Nicki Křížek [Fri, 24 Oct 2025 14:47:59 +0000 (16:47 +0200)]
Parse DNSKEY into a dnspython type in isctest.kasp.Key.dnskey
Previously, a DNSKEY string from keyfile was returned. This made the
function brittle for further processing, as the string would have to be
split up, concatenated, and TTL could be missing, making string indices
context-dependent.
Parse the DNSKEY rrset into a proper dnspython object and return it.
This makes the output more predictable and reliable, as all the
neccessary parsing is done by dnspython.
Meson boolean options are usually configured with enabled/disabled
instead of on/off. Make things more consistent with other meson options
by renaming -Dnamed-lto=off to -Dnamed-lto=disabled.
Ondřej Surý [Thu, 27 Nov 2025 11:42:09 +0000 (12:42 +0100)]
chg: dev: Use malloc_usable_size()/malloc_size() for memory accounting
Restore usage of malloc_usable_size()/malloc_size(), but this time only
for memory accounting and statistics purposes. This should reduce the
memory footprint in case of compilation without jemalloc as we don't
have to keep track of the allocated memory size ourselves.
Merge branch 'ondrej/use-malloc_usable_size-when-available' into 'main'
Ondřej Surý [Mon, 24 Nov 2025 08:41:31 +0000 (09:41 +0100)]
Use malloc_usable_size()/malloc_size() for memory accounting
Restore usage of malloc_usable_size()/malloc_size(), but this time only
for memory accounting and statistics purposes. This should reduce the
memory footprint in case of compilation without jemalloc as we don't
have to keep track of the allocated memory size ourselves.
Ondřej Surý [Wed, 26 Nov 2025 10:45:03 +0000 (11:45 +0100)]
Enable junk filling via jemalloc option in the CI
Since the filling memory with junk patterns have been removed from ISC
memory context in favor of jemalloc opt.junk option, enable the jemalloc
behaviour by default in the GitLab CI.
As the fetch context reference counting was converted to userspace RCU
reference counting, the ability to debug the reference counting was
lost. Restore the debugging by adding the optional compile-time enabled
debugging output again.
Merge branch 'ondrej/add-tracing-to-fctx-reference-counting' into 'main'
Ondřej Surý [Thu, 23 Oct 2025 11:11:45 +0000 (13:11 +0200)]
Add optional debugging output for fetch context reference counting
As the fetch context reference counting was converted to userspace RCU
reference counting, the ability to debug the reference counting was
lost. Restore the debugging by adding the optional compile-time enabled
debugging output again.
Ondřej Surý [Thu, 27 Nov 2025 09:38:58 +0000 (10:38 +0100)]
chg: nil: Split qctx_destroy() into qctx_deinit() and qctx_destroy()
The qctx_destroy() only needs to be called on allocated memory and
qctx_deinit() needs to be called always. Also remove .allocated member
from the query_ctx_t structure.