Colin Vidal [Thu, 2 Oct 2025 05:32:50 +0000 (07:32 +0200)]
chg: dev: rename ns_pluginregister_ctx_t into ns_pluginctx_t
The type `ns_pluginregister_ctx_t` was initially added to pass plugin
contextual data when the plugin is registered, but this is also now
passed into `plugin_check`. Furthermore, those various data are not
specific to the registration in particular. Rename the type into
`ns_pluginctx_t` for clarity.
Merge branch 'colin/rename-pluginctx-rename' into 'main'
Colin Vidal [Wed, 1 Oct 2025 11:32:59 +0000 (13:32 +0200)]
rename ns_pluginregister_ctx_t into ns_pluginctx_t
The type `ns_pluginregister_ctx_t` was initially added to pass plugin
contextual data when the plugin is registered, but this is also now
passed into `plugin_check`. Furthermore, those various data are not
specific to the registration in particular. Rename the type into
`ns_pluginctx_t` for clarity.
Andoni Duarte [Wed, 1 Oct 2025 14:20:26 +0000 (14:20 +0000)]
new: ci: Prepare release announcement MR
In the 'release' stage, create an MR automatically with the
corresponding release announcement. The input for this is taken from
metadata.json in bind9-qa.
Merge branch 'andoni/release-announcement-preparation' into 'main'
In the 'release' stage, create an MR automatically with the
corresponding release announcement. The input for this is taken from
metadata.json in bind9-qa.
Colin Vidal [Wed, 1 Oct 2025 10:52:39 +0000 (12:52 +0200)]
new: usr: add support for synthetic records
Add a query plugin which, in "reverse" mode, enables the server to build a synthesized response to a PTR query when the PTR record requested is not found in the zone.
The dynamically-built name is constructed from a static prefix (passed as a plugin parameter), the IP address (extracted from the query name) and a suffix (also passed as a plugin parameter). An `allow-synth` address-match list can be used to limit the network addresses for which the plugin may generate responses.
The plugin can also be used in "forward" mode, to build synthesized A/AAAA records from names using the same format as the dynamically-built PTR names. The same parameters are used: the plugin will react and answer a query if the name matches the configured prefix and origin, and encodes an IP address that is within `allow-synth`.
The "origin" parameter for synthrecord is now mandatory for reverse
zones, but when configured in a non-reverse zone, it will default to
the zone name.
the plugin's operating mode is now determined automatically
from the zone name: if the name ends in "ip6.arpa" or "in-addr.arpa",
then the plugin is in reverse mode, otherwise forward.
Colin Vidal [Mon, 31 Mar 2025 14:00:32 +0000 (16:00 +0200)]
add synthesized record system tests
Add system tests for the dynamically synthesized record plugin. This
covers the various cases the plugin should handle: generating a PTR
record only when (1) no answer is found locally and (2) the IP address
extracted from the query name is part of an allowed network. This also
covered the cases of forward synthesized records; answering a A/AAAA/ANY
query from a PTR address when this match the prefix, ACL and origin.
Colin Vidal [Mon, 31 Mar 2025 13:57:24 +0000 (15:57 +0200)]
add support for synthesized PTR answers
Add a BIND9 plugin which, in "reverse" mode, enables the server to build
a synthesized response to a PTR query when the PTR record requested is
not found in the zone. (The plugin won't be called for names below a
delegation point, because it couldn't know whether a name actually
exists within the delegation.)
The dynamically-built name is constructed from a static prefix (passed
as a plugin parameter), the IP address (extracted from the query name)
and a suffx (also passed as a plugin parameter). An "allow-synth"
address-match list is used to limit the network addresses for which
the plugin may generate responses.
The plugin can also be used in "forward" mode, to build synthesized
A/AAAA records from names using the same format as he dynamically-built
PTR names, if the query name and type are not found in the zone.
The same parameters are used when the plugin is in forward mode:
the plugin will react and answer a query if the name matches the
configured prefix and origin, and encodes an IP address that is
within "allow-synth".
Colin Vidal [Mon, 31 Mar 2025 13:50:32 +0000 (15:50 +0200)]
add API to parse and extract IP from PTR name
Add an API to parse and extract either an IPv4 or IPv6 address from
a name using the reverse format. It takes care of family detection,
and returns a generic error in case of syntax error.
Colin Vidal [Fri, 16 May 2025 16:33:13 +0000 (18:33 +0200)]
expose hex_decode APIs
Functions hex_decode_init(), hex_decode_char() and hex_decode_finish()
are now exposed, as well as the context hex_decode_ctx_t. They now are
respectively called isc_hex_decodeinit(), isc_hex_decodechar(),
isc_hex_decodefinish() and isc_hex_decodectx_t.
This enable to re-implement the functionality of isc_hex_decodestring()
in contextes where the input is not a NULL-terminated string, but, for
example, individual characters extracted (and avoid creating an
intermediate buffer to store them). This also enable to decode a stream
of hex characters where only hex characters are expected (i.e. no white
spaces).
Colin Vidal [Wed, 1 Oct 2025 09:52:14 +0000 (11:52 +0200)]
chg: dev: provide more context when registering plugins
Add a new type, `ns_pluginregister_ctx_t`, which is passed to `plugin_register()` in place of the `source` parameter. The source value is now just part of the structure, which also holds a pointer to the zone origin if the plugin is loaded at a zone level.
This provides more contextual information, enabling the plugin to make specific configuration decisions based on the name of the zone for which it is loaded.
It's also flexible if more contextual data are needed in the future: add a new field to `ns_pluginregister_ctx_t`, and new plugins can use it without affecting compatibility with existing plugins.
Closes #5533
Merge branch '5533-plugin-register-ctx' into 'main'
Colin Vidal [Sun, 28 Sep 2025 21:07:11 +0000 (23:07 +0200)]
unload zone plugin before freeing the zone
Make sure all zone's plugin are unloaded before the zone gets freed.
This makes passing zone metadata like its origin to the plugin
registering function safe, as this garantee that the origin would always
be valid from the plugin lifecycle.
Colin Vidal [Fri, 26 Sep 2025 13:54:51 +0000 (15:54 +0200)]
provide a context structure for plugin_register()
This commit introduces a new type, ns_pluginregister_ctx_t,
which is passed to plugin_check() and plugin_register() in place of the
'source' parameter. The source value is now just part of the structure,
which also holds a pointer to the zone origin if the plugin is loaded at
a zone level.
This provides more contextual information, enabling the plugin to make
specific configuration decisions based on the name of the zone for which
it is loaded.
It's also flexible if more contextual data are needed in the future:
add a new field to ns_pluginregister_ctx_t, and new plugins can use
it without affecting compatibility with existing plugins.
Colin Vidal [Wed, 1 Oct 2025 09:10:40 +0000 (11:10 +0200)]
fix: dev: check plugin config before registering
In `named_config_parsefile()`, when checking the validity of
`named.conf`, the checking of plugin correctness was deliberately
postponed until the plugin is loaded and registered. However,
the checking was never actually done: the `plugin_register()`
implementation was called, but `plugin_check()` was not.
`ns_plugin_register()` (used by `named`) now calls the check function
before the register function, and aborts if either one fails.
`ns_plugin_check()` (used by `named-checkconf`) calls only
the check function.
Merge branch 'each-check-plugin-named' into 'main'
In named_config_parsefile(), when checking the validity of
named.conf, the checking of plugin correctness was deliberately
postponed until the plugin is loaded and registered. However,
when the plugin was registered, the checking was never actually
done: the plugin_register() implementation was called, but
plugin_check() was not.
This made it necessary to duplicate the correctness checking in both
functions, so that both named-checkconf and named could catch errors.
That should not be required.
ns_plugin_register() now calls the check function before the register
function, and aborts if either one fails. ns_plugin_check() calls only
the check function. ns_plugin_check() is used by named-checkconf, and
ns_plugin_register() is used by named. (Note: this design has a
side effect that a call to ns_plugin_register() will result in the
plugin parameters being parsed twice at registration time.)
ns_plugin_check() now takes an additional argument for the hook
source: zone or view.
Colin Vidal [Tue, 30 Sep 2025 10:42:04 +0000 (12:42 +0200)]
rem: dev: remove CHECK_FOR_GLUE_IN_ANSWER
Macro CHECK_FOR_GLUE_IN_ANSWER is defined in `lib/dns/resolver.c` only,
documented nowhere and not exposed as build configuration. This is valid
at least for 9.21+, 9.20 and 9.18. Furthermore, it doesn't compile
anymore on 9.21+ with -DCHECK_FOR_GLUE_IN_ANSWER=1.
Considering it is very unlikely that anyone build named with this,
remove the code rather than fixing it.
Closes #5538
Merge branch '5538-remove-check-for-glue-in-answer' into 'main'
Colin Vidal [Tue, 30 Sep 2025 08:02:34 +0000 (10:02 +0200)]
remove CHECK_FOR_GLUE_IN_ANSWER
Macro CHECK_FOR_GLUE_IN_ANSWER is defined in `lib/dns/resolver.c` only,
documented nowhere and not exposed as build configuration. This is valid
at least for 9.21+, 9.20 and 9.18. Furthermore, it doesn't compile
anymore on 9.21+ with -DCHECK_FOR_GLUE_IN_ANSWER=1.
Considering it is very unlikely that anyone build named with this,
remove the code rather than fixing it.
chg: dev: Add option to always build fuzz binaries
Currently the fuzzer binaries are only built when someone requests a
fuzzer. This might cause us to inadvertently break fuzzing when changing
function signatures. It also deviates with the behaviour we had with
autotools, where the fuzz binaries were built with make test.
This commit splits the -Dfuzzing option into two: fuzzing, and
fuzzing-backend. The fuzzing option controls whether the fuzzing
binaries are built. The fuzzing-backend option controls which backend to
use, and defaults to none. If the value none is used the binaries are
built, but no backend is used or guaranteed, which means that the
binaries might be non-functional.
Closes #5526
Merge branch '5526-add-meson-option-to-always-build-fuzz-binaries' into 'main'
Currently the fuzzer binaries are only built when someone requests a
fuzzer. This might cause us to inadvertently break fuzzing when changing
function signatures. It also deviates with the behaviour we had with
autotools, where the fuzz binaries were built with make test.
This commit splits the -Dfuzzing option into two: fuzzing, and
fuzzing-backend. The fuzzing option controls whether the fuzzing
binaries are built. The fuzzing-backend option controls which backend to
use, and defaults to none. If the value none is used the binaries are
built, but no backend is used or guaranteed, which means that the
binaries might be non-functional.
Colin Vidal [Mon, 29 Sep 2025 08:15:38 +0000 (10:15 +0200)]
fix: dev: hookasyncctx renaming
The field `ns_hookasync_t` was initially named `hook_actx` and wrongly
renamed `hook_aclctx` during a mass-renaming of various names for the
config acl context into a consistent `aclctx` name (see !11003). Of
course this is wrong as `ns_hookasync_t` has nothing to do with ACL but
about _async_ context. This commit fixes the mistake by renaming this
field `hookasyncctx`
Merge branch 'colin/fix-hookasyncctx-rename' into 'main'
Colin Vidal [Sun, 28 Sep 2025 20:37:33 +0000 (22:37 +0200)]
fix hookasyncctx renaming
The field `ns_hookasync_t` was initially named `hook_actx` and wrongly
renamed `hook_aclctx` during a mass-renaming of various names for the
config acl context into a consistent `aclctx` name (see !11003). Of
course this is wrong as `ns_hookasync_t` has nothing to do with ACL but
about _async_ context. This commit fixes the mistake by renaming this
field `hookasyncctx`
Colin Vidal [Fri, 26 Sep 2025 13:31:33 +0000 (15:31 +0200)]
fix: dev: apply_configuration: leave exclusive mode after viewlist cleanup
When a re-configuration fails, `apply_configuration` flows jump to a
cleanup label and, at some point, leave the exclusive mode and cleanup
the viewlist. It looks fine as the viewlist is at this point only
locally known (if this is a configuration failure, this is the new view
list, if this is a success, this is the old list which has been swapped
out from the production list during the exclusive mode).
However, the view and zone initialization code enqueues job callbacks,
for instance from `dns_zone_setsigninginterval` (but there are others
cases) which will be called for the new views and zones after the
exclusive mode is over.
Depending where the configuration fails, those views and zones can be
half-configured, for instance a view might have an unfrozen resolver.
Hence, leaving the exclusive mode before cleaning up those views ans
zones will immediately called the previously enqueued callbacks and lead
to this reconfiguration-failure crash stack:
To avoid the problem, the views are now cleaned up before leaving the
exclusive mode (which also clean up the zones and enqueued callbacks).
As context, the bug was introduced by !10910 which moved the creation
(not configuration) of the view outsides of the exclusive mode. This is
a safe move (as at this point, the newly view are only known locally by
`apply_configuration`) but the re-order was wrong regarding the point
where the exclusive mode was ended (before the change, the exclusive
mode as always ended before the new view are detached).
Merge branch 'colin/leave-exclusive-mode-after-view-cleanup' into 'main'
Colin Vidal [Fri, 26 Sep 2025 12:54:42 +0000 (14:54 +0200)]
test views are detached before leaving exclusive mode
Adds a log-based test ensuring that when a reconfiguration fails inside
the view configuration, the newly created view are always detached
before the exclusive mode is ended.
Colin Vidal [Fri, 26 Sep 2025 09:12:53 +0000 (11:12 +0200)]
apply_configuration: leave exclusive mode after viewlist cleanup
When a re-configuration fails, `apply_configuration` flows jump to a
cleanup label and, at some point, leave the exclusive mode and cleanup
the viewlist. It looks fine as the viewlist is at this point only
locally known (if this is a configuration failure, this is the new view
list, if this is a success, this is the old list which has been swapped
out from the production list during the exclusive mode).
However, the view and zone initialization code enqueues job callbacks,
for instance from `dns_zone_setsigninginterval` (but there are others
cases) which will be called for the new views and zones after the
exclusive mode is over.
Depending where the configuration fails, those views and zones can be
half-configured, for instance a view might have an unfrozen resolver.
Hence, leaving the exclusive mode before cleaning up those views ans
zones will immediately called the previously enqueued callbacks and lead
to this reconfiguration-failure crash stack:
To avoid the problem, the views are now cleaned up before leaving the
exclusive mode (which also clean up the zones and enqueued callbacks).
As context, the bug was introduced by !10910 which moved the creation
(not configuration) of the view outsides of the exclusive mode. This is
a safe move (as at this point, the newly view are only known locally by
`apply_configuration`) but the re-order was wrong regarding the point
where the exclusive mode was ended (before the change, the exclusive
mode as always ended before the new view are detached).
fix: usr: rndc sign during ZSK rollover will now replace signatures
When performing a ZSK rollover, if the new DNSKEY is omnipresent, the :option:`rndc sign` command now signs the zone completely with the successor key, replacing all zone signatures from the predecessor key with new ones.
Closes #5483
Merge branch '5483-smooth-operator-bug' into 'main'
Matthijs Mekking [Tue, 19 Aug 2025 13:16:39 +0000 (15:16 +0200)]
Update the retire interval after full sign
After a full sign we no longer have to need to take the sign delay into
account. Update the timing checks in keymgr_transition_time to determine
the start of the interval: Either the last change, or if SigPublish/
SigDelete is set. The latter case indicates a full sign was done and
so we no longer have to take the sign delay into account.
Matthijs Mekking [Tue, 19 Aug 2025 10:37:29 +0000 (12:37 +0200)]
Force full sign to generate new signatures
When introducing the kasp logic, a full sign of the zone did not
generate new signatures for the new active keys during a ZSK rollover.
The introduced kasp logic ensured that the rollover is performed
smoothly, as in the signatures are only replaced if the old signature
is close to expiring (depending on the signatures-refresh option).
Fix by maintaining a fullsign boolean value in the signing structure,
that will ensure the RRsets are signed with the correct key, rather
than a similar good key.
In case of a fullsign, we can also remove signatures from inactive
keys.
Mark Andrews [Thu, 25 Sep 2025 14:13:38 +0000 (00:13 +1000)]
fix: usr: Use signer name when disabling DNSSEC algorithms
``disable-algorithms`` could cause DNSSEC validation failures when the parent zone was
signed with the algorithms that were being disabled for the child zone.
This has been fixed; `disable-algorithms` now works
on a whole-of-zone basis.
If the zone's name is at or below the ``disable-algorithms`` name the algorithm
is disabled for that zone, using deepest match when there are multiple
``disable-algorithms`` clauses.
Closes #5165
Merge branch '5165-use-signer-name-when-disabling-dnssec-algorithms' into 'main'
Mark Andrews [Thu, 7 Aug 2025 04:37:33 +0000 (14:37 +1000)]
Use signer name when disabling DNSSEC algorithms
When disabling algorithms, use the signer name to determine if the
algorithm is disabled or not. This allows for algorithms to be
cleanly disabled on a zone level basis. Previously, just using the
records owner name, "disable-algorithms" could impact resolution of
names that where not disabled. This does now mean that
"disable-algorithms" can not be used to disable part of a zone anymore.
Colin Vidal [Wed, 24 Sep 2025 18:57:38 +0000 (20:57 +0200)]
chg: dev: rename cfg_aclconfctx_t variables to aclctx
ACL configuration context variables are inconsistently named as `actx`,
`ac`, or `aclconfctx`, which caused confusion during code reviews. This
commit renames all `cfg_aclconfctx_t` variables to `aclctx`, which is
short, consistent, and unambiguous.
Colin Vidal [Wed, 24 Sep 2025 09:10:29 +0000 (11:10 +0200)]
rename cfg_aclconfctx_t variables to aclctx
ACL configuration context variables are inconsistently named as `actx`,
`ac`, or `aclconfctx`, which caused confusion during code reviews. This
commit renames all `cfg_aclconfctx_t` variables to `aclctx`, which is
short, consistent, and unambiguous.
new: usr: Add dnssec-policy keys configuration check to named-checkconf
A new option `-k` is added to `named-checkconf` that allows checking the `dnssec-policy` `keys` configuration against the configured key stores. If the found key files are not in sync with the given `dnssec-policy`, the check will fail.
This is useful to run before migrating to `dnssec-policy`.
Closes #5486
Merge branch '5486-named-checkconf-dnssec-policy-key-directory' into 'main'
The DST_ALGORITHM_FORMATSIZE constant is unused. It could be used in
dst_kasp_key_format, but instead we will use DNS_NAME_FORMATSIZE
because it is used in other places too. Clean up the unused constant.
Matthijs Mekking [Thu, 28 Aug 2025 12:48:07 +0000 (14:48 +0200)]
Change checkconf to include built-in dnssec-policy
The configuration should also take into account the built-in
DNSSEC policies when verifying the keys in the key-directory match the
given policy. Update the code accordingly and add some good and
failure test cases.
Matthijs Mekking [Thu, 28 Aug 2025 08:28:02 +0000 (10:28 +0200)]
Implement named-checkconf -k (check keys)
With named-checkconf -k you can check your configuration including
checking the dnssec-policy keys against the configured keystores. If
there is a mismatch in the key files versus the policy, named-checkconf
will fail. This is useful for running before migrating to dnssec-policy.
For logging purposes, introduce a function that writes the identifying
information about a policy key into a string.
Allow a dnssec key to be initialized outside the keymgr code.
Add 'log_errors' to 'cfg_kasp_fromconfig' to avoid duplicate error
logs.
There's currently an issue with the shotgun workflow that's being
investigated. Until it's resolved, there's no point in creating the
shotgun jobs as they'll just fail.
Merge branch 'nicki/ci-temporarily-disable-shotgun-jobs' into 'main'
There's currently an issue with the shotgun workflow that's being
investigated. Until it's resolved, there's no point in creating the
shotgun jobs as they'll just fail.
chg: dev: Add option to compile named with static linking and LTO
Statically linking lib{isc,dns,ns,cfg,isccc} and enabling LTO shows over 10% improvements on all almost measurements in perflab. That said, we can't use Meson's option for LTO since it would result in every binary being compiled with LTO and a great increase in compile time.
To work around it, we add a configuration option that enables LTO and static linking only for the `named` binary.
Add named-lto option to meson build to named with LTO
Enabling LTO yields substantial performance gains on both authoritative
and resolver benchmarks.
But since LTO defers many optimization passes to link time, enabling LTO
across the board would cause an increase in compilation time, as passes
that would be run only once would need to be run for each executable.
As a compromise, this commit adds a named-lto build option, that
compiles the individual object files with the -ffat-lto-object option
and then enables LTO only for the named executable. Object files are
reused between lib*.so and the named executable.
Enabling LTO in the subsequent commit requires the file names to be
unique and having same probes.d in each of the libraries breaks this
requirement. Rename probes.d to probes-{isc,dns,ns}.d files and adjust
the includes.
Colin Vidal [Wed, 24 Sep 2025 09:46:38 +0000 (11:46 +0200)]
chg: dev: refactor view creation/configuration loops in dedicated functions
Refactor a bit of `apply_configuration` by extracting (into respective dedicated function) the logic to build the keystores list, the KASP list as well as creating the view/zones and configuring those. This is the next step of MR !10895 and !10901
While the code is extracted, some global variables has been changed into a function parameters which enable to have a clear view of the dependency of the function, typically, to know if it depends on local configuration object or runtime "production" object. The end goal (not in this MR, but later on) is to move as much as possible initialization logic outside of the exclusive mode.
As a first step, latest commits move the keystores list, KASP list and view/zones creation outside of the exclusive mode. (The view/zone configuration remain in exclusive mode for now, because of a dependency to the runtime "cachelist". This is the target of a next MR.
For the record; while moving the keystores list, KASP list and view/zone creation doesn't have a significant impact on the time the exclusive mode is taken (from my experiment on a 1M small zones instance); moving `configure_views` did have a _massive_ impact (basically, the time spend in the exclusive mode is then non calculable). Configuring views outside the exclusive mode needs more work, which will be done in future MRs.
See #4673
Merge branch 'colin/refactor-applyconfig' into 'main'
Colin Vidal [Wed, 10 Sep 2025 13:17:11 +0000 (15:17 +0200)]
apply_configuration: log subroutines for tests
In order to have a (minimal) test ensuring we don't move back
`apply_configuration` subroutines which can be done before the exclusive
lock is taken, `APPLY_CONFIGURATION_SUBROUTINE_LOG` macro is added and
used for the few subroutines already extracted from the exclusive mode.
Those expected logs are added in `configloading` system test checks.
Colin Vidal [Tue, 9 Sep 2025 13:41:17 +0000 (15:41 +0200)]
creation of client TLS ctx before exclusive mode
When the server is configured (inside `apply_configuration`) a client
TLS context cache is created and attached to the global server object.
It is then used by `configure_view` flow (and also during runtime though
the zone manager).
It is now created before the exclusive mode, and the swap of the
previous TLS cache ctx is done at the end of the exclusive mode, if
everything went well.
This allows us (among other follow-up changes) to move the
`configure_views` function outside of the exclusive mode.
Colin Vidal [Mon, 8 Sep 2025 13:57:47 +0000 (15:57 +0200)]
move creation of keystores, kasp list and view outside of exclusive mode
The keystores initialization, the KASP list initialization as well as
the initialization of the view no longer depends of any data shared by
running "production" objects during re-configuration of the server. This
allows us to move those outside (before) the exclusive mode is taken.
Colin Vidal [Mon, 8 Sep 2025 12:58:47 +0000 (14:58 +0200)]
cfg_aclconfctx_t object is part of named_server
`named_g_actconfctx` is a global variable holding the ACL configuration
context alive (in particular, to dynamically load zones). However, this
object is build once per configuration (early) and is used only inside
server.c `apply_configuration` flow. (Two exceptions: the shutdown flow,
still in server.c and plugin check flow, which doesn't need it, so it's
NULL in such case).
Instead of leaving this global publicly exposed, it is now part of the
`named_server_t` object. This allows us to clearly see that, when
reconfigureing the server, the new instance of the ACL context is known
only by the newly built object and not currently used by "production"
object; and will help to move move logic before the exclusive mode is
taken.
The other advantage is that the ACL configuration context can now be
built before the exclusive lock as well.
Colin Vidal [Thu, 28 Aug 2025 15:29:23 +0000 (17:29 +0200)]
apply_configuration: add configure_kasplist
The kasplist (dnssec-policy defined in the builtin and global
configuration options) was built inside apply_configuration. This
commit extracts this logic into its separate function.
In order to make the view configuration independent of the global
`server` object, the newly built kasplist is now passed as parameter.
(This eventually will help to be able to configure the views outside of
the exclusive mode by limiting its dependency to the global
`server`/`named_g_server`).
Colin Vidal [Tue, 26 Aug 2025 11:19:10 +0000 (13:19 +0200)]
apply_configuration: remove builtin_viewlist
When creating/configuring the view, the user-defined views are built and
set into the viewlist, then builtin-view inside the builtin_viewlist.
But there is no seperate logic applied to those two lists, and they are
immediately merged into viewlist right after. This commit removes this
intermediate list and add builtin-views directly into the main viewlist
instead.
Colin Vidal [Tue, 26 Aug 2025 10:35:56 +0000 (12:35 +0200)]
refactor view creation/config in apply_configuration
In order to help splitting apply_configuration, the inline loops and bit
of logic around it for views creation and configuration, each of those
are now in a dedicatated function.
chg: dev: Use lock-free hashtable for storing resolver fetch contexts
Replace the locked hashmap with the lock-free hashtable from the RCU
library and protect the fetch contexts against reuse by replacing the
libisc reference counting with urcu_ref that can soft-fail in situation
where the reference count is already zero. This allows us to easily
skip re-using the fetch context if it is already in process of being
destroyed.
Merge branch 'ondrej/use-urcu-lfht-for-resolver-tables' into 'main'
Use lock-free hashtable for storing resolver fetch contexts
Previously, the fetch contexts were stored inside rwlocked hashmap
table. This was one of the most contended places for the resolver,
especially in the cold cache situation.
Replace the locked hashmap with the lock-free hashtable from the RCU
library and protect the fetch contexts against reuse by replacing the
libisc reference counting with urcu_ref that can soft-fail in situation
where the reference count is already zero. This allows us to easily
skip re-using the fetch context if it is already in process of being
destroyed.
chg: dev: Add a circular reference between slabtops for type and RRSIG(type)
Previously, the slabtops for "type" and its signature was only loosely
coupled and the headers could expire at different time (both TTL and LRU
based expiry). Add a .related member to the slabtop that allows us to
expire the headers in both related headers and also optimize the lookups
because now both slabtops are looked up at the same time.
Closes #3396
Merge branch '3396-bind-rrsigs-to-records' into 'main'
Previously, the slabtops for "type" and its signature was only loosely
coupled and the headers could expire at different time (both TTL and LRU
based expiry). This commit expires the headers in both related
headers.
Add a circular reference between slabtops for type and RRSIG(type)
Previously, the slabtops for "type" and its signature was only loosely
coupled. Add a .related member to the slabtop that allows us to
optimize the lookups because now both slabtops are looked up at the
same time.
There was a pattern where first the header was checked for NULL
and then for being stale. In both cases the code path is the same
so it makes sense to put them in a separate function.
chg: dev: Convert slabtop and slabheader to use the cds list
This is the first MR in series that aims to reduce the node locking
by replacing the single-linked list of slabtop(s) and slabheader(s)
with CDS linked list. This commit doesn't do anything else beyond
replacing .next and .down links with the cds_list_head. The RCU
semantics will be added later.
Merge branch 'ondrej/use-rcu-list-for-slabtop' into 'main'
This is the second commit in series that aims to reduce the node locking
by replacing the single-linked list of slabheader(s) with CDS linked list.
This commit doesn't do anything else beyond replacing .next link with
the cds_list_head. RCU semantics is going to be added in the subsequent
commits.
This is the first commit in series that aims to reduce the node locking
by replacing the single-linked list of slabtop(s) with CDS linked list.
This commit doesn't do anything else beyond replacing .next link with
the cds_list_head. RCU semantics is going to be added in the subsequent
commits.
fix: dev: Fix datarace between unlocking fctx lock and shuttingdown fctx
There was a data race where new fetch response could be added to the
fetch context after we unlock the fetch context and before we shut it
down. This could cause assertion failure when fctx__done() was called
with ISC_R_SUCCESS because there was originally no fetch response, but
new fetch response without associated dataset was added before we had a
chance to shutdown the fetch context. This manifested in the
validated() callback, where cache_rrset() now returns ISC_R_SUCCESS
instead of DNS_R_UNCHANGED when cache was not changed. However the data
race was wrong on a general level.
Add new argument to fctx__done() that allows to call it with fctx->lock
already acquired to prevent these data races.
Closes #5507
Merge branch '5507-dont-release-fctx-lock-on-done' into 'main'
Fix datarace between unlocking fctx lock and shuttingdown fctx
There was a data race where new fetch response could be added to the
fetch context after we unlock the fetch context and before we shut it
down. This could cause assertion failure when fctx__done() was called
with ISC_R_SUCCESS because there was originally no fetch response, but
new fetch response without associated dataset was added before we had a
chance to shutdown the fetch context. This manifested in the
validated() callback, where cache_rrset() now returns ISC_R_SUCCESS
instead of DNS_R_UNCHANGED when cache was not changed. However the data
race was wrong on a general level.
When the fctx__done() is called with ISC_R_SUCCESS as result is expects
the fctx->lock to be already acquired to prevent these data races.
Split the fctx_done() into success and failure variants
The split will allow us to call fctx__done() with fctx->lock acquired
when it is called with ISC_R_SUCESS to prevent data races when finishing
the fetch context.
chg: ci: Only run relevant CI jobs based on the changes
Trigger selected CI jobs on MR automatically only if there are related
code changes. Otherwise, offer an option to run the jobs manually in
MRs. For other sources, like schedules, tags etc., execute the jobs as
usual.
Merge branch 'nicki/ci-restrict-rules-changes' into 'main'
Trigger selected CI jobs on MR automatically only if there are related
code changes. Otherwise, offer an option to run the jobs manually in
MRs. For other sources, like schedules, tags etc., execute the jobs as
usual.
Colin Vidal [Wed, 17 Sep 2025 15:38:54 +0000 (17:38 +0200)]
fix: usr: preserve cache when reload fails and reload the server again
Fixes an issue where failing to reconfigure/reload the server would prevent to preserved the views caches on the subsequent server reconfiguration/reload.
Colin Vidal [Tue, 16 Sep 2025 15:14:33 +0000 (17:14 +0200)]
preserve cache when reload fails
If the server is reloaded, new views are created and preexisting cache
is attached to those _but_ something goes wrong later, the previous
views are restored but the previous cache list is destroyed. This makes
the subsequent reload to drop the existing cache. This fixes it by
avoiding a mutation of the old cache list.
Colin Vidal [Tue, 16 Sep 2025 15:14:46 +0000 (17:14 +0200)]
test that cache is preserved on reconfing failure
A named bug scrap the cache on a second reload after an initial reload
failure. Adds a test checking that the cache is preserved between server
reconfiguration/reloads even if it fails at some point (after attempting
to re-use the cache) and the server is re-loaded later.
The dns_qpcache already had all the namespace changes needed to put the
normal data and auxiliary NSEC data into a single tree. Remove the
extra nsec QP trie and use the single QP trie for all the cache data.
Merge branch 'ondrej/use-qp-namespace-in-cache' into 'main'
As we removed the ability to count nodes in the auxiliary trees (because
there are no auxiliary trees), we can also cleanup the API and
associated enum type (dns_dbtree_t).
The dns_qpcache already had all the namespace changes needed to put the
normal data and auxiliary NSEC data into a single tree. Remove the
extra nsec QP trie and use the single QP trie for all the cache data.
Remove the dbiterator_{last,prev} from the qpcache
The dbiterator_{last,prev} functions are not used in the cache, and the
implementation would get quite complicated when we squash the main and
nsec trees together. It's easier to just not implement these.