Michał Kępień [Tue, 21 Nov 2023 09:18:52 +0000 (10:18 +0100)]
Move job definitions to the proper place
The definitions of the "ci-variables" and "cross-version-config-tests"
GitLab CI jobs were accidentally added in the .gitlab-ci.yml section
that claims to only contain job templates. Move the definitions of
these two jobs to a more appropriate location in .gitlab-ci.yml, without
changing the job definitions themselves.
Michał Kępień [Tue, 21 Nov 2023 09:18:52 +0000 (10:18 +0100)]
Drop the TARBALL_EXTENSION variable
All currently supported BIND 9 branches use xz-packed tarballs for
source code distribution. Having a variable with a lengthy name that
only holds two characters does not improve readability - it was only
useful for maintaining .gitlab-ci.yml consistency between BIND 9.11 and
all the newer branches, but that era has come to an end a while ago.
Replace all occurrences of the TARBALL_EXTENSION variable in
.gitlab-ci.yml with a fixed string ("xz") to simplify the contents of
that file.
Evan Hunt [Sat, 11 Nov 2023 21:15:27 +0000 (13:15 -0800)]
set loadtime during initial transfer of a secondary zone
when transferring in a non-inline-signing secondary for the first time,
we previously never set the value of zone->loadtime, so it remained
zero. this caused a test failure in the statschannel system test,
and that test case was temporarily disabled. the value is now set
correctly and the test case has been reinstated.
Mark Andrews [Thu, 16 Nov 2023 00:15:49 +0000 (11:15 +1100)]
Check that buffer length in dns_message_renderbegin
The maximum DNS message size is 65535 octets. Check that the buffer
being passed to dns_message_renderbegin does not exceed this as the
compression code assumes that all offsets are no bigger than this.
Ondřej Surý [Mon, 21 Aug 2023 15:53:15 +0000 (17:53 +0200)]
Split the CPU architectures into more categories
Move i386 and other less common or ancient CPU architectures to
Community-Maintened category. Move armhf and arm64 to the Best-Effort
category as we do test them as part of development work (new MacBooks
are all arm64), we don't really do full set of tests in the CI.
Mark Andrews [Tue, 17 Oct 2023 23:45:41 +0000 (10:45 +1100)]
Suppress reporting upcoming changes in root hints
To reduce the amount of log spam when root servers change their
addresses keep a table of upcoming changes by expected date and time
and suppress reporting differences for them until then.
Add initial entry for B.ROOT-SERVERS.NET, Nov 27, 2023.
Michał Kępień [Thu, 2 Nov 2023 06:23:38 +0000 (07:23 +0100)]
Add a release signing job to GitLab CI
Add a GitLab CI job that is only run for tags and makes signing BIND 9
releases more convenient by utilizing a signing VM that is registered as
a GitLab CI runner. This pulls the signing process into the release
pipelines in GitLab CI, resulting in job artifacts containing the
signatures for BIND 9 releases, which in turns simplifies the subsequent
release publication steps.
Michał Kępień [Wed, 1 Nov 2023 17:04:07 +0000 (18:04 +0100)]
Improve stability of the jemalloc workaround
When jemalloc is linked into BIND 9 binaries (rather than preloaded or
used as the system allocator), depending on the decisions made by the
linker, the malloc() symbol may be resolved to a non-jemalloc
implementation at runtime. Such a scenario foils the workaround added
in commit 2da371d005c472dea349110e3ef9a6ed7b18b824 as it relies on the
jemalloc implementation of malloc() to be executed.
Handle the above scenario properly by calling mallocx() explicitly
instead of relying on the runtime resolution of the malloc() symbol.
Use trivial wrapper functions to avoid the need to copy multiple #ifdef
lines from lib/isc/mem.c to lib/isc/trampoline.c. Using a simpler
alternative, e.g. calling isc_mem_create() & isc_mem_destroy(), was
already considered before and rejected, as described in the log message
for commit 2da371d005c472dea349110e3ef9a6ed7b18b824.
ADJUST_ZERO_ALLOCATION_SIZE() is only used in isc__mem_free_noctx() to
concisely avoid compilation warnings about its 'size' parameter not
being used when building against jemalloc < 4.0.0 (as sdallocx() is then
redefined to dallocx(), which has a different signature).
Tom Krizek [Wed, 27 Sep 2023 13:48:31 +0000 (15:48 +0200)]
ci: trigger a DNS Shotgun performance test
Run comparative performance tests against the latest released version of
the same branch. This is done for different protocols with an
appropriate load the server is expected to be able to handle.
Currently, the results need to be inspected manually, since a success of
the job doesn't indicate there is no issue. Instead, the job provides an
URL to an overview with latency, memory and CPU charts which display the
test results with the current code against the reference version. There
should be no major unexplained and reproducible differences in the
charts.
Tom Krizek [Wed, 27 Sep 2023 15:41:26 +0000 (17:41 +0200)]
util: script to get DNS Shotgun pipeline results
The shotgun performance tests are executed in a different repository, in
a couple of different pipelines. To hide away the complexity, this
script takes the pipeline ID of the triggered pipeline and then takes
care of the rest - waits for the pipeline to finish, locates the child
pipeline and the relevant results. The output from this script is a
convenient link to the charts with the results once they're available.
GitLab also has a mechanism which can wait for another pipeline.
However, it can't be utilized here, since there are variables which
need to be passed in when the pipeline is triggered (like protocol to be
tested, load, runtime etc.). This isn't currently supported by the
GitLab feature.
Tom Krizek [Wed, 27 Sep 2023 13:26:10 +0000 (15:26 +0200)]
ci: move baseline version detection into separate job
Multiple CI jobs may utilize a baseline version, i.e. the version that
the current code should be tested against when doing comparative
testing. To avoid repeating the non-trivial detection of the baseline
version, move it into a separate job which creates an environment file
that subsequent jobs may require via `needs` option. It is then possible
to use the variable(s) defined in the script section of the new job.
Matthijs Mekking [Mon, 30 Oct 2023 18:33:19 +0000 (19:33 +0100)]
Don't ignore auth zones when in serve-stale mode
When serve-stale is enabled and recursive resolution fails, the fallback
to lookup stale data always happens in the cache database. Any
authoritative data is ignored, and only information learned through
recursive resolution is examined.
If there is data in the cache that could lead to an answer, and this can
be just the root delegation, the resolver will iterate further, getting
closer to the answer that can be found by recursing down the root, and
eventually puts the final response in the cache.
Change the fallback to serve-stale to use 'query_getdb()', that finds
out the best matching database for the given query.
Matthijs Mekking [Mon, 23 Oct 2023 11:52:12 +0000 (13:52 +0200)]
Test case for issue #4355
Add a test case where serve-stale is enabled on a server that also
servers a local authoritative zone.
The particular case tests a lame delegation and checks if falling
back to serving stale data does not attempt to retrieve the query
by recursing from the root down.
Aram Sargsyan [Thu, 26 Oct 2023 12:28:25 +0000 (12:28 +0000)]
Do not warn about lock-file option change when -X is used
When -X is used the 'lock-file' option change detection condition
is invalid, because it compares the 'lock-file' option's value to
the '-X' argument's value instead of the older 'lock-file' option
value (which was ignored because of '-X').
Don't warn about changing 'lock-file' option if '-X' is used.
Aram Sargsyan [Thu, 26 Oct 2023 12:24:17 +0000 (12:24 +0000)]
Fix an invalid condition check when detecting a lock-file change
It is obvious that the '!cfg_obj_asstring(obj)' check should be
'cfg_obj_asstring(obj)' instead, because it is an AND logic chain
which further uses 'obj' as a string.
Aram Sargsyan [Thu, 26 Oct 2023 12:21:57 +0000 (12:21 +0000)]
Fix assertion failure when using -X none and lock-file in configuration
When 'lock-file <lockfile>' is used in configuration at the same time
as using '-X none' in 'named' invocation, there is an invalid
logic that would lead to a isc_mem_strdup() call on a NULL value.
Also, contradicting to ARM, 'lock-file none' is overriding the '-X'
argument.
Fix the overall logic, and make sure that the '-X' takes precedence to
'lock-file'.
Ondřej Surý [Thu, 26 Oct 2023 08:54:28 +0000 (10:54 +0200)]
Fix assertion failure when using -X and lock-file in configuration
When 'lock-file <lockfile1>' was used in configuration at the same time
as using `-X <lockfile2>` in `named` invocation, there was an invalid
logic that would lead to a double isc_mem_strdup() call on the
<lockfile2> value.
Skip the second allocation if `lock-file` is being used in
configuration, so the <lockfile2> is used only single time.
Mark Andrews [Thu, 26 Oct 2023 04:07:58 +0000 (15:07 +1100)]
Check that the lock file was not removed too early
When named fails to starts due to not being able to obtain
a lock on the lock file that lock file should remain. Check
that the lock file exists before and after the attempt to
start a second instance of named.
Mark Andrews [Thu, 26 Oct 2023 03:50:43 +0000 (14:50 +1100)]
Only remove the lock file if we managed to lock it
The lock file was being removed when we hadn't successfully locked
it which defeated the purpose of the lockfile. Adjust cleanup_lockfile
such that it only unlinks the lockfile if we have successfully locked
the lockfile and it is still active (lockfile != NULL).
Aram Sargsyan [Fri, 20 Oct 2023 10:45:35 +0000 (10:45 +0000)]
Fix shutdown races in catzs
The dns__catz_update_cb() does not expect that 'catzs->zones'
can become NULL during shutdown.
Add similar checks in the dns__catz_update_cb() and dns_catz_zone_get()
functions to protect from such a case. Also add an INSIST in the
dns_catz_zone_add() function to explicitly state that such a case
is not expected there, because that function is called only during a
reconfiguration.
Mark Andrews [Wed, 16 Aug 2023 04:40:12 +0000 (14:40 +1000)]
Adjust UDP timeouts used in zone maintenance
Drop timeout before resending a UDP request from 15 seconds to 5
seconds and add 1 second to the total time to allow for the reply
to the third request to arrive. This will speed up the time it
takes for named to recover from a lost packet when refreshing a
zone and for it to determine that a primary is down.