Petr Špaček [Wed, 28 May 2025 13:46:14 +0000 (15:46 +0200)]
Run CI danger job even if user canceled it while it was running
Limitation: The after_script is not executed if the job did not start at
all, i.e. if the user canceled the job before it got onto a runner.
See https://gitlab.com/groups/gitlab-org/-/epics/10158
[9.16] [9.18] fix: dev: Unify the int32_t vs int_fast32_t when working with atomic types
There's a mismatch between the atomic and non-atomic types that could
potentialy lead to a rwlock deadlock (after two billion 2^32) writes.
Use int_fast32_t when loading the atomic_int_fast32_t types in the
isc_rwlock unit.
Closes #5280
Backport of MR !10390
Merge branch 'backport-5280-match-the-types-in-isc_rwlock-9.18-9.16' into 'bind-9.16'
Unify the int32_t vs int_fast32_t when working with atomic types
There's a mismatch between the atomic and non-atomic types that could
potentialy lead to a rwlock deadlock (after two billion 2^32) writes.
Use int_fast32_t when loading the atomic_int_fast32_t types in the
isc_rwlock unit.
[9.16] new: ci: Allow pushing branches and tags to customer git repos
For pipelines in the private repository, add an optional manual job,
which allows the current branch to be pushed into the specified
customer's git repository. This can be useful to provide patch previews
for early testing.
For tags created in a private repository, add a manual job which pushes
the created tag to all entitled customers.
Backport of MR !10323
Merge branch 'backport-nicki/ci-customer-git-automation-9.16' into 'bind-9.16'
Nicki Křížek [Tue, 25 Mar 2025 15:51:24 +0000 (16:51 +0100)]
Allow pushing branches and tags to customer git repos
For pipelines in the private repository, add an optional manual job,
which allows the current branch to be pushed into the specified
customer's git repository. This can be useful to provide patch previews
for early testing.
For tags created in a private repository, add a manual job which pushes
the created tag to all entitled customers.
Michal Nowak [Tue, 11 Mar 2025 12:42:59 +0000 (12:42 +0000)]
Drop FreeBSD and OpenBSD from CI
Both FreeBSD and OpenBSD in the CI are tested on outdated images.
Current FreeBSD images can't even be rebuild because in the maintained
branches they were ported from QCOW2 to the AWS autoscaler (also a
future of the OpenBSD image). This is something we don't want to
backport to EoL branches.
Merge branch 'mnowak/drop-bsd-images-from-ci' into 'bind-9.16'
Michal Nowak [Tue, 11 Mar 2025 09:56:18 +0000 (10:56 +0100)]
Drop FreeBSD and OpenBSD from CI
Both FreeBSD and OpenBSD in the CI are tested on outdated images.
Current FreeBSD images can't even be rebuild because in the maintained
branches they were ported from QCOW2 to the AWS autoscaler (also a
future of the OpenBSD image). This is something we don't want to
backport to EoL branches.
Petr Špaček [Thu, 30 Jan 2025 10:24:59 +0000 (11:24 +0100)]
Do not trigger post-merge jobs for cross-project pushes
We need to avoid double-triggering of post-merge jobs in the following
scenario:
1. A private MR gets merged into the private BIND 9 repository.
2. This merge operation triggers a "push" pipeline in the private
repository, which correctly runs post-merge jobs, e.g. to set MR
metadata in the private project.
3. When a release is published, a script is run to change the
automatically assigned milestone value ("Not released yet") to
something else.
4. Shortly afterwards, the result of the merge from step 1 is merged
back into a maintenance branch in the public repository.
5. The push operation triggers another "push" pipeline, this time in
the public project.
At this point there are two problems:
- If the script is dumb (like it currently is), it will extract the
merge request ID from the merge commit description and change the
milestone for a merge request in the wrong project namespace.
- Even if the script was fixed to extract and use the correct GitLab
project reference, it would reset the milestone for the merge
request in the private repository back to "Not released yet" - while
the milestone set in step 3 should be retained.
An alternative would be to change the order of operations so that
post-release milestoning happens at a later stage, while also fixing the
script to correctly follow cross-project references, but that approach
seems more fragile than simply failing on all cross-project pushes. The
rule to enforce is: each project should only take care of its own
post-merge tasks.
Michał Kępień [Fri, 31 Jan 2025 09:37:54 +0000 (09:37 +0000)]
[9.16] chg: ci: Use default cloning depth for the Danger CI job
With shallow fetching working reliably in pygit2 1.17.0+, there is no
longer any need for GitLab CI runners to clone the BIND 9 repository
with a fixed depth of 1000 during every "danger" CI job as Hazard is now
able to fetch remote refs with an arbitrary depth, controlled by the
HAZARD_FETCH_DEPTH environment variable. The latter can be defined via
GitLab project's CI settings and adjusted as needed over time, without
the need to update .gitlab-ci.yml every time its value needs to be
changed.
Backport of MR !9946
Merge branch 'backport-michal/use-default-cloning-depth-for-the-danger-ci-job-9.16' into 'bind-9.16'
Michał Kępień [Fri, 31 Jan 2025 09:25:56 +0000 (10:25 +0100)]
Use default cloning depth for the Danger CI job
With shallow fetching working reliably in pygit2 1.17.0+, there is no
longer any need for GitLab CI runners to clone the BIND 9 repository
with a fixed depth of 1000 during every "danger" CI job as Hazard is now
able to fetch remote refs with an arbitrary depth, controlled by the
HAZARD_FETCH_DEPTH environment variable. The latter can be defined via
GitLab project's CI settings and adjusted as needed over time, without
the need to update .gitlab-ci.yml every time its value needs to be
changed.
Nicki Křížek [Mon, 20 Jan 2025 16:17:08 +0000 (16:17 +0000)]
[9.16] [CVE-2024-11187] sec: usr: Limit the additional processing for large RDATA sets
When answering queries, don't add data to the additional section if the answer has more than 13 names in the RDATA. This limits the number of lookups into the database(s) during a single client query, reducing query processing load.
Backport of MR !750
See isc-projects/bind9#5034
Merge branch '5034-security-limit-additional-9.16' into 'bind-9.16-release'
Ondřej Surý [Thu, 14 Nov 2024 09:37:29 +0000 (10:37 +0100)]
Limit the additional processing for large RDATA sets
When answering queries, don't add data to the additional section if
the answer has more than 13 names in the RDATA. This limits the
number of lookups into the database(s) during a single client query,
reducing query processing load.
Also, don't append any additional data to type=ANY queries. The
answer to ANY is already big enough.
Ondřej Surý [Tue, 7 Jan 2025 14:22:40 +0000 (15:22 +0100)]
Isolate using the -T noaa flag only for part of the resolver test
Instead of running the whole resolver/ns4 server with -T noaa flag,
use it only for the part where it is actually needed. The -T noaa
could interfere with other parts of the test because the answers don't
have the authoritative-answer bit set, and we could have false
positives (or false negatives) in the test because the authoritative
server doesn't follow the DNS protocol for all the tests in the resolver
system test.
Arаm Sаrgsyаn [Wed, 8 Jan 2025 12:39:51 +0000 (12:39 +0000)]
[9.16] fix: dev: Fix a bug in isc_rwlock_trylock()
When isc_rwlock_trylock() fails to get a read lock because another
writer was faster, it should wake up other waiting writers in case
there are no other readers, but the current code forgets about
the currently active writer when evaluating 'cntflag'.
Unset the WRITER_ACTIVE bit in 'cntflag' before checking to see if
there are other readers, otherwise the waiting writers, if they exist,
might not wake up.
Closes #5121
Backport of MR !9937
Merge branch 'backport-aram/isc_rwlock_trylock-bugfix-9.18-9.16' into 'bind-9.16'
Aram Sargsyan [Tue, 7 Jan 2025 13:30:26 +0000 (13:30 +0000)]
Fix a bug in isc_rwlock_trylock()
When isc_rwlock_trylock() fails to get a read lock because another
writer was faster, it should wake up other waiting writers in case
there are no other readers, but the current code forgets about
the currently active writer when evaluating 'cntflag'.
Unset the WRITER_ACTIVE bit in 'cntflag' before checking to see if
there are other readers, otherwise the waiting writers, if they exist,
might not wake up.
Enforcing pylint standards and default for our test code seems
counter-productive. Since most of the newly added code are tests or is
test-related, encountering these checks rarely make us refactor the code
in other ways and we just disable these checks individually. Code that
is too complex or convoluted will be pointed out in reviews anyways.
Nicki Křížek [Mon, 14 Oct 2024 12:44:06 +0000 (14:44 +0200)]
Disable too-many/too-few pylint checks
Enforcing pylint standards and default for our test code seems
counter-productive. Since most of the newly added code are tests or is
test-related, encountering these checks rarely make us refactor the code
in other ways and we just disable these checks individually. Code that
is too complex or convoluted will be pointed out in reviews anyways.
Nicki Křížek [Tue, 15 Oct 2024 08:03:25 +0000 (10:03 +0200)]
Support dnspython 2.7.0
CookieOption with new .server/.client attributes (rather than .data) was
added to dnspython. Adjust the code to use the new attributes if
available and fall back to the old code for dnspython<2.7.0
compatibility.
Michal Nowak [Tue, 10 Sep 2024 12:44:51 +0000 (12:44 +0000)]
[9.16] chg: test: Be more patient when stopping servers in the system tests
When the TCP test is run on the busy server, the server might take a
while to wind the server down because it might still be processing all
that 300k invalid XFR requests.
Increate the rncd wait time to 120 seconds, the SIGTERM time to 300
seconds, and reduce the time to wait for ans servers from 1200 second
to just 120 seconds.
Be more patient when stopping servers in the system tests
When the TCP test is run on the busy server, the server might take a
while to wind the server down because it might still be processing all
that 300k invalid XFR requests.
Increate the rncd wait time to 120 seconds, the SIGTERM time to 300
seconds, and reduce the time to wait for ans servers from 1200 second
to just 120 seconds.
Michal Nowak [Wed, 28 Aug 2024 07:26:13 +0000 (07:26 +0000)]
chg: test: Bump max-recursion-queries to 100 in resolver system test
With max-recursion-queries set to 50 the resolver system test was
unstable in the "checking query resolution for a domain with a valid
glueless delegation chain" check as ns1 replied with SERVFAIL.
Closes #4897
Merge branch '4897-resolver-ns1-bump-max-recursion-queries-to-100' into 'bind-9.16'
Michal Nowak [Mon, 26 Aug 2024 15:56:56 +0000 (17:56 +0200)]
Bump max-recursion-queries to 100 in resolver system test
With max-recursion-queries set to 50 the resolver system test was
unstable in the "checking query resolution for a domain with a valid
glueless delegation chain" check as ns1 replied with SERVFAIL.
Michal Nowak [Mon, 26 Aug 2024 15:25:32 +0000 (15:25 +0000)]
[9.16] chg: ci: Drop removed system tests from cross-version-config-tests
The cross-version-config-tests job fails when a system test is removed
from the upcoming release. To avoid this, remove the system test also
from the $BIND_BASELINE_VERSION.
See the failure mode at https://gitlab.isc.org/isc-projects/bind9/-/jobs/4668947.
Backport of MR !9413
Merge branch 'backport-mnowak/remove-dialup-from-cross-version-config-tests-job-9.16' into 'bind-9.16'
Michal Nowak [Mon, 26 Aug 2024 11:41:47 +0000 (13:41 +0200)]
Drop removed system tests from $BIND_BASELINE_VERSION
The cross-version-config-tests job fails when a system test is removed
from the upcoming release. To avoid this, remove the system test also
from the $BIND_BASELINE_VERSION.
Petr Špaček [Mon, 5 Aug 2024 08:21:46 +0000 (10:21 +0200)]
Automatically adjust MR metadata after merge
1. Set milestone to 'Not released yet' after merge
We will set milestone to actual version number when we actually tag a
particular version. This will get rid of mass MR reassignment when we
do last minute changes to a release plan etc.
2. Adjust No CHANGES and Release Notes MR labels to match gitchangelog
workflow.
Ondřej Surý [Thu, 22 Aug 2024 09:31:53 +0000 (09:31 +0000)]
[9.16] new: usr: Tighten 'max-recursion-queries' and add 'max-query-restarts' option
There were cases in resolver.c when the `max-recursion-queries` quota was ineffective. It was possible to craft zones that would cause a resolver to waste resources by sending excessive queries while attempting to resolve a name. This has been addressed by correcting errors in the implementation of `max-recursion-queries`, and by reducing the default value from 100 to 32.
In addition, a new `max-query-restarts` option has been added which limits the number of times a recursive server will follow CNAME or DNAME records before terminating resolution. This was previously a hard-coded limit of 16, and now defaults to 11.
Closes #4741
Backport of MR !9281
Merge branch 'backport-4741-reclimit-restarts-9.16' into 'bind-9.16'
Evan Hunt [Wed, 26 Jun 2024 06:49:00 +0000 (23:49 -0700)]
implement 'max-query-restarts'
implement, document, and test the 'max-query-restarts' option
which specifies the query restart limit - the number of times
we can follow CNAMEs before terminating resolution.
Evan Hunt [Tue, 25 Jun 2024 21:30:20 +0000 (14:30 -0700)]
make "max_restarts" a configurable value
MAX_RESTARTS is no longer hard-coded; ns_server_setmaxrestarts()
and dns_client_setmaxrestarts() can now be used to modify the
max-restarts value at runtime. in both cases, the default is 11.
Evan Hunt [Tue, 25 Jun 2024 19:28:23 +0000 (12:28 -0700)]
reduce MAX_RESTARTS to 11
the number of steps that can be followed in a CNAME chain
before terminating the lookup has been reduced from 16 to 11.
(this is a hard-coded value, but will be made configurable later.)
Evan Hunt [Wed, 22 May 2024 20:02:16 +0000 (13:02 -0700)]
attach query counter to NS fetches
there were cases in resolver.c when queries for NS records were
started without passing a pointer to the parent fetch's query counter;
as a result, the max-recursion-queries quota for those queries started
counting from zero, instead of sharing the limit for the parent fetch,
making the quota ineffective in some cases.
Ondřej Surý [Thu, 15 Aug 2024 17:54:58 +0000 (19:54 +0200)]
For TSAN builds, use libraries from /opt/tsan
The new TSAN images, the TSAN-enabled images install libraries to
/opt/tsan, synchronize the configure options and CFLAGS between gcc:tsan
and clang:tsan images and set the PKG_CONFIG_PATH to /opt/tsan/lib.
Ondřej Surý [Wed, 7 Aug 2024 16:02:51 +0000 (16:02 +0000)]
[9.16] chg: test: Use new images with TSAN-enabled libraries
The new Fedora 40 TSAN images use libuv, urcu and OpenSSL libraries compiled with ThreadSanitizer. This (in theory) should enable better detection of memory races in those (most important) libraries.
Backport of MR !9264
Merge branch 'backport-ondrej/test-new-tsan-images-9.16' into 'bind-9.16'
Michał Kępień [Wed, 7 Aug 2024 15:41:57 +0000 (15:41 +0000)]
[9.16] fix: test: refresh base image repos before installing from them
Stale repositories cause issue on installation in the `docs:pdf` CI job:
E: Failed to fetch http://deb.debian.org/debian/pool/main/s/systemd/libsystemd-shared_252.22-1%7edeb12u1_amd64.deb 404 Not Found [IP: 2a04:4e42:78::644 80]
E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?
Michal Nowak [Mon, 8 Jul 2024 12:20:02 +0000 (14:20 +0200)]
Refresh base image repos before installing from them
Stale repositories cause issue on installation in the docs:pdf CI job:
E: Failed to fetch http://deb.debian.org/debian/pool/main/s/systemd/libsystemd-shared_252.22-1%7edeb12u1_amd64.deb 404 Not Found [IP: 2a04:4e42:78::644 80]
E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?
Michał Kępień [Tue, 6 Aug 2024 11:22:38 +0000 (11:22 +0000)]
chg: test: use --without-python for Debian "sid" builds
Debian "sid" images used in GitLab CI no longer contain the distutils
Python module, which prevents the ./configure script from succeeding on
that operating system. Instead of explicitly installing optional
dependencies for a branch that is no longer actively maintained, add the
--without-python switch to the ./configure invocations on Debian "sid"
to work around the problem.
Merge branch 'michal/disable-python-for-debian-sid-builds' into 'bind-9.16'
Michał Kępień [Tue, 6 Aug 2024 07:34:53 +0000 (09:34 +0200)]
chg: test: use --without-python for Debian "sid" builds
Debian "sid" images used in GitLab CI no longer contain the distutils
Python module, which prevents the ./configure script from succeeding on
that operating system. Instead of explicitly installing optional
dependencies for a branch that is no longer actively maintained, add the
--without-python switch to the ./configure invocations on Debian "sid"
to work around the problem.
Nicki Křížek [Mon, 5 Aug 2024 15:55:46 +0000 (15:55 +0000)]
[9.16] chg: Remove danger checks for release notes and CHANGES
Since 9.21.0-dev, the release notes and changelog process has been
changed. Backport to the EoL branch are no longer expected to have
either CHANGES or release notes, as we aren't going to release any more
versions anyway.
Related #75
Merge branch '75-gitchangelog-9.16' into 'bind-9.16'
Remove danger checks for release notes and CHANGES
Since 9.21.0-dev, the release notes and changelog process has been
changed. Backport to the EoL branch are no longer expected to have
either CHANGES or release notes, as we aren't going to release any more
versions anyway.
Ondřej Surý [Mon, 5 Aug 2024 10:32:36 +0000 (10:32 +0000)]
[9.16] fix: usr: Add a compatibility shim for older libuv versions (< 1.19.0)
The uv_stream_get_write_queue_size() is supported only in relatively newer versions of libuv (1.19.0 or higher). Provide a compatibility shim for this function , so BIND 9 can be built in environments with older libuv version.
Fixes: #4822
Backport of MR !9153
Merge branch 'backport-uv_stream_get_write_queue_size_wrapper-9.16' into 'bind-9.16'
Ondřej Surý [Mon, 5 Aug 2024 09:11:51 +0000 (09:11 +0000)]
fix: dev: Pull the doc/misc/options{,.active} from the CI
The doc/misc/options{,.active} were built on system with different
configuration than we have in the CI, so the docs job just keeps
failing. Pull the files from the CI, so they match what we have
in the CI images.
Merge branch 'ondrej/make-docs-job-happy-9.16' into 'bind-9.16'
Ondřej Surý [Mon, 5 Aug 2024 08:18:31 +0000 (10:18 +0200)]
Pull the doc/misc/options{,.active} from the CI
The doc/misc/options{,.active} were built on system with different
configuration than we have in the CI, so the docs job just keeps
failing. Pull the files from the CI, so they match what we have
in the CI images.
Ondřej Surý [Mon, 5 Aug 2024 08:38:57 +0000 (08:38 +0000)]
[9.16] fix: test: Use LC_ALL to override all system locales
The system tests were overriding the local locale by setting LANG to C.
This does not override the locale in case there are individual LC_<*>
variables like LC_CTYPE explicitly set.
Use LC_ALL=C instead which is the proper way of overriding all currently
set locales.
Backport of MR !9109
Merge branch 'backport-ondrej/use-LC_ALL-not-LANG-9.16' into 'bind-9.16'
Ondřej Surý [Tue, 18 Jun 2024 06:56:18 +0000 (08:56 +0200)]
Use LC_ALL to override all system locales
The system tests were overriding the local locale by setting LANG to C.
This does not override the locale in case there are individual LC_<*>
variables like LC_CTYPE explicitly set.
Use LC_ALL=C instead which is the proper way of overriding all currently
set locales.
Ondřej Surý [Wed, 2 Mar 2022 10:48:26 +0000 (11:48 +0100)]
Add the ability specify the signing / verification time
When fuzzing it is useful for all signing operations to happen
at a specific time for reproducability. Add two variables to
the message structure (fuzzing and fuzztime) to specify if a
fixed time should be used and the value of that time.