Michal Nowak [Wed, 25 Mar 2026 12:31:49 +0000 (13:31 +0100)]
Set RUNNER_SCRIPT_TIMEOUTs
Sometimes jobs can get stuck and be terminated by GitLab, leaving us
without artefacts that could contain useful information about why the
job got stuck.
Michał Kępień [Thu, 7 May 2026 16:15:02 +0000 (18:15 +0200)]
[9.16] chg: ci: Mark merged security fixes as "Not released yet"
Adjust the triggering rules for the "merged-metadata" CI job so that
merge requests merged into security-* branches are automatically
assigned to the "Not released yet" milestone, just like merge requests
targeting public branches. This enables merge requests containing
security fixes to be correctly processed by release automation scripts.
Backport of MR !11984
Merge branch 'backport-pspacek/extend-not-released-yet-milestone-9.16' into 'bind-9.16'
Petr Špaček [Tue, 5 May 2026 13:04:36 +0000 (15:04 +0200)]
Mark merged security fixes as "Not released yet"
Adjust the triggering rules for the "merged-metadata" CI job so that
merge requests merged into security-* branches are automatically
assigned to the "Not released yet" milestone, just like merge requests
targeting public branches. This enables merge requests containing
security fixes to be correctly processed by release automation scripts.
When processing a catalog zone member's primaries definition and
there is a TXT record containing an invalid name TSIG key name,
dns_name_free was incorrectly called triggering an assertion.
This has been fixed.
Closes #5858
Backport of MR !11832
Merge branch 'backport-5858-remove-unnecessary-dns-name-free-call-9.16' into 'bind-9.16'
Mark Andrews [Fri, 10 Apr 2026 03:07:26 +0000 (13:07 +1000)]
Remove unnecessary dns_name_free call
When processing a catalog zone member's primaries definition and
there is a TXT record containing an invalid name TSIG key name,
dns_name_free was incorrectly called triggering an assertion.
This has been fixed.
Michał Kępień [Wed, 25 Mar 2026 09:15:37 +0000 (10:15 +0100)]
[9.16] [CVE-2026-1519] sec: usr: Fix unbounded NSEC3 iterations when validating referrals to unsigned delegations
DNSSEC-signed zones may contain high iteration-count NSEC3 records,
which prove that certain delegations are insecure. Previously, a
validating resolver encountering such a delegation processed these
iterations up to the number given, which could be a maximum of 65,535.
This has been addressed by introducing a processing limit, set at 150.
Now, if such an NSEC3 record is encountered, the delegation will be
treated as insecure.
ISC would like to thank Samy Medjahed/Ap4sh for bringing this
vulnerability to our attention.
Closes isc-projects/bind9#5708
Backport of MR !935
Merge branch '5708-confidential-nsec3-delegation-iteration-fix-fallback-to-insecure-9.16' into 'bind-9.16-release'
In many places we only create a validator if the RRset has too low
trust (the RRset is pending validation, or could not be validated
before). This check was missing prior to validating negative response
data.
When looking up an NSEC3 as part of an insecurity proof, check the
number of iterations. If this is too high, treat the answer as insecure
by marking the answer with trust level "answer", indicating that they
did not validate, but could be cached as insecure.
Nicki Křížek [Fri, 21 Nov 2025 14:05:36 +0000 (15:05 +0100)]
Increase the threshold for respdiff-third-party
There are multiple reasons for the increased amount of differences we've
been seeing lately and for the raise of the threshold:
1. Recent hardening against cache poisoning (CVE-2025-40778) have
uncovered a few edge cases where the domain can't be properly
resolved with the new protections in place, but those are issues with
upstream configuration and DNS setup.
2. The same hardening magnified some behaviour differences between 9.21
and older versions. Some misconfigured domains, which can be resolved
with BIND 9.20 and older are no longer resolvable in 9.21+. This can
be again attributed to upstream DNS misconfiguration. See #5649.
3. A change in the respdiff CI job to include timeouts in the
comparison, or rather, increasing the timeouts to resolve the
previously timed out queries, which are typically failures. With the
previous job configuration, those were omitted from comparison,
because they were timeouts. Now, there should be no timeouts, but
there is a slight increase in the amount of differences for the
threshold evaluation.
Michał Kępień [Wed, 22 Oct 2025 16:45:05 +0000 (18:45 +0200)]
[9.16] [CVE-2025-40780] sec: usr: Cache-poisoning due to weak pseudo-random number generator
It was discovered during research for an upcoming academic paper that a
xoshiro128\*\* internal state can be recovered by an external 3rd party,
allowing the prediction of UDP ports and DNS IDs in outgoing queries.
This could lead to an attacker spoofing the DNS answers with great
efficiency and poisoning the DNS cache.
The internal random generator has been changed to a cryptographically
secure pseudo-random generator.
ISC would like to thank Prof. Amit Klein and Omer Ben Simhon from Hebrew
University of Jerusalem for bringing this vulnerability to our
attention.
Backport of !831
Closes isc-projects/bind9#5484
Merge branch '5484-security-make-isc_random-csprng-9.16' into 'bind-9.16-release'
Ondřej Surý [Tue, 19 Aug 2025 17:22:18 +0000 (19:22 +0200)]
Use cryptographically-secure pseudo-random generator everywhere
It was discovered in an upcoming academic paper that a xoshiro128**
internal state can be recovered by an external 3rd party allowing to
predict UDP ports and DNS IDs in the outgoing queries. This could lead
to an attacker spoofing the DNS answers with great efficiency and
poisoning the DNS cache.
Change the internal random generator to system CSPRNG with buffering to
avoid excessive syscalls.
Thanks Omer Ben Simhon and Amit Klein of Hebrew University of Jerusalem
for responsibly reporting this to us. Very cool research!
Michał Kępień [Wed, 22 Oct 2025 16:41:51 +0000 (18:41 +0200)]
[9.16] [CVE-2025-40778] sec: usr: Address various spoofing attacks
Previously, several issues could be exploited to poison a DNS cache with
spoofed records for zones which were not DNSSEC-signed or if the
resolver was configured to not do DNSSEC validation. These issues were
assigned CVE-2025-40778 and have now been fixed.
As an additional layer of protection, :iscman:`named` no longer accepts
DNAME records or extraneous NS records in the AUTHORITY section unless
these are received via spoofing-resistant transport (TCP, UDP with DNS
cookies, TSIG, or SIG(0)).
ISC would like to thank Yuxiao Wu, Yunyi Zhang, Baojun Liu, and Haixin
Duan from Tsinghua University for bringing this vulnerability to our
attention.
Backport of !838
Closes isc-projects/bind9#5414
Merge branch '5414-security-check-name-vs-qname-again-9.16' into 'bind-9.16-release'
To prevent spoofed unsigned DNAME responses being accepted retry
response with unsigned DNAMEs over TCP if the response is not TSIG
signed or there isn't a good DNS CLIENT COOKIE.
To prevent test failures, this required adding TCP support to the
ans3 and ans4 servers in the chain system test.
Further restrict addresses that are cached when processing referrals
Use the owner name of the NS record as the bailwick apex name
when determining which additional records to cache, rather than
the name of the delegating zone (or a parent thereof).
Tighten restrictions on caching NS RRsets in authority section
To prevent certain spoofing attacks, a new check has been added
to the existing rules for whether NS data can be cached: the owner
name of the NS RRset must be an ancestor of the name being queried.
Michal Nowak [Wed, 19 Mar 2025 13:02:32 +0000 (14:02 +0100)]
Set more lenient respdiff limits
After !9950, respdiff's maximal disagreement percentage needs to be
adjusted as target disagreements between the tested version of the
"main" branch and the reference one jumped for the respdiff,
respdiff:asan, and respdiff:tsan jobs from on average 0.07% to 0.16% and
from 0.12% to 0.17% for the respdiff-third-party job.
In !9950, we concluded setting MAX_DISAGREEMENTS_PERCENTAGE to double
the average disagreement percentage works fine in the CI.
Michal Nowak [Wed, 1 Oct 2025 08:36:49 +0000 (10:36 +0200)]
Drop gcc:sid:amd64 jobs
They fail to build with GCC 15.2.0:
rdata/in_1/wks_11.c: In function 'totext_in_wks':
rdata/in_1/wks_11.c:238:77: error: '%u' directive output may be truncated writing between 1 and 10 bytes into a region of size 6 [-Werror=format-truncation=]
238 | snprintf(buf, sizeof(buf), "%u",
| ^~
rdata/in_1/wks_11.c:238:76: note: directive argument in the range [1, 4294967295]
238 | snprintf(buf, sizeof(buf), "%u",
| ^~~~
rdata/in_1/wks_11.c:238:49: note: 'snprintf' output between 2 and 11 bytes into a destination of size 6
238 | snprintf(buf, sizeof(buf), "%u",
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
239 | i * 8 + j);
| ~~~~~~~~~~
nolibtool:sid:amd64 keeps building, so keep it.
Move -O3 and --without-lmdb to gcc:noble:amd64.
Move RUN_MAKE_INSTALL=1 to gcc:trixie:amd64 as Sphinx is needed to build
man pages that are checked for in util/check-make-install.
Petr Špaček [Wed, 28 May 2025 13:46:14 +0000 (15:46 +0200)]
Run CI danger job even if user canceled it while it was running
Limitation: The after_script is not executed if the job did not start at
all, i.e. if the user canceled the job before it got onto a runner.
See https://gitlab.com/groups/gitlab-org/-/epics/10158
[9.16] [9.18] fix: dev: Unify the int32_t vs int_fast32_t when working with atomic types
There's a mismatch between the atomic and non-atomic types that could
potentialy lead to a rwlock deadlock (after two billion 2^32) writes.
Use int_fast32_t when loading the atomic_int_fast32_t types in the
isc_rwlock unit.
Closes #5280
Backport of MR !10390
Merge branch 'backport-5280-match-the-types-in-isc_rwlock-9.18-9.16' into 'bind-9.16'
Unify the int32_t vs int_fast32_t when working with atomic types
There's a mismatch between the atomic and non-atomic types that could
potentialy lead to a rwlock deadlock (after two billion 2^32) writes.
Use int_fast32_t when loading the atomic_int_fast32_t types in the
isc_rwlock unit.
[9.16] new: ci: Allow pushing branches and tags to customer git repos
For pipelines in the private repository, add an optional manual job,
which allows the current branch to be pushed into the specified
customer's git repository. This can be useful to provide patch previews
for early testing.
For tags created in a private repository, add a manual job which pushes
the created tag to all entitled customers.
Backport of MR !10323
Merge branch 'backport-nicki/ci-customer-git-automation-9.16' into 'bind-9.16'
Nicki Křížek [Tue, 25 Mar 2025 15:51:24 +0000 (16:51 +0100)]
Allow pushing branches and tags to customer git repos
For pipelines in the private repository, add an optional manual job,
which allows the current branch to be pushed into the specified
customer's git repository. This can be useful to provide patch previews
for early testing.
For tags created in a private repository, add a manual job which pushes
the created tag to all entitled customers.
Michal Nowak [Tue, 11 Mar 2025 12:42:59 +0000 (12:42 +0000)]
Drop FreeBSD and OpenBSD from CI
Both FreeBSD and OpenBSD in the CI are tested on outdated images.
Current FreeBSD images can't even be rebuild because in the maintained
branches they were ported from QCOW2 to the AWS autoscaler (also a
future of the OpenBSD image). This is something we don't want to
backport to EoL branches.
Merge branch 'mnowak/drop-bsd-images-from-ci' into 'bind-9.16'
Michal Nowak [Tue, 11 Mar 2025 09:56:18 +0000 (10:56 +0100)]
Drop FreeBSD and OpenBSD from CI
Both FreeBSD and OpenBSD in the CI are tested on outdated images.
Current FreeBSD images can't even be rebuild because in the maintained
branches they were ported from QCOW2 to the AWS autoscaler (also a
future of the OpenBSD image). This is something we don't want to
backport to EoL branches.
Petr Špaček [Thu, 30 Jan 2025 10:24:59 +0000 (11:24 +0100)]
Do not trigger post-merge jobs for cross-project pushes
We need to avoid double-triggering of post-merge jobs in the following
scenario:
1. A private MR gets merged into the private BIND 9 repository.
2. This merge operation triggers a "push" pipeline in the private
repository, which correctly runs post-merge jobs, e.g. to set MR
metadata in the private project.
3. When a release is published, a script is run to change the
automatically assigned milestone value ("Not released yet") to
something else.
4. Shortly afterwards, the result of the merge from step 1 is merged
back into a maintenance branch in the public repository.
5. The push operation triggers another "push" pipeline, this time in
the public project.
At this point there are two problems:
- If the script is dumb (like it currently is), it will extract the
merge request ID from the merge commit description and change the
milestone for a merge request in the wrong project namespace.
- Even if the script was fixed to extract and use the correct GitLab
project reference, it would reset the milestone for the merge
request in the private repository back to "Not released yet" - while
the milestone set in step 3 should be retained.
An alternative would be to change the order of operations so that
post-release milestoning happens at a later stage, while also fixing the
script to correctly follow cross-project references, but that approach
seems more fragile than simply failing on all cross-project pushes. The
rule to enforce is: each project should only take care of its own
post-merge tasks.
Michał Kępień [Fri, 31 Jan 2025 09:37:54 +0000 (09:37 +0000)]
[9.16] chg: ci: Use default cloning depth for the Danger CI job
With shallow fetching working reliably in pygit2 1.17.0+, there is no
longer any need for GitLab CI runners to clone the BIND 9 repository
with a fixed depth of 1000 during every "danger" CI job as Hazard is now
able to fetch remote refs with an arbitrary depth, controlled by the
HAZARD_FETCH_DEPTH environment variable. The latter can be defined via
GitLab project's CI settings and adjusted as needed over time, without
the need to update .gitlab-ci.yml every time its value needs to be
changed.
Backport of MR !9946
Merge branch 'backport-michal/use-default-cloning-depth-for-the-danger-ci-job-9.16' into 'bind-9.16'
Michał Kępień [Fri, 31 Jan 2025 09:25:56 +0000 (10:25 +0100)]
Use default cloning depth for the Danger CI job
With shallow fetching working reliably in pygit2 1.17.0+, there is no
longer any need for GitLab CI runners to clone the BIND 9 repository
with a fixed depth of 1000 during every "danger" CI job as Hazard is now
able to fetch remote refs with an arbitrary depth, controlled by the
HAZARD_FETCH_DEPTH environment variable. The latter can be defined via
GitLab project's CI settings and adjusted as needed over time, without
the need to update .gitlab-ci.yml every time its value needs to be
changed.
Nicki Křížek [Mon, 20 Jan 2025 16:17:08 +0000 (16:17 +0000)]
[9.16] [CVE-2024-11187] sec: usr: Limit the additional processing for large RDATA sets
When answering queries, don't add data to the additional section if the answer has more than 13 names in the RDATA. This limits the number of lookups into the database(s) during a single client query, reducing query processing load.
Backport of MR !750
See isc-projects/bind9#5034
Merge branch '5034-security-limit-additional-9.16' into 'bind-9.16-release'
Ondřej Surý [Thu, 14 Nov 2024 09:37:29 +0000 (10:37 +0100)]
Limit the additional processing for large RDATA sets
When answering queries, don't add data to the additional section if
the answer has more than 13 names in the RDATA. This limits the
number of lookups into the database(s) during a single client query,
reducing query processing load.
Also, don't append any additional data to type=ANY queries. The
answer to ANY is already big enough.
Ondřej Surý [Tue, 7 Jan 2025 14:22:40 +0000 (15:22 +0100)]
Isolate using the -T noaa flag only for part of the resolver test
Instead of running the whole resolver/ns4 server with -T noaa flag,
use it only for the part where it is actually needed. The -T noaa
could interfere with other parts of the test because the answers don't
have the authoritative-answer bit set, and we could have false
positives (or false negatives) in the test because the authoritative
server doesn't follow the DNS protocol for all the tests in the resolver
system test.
Arаm Sаrgsyаn [Wed, 8 Jan 2025 12:39:51 +0000 (12:39 +0000)]
[9.16] fix: dev: Fix a bug in isc_rwlock_trylock()
When isc_rwlock_trylock() fails to get a read lock because another
writer was faster, it should wake up other waiting writers in case
there are no other readers, but the current code forgets about
the currently active writer when evaluating 'cntflag'.
Unset the WRITER_ACTIVE bit in 'cntflag' before checking to see if
there are other readers, otherwise the waiting writers, if they exist,
might not wake up.
Closes #5121
Backport of MR !9937
Merge branch 'backport-aram/isc_rwlock_trylock-bugfix-9.18-9.16' into 'bind-9.16'
Aram Sargsyan [Tue, 7 Jan 2025 13:30:26 +0000 (13:30 +0000)]
Fix a bug in isc_rwlock_trylock()
When isc_rwlock_trylock() fails to get a read lock because another
writer was faster, it should wake up other waiting writers in case
there are no other readers, but the current code forgets about
the currently active writer when evaluating 'cntflag'.
Unset the WRITER_ACTIVE bit in 'cntflag' before checking to see if
there are other readers, otherwise the waiting writers, if they exist,
might not wake up.
Enforcing pylint standards and default for our test code seems
counter-productive. Since most of the newly added code are tests or is
test-related, encountering these checks rarely make us refactor the code
in other ways and we just disable these checks individually. Code that
is too complex or convoluted will be pointed out in reviews anyways.
Nicki Křížek [Mon, 14 Oct 2024 12:44:06 +0000 (14:44 +0200)]
Disable too-many/too-few pylint checks
Enforcing pylint standards and default for our test code seems
counter-productive. Since most of the newly added code are tests or is
test-related, encountering these checks rarely make us refactor the code
in other ways and we just disable these checks individually. Code that
is too complex or convoluted will be pointed out in reviews anyways.
Nicki Křížek [Tue, 15 Oct 2024 08:03:25 +0000 (10:03 +0200)]
Support dnspython 2.7.0
CookieOption with new .server/.client attributes (rather than .data) was
added to dnspython. Adjust the code to use the new attributes if
available and fall back to the old code for dnspython<2.7.0
compatibility.
Michal Nowak [Tue, 10 Sep 2024 12:44:51 +0000 (12:44 +0000)]
[9.16] chg: test: Be more patient when stopping servers in the system tests
When the TCP test is run on the busy server, the server might take a
while to wind the server down because it might still be processing all
that 300k invalid XFR requests.
Increate the rncd wait time to 120 seconds, the SIGTERM time to 300
seconds, and reduce the time to wait for ans servers from 1200 second
to just 120 seconds.
Be more patient when stopping servers in the system tests
When the TCP test is run on the busy server, the server might take a
while to wind the server down because it might still be processing all
that 300k invalid XFR requests.
Increate the rncd wait time to 120 seconds, the SIGTERM time to 300
seconds, and reduce the time to wait for ans servers from 1200 second
to just 120 seconds.
Michal Nowak [Wed, 28 Aug 2024 07:26:13 +0000 (07:26 +0000)]
chg: test: Bump max-recursion-queries to 100 in resolver system test
With max-recursion-queries set to 50 the resolver system test was
unstable in the "checking query resolution for a domain with a valid
glueless delegation chain" check as ns1 replied with SERVFAIL.
Closes #4897
Merge branch '4897-resolver-ns1-bump-max-recursion-queries-to-100' into 'bind-9.16'
Michal Nowak [Mon, 26 Aug 2024 15:56:56 +0000 (17:56 +0200)]
Bump max-recursion-queries to 100 in resolver system test
With max-recursion-queries set to 50 the resolver system test was
unstable in the "checking query resolution for a domain with a valid
glueless delegation chain" check as ns1 replied with SERVFAIL.
Michal Nowak [Mon, 26 Aug 2024 15:25:32 +0000 (15:25 +0000)]
[9.16] chg: ci: Drop removed system tests from cross-version-config-tests
The cross-version-config-tests job fails when a system test is removed
from the upcoming release. To avoid this, remove the system test also
from the $BIND_BASELINE_VERSION.
See the failure mode at https://gitlab.isc.org/isc-projects/bind9/-/jobs/4668947.
Backport of MR !9413
Merge branch 'backport-mnowak/remove-dialup-from-cross-version-config-tests-job-9.16' into 'bind-9.16'
Michal Nowak [Mon, 26 Aug 2024 11:41:47 +0000 (13:41 +0200)]
Drop removed system tests from $BIND_BASELINE_VERSION
The cross-version-config-tests job fails when a system test is removed
from the upcoming release. To avoid this, remove the system test also
from the $BIND_BASELINE_VERSION.