Ondřej Surý [Mon, 19 Aug 2024 11:23:13 +0000 (11:23 +0000)]
[9.18] fix: dev: Check the result of dirfd() before calling unlinkat()
Instead of directly using the result of dirfd() in the unlinkat() call,
check whether the returned file descriptor is actually valid. That
doesn't really change the logic as the unlinkat() would fail with
invalid descriptor anyway, but this is cleaner and will report the right
error returned directly by dirfd() instead of EBADF from unlinkat().
Closes #4853
Backport of MR !9316
Merge branch 'backport-4853-check-result-of-dirfd-in-isc_log-9.18' into 'bind-9.18'
Ondřej Surý [Thu, 15 Aug 2024 07:23:31 +0000 (09:23 +0200)]
Check the result of dirfd() before calling unlinkat()
Instead of directly using the result of dirfd() in the unlinkat() call,
check whether the returned file descriptor is actually valid. That
doesn't really change the logic as the unlinkat() would fail with
invalid descriptor anyway, but this is cleaner and will report the right
error returned directly by dirfd() instead of EBADF from unlinkat().
Ondřej Surý [Thu, 15 Aug 2024 07:12:18 +0000 (09:12 +0200)]
Ignore errno returned from rewind() in the interface iterator
The clang-scan 19 has reported that we are ignoring errno after the call
to rewind(). As we don't really care about the result, just silence the
error, the whole code will be removed in the development version anyway
as it is not needed.
Ondřej Surý [Fri, 16 Aug 2024 08:40:48 +0000 (08:40 +0000)]
[9.18] chg: test: For TSAN builds, use libraries from /opt/tsan
The new TSAN images, the TSAN-enabled images install libraries to
opt/tsan, synchronize the configure options and CFLAGS between gcc:tsan
and clang:tsan images and set the PKG_CONFIG_PATH to /opt/tsan/lib.
Additionally, drop Debian bullseye that's EOL now.
Backport of MR !9324
Merge branch 'backport-ondrej/use-staging-tsan-images-9.18' into 'bind-9.18'
Ondřej Surý [Thu, 15 Aug 2024 17:54:58 +0000 (19:54 +0200)]
For TSAN builds, use libraries from /opt/tsan
The new TSAN images, the TSAN-enabled images install libraries to
/opt/tsan, synchronize the configure options and CFLAGS between gcc:tsan
and clang:tsan images and set the PKG_CONFIG_PATH to /opt/tsan/lib.
Since changelog entries are now generated from MR title&description,
they aren't sanity checked during a regular docs build. If these contain
special sequences that will be interpreted by sphinx, it might result in
breakage that would have to be amended manually.
Add a CI check to test a doc build with changelog after the MR is merged
to ensure that the docs can be built when generating changelog from
pristine git contents.
Related #4847
Backport of MR !9294
Merge branch 'backport-nicki/add-changelog-entry-check-9.18' into 'bind-9.18'
Nicki Křížek [Tue, 13 Aug 2024 12:00:43 +0000 (14:00 +0200)]
Use python3 in shebang lines for util scripts
Some distributions (notably, debian bookworm) have deprecated the
`python` interpreter in favor of `python3`. Since our scripts are
python3 anyway, use the proper numbered version in shebang to make
scripts easily executable.
Nicki Křížek [Mon, 12 Aug 2024 12:51:31 +0000 (14:51 +0200)]
Check that generated changelog entry doesn't break docs build
Since changelog entries are now generated from MR title&description,
they aren't sanity checked during a regular docs build. If these contain
special sequences that will be interpreted by sphinx, it might result in
breakage that would have to be amended manually.
Add a CI check to test a doc build with changelog after the MR is merged
to ensure that the docs can be built when generating changelog from
pristine git contents.
Nicki Křížek [Tue, 13 Aug 2024 08:33:02 +0000 (10:33 +0200)]
Fix ordering of gitchangelog replacement regexs
Prior to this change, the issue number could be accidentally removed by
the `Backport of` text, depending on the order of the MR description
contents. Ensure all the removals for text in MR descriptions happen
first, and only then run the replacement regex for issue number, which
appends it to the end of the last non-empty line (which will no longer
be removed).
The only removals that happen after the replacement are guaranteed to
always happen after the end of MR description, since they're
auto-generated by gitlab when the merge commit is created, thus won't
affect the line with the issue number.
Also remove the needless isc-private/bind9 replacement. References
to private MRs are already removed by the very first regex.
Michal Nowak [Thu, 8 Aug 2024 16:01:59 +0000 (16:01 +0000)]
[9.18] chg: Make every changelog entry a separate code block
LaTeX in CI and on ReadTheDocs [fails][1] to render a PDF version of ARM if
the Changelog section is included. The running theory is that the
verbatim section of more than twenty thousand lines is too big to meet
LaTeX self-imposed constraints, and it fails with:
Make each BIND 9 release a separate code block to work around the issue.
Further split up the sections for some exceptionally large releases, for
the same reason.
Michal Nowak [Wed, 7 Aug 2024 10:39:23 +0000 (12:39 +0200)]
Split up changelog into per-release code blocks
LaTeX in CI and on ReadTheDocs fails to render a PDF version of ARM if
the Changelog section is included. The running theory is that the
verbatim section of more than twenty thousand lines is too big to meet
LaTeX self-imposed constraints, and it fails with:
Make each BIND 9 release a separate code block to work around the issue.
Further split up the sections for some exceptionally large releases, for
the same reason.
Evan Hunt [Wed, 7 Aug 2024 23:22:22 +0000 (23:22 +0000)]
[9.18] new: usr: Tighten 'max-recursion-queries' and add 'max-query-restarts' option
There were cases in resolver.c when the `max-recursion-queries` quota was ineffective. It was possible to craft zones that would cause a resolver to waste resources by sending excessive queries while attempting to resolve a name. This has been addressed by correcting errors in the implementation of `max-recursion-queries`, and by reducing the default value from 100 to 32.
In addition, a new `max-query-restarts` option has been added which limits the number of times a recursive server will follow CNAME or DNAME records before terminating resolution. This was previously a hard-coded limit of 16, and now defaults to 11.
Closes #4741
Backport of MR !9281
Merge branch 'backport-4741-reclimit-restarts-9.18' into 'bind-9.18'
Evan Hunt [Wed, 26 Jun 2024 06:49:00 +0000 (23:49 -0700)]
implement 'max-query-restarts'
implement, document, and test the 'max-query-restarts' option
which specifies the query restart limit - the number of times
we can follow CNAMEs before terminating resolution.
Evan Hunt [Tue, 25 Jun 2024 21:30:20 +0000 (14:30 -0700)]
make "max_restarts" a configurable value
MAX_RESTARTS is no longer hard-coded; ns_server_setmaxrestarts()
and dns_client_setmaxrestarts() can now be used to modify the
max-restarts value at runtime. in both cases, the default is 11.
Evan Hunt [Tue, 25 Jun 2024 19:28:23 +0000 (12:28 -0700)]
reduce MAX_RESTARTS to 11
the number of steps that can be followed in a CNAME chain
before terminating the lookup has been reduced from 16 to 11.
(this is a hard-coded value, but will be made configurable later.)
Evan Hunt [Wed, 22 May 2024 20:02:16 +0000 (13:02 -0700)]
attach query counter to NS fetches
there were cases in resolver.c when queries for NS records were
started without passing a pointer to the parent fetch's query counter;
as a result, the max-recursion-queries quota for those queries started
counting from zero, instead of sharing the limit for the parent fetch,
making the quota ineffective in some cases.
Ondřej Surý [Wed, 7 Aug 2024 16:02:10 +0000 (16:02 +0000)]
[9.18] chg: test: Use new images with TSAN-enabled libraries
The new Fedora 40 TSAN images use libuv, urcu and OpenSSL libraries compiled with ThreadSanitizer. This (in theory) should enable better detection of memory races in those (most important) libraries.
Backport of MR !9264
Merge branch 'backport-ondrej/test-new-tsan-images-9.18' into 'bind-9.18'
Ondřej Surý [Wed, 7 Aug 2024 16:01:11 +0000 (16:01 +0000)]
[9.18] fix: dev: Disassociate the SSL object from the cached SSL_SESSION
When the SSL object was destroyed, it would invalidate all SSL_SESSION
objects including the cached, but not yet used, TLS session objects.
Properly disassociate the SSL object from the SSL_SESSION before we
store it in the TLS session cache, so we can later destroy it without
invalidating the cached TLS sessions.
Closes #4834
Backport of MR !9271
Merge branch 'backport-4834-detach-SSL-from-cached-SSL_SESSION-9.18' into 'bind-9.18'
Ondřej Surý [Wed, 7 Aug 2024 12:58:02 +0000 (14:58 +0200)]
Disassociate the SSL object from the cached SSL_SESSION
When the SSL object was destroyed, it would invalidate all SSL_SESSION
objects including the cached, but not yet used, TLS session objects.
Properly disassociate the SSL object from the SSL_SESSION before we
store it in the TLS session cache, so we can later destroy it without
invalidating the cached TLS sessions.
Ondřej Surý [Wed, 7 Aug 2024 16:00:55 +0000 (16:00 +0000)]
[9.18] fix: dev: Attach/detach to the listening child socket when accepting TLS
When TLS connection (TLSstream) connection was accepted, the children
listening socket was not attached to sock->server and thus it could have
been freed before all the accepted connections were actually closed.
In turn, this would cause us to call isc_tls_free() too soon - causing
cascade errors in pending SSL_read_ex() in the accepted connections.
Properly attach and detach the children listening socket when accepting
and closing the server connections.
Closes #4833
Backport of MR !9270
Merge branch 'backport-4833-tlssock-needs-to-attach-to-child-tlslistener-9.18' into 'bind-9.18'
Ondřej Surý [Wed, 7 Aug 2024 06:43:12 +0000 (08:43 +0200)]
Attach/detach to the listening child socket when accepting TLS
When TLS connection (TLSstream) connection was accepted, the children
listening socket was not attached to sock->server and thus it could have
been freed before all the accepted connections were actually closed.
In turn, this would cause us to call isc_tls_free() too soon - causing
cascade errors in pending SSL_read_ex() in the accepted connections.
Properly attach and detach the children listening socket when accepting
and closing the server connections.
Ondřej Surý [Wed, 7 Aug 2024 06:32:42 +0000 (06:32 +0000)]
[9.18] fix: dev: Don't loop indefinitely when isc_task quantum is 'unlimited'
Don't run more events than already scheduled. If the quantum is set to
a high value, the task_run() would execute already scheduled, and all
new events that result from running event->ev_action().
Setting quantum to a number of scheduled events will postpone events
scheduled after we enter the loop here to the next task_run()
invocation.
Merge branch 'ondrej/dont-run-more-events-than-scheduled-9.18' into 'bind-9.18'
Ondřej Surý [Thu, 7 Mar 2024 12:39:46 +0000 (13:39 +0100)]
Don't loop indefinitely when isc_task quantum is 'unlimited'
Don't run more events than already scheduled. If the quantum is set to
a high value, the task_run() would execute already scheduled, and all
new events that result from running event->ev_action().
Setting quantum to a number of scheduled events will postpone events
scheduled after we enter the loop here to the next task_run()
invocation.
Ondřej Surý [Tue, 6 Aug 2024 14:11:15 +0000 (14:11 +0000)]
[9.18] chg: dev: Use _exit() in the fatal() function
Since the fatal() isn't a correct but rather abrupt termination of the
program, we want to skip the various atexit() calls because not all
memory might be freed during fatal() call, etc. Using _exit() instead
of exit() has this effect - the program will end, but no destructors or
atexit routines will be called.
Backport of MR !8703
Merge branch 'backport-ondrej/use-_exit-in-fatal-9.18' into 'bind-9.18'
Ondřej Surý [Wed, 7 Feb 2024 13:44:39 +0000 (14:44 +0100)]
Use _exit() in the fatal() function
Since the fatal() isn't a correct but rather abrupt termination of the
program, we want to skip the various atexit() calls because not all
memory might be freed during fatal() call, etc. Using _exit() instead
of exit() has this effect - the program will end, but no destructors or
atexit routines will be called.
Ondřej Surý [Mon, 5 Aug 2024 14:05:31 +0000 (14:05 +0000)]
[9.18] fix: usr: Raise the log level of priming failures
When a priming query is complete, it's currently logged at level ISC_LOG_DEBUG(1), regardless of success or failure. We are now raising it to ISC_LOG_NOTICE in the case of failure. [GL #3516]
Closes #3516
Backport of MR !9121
Merge branch 'backport-3516-log-priming-errors-9.18' into 'bind-9.18'
when a priming query is complete, it's currently logged at
level ISC_LOG_INFO, regardless of success or failure. we
are now changing it to ISC_LOG_NOTICE in the case of failure
and ISC_LOG_DEBUG(1) in case of success.
Ondřej Surý [Mon, 5 Aug 2024 10:29:41 +0000 (10:29 +0000)]
fix: usr: Add a compatibility shim for older libuv versions (< 1.19.0)
The uv_stream_get_write_queue_size() is supported only in relatively newer versions of libuv (1.19.0 or higher). Provide a compatibility shim for this function , so BIND 9 can be built in environments with older libuv version.
Fixes: #4822
Merge branch 'uv_stream_get_write_queue_size_wrapper' into 'bind-9.18'
Ondřej Surý [Mon, 5 Aug 2024 09:12:30 +0000 (09:12 +0000)]
[9.18] fix: test: Use LC_ALL to override all system locales
The system tests were overriding the local locale by setting LANG to C.
This does not override the locale in case there are individual LC_<*>
variables like LC_CTYPE explicitly set.
Use LC_ALL=C instead which is the proper way of overriding all currently
set locales.
Backport of MR !9109
Merge branch 'backport-ondrej/use-LC_ALL-not-LANG-9.18' into 'bind-9.18'
Ondřej Surý [Tue, 18 Jun 2024 06:56:18 +0000 (08:56 +0200)]
Use LC_ALL to override all system locales
The system tests were overriding the local locale by setting LANG to C.
This does not override the locale in case there are individual LC_<*>
variables like LC_CTYPE explicitly set.
Use LC_ALL=C instead which is the proper way of overriding all currently
set locales.
Mark Andrews [Mon, 5 Aug 2024 05:17:11 +0000 (05:17 +0000)]
[9.18] fix: usr: Remove extra newline from yaml output
I split this into two commits, one for the actual newline removal, and one for issues I found, ruining the yaml output when some errors were outputted.
Closes: #4772
Backport of MR !9112
Merge branch 'backport-yaml-indent-9.18' into 'bind-9.18'
Mark Andrews [Tue, 9 Jul 2024 01:55:46 +0000 (11:55 +1000)]
Prevent overflow of bufsize
If bufsize overflows we will have an infinite loop. In practice
this will not happen unless we have made a coding error. Add an
INSIST to detect this condition.
181retry:
182 isc_buffer_allocate(mctx, &b, bufsize);
183 result = dns_rdata_totext(rdata, NULL, b);
184 if (result == ISC_R_NOSPACE) {
185 isc_buffer_free(&b);
CID 498031: (#1 of 1): Overflowed constant (INTEGER_OVERFLOW)
overflow_const: Expression bufsize, which is equal to 0, overflows
the type that receives it, an unsigned integer 32 bits wide.
186 bufsize *= 2;
187 goto retry;
188 }
Mark Andrews [Tue, 9 Jul 2024 01:59:39 +0000 (11:59 +1000)]
Prevent overflow of size
If size overflows we will have an infinite loop. In practice
this will not happen unless we have made a coding error. Add
an INSIST to detect this condition.
CID 498025: (#1 of 1): Overflowed constant (INTEGER_OVERFLOW)
overflow_const: Expression size, which is equal to 0, overflows the type that
receives it, an unsigned integer 32 bits wide.
192 size *= 2;
193 }
Disable deadlines for hypothesis tests when running in CI
The times it takes to run tests CI vary significantly enough
that it makes hypothesis test reach their deadlines and fail randomly
marking the tests as flaky.
This commit disables the deadlines when running in CI.
Štěpán Balážik [Thu, 21 Dec 2023 19:25:20 +0000 (20:25 +0100)]
Extend isctest package with more utility functions
Check for more rcodes and various properties needed in the wildcard
test. Add a `name` module for various dns.name.Name operations (with
`prepend_label` function only now).
Expose `timeout` as a parameter of `query.tcp`/`query.udp`.
Mark Andrews [Fri, 2 Aug 2024 08:13:49 +0000 (08:13 +0000)]
[9.18] fix: dev: Remove unnecessary operations
Decrementing optlen immediately before calling continue is unneccesary
and inconsistent with the rest of dns_message_pseudosectiontoyaml
and dns_message_pseudosectiontotext. Coverity was also reporting
an impossible false positive overflow of optlen (CID 499061).
Mark Andrews [Tue, 9 Jul 2024 00:29:30 +0000 (10:29 +1000)]
Remove unnecessary operations
Decrementing optlen immediately before calling continue is unneccesary
and inconsistent with the rest of dns_message_pseudosectiontoyaml
and dns_message_pseudosectiontotext. Coverity was also reporting
an impossible false positive overflow of optlen (CID 499061).
Mark Andrews [Fri, 2 Aug 2024 00:56:33 +0000 (00:56 +0000)]
[9.18] fix: usr: fix generation of 6to4-self name expansion from IPv4 address
The period between the most significant nibble of the encoded IPv4 address and the 2.0.0.2.IP6.ARPA suffix was missing resulting in the wrong name being checked. Add system test for 6to4-self implementation.
Closes #4766
Backport of MR !9099
Merge branch 'backport-4766-add-system-test-for-6to4-self-9.18' into 'bind-9.18'
Mark Andrews [Wed, 5 Jun 2024 03:59:39 +0000 (13:59 +1000)]
Add missing period to generated IPv4 6to4 name
The period between the most significant nibble of the IPv4 address
and the 2.0.0.2.IP6.ARPA suffix was missing resulting in the wrong
name being checked.
Mark Andrews [Mon, 8 Jul 2024 04:00:14 +0000 (14:00 +1000)]
Fix yaml output
In yaml mode we emit a string for each question and record. Certain
names and data could result in invalid yaml being produced. Use single
quote string for all questions and records. This requires that single
quotes get converted to two quotes within the string.
Mark Andrews [Mon, 17 Jun 2024 13:16:28 +0000 (23:16 +1000)]
Properly reject zero length ALPN in commatxt_fromtext
ALPN are defined as 1*255OCTET in RFC 9460. commatxt_fromtext was not
rejecting invalid inputs produces by missing a level of escaping
which where later caught be dns_rdata_fromwire on reception.
These inputs should have been rejected
svcb in svcb 1 1.svcb alpn=\,abc
svcb1 in svcb 1 1.svcb alpn=a\,\,abc
and generated 00 03 61 62 63 and 01 61 00 02 61 62 63 respectively.
The correct inputs to include commas in the alpn requires double
escaping.
svcb in svcb 1 1.svcb alpn=\\,abc
svcb1 in svcb 1 1.svcb alpn=a\\,\\,abc
and generate 04 2C 61 62 63 and 06 61 2C 2C 61 62 63 respectively.
Aram Sargsyan [Thu, 6 Jun 2024 20:49:34 +0000 (20:49 +0000)]
Update the chain test
Update the CNAME chain test to correspond to the changed behavior,
because now named returns SERVFAIL when hitting the maximum query
restarts limit (e.g. happening when following a long CNAME chain).
In the current test auth will hit the limit and return partial data
with a SERVFAIL code, while the resolver will return no data with
a SERVFAIL code after auth returns SERVFAIL to it.
Mark Andrews [Tue, 9 Jul 2024 02:37:13 +0000 (12:37 +1000)]
Properly compute the physical memory size
On a 32 bit machine casting to size_t can still lead to an overflow.
Cast to uint64_t. Also detect all possible negative values for
pages and pagesize to silence warning about possible negative value.
39#if defined(_SC_PHYS_PAGES) && defined(_SC_PAGESIZE)
1. tainted_data_return: Called function sysconf(_SC_PHYS_PAGES),
and a possible return value may be less than zero.
2. assign: Assigning: pages = sysconf(_SC_PHYS_PAGES).
40 long pages = sysconf(_SC_PHYS_PAGES);
41 long pagesize = sysconf(_SC_PAGESIZE);
42
3. Condition pages == -1, taking false branch.
4. Condition pagesize == -1, taking false branch.
43 if (pages == -1 || pagesize == -1) {
44 return (0);
45 }
46
5. overflow: The expression (size_t)pages * pagesize might be negative,
but is used in a context that treats it as unsigned.
CID 498034: (#1 of 1): Overflowed return value (INTEGER_OVERFLOW)
6. return_overflow: (size_t)pages * pagesize, which might have underflowed,
is returned from the function.
47 return ((size_t)pages * pagesize);
48#endif /* if defined(_SC_PHYS_PAGES) && defined(_SC_PAGESIZE) */
In 9.18, 'inline-signing yes;' must also be configured explicitly for
zones using dnssec-policy without a configured 'allow-update' or
'update-policy'.
Matthijs Mekking [Mon, 24 Jun 2024 13:14:16 +0000 (15:14 +0200)]
Update key lifetime and metadata after reconfig
If dnssec-policy is reconfigured and the key lifetime has changed,
update existing keys with the new lifetime and adjust the retire
and removed timing metadata accordingly.
If the key has no lifetime yet, just initialize the lifetime. It
may be that the retire/removed timing metadata has already been set.
Skip keys which goal is not set to omnipresent. These keys are already
in the progress of retiring, or still unused.