Ondřej Surý [Mon, 13 Feb 2023 14:52:51 +0000 (15:52 +0100)]
Use C-RW-WP lock in the dns_adb unit
Replace the isc_mutex in the dns_adb unit with isc_rwlock for better
performance. Both ADB names and ADB entries hashtables and LRU are now
using isc_rwlock.
Ondřej Surý [Wed, 24 Mar 2021 16:52:56 +0000 (17:52 +0100)]
Add the reader-writer synchronization with modified C-RW-WP
This changes the internal isc_rwlock implementation to:
Irina Calciu, Dave Dice, Yossi Lev, Victor Luchangco, Virendra
J. Marathe, and Nir Shavit. 2013. NUMA-aware reader-writer locks.
SIGPLAN Not. 48, 8 (August 2013), 157–166.
DOI:https://doi.org/10.1145/2517327.24425
(The full article available from:
http://mcg.cs.tau.ac.il/papers/ppopp2013-rwlocks.pdf)
The implementation is based on the The Writer-Preference Lock (C-RW-WP)
variant (see the 3.4 section of the paper for the rationale).
The implemented algorithm has been modified for simplicity and for usage
patterns in rbtdb.c.
The changes compared to the original algorithm:
* We haven't implemented the cohort locks because that would require a
knowledge of NUMA nodes, instead a simple atomic_bool is used as
synchronization point for writer lock.
* The per-thread reader counters are not being used - this would
require the internal thread id (isc_tid_v) to be always initialized,
even in the utilities; the change has a slight performance penalty,
so we might revisit this change in the future. However, this change
also saves a lot of memory, because cache-line aligned counters were
used, so on 32-core machine, the rwlock would be 4096+ bytes big.
* The readers use a writer_barrier that will raise after a while when
readers lock can't be acquired to prevent readers starvation.
* Separate ingress and egress readers counters queues to reduce both
inter and intra-thread contention.
Ondřej Surý [Tue, 14 Feb 2023 12:40:45 +0000 (13:40 +0100)]
Add missing <isc/atomic.h> include to dns/badcache.c
The dns_badcache was pulling the <isc/atomic.h> header only indirectly
via <isc/rwlock.h>, add the direct include as the <isc/rwlock.h> no
longer pulls the header when pthread_rwlock is used.
Petr Menšík [Thu, 2 Aug 2018 21:46:45 +0000 (23:46 +0200)]
FIPS tests changes for RHEL
Include MD5 feature detection in featuretest tool and use it in some
places. When RHEL distribution or Fedora ELN is in FIPS mode, then MD5
algorithm is unavailable completely and even hmac-md5 algorithm usage
will always fail. Work that around by checking MD5 works and if not,
skipping its usage.
Those changes were dragged as downstream patch bind-9.11-fips-tests.patch
in Fedora and RHEL.
Tony Finch [Tue, 14 Feb 2023 12:26:28 +0000 (12:26 +0000)]
Fix change 6093 which broke rbtdb when it grew too large
I misunderstood the purpose of the `heap_index` rdataset header
member; I thought it identified which heap to use, and could therefore
be smaller, the same size as `locknum` indexes. But in fact it is a
position within a heap, so it needs to be able to count up to the
total number of rdatasets in the rbtdb.
So this changes `heap_index` from `uint16_t` back to `unsigned int`.
To avoid re-embiggening the rdatasetheader, shrink the `count` member
from `uint32` to `uint16`. The `count` is used to rotate RRsets in
`dns_rdataset_towiresorted()`, so 16 bits is more than large enough.
This change also means we no longer need to avoid colliding with
`DNS_RDATASET_COUNT_UNDEFINED` i.e. UINT32_MAX.
Tony Finch [Wed, 24 Mar 2021 16:52:56 +0000 (17:52 +0100)]
Improve the spinloop pause / yield hint
Unfortunately, C still lacks a standard function for pause (x86,
sparc) or yeild (arm) instructions, for use in spin lock or CAS loops.
BIND has its own based on vendor intrinsics or inline asm.
Previously, it was buried in the `isc_rwlock` implementation. This
commit renames `isc_rwlock_pause()` to `isc_pause()` and moves
it into <isc/pause.h>.
This commit also fixes the configure script so that it detects ARM
yield support on systems that identify as `aarch*` instead of `arm*`.
On 64-bit ARM systems we now use the ISB (instruction synchronization
barrier) instruction in preference to yield. The ISB instruction
pauses the CPU for longer, several nanoseconds, which is more like the
x86 pause instruction. There are more details in a Rust pull request,
which also refers to MySQL making the same change:
https://github.com/rust-lang/rust/pull/84725
Tom Krizek [Mon, 13 Feb 2023 12:58:47 +0000 (13:58 +0100)]
Ignore dig errors in +short comparisons in tests
Tests using diff to compare outputs of dig +short shall ignore lines
starting with ";". In dig +short output, such lines should only be
present for errors such as network issues. Since we utilize dig's
default timeout/retry mechanisms, these transitory issues should be
ignored and only the final output should be considered during the diff
comparison.
Aram Sargsyan [Mon, 13 Feb 2023 14:47:09 +0000 (14:47 +0000)]
Fix RPZ reference counting error on shutdown
A dns_rpz_unref_rpzs() call is missing when taking the 'goto unlock;'
path on shutdown, in order to compensate for the earlier
dns_rpz_ref_rpzs() call.
Move the dns_rpz_ref_rpzs() call after the shutdown check.
Mark Andrews [Wed, 9 Nov 2022 12:12:07 +0000 (12:12 +0000)]
Report the key name that failed in retry_keyfetch
When there are multiple managed trust anchors we need to know the
name of the trust anchor that is failing. Extend the error message
to include the trust anchor name.
Evan Hunt [Fri, 10 Feb 2023 18:18:38 +0000 (10:18 -0800)]
remove some unused functions
removed some functions that are no longer used and unlikely to
be resurrected, and also some that were only used to support Windows
and can now be replaced with generic versions.
Tom Krizek [Mon, 6 Feb 2023 13:16:44 +0000 (14:16 +0100)]
Increase named startup wait time for runtime test
Occasionally, the allotted 10 seconds for the "running" line to appear
in log after named is started proved insufficient in CI, especially
during increased load. Give named up to 60 seconds to start up to
mitigate this issue.
Michal Nowak [Wed, 18 Jan 2023 16:41:21 +0000 (17:41 +0100)]
Start named as auth and recursive server in pairwise
The script will start the named process configured as both an
authoritative and recursive server for each pairwise ./configure
configuration. The test is considered successful if the named process
runs until the 5-second timeout is triggered, and there is no named.lock
file present, indicating that named did not crash on shutdown.
Ondřej Surý [Thu, 9 Feb 2023 11:27:40 +0000 (12:27 +0100)]
Add magic to fctxcount and replace the atomics with integers
Add magic value to the fctxcount, to check for completely invalid
counters, or counters that have been already destroyed.
Improve the locking around the counters, and because of that we can drop
the atomics and use simple integers - the counters were already locked
and the tiny bits that used the atomics were not worth the extra effort.
Evan Hunt [Wed, 8 Feb 2023 18:33:06 +0000 (10:33 -0800)]
clean up some deprecated/obsolete options and doc
- removed documentation of -S option from named man page
- removed documentation of reserved-sockets from ARM
- simplified documentation of dnssec-secure-to-insecure - it
now just says it's obsolete rather than describing what it
doesn't do anymore
- marked three formerly obsolete options as ancient:
parent-registration-delay, reserved-sockets, and
suppress-initial-notify
Petr Špaček [Mon, 4 Jul 2022 15:25:11 +0000 (17:25 +0200)]
Remove pregenerated manpages from the repo
We don't need them in the repo, it's sufficient if we pregenerate them
while preparing the tarball. That way we don't have overhead while
modifying them but they are still available for installations without
Sphinx.
I assume that this will make rebases and cherry-picks across branches
easier, with less trial and error churn required in the CI.
It's implemented in the way that we build the manpages only when we
either have pregenerated pages available at the configure time or
sphinx-build is installed and working.
Evan Hunt [Thu, 9 Feb 2023 03:28:09 +0000 (19:28 -0800)]
remove isc_bind9 variable
isc_bind9 was a global bool used to indicate whether the library
was being used internally by BIND or by an external caller. external
use is no longer supported, but the variable was retained for use
by dyndb, which needed it only when being built without libtool.
building without libtool is *also* no longer supported, so the variable
can go away.
Ondřej Surý [Wed, 8 Feb 2023 08:29:54 +0000 (09:29 +0100)]
Enforce version drift limits for libuv
libuv support for receiving multiple UDP messages in a single system
call (recvmmsg()) has been tweaked several times between libuv versions
1.35.0 and 1.40.0. Mixing and matching libuv versions within that span
may lead to assertion failures and is therefore considered harmful, so
try to limit potential damage be preventing users from mixing libuv
versions with distinct sets of recvmmsg()-related flags.
Mark Andrews [Thu, 9 Feb 2023 04:11:24 +0000 (15:11 +1100)]
Make notify source port test reliable
Send the test message from ns3 to ns2 instead of ns2 to ns3 as ns2
is started first and therefore the test doesn't have to wait on the
resend of the the NOTIFY message to be successful.
Mark Andrews [Tue, 31 Jan 2023 02:50:36 +0000 (13:50 +1100)]
dnssec-checkds: cleanup memory on error paths
Move and give unique names to the dns_db_t, dns_dbnode_t and
dns_dbversion_t pointers, so they have global scope and therefore
are visible to cleanup. Unique names are not strictly necessary,
as none of the functions involved call each other.
Change free_db to handle NULL pointers and also an optional
(dns_dbversion_t **).
In match_keyset_dsset and free_keytable, ki to be handled
differently to prevent a false positive NULL pointer dereference
warning from scan.
In formatset moved dns_master_styledestroy earlier and freed
buf before calling check_result to prevent memory leak.
In append_new_ds_set freed ds on the default path before
calling check_result to prevent memory leak.
Ondřej Surý [Tue, 10 Jan 2023 10:47:44 +0000 (11:47 +0100)]
Drop RHEL / CentOS / Oracle Linux 7 support
The RHEL (and clones) 7 will reach EOL in June 2024, shortly after BIND
9.20 will be released. Drop the support for building on those
platforms, so we can use features of modern operating systems - newer
compiler that supports at least subset of C23 and OpenSSL 1.1/3.0.
This will simplify some of the code that we are using in BIND 9.
Evan Hunt [Mon, 31 Jan 2022 20:10:29 +0000 (12:10 -0800)]
refactor dns_clientinfo_init(); use separate function to set ECS
Instead of using an extra rarely-used paramater to dns_clientinfo_init()
to set ECS information for a client, this commit adds a function
dns_clientinfo_setecs() which can be called only when ECS is needed.
Evan Hunt [Tue, 7 Feb 2023 19:05:13 +0000 (11:05 -0800)]
increase simultaneous updates for quota test
the nsupdate system test was intermittently failing due to the update
quota not being exceeded when it should have been. this is most likely
a timing issue: the client is sending updates too slowly, or the server
is processing them too quickly, for the quota to fill. this commit
attempts to make that the failure less likely by increasing the number
of update transactions from 10 to 20.
Mark Andrews [Tue, 7 Feb 2023 01:08:31 +0000 (12:08 +1100)]
Allow some time to the root trust anchor to appear
Following deleting the root trust anchor and reconfiguring the
server it takes some time to for trust anchor to appear in 'rndc
managed-keys status' output. Retry several times.
Aram Sargsyan [Wed, 1 Feb 2023 14:41:58 +0000 (14:41 +0000)]
Fix a bug in resolver's resume_dslookup() function
A recent refactoring in 7e4e125e5ea5b29c946ce4646461d06a75cd8702
had introduced a logical error which could result in calling the
dns_resolver_createfetch() function with 'nameservers' pointer set
to NULL, but with 'domain' not set to NULL, which is not allowed
by the function.
Make sure 'domain' is set only when 'nsrdataset' is valid.
Mark Andrews [Mon, 30 Jan 2023 07:06:57 +0000 (18:06 +1100)]
named-rrchecker: have fatal cleanup
It is trivial to fully cleanup memory on all the error paths in
named-rrchecker, many of which are triggered by bad user input.
This involves freeing lex and mctx if they exist when fatal is
called.
Evan Hunt [Thu, 2 Feb 2023 21:35:32 +0000 (13:35 -0800)]
add source port configuration tests
check in the log files of receiving servers that the originating
ports for notify and SOA query messages were set correctly from
configured notify-source and transfer-source options.
Evan Hunt [Thu, 2 Feb 2023 20:16:49 +0000 (12:16 -0800)]
use configured source ports for UDP requests
the optional 'port' option, when used with notify-source,
transfer-source, etc, is used to set up UDP dispatches with a
particular source port, but when the actual UDP connection was
established the port would be overridden with a random one. this
has been fixed.
(configuring source ports is deprecated in 9.20 and slated for
removal in 9.22, but should still work correctly until then.)
Evan Hunt [Fri, 3 Feb 2023 22:57:17 +0000 (14:57 -0800)]
remove /etc/bind.keys
the built-in trust anchors in named and delv are sufficent for
validation. named still needs to be able to load trust anchors from
a bind.keys file for testing purposes, but it doesn't need to be
the default behavior.
we now only load trust anchors from a file if explicitly specified
via the "bindkeys-file" option in named or the "-a" command line
argument to delv. documentation has been cleaned up to remove references
to /etc/bind.keys.
Evan Hunt [Fri, 27 Jan 2023 22:43:11 +0000 (14:43 -0800)]
delay trust anchor management until zones are loaded
it was possible for a managed trust anchor needing to send a key
refresh query to be unable to do so because an authoritative zone
was not yet loaded. this has been corrected by delaying the
synchronization of managed-keys zones until after all zones are
loaded.
Tony Finch [Fri, 3 Feb 2023 12:29:00 +0000 (12:29 +0000)]
Fix ISC_MEM_ZERO on allocators with malloc_usable_size()
ISC_MEM_ZERO requires great care to use when the space returned by
the allocator is larger than the requested space, and when memory is
reallocated. You must ensure that _every_ call to allocate or
reallocate a particular block of memory uses ISC_MEM_ZERO, to ensure
that the extra space is zeroed as expected. (When ISC_MEMFLAG_FILL
is set, the extra space will definitely be non-zero.)
When BIND is built without jemalloc, ISC_MEM_ZERO is implemented in
`jemalloc_shim.h`. This had a bug on systems that have malloc_size()
or malloc_usable_size(): memory was only zeroed up to the requested
size, not the allocated size. When an oversized allocation was
returned, and subsequently reallocated larger, memory between the
original requested size and the original allocated size could
contain unexpected nonzero junk. The realloc call does not know the
original requested size and only zeroes from the original allocated
size onwards.
After this change, `jemalloc_shim.h` always zeroes up to the
allocated size, not the requested size.