Keeping the information about lame server in the ADB was done in !322 to
fix following security issue:
[CVE-2021-25219] Disable "lame-ttl" cache
The handling of the lame servers needs to be redesigned and it is not
going to be enabled any time soon, and the current code is just dead
code that takes up space, code and stands in the way of making ADB work
faster.
Remove all the internals needed for handling the lame servers in the ADB
for now. It might get reintroduced later if and when we redesign ADB.
now that we're using qpmulti for the summary database, we
no longer need to hold search_lock for it. we do still need
it for the radix tree and the trigger counts.
depending on how the QP trie is traversed during a lookup, it is
possible for a search to terminate on a leaf which is a partial
match, without that leaf being added to the chain. to ensure the
chain is correct in this case, when a partial match condition is
detected via qpkey_compare(), we will call add_link() again, just
in case. (add_link() will check for a duplicated node, so it will
be harmless if it was already done.)
Ondřej Surý [Fri, 6 Oct 2023 08:30:21 +0000 (10:30 +0200)]
Add base testing set of names for load-names benchmark
This was generated from dnsperf queryfile with following script:
#!/usr/bin/env python3
names = {}
import sys
i = 0
for line in iter(sys.stdin.readline, ''):
name = line.rstrip('\n')
if not name in names:
names[name] = line
print(f"{i},{name}")
i += 1
if i >= 1024*1024:
break
Ondřej Surý [Fri, 6 Oct 2023 08:28:32 +0000 (10:28 +0200)]
Fix hashmap part of load-names benchmark
The name_match() was errorneously converting struct item into dns_name
pointer. Correctly retype void *node to struct item * first and then
use item.fixed.name to pass the name to dns_name_equal() function.
Ondřej Surý [Fri, 6 Oct 2023 08:06:36 +0000 (10:06 +0200)]
Use read number of items instead of raw array size in load_names
The load_names benchmark expected the input CSV with domains would fill
the whole item array and it would crash when the number of lines would
be less than that.
Fix the expectations by using the real number or lines read to calculate
the array start and end position for each benchmark thread.
Michał Kępień [Fri, 6 Oct 2023 11:07:55 +0000 (13:07 +0200)]
Move Linux "stress" tests to autoscaled instances
The autoscaling GitLab CI runners currently used for most GitLab CI jobs
spin up AWS EC2 instances that are at least as powerful as the dedicated
instances used for running "stress" tests. Move all Linux-based
"stress" tests to autoscaling GitLab CI runners to enable deprovisioning
Linux AWS instances reserved for running "stress" tests. Leave FreeBSD
"stress" tests intact as there is currently no support for autoscaling
BSD instances.
Michal Nowak [Wed, 16 Aug 2023 14:19:30 +0000 (16:19 +0200)]
Report hung system tests
At times, a problem might occur where a test is not responding,
especially in the CI, determining the specific test responsible can be
difficult. Fortunately, when running tests with the pytest runner,
pytest sets the PYTEST_CURRENT_TEST environment variable to the current
test nodeid and stage. Afterward, the variable can be examined to
identify the test that has stopped responding.
The monitoring script needs to be started in the background. Still, the
shell executor used for BSD and FIPS testing can't handle the background
process cleanly, and the script step will wait for the background
process for the entire duration of the background process (currently
3000 seconds). Therefore, run the monitoring script only when the Docker
executor is used where this is not a problem.
Move the block on the error path, where the link is checked, to a place
where it makes sense, to avoid accessing an unitialized link when
jumping to the 'cleanup_query' label from 4 different places. The link
is initialized only after those jumps happen.
In addition, initilize the link when creating the object, to avoid
similar errors.
get predecessor name in dns_qp_findname_ancestor()
dns_qp_findname_ancestor() now takes an optional 'predecessor'
parameter, which if non-NULL is updated to contain the DNSSEC
predecessor of the name searched for. this is done by constructing
an iterator stack while carrying out the search, so it can be used
to step backward if needed.
since dns_qp_findname_ancestor() can now return a chain object, it is no
longer necessary to provide a _NOEXACT search option. if we want to look
up the closest ancestor of a name, we can just do a normal search, and
if successful, retrieve the second-to-last node from the QP chain.
this makes ancestor lookups slightly more complicated for the caller,
but allows us to simplify the code in dns_qp_findname_ancestor(), making
it easier to ensure correctness. this was a fairly rare use case:
outside of unit tests, DNS_QPFIND_NOEXACT was only used in the zone
table, which has now been updated to use the QP chain. the equivalent
RBT feature is only used by the resolver for cache lookups of 'atparent'
types (i.e, DS records).
- make iterators reversible: refactor dns_qpiter_next() and add a new
dns_qpiter_prev() function to support iterating both forwards and
backwards through a QP trie.
- added a 'name' parameter to dns_qpiter_next() (as well as _prev())
to make it easier to retrieve the nodename while iterating, without
having to construct it from pointer value data.
- the helper functions for accessing twigs beneath a branch
(branch_twig_pos(), branch_twig_ptr(), etc) were somewhat confusing
to read, since several of them were implemented by calling other
helper functions. they now all show what they're really doing.
- branch_twigs_vector() has been renamed to simply branch_twigs().
- revised some unrelated comments in qp_p.h for clarity.
dns_qp_findname_ancestor() now takes an optional 'chain' parameter;
if set, the dns_qpchain object it points to will be updated with an
array of pointers to the populated nodes between the tree root and the
requested name. the number of nodes in the chain can then be accessed
using dns_qpchain_length() and the individual nodes using
dns_qpchain_node().
modify dns_qp_findname_ancestor() to return found name
add a 'foundname' parameter to dns_qp_findname_ancestor(),
and use it to set the found name in dns_nametree.
this required adding a dns_qpkey_toname() function; that was
done by moving qp_test_keytoname() from the test library to qp.c.
added some more test cases and fixed bugs with the handling of
relative and empty names.
this loads a file containing DNS names and measures the time it takes to:
1) iterate it,
2) look up each name with dns_qp_getname()
3) look up each name with dns_qp_findname_ancestor()
4) look up a modified name based on the name, to check performance
when the name is not found.
the refactoring of isc_job_run() and isc_async_run() in 9.19.12
intefered with the way the qpmulti benchmark uses uv_idle.
it has now been modified to use isc_job/isc_async instead.
When the given zone is not associated with a zone manager, the function
currently returns ISC_R_NOTFOUND, which is documented as the return
value for the case in which no incoming zone transfer is found. Make
the function return ISC_R_FAILURE in such a case instead.
Also update the description of the function as the value it returns is
not meant to indicate whether an ongoing incoming transfer for the given
zone exists. The boolean variables that the function sets via the
pointers provided as its parameters, combined with either keeping
'*xfrp' set to NULL or updating it to a valid pointer, can be used by
the caller to infer all the necessary information.
The TRY0 macro doesn't set the 'result' variable, so the error
log message is never printed. Remove the 'result' variable and
modify the function's control flow to be similar to the the
zone_xmlrender() function, with a separate error returning path.
Workaround compiler bug that optimizes setting .free_pools
The .free_pools bitfield would not be set on some levels of
optimizations - workaround the compiler bug by reordering the setting
the .freepools in the initializer.
The structure member is populated only moments before its
destruction, and is not used anywhere, except for the
destructor. Use a local variable instead.
The 'end_serial' and some other members of the 'dns_xfrin_t'
structure can be accessed by the statistics channel, causing
a data race with the zone transfer process.
Use the existing 'statslock' mutex for protecting those members.
Mark Andrews [Tue, 19 Sep 2023 04:06:15 +0000 (14:06 +1000)]
Wait for the test zone to finish re-loading
'rndc thaw' initiates asynchrous loading of all the zones
similar to 'rndc load'. Wait for the test zone's load to
complete before testing that it is updatable again.
Change dns_message_create() function to accept memory pools
Instead of creating new memory pools for each new dns_message, change
dns_message_create() method to optionally accept externally created
dns_fixedname_t and dns_rdataset_t memory pools. This allows us to
preallocate the memory pools in ns_client and dns_resolver units for the
lifetime of dns_resolver_t and ns_clientmgr_t.
Fix the incoming transfers' "Needs Refresh" state in stats channel
The "Needs Refresh" flag is exposed in two places in the statistics
channel: first - there is a state called "Needs Refresh", when the
process hasn't started yet, but the zone needs a refresh, and second
- there there is a field called "Additional Refresh Queued", when the
process is ongoing, but another refresh is queued for the same zone.
The DNS_ZONEFLG_NEEDREFRESH flag, however, is set only when there is
an ongoing zone transfer and a new notify is received. That is, the
flag is not set for the first case above.
In order to fix the issue, use the DNS_ZONEFLG_NEEDREFRESH flag only
when the zone transfer is running, otherwise, decide whether a zone
needs a refresh using its refresh and expire times.
xfrin: rename XFRST_INITIALSOA to XFRST_ZONEXFRREQUEST
The XFRST_INITIALSOA state in the xfrin module is named like that,
because the first RR in a zone transfer must be SOA. However, the
name of the state is a bit confusing (especially when exposed to
the users with statistics channel), because it can be mistaken with
the refresh SOA request step, which takes place before the zone
transfer starts.
Rename the state to XFRST_ZONEXFRREQUEST (i.e. Zone Transfer Request).
During that step the state machine performs several operations -
establishing a connection, sending a request, and receiving/parsing
the first RR in the answer.
Show the local and remote addresses for the "Refresh SOA" query
Currently in the statsistics channel's incoming zone transfers list
the local and remote addresses are shown only when the zone transfer
is already running. Since we have now introduced the "Refresh SOA"
state, which shows the state of the SOA query before the zone transfer
is started, this commit implements a feature to show the local and
remote addresses for the SOA query, when the state is "Refresh SOA".
Improve the "Duration (s)" field of the incoming xfers in stats channel
Improve the "Duration (s)" field, so that it can show the duration of
all the major states of an incoming zone transfer process, while they
are taking place. In particular, it will now show the duration of the
"Pending", "Refresh SOA" and "Deferred" states too, before the actual
zone transfer starts.
Add the "Refresh SOA" state for the incoming zone transfers
With adding this state to the statistics channel, it can now show
the zone transfer in this state instead of as "Pending" when the
zone.c module is performing a refresh SOA request, before actually
starting the transfer process. This will help to understand
whether the process is waiting because of the rate limiter (i.e.
"Pending"), or the rate limiter is passed and it is now waiting for
the refresh SOA query to complete or time out.
Check zone transfer transports in the statistics channel
Add two more secondary zones to ns3 to be transferred from ns1,
using its IPv6 address for which the 'tcp-only' is set to 'yes'.
Check the statistics channel's incoming zone transfers information
to confirm that the expected transports were used for each of the
SOA query cases (UDP, TCP, TLS), and also for zone transfers (TCP,
TLS).
Aram Sargsyan [Wed, 23 Aug 2023 10:46:44 +0000 (10:46 +0000)]
Expose the SOA query transport type used before/during XFR
Add a new field in the incoming zone transfers section of the
statistics channel to show the transport used for the SOA request.
When the transfer is started beginning from the XFRST_SOAQUERY state,
it means that the SOA query will be performed by xfrin itself, using
the same transport. Otherwise, it means that the SOA query was already
performed by other means (e.g. by zone.c:soa_query()), and, in that
case, we use the SOA query transport type information passed by the
'soa_transport_type' argument, when the xfrin object was created.
Mark Andrews [Fri, 11 Aug 2023 03:28:05 +0000 (13:28 +1000)]
Wait for slow zone transfer to complete before ending test
This allows the statistics channel to be viewed in a browser while
the transfer is in progress. Also set the transfer format to
one-answer to extend the amount of time the re-transfer takes.
When running the statschannel test on its own, use
<http://10.53.0.3:5304/xml/v3/xfrins> to see the output.
Mark Andrews [Thu, 6 Jul 2023 04:00:48 +0000 (14:00 +1000)]
Provide thread safe access to dns_xfrin_t state
dns_xfrin_t state may be accessed from different threads when
when reporting transfer state. Ensure access is thread safe by
using atomics and locks where appropriate.
Aram Sargsyan [Tue, 30 May 2023 15:00:33 +0000 (15:00 +0000)]
Add a test case for checking zone transfers in statschannel
Use the named -T transferslowly test options to slow down a zone
transfer from the primary server, and test that it's correctly
exposed in the statistics channel of the secondary server, while
it's in-progress.
Aram Sargsyan [Tue, 30 May 2023 14:32:02 +0000 (14:32 +0000)]
dns_transport: use const arguments in getters when possible
In some dns_transport getter functions it's possible to use a
const dns_transport_t as the first argument instead of just
dns_transport_t. Convert the function prototypes to use const.
Explicitly cast chars to unsigned chars for <ctype.h> functions
Apply the semantic patch to catch all the places where we pass 'char' to
the <ctype.h> family of functions (isalpha() and friends, toupper(),
tolower()).
Add semantic patch to explicitly cast chars to unsigned for ctype.h
Add a semantic patch to catch all the places where we pass 'char' to the
<ctype.h> family of functions (isalpha() and friends, toupper(),
tolower()). While it generally works because the way how these
functions are constructed in the libc, it's safer to do the explicit
cast.
Michal Nowak [Thu, 31 Aug 2023 16:55:36 +0000 (18:55 +0200)]
Add a Sphinx role for linking CVEs to the ISC Knowledgebase
The new :cve: Sphinx role takes a CVE number as an argument and creates
a hyperlink to the relevant ISC Knowledgebase document that might have
more up-to-date or verbose information than the relevant release note.
This makes reaching ISC Knowledgebase pages directly from the release
notes easier.
Make all CVE references in the release notes use the new Sphinx role.
Use the new isc_sockaddr_hash_ex() to fix QID table hashing
The QID table hashing used a custom merging of the sockaddr, port and id
into a single hashvalue. Normalize the QID table hashing function to
use isc_hash32 API for all the values.