Remove the .key from the beginning of the line in rst file
The handling of . (dot) characted at the beginning of the line has
changed between the sphinx-doc versions, and it was constantly giving us
trouble when generating man pages when using different sphinx-doc. This
commit just changes the source rst file, so there's no more . (dot) the
beginning of the line.
Mark Andrews [Mon, 28 Sep 2020 02:54:17 +0000 (12:54 +1000)]
Always clean sig0name in msgresetsigs() and dns_message_renderreset()
The fuzzing harness operates on dns_message_t in non-standard ways
and if 'sig0name' is non-NULL when msgresetsigs() and
dns_message_renderreset() are called it should be cleaned up.
The dns_message_create() cannot fail, change the return to void
The dns_message_create() function cannot soft fail (as all memory
allocations either succeed or cause abort), so we change the function to
return void and cleanup the calls.
Diego Fronza [Mon, 21 Sep 2020 21:19:49 +0000 (18:19 -0300)]
cocci: Add semantic patch to refactor dns_message_destroy()
dns_message_t objects are now being handled using reference counting
semantics, so now dns_message_destroy() is not called directly anymore,
dns_message_detach must be called instead.
Diego Fronza [Mon, 21 Sep 2020 20:44:29 +0000 (17:44 -0300)]
Properly handling dns_message_t shared references
This commit fix the problems that arose when moving the dns_message_t
object from fetchctx_t to the query structure.
Since the lifetime of query objects are different than that of a fetchctx
and the dns_message_t object held by the query may be being used by some
external module, e.g. validator, even after the query may have been destroyed,
propery handling of the references to the message were added in this commit to
avoid accessing an already destroyed object.
Specifically, in rctx_done(), a reference to the message is attached at
the beginning of the function and detached at the end, since a possible call
to fctx_cancelquery() would release the dns_message_t object, and in the next
lines of code a call to rctx_nextserver() or rctx_chaseds() would require
a valid pointer to the same object.
In valcreate() a new reference is attached to the message object, this
ensures that if the corresponding query object is destroyed before the
validator attempts to access it, no invalid pointer access occurs.
In validated() we have to attach a new reference to the message, since
we destroy the validator object at the beginning of the function,
and we need access to the message in the next lines of the same function.
rctx_nextserver() and rctx_chaseds() functions were adapted to receive
a new parameter of dns_message_t* type, this was so they could receive a
valid reference to a dns_message_t since using the response context respctx_t
to access the message through rctx->query->rmessage could lead to an already
released reference due to the query being canceled.
Diego Fronza [Mon, 21 Sep 2020 20:32:39 +0000 (17:32 -0300)]
Fix invalid dns message state in resolver's logic
The assertion failure REQUIRE(msg->state == DNS_SECTION_ANY),
caused by calling dns_message_setclass within function resquery_response()
in resolver.c, was happening due to wrong management of dns message_t
objects used to process responses to the queries issued by the resolver.
Before the fix, a resolver's fetch context (fetchctx_t) would hold
a pointer to the message, this same reference would then be used over all
the attempts to resolve the query, trying next server, etc... for this to work
the message object would have it's state reset between each iteration, marking
it as ready for a new processing.
The problem arose in a scenario with many different forwarders configured,
managing the state of the dns_message_t object was lacking better
synchronization, which have led it to a invalid dns_message_t state in
resquery_response().
Instead of adding unnecessarily complex code to synchronize the object,
the dns_message_t object was moved from fetchctx_t structure to the
query structure, where it better belongs to, since each query will produce
a response, this way whenever a new query is created an associated
dns_messate_t is also created.
This commit deals mainly with moving the dns_message_t object from fetchctx_t
to the query structure.
Michal Nowak [Fri, 31 Jul 2020 11:10:44 +0000 (13:10 +0200)]
Do not remove $systest for out-of-tree builds
Previously, the $systest directory was being removed for out-of-tree
builds at the end of each system test. Because of that, running tests
which depend on compiled objects was breaking subsequent "make check"
invocations:
make: Target 'check' not remade because of errors.
Making all in dyndb/driver
/bin/bash: line 20: cd: dyndb/driver: No such file or directory
Making all in dlzexternal/driver
/bin/bash: line 20: cd: dlzexternal/driver: No such file or directory
Address by first removing build/test artifacts for a given test and then
removing empty directories inside (and potentially including) $systest.
Michal Nowak [Tue, 21 Jul 2020 13:54:27 +0000 (15:54 +0200)]
Add an out-of-tree system test job to GitLab CI
Make sure the new job does not get run for every pipeline as it is not
expected to break often and it is similar enough to other system test
jobs. Change the name of the variable holding the path to the
out-of-tree build directory to a more generic one.
Refactor the pausing/unpausing and finishing the nm_thread
The isc_nm_pause(), isc_nm_resume() and finishing the nm_thread() from
nm_destroy() has been refactored, so all use the netievents instead of
directly touching the worker structure members. This allows us to
remove most of the locking as the .paused and .finished members are
always accessed from the matching nm_thread.
When shutting down the nm_thread(), instead of issuing uv_stop(), we
just shutdown the .async handler, so all uv_loop_t events are properly
finished first and uv_run() ends gracefully with no outstanding active
handles in the loop.
Michał Kępień [Mon, 28 Sep 2020 07:09:21 +0000 (09:09 +0200)]
Fix function overrides in unit tests on macOS
Since Mac OS X 10.1, Mach-O object files are by default built with a
so-called two-level namespace which prevents symbol lookups in BIND unit
tests that attempt to override the implementations of certain library
functions from working as intended. This feature can be disabled by
passing the "-flat_namespace" flag to the linker. Fix unit tests
affected by this issue on macOS by adding "-flat_namespace" to LDFLAGS
used for building all object files on that operating system (it is not
enough to only set that flag for the unit test executables).
Michał Kępień [Mon, 28 Sep 2020 07:09:21 +0000 (09:09 +0200)]
Drop function wrapping as it is redundant for now
As currently used in the BIND source tree, the --wrap linker option is
redundant because:
- static builds are no longer supported,
- there is no need to wrap around existing functions - what is
actually required (at least for now) is to replace them altogether
in unit tests,
- only functions exposed by shared libraries linked into unit test
binaries are currently being replaced.
Given the above, providing the alternative implementations of functions
to be overridden in lib/ns/tests/nstest.c is a much simpler alternative
to using the --wrap linker option. Drop the code detecting support for
the latter from configure.ac, simplify the relevant Makefile.am, and
remove lib/ns/tests/wrap.c, updating lib/ns/tests/nstest.c accordingly
(it is harmless for unit tests which are not calling the overridden
functions).
Mark Andrews [Fri, 25 Sep 2020 07:42:41 +0000 (17:42 +1000)]
Wait for 'rpz: policy: reload done' to signalled before proceeding.
RPZ rules cannot be fully relied upon until the summary RPZ database is
updated after an "rndc reload". Wait until the relevant message is
logged after an "rndc reload" to prevent false positives in the
"rpzrecurse" system test caused by the RPZ rules not yet being in effect
by the time ns3 is queried.
Evan Hunt [Wed, 22 May 2019 08:58:41 +0000 (10:58 +0200)]
Purge memory pool upon plugin destruction
The typical sequence of events for AAAA queries which trigger recursion
for an A RRset at the same name is as follows:
1. Original query context is created.
2. An AAAA RRset is found in cache.
3. Client-specific data is allocated from the filter-aaaa memory pool.
4. Recursion is triggered for an A RRset.
5. Original query context is torn down.
6. Recursion for an A RRset completes.
7. A second query context is created.
8. Client-specific data is retrieved from the filter-aaaa memory pool.
9. The response to be sent is processed according to configuration.
10. The response is sent.
11. Client-specific data is returned to the filter-aaaa memory pool.
12. The second query context is torn down.
However, steps 6-12 are not executed if recursion for an A RRset is
canceled. Thus, if named is in the process of recursing for A RRsets
when a shutdown is requested, the filter-aaaa memory pool will have
outstanding allocations which will never get released. This in turn
leads to a crash since every memory pool must not have any outstanding
allocations by the time isc_mempool_destroy() is called.
Fix by creating a stub query context whenever fetch_callback() is called,
including cancellation events. When the qctx is destroyed, it will ensure
the client is detached and the plugin memory is freed.
Matthijs Mekking [Thu, 13 Aug 2020 05:47:27 +0000 (07:47 +0200)]
Include expired rdatasets in iteration functions
By changing the check in 'rdatasetiter_first' and 'rdatasetiter_next'
from "now > header->rdh_ttl" to "now - RBDTB_VIRTUAL > header->rdh_ttl"
we include expired rdataset entries so that they can be used for
"rndc dumpdb -expired".
Matthijs Mekking [Thu, 13 Aug 2020 05:58:42 +0000 (07:58 +0200)]
Minor changes to serve-stale tests
Minor changes are:
- Replace the "$RNDCCMD dumpdb" logic with "rndc_dumpdb" from
conf.sh.common (it does the same thing).
- Update a comment to match the grep calls below it (comment said the
rest should be expired, while the grep calls indicate that they
are still in the cache, the comment now explains why).
Mark Andrews [Fri, 18 Sep 2020 05:00:35 +0000 (15:00 +1000)]
Clone the saved / query message buffers
The message buffer passed to ns__client_request is only valid for
the life of the the ns__client_request call. Save a copy of it
when we recurse or process a update as ns__client_request will
return before those operations complete.
Mark Andrews [Tue, 22 Sep 2020 05:22:34 +0000 (15:22 +1000)]
Break lock order loop by sending TAT in an event
The dotat() function has been changed to send the TAT
query asynchronously, so there's no lock order loop
because we initialize the data first and then we schedule
the TAT send to happen asynchronously.
This breaks following lock-order loops:
zone->lock (dns_zone_setviewcommit) while holding view->lock
(dns_view_setviewcommit)
keytable->lock (dns_keytable_find) while holding zone->lock
(zone_asyncload)
view->lock (dns_view_findzonecut) while holding keytable->lock
(dns_keytable_forall)
As the query_prefetch() or query_rpzfetch() could be called during
"regular" fetch, we need to introduce separate storage for attaching
the nmhandle during prefetching the records. The query_prefetch()
and query_rpzfetch() are guarded for re-entrance by .query.prefetch
member of ns_client_t, so we can reuse the same .prefetchhandle for
both.
Michał Kępień [Tue, 22 Sep 2020 06:40:04 +0000 (08:40 +0200)]
Minimize stdout logging in pairwise testing jobs
The size of the log generated by each GitLab CI job is limited to 4 MB
by default. While this limit is configurable, it makes little sense to
print build logs to standard output if they are being captured to files
anyway. Limit use of "tee" in util/pairwise-testing.sh to printing the
combination of configure switches used for a given build. This way the
job should never exceed the default 4 MB log size limit, yet it will
still indicate its progress in a concise way.
Michał Kępień [Mon, 21 Sep 2020 10:08:50 +0000 (12:08 +0200)]
Extend the artifact list for the "pairwise" CI job
Include the log file generated by the last failed pairwise build among
the artifacts produced by the "pairwise" GitLab CI job in order to
simplify troubleshooting pairwise build failures.
Michal Nowak [Wed, 1 Jul 2020 08:29:36 +0000 (10:29 +0200)]
Add pairwise testing
Pairwise testing is a test case generation technique based on the
observation that most faults are caused by interactions of at most two
factors. For BIND, its configure options can be thought of as such
factors.
Process BIND configure options into a model that is subsequently
processed by the PICT tool in order to find an effective test vector.
That test vector is then used for configuring and building BIND using
various combinations of configure options.
Mark Andrews [Mon, 21 Sep 2020 04:52:53 +0000 (14:52 +1000)]
Remove the memmove call on dns_rbtnode_t structure that contains atomics
Calling the plain memmove on the structure that contains atomic members
triggers following TSAN warning (even when we don't really use the
atomic members in the code):
WARNING: ThreadSanitizer: data race
Read of size 8 at 0x000000000001 by thread T1 (mutexes: write M1, write M2):
#0 memmove <null>
#1 memmove /usr/include/x86_64-linux-gnu/bits/string_fortified.h:40:10
#2 deletefromlevel lib/dns/rbt.c:2675:3
#3 dns_rbt_deletenode lib/dns/rbt.c:2143:2
#4 delete_node lib/dns/rbtdb.c
#5 decrement_reference lib/dns/rbtdb.c:2202:4
#6 prune_tree lib/dns/rbtdb.c:2259:3
#7 dispatch lib/isc/task.c:1152:7
#8 run lib/isc/task.c:1344:2
In the new netmgr code, the .listener member was mostly functionally
only duplicating the .exiting member and was unneeded. This also
resolves following ThreadSanitizer (harmless) warning:
WARNING: ThreadSanitizer: data race
Write of size 1 at 0x000000000001 by thread T1:
#0 control_senddone bin/named/controlconf.c:257:22
#1 tcp_send_cb lib/isc/netmgr/tcp.c:1027:2
#2 uv__write_callbacks /home/ondrej/Projects/tsan/libuv/src/unix/stream.c:953:7
#3 uv__stream_io /home/ondrej/Projects/tsan/libuv/src/unix/stream.c:1330:5
#4 uv__run_pending /home/ondrej/Projects/tsan/libuv/src/unix/core.c:812:5
#5 uv_run /home/ondrej/Projects/tsan/libuv/src/unix/core.c:377:19
#6 nm_thread lib/isc/netmgr/netmgr.c:500:11
Michał Kępień [Mon, 21 Sep 2020 07:28:36 +0000 (09:28 +0200)]
Fix updating summary RPZ DB for mixed-case RPZs
Each dns_rpz_zone_t structure keeps a hash table of the names this RPZ
database contains. Here is what happens when an RPZ is updated:
- a new hash table is prepared for the new version of the RPZ by
iterating over it; each name found is added to the summary RPZ
database,
- every name added to the new hash table is searched for in the old
hash table; if found, it is removed from the old hash table,
- the old hash table is iterated over; all names found in it are
removed from the summary RPZ database (because at that point the old
hash table should only contain names which are not present in the
new version of the RPZ),
- the new hash table replaces the old hash table.
When the new version of the RPZ is iterated over, if a given name is
spelled using a different letter case than in the old version of the
RPZ, the new variant will hash to a different value than the old
variant, which means it will not be removed from the old hash table.
When the old hash table is subsequently iterated over to remove
seemingly deleted names, the old variant of the name will still be
there, causing the name to be deleted from the summary RPZ database
(which effectively causes a given rule to be ignored).
The issue can be triggered not just by altering the case of existing
names in an RPZ, but also by adding sibling names spelled with a
different letter case. This is because RBT code preserves case when
node splitting occurs. The end result is that when the RPZ is iterated
over, a given name may be using a different case than in the zone file
(or XFR contents).
Fix by downcasing all names found in the RPZ database before adding them
to the summary RPZ database.
Michał Kępień [Thu, 17 Sep 2020 13:39:11 +0000 (15:39 +0200)]
Update release checklist
Add an item to the release checklist to make sure eligible customers get
notified about the location of the latest version of BIND Subscription
Edition upon its release.