Aram Sargsyan [Fri, 27 Jan 2023 08:47:52 +0000 (08:47 +0000)]
catz: unregister the db update-notify callback before detaching from db
When detaching from the previous version of the database, make sure
that the update-notify callback is unregistered, otherwise there is
an INSIST check which can generate an assertion failure in free_rbtdb(),
which checks that there are no outstanding update listeners in the list.
Aram Sargsyan [Thu, 26 Jan 2023 19:08:19 +0000 (19:08 +0000)]
Process db callbacks in zone_loaddone() after zone_postload()
The zone_postload() function can fail and unregister the callbacks.
Call dns_db_endload() only after calling zone_postload() to make
sure that the registered update-notify callbacks are not called
when the zone loading has failed during zone_postload().
Also, don't ignore the return value of zone_postload().
Aram Sargsyan [Fri, 27 Jan 2023 09:22:11 +0000 (09:22 +0000)]
Add a system test for [GL #3777]
Add the 'ixfr-from-differences yes;' option to trigger a failed
zone postload operation when a zone is updated but the serial
number is not updated, then issue two successive 'rndc reload'
commands to trigger the bug, which causes an assertion failure.
Ondřej Surý [Thu, 23 Feb 2023 10:10:39 +0000 (11:10 +0100)]
Pause the catz dbiterator while processing the zone
The dbiterator read-locks the whole zone and it stayed locked during
whole processing time when catz is being read. Pause the iterator, so
the updates to catz zone are not being blocked while processing the catz
update.
Ondřej Surý [Fri, 24 Feb 2023 09:46:00 +0000 (09:46 +0000)]
Unlock catzs during dns__catz_update_cb()
Instead of holding the catzs->lock the whole time we process the catz
update, only hold it for hash table lookup and then release it. This
should unblock any other threads that might be processing updates to
catzs triggered by extra incoming transfer.
Aram Sargsyan [Tue, 21 Feb 2023 14:23:38 +0000 (14:23 +0000)]
Offload catalog zone updates
Offload catalog zone processing so that the network manager threads
are not interrupted by a large catalog zone update.
Introduce a new 'updaterunning' state alongside with 'updatepending',
like it is done in the RPZ module.
Note that the dns__catz_update_cb() function currently holds the
catzs->lock during the whole process, which is far from being optimal,
but the issue is going to be addressed separately.
Aram Sargsyan [Tue, 21 Feb 2023 21:14:44 +0000 (21:14 +0000)]
Add shutdown signaling for catalog zones
This change should make sure that catalog zone update processing
doesn't happen when the catalog zone is being shut down. This
should help avoid races when offloading the catalog zone updates
in the follow-up commit.
Aram Sargsyan [Thu, 23 Feb 2023 14:43:41 +0000 (14:43 +0000)]
Call dns_catz_new_zones() only when it is needed
The configure_catz() function creates the catalog zones structure
for the view even when it is not needed, in which case it then
discards it (by detaching) later.
Instead, call dns_catz_new_zones() only when it is needed, i.e. when
there is no existing "previous" view with an existing 'catzs', that
is going to be reused.
Aram Sargsyan [Thu, 9 Feb 2023 10:32:33 +0000 (10:32 +0000)]
Light refactoring of catz.c
* Change 'dns_catz_new_zones()' function's prototype (the order of the
arguments) to synchronize it with the similar function in rpz.c.
* Rename 'refs' to 'references' in preparation of ISC_REFCOUNT_*
macros usage for reference tracking.
* Unify dns_catz_zone_t naming to catz, and dns_catz_zones_t naming to
catzs, following the logic of similar changes in rpz.c.
* Use C compound literals for structure initialization.
* Synchronize the "new zone version came too soon" log message with the
one in rpz.c.
* Use more of 'sizeof(*ptr)' style instead of the 'sizeof(type_t)' style
expressions when allocating or freeing memory for 'ptr'.
Tony Finch [Fri, 16 Dec 2022 12:19:08 +0000 (12:19 +0000)]
Move irs_resconf into libdns and remove libirs
`libirs` used to be a reference implementation of `getaddrinfo` and
related modern resolver APIs. It was stripped down in BIND 9.18
leaving only the `irs_resconf` module, which parses
`/etc/resolv.conf`. I have kept its include path and namespace prefix,
so it remains a little fragment of libirs now embedded in libdns.
Ondřej Surý [Fri, 24 Feb 2023 07:41:51 +0000 (08:41 +0100)]
Add SonarCloud GitHub Action
Add new SonarCloud GitHub Action and configuration; something (maybe
the way the builds were submitted) has apparently changed and the
project got deleted and the analysis wasn't working.
Evan Hunt [Wed, 22 Feb 2023 19:04:12 +0000 (11:04 -0800)]
fix a bug in dns_dispatch_getnext()
when a message arrives over a TCP connection matching an expected
QID, the dispatch is updated so it no longer expects that QID,
but continues reading. subsequent messages with the same QID are
ignored, unless the dispatch entry has called dns_dispatch_getnext()
or dns_dispatch_resume().
however, a coding error caused those functions to have no effect
when the dispatch was reading, so streams of messages with the same
QID could not be received over a single TCP connection, breaking *XFR.
this has been corrected by changing the order of operations in
tcp_dispatch_getnext() so that disp->reading isn't checked until
after the dispatch entry has been reactivated.
Evan Hunt [Wed, 22 Feb 2023 04:14:30 +0000 (20:14 -0800)]
refactor dns_xfrin to use dns_dispatch
the dns_xfrin module was still using the network manager directly to
manage TCP connections and send and receive messages. this commit
changes it to use the dispatch manager instead.
Evan Hunt [Thu, 23 Feb 2023 18:29:33 +0000 (10:29 -0800)]
implement refcount tracing in xfrin.c
use ISC_REFCOUNT_IMPL for dns_xfrin_ctx_t (which has been renamed
to dns_xfrin_t to keep the function names dns_xfrin_attach() and
dns_xfrin_detach() unchanged).
Evan Hunt [Wed, 22 Feb 2023 00:34:41 +0000 (16:34 -0800)]
move dispatchmgr from resolver to view
the 'dispatchmgr' member of the resolver object is used by both
the dns_resolver and dns_request modules, and may in the future
be used by others such as dns_xfrin. it doesn't make sense for it
to live in the resolver object; this commit moves it into dns_view.
Tony Finch [Thu, 29 Dec 2022 19:18:00 +0000 (19:18 +0000)]
QSBR: safe memory reclamation for lock-free data structures
This "quiescent state based reclamation" module provides support for
the qp-trie module in dns/qp. It is a replacement for liburcu, written
without reference to the urcu source code, and in fact it works in a
significantly different way.
A few specifics of BIND make this variant of QSBR somewhat simpler:
* We can require that wait-free access to a qp-trie only happens in
an isc_loop callback. The loop provides a natural quiescent state,
after the callbacks are done, when no qp-trie access occurs.
* We can dispense with any API like rcu_synchronize(). In practice,
it takes far too long to wait for a grace period to elapse for each
write to a data structure.
* We use the idea of "phases" (aka epochs or eras) from EBR to
reduce the amount of bookkeeping needed to track memory that is no
longer needed, knowing that the qp-trie does most of that work
already.
I considered hazard pointers for safe memory reclamation. They have
more read-side overhead (updating the hazard pointers) and it wasn't
clear to me how to nicely schedule the cleanup work. Another
alternative, epoch-based reclamation, is designed for fine-grained
lock-free updates, so it needs some rethinking to work well with the
heavily read-biased design of the qp-trie. QSBR has the fastest read
side of the basic SMR algorithms (with no barriers), and fits well
into a libuv loop. More recent hybrid SMR algorithms do not appear to
have enough benefits to justify the extra complexity.
Evan Hunt [Tue, 21 Feb 2023 22:39:43 +0000 (14:39 -0800)]
remove isc_glob
the isc_glob module was originally needed to support posix-style glob
processing on Windows, but is now just an unnecessary wrapper around
glob(3). this commit removes it.
Evan Hunt [Tue, 21 Feb 2023 22:31:58 +0000 (14:31 -0800)]
fix a crash from using an empty string for "include"
the parser could crash when "include" specified an empty string in place
of the filename. this has been fixed by returning ISC_R_FILENOTFOUND
when the string length is 0.
Ondřej Surý [Mon, 20 Feb 2023 15:16:07 +0000 (16:16 +0100)]
Use atomic stack for async job queue
Previously, the async job queue would use a locked-list (ISC_LIST).
With introduction of atomic stack (that has to be drained at once), we
could use it to remove some contention between the threads and simplify
the async queue.
Fortunately, the reverse order still works for us - instead of append
and tail/prev operation on the list, we are now using prepend and
head/next operation on the atomic stack.
Tony Finch [Mon, 2 Jan 2023 14:49:52 +0000 (14:49 +0000)]
Simple lock-free stack in <isc/stack.h>
Add a singly-linked stack that supports lock-free prepend and drain (to
empty the list and clean up its elements). Intended for use with QSBR
to collect objects that need safe memory reclamation, or any other user
that works with adding objects to the stack and then draining them in
one go like various work queues.
In <isc/atomic.h>, add an `atomic_ptr()` macro to make type
declarations a little less abominable, and clean up a duplicate
definition of `atomic_compare_exchange_strong_acq_rel()`
Aram Sargsyan [Fri, 11 Nov 2022 14:44:26 +0000 (14:44 +0000)]
Add tests for CVE-2022-3924
Reproduce the assertion by configuring a 'named' resolver with
'recursive-clients 10;' configuration option and running 20
queries is parallel.
Also tweak the 'ans2/ans.pl' to simulate a 50ms network latency
when qname starts with "latency". This makes sure that queries
running in parallel don't get served immediately, thus allowing
the configured recursive clients quota limitation to be activated.
Evan Hunt [Tue, 21 Feb 2023 20:21:32 +0000 (12:21 -0800)]
remove references to obsolete isc_task/timer functions
removed references in code comments, doc/dev documentation, etc, to
isc_task, isc_timer_reset(), and isc_timertype_inactive. also removed a
coccinelle patch related to isc_timer_reset() that was no longer needed.
Evan Hunt [Fri, 10 Feb 2023 07:21:57 +0000 (23:21 -0800)]
make builtin a standalone dns_db implementation
instead of using the SDB API as a wrapper to register and
unregister and provide a call framework for the builtin databases,
this commit flattens things so that the builtin databases implement
dns_db directly.
Evan Hunt [Fri, 17 Feb 2023 21:04:12 +0000 (13:04 -0800)]
enable detailed db tracing
move database attach/detach functions to db.c, instead of
requiring them to be implemented for every database type.
instead, they must implement a 'destroy' function that is
called when references go to zero.
this enables us to use ISC_REFCOUNT_IMPL for databases,
with detailed tracing enabled by setting DNS_DB_TRACE to 1.
Evan Hunt [Fri, 17 Feb 2023 20:04:32 +0000 (12:04 -0800)]
simplify dns_sdb API
SDB is currently (and foreseeably) only used by the named
builtin databases, so it only needs as much of its API as
those databases use.
- removed three flags defined for the SDB API that were always
set the same by builtin databases.
- there were two different types of lookup functions defined for
SDB, using slightly different function signatures. since backward
compatibility is no longer a concern, we can eliminate the 'lookup'
entry point and rename 'lookup2' to 'lookup'.
- removed the 'allnodes' entry point and all database iterator
implementation code
- removed dns_sdb_putnamedrr() and dns_sdb_putnamedrdata() since
they were never used.
Evan Hunt [Fri, 17 Feb 2023 19:46:58 +0000 (11:46 -0800)]
use member name initialization for methods
initialize dns_dbmethods, dns_sdbmethods and dns_rdatasetmethods
using explicit struct member names, so we don't have to keep track
of NULLs for unimplemented functions any longer.
Evan Hunt [Fri, 17 Feb 2023 20:05:25 +0000 (12:05 -0800)]
make fewer dns_db functions mandatory-to-implement
some dns_db functions would have crashed if the DB implementation failed
to implement them, requiring the implementations to add functions that
did nothing but return ISC_R_NOTIMPLEMENTED or some obvious default
value. we can just have the dns_db wrapper functions themselves return
those values, and clean up the implementations accordingly.
Evan Hunt [Fri, 17 Feb 2023 19:57:05 +0000 (11:57 -0800)]
remove rdatalist_p.h
make the private isc__rdatalist_* functions public dns_rdatalist
functions so that all the rdatalist primitives can be used by
callers to libdns. (this will be needed later for moving SDB and
SDLZ out of libdns.)
Mark Andrews [Tue, 21 Feb 2023 02:37:24 +0000 (13:37 +1100)]
Cleanup left over 'fctx != NULL' test following refactoring
This was causing 'CID 436299: Null pointer dereferences (REVERSE_INULL)'
in Coverity. Also removed an 'INSIST(fctx != NULL);' that should
no longer be needed.
Aram Sargsyan [Fri, 17 Feb 2023 12:41:29 +0000 (12:41 +0000)]
Detach rpzs and catzs from the previous view
When switching to a new view during a reconfiguration (or reverting
to the old view), detach the 'rpzs' and 'catzs' from the previuos view.
The 'catzs' case was earlier solved slightly differently, by detaching
from the new view when reverting to the old view, but we can not solve
this the same way for 'rpzs', because now in BIND 9.19 and BIND 9.18
a dns_rpz_shutdown_rpzs() call was added in view's destroy() function
before detaching the 'rpzs', so we can not leave the 'rpzs' attached to
the previous view and let it be shut down when we intend to continue
using it with the new view.
Instead, "re-fix" the issue for the 'catzs' pointer the same way as
for 'rpzs' for consistency, and also because a similar shutdown call
is likely to be implemented for 'catzs' in the near future.
Evan Hunt [Thu, 9 Feb 2023 20:48:07 +0000 (12:48 -0800)]
remove named_os_gethostname()
this function was just a front-end for gethostname(). it was
needed when we supported windows, which has a different function
for looking up the hostname; it's not needed any longer.
Evan Hunt [Thu, 16 Feb 2023 19:05:39 +0000 (11:05 -0800)]
remove validator lock
as every validator function is loop-synchronized, it should no longer be
necessary to use a validator lock.
calling dns_validator_send(), dns_validator_cancel() or
dns_validator_destroy() from a thread other than the one on which the
validator is running will now cause an assertion failure; this should be
fine since the validator and resolver are tightly coupled, and the fetch
contexts and validators run in the same loops.
Evan Hunt [Wed, 15 Feb 2023 19:32:00 +0000 (11:32 -0800)]
additional refactoring of dns_validator
refactor validator so that the validation status object (previously
called dns_valstatus_t, which was derived from dns_validatorevent_t), is
now part of the dns_validator object. when calling validator callbacks,
the validator itself is now sent as the argument.
(note: this necessitates caution in the callback functions that are
internal to validator.c validators spawn other validators, and it can be
confusing at times whether we need to be looking at val, val->subvalidator,
or val->parent.)
Ondřej Surý [Thu, 16 Feb 2023 11:26:01 +0000 (12:26 +0100)]
Don't remove ADB entry from LRU before trying to expire it
There was a code flow error that would remove the expired ADB entry from
the LRU list and then a check in the expire_entry() would cause
assertion error because it expect the ADB entry to be linked.
Additionally, the expire mechanism would loop for cases when we would
held only a read rwlock; in such case we need to upgrade the lock and
try again, not just try again.