Michał Kępień [Tue, 20 Feb 2018 12:59:28 +0000 (13:59 +0100)]
Wait until a cache dump completes instead of waiting for a fixed amount of time
Dumping the cache is an asynchronous operation, so sleeping for a fixed
amount of time after running "rndc dumpdb" is imperfect as dumping cache
contents may take longer than expected on slower machines. Instead of
always sleeping for 1 second, wait until the "; Dump complete" line
appears in the dump or 10 seconds pass, whichever comes first.
Michał Kępień [Tue, 20 Feb 2018 12:59:27 +0000 (13:59 +0100)]
Do not overwrite cache dumps
Unless configured otherwise in named.conf, "rndc dumpdb" causes a cache
dump to be written to a file called "named_dump.db" in the working
directory of the given named instance. Repeatedly using this command
throughout different checks in the cacheclean system test causes cache
dumps for older checks to be overwritten, which hinders failure
diagnosis. Prevent this by moving each cache dump to a check-specific
location after running "rndc dumpdb".
Furthermore, during the "check flushtree clears adb correctly" check,
dump_cache() is called twice without renaming the resulting files.
Prevent the first cache dump from being overwritten by moving it to a
different file before calling "rndc dumpdb" for the second time.
Michał Kępień [Mon, 5 Feb 2018 20:04:39 +0000 (21:04 +0100)]
Make dns_dt_send() call dns_dt_reopen() asynchronously
Instead of checking current dnstap output file size and potentially
synchronously calling dns_dt_reopen() upon every call to dns_dt_send():
- call dns_dt_reopen() asynchronously by queuing an event to the task
specified at dnstap environment creation time,
- ensure no roll event is outstanding before checking dnstap output
file size and potentially queuing another roll event.
This causes dnstap output files to exceed their configured size limits,
but prevents any two threads from performing the roll simultaneously
(which causes crashes).
Michał Kępień [Mon, 5 Feb 2018 20:22:53 +0000 (21:22 +0100)]
Make dns_dt_reopen() request task-exclusive mode on its own
Instead of relying on the caller to set up task-exclusive mode, make
dns_dt_reopen() enforce task-exclusive mode itself, using the task
specified at dnstap environment creation time.
Michał Kępień [Mon, 5 Feb 2018 20:19:44 +0000 (21:19 +0100)]
Add dns_dt_create2()
Implement a new variant of dns_dt_create() to enable a dnstap
environment structure to hold the task in the context of which
dns_dt_reopen() will be executed.
Michał Kępień [Thu, 15 Feb 2018 19:31:55 +0000 (20:31 +0100)]
Add CHANGES entry
4892. [bug] named could leak memory when "rndc reload" was invoked
before all zone loading actions triggered by a previous
"rndc reload" command were completed. [RT #47076]
Michał Kępień [Thu, 15 Feb 2018 19:31:54 +0000 (20:31 +0100)]
Do not recheck DNS_ZONEFLG_LOADPENDING in zone_asyncload()
Remove a block of code which dates back to commit 8a2ab2b9203, when
dns_zone_asyncload() did not yet check DNS_ZONEFLG_LOADPENDING.
Currently, no race in accessing DNS_ZONEFLG_LOADPENDING is possible any
more, because:
- dns_zone_asyncload() is still the only function which may queue
zone_asyncload(),
- dns_zone_asyncload() accesses DNS_ZONEFLG_LOADPENDING under a lock
(and potentially queues an event under the same lock),
- DNS_ZONEFLG_LOADPENDING is not cleared until the load actually
completes.
Thus, the rechecking code can be safely removed from zone_asyncload().
Note that this also brings zone_asyncload() to a state in which the
completion callback is always invoked. This is required to prevent
leaking memory in case something goes wrong in zone_asyncload() and a
zone table the zone belongs to is indefinitely left with a positive
reference count.
Michał Kępień [Thu, 15 Feb 2018 19:31:53 +0000 (20:31 +0100)]
Asynchronous zone load events have no way of getting canceled
Code handling cancellation of asynchronous zone load events was likely
copied over from other functions when asynchronous zone loading was
first implemented in commit 8a2ab2b9203. However, unlike those other
functions, asynchronous zone loading events currently have no way of
getting canceled once they get posted, which means the aforementioned
code is effectively dead. Remove it to prevent confusion.