Ondřej Surý [Thu, 25 Feb 2021 13:54:12 +0000 (14:54 +0100)]
Remove the fillcount from the isc_mempool API
Previously, the fillcount could be used to indicate how many elements
would be preallocated every time the memory would be empty. This would
result in bursty allocations when the mempool would be drained.
For more smooth performance, we allocate the new pool items only as
needed. In the future, we could consider changing the
isc_mempool_create() function to take an initial number of pre-allocated
items on the pool, so the bursty allocation happens only on the pool
creation.
Ondřej Surý [Wed, 24 Feb 2021 21:44:38 +0000 (22:44 +0100)]
Add semantic patch for never failing isc_mempool_get()
The isc_mempool_get() could never fail now, thus we add a semantic patch
for cleaning up the error paths from the calls where previously we had
to check if the return value was not NULL.
Ondřej Surý [Wed, 24 Feb 2021 21:38:37 +0000 (22:38 +0100)]
Remove the maximum allocation limit from the isc_mempool
The only place where the limits on the maximum number of allocated items
from the pool was the dns_dispatch where we already have different
limits in place.
As an example the maximum number of buffers is guarded by:
if (disp->mgr->buffers >= DNS_DISPATCH_MAXBUFFERS) {
UNLOCK(&disp->mgr->buffer_lock);
return (NULL);
}
but then at the same time we were limiting the maximum number of items
we can get from the disp->bpool.
By removing the maximum allocation limit from the isc_mempool API, we
can simplify the logic in many places as the isc_mempool_get() would
never fail now and it would always return a chunk of memory.
Ondřej Surý [Mon, 1 Mar 2021 13:21:05 +0000 (14:21 +0100)]
Call isc__initialize()/isc__shutdown() from win32 DllMain
Call the libisc isc__initialize() constructor and isc__shutdown()
destructor from DllMain instead of having duplicate code between
those and DllMain() code.
Ondřej Surý [Thu, 25 Feb 2021 10:08:34 +0000 (11:08 +0100)]
Add mempool get/put tracking with AddressSanitizer
When AddressSanitizer is in use, disable the internal mempool
implementation and redirect the isc_mempool_get to isc_mem_get
(and similarly for isc_mempool_put). This is the method recommended
by the AddressSanitizer authors for tracking allocations and
deallocations instead of custom poison/unpoison code (see
https://github.com/google/sanitizers/wiki/AddressSanitizerManualPoisoning).
Ondřej Surý [Tue, 16 Feb 2021 18:54:04 +0000 (19:54 +0100)]
Change the isc_thread_self() return type to uintptr_t
The pthread_self(), thrd_current() or GetCurrentThreadId() could
actually be a pointer, so we should rather convert the value into
uintptr_t instead of unsigned long.
Ondřej Surý [Tue, 16 Feb 2021 15:02:51 +0000 (16:02 +0100)]
Use globally assigned thread_id in the isc_hp API
Convert the isc_hp API to use the globally available isc_tid_v instead
of locally defined tid_v. This should solve most of the problems on
machines with many number of cores / CPUs.
Ondřej Surý [Tue, 16 Feb 2021 14:57:39 +0000 (15:57 +0100)]
Add isc_trampoline API to have simple accounting around threads
The current isc_hp API uses internal tid_v variable that gets
incremented for each new thread using hazard pointers. This tid_v
variable is then used as a index to global shared table with hazard
pointers state. Since the tid_v is only incremented and never
decremented the table could overflow very quickly if we create set of
threads for short period of time, they finish the work and cease to
exist. Then we create identical set of threads and so on and so on.
This is not a problem for a normal `named` operation as the set of
threads is stable, but the problematic place are the unit tests where we
test network manager or other APIs (task, timer) that create threads.
This commits adds a thin wrapper around any function called from
isc_thread_create() that adds unique-but-reusable small digit thread id
that can be used as index to f.e. hazard pointer tables. The trampoline
wrapper ensures that the thread ids will be reused, so the highest
thread_id number doesn't grow indefinitely when threads are created and
destroyed and then created again. This fixes the hazard pointer table
overflow on machines with many cores. [GL #2396]
Matthijs Mekking [Mon, 22 Feb 2021 11:08:49 +0000 (12:08 +0100)]
Don't servfail on staleonly lookups
When a staleonly lookup doesn't find a satisfying answer, it should
not try to respond to the client.
This is not true when the initial lookup is staleonly (that is when
'stale-answer-client-timeout' is set to 0), because no resolver fetch
has been created at this point. In this case continue with the lookup
normally.
Matthijs Mekking [Thu, 18 Feb 2021 15:09:41 +0000 (16:09 +0100)]
Don't allow recursion on staleonly lookups
Fix a crash that can happen in the following scenario:
A client request is received. There is no data for it in the cache,
(not even stale data). A resolver fetch is created as part of
recursion.
Some time later, the fetch still hasn't completed, and
stale-answer-client-timeout is triggered. A staleonly lookup is
started. It will also find no data in the cache.
So 'query_lookup()' will call 'query_gotanswer()' with ISC_R_NOTFOUND,
so this will call 'query_notfound()' and this will start recursion.
We will eventually end up in 'ns_query_recurse()' and that requires
the client query fetch to be NULL:
REQUIRE(client->query.fetch == NULL);
If the previously started fetch is still running this assertion
fails.
The crash is easily prevented by not requiring recursion for
staleonly lookups.
Also remove a redundant setting of the staleonly flag at the end of
'query_lookup_staleonly()' before destroying the query context.
Matthijs Mekking [Wed, 24 Feb 2021 08:35:06 +0000 (09:35 +0100)]
Fix dnssec-policy NSEC3 on dynamic zones
When applying dnssec-policy on a dynamic zone (e.g. that allows Dynamic
Updates), the NSEC3 parameters were put on the queue, but they were
not being processed (until a reload of the zone or reconfiguration).
Process the NSEC3PARAM queue on zone postload when handling a
dynamic zone.
Ondřej Surý [Wed, 24 Feb 2021 05:55:36 +0000 (06:55 +0100)]
Disable safe-guard assertion in DLL_THREAD_ATTACH/DLL_THREAD_DETACH
The BIND 9 libraries on Windows define DllMain() optional entry point
into a dynamic-link library (DLL). When the system starts or terminates
a process or thread, it calls the entry-point function for each loaded
DLL using the first thread of the process.
When the DLL is being loaded into the virtual address space of the
current process as a result of the process starting up, we make a call
to DisableThreadLibraryCalls() which should disable the
DLL_THREAD_ATTACH and DLL_THREAD_DETACH notifications for the specified
dynamic-link library (DLL).
This seems not be the case because we never check the return value of
the DisableThreadLibraryCalls() call, and it could in fact fail. The
DisableThreadLibraryCalls() function fails if the DLL specified by
hModule has active static thread local storage, or if hModule is an
invalid module handle.
In this commit, we remove the safe-guard assertion put in place for the
DLL_THREAD_ATTACH and DLL_THREAD_DETACH events and we just ignore them.
BIND 9 doesn't create/destroy enough threads for it actually to make any
difference, and in fact we do use static thread local storage in the
code.
Michal Nowak [Mon, 22 Feb 2021 12:50:11 +0000 (13:50 +0100)]
Initialize checknames field in dns_view_create()
The 'checknames' field wasn't initialized in dns_view_create(), but it
should otherwise AddressSanitizer identifies the following runtime error
in query_test.c.
runtime error: load of value 190, which is not a valid value for type '_Bool'
Mark Andrews [Sun, 21 Feb 2021 22:28:37 +0000 (09:28 +1100)]
Silence CID 320481: Null pointer dereferences
*** CID 320481: Null pointer dereferences (REVERSE_INULL)
/bin/tests/wire_test.c: 261 in main()
255 process_message(input);
256 }
257 } else {
258 process_message(input);
259 }
260
CID 320481: Null pointer dereferences (REVERSE_INULL)
Null-checking "input" suggests that it may be null, but it has already been dereferenced on all paths leading to the check.
261 if (input != NULL) {
262 isc_buffer_free(&input);
263 }
264
265 if (printmemstats) {
266 isc_mem_stats(mctx, stdout);
Mark Andrews [Tue, 16 Feb 2021 05:15:25 +0000 (16:15 +1100)]
Silence CID 281450: Dereference before null check
remove redundant 'inst != NULL' test
162cleanup:
CID 281450 (#1 of 1): Dereference before null check (REVERSE_INULL)
check_after_deref: Null-checking inst suggests that it may be null, but it has already been dereferenced on all paths leading to the check.
163 if (result != ISC_R_SUCCESS && inst != NULL) {
164 plugin_destroy((void **)&inst);
165 }
Mark Andrews [Tue, 16 Feb 2021 05:05:56 +0000 (16:05 +1100)]
Silence CID 304936 Dereference before null check
Removed redundant 'listener != NULL' check.
1191cleanup:
CID 304936 (#1 of 1): Dereference before null check (REVERSE_INULL)
check_after_deref: Null-checking listener suggests that it may be null, but it has already been dereferenced on all paths leading to the check.
1192 if (listener != NULL) {
1193 isc_refcount_decrement(&listener->refs);
1194 listener->exiting = true;
1195 free_listener(listener);
1196 }
1. A wrong comment in ns3/setup.sh (we are subtracting 2 hours, not
adding them).
2. 'get_keyids' used bad parameters "$1" "$2" when 'check_numkeys'
failed. Also, 'check_numkeys' can use $DIR, $ZONE, and $NUMKEYS
directly, no need to pass them.
Add some more zones to the kasp system test to test the 'purge-keys'
option. Three zones test that the predecessor key files are removed
after the purge keys interval, one test checks that the key files
are retained if 'purge-keys' is disabled. For that, we change the
times to 90 days in the past (the default value for 'purge-keys').
On each keymgr run, we now also check if key files can be removed.
The 'purge-keys' interval determines how long keys should be retained
after they have become completely hidden.
Key files should not be removed if it has a state that is set to
something else then HIDDEN, if purge-keys is 0 (disabled), if
the key goal is set to OMNIPRESENT, or if the key is unused (a key is
unused if no timing metadata set, and no states are set or if set,
they are set to HIDDEN).
If the last changed timing metadata plus the purge-keys interval is
in the past, the key files may be removed.
Add a dst_key_t variable 'purge' to signal that the key file should
not be written to file again.
Add a new option 'purge-keys' to 'dnssec-policy' that will purge key
files for deleted keys. The option determines how long key files
should be retained prior to removing the corresponding files from
disk.
If set to 0, the option is disabled and 'named' will not remove key
files from disk.
Ondřej Surý [Fri, 19 Feb 2021 11:53:36 +0000 (12:53 +0100)]
Include lib/isc/tls_p.h in release tarballs
The addition of lib/isc/tls_p.h to the source tree was not accounted for
in the relevant variable in lib/isc/Makefile.am and thus the former file
is not being included in release tarballs prepared using "make dist".
Fix by tweaking the libisc_la_SOURCES list in lib/isc/Makefile.am
accordingly.
Michał Kępień [Fri, 19 Feb 2021 10:52:56 +0000 (11:52 +0100)]
Do not require libtool in PATH at build time
The build-time requirement for libtool was introduced inadvertently:
1. Commit 1628f5865acb2d472ce4adf71fc78ac99094fa1c added a check to
configure.ac which claims to test whether the libtool script is
available. There are two problems with that check:
- it is effectively a no-op as the AC_PROG_LIBTOOL() macro always
sets the LIBTOOL variable [1],
- this check was intended to be performed before autoreconf is
run, not when ./configure is run; the libtool script is supposed
to be dynamically generated by ./configure on the build host and
thus there is no need for a standalone libtool script to be
installed system-wide on every host attempting to build BIND 9
e.g. from a tarball produced by "make dist".
2. Commit a7982d14dddb864420deb49e735f782022d1fa07 was based on the
incorrect assumption that the AC_PROG_LIBTOOL() macro looks for the
libtool binary in PATH and sets the LIBTOOL variable accordingly,
which is what other AC_PROG_*() macros do. Meanwhile, the
AC_PROG_LIBTOOL() macro only initializes libtool for use with
Automake. It is not necessary for a standalone libtool script to be
available in PATH on the build host when ./configure is run.
Do not look for libtool in PATH at build time as it prevents hosts
without a libtool script available system-wide from building BIND 9 from
source tarballs prepared using "make dist". Note that libtool m4
macros, utilities, etc. still need to be present on a given host if
autoreconf is to be run on it.
Ondřej Surý [Tue, 9 Feb 2021 16:44:40 +0000 (17:44 +0100)]
Use library constructor/destructor to initialize OpenSSL
Instead of calling isc_tls_initialize()/isc_tls_destroy() explicitly use
gcc/clang attributes on POSIX and DLLMain on Windows to initialize and
shutdown OpenSSL library.
This resolves the issue when isc_nm_create() / isc_nm_destroy() was
called multiple times and it would call OpenSSL library destructors from
isc_nm_destroy().
At the same time, since we now have introduced the ctor/dtor for libisc,
this commit moves the isc_mem API initialization (the list of the
contexts) and changes the isc_mem_checkdestroyed() to schedule the
checking of memory context on library unload instead of executing the
code immediately.
Disable calling DllMain() on thread creation/destruction
Disables the DLL_THREAD_ATTACH and DLL_THREAD_DETACH notifications for
the specified dynamic-link library (DLL). This can reduce the size of
the working set for some applications.
Ondřej Surý [Tue, 9 Feb 2021 12:25:52 +0000 (13:25 +0100)]
Fix the invalid condition variable
Although harmless, the memmove() in tlsdns and tcpdns was guarded by a
current message length variable that was always bigger than 0 instead of
correct current buffer length remainder variable.
Ondřej Surý [Tue, 9 Feb 2021 12:25:46 +0000 (13:25 +0100)]
Move most of the OpenSSL initialization to isc_tls
Since we now require both libcrypto and libssl to be initialized for
netmgr, we move all the OpenSSL initialization code except the engine
initialization to isc_tls API.
The isc_tls_initialize() and isc_tls_destroy() has been made idempotent,
so they could be called multiple time. However when isc_tls_destroy()
has been called, the isc_tls_initialize() could not be called again.
Ondřej Surý [Fri, 5 Feb 2021 16:18:28 +0000 (17:18 +0100)]
Remove overrun checking code from memory allocator
The ISC_MEM_CHECKOVERRUN would add canary byte at the end of every
allocations and check whether the canary byte hasn't been changed at the
free time. The AddressSanitizer and valgrind memory checks surpases
simple checks like this, so there's no need to actually keep the code
inside the allocator.
Ondřej Surý [Fri, 5 Feb 2021 09:25:07 +0000 (10:25 +0100)]
Modify the way we benchmark mem_{get,put}
Previously, the mem_{get,put} benchmark would pass the allocation size
as thread_create argument. This has been now changed, so the allocation
size is stored and decremented (divided) in atomic variable and the
thread create routing is given a memory context. This will allow to
write tests where each thread is given different memory context and do
the same for mempool benchmarking.
Ondřej Surý [Fri, 5 Feb 2021 09:25:07 +0000 (10:25 +0100)]
Disable memory debugging features in non-developer build
The two memory debugging features: ISC_MEM_DEFAULTFILL
(ISC_MEMFLAG_FILL) and ISC_MEM_TRACKLINES were always enabled in all
builds and the former was only disabled in `named`.
This commits disables those two features in non-developer build to make
the memory allocator significantly faster.
Ondřej Surý [Thu, 4 Feb 2021 20:56:49 +0000 (21:56 +0100)]
Make the memory and mempool counters to be stdatomic types
This is yet another step into unlocking some parts of the memory
contexts. All the regularly updated variables has been turned into
atomic types, so we can later remove the locks when updating various
counters.
Also unlock as much code as possible without breaking anything.
Bump the maximum number of hazard pointers in tests
On 24-core machine, the tests would crash because we would run out of
the hazard pointers. We now adjust the number of hazard pointers to be
in the <128,256> interval based on the number of available cores.
Note: This is just a band-aid and needs a proper fix.
Ondřej Surý [Thu, 4 Feb 2021 19:19:09 +0000 (20:19 +0100)]
Remove the extra level of indirection via isc_memmethods_t
Previously, the applications using libisc would be able to override the
internal memory methods with own implementation. This was no longer
possible, but the extra level of indirection was not removed. This
commit removes the extra level of indirection for the memory methods and
the default_memalloc() and default_memfree().
Ondřej Surý [Thu, 4 Feb 2021 19:11:20 +0000 (20:11 +0100)]
Remove the internal memory allocator
The internal memory allocator had an extra code to keep a list of blocks
for small size allocation. This would help to reduce the interactions
with the system malloc as the memory would be already allocated from the
system, but there's an extra cost associated with that - all the
allocations/deallocations must be locked, effectively eliminating any
optimizations in the system allocator targeted at multi-threaded
applications. While the isc_mem API is still using locks pretty heavily,
this is a first step into reducing the memory allocation/deallocation
contention.
Michal Nowak [Tue, 16 Feb 2021 10:33:58 +0000 (11:33 +0100)]
Prevent Git to expand $systest
CentOS 8 "git status" unexpectedly expands search directory "tsig" to
also search in the "tsiggss" directory, thus incorrectly identifying
files as "not removed" in the "tsig" directory:
$ git status -su --ignored tsig
$ touch tsiggss/ns1/{named.run,named.memstats}
$ git status -su --ignored tsig
!! tsiggss/ns1/named.memstats
!! tsiggss/ns1/named.run