Otto Moerbeek [Fri, 12 Jun 2020 10:24:26 +0000 (12:24 +0200)]
Fix three shared cache issues:
- Only prime share cache once on startup
- Cache pruning could go into an infinite loop if not enough expired
entries could be pruned.
- Handler thread isn't run very often, but now the record cache
pruning is done by it, so increase frequency of the housekeeping
call for the handler thread.
Kees Monshouwer [Mon, 15 Jun 2020 09:54:05 +0000 (11:54 +0200)]
auth: gsqlite3backend: add missing indexes
Sqlite3 backend was performing terrible in environments with many updates.
On a slaved root zone the performance increase was huge, 71ms -> 1ms.
Since the lack of proper indexes is causing a lot of trouble in larger environments, I target this update at 4.3.1
Remi Gacogne [Mon, 8 Jun 2020 14:28:42 +0000 (16:28 +0200)]
dnsdist: Use non-blocking pipes to pass DoH queries/responses around
This commit makes the internal sockets non-blocking so we don't freeze if
they ever fill up, and log errors/increment metrics instead.
It also replaces the socket pairs by pipes, since the default buffer
size for sockets seems to allow only ~278 pending queries which might
be reached given how libh2o batches events. On Linux, a pipe gives us
8192 pending queries by default due to the lower overhead, and it
can easily be incremented to 131072 pending queries by setting the
pipe size to 1048576. This commits adds a new setting to do just
that.
ring-buffer size metrics are affected in three ways:
* incremented and saturated as items are added
* set to zero, when the ring-buffer is reset
* decremented when the ring-buffer is resized to a smaller capacity
that cannot hold the number of items currently stored
The latter qualifies ring-buffer size metrics as gauges.
Otto Moerbeek [Fri, 5 Jun 2020 08:37:28 +0000 (10:37 +0200)]
Add/modify tests. Also re-check for the cache case. It *is* a bit
unsettling that case causes an ImmediateServFailException, but I do
not like to touch the general flow right now. That would be required
to make the CNAME cache case more similar to the non-cached case.
Otto Moerbeek [Wed, 3 Jun 2020 07:07:56 +0000 (09:07 +0200)]
Correct depth increments.
With the introduction of qname minimization, a function
doResolveNoQNameMinimization() was introduced. This function is
called by doResolve() with depth incremented. Due to the recursive
nature of the resursor algortihm (Nomen est Omen) we end up
incrementing the depth too much. This prompted a review of the other
places depth was incremented, and I believe it should only be done
when calling doResolve(). Especially the case "+ 2" in the getAddrs()
call looks strange to me, as the doResolve() calls in getAddrs()
already call doResolve() with depth + 1.
This fixes #9184 and likely other cases of deep recursion caused
by long CNAME chains.
Remi Gacogne [Tue, 2 Jun 2020 10:24:34 +0000 (12:24 +0200)]
Fix compilation on systems that do not define HOST_NAME_MAX
On FreeBSD at least, HOST_NAME_MAX is not defined and we need to
use sysconf() to get the value at runtime instead.
Based on a work done by @RvdE to make the recursor compile on
FreeBSD (many thanks!).
Avoid making new backends when we are going to either deny the XFR, or
fall back to AXFR anyway.
This cuts down the number of new backends from four (three for IXFR
pre-checks plus one for AXFR) to one (just the AXFR one).
When replying in IXFR mode, we keep making _one_ new backend, which is
also better than before.
While we now hold the s_plock for a while longer, we only take it once
in doIXFR; before we took it twice -- for TSIG retrieval, which now
re-uses the IXFR backend.
aerique [Mon, 25 May 2020 15:08:07 +0000 (17:08 +0200)]
Make sure we can install unsigned packages.
Sometimes we need to install unsigned packages from our own ad-hoc repo,
installing `apt-transport-https` makes sure we can do this (at least on
Debian Stretch).
Remi Gacogne [Mon, 25 May 2020 09:33:19 +0000 (11:33 +0200)]
rec: Defer the NOD lookup until after the response has been sent
If the NOD lookup is slow, for example because the destination
authoritative server is down, doing the NOD lookup before the response
has been sent increases the latency a lot.
This commit moves the actual NOD lookup after the response has been
sent, so we can still use the existing mthread (we might actually need
to do a proper DNS resolution to find the target authoritative server)
without keeping the client waiting.