Remi Gacogne [Mon, 30 Dec 2024 14:55:33 +0000 (15:55 +0100)]
dnsdist: Fix regression tests with Python 3.13
The CA certificates that we are generating as par of our regression tests
were lacking the X.509 `Key Usage` extension, causing TLS validation with
Python 3.13 to fail with:
> certificate verify failed: CA cert does not include key usage extension
It appears that Python 3.13 enables `VERIFY_X509_STRICT` by default, which makes OpenSSL stricter, and thus it chokes on our invalid CA.
dnsdist: Gracefully handle timeout/response for a closed HTTP stream
The remote end might very well have already closed the HTTP stream
corresponding to the timeout or response we are processing. While
this means we need to discard the event we were processing, it is
not an unexpected event and we should thus not raise an exception
since the caller cannot do anything about it.
dnsdist: Fix a crash when processing timeouts for incoming DoH queries
This commit fixes a double-free triggered by an exception being raised
while we are processing a timeout for an incoming DoH query. The exception
bypasses the call releasing the smart pointer, and thus the destructor
is called when we reach the end of the function since we own the smart
pointer, but unfortunately it has already been destroyed by the function
that raised the exception. The fix is to release the pointer first,
then call the function, so even if an exception is raised we no longer
own the pointer, and it's clear that the function has taken ownership of it.
Remi Gacogne [Mon, 10 Feb 2025 10:24:28 +0000 (11:24 +0100)]
dnsdist-1.9.x: Fix compatibility with boost::lockfree >= 1.87.0
In https://github.com/boostorg/lockfree/pull/90 `boost::lockfree::spsc_queue`
introduced moved semantics, which is great, but added restrictions
to the callback functor that did not exist before, breaking the API.
This PR fixes that by updating our callbacks to expect an object
instead of a reference.
Remi Gacogne [Fri, 13 Dec 2024 14:45:31 +0000 (15:45 +0100)]
dnsdist: Fix ECS zero-scope with incoming DoH queries
The zero-scope feature involves a first cache lookup before the ECS
information has been added to the query, then on a miss a second,
regular lookup is done. When we get a response from the backend that
contains an ECS scope set to 0, we can insert it into the cache in a
way that allows using it for all clients, but we must be careful to
use the key that was computed during the first lookup, and not the
second one.
Incoming DoH queries make that even more interesting because while
they are received over TCP, they are initially forwarded to the
backend over UDP but can be retried over TCP if a TC=1 answer is
received. In that case we must be very careful not to insert the
answer into the cache using the wrong protocol, as we don't want to
serve a TC=1 answer to a client contacting us over TCP, for example.
The computation of the cache key and protocol was unfortunately broken
for the incoming query received over DoH, forwarded over UDP and
response has a zero scope case. This commit fixes it.
Remi Gacogne [Wed, 4 Dec 2024 13:39:56 +0000 (14:39 +0100)]
dnsdist: Allow resetting `setWeightedBalancingFactor()` to zero
Zero is the initial value, but until now it was only possible to pass
a value greater than or equal to 1.0 to `setWeightedBalancingFactor()`
so it was not possible to reset it to the default value.
Remi Gacogne [Tue, 26 Nov 2024 08:42:47 +0000 (09:42 +0100)]
Merge pull request #14878 from rgacogne/ddist19-backport-14768
dnsdist-1.9.x: Backport of #14768 - setTicketsKeyAddedHook: pass a std::string to the hook to avoid luawrapper to truncate content at potential null chars
Remi Gacogne [Thu, 3 Oct 2024 07:10:09 +0000 (09:10 +0200)]
dnsdist: Disable eBPF filtering on QUIC (DoQ, DoH3) sockets
The current eBPF code tries to parse the beginning of the DNS payload
to extract the qname for all UDP datagrams, which is not course
not working correctly for QUIC packets. I don't immediately see a way
to identify QUIC packets from our eBPF code, so for now this commit
disables the eBPF filtering feature on QUIC sockets.
dnsdist: Add EDNS to responses generated from raw record data
My reasoning is that it makes sense to add EDNS to responses generated
from DNSdist provided that:
- the initial query had EDNS
- `setAddEDNSToSelfGeneratedResponses` has not been set to `false`
- we are only provided part of the response and not a full response
packet
See https://github.com/cloudflare/quiche/pull/1769
but it does not matter in our case since we install
the Quiche library in such a way (libdnsdist-quiche.so)
that we are the only user, and it will always be updated
with DNSdist. Keeping it makes our life significantly harder
since several packaging tools look a the `SONAME`.
Remi Gacogne [Tue, 20 Aug 2024 10:26:33 +0000 (12:26 +0200)]
dnsdist: Fix EDNS flags confusion when editing the OPT header
We used to wrongly reverse the byte-ordering of the existing EDNS
flags when editing the OPT header, for example when setting an
extended DNS error status.
Remi Gacogne [Tue, 20 Aug 2024 11:04:11 +0000 (13:04 +0200)]
dnsdist: Return a valid unix timestamp for Dynamic Block's `until`
We internally use a timestamp obtained via `CLOCK_MONOTONIC` which
is quite useless to an external observer, so convert it to a normal
unix timestamp in the Lua accessor.
Remi Gacogne [Tue, 20 Aug 2024 12:44:57 +0000 (14:44 +0200)]
dnsdist: Stop reporting timeouts in `topSlow()`, add `topTimeouts()`
Until this commit `topSlow()` returned queries that timed out, which
is not very helpful. This was happening because timeouts are internally
recorded with a very high response time.
With this change, `topSlow()` now ignores queries that timed out, and
a new command is added to look into these: `topTimeouts()`.
Michael Cho [Fri, 16 Aug 2024 02:49:17 +0000 (22:49 -0400)]
Fix build with boost 1.86.0
Boost 1.86.0 changes seem to no longer indirectly include header which
causes build to fail with:
```
uuid-utils.cc:38:58:
error: 'random' is not a class, namespace, or enumeration
```
boost/random/mersenne_twister.hpp has been available since Boost 1.21.2
dnsdist: Fix handling of proxy protocol payload outside of TLS for DoT
After reading the proxy protocol payload from the I/O buffer
we were clearing the buffer but failed to properly reset the
position, leading to an exception when trying to read the DNS
payload after processing the TLS handshake:
```
Got an exception while handling (reading) TCP query from 127.0.0.1:59426: Calling tryRead() with a too small buffer (2) for a read of 18446744073709551566 bytes starting at 52
```
The huge value comes from the fact that the position (52 here)
is larger than the size of the buffer (2 at this point to read
the size of the incoming DNS payload), leading to an unsigned
underflow. The code is properly detecting that the value makes
no sense in this context, but the connection is then dropped
because we cannot recover.
It turns out we had a end-to-end test for the "proxy protocol
outside of TLS" case but only over incoming DoH, and the DoH
case avoids this specific issue because the buffer is always
properly resized, and the position updated.
Remi Gacogne [Thu, 27 Jun 2024 14:07:20 +0000 (16:07 +0200)]
dnsdist: Handle Quiche >= 0.22.0
Quiche broke its existing API in 0.22.0: https://github.com/cloudflare/quiche/pull/1726
This pull request adds m4 code to detect whether the Quiche version
we are building against is >= 0.22.0, and if it is defines
`HAVE_QUICHE_STREAM_ERROR_CODES` which is later used by the code
using Quiche to know which version of the API to use.