* http embeds modified lua-http server code that
reuses single cqueue for all h2 client sockets,
this is also because the API in upstream is unstable
* http embeds rickshaw for real-time graphs over
websockets, it displays latency heatmap by default
and can show several other metrics
* http shows a world map with pinned recently contacted
authoritatives, where diameter represents number
of queries sent and colour its average RTT, so
you can see where the queries are going
* http now exports several endpoints and websockets:
/stats for statistics in JSON, and /metrics for
metrics in Prometheus text format
Marek Vavrusa [Wed, 1 Jun 2016 07:08:00 +0000 (00:08 -0700)]
modules/http: doc, auto-tls, cert renewal, ...
added documentation, many fixes in the H2 fallback
code and H2 stream handling, TLS is enabled by
default using ephemeral key and certificate that
is automatically renewed, but custom certificates
are also supported
this also allows other modules to place code
snippets on the webpage
Marek Vavrusa [Sun, 29 May 2016 20:27:19 +0000 (13:27 -0700)]
daemon/io: freed handle could be touched in libuv
the daemon wrongly freed handle that returned 0,
as in "no more data". this socket is going to be
closed, but it still could be touched by libuv
so it must be freed wit uv_close() handler
Marek Vavrusa [Mon, 23 May 2016 00:56:50 +0000 (17:56 -0700)]
daemon: support event.socket(fd, cb) for I/O events
this allows embedding other event loops or just
asynchronous events triggered by socket activity.
this is required for things like cooperative
HTTP server, monitoring endpoint or remote
configuration daemon/controller
Marek Vavrusa [Sun, 22 May 2016 03:58:11 +0000 (20:58 -0700)]
worker: fixed corruption when follower timeouts, early free
* when enqueued task terminated earlier than leader
task because of timeout, it wasn't dequeued from
the waitlist immediately, but it didn't have any
outstanding outbound queries. when leader task
terminated, it removed this task and updated its
outbound query, which didn't exist. this triggered
a 16B write in undefined location
* fixed timeout timer being scheduled for closing
without holding reference to parent task
Marek Vavrusa [Sun, 15 May 2016 21:14:53 +0000 (14:14 -0700)]
lib: cache api v2, removed dep on libknot db.h
this change introduces new API for cache backends,
that is a subset of knot_db_api_t from libknot
with several cache-specific operations
major changes are:
* merged 'cachectl' module into 'cache' as it is
99% default-on and it simplifies things
* not transaction oriented, transactions may be
reused and cached for higher performance
* scatter/gather API, this is important for
latency and performance of non-local backends
like Redis
* faster and reliable cache clearing
* cache-specific operations (prefix scan, ...) in
the API not hacked in
* simpler code for both backends and caller
Marek Vavrusa [Sun, 15 May 2016 21:08:45 +0000 (14:08 -0700)]
contrib/lmdb: imported LMDB 0.9.18, built-in
by default, build system attempts to use LMDB
from the system. however if it's not found or
the version is too old, it uses the built-in
snapshot in contrib
Marek Vavrusa [Wed, 11 May 2016 07:40:35 +0000 (00:40 -0700)]
daemon/worker: deduplicate inbound queries
many clients do frequent retransmits of the query
to avoid network losses and get better service,
but then fail to work properly when a resolver
answers SERVFAIL to some of them because of the
time limit and some of them NOERROR.
it's also a good idea to avoid wasting time
tracking pending tasks to solve the same thing.
Marek Vavrusa [Wed, 11 May 2016 00:45:12 +0000 (17:45 -0700)]
daemon: do not modify task for outgoing queries
if the upstream TCP query timeouted or the connection
was severed, it would dissociate the handle from
original query, so the query would be solved
but the requestor wouldn't see the answer unless
he requeried
Marek Vavrusa [Fri, 6 May 2016 06:40:28 +0000 (23:40 -0700)]
lib: cleanup servfail soft-fails
* simplified soft-fail per-ns limit to per-query
limit, each query gets 4 tries at resolving
* instead of locking at single servfailing NS,
penalise it and run reelection, this may or
may not try other servers but avoids pathologic
case when single NS is servfailing while others
are good but never probed
* added new nsrep update mode (addition)
Marek Vavrusa [Wed, 4 May 2016 00:17:53 +0000 (17:17 -0700)]
lib/validate: fixed memory bug
this code used memory pool of source packet instead
of the answer, this could result in invalidated
memory read if the memory occupied by source
packet was rewritten
Marek Vavrusa [Tue, 3 May 2016 06:56:20 +0000 (23:56 -0700)]
daemon: out-of-order processing for TCP
* daemon now processes messages over TCP stream
out-of-order and concurrently
* support for TCP_DEFER_ACCEPT
* support for TCP Fast-Open
* there are now deadlines for TCP for idle/slow
streams (to prevent slowloris; pruning)
* there is now per-request limit on timeouts
(each request is allowed 4 timeouts before bailing)
* faster request closing, unified retry/timeout timers
* rare race condition in timer closing fixed
Marek Vavrusa [Mon, 18 Apr 2016 03:34:31 +0000 (20:34 -0700)]
daemon: mode(strict|normal|permissive)
the daemon has now three modes of strictness
checking from strict to permissive.
it reflects the tradeoff between resolving the
query in as few steps as possible and security
for insecure zones
Marek Vavrusa [Mon, 18 Apr 2016 00:32:17 +0000 (17:32 -0700)]
engine: clear bad scorers from RTT every 5 minutes
an internal timer walks RTT timer periodically and
clears entries with bad results every 5 minutes.
this means that a timeouted entry penalty is
capped to that interval, making sure that the
bad reputation doesn't last forever
Marek Vavrusa [Mon, 18 Apr 2016 00:29:41 +0000 (17:29 -0700)]
engine: throttle outbound queries only when busy
resolver will always attempt to contact upstreams
known to be bad if it's not busy.
this fixes a problem on low-volume resolvers
where a short connection outage could make
resolvers deny resolving queries even after the
connection is restored
Marek Vavrusa [Fri, 15 Apr 2016 07:03:13 +0000 (00:03 -0700)]
lib/iterate: QUERY_PERMISSIVE mode
in permissive mode, resolver is free to use
(but not cache) non-mandatory glue records even
if they're not resolvable. this is great as a
workaround for broken child-side zones, but
not great for security of, well, insecure
delegations. it's off by default.