Vladimír Čunát [Wed, 19 Sep 2018 17:39:26 +0000 (19:39 +0200)]
worker: safer code around the mempool freelist
I did NOT remove this one, as in a quick profile that would be
increase in roughly 0.5% time in malloc, so that's possibly justifiable.
(And this one is much less obstructing to splitting the worker code.)
Vladimír Čunát [Wed, 19 Sep 2018 16:39:17 +0000 (18:39 +0200)]
worker: remove freelists for iohandle and iorequest
A quick profiling showed no change in performance,
and in particular no change in time spent in malloc/free.
Some of the types in the union differed in size by a multiple.
If their performance won't be satisfying, replacements should be
considered first (e.g. jemalloc) before rolling our own stuff.
daemon: logic around struct session was relocated to separate module; input data buffering scheme was changed (libuv); attempt was made to simplify processing of the stream
Vladimír Čunát [Fri, 14 Sep 2018 08:21:43 +0000 (10:21 +0200)]
misc nitpicks
- \param family, esp. don't rely on AF_UNSPEC being zero
- kres_gnutls_vec_push(): don't uv_write() if ENOMEM
- tls_client_params_clear(): remove unused function
Marek Vavruša [Fri, 17 Aug 2018 07:43:36 +0000 (00:43 -0700)]
daemon/worker: fixes error handling from TLS writes
The error handling loop for uncorking TLS data was wrong, as the
underlying push function is asynchronous and there's no relationship
between completed DNS packet writes and number of TLS message writes.
In case of the asynchronous function, the buffered data must be valid
until the write is complete, currently this is not guaranteed and
loading the resolver with pipelined requests results in memory errors:
```
$ getdns_query @127.0.0.1#853 -s -a -s -l L -B -F queries -q
...
==47111==ERROR: AddressSanitizer: heap-use-after-free on address 0x6290040a1253 at pc 0x00010da960d3 bp 0x7ffee2628b30 sp 0x7ffee26282e0
READ of size 499 at 0x6290040a1253 thread T0
#0 0x10da960d2 in wrap_write (libclang_rt.asan_osx_dynamic.dylib:x86_64h+0x1f0d2)
#1 0x10d855971 in uv__write (libuv.1.dylib:x86_64+0xf971)
#2 0x10d85422e in uv__stream_io (libuv.1.dylib:x86_64+0xe22e)
#3 0x10d85b35a in uv__io_poll (libuv.1.dylib:x86_64+0x1535a)
#4 0x10d84c644 in uv_run (libuv.1.dylib:x86_64+0x6644)
#5 0x10d602ddf in main main.c:422
#6 0x7fff6a28a014 in start (libdyld.dylib:x86_64+0x1014)
0x6290040a1253 is located 83 bytes inside of 16895-byte region [0x6290040a1200,0x6290040a53ff)
freed by thread T0 here:
#0 0x10dacdfdd in wrap_free (libclang_rt.asan_osx_dynamic.dylib:x86_64h+0x56fdd)
#1 0x10d913c2e in _mbuffer_head_remove_bytes (libgnutls.30.dylib:x86_64+0xbc2e)
#2 0x10d915080 in _gnutls_io_write_flush (libgnutls.30.dylib:x86_64+0xd080)
#3 0x10d90ca18 in _gnutls_send_tlen_int (libgnutls.30.dylib:x86_64+0x4a18)
#4 0x10d90edde in gnutls_record_send2 (libgnutls.30.dylib:x86_64+0x6dde)
#5 0x10d90f085 in gnutls_record_uncork (libgnutls.30.dylib:x86_64+0x7085)
#6 0x10d5f6569 in tls_push tls.c:238
#7 0x10d5e5b2a in qr_task_send worker.c:1002
#8 0x10d5e2ea6 in qr_task_finalize worker.c:1562
#9 0x10d5dab99 in qr_task_step worker.c
#10 0x10d5e12fe in worker_process_tcp worker.c:2410
```
The current implementation adds opportunistic uv_try_write which
either writes the requested data, or returns UV_EAGAIN or an error,
which then falls back to slower asynchronous write that copies the buffered data.
The function signature is changed from simple write to vectorized write.
This also enables TLS False Start to save 1RTT when possible.
Vladimír Čunát [Wed, 12 Sep 2018 12:59:46 +0000 (14:59 +0200)]
cache: improve out-of-disk condition
When suspect SIGBUS happens, print helpful error and try to remove
the cache, so that the service might work again if auto-restarted.
Theoretically we could longjmp() out of the SIGBUS handler,
but that would be rather messy, so let the process die.
Petr Špaček [Thu, 23 Aug 2018 08:16:50 +0000 (10:16 +0200)]
ci: update Deckard in attempt to make CI more reliable
Changes related to monotonic fake time and detection logic for overload
should make CI a little bit more reliable. It should be even better once
we combine overload-detection with some kind of auto-retry.
Petr Špaček [Fri, 17 Aug 2018 13:40:20 +0000 (15:40 +0200)]
cache.clear: clearing root clears everything, not only the root zone
Problem was caused by our lookup format where only the root zone starts
with \0 and all other zones start differently. This caused
cache_match('.') to match only data from root zone.
Petr Špaček [Fri, 17 Aug 2018 12:55:56 +0000 (14:55 +0200)]
remove memcached and redis modules from source tree
Source was kept for historical reasons but was not in use since 2.0.0.
It is now clear that there are better approaches to implement
distributed cache so it is pointless to keep old stuff in tree and
confuse users.