Brad Cowie [Fri, 30 May 2025 01:57:25 +0000 (13:57 +1200)]
datamodel/templates: fix kr_rule_local_* macros
commit a782e9c3 broke the jinja2 generation of the
kr_rule_local_* macro functions. C.KR_RULE_OPTS_DEFAULT
was provided as an argument to the assert() function call
instead of the call to the corresponding C.kr_rule_local_* function
Vladimír Čunát [Wed, 28 May 2025 12:35:46 +0000 (14:35 +0200)]
datamodel: hide /local-data/rpz/*/dry-run for now
While this can be practical, let's not promise this approach to
configuration until it's more clear how the more general score
will appear in the config.
Vladimír Čunát [Sun, 25 May 2025 08:17:44 +0000 (10:17 +0200)]
prefill: download through a temporary file
File rename is an atomic operation, so that's a plus.
We had a practical issue with the canary process,
as (for me) it exits somewhere during the download; example log:
kresd0[912938]: [prefil] downloading root zone to file root.zone ...
kresd0[912942]: [prefil] root zone file valid for 11 hours 59 minutes, reusing data from disk
kresd0[912942]: [prefil] empty zone file
kresd0[912942]: [prefil] error parsing zone file `root.zone`
kresd0[912942]: [prefil] root zone import failed, retry in 01 seconds
kresd0[912942]: [prefil] downloading root zone to file root.zone ...
kresd0[912942]: [prefil] import started for zone file `root.zone`
kresd0[912942]: [prefil] performance: parsing took 0.832 s, hashing took nan s
kresd0[912942]: [prefil] zone successfully parsed, import started
kresd0[912942]: [prefil] root zone refresh in 11 hours 59 minutes
kresd0[912942]: [prefil] performance: validating and caching took 0.736 s
Also avoid the unnecessary pcall+error. Why throw and immediately catch?
Vladimír Čunát [Sun, 25 May 2025 08:17:44 +0000 (10:17 +0200)]
prefill nit: better error messages
Avoid the ugly cdata<const char *>: 0x7fe6202c7f80
Moreover the return code is -1 in my test case,
but that does not imply EPERM: Operation not permitted.
It was all unnecessary mess, including the pcall+error() pair.
Also avoid some double-wrapping by '[prefil]'.
Vladimír Čunát [Thu, 3 Apr 2025 12:13:28 +0000 (14:13 +0200)]
doc: better build parallelism
`auto` isn't perfect because of nested parallelism,
but I don't see another simple way here,
and I hope the potential slight overload will be OK for docs.
Vladimír Čunát [Sat, 10 May 2025 09:11:14 +0000 (11:11 +0200)]
daemon/session2_{inc,dec}_refs() nit: allow compiler to inline
The `inline + extern inline` combination is kind-of arcane,
but I find it nice to leave it to compiler whether to inline or not.
(in particular, in debug builds it's probably better not to inline this)
Vladimír Čunát [Thu, 24 Apr 2025 08:10:44 +0000 (10:10 +0200)]
NEWS: classify these issues as "security"
If an attacker can control client's queries
(and register names with malicious auths),
with enough work they probably can trigger the conditions often.
Vladimír Čunát [Thu, 24 Apr 2025 08:10:44 +0000 (10:10 +0200)]
daemon/session2_tasklist_del(): be more defensive
I don't expect we still have a bug here, but even so -
if this assertion fails, I don't think we need to force a crash.
A recoverable assertion seems a better choice here.
The tasks on the waitinglist are not present in the taskslist,
so let's not incorrectly attempt removal in this case.
We didn't check the return value here, and the disconnection event
won't even happen in the typical cases, so this has been unnoticed -
until the deletion actually did find a matching msgid (lucky!)
by a *different* task (of course) which triggered an assertion (crash).
Vladimír Čunát [Thu, 24 Apr 2025 08:10:44 +0000 (10:10 +0200)]
daemon/worker send_waiting(): be more defensive
We encountered non-recoverable assertions due to popping
from an empty queue here, but I see no reason to block recovery here.
I'm still keeping it as a soft assertion until it's better understood.
I *suspect* what happened is that:
- multiple queries queued up before outgoing TCP handshake completed
- the session got into closing state for some reason
*before* processing this whole queue
- during that the queue got emptied
Vladimír Čunát [Fri, 3 May 2024 07:40:49 +0000 (09:40 +0200)]
doc/user/gettingstarted-startup.rst: less strong formulation
Some distros do enable knot-resolver.service on installation,
e.g. I quickly tried in a CentOS 9 LXC where it didn't start
immediately but it did after restarting the container.
I believe that customs of each distro should be followed here.