docs: config: performance subsection

author Aleš Mrázek <ales.mrazek@nic.cz>

Tue, 11 Jul 2023 14:05:38 +0000 (16:05 +0200)

committer Vladimír Čunát <vladimir.cunat@nic.cz>

Tue, 8 Aug 2023 06:50:52 +0000 (08:50 +0200)
author Aleš Mrázek <ales.mrazek@nic.cz>
Tue, 11 Jul 2023 14:05:38 +0000 (16:05 +0200)
committer Vladimír Čunát <vladimir.cunat@nic.cz>
Tue, 8 Aug 2023 06:50:52 +0000 (08:50 +0200)
diff --git a/doc/config-cache-predict.rst b/doc/config-cache-predict.rst

new file mode 100644 (file)

index 0000000..f25f95b
--- /dev/null
+++ b/doc/config-cache-predict.rst
@@ -0,0 +1,82 @@
+.. SPDX-License-Identifier: GPL-3.0-or-later
+
+.. _config-cache-predict:
+
+Prefetching records
+===================
+
+Prefetching records helps to keep the cache hot.
+It can utilize two independent mechanisms to select the records which should be refreshed:
+expiring records and prediction.
+
+Expiring records
+----------------
+
+This mechanism is always active when the prefetching is enabled and it is not configurable.
+
+Any time the resolver answers with records that are about to expire,
+they get refreshed. Record is expiring if it has less than 1% TTL (or less than 5s).
+That improves latency for records which get frequently queried, relatively to their TTL.
+
+Prediction
+----------
+
+The resolver can also learn usage patterns and repetitive queries,
+though this mechanism is a prototype and **not recommended** for use in production or with high traffic.
+
+For example, if it makes a query every day at 18:00,
+the resolver expects that it is needed by that time and prefetches it ahead of time.
+This is helpful to minimize the perceived latency and keeps the cache hot.
+
+You can disable prediction by configuring :option:`period <period: <int>>` to ``0``.
+
+.. tip::
+
+   The tracking window and period length determine memory requirements.
+   If you have a server with relatively fast query turnover, keep the period low (hour for start) and shorter tracking window (5 minutes).
+   For personal slower resolver, keep the tracking window longer (i.e. 30 minutes) and period longer (a day), as the habitual queries occur daily.
+   Experiment to get the best results.
+
+
+Configuration
+-------------
+
+.. option:: cache/prediction: true|false|<options>
+
+   :default: false
+
+   .. option:: window: <time ms|s|m|h|d>
+
+      :default: 15m
+
+   .. option:: period: <int>
+
+      :default: 24
+
+Reconfigure the predictor to given tracking window and period length. Both parameters are optional.
+Window length is in minutes, period is a number of windows that can be kept in memory.
+e.g. if a ``window`` is 15 minutes, a ``period`` of "24" means 6 hours (360 minutes, 15*24=360).
+
+.. code-block:: yaml
+
+   cache:
+     # this mode is NOT RECOMMENDED for use in production
+     prediction:
+       window: 15m  # 15 minutes sampling window
+       period: 24   # track last 6 hours
+
+It is also possible to enable prediction with defaults for :option:`window <window: <time ms|s|m|h|d>>` and :option:`period <period: <int>>`.
+
+.. code-block:: yaml
+
+   cache:
+     prediction: true
+
+Exported metrics
+----------------
+
+To visualize the efficiency of the predictions, following statistics are exported.
+
+* ``predict.epoch`` - current prediction epoch (based on time of day and sampling window)
+* ``predict.queue`` - number of queued queries in current window
+* ``predict.learned`` - number of learned queries in current window
diff --git a/doc/config-cache-prefill.rst b/doc/config-cache-prefill.rst

index d8a8678447759c581475b2736a181a89ce144fe7..b0a24578ec11e7ab6c363bf84432613a4fdf240c 100644 (file)
--- a/doc/config-cache-prefill.rst
+++ b/doc/config-cache-prefill.rst
@@ -9,8 +9,8 @@ This provides ability to periodically prefill the DNS cache by importing root zo
  
  Intended users of this module are big resolver operators which will benefit from decreased latencies and smaller amount of traffic towards DNS root servers.
  
-
  .. option:: cache/prefill: <list>
+
     .. option:: origin: <zone name>
  
        Name of the zone, only root zone import is supported at the moment.
@@ -20,6 +20,7 @@ Intended users of this module are big resolver operators which will benefit from
        URL of a file in :rfc:`1035` zone file format.
  
     .. option:: refresh-interval: <time ms|s|m|h|d>
+
        :default: 1d
  
        Time between zone data refresh attempts.
diff --git a/doc/config-cache.rst b/doc/config-cache.rst

index d6b2cd8cc1e8c35884d071c5517435125fad6402..11c85cbdd80dc39aa9d67498d966bfff8398baba 100644 (file)
--- a/doc/config-cache.rst
+++ b/doc/config-cache.rst
@@ -5,9 +5,8 @@
  Cache
  =====
  
-Cache in Knot Resolver is stored on disk and also shared between
-:ref:`systemd-multiple-instances` so resolver doesn't lose the cached data on
-restart or crash.
+Cache in Knot Resolver is shared between :ref:`multiple workers <config-multiple-workers>`
+and stored in a file, so resolver doesn't lose the cached data on restart or crash.
  
  To improve performance even further the resolver implements so-called aggressive caching
  for DNSSEC-validated data (:rfc:`8198`), which improves performance and also protects
@@ -99,12 +98,12 @@ config file as well.
  Configuration reference
  -----------------------
  
-
  .. option:: cache/storage: <dir>
-   :default: /var/cache/knot-resolver
  
+   :default: /var/cache/knot-resolver
  
  .. option:: cache/size-max: <size B|K|M|G>
+
     :default: 100M
  
  .. note:: Use ``B, K, M, G`` bytes units prefixes.
@@ -118,14 +117,14 @@ Note that the maximum size cannot be lowered, only increased due to how cache is
        storage: /var/cache/knot-resolver
        size-max: 400M
  
-
  .. option:: cache/ttl-max: <time ms|s|m|h|d>
-   :default: 1d (1 day)
  
-   Higher TTL bound applied to all received records.
+   :default: 1d
  
+   Higher TTL bound applied to all received records.
  
  .. option:: cache/ttl-min: <time ms|s|m|h|d>
+
     :default: 5s
  
     Lower TTL bound applied to all received records.
@@ -139,10 +138,10 @@ Note that the maximum size cannot be lowered, only increased due to how cache is
        ttl-max: 2d
        ttl-min: 20s
  
-
  .. option:: cache/ns-timeout: <time ms|s|m|h|d>
+
     :default: 1000ms
  
     Time interval for which a nameserver address will be ignored after determining that it doesn't return (useful) answers.
-   The intention is to avoid waiting if there's little hope; instead, kresd can immediately SERVFAIL or immediately use stale records (with :ref:`serve_stale <mod-serve_stale>` module).
+   The intention is to avoid waiting if there's little hope; instead, kresd can immediately SERVFAIL or immediately use stale records (with :ref:`serve-stale <config-serve-stale>`).
  
diff --git a/doc/config-edns-keepalive.rst b/doc/config-edns-keepalive.rst

new file mode 100644 (file)

index 0000000..7823cc0
--- /dev/null
+++ b/doc/config-edns-keepalive.rst
@@ -0,0 +1,23 @@
+.. SPDX-License-Identifier: GPL-3.0-or-later
+
+.. _config-edns-keepalive:
+
+EDNS keepalive
+==============
+
+Implementation of :rfc:`7828` for *clients*
+connecting to Knot Resolver via TCP and TLS.
+It just allows clients to discover the connection timeout,
+client connections are always timed-out the same way *regardless*
+of clients sending the EDNS option.
+
+When connecting to servers, Knot Resolver does not send this EDNS option.
+It still attempts to reuse established connections intelligently.
+
+It is enabled by default. For debugging purposes it can be
+disabled in configuration file.
+
+.. code-block:: yaml
+
+   options:
+     edns-tcp-keepalive: false
diff --git a/doc/config-multiple-workers.rst b/doc/config-multiple-workers.rst

new file mode 100644 (file)

index 0000000..b31e9e1
--- /dev/null
+++ b/doc/config-multiple-workers.rst
@@ -0,0 +1,30 @@
+.. SPDX-License-Identifier: GPL-3.0-or-later
+
+.. _config-multiple-workers:
+
+Multiple workers
+================
+
+Knot Resolver can utilize multiple CPUs running multiple independent workers (processes), where each process utilizes at most single CPU core on your machine.
+If your machine handles a lot of DNS traffic configure multiple workers.
+
+All workers typically share the same configuration and cache, and incoming queries are automatically distributed by operating system among all workers.
+
+Advantage of using multiple workers is that a problem in a single worker will not affect others, so a single worker crash will not bring the whole resolver service down.
+
+.. tip::
+
+   For maximum performance, there should be as many worker processes as there are available CPU threads.
+
+To run multiple workers, configure its number in configuration file.
+
+.. code-block:: yaml
+
+   workers: 4
+
+You can try let the resolver get number of available CPU threads automatically.
+If there is problem, configuration shoul not pass validation process.
+
+.. code-block:: yaml
+
+   workers: auto
diff --git a/doc/config-performance.rst b/doc/config-performance.rst

index 9df0f93e309c379661e427b01eb6cd468e7eccb3..dbd8d12d22c58b2dfd7d0e527ffc0ef7f1b3a685 100644 (file)
--- a/doc/config-performance.rst
+++ b/doc/config-performance.rst
@@ -1,6 +1,6 @@
  .. SPDX-License-Identifier: GPL-3.0-or-later
  
-.. _performance:
+.. _config-performance:
  
  **************************
  Performance and resiliency
@@ -10,12 +10,12 @@ For DNS resolvers, the most important parameter from performance perspective
  is cache hit rate, i.e. percentage of queries answered from resolver's cache.
  Generally the higher cache hit rate the better.
  
-Performance tunning should start with cache :ref:`cache_sizing`
-and :ref:`cache_persistence`.
+Performance tunning should start with cache :ref:`config-cache-sizing`
+and :ref:`config-cache-persistence`.
  
-It is also recommended to run :ref:`systemd-multiple-instances` (even on a
-single machine!) because it allows to utilize multiple CPU threads and
-increases overall resiliency.
+.. It is also recommended to run :ref:`systemd-multiple-instances` (even on a
+.. single machine!) because it allows to utilize multiple CPU threads and
+.. increases overall resiliency.
  
  Other features described in this section can be used for fine-tunning
  performance and resiliency of the resolver but generally have much smaller
@@ -24,13 +24,11 @@ impact than cache settings and number of instances.
  .. toctree::
     :maxdepth: 1
  
-   daemon-bindings-cache
-   systemd-multiinst
-   modules-predict
-   modules-prefill
-   modules-serve_stale
-   modules-rfc7706
-   modules-priming
-   modules-edns_keepalive
-   daemon-bindings-net_xdpsrv
-
+   config-cache
+   config-multiple-workers
+   config-cache-predict
+   config-cache-prefill
+   config-serve-stale
+   config-rfc7706
+   config-priming
+   config-edns-keepalive
diff --git a/doc/config-priming.rst b/doc/config-priming.rst

new file mode 100644 (file)

index 0000000..37a3c4a
--- /dev/null
+++ b/doc/config-priming.rst
@@ -0,0 +1,22 @@
+.. SPDX-License-Identifier: GPL-3.0-or-later
+
+.. _config-priming:
+
+Priming
+=======
+
+Initializing a DNS Resolver with Priming Queries implemented
+according to :rfc:`8109`. Purpose of this is to keep up-to-date list of
+root DNS servers and associated IP addresses.
+
+Result of successful priming query replaces root hints distributed with
+the resolver software. Unlike other DNS resolvers, Knot Resolver caches
+result of priming query on disk and keeps the data between restarts until
+TTL expires.
+
+Priming is enabled by default; you may disable it in configuration file.
+
+.. code-block:: yaml
+
+   options:
+     priming: false
diff --git a/doc/config-rfc7706.rst b/doc/config-rfc7706.rst

new file mode 100644 (file)

index 0000000..ed6fd57
--- /dev/null
+++ b/doc/config-rfc7706.rst
@@ -0,0 +1,12 @@
+.. SPDX-License-Identifier: GPL-3.0-or-later
+
+Root on loopback (RFC 7706)
+---------------------------
+Knot Resolver developers think that literal implementation of :rfc:`7706`
+("Decreasing Access Time to Root Servers by Running One on Loopback")
+is a bad idea so it is not implemented in the form envisioned by the RFC.
+
+You can get the very similar effect without its downsides by combining
+:ref:`config-cache-prefill` and :ref:`config-serve-stale` modules with Aggressive Use
+of DNSSEC-Validated Cache (:rfc:`8198`) behavior which is enabled
+automatically together with DNSSEC validation.
diff --git a/doc/config-serve-stale.rst b/doc/config-serve-stale.rst

new file mode 100644 (file)

index 0000000..1377e53
--- /dev/null
+++ b/doc/config-serve-stale.rst
@@ -0,0 +1,23 @@
+.. SPDX-License-Identifier: GPL-3.0-or-later
+
+.. _config-serve-stale:
+
+Serve stale
+===========
+
+This allows using timed-out records in case the resolver is unable to contact upstream servers.
+
+By default it allows stale-ness by up to one day,
+after roughly four seconds trying to contact the servers.
+It's quite configurable/flexible; see the beginning of the module source for details.
+See also the RFC draft_ (not fully followed) and :option:`cache/ns-timeout <cache/ns-timeout: <time ms|s|m|h|d>>`.
+
+Running
+-------
+
+.. code-block:: yaml
+
+    options:
+      serve-stale: true
+
+.. _draft: https://tools.ietf.org/html/draft-ietf-dnsop-serve-stale-00
diff --git a/doc/index.rst b/doc/index.rst

index 5dd1ab52262549a827bc0c3d761e34ace4ef7ba8..73084c6d332a8b06c3d400026fb64df5ee0a2b8b 100644 (file)
--- a/doc/index.rst
+++ b/doc/index.rst
@@ -5,7 +5,7 @@ Knot Resolver
  #############
  
  Welcome to Knot Resolver's documentation!
-Knot Resolver is an opensource implementation of a caching validating DNS resolver.
+Knot Resolver is an open-source implementation of a caching validating DNS resolver.
  Modular architecture keeps the core tiny and efficient, and it also provides a state-machine like API for extensions.
  
  If you are a new user, please start with chapter for :ref:`getting started <gettingstarted>`.
@@ -27,6 +27,7 @@ If you are a new user, please start with chapter for :ref:`getting started <gett
     config-overview
     usecase-network-interfaces
     config-policy-new
+   config-performance
     config-logging-monitoring
     config-dnssec
     config-lua
author	Aleš Mrázek <ales.mrazek@nic.cz>
	Tue, 11 Jul 2023 14:05:38 +0000 (16:05 +0200)
committer	Vladimír Čunát <vladimir.cunat@nic.cz>
	Tue, 8 Aug 2023 06:50:52 +0000 (08:50 +0200)
doc/config-cache-predict.rst	[new file with mode: 0644]	patch \| blob
doc/config-cache-prefill.rst		patch \| blob \| blame \| history
doc/config-cache.rst		patch \| blob \| blame \| history
doc/config-edns-keepalive.rst	[new file with mode: 0644]	patch \| blob
doc/config-multiple-workers.rst	[new file with mode: 0644]	patch \| blob
doc/config-performance.rst		patch \| blob \| blame \| history
doc/config-priming.rst	[new file with mode: 0644]	patch \| blob
doc/config-rfc7706.rst	[new file with mode: 0644]	patch \| blob
doc/config-serve-stale.rst	[new file with mode: 0644]	patch \| blob
doc/index.rst		patch \| blob \| blame \| history