]> git.ipfire.org Git - thirdparty/pdns.git/blame - docs/performance.rst
Merge pull request #8274 from rgacogne/dnsdist-rcode-ratio
[thirdparty/pdns.git] / docs / performance.rst
CommitLineData
0e2063c3
PL
1Performance and Tuning
2======================
3
4In general, best performance is achieved on recent Linux 4.x kernels and
5using MySQL, although many of the largest PowerDNS installations are
6based on PostgreSQL. FreeBSD also performs very well.
7
8Database servers can require configuration to achieve decent
9performance. It is especially worth noting that several vendors ship
10PostgreSQL with a slow default configuration.
11
12.. warning::
13 When deploying (large scale) IPv6, please be aware some
14 Linux distributions leave IPv6 routing cache tables at very small
15 default values. Please check and if necessary raise
16 ``sysctl net.ipv6.route.max_size``.
17
18Performance related settings
19----------------------------
20
21When PowerDNS starts up it creates a number of threads to listen for
22packets. This is configurable with the
23:ref:`setting-receiver-threads` setting which
24defines how many sockets will be opened by the powerdns process. In
25versions of linux before kernel 3.9 having too many receiver threads set
26up resulted in decreased performance due to socket contention between
27multiple CPUs - the typical sweet spot was 3 or 4. For optimal
28performance on kernel 3.9 and following with
29:ref:`setting-reuseport` enabled you'll typically want
30a receiver thread for each core on your box if backend
31latency/performance is not an issue and you want top performance.
32
33Different backends will have different characteristics - some will want
34to have more parallel instances than others. In general, if your backend
35is latency bound, like most relational databases are, it pays to open
36more backends.
37
38This is done with the
39:ref:`setting-distributor-threads` setting
40which says how many distributors will be opened for each receiver
41thread. Of special importance is the choice between 1 or more backends.
42In case of only 1 thread, PowerDNS reverts to unthreaded operation which
43may be a lot faster, depending on your operating system and
44architecture.
45
46Other very important settings are
47:ref:`setting-cache-ttl`. PowerDNS caches entire
48packets it sends out so as to save the time to query backends to
49assemble all data. The default setting of 20 seconds may be low for high
50traffic sites, a value of 60 seconds rarely leads to problems. Please be
51aware that if any TTL in the answer is shorter than this setting, the
52packet cache will respect the answer's shortest TTL.
53
54Some PowerDNS operators set cache-ttl to many hours or even days, and
55use :ref:`pdns_control purge <running-pdnscontrol>` to
56selectively or globally notify PowerDNS of changes made in the backend.
57Also look at the :ref:`query-cache` described in this
58chapter. It may materially improve your performance.
59
60To determine if PowerDNS is unable to keep up with packets, determine
61the value of the :ref:`stat-qsize-q` variable. This represents the number of
62packets waiting for database attention. During normal operations the
63queue should be small.
64
65Logging truly kills performance as answering a question from the cache
66is an order of magnitude less work than logging a line about it. Busy
67sites will prefer to turn :ref:`setting-log-dns-details` off.
68
69.. _packet-cache:
70
71Packet Cache
72------------
73
74PowerDNS by default uses the 'Packet Cache' to recognise identical
75questions and supply them with identical answers, without any further
76processing. The default time to live is 20 seconds and can be changed by
77setting ``cache-ttl``. It has been observed that the utility of the
78packet cache increases with the load on your nameserver.
79
80Not all backends may benefit from the packet cache. If your backend is
81memory based and does not lead to context switches, the packet cache may
82actually hurt performance.
83
84.. versionchanged:: 4.1.0
85 The maximum size of the packet cache is controlled by the
86 :ref:`setting-max-packet-cache-entries` entries. Before that both the
87 query cache and the packet cache used the :ref:`setting-max-cache-entries` setting.
88
89.. _query-cache:
90
91Query Cache
92-----------
93
94Besides entire packets, PowerDNS can also cache individual backend
95queries. Each DNS query leads to a number of backend queries, the most
96obvious additional backend query is the check for a possible CNAME. So,
97when a query comes in for the 'A' record for 'www.powerdns.com',
98PowerDNS must first check for a CNAME for 'www.powerdns.com'.
99
100The Query Cache caches these backend queries, many of which are quite
101repetitive. The maximum number of entries in the cache is controlled by
102the ``max-cache-entries`` setting. Before 4.1 this setting also controls
103the maximum number of entries in the packet cache.
104
105Most gain is made from caching negative entries, ie, queries that have
106no answer. As these take little memory to store and are typically not a
107real problem in terms of speed-of-propagation, the default TTL for
108negative queries is a rather high 60 seconds.
109
110This only is a problem when first doing a query for a record, adding it,
111and immediately doing a query for that record again. It may then take up
112to 60 seconds to appear. Changes to existing records however do not fall
113under the negative query ttl
114(:ref:`setting-negquery-cache-ttl`), but under
115the generic :ref:`setting-query-cache-ttl` which
116defaults to 20 seconds.
117
118The default values should work fine for many sites. When tuning, keep in
119mind that the Query Cache mostly saves database access but that the
120Packet Cache also saves a lot of CPU because 0 internal processing is
121done when answering a question from the Packet Cache.
122
123Performance Monitoring
124----------------------
125
126A number of counters and variables are set during PowerDNS Authoritative
127Server operation.
128
129.. _counters:
130.. _metricnames:
131
132Counters
133~~~~~~~~
134
135All counters that show the "number of X" count since the last startup of the daemon.
136
137.. _stat-corrupt-packets:
138
139corrupt-packets
140^^^^^^^^^^^^^^^
141Number of corrupt packets received
142
143.. _stat-deferred-cache-inserts:
144
145deferred-cache-inserts
146^^^^^^^^^^^^^^^^^^^^^^
147Number of cache inserts that were deferred because of maintenance
148
149.. _stat-deferred-cache-lookup:
150
151deferred-cache-lookup
152^^^^^^^^^^^^^^^^^^^^^
5b9dd957 153Number of cache lookups that were deferred because of maintenance
0e2063c3
PL
154
155.. _stat-deferred-packetcache-inserts:
156
157deferred-packetcache-inserts
158^^^^^^^^^^^^^^^^^^^^^^^^^^^^
159Number of packet cache inserts that were deferred because of maintenance
160
161.. _stat-deferred-packetcache-lookup:
162
163deferred-packetcache-lookup
164^^^^^^^^^^^^^^^^^^^^^^^^^^^
165Number of packet cache lookups that were deferred because of maintenance
166
167.. _stat-dnsupdate-answers:
168
169dnsupdate-answers
170^^^^^^^^^^^^^^^^^
171Number of DNS update packets successfully answered
172
173.. _stat-dnsupdate-changes:
174
175dnsupdate-changes
176^^^^^^^^^^^^^^^^^
177Total number of changes to records from DNS update
178
179.. _stat-dnsupdate-queries:
180
181dnsupdate-queries
182^^^^^^^^^^^^^^^^^
183Number of DNS update packets received
184
185.. _stat-dnsupdate-refused:
186
187dnsupdate-refused
188^^^^^^^^^^^^^^^^^
189Number of DNS update packets that were refused
190
191.. _stat-incoming-notifications:
192
193incoming-notifications
194^^^^^^^^^^^^^^^^^^^^^^
195Number of NOTIFY packets that were received
196
197.. _stat-key-cache-size:
198
199key-cache-size
200^^^^^^^^^^^^^^
201Number of entries in the key cache
202
203.. _stat-latency:
204
205latency
206^^^^^^^
207Average number of microseconds a packet spends within PowerDNS
208
209.. _stat-meta-cache-size:
210
211meta-cache-size
212^^^^^^^^^^^^^^^
213Number of entries in the metadata cache
214
d322f931
PD
215.. _stat-open-tcp-connections:
216
217open-tcp-connections
218~~~~~~~~~~~~~~~~~~~~
219Number of currently open TCP connections
220
0e2063c3
PL
221.. _stat-overload-drops:
222
223overload-drops
224^^^^^^^^^^^^^^
225Number of questions dropped because backends overloaded
226
227.. _stat-packetcache-hit:
228
229packetcache-hit
230^^^^^^^^^^^^^^^
231Number of packets which were answered out of the cache
232
233.. _stat-packetcache-miss:
234
235packetcache-miss
236^^^^^^^^^^^^^^^^
237Number of times a packet could not be answered out of the cache
238
239.. _stat-packetcache-size:
240
241packetcache-size
242^^^^^^^^^^^^^^^^
243Amount of packets in the packetcache
244
245.. _stat-qsize-q:
246
247qsize-q
248^^^^^^^
249Number of packets waiting for database attention
250
251.. _stat-query-cache-hit:
252
253query-cache-hit
254^^^^^^^^^^^^^^^
255Number of hits on the :ref:`query-cache`
256
257.. _stat-query-cache-miss:
258
259query-cache-miss
260^^^^^^^^^^^^^^^^
261Number of misses on the :ref:`query-cache`
262
263.. _stat-query-cache-size:
264
265query-cache-size
266^^^^^^^^^^^^^^^^
267Number of entries in the query cache
268
269.. _stat-rd-queries:
270
271rd-queries
272^^^^^^^^^^
273Number of packets sent by clients requesting recursion (regardless of if we'll be providing them with recursion).
274
275.. _stat-recursing-answers:
276
277recursing-answers
278^^^^^^^^^^^^^^^^^
279Number of packets we supplied an answer to after recursive processing
280
281.. _stat-recursing-questions:
282
283recursing-questions
284^^^^^^^^^^^^^^^^^^^
285Number of packets we performed recursive processing for.
286
287.. _stat-recursion-unanswered:
288
289recursion-unanswered
290^^^^^^^^^^^^^^^^^^^^
291Number of packets we sent to our recursor, but did not get a timely answer for.
292
293.. _stat-security-status:
294
295security-status
296^^^^^^^^^^^^^^^
297Security status based on :ref:`securitypolling`.
298
299.. _stat-servfail-packets:
300
301servfail-packets
302^^^^^^^^^^^^^^^^
303Amount of packets that could not be answered due to database problems
304
305.. _stat-signature-cache-size:
306
307signature-cache-size
308^^^^^^^^^^^^^^^^^^^^
309Number of entries in the signature cache
310
311.. _stat-signatures:
312
313signatures
314^^^^^^^^^^
315Number of DNSSEC signatures created
316
317.. _stat-sys-msec:
318
319sys-msec
320^^^^^^^^
321Number of CPU milliseconds sent in system time
322
323.. _stat-tcp-answers-bytes:
324
325tcp-answers-bytes
326^^^^^^^^^^^^^^^^^
327Total number of answer bytes sent over TCP
328
329.. _stat-tcp-answers:
330
331tcp-answers
332^^^^^^^^^^^
333Number of answers sent out over TCP
334
335.. _stat-tcp-queries:
336
337tcp-queries
338^^^^^^^^^^^
339Number of questions received over TCP
340
341.. _stat-tcp4-answers-bytes:
342
343tcp4-answers-bytes
344^^^^^^^^^^^^^^^^^^
345Total number of answer bytes sent over TCPv4
346
347.. _stat-tcp4-answers:
348
349tcp4-answers
350^^^^^^^^^^^^^^^^
351Number of answers sent out over TCPv4
352
353.. _stat-tcp4-queries:
354
355tcp4-queries
356^^^^^^^^^^^^
357Number of questions received over TCPv4
358
359.. _stat-tcp6-answers-bytes:
360
361tcp6-answers-bytes
362^^^^^^^^^^^^^^^^^^
363Total number of answer bytes sent over TCPv6
364
365.. _stat-tcp6-answers:
366
367tcp6-answers
368^^^^^^^^^^^^
369Number of answers sent out over TCPv6
370
371.. _stat-tcp6-queries:
372
373tcp6-queries
374^^^^^^^^^^^^
375Number of questions received over TCPv6
376
377.. _stat-timedout-packets:
378
379timedout-packets
380^^^^^^^^^^^^^^^^
381Amount of packets that were dropped because they had to wait too long internally
382
383.. _stat-udp-answers-bytes:
384
385udp-answers-bytes
386^^^^^^^^^^^^^^^^^
387Total number of answer bytes sent over UDP
388
389.. _stat-udp-answers:
390
391udp-answers
392^^^^^^^^^^^
393Number of answers sent out over UDP
394
395.. _stat-udp-do-queries:
396
397udp-do-queries
398^^^^^^^^^^^^^^
399Number of queries received with the DO (DNSSEC OK) bit set
400
401.. _stat-udp-in-errors:
402
403udp-in-errors
404^^^^^^^^^^^^^
405Number of packets, received faster than the OS could process them
406
407.. _stat-udp-noport-errors:
408
409udp-noport-errors
410^^^^^^^^^^^^^^^^^
411Number of UDP packets where an ICMP response was received that the remote port was not listening
412
413.. _stat-udp-queries:
414
415udp-queries
416^^^^^^^^^^^
417Number of questions received over UDP
418
419.. _stat-udp-recvbuf-errors:
420
421udp-recvbuf-errors
422^^^^^^^^^^^^^^^^^^
5b9dd957 423Number of errors caused in the UDP receive buffer
0e2063c3
PL
424
425.. _stat-udp-sndbuf-errors:
426
427udp-sndbuf-errors
428^^^^^^^^^^^^^^^^^
429Number of errors caused in the UDP send buffer
430
431.. _stat-udp4-answers-bytes:
432
433udp4-answers-bytes
434^^^^^^^^^^^^^^^^^^
435Total number of answer bytes sent over UDPv4
436
437.. _stat-udp4-answers:
438
439udp4-answers
440^^^^^^^^^^^^
441Number of answers sent out over UDPv4
442
443.. _stat-udp4-queries:
444
445udp4-queries
446^^^^^^^^^^^^
447Number of questions received over UDPv4
448
449.. _stat-udp6-answers-bytes:
450
451udp6-answers-bytes
452^^^^^^^^^^^^^^^^^^
453Total number of answer bytes sent over UDPv6
454
455.. _stat-udp6-answers:
456
457udp6-answers
458^^^^^^^^^^^^
459Number of answers sent out over UDPv6
460
461.. _stat-udp6-queries:
462
463udp6-queries
464^^^^^^^^^^^^
465Number of questions received over UDPv6
466
467.. _stat-uptime:
468
469uptime
470^^^^^^
471Uptime in seconds of the daemon
472
473.. _stat-user-msec:
474
475user-msec
476^^^^^^^^^
477Number of milliseconds spend in CPU 'user' time
478
479Ring buffers
480~~~~~~~~~~~~
481
482Besides counters, PowerDNS also maintains the ringbuffers. A ringbuffer
483records events, each new event gets a place in the buffer until it is
484full. When full, earlier entries get overwritten, hence the name 'ring'.
485
486By counting the entries in the buffer, statistics can be generated.
487These statistics can currently only be viewed using the webserver and
488are in fact not even collected without the webserver running.
489
490The following ringbuffers are available:
491
492- **logmessages**: All messages logged
493- **noerror-queries**: Queries for existing records but for a type we
494 don't have. Queries for, say, the AAAA record of a domain, when only
495 an A is available. Queries are listed in the following format:
496 name/type. So an AAAA query for pdns.powerdns.com looks like
497 pdns.powerdns.com/AAAA.
498- **nxdomain-queries**: Queries for non-existing records within
499 existing domains. If PowerDNS knows it is authoritative over a
500 domain, and it sees a question for a record in that domain that does
501 not exist, it is able to send out an authoritative 'no such domain'
502 message. Indicates that hosts are trying to connect to services
503 really not in your zone.
504- **udp-queries**: All UDP queries seen.
505- **remotes**: Remote server IP addresses. Number of hosts querying
506 PowerDNS. Be aware that UDP is anonymous - person A can send queries
507 that appear to be coming from person B.
508- **remote-corrupts**: Remotes sending corrupt packets. Hosts sending
509 PowerDNS broken packets, possibly meant to disrupt service. Be aware
510 that UDP is anonymous - person A can send queries that appear to be
511 coming from person B.
512- **remote-unauth**: Remotes querying domains for which we are not
513 authoritative. It may happen that there are misconfigured hosts on
514 the internet which are configured to think that a PowerDNS
515 installation is in fact a resolving nameserver. These hosts will not
516 get useful answers from PowerDNS. This buffer lists hosts sending
517 queries for domains which PowerDNS does not know about.
518- **servfail-queries**: Queries that could not be answered due to
519 backend errors. For one reason or another, a backend may be unable to
520 extract answers for a certain domain from its storage. This may be
521 due to a corrupt database or to inconsistent data. When this happens,
522 PowerDNS sends out a 'servfail' packet indicating that it was unable
523 to answer the question. This buffer shows which queries have been
524 causing servfails.
525- **unauth-queries**: Queries for domains that we are not authoritative
526 for. If a domain is delegated to a PowerDNS instance, but the backend
527 is not made aware of this fact, questions come in for which no answer
528 is available, nor is the authority. Use this ringbuffer to spot such
529 queries.
530
531.. _metricscarbon:
532
533Sending metrics to Graphite/Metronome over Carbon
534~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
535For carbon/graphite/metronome, we use the following namespace.
536Everything starts with 'pdns.', which is then followed by the local hostname.
537Thirdly, we add 'auth' to signify the daemon generating the metrics.
538This is then rounded off with the actual name of the metric. As an example: 'pdns.ns1.auth.questions'.
539
540Care has been taken to make the sending of statistics as unobtrusive as possible, the daemons will not be hindered by an unreachable carbon server, timeouts or connection refused situations.
541
542To benefit from our carbon/graphite support, either install Graphite, or use our own lightweight statistics daemon, Metronome, currently available on `GitHub <https://github.com/ahupowerdns/metronome/>`_.
543
544To enable sending metrics, set :ref:`setting-carbon-server`, possibly :ref:`setting-carbon-interval` and possibly :ref:`setting-carbon-ourname` in the configuration.
545
546.. warning::
547
548 If your hostname includes dots, they will be replaced by underscores so as not to confuse the namespace.
549
550 If you include dots in :ref:`setting-carbon-ourname`, they will **not** be replaced by underscores.
551 As PowerDNS assumes you know what you are doing if you override your hostname.