]> git.ipfire.org Git - thirdparty/pdns.git/blob - docs/performance.rst
Meson: Add basic support for systemd service file
[thirdparty/pdns.git] / docs / performance.rst
1 Performance and Tuning
2 ======================
3
4 In general, best performance is achieved on recent Linux kernels with
5 the bindbackend, or if something more database-like is preferred,
6 the LMDB backend. Meanwhile many of the largest PowerDNS installations are
7 based on PostgreSQL or MySQL.
8
9 Database servers can require configuration to achieve decent
10 performance. It is especially worth noting that several vendors ship
11 PostgreSQL with a slow default configuration.
12
13 .. warning::
14 When deploying (large scale) IPv6, please be aware some
15 Linux distributions leave IPv6 routing cache tables at very small
16 default values. Please check and if necessary raise
17 ``sysctl net.ipv6.route.max_size``.
18
19 Performance related settings
20 ----------------------------
21
22 When PowerDNS starts up it creates a number of threads to listen for
23 packets. This is configurable with the
24 :ref:`setting-receiver-threads` setting which
25 defines how many sockets will be opened by the powerdns process. In
26 versions of linux before kernel 3.9 having too many receiver threads set
27 up resulted in decreased performance due to socket contention between
28 multiple CPUs - the typical sweet spot was 3 or 4. For optimal
29 performance on kernel 3.9 and following with
30 :ref:`setting-reuseport` enabled you'll typically want
31 a receiver thread for each core on your box if backend
32 latency/performance is not an issue and you want top performance.
33
34 Different backends will have different characteristics - some will want
35 to have more parallel instances than others. In general, if your backend
36 is latency bound, like most relational databases are, it pays to open
37 more backends.
38
39 This is done with the
40 :ref:`setting-distributor-threads` setting
41 which says how many distributors will be opened for each receiver
42 thread. Of special importance is the choice between 1 or more backends.
43 In case of only 1 thread, PowerDNS reverts to unthreaded operation which
44 may be a lot faster, depending on your operating system and
45 architecture.
46
47 Other very important settings are
48 :ref:`setting-cache-ttl`. PowerDNS caches entire
49 packets it sends out so as to save the time to query backends to
50 assemble all data. The default setting of 20 seconds may be low for high
51 traffic sites, a value of 60 seconds rarely leads to problems. Please be
52 aware that if any TTL in the answer is shorter than this setting, the
53 packet cache will respect the answer's shortest TTL.
54
55 Some PowerDNS operators set cache-ttl to many hours or even days, and
56 use :ref:`pdns_control purge <running-pdnscontrol>` to
57 selectively or globally notify PowerDNS of changes made in the backend.
58 Also look at the :ref:`query-cache` described in this
59 chapter. It may materially improve your performance.
60
61 To determine if PowerDNS is unable to keep up with packets, determine
62 the value of the :ref:`stat-qsize-q` variable. This represents the number of
63 packets waiting for database attention. During normal operations the
64 queue should be small.
65
66 The value of :ref:`setting-queue-limit` should be set to only keep queries in
67 queue for as long as someone would be interested in knowing the answer. Many
68 resolvers will query other name servers for the zone quite aggressively.
69
70 Logging truly kills performance as answering a question from the cache
71 is an order of magnitude less work than logging a line about it. Busy
72 sites will prefer to turn :ref:`setting-log-dns-details` off.
73
74 .. _packet-cache:
75
76 Packet Cache
77 ------------
78
79 PowerDNS by default uses the 'Packet Cache' to recognise identical
80 questions and supply them with identical answers, without any further
81 processing. The default time to live is 20 seconds and can be changed by
82 setting ``cache-ttl``. It has been observed that the utility of the
83 packet cache increases with the load on your nameserver.
84
85 Not all backends may benefit from the packet cache. If your backend is
86 memory based and does not lead to context switches, the packet cache may
87 actually hurt performance.
88
89 .. _query-cache:
90
91 Query Cache
92 -----------
93
94 Besides entire packets, PowerDNS can also cache individual backend
95 queries. Each DNS query leads to a number of backend queries, the most
96 obvious additional backend query is the check for a possible CNAME. So,
97 when a query comes in for the 'A' record for 'www.powerdns.com',
98 PowerDNS must first check for a CNAME for 'www.powerdns.com'.
99
100 The Query Cache caches these backend queries, many of which are quite
101 repetitive. The maximum number of entries in the cache is controlled by
102 the ``max-cache-entries`` setting. Before 4.1 this setting also controls
103 the maximum number of entries in the packet cache.
104
105 Most gain is made from caching negative entries, ie, queries that have
106 no answer. As these take little memory to store and are typically not a
107 real problem in terms of speed-of-propagation, the default TTL for
108 negative queries is a rather high 60 seconds.
109
110 This only is a problem when first doing a query for a record, adding it,
111 and immediately doing a query for that record again. It may then take up
112 to 60 seconds to appear. Changes to existing records however do not fall
113 under the negative query ttl
114 (:ref:`setting-negquery-cache-ttl`), but under
115 the generic :ref:`setting-query-cache-ttl` which
116 defaults to 20 seconds.
117
118 The default values should work fine for many sites. When tuning, keep in
119 mind that the Query Cache mostly saves database access but that the
120 Packet Cache also saves a lot of CPU because 0 internal processing is
121 done when answering a question from the Packet Cache.
122
123 Caches & Memory Allocations & glibc
124 -----------------------------------
125
126 Managing the two caches described above involves a lot of memory management, that is handled by ``malloc`` in your libc.
127 To avoid contention between threads, the allocator in glibc separates memory into separate arenas, sometimes even hundreds of them.
128 This avoids locking, but it may cause massive memory fragmentation, that could make PowerDNS take `an order of magnitude more memory <https://sourceware.org/bugzilla/show_bug.cgi?id=11261>`_ in some situations.
129
130 If you suspect this is happening on your setup, you can consider lowering ``MALLOC_ARENA_MAX`` to a small number.
131 Several users have reported that ``4`` works well for them.
132 Via ``systemctl edit pdns`` you can put ``Environment=MALLOC_ARENA_MAX=4`` in your pdns unit file to enable this tweak.
133
134 Note that `newer glibc versions replace MALLOC_ARENA_MAX with a different setting syntax <https://www.gnu.org/software/libc/manual/html_node/Tunables.html#Tunables>`__.
135 The new syntax is ``GLIBC_TUNABLES=glibc.malloc.arena_max=4``, please check which syntax is valid for your glibc version (it is quite likely that both syntaxes will work).
136
137 Performance Monitoring
138 ----------------------
139
140 A number of counters and variables are set during PowerDNS Authoritative
141 Server operation.
142
143 .. _counters:
144 .. _metricnames:
145
146 Counters
147 ~~~~~~~~
148
149 All counters that show the "number of X" count since the last startup of the daemon.
150
151 .. _stat-corrupt-packets:
152
153 corrupt-packets
154 ^^^^^^^^^^^^^^^
155 Number of corrupt packets received
156
157 .. _stat-deferred-cache-inserts:
158
159 deferred-cache-inserts
160 ^^^^^^^^^^^^^^^^^^^^^^
161 Number of cache inserts that were deferred because of maintenance
162
163 .. _stat-deferred-cache-lookup:
164
165 deferred-cache-lookup
166 ^^^^^^^^^^^^^^^^^^^^^
167 Number of cache lookups that were deferred because of maintenance
168
169 .. _stat-deferred-packetcache-inserts:
170
171 deferred-packetcache-inserts
172 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
173 Number of packet cache inserts that were deferred because of maintenance
174
175 .. _stat-deferred-packetcache-lookup:
176
177 deferred-packetcache-lookup
178 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
179 Number of packet cache lookups that were deferred because of maintenance
180
181 .. _stat-dnsupdate-answers:
182
183 dnsupdate-answers
184 ^^^^^^^^^^^^^^^^^
185 Number of DNS update packets successfully answered
186
187 .. _stat-dnsupdate-changes:
188
189 dnsupdate-changes
190 ^^^^^^^^^^^^^^^^^
191 Total number of changes to records from DNS update
192
193 .. _stat-dnsupdate-queries:
194
195 dnsupdate-queries
196 ^^^^^^^^^^^^^^^^^
197 Number of DNS update packets received
198
199 .. _stat-dnsupdate-refused:
200
201 dnsupdate-refused
202 ^^^^^^^^^^^^^^^^^
203 Number of DNS update packets that were refused
204
205 .. _stat-incoming-notifications:
206
207 incoming-notifications
208 ^^^^^^^^^^^^^^^^^^^^^^
209 Number of NOTIFY packets that were received
210
211 .. _stat-key-cache-size:
212
213 key-cache-size
214 ^^^^^^^^^^^^^^
215 Number of entries in the key cache
216
217 .. _stat-latency:
218
219 latency
220 ^^^^^^^
221 Average number of microseconds a packet spends within PowerDNS
222
223 .. _stat-meta-cache-size:
224
225 meta-cache-size
226 ^^^^^^^^^^^^^^^
227 Number of entries in the metadata cache
228
229 .. _stat-open-tcp-connections:
230
231 open-tcp-connections
232 ^^^^^^^^^^^^^^^^^^^^
233 Number of currently open TCP connections
234
235 .. _stat-overload-drops:
236
237 overload-drops
238 ^^^^^^^^^^^^^^
239 Number of questions dropped because backends overloaded (backends are overloaded if they have more outstanding queries than the value of :ref:`setting-overload-queue-length`)
240
241 .. _stat-packetcache-hit:
242
243 packetcache-hit
244 ^^^^^^^^^^^^^^^
245 Number of packets which were answered out of the cache
246
247 .. _stat-packetcache-miss:
248
249 packetcache-miss
250 ^^^^^^^^^^^^^^^^
251 Number of times a packet could not be answered out of the cache
252
253 .. _stat-packetcache-size:
254
255 packetcache-size
256 ^^^^^^^^^^^^^^^^
257 Amount of packets in the packetcache
258
259 .. _stat-qsize-q:
260
261 qsize-q
262 ^^^^^^^
263 Number of packets waiting for database attention, only available if :ref:`setting-receiver-threads` > 1
264
265 .. _stat-query-cache-hit:
266
267 query-cache-hit
268 ^^^^^^^^^^^^^^^
269 Number of hits on the :ref:`query-cache`
270
271 .. _stat-query-cache-miss:
272
273 query-cache-miss
274 ^^^^^^^^^^^^^^^^
275 Number of misses on the :ref:`query-cache`
276
277 .. _stat-query-cache-size:
278
279 query-cache-size
280 ^^^^^^^^^^^^^^^^
281 Number of entries in the query cache
282
283 .. _stat-rd-queries:
284
285 rd-queries
286 ^^^^^^^^^^
287 Number of packets sent by clients requesting recursion (regardless of if we'll be providing them with recursion).
288
289 .. _stat-receive-latency:
290
291 receive-latency
292 ^^^^^^^^^^^^^^^
293 Average number of microseconds needed to receive a query
294
295 .. _stat-recursing-answers:
296
297 recursing-answers
298 ^^^^^^^^^^^^^^^^^
299 Number of packets we supplied an answer to after recursive processing
300
301 .. _stat-recursing-questions:
302
303 recursing-questions
304 ^^^^^^^^^^^^^^^^^^^
305 Number of packets we performed recursive processing for.
306
307 .. _stat-recursion-unanswered:
308
309 recursion-unanswered
310 ^^^^^^^^^^^^^^^^^^^^
311 Number of packets we sent to our recursor, but did not get a timely answer for.
312
313 .. _stat-security-status:
314
315 security-status
316 ^^^^^^^^^^^^^^^
317 Security status based on :ref:`securitypolling`.
318
319 .. _stat-servfail-packets:
320
321 servfail-packets
322 ^^^^^^^^^^^^^^^^
323 Amount of packets that could not be answered due to database problems
324
325 .. _stat-signature-cache-size:
326
327 signature-cache-size
328 ^^^^^^^^^^^^^^^^^^^^
329 Number of entries in the signature cache
330
331 .. _stat-signatures:
332
333 signatures
334 ^^^^^^^^^^
335 Number of DNSSEC signatures created
336
337 .. _stat-sys-msec:
338
339 sys-msec
340 ^^^^^^^^
341 Number of CPU milliseconds sent in system time
342
343 .. _stat-tcp-answers-bytes:
344
345 tcp-answers-bytes
346 ^^^^^^^^^^^^^^^^^
347 Total number of answer bytes sent over TCP
348
349 .. _stat-tcp-answers:
350
351 tcp-answers
352 ^^^^^^^^^^^
353 Number of answers sent out over TCP
354
355 .. _stat-tcp-queries:
356
357 tcp-queries
358 ^^^^^^^^^^^
359 Number of questions received over TCP
360
361 .. _stat-tcp4-answers-bytes:
362
363 tcp4-answers-bytes
364 ^^^^^^^^^^^^^^^^^^
365 Total number of answer bytes sent over TCPv4
366
367 .. _stat-tcp4-answers:
368
369 tcp4-answers
370 ^^^^^^^^^^^^^^^^
371 Number of answers sent out over TCPv4
372
373 .. _stat-tcp4-queries:
374
375 tcp4-queries
376 ^^^^^^^^^^^^
377 Number of questions received over TCPv4
378
379 .. _stat-tcp6-answers-bytes:
380
381 tcp6-answers-bytes
382 ^^^^^^^^^^^^^^^^^^
383 Total number of answer bytes sent over TCPv6
384
385 .. _stat-tcp6-answers:
386
387 tcp6-answers
388 ^^^^^^^^^^^^
389 Number of answers sent out over TCPv6
390
391 .. _stat-tcp6-queries:
392
393 tcp6-queries
394 ^^^^^^^^^^^^
395 Number of questions received over TCPv6
396
397 .. _stat-timedout-packets:
398
399 timedout-packets
400 ^^^^^^^^^^^^^^^^
401 Amount of packets that were dropped because they had to wait too long internally
402
403 .. _stat-send-latency:
404
405 send-latency
406 ^^^^^^^^^^^^
407 Average number of microseconds needed to send the answer
408
409 .. _stat-udp-answers-bytes:
410
411 udp-answers-bytes
412 ^^^^^^^^^^^^^^^^^
413 Total number of answer bytes sent over UDP
414
415 .. _stat-udp-answers:
416
417 udp-answers
418 ^^^^^^^^^^^
419 Number of answers sent out over UDP
420
421 .. _stat-udp-do-queries:
422
423 udp-do-queries
424 ^^^^^^^^^^^^^^
425 Number of queries received with the DO (DNSSEC OK) bit set
426
427 .. _stat-udp-in-csum-errors:
428
429 udp-in-csum-errors
430 ^^^^^^^^^^^^^^^^^^
431 Number of UDP packets received with an invalid checksum
432
433 .. _stat-udp-in-errors:
434
435 udp-in-errors
436 ^^^^^^^^^^^^^
437 Number of packets received faster than the OS could process them
438
439 .. _stat-udp-noport-errors:
440
441 udp-noport-errors
442 ^^^^^^^^^^^^^^^^^
443 Number of UDP packets where an ICMP response was received that the remote port was not listening
444
445 .. _stat-udp-queries:
446
447 udp-queries
448 ^^^^^^^^^^^
449 Number of questions received over UDP
450
451 .. _stat-udp-recvbuf-errors:
452
453 udp-recvbuf-errors
454 ^^^^^^^^^^^^^^^^^^
455 Number of errors caused in the UDP receive buffer
456
457 .. _stat-udp-sndbuf-errors:
458
459 udp-sndbuf-errors
460 ^^^^^^^^^^^^^^^^^
461 Number of errors caused in the UDP send buffer
462
463 .. _stat-udp4-answers-bytes:
464
465 udp4-answers-bytes
466 ^^^^^^^^^^^^^^^^^^
467 Total number of answer bytes sent over UDPv4
468
469 .. _stat-udp4-answers:
470
471 udp4-answers
472 ^^^^^^^^^^^^
473 Number of answers sent out over UDPv4
474
475 .. _stat-udp4-queries:
476
477 udp4-queries
478 ^^^^^^^^^^^^
479 Number of questions received over UDPv4
480
481 .. _stat-udp6-answers-bytes:
482
483 udp6-answers-bytes
484 ^^^^^^^^^^^^^^^^^^
485 Total number of answer bytes sent over UDPv6
486
487 .. _stat-udp6-answers:
488
489 udp6-answers
490 ^^^^^^^^^^^^
491 Number of answers sent out over UDPv6
492
493 .. _stat-udp6-in-csum-errors:
494
495 udp6-in-csum-errors
496 ^^^^^^^^^^^^^^^^^^^
497 Number of IPv6 UDP packets received with an invalid checksum
498
499 .. _stat-udp6-in-errors:
500
501 udp6-in-errors
502 ^^^^^^^^^^^^^^
503 Number of IPv6 UDP packets received faster than the OS could process them
504
505 .. _stat-udp6-noport-errors:
506
507 udp6-noport-errors
508 ^^^^^^^^^^^^^^^^^^
509 Number of IPv6 UDP packets where an ICMP response was received that the remote port was not listening
510
511 .. _stat-udp6-queries:
512
513 udp6-queries
514 ^^^^^^^^^^^^
515 Number of questions received over UDPv6
516
517 .. _stat-udp6-recvbuf-errors:
518
519 udp6-recvbuf-errors
520 ^^^^^^^^^^^^^^^^^^^
521 Number of errors caused in the IPv6 UDP receive buffer
522
523 .. _stat-udp6-sndbuf-errors:
524
525 udp6-sndbuf-errors
526 ^^^^^^^^^^^^^^^^^^
527 Number of errors caused in the IPv6 UDP send buffer
528
529 .. _stat-uptime:
530
531 uptime
532 ^^^^^^
533 Uptime in seconds of the daemon
534
535 .. _stat-user-msec:
536
537 user-msec
538 ^^^^^^^^^
539 Number of milliseconds spend in CPU 'user' time
540
541 Ring buffers
542 ~~~~~~~~~~~~
543
544 Besides counters, PowerDNS also maintains the ringbuffers. A ringbuffer
545 records events, each new event gets a place in the buffer until it is
546 full. When full, earlier entries get overwritten, hence the name 'ring'.
547
548 By counting the entries in the buffer, statistics can be generated.
549 These statistics can currently only be viewed using the webserver and
550 are in fact not even collected without the webserver running.
551
552 The following ringbuffers are available:
553
554 - **logmessages**: All messages logged
555 - **noerror-queries**: Queries for existing records but for a type we
556 don't have. Queries for, say, the AAAA record of a domain, when only
557 an A is available. Queries are listed in the following format:
558 name/type. So an AAAA query for pdns.powerdns.com looks like
559 pdns.powerdns.com/AAAA.
560 - **nxdomain-queries**: Queries for non-existing records within
561 existing domains. If PowerDNS knows it is authoritative over a
562 domain, and it sees a question for a record in that domain that does
563 not exist, it is able to send out an authoritative 'no such domain'
564 message. Indicates that hosts are trying to connect to services
565 really not in your zone.
566 - **udp-queries**: All UDP queries seen.
567 - **remotes**: Remote server IP addresses. Number of hosts querying
568 PowerDNS. Be aware that UDP is anonymous - person A can send queries
569 that appear to be coming from person B.
570 - **remote-corrupts**: Remotes sending corrupt packets. Hosts sending
571 PowerDNS broken packets, possibly meant to disrupt service. Be aware
572 that UDP is anonymous - person A can send queries that appear to be
573 coming from person B.
574 - **remote-unauth**: Remotes querying domains for which we are not
575 authoritative. It may happen that there are misconfigured hosts on
576 the internet which are configured to think that a PowerDNS
577 installation is in fact a resolving nameserver. These hosts will not
578 get useful answers from PowerDNS. This buffer lists hosts sending
579 queries for domains which PowerDNS does not know about.
580 - **servfail-queries**: Queries that could not be answered due to
581 backend errors. For one reason or another, a backend may be unable to
582 extract answers for a certain domain from its storage. This may be
583 due to a corrupt database or to inconsistent data. When this happens,
584 PowerDNS sends out a 'servfail' packet indicating that it was unable
585 to answer the question. This buffer shows which queries have been
586 causing servfails.
587 - **unauth-queries**: Queries for domains that we are not authoritative
588 for. If a domain is delegated to a PowerDNS instance, but the backend
589 is not made aware of this fact, questions come in for which no answer
590 is available, nor is the authority. Use this ringbuffer to spot such
591 queries.
592
593 .. _metricscarbon:
594
595 Sending metrics to Graphite/Metronome over Carbon
596 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
597 For carbon/graphite/metronome, we use the following namespace.
598 Everything starts with 'pdns.', which is then followed by the local hostname.
599 Thirdly, we add 'auth' to signify the daemon generating the metrics.
600 This is then rounded off with the actual name of the metric. As an example: 'pdns.ns1.auth.questions'.
601
602 Care has been taken to make the sending of statistics as unobtrusive as possible, the daemons will not be hindered by an unreachable carbon server, timeouts or connection refused situations.
603
604 To benefit from our carbon/graphite support, either install Graphite, or use our own lightweight statistics daemon, Metronome, currently available on `GitHub <https://github.com/ahupowerdns/metronome/>`_.
605
606 To enable sending metrics, set :ref:`setting-carbon-server`, possibly :ref:`setting-carbon-interval` and possibly :ref:`setting-carbon-ourname` in the configuration.
607
608 .. warning::
609
610 If your hostname includes dots, they will be replaced by underscores so as not to confuse the namespace.
611
612 If you include dots in :ref:`setting-carbon-ourname`, they will **not** be replaced by underscores.
613 As PowerDNS assumes you know what you are doing if you override your hostname.