]> git.ipfire.org Git - thirdparty/pdns.git/blob - docs/performance.rst
Merge pull request #5762 from pieterlexis/5439-initscript-socket-dir
[thirdparty/pdns.git] / docs / performance.rst
1 Performance and Tuning
2 ======================
3
4 In general, best performance is achieved on recent Linux 4.x kernels and
5 using MySQL, although many of the largest PowerDNS installations are
6 based on PostgreSQL. FreeBSD also performs very well.
7
8 Database servers can require configuration to achieve decent
9 performance. It is especially worth noting that several vendors ship
10 PostgreSQL with a slow default configuration.
11
12 .. warning::
13 When deploying (large scale) IPv6, please be aware some
14 Linux distributions leave IPv6 routing cache tables at very small
15 default values. Please check and if necessary raise
16 ``sysctl net.ipv6.route.max_size``.
17
18 Performance related settings
19 ----------------------------
20
21 When PowerDNS starts up it creates a number of threads to listen for
22 packets. This is configurable with the
23 :ref:`setting-receiver-threads` setting which
24 defines how many sockets will be opened by the powerdns process. In
25 versions of linux before kernel 3.9 having too many receiver threads set
26 up resulted in decreased performance due to socket contention between
27 multiple CPUs - the typical sweet spot was 3 or 4. For optimal
28 performance on kernel 3.9 and following with
29 :ref:`setting-reuseport` enabled you'll typically want
30 a receiver thread for each core on your box if backend
31 latency/performance is not an issue and you want top performance.
32
33 Different backends will have different characteristics - some will want
34 to have more parallel instances than others. In general, if your backend
35 is latency bound, like most relational databases are, it pays to open
36 more backends.
37
38 This is done with the
39 :ref:`setting-distributor-threads` setting
40 which says how many distributors will be opened for each receiver
41 thread. Of special importance is the choice between 1 or more backends.
42 In case of only 1 thread, PowerDNS reverts to unthreaded operation which
43 may be a lot faster, depending on your operating system and
44 architecture.
45
46 Other very important settings are
47 :ref:`setting-cache-ttl`. PowerDNS caches entire
48 packets it sends out so as to save the time to query backends to
49 assemble all data. The default setting of 20 seconds may be low for high
50 traffic sites, a value of 60 seconds rarely leads to problems. Please be
51 aware that if any TTL in the answer is shorter than this setting, the
52 packet cache will respect the answer's shortest TTL.
53
54 Some PowerDNS operators set cache-ttl to many hours or even days, and
55 use :ref:`pdns_control purge <running-pdnscontrol>` to
56 selectively or globally notify PowerDNS of changes made in the backend.
57 Also look at the :ref:`query-cache` described in this
58 chapter. It may materially improve your performance.
59
60 To determine if PowerDNS is unable to keep up with packets, determine
61 the value of the :ref:`stat-qsize-q` variable. This represents the number of
62 packets waiting for database attention. During normal operations the
63 queue should be small.
64
65 Logging truly kills performance as answering a question from the cache
66 is an order of magnitude less work than logging a line about it. Busy
67 sites will prefer to turn :ref:`setting-log-dns-details` off.
68
69 .. _packet-cache:
70
71 Packet Cache
72 ------------
73
74 PowerDNS by default uses the 'Packet Cache' to recognise identical
75 questions and supply them with identical answers, without any further
76 processing. The default time to live is 20 seconds and can be changed by
77 setting ``cache-ttl``. It has been observed that the utility of the
78 packet cache increases with the load on your nameserver.
79
80 Not all backends may benefit from the packet cache. If your backend is
81 memory based and does not lead to context switches, the packet cache may
82 actually hurt performance.
83
84 .. versionchanged:: 4.1.0
85 The maximum size of the packet cache is controlled by the
86 :ref:`setting-max-packet-cache-entries` entries. Before that both the
87 query cache and the packet cache used the :ref:`setting-max-cache-entries` setting.
88
89 .. _query-cache:
90
91 Query Cache
92 -----------
93
94 Besides entire packets, PowerDNS can also cache individual backend
95 queries. Each DNS query leads to a number of backend queries, the most
96 obvious additional backend query is the check for a possible CNAME. So,
97 when a query comes in for the 'A' record for 'www.powerdns.com',
98 PowerDNS must first check for a CNAME for 'www.powerdns.com'.
99
100 The Query Cache caches these backend queries, many of which are quite
101 repetitive. The maximum number of entries in the cache is controlled by
102 the ``max-cache-entries`` setting. Before 4.1 this setting also controls
103 the maximum number of entries in the packet cache.
104
105 Most gain is made from caching negative entries, ie, queries that have
106 no answer. As these take little memory to store and are typically not a
107 real problem in terms of speed-of-propagation, the default TTL for
108 negative queries is a rather high 60 seconds.
109
110 This only is a problem when first doing a query for a record, adding it,
111 and immediately doing a query for that record again. It may then take up
112 to 60 seconds to appear. Changes to existing records however do not fall
113 under the negative query ttl
114 (:ref:`setting-negquery-cache-ttl`), but under
115 the generic :ref:`setting-query-cache-ttl` which
116 defaults to 20 seconds.
117
118 The default values should work fine for many sites. When tuning, keep in
119 mind that the Query Cache mostly saves database access but that the
120 Packet Cache also saves a lot of CPU because 0 internal processing is
121 done when answering a question from the Packet Cache.
122
123 Performance Monitoring
124 ----------------------
125
126 A number of counters and variables are set during PowerDNS Authoritative
127 Server operation.
128
129 .. _counters:
130 .. _metricnames:
131
132 Counters
133 ~~~~~~~~
134
135 All counters that show the "number of X" count since the last startup of the daemon.
136
137 .. _stat-corrupt-packets:
138
139 corrupt-packets
140 ^^^^^^^^^^^^^^^
141 Number of corrupt packets received
142
143 .. _stat-deferred-cache-inserts:
144
145 deferred-cache-inserts
146 ^^^^^^^^^^^^^^^^^^^^^^
147 Number of cache inserts that were deferred because of maintenance
148
149 .. _stat-deferred-cache-lookup:
150
151 deferred-cache-lookup
152 ^^^^^^^^^^^^^^^^^^^^^
153 Number of cache lookups that were deferred because of maintenance
154
155 .. _stat-deferred-packetcache-inserts:
156
157 deferred-packetcache-inserts
158 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
159 Number of packet cache inserts that were deferred because of maintenance
160
161 .. _stat-deferred-packetcache-lookup:
162
163 deferred-packetcache-lookup
164 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
165 Number of packet cache lookups that were deferred because of maintenance
166
167 .. _stat-dnsupdate-answers:
168
169 dnsupdate-answers
170 ^^^^^^^^^^^^^^^^^
171 Number of DNS update packets successfully answered
172
173 .. _stat-dnsupdate-changes:
174
175 dnsupdate-changes
176 ^^^^^^^^^^^^^^^^^
177 Total number of changes to records from DNS update
178
179 .. _stat-dnsupdate-queries:
180
181 dnsupdate-queries
182 ^^^^^^^^^^^^^^^^^
183 Number of DNS update packets received
184
185 .. _stat-dnsupdate-refused:
186
187 dnsupdate-refused
188 ^^^^^^^^^^^^^^^^^
189 Number of DNS update packets that were refused
190
191 .. _stat-incoming-notifications:
192
193 incoming-notifications
194 ^^^^^^^^^^^^^^^^^^^^^^
195 Number of NOTIFY packets that were received
196
197 .. _stat-key-cache-size:
198
199 key-cache-size
200 ^^^^^^^^^^^^^^
201 Number of entries in the key cache
202
203 .. _stat-latency:
204
205 latency
206 ^^^^^^^
207 Average number of microseconds a packet spends within PowerDNS
208
209 .. _stat-meta-cache-size:
210
211 meta-cache-size
212 ^^^^^^^^^^^^^^^
213 Number of entries in the metadata cache
214
215 .. _stat-overload-drops:
216
217 overload-drops
218 ^^^^^^^^^^^^^^
219 Number of questions dropped because backends overloaded
220
221 .. _stat-packetcache-hit:
222
223 packetcache-hit
224 ^^^^^^^^^^^^^^^
225 Number of packets which were answered out of the cache
226
227 .. _stat-packetcache-miss:
228
229 packetcache-miss
230 ^^^^^^^^^^^^^^^^
231 Number of times a packet could not be answered out of the cache
232
233 .. _stat-packetcache-size:
234
235 packetcache-size
236 ^^^^^^^^^^^^^^^^
237 Amount of packets in the packetcache
238
239 .. _stat-qsize-q:
240
241 qsize-q
242 ^^^^^^^
243 Number of packets waiting for database attention
244
245 .. _stat-query-cache-hit:
246
247 query-cache-hit
248 ^^^^^^^^^^^^^^^
249 Number of hits on the :ref:`query-cache`
250
251 .. _stat-query-cache-miss:
252
253 query-cache-miss
254 ^^^^^^^^^^^^^^^^
255 Number of misses on the :ref:`query-cache`
256
257 .. _stat-query-cache-size:
258
259 query-cache-size
260 ^^^^^^^^^^^^^^^^
261 Number of entries in the query cache
262
263 .. _stat-rd-queries:
264
265 rd-queries
266 ^^^^^^^^^^
267 Number of packets sent by clients requesting recursion (regardless of if we'll be providing them with recursion).
268
269 .. _stat-recursing-answers:
270
271 recursing-answers
272 ^^^^^^^^^^^^^^^^^
273 Number of packets we supplied an answer to after recursive processing
274
275 .. _stat-recursing-questions:
276
277 recursing-questions
278 ^^^^^^^^^^^^^^^^^^^
279 Number of packets we performed recursive processing for.
280
281 .. _stat-recursion-unanswered:
282
283 recursion-unanswered
284 ^^^^^^^^^^^^^^^^^^^^
285 Number of packets we sent to our recursor, but did not get a timely answer for.
286
287 .. _stat-security-status:
288
289 security-status
290 ^^^^^^^^^^^^^^^
291 Security status based on :ref:`securitypolling`.
292
293 .. _stat-servfail-packets:
294
295 servfail-packets
296 ^^^^^^^^^^^^^^^^
297 Amount of packets that could not be answered due to database problems
298
299 .. _stat-signature-cache-size:
300
301 signature-cache-size
302 ^^^^^^^^^^^^^^^^^^^^
303 Number of entries in the signature cache
304
305 .. _stat-signatures:
306
307 signatures
308 ^^^^^^^^^^
309 Number of DNSSEC signatures created
310
311 .. _stat-sys-msec:
312
313 sys-msec
314 ^^^^^^^^
315 Number of CPU milliseconds sent in system time
316
317 .. _stat-tcp-answers-bytes:
318
319 tcp-answers-bytes
320 ^^^^^^^^^^^^^^^^^
321 Total number of answer bytes sent over TCP
322
323 .. _stat-tcp-answers:
324
325 tcp-answers
326 ^^^^^^^^^^^
327 Number of answers sent out over TCP
328
329 .. _stat-tcp-queries:
330
331 tcp-queries
332 ^^^^^^^^^^^
333 Number of questions received over TCP
334
335 .. _stat-tcp4-answers-bytes:
336
337 tcp4-answers-bytes
338 ^^^^^^^^^^^^^^^^^^
339 Total number of answer bytes sent over TCPv4
340
341 .. _stat-tcp4-answers:
342
343 tcp4-answers
344 ^^^^^^^^^^^^^^^^
345 Number of answers sent out over TCPv4
346
347 .. _stat-tcp4-queries:
348
349 tcp4-queries
350 ^^^^^^^^^^^^
351 Number of questions received over TCPv4
352
353 .. _stat-tcp6-answers-bytes:
354
355 tcp6-answers-bytes
356 ^^^^^^^^^^^^^^^^^^
357 Total number of answer bytes sent over TCPv6
358
359 .. _stat-tcp6-answers:
360
361 tcp6-answers
362 ^^^^^^^^^^^^
363 Number of answers sent out over TCPv6
364
365 .. _stat-tcp6-queries:
366
367 tcp6-queries
368 ^^^^^^^^^^^^
369 Number of questions received over TCPv6
370
371 .. _stat-timedout-packets:
372
373 timedout-packets
374 ^^^^^^^^^^^^^^^^
375 Amount of packets that were dropped because they had to wait too long internally
376
377 .. _stat-udp-answers-bytes:
378
379 udp-answers-bytes
380 ^^^^^^^^^^^^^^^^^
381 Total number of answer bytes sent over UDP
382
383 .. _stat-udp-answers:
384
385 udp-answers
386 ^^^^^^^^^^^
387 Number of answers sent out over UDP
388
389 .. _stat-udp-do-queries:
390
391 udp-do-queries
392 ^^^^^^^^^^^^^^
393 Number of queries received with the DO (DNSSEC OK) bit set
394
395 .. _stat-udp-in-errors:
396
397 udp-in-errors
398 ^^^^^^^^^^^^^
399 Number of packets, received faster than the OS could process them
400
401 .. _stat-udp-noport-errors:
402
403 udp-noport-errors
404 ^^^^^^^^^^^^^^^^^
405 Number of UDP packets where an ICMP response was received that the remote port was not listening
406
407 .. _stat-udp-queries:
408
409 udp-queries
410 ^^^^^^^^^^^
411 Number of questions received over UDP
412
413 .. _stat-udp-recvbuf-errors:
414
415 udp-recvbuf-errors
416 ^^^^^^^^^^^^^^^^^^
417 Number of errors caused in the UDP receive buffer
418
419 .. _stat-udp-sndbuf-errors:
420
421 udp-sndbuf-errors
422 ^^^^^^^^^^^^^^^^^
423 Number of errors caused in the UDP send buffer
424
425 .. _stat-udp4-answers-bytes:
426
427 udp4-answers-bytes
428 ^^^^^^^^^^^^^^^^^^
429 Total number of answer bytes sent over UDPv4
430
431 .. _stat-udp4-answers:
432
433 udp4-answers
434 ^^^^^^^^^^^^
435 Number of answers sent out over UDPv4
436
437 .. _stat-udp4-queries:
438
439 udp4-queries
440 ^^^^^^^^^^^^
441 Number of questions received over UDPv4
442
443 .. _stat-udp6-answers-bytes:
444
445 udp6-answers-bytes
446 ^^^^^^^^^^^^^^^^^^
447 Total number of answer bytes sent over UDPv6
448
449 .. _stat-udp6-answers:
450
451 udp6-answers
452 ^^^^^^^^^^^^
453 Number of answers sent out over UDPv6
454
455 .. _stat-udp6-queries:
456
457 udp6-queries
458 ^^^^^^^^^^^^
459 Number of questions received over UDPv6
460
461 .. _stat-uptime:
462
463 uptime
464 ^^^^^^
465 Uptime in seconds of the daemon
466
467 .. _stat-user-msec:
468
469 user-msec
470 ^^^^^^^^^
471 Number of milliseconds spend in CPU 'user' time
472
473 Ring buffers
474 ~~~~~~~~~~~~
475
476 Besides counters, PowerDNS also maintains the ringbuffers. A ringbuffer
477 records events, each new event gets a place in the buffer until it is
478 full. When full, earlier entries get overwritten, hence the name 'ring'.
479
480 By counting the entries in the buffer, statistics can be generated.
481 These statistics can currently only be viewed using the webserver and
482 are in fact not even collected without the webserver running.
483
484 The following ringbuffers are available:
485
486 - **logmessages**: All messages logged
487 - **noerror-queries**: Queries for existing records but for a type we
488 don't have. Queries for, say, the AAAA record of a domain, when only
489 an A is available. Queries are listed in the following format:
490 name/type. So an AAAA query for pdns.powerdns.com looks like
491 pdns.powerdns.com/AAAA.
492 - **nxdomain-queries**: Queries for non-existing records within
493 existing domains. If PowerDNS knows it is authoritative over a
494 domain, and it sees a question for a record in that domain that does
495 not exist, it is able to send out an authoritative 'no such domain'
496 message. Indicates that hosts are trying to connect to services
497 really not in your zone.
498 - **udp-queries**: All UDP queries seen.
499 - **remotes**: Remote server IP addresses. Number of hosts querying
500 PowerDNS. Be aware that UDP is anonymous - person A can send queries
501 that appear to be coming from person B.
502 - **remote-corrupts**: Remotes sending corrupt packets. Hosts sending
503 PowerDNS broken packets, possibly meant to disrupt service. Be aware
504 that UDP is anonymous - person A can send queries that appear to be
505 coming from person B.
506 - **remote-unauth**: Remotes querying domains for which we are not
507 authoritative. It may happen that there are misconfigured hosts on
508 the internet which are configured to think that a PowerDNS
509 installation is in fact a resolving nameserver. These hosts will not
510 get useful answers from PowerDNS. This buffer lists hosts sending
511 queries for domains which PowerDNS does not know about.
512 - **servfail-queries**: Queries that could not be answered due to
513 backend errors. For one reason or another, a backend may be unable to
514 extract answers for a certain domain from its storage. This may be
515 due to a corrupt database or to inconsistent data. When this happens,
516 PowerDNS sends out a 'servfail' packet indicating that it was unable
517 to answer the question. This buffer shows which queries have been
518 causing servfails.
519 - **unauth-queries**: Queries for domains that we are not authoritative
520 for. If a domain is delegated to a PowerDNS instance, but the backend
521 is not made aware of this fact, questions come in for which no answer
522 is available, nor is the authority. Use this ringbuffer to spot such
523 queries.
524
525 .. _metricscarbon:
526
527 Sending metrics to Graphite/Metronome over Carbon
528 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
529 For carbon/graphite/metronome, we use the following namespace.
530 Everything starts with 'pdns.', which is then followed by the local hostname.
531 Thirdly, we add 'auth' to signify the daemon generating the metrics.
532 This is then rounded off with the actual name of the metric. As an example: 'pdns.ns1.auth.questions'.
533
534 Care has been taken to make the sending of statistics as unobtrusive as possible, the daemons will not be hindered by an unreachable carbon server, timeouts or connection refused situations.
535
536 To benefit from our carbon/graphite support, either install Graphite, or use our own lightweight statistics daemon, Metronome, currently available on `GitHub <https://github.com/ahupowerdns/metronome/>`_.
537
538 To enable sending metrics, set :ref:`setting-carbon-server`, possibly :ref:`setting-carbon-interval` and possibly :ref:`setting-carbon-ourname` in the configuration.
539
540 .. warning::
541
542 If your hostname includes dots, they will be replaced by underscores so as not to confuse the namespace.
543
544 If you include dots in :ref:`setting-carbon-ourname`, they will **not** be replaced by underscores.
545 As PowerDNS assumes you know what you are doing if you override your hostname.