]>
Commit | Line | Data |
---|---|---|
0e2063c3 PL |
1 | Performance and Tuning |
2 | ====================== | |
3 | ||
4 | In general, best performance is achieved on recent Linux 4.x kernels and | |
5 | using MySQL, although many of the largest PowerDNS installations are | |
6 | based on PostgreSQL. FreeBSD also performs very well. | |
7 | ||
8 | Database servers can require configuration to achieve decent | |
9 | performance. It is especially worth noting that several vendors ship | |
10 | PostgreSQL with a slow default configuration. | |
11 | ||
12 | .. warning:: | |
13 | When deploying (large scale) IPv6, please be aware some | |
14 | Linux distributions leave IPv6 routing cache tables at very small | |
15 | default values. Please check and if necessary raise | |
16 | ``sysctl net.ipv6.route.max_size``. | |
17 | ||
18 | Performance related settings | |
19 | ---------------------------- | |
20 | ||
21 | When PowerDNS starts up it creates a number of threads to listen for | |
22 | packets. This is configurable with the | |
23 | :ref:`setting-receiver-threads` setting which | |
24 | defines how many sockets will be opened by the powerdns process. In | |
25 | versions of linux before kernel 3.9 having too many receiver threads set | |
26 | up resulted in decreased performance due to socket contention between | |
27 | multiple CPUs - the typical sweet spot was 3 or 4. For optimal | |
28 | performance on kernel 3.9 and following with | |
29 | :ref:`setting-reuseport` enabled you'll typically want | |
30 | a receiver thread for each core on your box if backend | |
31 | latency/performance is not an issue and you want top performance. | |
32 | ||
33 | Different backends will have different characteristics - some will want | |
34 | to have more parallel instances than others. In general, if your backend | |
35 | is latency bound, like most relational databases are, it pays to open | |
36 | more backends. | |
37 | ||
38 | This is done with the | |
39 | :ref:`setting-distributor-threads` setting | |
40 | which says how many distributors will be opened for each receiver | |
41 | thread. Of special importance is the choice between 1 or more backends. | |
42 | In case of only 1 thread, PowerDNS reverts to unthreaded operation which | |
43 | may be a lot faster, depending on your operating system and | |
44 | architecture. | |
45 | ||
46 | Other very important settings are | |
47 | :ref:`setting-cache-ttl`. PowerDNS caches entire | |
48 | packets it sends out so as to save the time to query backends to | |
49 | assemble all data. The default setting of 20 seconds may be low for high | |
50 | traffic sites, a value of 60 seconds rarely leads to problems. Please be | |
51 | aware that if any TTL in the answer is shorter than this setting, the | |
52 | packet cache will respect the answer's shortest TTL. | |
53 | ||
54 | Some PowerDNS operators set cache-ttl to many hours or even days, and | |
55 | use :ref:`pdns_control purge <running-pdnscontrol>` to | |
56 | selectively or globally notify PowerDNS of changes made in the backend. | |
57 | Also look at the :ref:`query-cache` described in this | |
58 | chapter. It may materially improve your performance. | |
59 | ||
60 | To determine if PowerDNS is unable to keep up with packets, determine | |
61 | the value of the :ref:`stat-qsize-q` variable. This represents the number of | |
62 | packets waiting for database attention. During normal operations the | |
63 | queue should be small. | |
64 | ||
65 | Logging truly kills performance as answering a question from the cache | |
66 | is an order of magnitude less work than logging a line about it. Busy | |
67 | sites will prefer to turn :ref:`setting-log-dns-details` off. | |
68 | ||
69 | .. _packet-cache: | |
70 | ||
71 | Packet Cache | |
72 | ------------ | |
73 | ||
74 | PowerDNS by default uses the 'Packet Cache' to recognise identical | |
75 | questions and supply them with identical answers, without any further | |
76 | processing. The default time to live is 20 seconds and can be changed by | |
77 | setting ``cache-ttl``. It has been observed that the utility of the | |
78 | packet cache increases with the load on your nameserver. | |
79 | ||
80 | Not all backends may benefit from the packet cache. If your backend is | |
81 | memory based and does not lead to context switches, the packet cache may | |
82 | actually hurt performance. | |
83 | ||
84 | .. versionchanged:: 4.1.0 | |
85 | The maximum size of the packet cache is controlled by the | |
86 | :ref:`setting-max-packet-cache-entries` entries. Before that both the | |
87 | query cache and the packet cache used the :ref:`setting-max-cache-entries` setting. | |
88 | ||
89 | .. _query-cache: | |
90 | ||
91 | Query Cache | |
92 | ----------- | |
93 | ||
94 | Besides entire packets, PowerDNS can also cache individual backend | |
95 | queries. Each DNS query leads to a number of backend queries, the most | |
96 | obvious additional backend query is the check for a possible CNAME. So, | |
97 | when a query comes in for the 'A' record for 'www.powerdns.com', | |
98 | PowerDNS must first check for a CNAME for 'www.powerdns.com'. | |
99 | ||
100 | The Query Cache caches these backend queries, many of which are quite | |
101 | repetitive. The maximum number of entries in the cache is controlled by | |
102 | the ``max-cache-entries`` setting. Before 4.1 this setting also controls | |
103 | the maximum number of entries in the packet cache. | |
104 | ||
105 | Most gain is made from caching negative entries, ie, queries that have | |
106 | no answer. As these take little memory to store and are typically not a | |
107 | real problem in terms of speed-of-propagation, the default TTL for | |
108 | negative queries is a rather high 60 seconds. | |
109 | ||
110 | This only is a problem when first doing a query for a record, adding it, | |
111 | and immediately doing a query for that record again. It may then take up | |
112 | to 60 seconds to appear. Changes to existing records however do not fall | |
113 | under the negative query ttl | |
114 | (:ref:`setting-negquery-cache-ttl`), but under | |
115 | the generic :ref:`setting-query-cache-ttl` which | |
116 | defaults to 20 seconds. | |
117 | ||
118 | The default values should work fine for many sites. When tuning, keep in | |
119 | mind that the Query Cache mostly saves database access but that the | |
120 | Packet Cache also saves a lot of CPU because 0 internal processing is | |
121 | done when answering a question from the Packet Cache. | |
122 | ||
123 | Performance Monitoring | |
124 | ---------------------- | |
125 | ||
126 | A number of counters and variables are set during PowerDNS Authoritative | |
127 | Server operation. | |
128 | ||
129 | .. _counters: | |
130 | .. _metricnames: | |
131 | ||
132 | Counters | |
133 | ~~~~~~~~ | |
134 | ||
135 | All counters that show the "number of X" count since the last startup of the daemon. | |
136 | ||
137 | .. _stat-corrupt-packets: | |
138 | ||
139 | corrupt-packets | |
140 | ^^^^^^^^^^^^^^^ | |
141 | Number of corrupt packets received | |
142 | ||
143 | .. _stat-deferred-cache-inserts: | |
144 | ||
145 | deferred-cache-inserts | |
146 | ^^^^^^^^^^^^^^^^^^^^^^ | |
147 | Number of cache inserts that were deferred because of maintenance | |
148 | ||
149 | .. _stat-deferred-cache-lookup: | |
150 | ||
151 | deferred-cache-lookup | |
152 | ^^^^^^^^^^^^^^^^^^^^^ | |
5b9dd957 | 153 | Number of cache lookups that were deferred because of maintenance |
0e2063c3 PL |
154 | |
155 | .. _stat-deferred-packetcache-inserts: | |
156 | ||
157 | deferred-packetcache-inserts | |
158 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
159 | Number of packet cache inserts that were deferred because of maintenance | |
160 | ||
161 | .. _stat-deferred-packetcache-lookup: | |
162 | ||
163 | deferred-packetcache-lookup | |
164 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
165 | Number of packet cache lookups that were deferred because of maintenance | |
166 | ||
167 | .. _stat-dnsupdate-answers: | |
168 | ||
169 | dnsupdate-answers | |
170 | ^^^^^^^^^^^^^^^^^ | |
171 | Number of DNS update packets successfully answered | |
172 | ||
173 | .. _stat-dnsupdate-changes: | |
174 | ||
175 | dnsupdate-changes | |
176 | ^^^^^^^^^^^^^^^^^ | |
177 | Total number of changes to records from DNS update | |
178 | ||
179 | .. _stat-dnsupdate-queries: | |
180 | ||
181 | dnsupdate-queries | |
182 | ^^^^^^^^^^^^^^^^^ | |
183 | Number of DNS update packets received | |
184 | ||
185 | .. _stat-dnsupdate-refused: | |
186 | ||
187 | dnsupdate-refused | |
188 | ^^^^^^^^^^^^^^^^^ | |
189 | Number of DNS update packets that were refused | |
190 | ||
191 | .. _stat-incoming-notifications: | |
192 | ||
193 | incoming-notifications | |
194 | ^^^^^^^^^^^^^^^^^^^^^^ | |
195 | Number of NOTIFY packets that were received | |
196 | ||
197 | .. _stat-key-cache-size: | |
198 | ||
199 | key-cache-size | |
200 | ^^^^^^^^^^^^^^ | |
201 | Number of entries in the key cache | |
202 | ||
203 | .. _stat-latency: | |
204 | ||
205 | latency | |
206 | ^^^^^^^ | |
207 | Average number of microseconds a packet spends within PowerDNS | |
208 | ||
209 | .. _stat-meta-cache-size: | |
210 | ||
211 | meta-cache-size | |
212 | ^^^^^^^^^^^^^^^ | |
213 | Number of entries in the metadata cache | |
214 | ||
215 | .. _stat-overload-drops: | |
216 | ||
217 | overload-drops | |
218 | ^^^^^^^^^^^^^^ | |
219 | Number of questions dropped because backends overloaded | |
220 | ||
221 | .. _stat-packetcache-hit: | |
222 | ||
223 | packetcache-hit | |
224 | ^^^^^^^^^^^^^^^ | |
225 | Number of packets which were answered out of the cache | |
226 | ||
227 | .. _stat-packetcache-miss: | |
228 | ||
229 | packetcache-miss | |
230 | ^^^^^^^^^^^^^^^^ | |
231 | Number of times a packet could not be answered out of the cache | |
232 | ||
233 | .. _stat-packetcache-size: | |
234 | ||
235 | packetcache-size | |
236 | ^^^^^^^^^^^^^^^^ | |
237 | Amount of packets in the packetcache | |
238 | ||
239 | .. _stat-qsize-q: | |
240 | ||
241 | qsize-q | |
242 | ^^^^^^^ | |
243 | Number of packets waiting for database attention | |
244 | ||
245 | .. _stat-query-cache-hit: | |
246 | ||
247 | query-cache-hit | |
248 | ^^^^^^^^^^^^^^^ | |
249 | Number of hits on the :ref:`query-cache` | |
250 | ||
251 | .. _stat-query-cache-miss: | |
252 | ||
253 | query-cache-miss | |
254 | ^^^^^^^^^^^^^^^^ | |
255 | Number of misses on the :ref:`query-cache` | |
256 | ||
257 | .. _stat-query-cache-size: | |
258 | ||
259 | query-cache-size | |
260 | ^^^^^^^^^^^^^^^^ | |
261 | Number of entries in the query cache | |
262 | ||
263 | .. _stat-rd-queries: | |
264 | ||
265 | rd-queries | |
266 | ^^^^^^^^^^ | |
267 | Number of packets sent by clients requesting recursion (regardless of if we'll be providing them with recursion). | |
268 | ||
269 | .. _stat-recursing-answers: | |
270 | ||
271 | recursing-answers | |
272 | ^^^^^^^^^^^^^^^^^ | |
273 | Number of packets we supplied an answer to after recursive processing | |
274 | ||
275 | .. _stat-recursing-questions: | |
276 | ||
277 | recursing-questions | |
278 | ^^^^^^^^^^^^^^^^^^^ | |
279 | Number of packets we performed recursive processing for. | |
280 | ||
281 | .. _stat-recursion-unanswered: | |
282 | ||
283 | recursion-unanswered | |
284 | ^^^^^^^^^^^^^^^^^^^^ | |
285 | Number of packets we sent to our recursor, but did not get a timely answer for. | |
286 | ||
287 | .. _stat-security-status: | |
288 | ||
289 | security-status | |
290 | ^^^^^^^^^^^^^^^ | |
291 | Security status based on :ref:`securitypolling`. | |
292 | ||
293 | .. _stat-servfail-packets: | |
294 | ||
295 | servfail-packets | |
296 | ^^^^^^^^^^^^^^^^ | |
297 | Amount of packets that could not be answered due to database problems | |
298 | ||
299 | .. _stat-signature-cache-size: | |
300 | ||
301 | signature-cache-size | |
302 | ^^^^^^^^^^^^^^^^^^^^ | |
303 | Number of entries in the signature cache | |
304 | ||
305 | .. _stat-signatures: | |
306 | ||
307 | signatures | |
308 | ^^^^^^^^^^ | |
309 | Number of DNSSEC signatures created | |
310 | ||
311 | .. _stat-sys-msec: | |
312 | ||
313 | sys-msec | |
314 | ^^^^^^^^ | |
315 | Number of CPU milliseconds sent in system time | |
316 | ||
317 | .. _stat-tcp-answers-bytes: | |
318 | ||
319 | tcp-answers-bytes | |
320 | ^^^^^^^^^^^^^^^^^ | |
321 | Total number of answer bytes sent over TCP | |
322 | ||
323 | .. _stat-tcp-answers: | |
324 | ||
325 | tcp-answers | |
326 | ^^^^^^^^^^^ | |
327 | Number of answers sent out over TCP | |
328 | ||
329 | .. _stat-tcp-queries: | |
330 | ||
331 | tcp-queries | |
332 | ^^^^^^^^^^^ | |
333 | Number of questions received over TCP | |
334 | ||
335 | .. _stat-tcp4-answers-bytes: | |
336 | ||
337 | tcp4-answers-bytes | |
338 | ^^^^^^^^^^^^^^^^^^ | |
339 | Total number of answer bytes sent over TCPv4 | |
340 | ||
341 | .. _stat-tcp4-answers: | |
342 | ||
343 | tcp4-answers | |
344 | ^^^^^^^^^^^^^^^^ | |
345 | Number of answers sent out over TCPv4 | |
346 | ||
347 | .. _stat-tcp4-queries: | |
348 | ||
349 | tcp4-queries | |
350 | ^^^^^^^^^^^^ | |
351 | Number of questions received over TCPv4 | |
352 | ||
353 | .. _stat-tcp6-answers-bytes: | |
354 | ||
355 | tcp6-answers-bytes | |
356 | ^^^^^^^^^^^^^^^^^^ | |
357 | Total number of answer bytes sent over TCPv6 | |
358 | ||
359 | .. _stat-tcp6-answers: | |
360 | ||
361 | tcp6-answers | |
362 | ^^^^^^^^^^^^ | |
363 | Number of answers sent out over TCPv6 | |
364 | ||
365 | .. _stat-tcp6-queries: | |
366 | ||
367 | tcp6-queries | |
368 | ^^^^^^^^^^^^ | |
369 | Number of questions received over TCPv6 | |
370 | ||
371 | .. _stat-timedout-packets: | |
372 | ||
373 | timedout-packets | |
374 | ^^^^^^^^^^^^^^^^ | |
375 | Amount of packets that were dropped because they had to wait too long internally | |
376 | ||
377 | .. _stat-udp-answers-bytes: | |
378 | ||
379 | udp-answers-bytes | |
380 | ^^^^^^^^^^^^^^^^^ | |
381 | Total number of answer bytes sent over UDP | |
382 | ||
383 | .. _stat-udp-answers: | |
384 | ||
385 | udp-answers | |
386 | ^^^^^^^^^^^ | |
387 | Number of answers sent out over UDP | |
388 | ||
389 | .. _stat-udp-do-queries: | |
390 | ||
391 | udp-do-queries | |
392 | ^^^^^^^^^^^^^^ | |
393 | Number of queries received with the DO (DNSSEC OK) bit set | |
394 | ||
395 | .. _stat-udp-in-errors: | |
396 | ||
397 | udp-in-errors | |
398 | ^^^^^^^^^^^^^ | |
399 | Number of packets, received faster than the OS could process them | |
400 | ||
401 | .. _stat-udp-noport-errors: | |
402 | ||
403 | udp-noport-errors | |
404 | ^^^^^^^^^^^^^^^^^ | |
405 | Number of UDP packets where an ICMP response was received that the remote port was not listening | |
406 | ||
407 | .. _stat-udp-queries: | |
408 | ||
409 | udp-queries | |
410 | ^^^^^^^^^^^ | |
411 | Number of questions received over UDP | |
412 | ||
413 | .. _stat-udp-recvbuf-errors: | |
414 | ||
415 | udp-recvbuf-errors | |
416 | ^^^^^^^^^^^^^^^^^^ | |
5b9dd957 | 417 | Number of errors caused in the UDP receive buffer |
0e2063c3 PL |
418 | |
419 | .. _stat-udp-sndbuf-errors: | |
420 | ||
421 | udp-sndbuf-errors | |
422 | ^^^^^^^^^^^^^^^^^ | |
423 | Number of errors caused in the UDP send buffer | |
424 | ||
425 | .. _stat-udp4-answers-bytes: | |
426 | ||
427 | udp4-answers-bytes | |
428 | ^^^^^^^^^^^^^^^^^^ | |
429 | Total number of answer bytes sent over UDPv4 | |
430 | ||
431 | .. _stat-udp4-answers: | |
432 | ||
433 | udp4-answers | |
434 | ^^^^^^^^^^^^ | |
435 | Number of answers sent out over UDPv4 | |
436 | ||
437 | .. _stat-udp4-queries: | |
438 | ||
439 | udp4-queries | |
440 | ^^^^^^^^^^^^ | |
441 | Number of questions received over UDPv4 | |
442 | ||
443 | .. _stat-udp6-answers-bytes: | |
444 | ||
445 | udp6-answers-bytes | |
446 | ^^^^^^^^^^^^^^^^^^ | |
447 | Total number of answer bytes sent over UDPv6 | |
448 | ||
449 | .. _stat-udp6-answers: | |
450 | ||
451 | udp6-answers | |
452 | ^^^^^^^^^^^^ | |
453 | Number of answers sent out over UDPv6 | |
454 | ||
455 | .. _stat-udp6-queries: | |
456 | ||
457 | udp6-queries | |
458 | ^^^^^^^^^^^^ | |
459 | Number of questions received over UDPv6 | |
460 | ||
461 | .. _stat-uptime: | |
462 | ||
463 | uptime | |
464 | ^^^^^^ | |
465 | Uptime in seconds of the daemon | |
466 | ||
467 | .. _stat-user-msec: | |
468 | ||
469 | user-msec | |
470 | ^^^^^^^^^ | |
471 | Number of milliseconds spend in CPU 'user' time | |
472 | ||
473 | Ring buffers | |
474 | ~~~~~~~~~~~~ | |
475 | ||
476 | Besides counters, PowerDNS also maintains the ringbuffers. A ringbuffer | |
477 | records events, each new event gets a place in the buffer until it is | |
478 | full. When full, earlier entries get overwritten, hence the name 'ring'. | |
479 | ||
480 | By counting the entries in the buffer, statistics can be generated. | |
481 | These statistics can currently only be viewed using the webserver and | |
482 | are in fact not even collected without the webserver running. | |
483 | ||
484 | The following ringbuffers are available: | |
485 | ||
486 | - **logmessages**: All messages logged | |
487 | - **noerror-queries**: Queries for existing records but for a type we | |
488 | don't have. Queries for, say, the AAAA record of a domain, when only | |
489 | an A is available. Queries are listed in the following format: | |
490 | name/type. So an AAAA query for pdns.powerdns.com looks like | |
491 | pdns.powerdns.com/AAAA. | |
492 | - **nxdomain-queries**: Queries for non-existing records within | |
493 | existing domains. If PowerDNS knows it is authoritative over a | |
494 | domain, and it sees a question for a record in that domain that does | |
495 | not exist, it is able to send out an authoritative 'no such domain' | |
496 | message. Indicates that hosts are trying to connect to services | |
497 | really not in your zone. | |
498 | - **udp-queries**: All UDP queries seen. | |
499 | - **remotes**: Remote server IP addresses. Number of hosts querying | |
500 | PowerDNS. Be aware that UDP is anonymous - person A can send queries | |
501 | that appear to be coming from person B. | |
502 | - **remote-corrupts**: Remotes sending corrupt packets. Hosts sending | |
503 | PowerDNS broken packets, possibly meant to disrupt service. Be aware | |
504 | that UDP is anonymous - person A can send queries that appear to be | |
505 | coming from person B. | |
506 | - **remote-unauth**: Remotes querying domains for which we are not | |
507 | authoritative. It may happen that there are misconfigured hosts on | |
508 | the internet which are configured to think that a PowerDNS | |
509 | installation is in fact a resolving nameserver. These hosts will not | |
510 | get useful answers from PowerDNS. This buffer lists hosts sending | |
511 | queries for domains which PowerDNS does not know about. | |
512 | - **servfail-queries**: Queries that could not be answered due to | |
513 | backend errors. For one reason or another, a backend may be unable to | |
514 | extract answers for a certain domain from its storage. This may be | |
515 | due to a corrupt database or to inconsistent data. When this happens, | |
516 | PowerDNS sends out a 'servfail' packet indicating that it was unable | |
517 | to answer the question. This buffer shows which queries have been | |
518 | causing servfails. | |
519 | - **unauth-queries**: Queries for domains that we are not authoritative | |
520 | for. If a domain is delegated to a PowerDNS instance, but the backend | |
521 | is not made aware of this fact, questions come in for which no answer | |
522 | is available, nor is the authority. Use this ringbuffer to spot such | |
523 | queries. | |
524 | ||
525 | .. _metricscarbon: | |
526 | ||
527 | Sending metrics to Graphite/Metronome over Carbon | |
528 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
529 | For carbon/graphite/metronome, we use the following namespace. | |
530 | Everything starts with 'pdns.', which is then followed by the local hostname. | |
531 | Thirdly, we add 'auth' to signify the daemon generating the metrics. | |
532 | This is then rounded off with the actual name of the metric. As an example: 'pdns.ns1.auth.questions'. | |
533 | ||
534 | Care has been taken to make the sending of statistics as unobtrusive as possible, the daemons will not be hindered by an unreachable carbon server, timeouts or connection refused situations. | |
535 | ||
536 | To benefit from our carbon/graphite support, either install Graphite, or use our own lightweight statistics daemon, Metronome, currently available on `GitHub <https://github.com/ahupowerdns/metronome/>`_. | |
537 | ||
538 | To enable sending metrics, set :ref:`setting-carbon-server`, possibly :ref:`setting-carbon-interval` and possibly :ref:`setting-carbon-ourname` in the configuration. | |
539 | ||
540 | .. warning:: | |
541 | ||
542 | If your hostname includes dots, they will be replaced by underscores so as not to confuse the namespace. | |
543 | ||
544 | If you include dots in :ref:`setting-carbon-ourname`, they will **not** be replaced by underscores. | |
545 | As PowerDNS assumes you know what you are doing if you override your hostname. |