From: Remi Gacogne Date: Fri, 15 Apr 2022 13:11:15 +0000 (+0200) Subject: dnsdist: Document external eBPF programs and maps X-Git-Tag: auth-4.8.0-alpha0~65^2~1 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=f514078936dba5a4caa797f9192fcddaf19fa11c;p=thirdparty%2Fpdns.git dnsdist: Document external eBPF programs and maps --- diff --git a/pdns/dnsdistdist/docs/advanced/ebpf.rst b/pdns/dnsdistdist/docs/advanced/ebpf.rst index 12b7435816..1b55beddfa 100644 --- a/pdns/dnsdistdist/docs/advanced/ebpf.rst +++ b/pdns/dnsdistdist/docs/advanced/ebpf.rst @@ -80,3 +80,27 @@ In addition to the capabilities explained above, that feature might require an i When attaching an eBPF program to a socket, the size of the program is checked against this limit, and the default value might not be enough. Large map sizes might also require an increase of ``RLIMIT_MEMLOCK``, which can be done by adding ``LimitMEMLOCK=infinity`` in the systemd unit file. It can also be done manually for testing purposes, in a non-permanent way, by using ``ulimit -l``. + +External program, maps and XDP filtering +---------------------------------------- + +Since 1.7.0 dnsdist has the ability to expose its eBPF map to external programs. That feature makes it possible to populate the client IP addresses and qnames maps from dnsdist, usually using the dynamic block mechanism, and to act on the content of these maps from an external program, including a XDP one. +For example, to instruct dnsdist to create under the ``/sys/fs/bpf`` mount point of type ``bpf`` three maps of maximum 1024 entries each, respectively pinned to ``/sys/fs/bpf/dnsdist/addr-v4``, ``/sys/fs/bpf/dnsdist/addr-v6``, ``/sys/fs/bpf/dnsdist/qnames`` for IPv4 addresses, IPv6 ones, and qnames: + +.. code-block:: lua + + bpf = newBPFFilter({maxItems=1024, pinnedPath='/sys/fs/bpf/dnsdist/addr-v4'}, {maxItems=1024, pinnedPath='/sys/fs/bpf/dnsdist/addr-v6'}, {maxItems=1024, pinnedPath='/sys/fs/bpf/dnsdist/qnames'}, true) + +.. note:: + By default only root can write into a bpf mount point, but it is possible to create a ``dnsdist/`` sub-directory with ``mkdir`` and to make it owned by the ``dnsdist`` user with ``chown``. + +The last parameter to :func:`newBPFFilter` is set to ``true`` to indicate to dnsdist not to load its internal eBPF socket filter program, which is not needed since packets will be intercepted by an external program and would at best duplicate the work done by the other program. It also tell dnsdist to use a slightly different format for the eBPF maps: + + * IPv4 and IPv6 maps still use the address as key, but the value contains an action field in addition to the 'matched' counter, to allow for more actions than just dropping the packet + * the qname map now uses the qname and qtype as key, instead of using only the qname, and the value contains the action and counter fields described above instead of having a counter and the qtype + +The first, legacy format is still used because of the limitations of eBPF socket filter programs on older kernels, and the number of instructions in particular, that prevented us from using the qname and qtype as key. We will likely switch to the newer format by default once Linux distributions stop shipping these older kernels. XDP programs require newer kernel versions anyway and have thus fewer limitations. + +XDP programs are more powerful than eBPF socket filtering ones as they are not limited to accepting or denying a packet, but can immediately craft and send an answer. They are also executed a bit earlier in the kernel networking path so can provide better performance. + +A sample program using the maps populated by dnsdist in an external XDP program can be found in the `contrib/ directory of our git repository `__. That program supports answering with a TC=1 response instead of simply dropping the packet.