</listitem>
</varlistentry>
+ <varlistentry>
+ <term><varname>NFTSet=</varname><replaceable>family</replaceable>:<replaceable>table</replaceable>:<replaceable>set</replaceable></term>
+ <listitem>
+ <para>This setting provides a method for integrating dynamic cgroup, user and group IDs into
+ firewall rules with <ulink url="https://netfilter.org/projects/nftables/index.html">NFT</ulink>
+ sets. The benefit of using this setting is to be able to use the IDs as selectors in firewall rules
+ easily and this in turn allows more fine grained filtering. NFT rules for cgroup matching use
+ numeric cgroup IDs, which change every time a service is restarted, making them hard to use in
+ systemd environment otherwise. Dynamic and random IDs used by <varname>DynamicUser=</varname> can
+ be also integrated with this setting.</para>
+
+ <para>This option expects a whitespace separated list of NFT set definitions. Each definition
+ consists of a colon-separated tuple of source type (one of <literal>cgroup</literal>,
+ <literal>user</literal> or <literal>group</literal>), NFT address family (one of
+ <literal>arp</literal>, <literal>bridge</literal>, <literal>inet</literal>, <literal>ip</literal>,
+ <literal>ip6</literal>, or <literal>netdev</literal>), table name and set name. The names of tables
+ and sets must conform to lexical restrictions of NFT table names. The type of the element used in
+ the NFT filter must match the type implied by the directive (<literal>cgroup</literal>,
+ <literal>user</literal> or <literal>group</literal>) as shown in the table below. When a control
+ group or a unit is realized, the corresponding ID will be appended to the NFT sets and it will be
+ be removed when the control group or unit is removed. <command>systemd</command> only inserts
+ elements to (or removes from) the sets, so the related NFT rules, tables and sets must be prepared
+ elsewhere in advance. Failures to manage the sets will be ignored.</para>
+
+ <table>
+ <title>Defined <varname>source type</varname> values</title>
+ <tgroup cols='3'>
+ <colspec colname='source type'/>
+ <colspec colname='description'/>
+ <colspec colname='NFT type name'/>
+ <thead>
+ <row>
+ <entry>Source type</entry>
+ <entry>Description</entry>
+ <entry>Corresponding NFT type name</entry>
+ </row>
+ </thead>
+
+ <tbody>
+ <row>
+ <entry><literal>cgroup</literal></entry>
+ <entry>control group ID</entry>
+ <entry><literal>cgroupsv2</literal></entry>
+ </row>
+ <row>
+ <entry><literal>user</literal></entry>
+ <entry>user ID</entry>
+ <entry><literal>meta skuid</literal></entry>
+ </row>
+ <row>
+ <entry><literal>group</literal></entry>
+ <entry>group ID</entry>
+ <entry><literal>meta skgid</literal></entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+
+ <para>If the firewall rules are reinstalled so that the contents of NFT sets are destroyed, command
+ <command>systemctl daemon-reload</command> can be used to refill the sets.</para>
+
+ <para>Example:
+ <programlisting>[Unit]
+NFTSet=cgroup:inet:filter:my_service user:inet:filter:serviceuser
+</programlisting>
+ Corresponding NFT rules:
+ <programlisting>table inet filter {
+ set my_service {
+ type cgroupsv2
+ }
+ set serviceuser {
+ typeof meta skuid
+ }
+ chain x {
+ socket cgroupv2 level 2 @my_service accept
+ drop
+ }
+ chain y {
+ meta skuid @serviceuser accept
+ drop
+ }
+}</programlisting>
+ </para>
+ <xi:include href="version-info.xml" xpointer="v255"/></listitem>
+ </varlistentry>
+
</variablelist>
</refsect2><refsect2><title>BPF Programs</title>
<listitem>
<para><varname>BPFProgram=</varname> allows attaching custom BPF programs to the cgroup of a
unit. (This generalizes the functionality exposed via <varname>IPEgressFilterPath=</varname> and
- and <varname>IPIngressFilterPath=</varname> for other hooks.) Cgroup-bpf hooks in the form of BPF
+ <varname>IPIngressFilterPath=</varname> for other hooks.) Cgroup-bpf hooks in the form of BPF
programs loaded to the BPF filesystem are attached with cgroup-bpf attach flags determined by the
unit. For details about attachment types and flags see <ulink
url="https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/plain/include/uapi/linux/bpf.h"><filename>bpf.h</filename></ulink>. Also
<replaceable>type</replaceable>:<replaceable>program-path</replaceable>.</para>
<para>The BPF program type is equivalent to the BPF attach type used in
- <command>bpftool</command>. It may be one of <constant>egress</constant>,
- <constant>ingress</constant>, <constant>sock_create</constant>, <constant>sock_ops</constant>,
- <constant>device</constant>, <constant>bind4</constant>, <constant>bind6</constant>,
- <constant>connect4</constant>, <constant>connect6</constant>, <constant>post_bind4</constant>,
- <constant>post_bind6</constant>, <constant>sendmsg4</constant>, <constant>sendmsg6</constant>,
- <constant>sysctl</constant>, <constant>recvmsg4</constant>, <constant>recvmsg6</constant>,
- <constant>getsockopt</constant>, <constant>setsockopt</constant>.</para>
+ <citerefentry project='mankier'><refentrytitle>bpftool</refentrytitle><manvolnum>8</manvolnum></citerefentry>
+ It may be one of
+ <constant>egress</constant>,
+ <constant>ingress</constant>,
+ <constant>sock_create</constant>,
+ <constant>sock_ops</constant>,
+ <constant>device</constant>,
+ <constant>bind4</constant>,
+ <constant>bind6</constant>,
+ <constant>connect4</constant>,
+ <constant>connect6</constant>,
+ <constant>post_bind4</constant>,
+ <constant>post_bind6</constant>,
+ <constant>sendmsg4</constant>,
+ <constant>sendmsg6</constant>,
+ <constant>sysctl</constant>,
+ <constant>recvmsg4</constant>,
+ <constant>recvmsg6</constant>,
+ <constant>getsockopt</constant>,
+ or <constant>setsockopt</constant>.
+ </para>
<para>The specified program path must be an absolute path referencing a BPF program inode in the
bpffs file system (which generally means it must begin with <filename>/sys/fs/bpf/</filename>). If
<varname>$MEMORY_PRESSURE_WATCH</varname> environment variable to the literal string
<filename>/dev/null</filename>. If <literal>on</literal> tells the service to watch for memory
pressure events. This enables memory accounting for the service, and ensures the
- <filename>memory.pressure</filename> cgroup attribute files is accessible for read and write to the
+ <filename>memory.pressure</filename> cgroup attribute file is accessible for reading and writing by the
service's user. It then sets the <varname>$MEMORY_PRESSURE_WATCH</varname> environment variable for
processes invoked by the unit to the file system path to this file. The threshold information
configured with <varname>MemoryPressureThresholdSec=</varname> is encoded in the
<xi:include href="version-info.xml" xpointer="v254"/></listitem>
</varlistentry>
+ </variablelist>
- <varlistentry>
- <term><varname>NFTSet=</varname><replaceable>family</replaceable>:<replaceable>table</replaceable>:<replaceable>set</replaceable></term>
- <listitem>
- <para>This setting provides a method for integrating dynamic cgroup, user and group IDs into
- firewall rules with <ulink url="https://netfilter.org/projects/nftables/index.html">NFT</ulink>
- sets. The benefit of using this setting is to be able to use the IDs as selectors in firewall rules
- easily and this in turn allows more fine grained filtering. NFT rules for cgroup matching use
- numeric cgroup IDs, which change every time a service is restarted, making them hard to use in
- systemd environment otherwise. Dynamic and random IDs used by <varname>DynamicUser=</varname> can
- be also integrated with this setting.</para>
+ </refsect2><refsect2><title>Coredump Control</title>
- <para>This option expects a whitespace separated list of NFT set definitions. Each definition
- consists of a colon-separated tuple of source type (one of <literal>cgroup</literal>,
- <literal>user</literal> or <literal>group</literal>), NFT address family (one of
- <literal>arp</literal>, <literal>bridge</literal>, <literal>inet</literal>, <literal>ip</literal>,
- <literal>ip6</literal>, or <literal>netdev</literal>), table name and set name. The names of tables
- and sets must conform to lexical restrictions of NFT table names. The type of the element used in
- the NFT filter must match the type implied by the directive (<literal>cgroup</literal>,
- <literal>user</literal> or <literal>group</literal>) as shown in the table below. When a control
- group or a unit is realized, the corresponding ID will be appended to the NFT sets and it will be
- be removed when the control group or unit is removed. <command>systemd</command> only inserts
- elements to (or removes from) the sets, so the related NFT rules, tables and sets must be prepared
- elsewhere in advance. Failures to manage the sets will be ignored.</para>
+ <variablelist class='unit-directives'>
- <table>
- <title>Defined <varname>source type</varname> values</title>
- <tgroup cols='3'>
- <colspec colname='source type'/>
- <colspec colname='description'/>
- <colspec colname='NFT type name'/>
- <thead>
- <row>
- <entry>Source type</entry>
- <entry>Description</entry>
- <entry>Corresponding NFT type name</entry>
- </row>
- </thead>
+ <varlistentry>
+ <term><varname>CoredumpReceive=</varname></term>
- <tbody>
- <row>
- <entry><literal>cgroup</literal></entry>
- <entry>control group ID</entry>
- <entry><literal>cgroupsv2</literal></entry>
- </row>
- <row>
- <entry><literal>user</literal></entry>
- <entry>user ID</entry>
- <entry><literal>meta skuid</literal></entry>
- </row>
- <row>
- <entry><literal>group</literal></entry>
- <entry>group ID</entry>
- <entry><literal>meta skgid</literal></entry>
- </row>
- </tbody>
- </tgroup>
- </table>
+ <listitem><para>Takes a boolean argument. This setting is used to enable coredump forwarding for containers
+ that belong to this unit's cgroup. Units with <varname>CoredumpReceive=yes</varname> must also be configured
+ with <varname>Delegate=yes</varname>. Defaults to false.</para>
- <para>If the firewall rules are reinstalled so that the contents of NFT sets are destroyed, command
- <command>systemctl daemon-reload</command> can be used to refill the sets.</para>
+ <para>When <command>systemd-coredump</command> is handling a coredump for a process from a container,
+ if the container's leader process is a descendant of a cgroup with <varname>CoredumpReceive=yes</varname>
+ and <varname>Delegate=yes</varname>, then <command>systemd-coredump</command> will attempt to forward
+ the coredump to <command>systemd-coredump</command> within the container.</para>
- <para>Example:
- <programlisting>[Unit]
-NFTSet=cgroup:inet:filter:my_service user:inet:filter:serviceuser
-</programlisting>
- Corresponding NFT rules:
- <programlisting>table inet filter {
- set my_service {
- type cgroupsv2
- }
- set serviceuser {
- typeof meta skuid
- }
- chain x {
- socket cgroupv2 level 2 @my_service accept
- drop
- }
- chain y {
- meta skuid @serviceuser accept
- drop
- }
-}</programlisting>
- </para>
<xi:include href="version-info.xml" xpointer="v255"/></listitem>
</varlistentry>
+
</variablelist>
</refsect2>
</refsect1>