<option>syslog</option> or <option>kmsg</option> (or their combinations with console output, see below)
automatically acquire dependencies of type <varname>After=</varname> on
<filename>systemd-journald.socket</filename>.</para></listitem>
+
+ <listitem><para>Units using <varname>LogNamespace=</varname> will automatically gain ordering and
+ requirement dependencies on the two socket units associated with
+ <filename>systemd-journald@.service</filename> instances.</para></listitem>
</itemizedlist>
</refsect1>
is set, the default group of the user is used. This setting does not affect commands whose command line is
prefixed with <literal>+</literal>.</para>
- <para>Note that restrictions on the user/group name syntax are enforced: the specified name must consist only
- of the characters a-z, A-Z, 0-9, <literal>_</literal> and <literal>-</literal>, except for the first character
- which must be one of a-z, A-Z or <literal>_</literal> (i.e. numbers and <literal>-</literal> are not permitted
- as first character). The user/group name must have at least one character, and at most 31. These restrictions
- are enforced in order to avoid ambiguities and to ensure user/group names and unit files remain portable among
- Linux systems.</para>
+ <para>Note that this enforces only weak restrictions on the user/group name syntax, but will generate
+ warnings in many cases where user/group names do not adhere to the following rules: the specified
+ name should consist only of the characters a-z, A-Z, 0-9, <literal>_</literal> and
+ <literal>-</literal>, except for the first character which must be one of a-z, A-Z and
+ <literal>_</literal> (i.e. digits and <literal>-</literal> are not permitted as first character). The
+ user/group name must have at least one character, and at most 31. These restrictions are made in
+ order to avoid ambiguities and to ensure user/group names and unit files remain portable among Linux
+ systems. For further details on the names accepted and the names warned about see <ulink
+ url="https://systemd.io/USER_NAMES">User/Group Name Syntax</ulink>.</para>
<para>When used in conjunction with <varname>DynamicUser=</varname> the user/group name specified is
- dynamically allocated at the time the service is started, and released at the time the service is stopped —
- unless it is already allocated statically (see below). If <varname>DynamicUser=</varname> is not used the
- specified user and group must have been created statically in the user database no later than the moment the
- service is started, for example using the
- <citerefentry><refentrytitle>sysusers.d</refentrytitle><manvolnum>5</manvolnum></citerefentry> facility, which
- is applied at boot or package install time.</para>
+ dynamically allocated at the time the service is started, and released at the time the service is
+ stopped — unless it is already allocated statically (see below). If <varname>DynamicUser=</varname>
+ is not used the specified user and group must have been created statically in the user database no
+ later than the moment the service is started, for example using the
+ <citerefentry><refentrytitle>sysusers.d</refentrytitle><manvolnum>5</manvolnum></citerefentry>
+ facility, which is applied at boot or package install time. If the user does not exist by then
+ program invocation will fail.</para>
<para>If the <varname>User=</varname> setting is used the supplementary group list is initialized
from the specified user's default group list, as defined in the system's user and group
<varname>RestrictAddressFamilies=</varname>, <varname>RestrictNamespaces=</varname>,
<varname>PrivateDevices=</varname>, <varname>ProtectKernelTunables=</varname>,
<varname>ProtectKernelModules=</varname>, <varname>ProtectKernelLogs=</varname>,
- <varname>MemoryDenyWriteExecute=</varname>, <varname>RestrictRealtime=</varname>,
- <varname>RestrictSUIDSGID=</varname>, <varname>DynamicUser=</varname> or <varname>LockPersonality=</varname>
- are specified. Note that even if this setting is overridden by them, <command>systemctl show</command> shows the
- original value of this setting. Also see <ulink
- url="https://www.kernel.org/doc/html/latest/userspace-api/no_new_privs.html">No New Privileges
+ <varname>ProtectClock=</varname>, <varname>MemoryDenyWriteExecute=</varname>,
+ <varname>RestrictRealtime=</varname>, <varname>RestrictSUIDSGID=</varname>, <varname>DynamicUser=</varname>
+ or <varname>LockPersonality=</varname> are specified. Note that even if this setting is overridden by them,
+ <command>systemctl show</command> shows the original value of this setting.
+ Also see <ulink url="https://www.kernel.org/doc/html/latest/userspace-api/no_new_privs.html">No New Privileges
Flag</ulink>.</para></listitem>
</varlistentry>
<term><varname>LimitRTTIME=</varname></term>
<listitem><para>Set soft and hard limits on various resources for executed processes. See
- <citerefentry><refentrytitle>setrlimit</refentrytitle><manvolnum>2</manvolnum></citerefentry> for details on
- the resource limit concept. Resource limits may be specified in two formats: either as single value to set a
- specific soft and hard limit to the same value, or as colon-separated pair <option>soft:hard</option> to set
- both limits individually (e.g. <literal>LimitAS=4G:16G</literal>). Use the string <option>infinity</option> to
- configure no limit on a specific resource. The multiplicative suffixes K, M, G, T, P and E (to the base 1024)
- may be used for resource limits measured in bytes (e.g. LimitAS=16G). For the limits referring to time values,
- the usual time units ms, s, min, h and so on may be used (see
+ <citerefentry><refentrytitle>setrlimit</refentrytitle><manvolnum>2</manvolnum></citerefentry> for
+ details on the resource limit concept. Resource limits may be specified in two formats: either as
+ single value to set a specific soft and hard limit to the same value, or as colon-separated pair
+ <option>soft:hard</option> to set both limits individually (e.g. <literal>LimitAS=4G:16G</literal>).
+ Use the string <option>infinity</option> to configure no limit on a specific resource. The
+ multiplicative suffixes K, M, G, T, P and E (to the base 1024) may be used for resource limits
+ measured in bytes (e.g. <literal>LimitAS=16G</literal>). For the limits referring to time values, the
+ usual time units ms, s, min, h and so on may be used (see
<citerefentry><refentrytitle>systemd.time</refentrytitle><manvolnum>7</manvolnum></citerefentry> for
- details). Note that if no time unit is specified for <varname>LimitCPU=</varname> the default unit of seconds
- is implied, while for <varname>LimitRTTIME=</varname> the default unit of microseconds is implied. Also, note
- that the effective granularity of the limits might influence their enforcement. For example, time limits
- specified for <varname>LimitCPU=</varname> will be rounded up implicitly to multiples of 1s. For
- <varname>LimitNICE=</varname> the value may be specified in two syntaxes: if prefixed with <literal>+</literal>
- or <literal>-</literal>, the value is understood as regular Linux nice value in the range -20..19. If not
- prefixed like this the value is understood as raw resource limit parameter in the range 0..40 (with 0 being
- equivalent to 1).</para>
-
- <para>Note that most process resource limits configured with these options are per-process, and processes may
- fork in order to acquire a new set of resources that are accounted independently of the original process, and
- may thus escape limits set. Also note that <varname>LimitRSS=</varname> is not implemented on Linux, and
- setting it has no effect. Often it is advisable to prefer the resource controls listed in
+ details). Note that if no time unit is specified for <varname>LimitCPU=</varname> the default unit of
+ seconds is implied, while for <varname>LimitRTTIME=</varname> the default unit of microseconds is
+ implied. Also, note that the effective granularity of the limits might influence their
+ enforcement. For example, time limits specified for <varname>LimitCPU=</varname> will be rounded up
+ implicitly to multiples of 1s. For <varname>LimitNICE=</varname> the value may be specified in two
+ syntaxes: if prefixed with <literal>+</literal> or <literal>-</literal>, the value is understood as
+ regular Linux nice value in the range -20..19. If not prefixed like this the value is understood as
+ raw resource limit parameter in the range 0..40 (with 0 being equivalent to 1).</para>
+
+ <para>Note that most process resource limits configured with these options are per-process, and
+ processes may fork in order to acquire a new set of resources that are accounted independently of the
+ original process, and may thus escape limits set. Also note that <varname>LimitRSS=</varname> is not
+ implemented on Linux, and setting it has no effect. Often it is advisable to prefer the resource
+ controls listed in
<citerefentry><refentrytitle>systemd.resource-control</refentrytitle><manvolnum>5</manvolnum></citerefentry>
- over these per-process limits, as they apply to services as a whole, may be altered dynamically at runtime, and
- are generally more expressive. For example, <varname>MemoryLimit=</varname> is a more powerful (and working)
- replacement for <varname>LimitRSS=</varname>.</para>
-
- <para>For system units these resource limits may be chosen freely. For user units however (i.e. units run by a
- per-user instance of
- <citerefentry><refentrytitle>systemd</refentrytitle><manvolnum>1</manvolnum></citerefentry>), these limits are
- bound by (possibly more restrictive) per-user limits enforced by the OS.</para>
+ over these per-process limits, as they apply to services as a whole, may be altered dynamically at
+ runtime, and are generally more expressive. For example, <varname>MemoryMax=</varname> is a more
+ powerful (and working) replacement for <varname>LimitRSS=</varname>.</para>
<para>Resource limits not configured explicitly for a unit default to the value configured in the various
<varname>DefaultLimitCPU=</varname>, <varname>DefaultLimitFSIZE=</varname>, … options available in
<citerefentry><refentrytitle>systemd-system.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry>, and –
if not configured there – the kernel or per-user defaults, as defined by the OS (the latter only for user
- services, see above).</para>
+ services, see below).</para>
+
+ <para>For system units these resource limits may be chosen freely. When these settings are configured
+ in a user service (i.e. a service run by the per-user instance of the service manager) they cannot be
+ used to raise the limits above those set for the user manager itself when it was first invoked, as
+ the user's service manager generally lacks the privileges to do so. In user context these
+ configuration options are hence only useful to lower the limits passed in or to raise the soft limit
+ to the maximum of the hard limit as configured for the user. To raise the user's limits further, the
+ available configuration mechanisms differ between operating systems, but typically require
+ privileges. In most cases it is possible to configure higher per-user resource limits via PAM or by
+ setting limits on the system service encapsulating the user's service manager, i.e. the user's
+ instance of <filename>user@.service</filename>. After making such changes, make sure to restart the
+ user's service manager.</para>
<table>
<title>Resource limit directives, their equivalent <command>ulimit</command> shell commands and the unit used</title>
<term><varname>UMask=</varname></term>
<listitem><para>Controls the file mode creation mask. Takes an access mode in octal notation. See
- <citerefentry><refentrytitle>umask</refentrytitle><manvolnum>2</manvolnum></citerefentry> for details. Defaults
- to 0022.</para></listitem>
+ <citerefentry><refentrytitle>umask</refentrytitle><manvolnum>2</manvolnum></citerefentry> for
+ details. Defaults to 0022 for system units. For units of the user service manager the default value
+ is inherited from the user instance (whose default is inherited from the system service manager, and
+ thus also is 0022). Hence changing the default value of a user instance, either via
+ <varname>UMask=</varname> or via a PAM module, will affect the user instance itself and all user
+ units started by the user instance unless a user unit has specified its own
+ <varname>UMask=</varname>.</para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>CoredumpFilter=</varname></term>
+
+ <listitem><para>Controls which types of memory mappings will be saved if the process dumps core
+ (using the <filename>/proc/<replaceable>pid</replaceable>/coredump_filter</filename> file). Takes a
+ whitespace-separated combination of mapping type names or numbers (with the default base 16). Mapping
+ type names are <constant>private-anonymous</constant>, <constant>shared-anonymous</constant>,
+ <constant>private-file-backed</constant>, <constant>shared-file-backed</constant>,
+ <constant>elf-headers</constant>, <constant>private-huge</constant>,
+ <constant>shared-huge</constant>, <constant>private-dax</constant>, <constant>shared-dax</constant>,
+ and the special values <constant>all</constant> (all types) and <constant>default</constant> (the
+ kernel default of <literal><constant>private-anonymous</constant>
+ <constant>shared-anonymous</constant> <constant>elf-headers</constant>
+ <constant>private-huge</constant></literal>). See
+ <citerefentry><refentrytitle>core</refentrytitle><manvolnum>5</manvolnum></citerefentry> for the
+ meaning of the mapping types. When specified multiple times, all specified masks are ORed. When not
+ set, or if the empty value is assigned, the inherited value is not changed.</para>
+
+ <example>
+ <title>Add DAX pages to the dump filter</title>
+
+ <programlisting>CoredumpFilter=default private-dax shared-dax</programlisting>
+ </example>
+ </listitem>
</varlistentry>
<varlistentry>
<term><varname>CPUAffinity=</varname></term>
<listitem><para>Controls the CPU affinity of the executed processes. Takes a list of CPU indices or ranges
- separated by either whitespace or commas. CPU ranges are specified by the lower and upper CPU indices separated
- by a dash. This option may be specified more than once, in which case the specified CPU affinity masks are
- merged. If the empty string is assigned, the mask is reset, all assignments prior to this will have no
- effect. See
+ separated by either whitespace or commas. Alternatively, takes a special "numa" value in which case systemd
+ automatically derives allowed CPU range based on the value of <varname>NUMAMask=</varname> option. CPU ranges
+ are specified by the lower and upper CPU indices separated by a dash. This option may be specified more than
+ once, in which case the specified CPU affinity masks are merged. If the empty string is assigned, the mask
+ is reset, all assignments prior to this will have no effect. See
<citerefentry><refentrytitle>sched_setaffinity</refentrytitle><manvolnum>2</manvolnum></citerefentry> for
details.</para></listitem>
</varlistentry>
<xi:include href="system-only.xml" xpointer="singular"/></listitem>
</varlistentry>
+ <varlistentry>
+ <term><varname>ProtectClock=</varname></term>
+
+ <listitem><para>Takes a boolean argument. If set, writes to the hardware clock or system clock will be denied.
+ It is recommended to turn this on for most services that do not need modify the clock. Defaults to off. Enabling
+ this option removes <constant>CAP_SYS_TIME</constant> and <constant>CAP_WAKE_ALARM</constant> from the
+ capability bounding set for this unit, installs a system call filter to block calls that can set the
+ clock, and <varname>DeviceAllow=char-rtc r</varname> is implied. This ensures <filename>/dev/rtc0</filename>,
+ <filename>/dev/rtc1</filename>, etc are made read only to the service. See
+ <citerefentry><refentrytitle>systemd.resource-control</refentrytitle><manvolnum>5</manvolnum></citerefentry>
+ for the details about <varname>DeviceAllow=</varname>.</para>
+
+ <xi:include href="system-only.xml" xpointer="singular"/></listitem>
+ </varlistentry>
+
<varlistentry>
<term><varname>ProtectKernelTunables=</varname></term>
mappings. Specifically these are the options <varname>PrivateTmp=</varname>,
<varname>PrivateDevices=</varname>, <varname>ProtectSystem=</varname>, <varname>ProtectHome=</varname>,
<varname>ProtectKernelTunables=</varname>, <varname>ProtectControlGroups=</varname>,
- <varname>ProtectKernelLogs=</varname>, <varname>ReadOnlyPaths=</varname>,
+ <varname>ProtectKernelLogs=</varname>, <varname>ProtectClock=</varname>, <varname>ReadOnlyPaths=</varname>,
<varname>InaccessiblePaths=</varname> and <varname>ReadWritePaths=</varname>.</para></listitem>
</varlistentry>
</para></listitem>
</varlistentry>
+ <varlistentry>
+ <term><varname>LogNamespace=</varname></term>
+
+ <listitem><para>Run the unit's processes in the specified journal namespace. Expects a short
+ user-defined string identifying the namespace. If not used the processes of the service are run in
+ the default journal namespace, i.e. their log stream is collected and processed by
+ <filename>systemd-journald.service</filename>. If this option is used any log data generated by
+ processes of this unit (regardless if via the <function>syslog()</function>, journal native logging
+ or stdout/stderr logging) is collected and processed by an instance of the
+ <filename>systemd-journald@.service</filename> template unit, which manages the specified
+ namespace. The log data is stored in a data store independent from the default log namespace's data
+ store. See
+ <citerefentry><refentrytitle>systemd-journald.service</refentrytitle><manvolnum>8</manvolnum></citerefentry>
+ for details about journal namespaces.</para>
+
+ <para>Internally, journal namespaces are implemented through Linux mount namespacing and
+ over-mounting the directory that contains the relevant <constant>AF_UNIX</constant> sockets used for
+ logging in the unit's mount namespace. Since mount namespaces are used this setting disconnects
+ propagation of mounts from the unit's processes to the host, similar to how
+ <varname>ReadOnlyPaths=</varname> and similar settings (see above) work. Journal namespaces may hence
+ not be used for services that need to establish mount points on the host.</para>
+
+ <para>When this option is used the unit will automatically gain ordering and requirement dependencies
+ on the two socket units associated with the <filename>systemd-journald@.service</filename> instance
+ so that they are automatically established prior to the unit starting up. Note that when this option
+ is used log output of this service does not appear in the regular
+ <citerefentry><refentrytitle>journalctl</refentrytitle><manvolnum>1</manvolnum></citerefentry>
+ output, unless the <option>--namespace=</option> option is used.</para></listitem>
+ </varlistentry>
+
<varlistentry>
<term><varname>SyslogIdentifier=</varname></term>