X-Git-Url: http://git.ipfire.org/?a=blobdiff_plain;f=man%2Fsystemd.exec.xml;h=3e2ae93bf0e60d857a347710b979b73710d1d97d;hb=a9ab5cdb505d7d368c44fc02cc0183e75db1f657;hp=7b0b4f18e574306545356308c214b9556a77db8b;hpb=353a6f293e701bf97bf5f2659b8f0d6941c1121d;p=thirdparty%2Fsystemd.git diff --git a/man/systemd.exec.xml b/man/systemd.exec.xml index 7b0b4f18e57..3e2ae93bf0e 100644 --- a/man/systemd.exec.xml +++ b/man/systemd.exec.xml @@ -70,6 +70,10 @@ or (or their combinations with console output, see below) automatically acquire dependencies of type After= on systemd-journald.socket. + + Units using LogNamespace= will automatically gain ordering and + requirement dependencies on the two socket units associated with + systemd-journald@.service instances. @@ -117,12 +121,12 @@ RootImage= - Takes a path to a block device node or regular file as argument. This call is similar to - RootDirectory= however mounts a file system hierarchy from a block device node or loopback - file instead of a directory. The device node or file system image file needs to contain a file system without a - partition table, or a file system within an MBR/MS-DOS or GPT partition table with only a single - Linux-compatible partition, or a set of file systems within a GPT partition table that follows the Discoverable Partitions + Takes a path to a block device node or regular file as argument. This call is similar + to RootDirectory= however mounts a file system hierarchy from a block device node + or loopback file instead of a directory. The device node or file system image file needs to contain a + file system without a partition table, or a file system within an MBR/MS-DOS or GPT partition table + with only a single Linux-compatible partition, or a set of file systems within a GPT partition table + that follows the Discoverable Partitions Specification. When DevicePolicy= is set to closed or @@ -135,6 +139,9 @@ PrivateDevices= below, as it may change the setting of DevicePolicy=. + Units making use of RootImage= automatically gain an + After= dependency on systemd-udevd.service. + @@ -213,20 +220,24 @@ is set, the default group of the user is used. This setting does not affect commands whose command line is prefixed with +. - Note that restrictions on the user/group name syntax are enforced: the specified name must consist only - of the characters a-z, A-Z, 0-9, _ and -, except for the first character - which must be one of a-z, A-Z or _ (i.e. numbers and - are not permitted - as first character). The user/group name must have at least one character, and at most 31. These restrictions - are enforced in order to avoid ambiguities and to ensure user/group names and unit files remain portable among - Linux systems. + Note that this enforces only weak restrictions on the user/group name syntax, but will generate + warnings in many cases where user/group names do not adhere to the following rules: the specified + name should consist only of the characters a-z, A-Z, 0-9, _ and + -, except for the first character which must be one of a-z, A-Z and + _ (i.e. digits and - are not permitted as first character). The + user/group name must have at least one character, and at most 31. These restrictions are made in + order to avoid ambiguities and to ensure user/group names and unit files remain portable among Linux + systems. For further details on the names accepted and the names warned about see User/Group Name Syntax. When used in conjunction with DynamicUser= the user/group name specified is - dynamically allocated at the time the service is started, and released at the time the service is stopped — - unless it is already allocated statically (see below). If DynamicUser= is not used the - specified user and group must have been created statically in the user database no later than the moment the - service is started, for example using the - sysusers.d5 facility, which - is applied at boot or package install time. + dynamically allocated at the time the service is started, and released at the time the service is + stopped — unless it is already allocated statically (see below). If DynamicUser= + is not used the specified user and group must have been created statically in the user database no + later than the moment the service is started, for example using the + sysusers.d5 + facility, which is applied at boot or package install time. If the user does not exist by then + program invocation will fail. If the User= setting is used the supplementary group list is initialized from the specified user's default group list, as defined in the system's user and group @@ -404,11 +415,11 @@ CapabilityBoundingSet=~CAP_B CAP_C RestrictAddressFamilies=, RestrictNamespaces=, PrivateDevices=, ProtectKernelTunables=, ProtectKernelModules=, ProtectKernelLogs=, - MemoryDenyWriteExecute=, RestrictRealtime=, - RestrictSUIDSGID=, DynamicUser= or LockPersonality= - are specified. Note that even if this setting is overridden by them, systemctl show shows the - original value of this setting. Also see No New Privileges + ProtectClock=, MemoryDenyWriteExecute=, + RestrictRealtime=, RestrictSUIDSGID=, DynamicUser= + or LockPersonality= are specified. Note that even if this setting is overridden by them, + systemctl show shows the original value of this setting. + Also see No New Privileges Flag. @@ -497,42 +508,51 @@ CapabilityBoundingSet=~CAP_B CAP_C LimitRTTIME= Set soft and hard limits on various resources for executed processes. See - setrlimit2 for details on - the resource limit concept. Resource limits may be specified in two formats: either as single value to set a - specific soft and hard limit to the same value, or as colon-separated pair to set - both limits individually (e.g. LimitAS=4G:16G). Use the string to - configure no limit on a specific resource. The multiplicative suffixes K, M, G, T, P and E (to the base 1024) - may be used for resource limits measured in bytes (e.g. LimitAS=16G). For the limits referring to time values, - the usual time units ms, s, min, h and so on may be used (see + setrlimit2 for + details on the resource limit concept. Resource limits may be specified in two formats: either as + single value to set a specific soft and hard limit to the same value, or as colon-separated pair + to set both limits individually (e.g. LimitAS=4G:16G). + Use the string to configure no limit on a specific resource. The + multiplicative suffixes K, M, G, T, P and E (to the base 1024) may be used for resource limits + measured in bytes (e.g. LimitAS=16G). For the limits referring to time values, the + usual time units ms, s, min, h and so on may be used (see systemd.time7 for - details). Note that if no time unit is specified for LimitCPU= the default unit of seconds - is implied, while for LimitRTTIME= the default unit of microseconds is implied. Also, note - that the effective granularity of the limits might influence their enforcement. For example, time limits - specified for LimitCPU= will be rounded up implicitly to multiples of 1s. For - LimitNICE= the value may be specified in two syntaxes: if prefixed with + - or -, the value is understood as regular Linux nice value in the range -20..19. If not - prefixed like this the value is understood as raw resource limit parameter in the range 0..40 (with 0 being - equivalent to 1). - - Note that most process resource limits configured with these options are per-process, and processes may - fork in order to acquire a new set of resources that are accounted independently of the original process, and - may thus escape limits set. Also note that LimitRSS= is not implemented on Linux, and - setting it has no effect. Often it is advisable to prefer the resource controls listed in + details). Note that if no time unit is specified for LimitCPU= the default unit of + seconds is implied, while for LimitRTTIME= the default unit of microseconds is + implied. Also, note that the effective granularity of the limits might influence their + enforcement. For example, time limits specified for LimitCPU= will be rounded up + implicitly to multiples of 1s. For LimitNICE= the value may be specified in two + syntaxes: if prefixed with + or -, the value is understood as + regular Linux nice value in the range -20..19. If not prefixed like this the value is understood as + raw resource limit parameter in the range 0..40 (with 0 being equivalent to 1). + + Note that most process resource limits configured with these options are per-process, and + processes may fork in order to acquire a new set of resources that are accounted independently of the + original process, and may thus escape limits set. Also note that LimitRSS= is not + implemented on Linux, and setting it has no effect. Often it is advisable to prefer the resource + controls listed in systemd.resource-control5 - over these per-process limits, as they apply to services as a whole, may be altered dynamically at runtime, and - are generally more expressive. For example, MemoryLimit= is a more powerful (and working) - replacement for LimitRSS=. - - For system units these resource limits may be chosen freely. For user units however (i.e. units run by a - per-user instance of - systemd1), these limits are - bound by (possibly more restrictive) per-user limits enforced by the OS. + over these per-process limits, as they apply to services as a whole, may be altered dynamically at + runtime, and are generally more expressive. For example, MemoryMax= is a more + powerful (and working) replacement for LimitRSS=. Resource limits not configured explicitly for a unit default to the value configured in the various DefaultLimitCPU=, DefaultLimitFSIZE=, … options available in systemd-system.conf5, and – if not configured there – the kernel or per-user defaults, as defined by the OS (the latter only for user - services, see above). + services, see below). + + For system units these resource limits may be chosen freely. When these settings are configured + in a user service (i.e. a service run by the per-user instance of the service manager) they cannot be + used to raise the limits above those set for the user manager itself when it was first invoked, as + the user's service manager generally lacks the privileges to do so. In user context these + configuration options are hence only useful to lower the limits passed in or to raise the soft limit + to the maximum of the hard limit as configured for the user. To raise the user's limits further, the + available configuration mechanisms differ between operating systems, but typically require + privileges. In most cases it is possible to configure higher per-user resource limits via PAM or by + setting limits on the system service encapsulating the user's service manager, i.e. the user's + instance of user@.service. After making such changes, make sure to restart the + user's service manager. Resource limit directives, their equivalent <command>ulimit</command> shell commands and the unit used @@ -638,8 +658,39 @@ CapabilityBoundingSet=~CAP_B CAP_CUMask=Controls the file mode creation mask. Takes an access mode in octal notation. See - umask2 for details. Defaults - to 0022. + umask2 for + details. Defaults to 0022 for system units. For units of the user service manager the default value + is inherited from the user instance (whose default is inherited from the system service manager, and + thus also is 0022). Hence changing the default value of a user instance, either via + UMask= or via a PAM module, will affect the user instance itself and all user + units started by the user instance unless a user unit has specified its own + UMask=. + + + + CoredumpFilter= + + Controls which types of memory mappings will be saved if the process dumps core + (using the /proc/pid/coredump_filter file). Takes a + whitespace-separated combination of mapping type names or numbers (with the default base 16). Mapping + type names are private-anonymous, shared-anonymous, + private-file-backed, shared-file-backed, + elf-headers, private-huge, + shared-huge, private-dax, shared-dax, + and the special values all (all types) and default (the + kernel default of private-anonymous + shared-anonymous elf-headers + private-huge). See + core5 for the + meaning of the mapping types. When specified multiple times, all specified masks are ORed. When not + set, or if the empty value is assigned, the inherited value is not changed. + + + Add DAX pages to the dump filter + + CoredumpFilter=default private-dax shared-dax + + @@ -760,10 +811,11 @@ CapabilityBoundingSet=~CAP_B CAP_C CPUAffinity= Controls the CPU affinity of the executed processes. Takes a list of CPU indices or ranges - separated by either whitespace or commas. CPU ranges are specified by the lower and upper CPU indices separated - by a dash. This option may be specified more than once, in which case the specified CPU affinity masks are - merged. If the empty string is assigned, the mask is reset, all assignments prior to this will have no - effect. See + separated by either whitespace or commas. Alternatively, takes a special "numa" value in which case systemd + automatically derives allowed CPU range based on the value of NUMAMask= option. CPU ranges + are specified by the lower and upper CPU indices separated by a dash. This option may be specified more than + once, in which case the specified CPU affinity masks are merged. If the empty string is assigned, the mask + is reset, all assignments prior to this will have no effect. See sched_setaffinity2 for details. @@ -830,7 +882,8 @@ CapabilityBoundingSet=~CAP_B CAP_CAlso note that some sandboxing functionality is generally not available in user services (i.e. services run by the per-user service manager). Specifically, the various settings requiring file system namespacing support (such as ProtectSystem=) are not available, as the underlying kernel functionality is only - accessible to privileged processes. + accessible to privileged processes. However, most namespacing settings, that will not work on their own in user + services, will work when used in conjunction with PrivateUsers=. @@ -1251,6 +1304,13 @@ BindReadOnlyPaths=/var/lib/systemd such as CapabilityBoundingSet= will affect only the latter, and there's no way to acquire additional capabilities in the host's user namespace. Defaults to off. + When this setting is set up by a per-user instance of the service manager, the mapping of the + root user and group to itself is omitted (unless the user manager is root). + Additionally, in the per-user instance manager case, the + user namespace will be set up before most other namespaces. This means that combining + PrivateUsers= with other namespaces will enable use of features not + normally supported by the per-user instances of the service manager. + This setting is particularly useful in conjunction with RootDirectory=/RootImage=, as the need to synchronize the user and group databases in the root directory and on the host is reduced, as the only users and groups who need to be matched @@ -1258,9 +1318,7 @@ BindReadOnlyPaths=/var/lib/systemd Note that the implementation of this setting might be impossible (for example if user namespaces are not available), and the unit should be written in a way that does not solely rely on this setting for - security. - - + security. @@ -1280,6 +1338,21 @@ BindReadOnlyPaths=/var/lib/systemd + + ProtectClock= + + Takes a boolean argument. If set, writes to the hardware clock or system clock will be denied. + It is recommended to turn this on for most services that do not need modify the clock. Defaults to off. Enabling + this option removes CAP_SYS_TIME and CAP_WAKE_ALARM from the + capability bounding set for this unit, installs a system call filter to block calls that can set the + clock, and DeviceAllow=char-rtc r is implied. This ensures /dev/rtc0, + /dev/rtc1, etc are made read only to the service. See + systemd.resource-control5 + for the details about DeviceAllow=. + + + + ProtectKernelTunables= @@ -1561,7 +1634,7 @@ RestrictNamespaces=~cgroup net points of the file system namespace created for each process of this unit. Other file system namespacing unit settings (see the discussion in PrivateMounts= above) will implicitly disable mount and unmount propagation from the unit's processes towards the host by changing the propagation setting of all mount - points in the unit's file system namepace to first. Setting this option to + points in the unit's file system namespace to first. Setting this option to does not reestablish propagation in that case. If not set – but file system namespaces are enabled through another file system namespace unit setting – @@ -1791,7 +1864,7 @@ SystemCallErrorNumber=EPERM mappings. Specifically these are the options PrivateTmp=, PrivateDevices=, ProtectSystem=, ProtectHome=, ProtectKernelTunables=, ProtectControlGroups=, - ProtectKernelLogs=, ReadOnlyPaths=, + ProtectKernelLogs=, ProtectClock=, ReadOnlyPaths=, InaccessiblePaths= and ReadWritePaths=. @@ -1902,7 +1975,8 @@ SystemCallErrorNumber=EPERM The files listed with this directive will be read shortly before the process is executed (more specifically, after all processes from a previous unit state terminated. This means you can generate these - files in one unit state, and read it with this option in the next). + files in one unit state, and read it with this option in the next. The files are read from the file + system of the service manager, before any file system changes like bind mounts take place). Settings from these files override settings made with Environment=. If the same variable is set twice from these files, the files will be read in the order they are specified and the later @@ -2041,7 +2115,7 @@ SystemCallErrorNumber=EPERM StandardOutput= - Controls where file descriptor 1 (STDOUT) of the executed processes is connected + Controls where file descriptor 1 (stdout) of the executed processes is connected to. Takes one of , , , , , , , , @@ -2117,7 +2191,7 @@ SystemCallErrorNumber=EPERM StandardError= - Controls where file descriptor 2 (STDERR) of the executed processes is connected to. The + Controls where file descriptor 2 (stderr) of the executed processes is connected to. The available options are identical to those of StandardOutput=, with some exceptions: if set to the file descriptor used for standard output is duplicated for standard error, while will use a default file descriptor name of @@ -2222,6 +2296,36 @@ StandardInputData=SWNrIHNpdHplIGRhIHVuJyBlc3NlIEtsb3BzLAp1ZmYgZWVtYWwga2xvcHAncy + + LogNamespace= + + Run the unit's processes in the specified journal namespace. Expects a short + user-defined string identifying the namespace. If not used the processes of the service are run in + the default journal namespace, i.e. their log stream is collected and processed by + systemd-journald.service. If this option is used any log data generated by + processes of this unit (regardless if via the syslog(), journal native logging + or stdout/stderr logging) is collected and processed by an instance of the + systemd-journald@.service template unit, which manages the specified + namespace. The log data is stored in a data store independent from the default log namespace's data + store. See + systemd-journald.service8 + for details about journal namespaces. + + Internally, journal namespaces are implemented through Linux mount namespacing and + over-mounting the directory that contains the relevant AF_UNIX sockets used for + logging in the unit's mount namespace. Since mount namespaces are used this setting disconnects + propagation of mounts from the unit's processes to the host, similar to how + ReadOnlyPaths= and similar settings (see above) work. Journal namespaces may hence + not be used for services that need to establish mount points on the host. + + When this option is used the unit will automatically gain ordering and requirement dependencies + on the two socket units associated with the systemd-journald@.service instance + so that they are automatically established prior to the unit starting up. Note that when this option + is used log output of this service does not appear in the regular + journalctl1 + output, unless the option is used. + + SyslogIdentifier=