From: Lennart Poettering Date: Wed, 16 Apr 2025 13:32:45 +0000 (+0200) Subject: man: explain coredump handling in context of containers better X-Git-Tag: v258-rc1~763 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=80653ba9257552b7a840b8ffa99d8c2da568583d;p=thirdparty%2Fsystemd.git man: explain coredump handling in context of containers better We have two different mechanisms, let's discuss them explicitly, comparing their effect and intended usecase. --- diff --git a/man/coredump.conf.xml b/man/coredump.conf.xml index df5e7df29ce..c20879eda03 100644 --- a/man/coredump.conf.xml +++ b/man/coredump.conf.xml @@ -112,13 +112,30 @@ EnterNamespace= - Controls whether systemd-coredump will attempt to use the mount tree of - a process that crashed in PID namespace. Access to the namespace's mount tree might be necessary to generate - a fully symbolized backtrace. If set to yes, then systemd-coredump will - obtain the mount tree from corresponding mount namespace and will try to generate the stack trace using the - binary and libraries from the mount namespace. Note that the coredump of the namespaced process might - still be saved in /var/lib/systemd/coredump/ even if EnterNamespace= - is set to no. Defaults to no. + For processes belonging to a PID namespace, controls whether + systemd-coredump shall attempt to process core dumps on the host, using debug + information from the file system hierarchy (i.e. the mount namespace) of the process that + crashed. Access to the process' file system hierarchy might be necessary to generate a fully + symbolized backtrace. If set to yes, systemd-coredump will + obtain the tree of mounts from the crashing process' mount namespace and will try to generate the stack + trace in host context using the debug information of binaries and libraries contained in the crashing + process' hierarchy. Defaults to no, i.e. no attempt is made to acquire external + debug information from the process' mount namespace, in order to maximize security. This option has + no effect on processes that are part of the host's PID namespace. + + Note that the coredump of the namespaced process is still saved in + /var/lib/systemd/coredump/ on the host even if + EnterNamespace= is set to no (subject to + Storage=). + + Note that EnterNamespace= only has an effect if a core dump is generated by + a container whose unit does not have CoredumpReceive= enabled. + + Note that it's typically preferable to let containers and other namespace-based sandboxes + process their own coredumps, if possible, for best security. This may be enabled on the container's + unit via the CoredumpReceive= setting, see + systemd.resource-control5 + for details. diff --git a/man/systemd-coredump.xml b/man/systemd-coredump.xml index a7862f9c0fd..9972ba02e4d 100644 --- a/man/systemd-coredump.xml +++ b/man/systemd-coredump.xml @@ -39,11 +39,11 @@ stack trace if possible. It may also save the core dump for later processing. See the "Information about the crashed process" section below. - The behavior of a specific program upon reception of a signal is governed by a few - factors which are described in detail in - core5. - In particular, the core dump will only be processed when the related resource limits are sufficient. - + The behavior of a specific program upon reception of a signal is governed by a few factors which + are described in detail in core5. In + particular, the core dump will only be processed when the related process resource limits + (RLIMIT_CORE) are sufficient. Core dumps can be written to the journal or saved as a file. In both cases, they can be retrieved for further processing, for example in @@ -53,7 +53,7 @@ By default, systemd-coredump will log the core dump to the journal, including a backtrace if possible, and store the core dump (an image of the memory contents of the process) itself in - an external file in /var/lib/systemd/coredump. These core dumps are deleted after a + an external file in /var/lib/systemd/coredump/. These core dumps are deleted after a few days by default; see /usr/lib/tmpfiles.d/systemd.conf for details. Note that the removal of core files from the file system and the purging of journal entries are independent, and the core file may be present without the journal entry, and journal entries may point to since-removed core @@ -88,6 +88,43 @@ metadata fields in the same way it does for core dumps received from the kernel. In this mode, no core dump is stored in the journal. + + + Core dumps in Containers/Namespaces + + The systemd-coredump@.service service will automatically attempt to extract + a stacktrace from a process as it crashes. For this stacktrace symbols will be resolved based on debug + information embedded in the crashing ELF image, or equivalent debug information separately available on + the host OS. For processes that crash inside of local containers or other mount namespace-based + sandboxes, this auxiliary debug information is typically not available on the host (simply because + containers typically run different software versions than the + host). systemd-coredump provides two mechanisms to address this: + + + For full-OS containers running systemd inside it is a good idea to enable + CoredumpReceive= on the unit (see + systemd.resource-control5), + which ensures that coredumps of a container are attempted to be forwarded to + systemd-coredump@.service running inside the container, i.e the container gets + to process and store its own core dumps. Note that + systemd-nspawn8 + defaults to this mode if invoked with the switch. This mode of operation is + generally recommended for security reasons: the security-sensitive processing of the core dump is + done within the confinements of the container itself, by the container's own code, backed by the + container's own storage. + + Alternatively, for more restricted containers (that do not run a proper + init system as PID 1) it is possible to enable processing of the core dump on + the host, with access to the debug information data from the container itself. This mode of operation + must be enabled via EnterNamespace= in + coredump.conf5, + and defaults to off, for security reasons. + + + If both CoredumpReceive= is enabled on the unit of the container the core dump + belongs to, and EnterNamespace= is enabled in the coredump.conf + configuration file, the former takes precedence. +