From 1e785c50c9257ee13c9dbf57d8045febc195c8bc Mon Sep 17 00:00:00 2001 From: Lennart Poettering Date: Wed, 13 Mar 2024 10:04:42 +0100 Subject: [PATCH] docs: document new sd_notify() extensions --- docs/CONTAINER_INTERFACE.md | 23 +++++--- man/sd_notify.xml | 11 ++-- man/systemd.xml | 104 ++++++++++++++++++++++++++++++++---- 3 files changed, 116 insertions(+), 22 deletions(-) diff --git a/docs/CONTAINER_INTERFACE.md b/docs/CONTAINER_INTERFACE.md index dcecdecc3eb..4f59746ee77 100644 --- a/docs/CONTAINER_INTERFACE.md +++ b/docs/CONTAINER_INTERFACE.md @@ -165,10 +165,15 @@ manager, please consider supporting the following interfaces. issuing `journalctl -m`. The container machine ID can be determined from `/etc/machine-id` in the container. -3. If the container manager wants to cleanly shutdown the container, it might +3. If the container manager wants to cleanly shut down the container, it might be a good idea to send `SIGRTMIN+3` to its init process. systemd will then do a clean shutdown. Note however, that since only systemd understands - `SIGRTMIN+3` like this, this might confuse other init systems. + `SIGRTMIN+3` like this, this might confuse other init systems. A container + manager may implement the `$NOTIFY_SOCKET` protocol mentioned below in which + case it will receive a notification message `X_SYSTEMD_SIGNALS_LEVEL=2` that + indicates if and when these additional signal handlers are installed. If + these signals are sent to the container's PID 1 before this notification + message is sent they might not be handled correctly yet. 4. To support [Socket Activated Containers](https://0pointer.de/blog/projects/socket-activated-containers.html) @@ -190,12 +195,14 @@ manager, please consider supporting the following interfaces. unit they created for their container. That's private property of systemd, and no other code should modify it. -6. systemd running inside the container can report when boot-up is complete - using the usual `sd_notify()` protocol that is also used when a service - wants to tell the service manager about readiness. A container manager can - set the `$NOTIFY_SOCKET` environment variable to a suitable socket path to - make use of this functionality. (Also see information about - `/run/host/notify` below.) +6. systemd running inside the container can report when boot-up is complete, + boot progress and functionality as well as various other bits of system + information using the `sd_notify()` protocol that is also used when a + service wants to tell the service manager about readiness. A container + manager can set the `$NOTIFY_SOCKET` environment variable to a suitable + socket path to make use of this functionality. (Also see information about + `/run/host/notify` below, as well as the Readiness Protocol section on + [systemd(1)](https://www.freedesktop.org/software/systemd/man/latest/systemd.html) ## Networking diff --git a/man/sd_notify.xml b/man/sd_notify.xml index a56d0394686..d8fe6468a29 100644 --- a/man/sd_notify.xml +++ b/man/sd_notify.xml @@ -446,9 +446,14 @@ The notification messages sent by services are interpreted by the service manager. Unknown - assignments may be logged, but are otherwise ignored. Thus, it is not useful to send assignments which - are not in this list. The service manager also sends some messages to its - notification socket, which are then consumed by the machine or container manager. + assignments are ignored. Thus, it is is safe (but often without effect) to send assignments which are not + in this list. The protocol is extensible, but care should be taken to ensure private extensions are + recognizable as such. Specifically, it is recommend to prefix them with X_ followed by + some namespace identifier. The service manager also sends some messages to its + notification socket, which may then consumed by a supervising machine or container manager further up the + stack. The service manager sends a number of extension fields, for example + X_SYSTEMD_UNIT_ACTIVE=, for details see + systemd1. diff --git a/man/systemd.xml b/man/systemd.xml index 960df97f0b3..b66707faba6 100644 --- a/man/systemd.xml +++ b/man/systemd.xml @@ -372,6 +372,14 @@ Signals + The service listens to various UNIX process signals that can be used to request various actions + asynchronously. The signal handling is enabled very early during boot, before any further processes are + invoked. However, a supervising container manager or similar that intends to request these operations via + this mechanism must take into consideration that this functionality is not available during the earliest + initialization phase. An sd_notify() notification message carrying the + X_SYSTEMD_SIGNALS_LEVEL=2 field is emitted once the signal handlers are enabled, see + below. This may be used to schedule submission of these signals correctly. + SIGTERM @@ -769,10 +777,11 @@ $NOTIFY_SOCKET - Set by systemd for supervised processes for - status and start-up completion notification. See - sd_notify3 - for more information. + Set by service manager for its services for status and readiness notifications. Also + consumed by service manager for notifying supervising container managers or service managers up the + stack about its own progress. See + sd_notify3 and the + relevant section below for more information. @@ -1109,7 +1118,7 @@ - System credentials + System Credentials During initialization the service manager will import credentials from various sources into the system's set of credentials, which can then be propagated into services and consumed by @@ -1151,14 +1160,16 @@ vmm.notify_socket Contains a AF_VSOCK or AF_UNIX address where to - send a READY=1 notification datagram when the system has finished booting. See - sd_notify3 for - more information. Note that in case the hypervisor does not support SOCK_DGRAM - over AF_VSOCK, SOCK_SEQPACKET will be tried instead. The - credential payload for AF_VSOCK should be in the form + send a READY=1 notification message when the service manager has completed + booting. See + sd_notify3 and + the next section for more information. Note that in case the hypervisor does not support + SOCK_DGRAM over AF_VSOCK, + SOCK_SEQPACKET will be tried instead. The credential payload for + AF_VSOCK should be a string in the form vsock:CID:PORT. - This feature is useful for hypervisors/VMMs or other processes on the host to receive a + This feature is useful for machine managers or other processes on the host to receive a notification via VSOCK when a virtual machine has finished booting. @@ -1177,6 +1188,77 @@ + + For a list of system credentials various other components of systemd consume, see + systemd.system-credentials7. + + + + Readiness Protocol + + The service manager implements a readiness notification protocol both between the manager and its + services (i.e. down the stack), and between the manager and a potential supervisor further up the stack + (the latter could be a machine or container manager, or in case of a per-user service manager the system + service manager instance). The basic protocol (and the suggested API for it) is described in + sd_notify3. + + The notification socket the service manager (including PID 1) uses for reporting readiness to its + own supervisor is set via the usual $NOTIFY_SOCKET environment variable (see + above). Since this is directly settable only for container managers and for the per-user instance of the + service manager, an additional mechanism to configure this is available, in particular intended for use + in VM environments: the vmm.notify_socket system credential (see above) may be set to + a suitable socket (typically an AF_VSOCK one) via SMBIOS Type 11 vendor strings. For + details see above. + + The notification protocol from the service manager up the stack towards a supervisor supports a + number of extension fields that allow a supervisor to learn about specific properties of the system and + track its boot progress. Specifically the following fields are sent: + + + An X_SYSTEMD_HOSTNAME=… message will be sent out once the initial + hostname for the system has been determined. Note that during later runtime the hostname might be + changed again programmatically, and (currently) no further notifications are sent out in that case. + + + + An X_SYSTEMD_MACHINE_ID=… message will be sent out once the machine + ID of the system has been determined. See + machine-id5 for + details. + + + + An X_SYSTEMD_SIGNALS_LEVEL=… message will be sent out once the + service manager installed the various UNIX process signal handlers described above. The field's value + is an unsigned integer formatted as decimal string, and indicates the supported UNIX process signal + feature level of the service manager. Currently, only a single feature level is defined: + + + X_SYSTEMD_SIGNALS_LEVEL=2 covers the various UNIX process signals + documented above – which are a superset of those supported by the historical SysV init + system. + + + Signals sent to PID 1 before this message is sent might not be handled correctly yet. A consumer + of these messages should parse the value as an unsigned integer indication the level of support. For + now only the mentioned level 2 is defined, but later on additional levels might be defined with higher + integers, that will implement a superset of the currently defined behaviour. + + + + X_SYSTEMD_UNIT_ACTIVE=… and + X_SYSTEMD_UNIT_INACTIVE=… messages will be sent out for each target unit as it + becomes active or stops being active. This is useful to track boot progress and functionality. For + example, once the ssh-access.target unit is reported started SSH access is + typically available, see + systemd.special7 for + details. + + + + + Note that these extension fields are sent in addition to the regular READY=1 and + RELOADING=1 notifications. -- 2.47.3