From f8a32e679eec7173db5f7ccfb63a2e3841ded1e1 Mon Sep 17 00:00:00 2001 From: Lennart Poettering Date: Thu, 16 Feb 2023 17:24:28 +0100 Subject: [PATCH] man: document the new sd_event_add_memory_pressure() API --- man/rules/meson.build | 6 + man/sd_event_add_memory_pressure.xml | 270 +++++++++++++++++++++++++++ 2 files changed, 276 insertions(+) create mode 100644 man/sd_event_add_memory_pressure.xml diff --git a/man/rules/meson.build b/man/rules/meson.build index dcae4442eaa..1279004161b 100644 --- a/man/rules/meson.build +++ b/man/rules/meson.build @@ -555,6 +555,12 @@ manpages = [ 'sd_event_source_set_io_fd', 'sd_event_source_set_io_fd_own'], ''], + ['sd_event_add_memory_pressure', + '3', + ['sd_event_source_set_memory_pressure_period', + 'sd_event_source_set_memory_pressure_type', + 'sd_event_trim_memory'], + ''], ['sd_event_add_signal', '3', ['SD_EVENT_SIGNAL_PROCMASK', diff --git a/man/sd_event_add_memory_pressure.xml b/man/sd_event_add_memory_pressure.xml new file mode 100644 index 00000000000..46cae0d16db --- /dev/null +++ b/man/sd_event_add_memory_pressure.xml @@ -0,0 +1,270 @@ + + + + + + + + sd_event_add_memory_pressure + systemd + + + + sd_event_add_memory_pressure + 3 + + + + sd_event_add_memory_pressure + sd_event_source_set_memory_pressure_type + sd_event_source_set_memory_pressure_period + sd_event_trim_memory + + Add and configure an event source run as result of memory pressure + + + + + #include <systemd/sd-event.h> + + typedef struct sd_event_source sd_event_source; + + + int sd_event_add_memory_pressure + sd_event *event + sd_event_source **ret_source + sd_event_handler_t handler + void *userdata + + + + int sd_event_source_set_memory_pressure_type + sd_event_source *source + const char *type + + + + int sd_event_source_set_memory_pressure_period + sd_event_source *source + uint64_t threshold_usec + uint64_t window_usec + + + + int sd_event_trim_memory + void + + + + + + Description + + sd_event_add_memory_pressure() adds a new event source that is triggered + whenever memory pressure is seen. This functionality is built around the Linux kernel's Pressure Stall Information (PSI) logic. + + Expects an event loop object as first parameter, and returns the allocated event source object in + the second parameter, on success. The handler parameter is a function to call when + memory pressure is seen, or NULL. The handler function will be passed the + userdata pointer, which may be chosen freely by the caller. The handler may return + negative to signal an error (see below), other return values are ignored. If + handler is NULL, a default handler that compacts allocation + caches maintained by libsystemd as well as glibc (via malloc_trim3) + will be used. + + To destroy an event source object use + sd_event_source_unref3, + but note that the event source is only removed from the event loop when all references to the event + source are dropped. To make sure an event source does not fire anymore, even if it is still referenced, + disable the event source using + sd_event_source_set_enabled3 + with SD_EVENT_OFF. + + If the second parameter of sd_event_add_memory_pressure() is + NULL no reference to the event source object is returned. In this case the event + source is considered "floating", and will be destroyed implicitly when the event loop itself is + destroyed. + + The event source will fire according to the following logic: + + + If the + $MEMORY_PRESSURE_WATCH/$MEMORY_PRESSURE_WRITE environment + variables are set at the time the event source is established, it will watch the file, FIFO or AF_UNIX + socket specified via $MEMORY_PRESSURE_WATCH (which must contain an absolute path + name) for POLLPRI (in case it is a regular file) or POLLIN + events (otherwise). After opening the inode, it will write the (decoded) Base64 data provided via + $MEMORY_PRESSURE_WRITE into it before it starts polling on it (the variable may be + unset in which case this is skipped). Typically, if used, $MEMORY_PRESSURE_WATCH + will contain a path such as /proc/pressure/memory or a path to a specific + memory.pressure file in the control group file system + (cgroupfs). + + If these environment variables are not set, the local PSI interface file + memory.pressure of the control group the invoking process is running in is + used. + + If that file does not exist, the system-wide PSI interface file + /proc/pressure/memory is watched instead. + + + Or in other words: preferably any explicit configuration passed in by an invoking service manager + (or similar) is used as notification source, before falling back to local notifications of the service, + and finally to global notifications of the system. + + Well-behaving services and applications are recommended to react to memory pressure events by + executing one or more of the following operations, in order to ensure optimal behaviour even on loaded + and resource-constrained systems: + + + Release allocation caches such as malloc_trim() or similar, both + implemented in the libraries consumed by the program and in private allocation caches of the program + itself. + + Release any other form of in-memory caches that can easily be recovered if + needed (e.g. browser caches). + + Terminate idle worker threads or processes, or similar. + + Even exit entirely from the program if it is idle and can be automatically started when + needed (for example via socket or bus activation). + + + Any of the suggested operations should help easing memory pressure situations and allowing the + system to make progress by reclaiming the memory for other purposes. + + This event source typically fires on memory pressure stalls, i.e. when operational latency above a + configured threshold already has been seen. This should be taken into consideration when discussing + whether later latency to re-aquire any released resources is acceptable: it's usually more important to + think of the latencies that already happened than those coming up in future. + + The sd_event_source_set_memory_pressure_type() and + sd_event_source_set_memory_pressure_period() functions can be used to fine-tune the + PSI parameters for pressure notifications. The former takes either some, + full as second parameter, the latter takes threshold and period times in microseconds + as parameters. For details about these three parameters see the PSI documentation. Note that these two + calls must be invoked immediately after allocating the event source, as they must be configured before + polling begins. Also note that these calls will fail if memory pressure paramterization has been passed + in via the $MEMORY_PRESSURE_WATCH/$MEMORY_PRESSURE_WRITE + environment variables (or in other words: configuration supplied by a service manager wins over internal + settings). + + The sd_event_trim_memory() function releases various internal allocation + caches maintained by libsystemd and then invokes glibc's malloc_trim3. This + makes the operation executed when the handler function parameter of + sd_event_add_memory_pressure is passed as NULL directly + accessible for invocation at any time (see above). This function will log a structured log message at + LOG_DEBUG level (with message ID f9b0be465ad540d0850ad32172d57c21) about the memory + pressure operation. + + + + Return Value + + On success, these functions return 0 or a positive + integer. On failure, they return a negative errno-style error + code. + + + Errors + + Returned errors may indicate the following problems: + + + + -ENOMEM + + Not enough memory to allocate an object. + + + + -EINVAL + + An invalid argument has been passed. + + + + -EHOSTDOWN + + The $MEMORY_PRESSURE_WATCH variable has been set to the literal + string /dev/null, in order to explicitly disable memory pressure + handling. + + + + -EBADMSG + + The $MEMORY_PRESSURE_WATCH variable has been set to an invalid + string, for example a relative rather than an absolute path. + + + + -ENOTTY + + The $MEMORY_PRESSURE_WATCH variable points to a regular file + outside of the procfs or cgroupfs file systems. + + + + -EOPNOTSUPP + + No configuration via $MEMORY_PRESSURE_WATCH has been specified + and the local kernel does not support the PSI interface. + + + + -EBUSY + + This is returned by sd_event_source_set_memory_pressure_type() + and sd_event_source_set_memory_pressure_period() if invoked on event sources + at a time later than immediately after allocting them. + + + + -ESTALE + + The event loop is already terminated. + + + + -ECHILD + + The event loop has been created in a different process. + + + + -EDOM + + The passed event source is not a signal event source. + + + + + + + + + + See Also + + + systemd1, + sd-event3, + sd_event_new3, + sd_event_add_io3, + sd_event_add_time3, + sd_event_add_child3, + sd_event_add_inotify3, + sd_event_add_defer3, + sd_event_source_set_enabled3, + sd_event_source_set_description3, + sd_event_source_set_userdata3, + sd_event_source_set_floating3 + + + + -- 2.47.3