Yu Watanabe [Sat, 16 May 2026 18:20:45 +0000 (03:20 +0900)]
ci/alpine: do not install util-linux-login
For some reasons, after util-linux is bumped from 2.41.4-r0 to 2.42-r0,
the 'su' command from util-linux-login seems to not correctly run commands in
https://github.com/jirutka/setup-alpine/blob/v1.4.1/alpine.sh
and causes the following spurious failure:
```
2026-05-15T21:19:15.6539432Z ##[group]Set up user runner
2026-05-15T21:19:15.6981963Z /bin/sh: line 0: ��: not found
2026-05-15T21:19:15.6982503Z /bin/sh: line 1: ␡ELF␂␁␁␃: not found
2026-05-15T21:19:15.6985788Z /bin/sh: line 10: ␒␐␆␒B␈␒�␄␒y␄␒�␁␒␞␇␒:␁␒�␃␒�␄␒@␁␒9␈␒?␆␒␚␈␒x: not found
2026-05-15T21:19:15.7010731Z /bin/sh: line 33: can't open ␂␒-␂␒�: no such file
2026-05-15T21:19:15.7016026Z /bin/sh: line 33: syntax error: unexpected word (expecting ")")
2026-05-15T21:19:15.7049583Z
2026-05-15T21:19:15.7050199Z ␛[1;31mError occurred at line 338:␛[0m
2026-05-15T21:19:15.7050830Z 335 | echo 'permit nopass keepenv $SUDO_USER' | tee /etc/doas.d/root.conf
2026-05-15T21:19:15.7051287Z 336 | fi
2026-05-15T21:19:15.7051549Z 337 | SHELL
2026-05-15T21:19:15.7052039Z ␛[1;31m> 338 | abin/"$INPUT_SHELL_NAME" --root /.setup.sh␛[0m
2026-05-15T21:19:15.7052506Z 339 |
2026-05-15T21:19:15.7052796Z 340 | rm .setup.sh
2026-05-15T21:19:15.7053172Z 341 | endgroup
2026-05-15T21:19:15.7096322Z ##[error]Error occurred at line 338: abin/"$INPUT_SHELL_NAME" --root /.setup.sh (see the job log for more information)
2026-05-15T21:19:15.7101400Z ##[error]Process completed with exit code 1.
```
Let's not install the package. It seems no command provided by the
package is used.
test-verbs: dispatch via _dispatch_verb_with_args() directly
Drops the global-optind dependency from the test helper. Verb fixtures
stay inline as static const Verb[] — the section-based VERB() macro
would force unique verb names across the three test cases, which they
deliberately share to exercise overlap.
Co-developed-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Place VERB() declarations above each dispatch function and use
verbs_get_help_table() in help(). run() switches to
dispatch_verb_with_args(); the argv_looks_like_help() shortcut is
preserved since this is an internal tool with no proper option parsing.
Co-developed-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There is no --help implemented, so both verbs don't get help strings.
We should probably add --help + --version, and a proper description
of the program, but I'm leaving that for later.
Co-developed-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Place VERB() declarations above each dispatch function and use
verbs_get_help_table() in help() so the command listing stays in sync.
run() switches to dispatch_verb_with_args(); the argv_looks_like_help()
shortcut is preserved since this is an internal tool with no proper
option parsing.
Co-developed-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Place VERB() declarations directly above each dispatch function and use
verbs_get_help_table() in help() so the command listing stays in sync.
run() switches to dispatch_verb_with_args(); the argv_looks_like_help()
shortcut is preserved since this is an internal tool with no proper
option parsing.
Co-developed-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
storagectl: convert run_as_mount_helper to OPTION macros
This is the util-linux mount-helper interface (mount.storage), so all
options stay hidden via help=NULL — they are not user-facing. The
namespace "mount.storage" is distinct from the storagectl namespace
used for the user-facing CLI.
Co-developed-by: Claude Opus 4.7 <noreply@anthropic.com>
This bothered me for a while, but I didn't think too much about it and just
copied the existing usage pattern. But it really doesn't make sense. We expect
the compiler to align the section properly. But if it didn't align it, applying
alignment after the fact would just cause our pointer to point to the middle
of the structure. That'd be even worse than a misaligned pointer.
Similarly, when doing pointer arithmetic, p++ should really result in a value
with the appropriate alignment. This is the basic principle of C pointer
addition. So we really shouldn't try to adjust the pointer ourselves. At most,
we can assert that it is indeed aligned in tests.
Yu Watanabe [Sat, 16 May 2026 15:33:43 +0000 (00:33 +0900)]
sd-dhcp-client: use new message parser (#42123)
In 26b7c5ff3b944aa3a16d4e859e9c84ce7e968a5a, we introduced a new parser
for received DHCP message, but it was not used at that time. This PR
replaces the legacy parser with the new one, and makes the fuzzer also
use the new parser.
For the shell verb we want switches specified after the program name to
be passed to the program to execute, not processed by us. Mirror the
approach in 'userdbctl ssh-authorized-keys': start with
OPTION_PARSER_RETURN_POSITIONAL_ARGS, then lates switch to
STOP_AT_FIRST_NONOPTION for "shell" or NORMAL otherwise.
VERB declarations are placed directly above each function; functions
that dispatch multiple verb names get stacked VERB() declarations.
chainload_importctl() now takes the args strv instead of relying on the
global optind.
--help output is mostly the same.
--no-pager/--no-legend/--no-ask-password/-q/--quiet are now at the end.
bind-volume/unbind-volume are documented.
Also, if the fuzzing engine provides a valid message, then try to build
json variant and UDP payload from the parsed message. We will drop
dhcp_lease_save() and dhcp_lease_load(), hence the tests for them are
dropped.
Currently translated at 100.0% (266 of 266 strings)
Co-authored-by: Fco. Javier F. Serrador <fserrador@gmail.com>
Translate-URL: https://translate.fedoraproject.org/projects/systemd/main/es/
Translation: systemd/main
Yu Watanabe [Tue, 31 Mar 2026 22:56:09 +0000 (07:56 +0900)]
networkctl: load information about DHCP client from varlink
By the previous commit, networkd now exposes the received DHCP message
in the Descibe() DBus/Varlink method. Let's make networkctl deserialize
the DHCP message and use it where necessary.
This internally uses sd_dhcp_message object, and replaces functions
for creating and sending DHCP messages.
By using sd_dhcp_message internally, now we can correctly send long
(> 255 bytes) option data that cannot be fit in a single DHCP option TLV.
This also fixes the value in DHCP option 57 (Maximum Message Size).
Previously the IP and UDP header size is subtracted from the interface
MTU, but it should not.
Except for the above, this should not change any effective behaviors.
Luca Boccassi [Fri, 15 May 2026 17:19:41 +0000 (18:19 +0100)]
test-network: retry networkctl status in wait_operstate()
networkctl status may transiently fail right after start_networkd() because networkd has not yet picked up the freshly-created link from the kernel. The retry loop in wait_operstate() did not catch the resulting subprocess.CalledProcessError, so the test aborted on the first attempt instead of retrying for the configured timeout.
Observed in TEST-85-NETWORK-NetworkdBridgeTests, subtest test_bridge_configure_without_carrier[no-slave]:
Daan De Meyer [Fri, 15 May 2026 18:51:30 +0000 (18:51 +0000)]
meson: drop vestigial libgpg-error dependency
libgpg-error was added in 2017 (commit 76c8741060, Michael Biebl) to
gate HAVE_GCRYPT on its presence because src/resolve referenced
libgpg-error directly at the time. That usage is long gone — no source
file references any gpg-error API today — so the dependency only served
to fail HAVE_GCRYPT detection when gpg-error-dev wasn't installed.
libgcrypt's pkg-config Requires already pulls in the gpg-error headers
(via the transitive #include <gpg-error.h> in <gcrypt.h>), so dropping
the dep doesn't break compilation.
machinectl: reorder verb functions to match --help
The net diff is negative because some spurious whitespace and forward
declarations were dropped. One new forward declaration was added. (For
verb_poweroff_machine. The func could be moved, but I think it's better
to keep it adjacent to verb_reboot_machine which is very similar.)
Daan De Meyer [Fri, 15 May 2026 19:19:15 +0000 (21:19 +0200)]
nsresourced: detect and clean up registry entries for dead user namespaces (#42070)
The BPF kprobe that fires on user namespace destruction is the only
thing
that triggers registry cleanup, so any time it doesn't run — ring buffer
overflow, kprobe missing, fdstore entry dropped outside our cleanup path
— a registry entry is left behind forever.
Stamp each registry entry with the kernel's unique namespace identifier
(NS_GET_ID, kernel ≥ 6.13) at allocation time. At manager startup, after
the existing fdstore→registry sweep, walk the registry and ask the
kernel
to look each namespace up by id via open_by_handle_at() on nsfs; if the
lookup returns -ESTALE the namespace is gone and we release the entry.
Old entries written before this change carry no identifier and are left
alone.
Add a namespace_open_by_id() helper for the lookup. The kernel restricts
open_by_handle_at() on nsfs to processes in the initial user namespace,
collapsing both permission denials and dead namespaces onto -ESTALE; the
helper refuses early with -EPERM outside the initial user namespace
so callers can tell the two apart.
Daan De Meyer [Wed, 13 May 2026 10:54:02 +0000 (12:54 +0200)]
nsresourced: detect and clean up registry entries for dead user namespaces
The BPF kprobe that fires on user namespace destruction is the only thing
that triggers registry cleanup, so any time it doesn't run — ring buffer
overflow, kprobe missing, fdstore entry dropped outside our cleanup path
— a registry entry is left behind forever.
Stamp each registry entry with the kernel's unique namespace identifier
(NS_GET_ID, kernel ≥ 6.13) at allocation time. At manager startup, after
the existing fdstore→registry sweep, walk the registry and ask the kernel
to look each namespace up by id via open_by_handle_at() on nsfs; if the
lookup returns -ESTALE the namespace is gone and we release the entry.
Old entries written before this change carry no identifier and are left
alone.
Add a namespace_open_by_id() helper for the lookup. The kernel restricts
open_by_handle_at() on nsfs to processes in the initial user namespace,
collapsing both permission denials and dead namespaces onto -ESTALE; the
helper refuses early with -EHOSTDOWN outside the initial user namespace
so callers can tell the two apart.
Rewrite help() with help-util.h primitives + option_parser_get_help_table_group
for each User Record Properties section. The verbs[] table stays
unchanged for now; run() switches from dispatch_verb() (which depended
on the global optind) to _dispatch_verb_with_args() fed by
option_parser_get_args().
Explanations are improved for --birth-date[=DATE] (correct placement of
'['), --skel=, --shell= (short options listed). Some minor rewordings
for other options. The explanation for -E and -EE is split.
(OPTION_HELP_ENTRY_VERBATIM is used for -EE.)
Co-developed-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
homectl: reorder verb functions to match order in --help
Just a hand-crafted moving of blocks of code up and down, no other
changes. The net diff is -2 because add_signing_keys_from_credentials
forward declaration was dropped.
Luca Boccassi [Fri, 15 May 2026 15:07:48 +0000 (16:07 +0100)]
core: support FD Store propagation through manager instances, and preservation through kexec via LUO (#41683)
First of all, FD store propagation is enabled between manager instances,
and nspawn instances, via LISTEN_FDS and sd_notify. These can be nested
arbitrarily deeply all the way up to the system manager, and on restart
will be propagated all the way down to the origin. The FDs payload of an
nspawn container running as a user unit will be preserved, all the way
up to the system manager, and then down again.
The kernel Live Update Orchestrator (LUO) exposes /dev/liveupdate, which
lets userspace hand a set of "preservable" kernel objects to the new
kernel across a kexec-based reboot. For now it only supports memfds,
with more object types (virtio devices, etc.) expected to be added
later.
This is a natural fit for systemd's FD Store feature: services hand
memfds (containing serialized state or other service data) to PID 1 via
FDSTORE=1 sd_notify() messages, and get them back on their next start.
Today this works across service restarts, soft reboots and initrd→rootfs
transitions. With LUO, this series extends the same mechanism to work
across kexec too. The nesting preservation of FD stores thus now is
extended across kexec.
All preservable fds are collected into a single LUO session named
"systemd". Each fd is uploaded with an index (token). Token 0 is
reserved for a "mapping" memfd, which carries a JSON object describing
how to dispatch the other tokens back to units on the next boot.
Unit names are used as the unit identifier, as they are stable across
daemon-reexec, switch-root and kexec. token refers to the LUO index
assigned to the object in the session.
On shutdown for MANAGER_KEXEC, just before manager_free(), systemd walks
all services and serializes their persistent fd store contents (fds +
FDNAMEs + cgroup paths) into a JSON memfd. The fds themselves are
gathered into an FDSet. The fdset and the serialization memfd are passed
to systemd-shutdown via the SYSTEMD_LUO_SERIALIZE_FD environment
variable providing the fd number, so the actual LUO session creation and
ioctls happen as the very last step before kexec.
On boot, manager_luo_restore_fd_stores() opens /dev/liveupdate, tries to
retrieve the "systemd" session, reads the mapping memfd, then for each
entry retrieves the fd from the session and attempts to attach it to the
matching unit's fd store.
Because the initrd-stage PID 1 runs before the real rootfs units are
loaded, fds whose target unit is not (yet) known are not dropped: they
are stashed in a new luo_held_fds hashmap keyed by cgroup path. They are
re-tried in two places: after deserialization, and from unit_load(), so
fds land in the correct fd store as soon as the owning unit is parsed,
allowing units to be plugged in at runtime.
Non-kexec shutdown paths are unaffected: if MANAGER_KEXEC is not the
final objective, no serialization file is produced and no LUO session is
ever created. Likewise if /dev/liveupdate does not exist, nothing
happens.
The LUO session creation is performed by systemd-shutdown, rather than
by PID 1, deliberately: it is the last point where we can be sure all
other processes have already been killed, so nothing else can race us
into creating (or worse, hijacking) the "systemd" session.
/dev/liveupdate is a singleton and session names are global. In
addition, any kernel-visible side effects of preserving objects (memory
pinning, etc.) are delayed until the absolute last moment, minimizing
the window in which they could affect the running system. There is no
behaviour change for shutdown paths other than kexec, or for kexec when
systemd didn't hand over a serialization fd (e.g. because no service had
any fds stored, or because LUO wasn't supported at serialization time).
Finally, since LUO sessions cannot be nested under other sessions,
third-party sessions need to be handled explicitly and held open in the
shutdown binary alongside our own internal session, to allow services to
create and preserve their own sessions. The requirement comes from VMMs
that wish to preserve VM state across kexec: some file descriptors (e.g.
KVM's vmfd from the KVM_CREATE_VM ioctl) cannot be transferred between
processes via SCM_RIGHTS, so they cannot be stashed in the FD Store
directly. Additionally, some file descriptors must be handled
all-or-nothing, again tied to KVM, where a VM and its associated devices
are one indivisible group.
LUO: add support for preserving third party sessions
LUO sessions cannot be nested under other sessions. This means we need
to handle them explicitly, and held them open in the shutdown binary
like we do with our own internal session, to allow services to create
their own.
The requirement to support third party sessions comes from VMMs that
wish to preserve VM(s) state(s) across kexec, as some file descriptors
(KVM's vmfd from the KVM_CREATE_VM ioctl) cannot be transfered between
processes via SCM_RIGHTS, so they cannot be stashed in the FD Store
directly. Also some file descriptors have to be handled all together or
not at all, again to do with KVM and devices that are all part of the
same vm.
Luca Boccassi [Mon, 30 Mar 2026 23:29:19 +0000 (00:29 +0100)]
shutdown: prepare LUO session for FD Stores before kexec
Wires up the systemd-shutdown side of the kexec-via-LUO fd store preservation.
When rebooting via kexec, systemd builds a JSON description of the fd
stores of all loaded services and passes it to systemd-shutdown through
the SYSTEMD_LUO_SERIALIZE_FD environment variable. The FDs themselves
come in as part of the normal shutdown FDSet. systemd-shutdown's job is
then, at the very last moment before invoking the kexec syscall, to
move that state into a kernel LUO session so it survives the reboot.
Doing the LUO session creation here, rather than in PID 1, is
deliberate:
* It's the last point where we can be sure all other processes have
already been killed, so nothing else can race us into creating (or
worse, hijacking) the "systemd" session, as /dev/liveupdate is a
singleton and a session name is global.
* Any kernel-visible side effects of preserving objects (memory
pinning etc.) are delayed until the absolute last moment, minimizing
the window in which they could affect the running system
No behaviour change for shutdown paths other than kexec, or for kexec
when systemd didn't hand over a serialization fd (e.g. because no
service had any fds stored, or because LUO wasn't supported at
serialization time).
Luca Boccassi [Fri, 1 May 2026 13:25:11 +0000 (14:25 +0100)]
core: support FD Store preservation through kexec via LUO
The kernel Live Update Orchestrator (LUO) exposes /dev/liveupdate, which
allows userspace to hand a set of "preservable" kernel objects to the
new kernel across a kexec-based reboot. For now it only supports memfds,
with more object types (virtio devices, etc.) expected to be added later.
This is a natural fit for systemd's FD Store feature: services hand
memfds (containing serialized state or other service data) to PID 1 via
FDSTORE=1 sd_notify() messages, and get them back on their next start.
Today this works across service restarts, soft reboots and
initrd→rootfs transitions. With LUO we can extend the same mechanism to
work across kexec, too.
The protocol on the PID 1 side works roughly as follows:
* All preservable fds are collected into a single LUO session named
"systemd". Each FD gets uploaded with a token. Token 0 in that session
is reserved for a "mapping" memfd, which carries a JSON object
describing how to dispatch the other tokens back to units on the next
boot:
unit IDs are used as the unit identifier, as they're stable
across daemon-reexec, switch-root and kexec. token refers to the
LUO token assigned to the object in the session.
* On shutdown for MANAGER_KEXEC, just before manager_free(), systemd
walks all services and serializes their persistent fd store contents
(fds + FDNAMEs + unit IDs) into a JSON memfd. The FDs themselves are
gathered into a FDSet to be kept around. The fdset and the
serialization memfd are passed to systemd-shutdown via the
SYSTEMD_LUO_SERIALIZE_FD environment variable providing the fd number,
so the actual LUO session creation and ioctls can happen as the very
last step before kexec (shutdown implementation is the next commit).
* On boot, manager_luo_restore_fd_stores() opens /dev/liveupdate,
tries to retrieve the "systemd" session, reads the mapping memfd,
then for each entry retrieves the fd from the session and attempts
to attach it to the matching unit's fd store.
* The FDs are injected in the appropriate unit's FD stores using the
same mechanism as the LISTEN_FDS propagation that was set up earlier.
Non-kexec shutdown paths are unaffected: if MANAGER_KEXEC is not the
final objective, no serialization file is produced and no LUO session
is ever created. Likewise if /dev/liveupdate does not exist, nothing
happens.
Luca Boccassi [Fri, 1 May 2026 13:06:11 +0000 (14:06 +0100)]
nspawn: support forwarding FDs from payloads to managers
When there is a NOTIFY_SOCKET, and FDs are received from the
payload following the FD Store protocol, forward them up the
chain to the service manager that is managing nspawn.
This allows FD Store persistence across container restarts,
and can chain up for user managers as well to survive restarting
those, or reexecs, and in the future reboots too via LUO.
Add a new test case to exercise the PID1 -> user session -> nspawn -> payload
chain.
Luca Boccassi [Fri, 1 May 2026 13:19:33 +0000 (14:19 +0100)]
core: propagate FDs from store from user to system manager
In order to allow FD Stores of user units to survive a user
session restart, propagate FDs received via the protocol up one
level from user to system manager via sd_notify.
And the other way around, propagate them down via LISTEN_FDS
tagging them with the unit name so that the child manager can
inject them in the appropriate unit.
Ensure units that are dead or not loaded can get FDs added to
their stores, and that they are correctly propagated once the
unit is started or loaded. When the unit is not loaded we don't
know what the FD max limit is, so simply increase it for each FD
injected, and then when the unit is realised prune it down to
match the unit's now available config in case the limit is lower
than the number of FDs in the store.
Each FD sent up or down is assigned a monotonic index, and the manager
also sends a JSON map that associates the index with the original
unit and FDNAME:
Yu Watanabe [Sun, 22 Feb 2026 16:51:26 +0000 (01:51 +0900)]
udev/node: drop support of old file format in /run/udev/links/
The new file format in /run/udev/links/ has been introduced in 377a83f0d80376456d9be203796f66f543a8b943 (v250, released on 2021-12-23).
Let's drop the old format support, to simplify the logic.
Yu Watanabe [Sun, 22 Feb 2026 16:28:41 +0000 (01:28 +0900)]
udev/watch: use mapping from device ID -> watch on restart
The mapping from device ID to watch handle has been introduced by e7f781e473f5119bf9246208a6de9f6b76a39c5d (v249, released on 2021-07-07).
Let's drop the runtime upgradability of udevd from an ancient version.
core: add WorkingDirectory, Environment and SetCredential{,Encrypted} to io.systemd.Unit.StartTransient (#41874)
This PR adds some more properties to the io.systemd.Unit.StartTransient
varlink interface: WorkingDirectory, Environment and
SetCredential{,Encrypted}. Its also hopefully a useful starting point to
establish a pattern to add even more.
manager: skip reopening of console and signal reset when running as normal program
We want to reopen the console used for logging when running as PID1, but
also when running a user manager (c.f. 48a601fe5de8aa0d89ba6dadde168769fa7ce992
and 2a646b1d624e510a79785e1268b55a9c3a441db5). But this can cause
problems when the binary is invoked directly, e.g. to print --help.
E.g. if we ignore SIGPIPE, we'd remain running briefly after
'/usr/lib/systemd/systemd --help | head -n1'.
Previusly, the getopt machinery would print to stderr unconditionally.
But after the rework of option parsing, which means that we use the
log_* functions to repor errors, the test that checks if we print errors
to stderr started failing.
So let's skip some more of the setup if !invoked_by_systemd().
It'd be nice if we could not repeat the information about the option
list a second time. But I don't see a nice way to do this, since
(by design) with the macro approach, the macros must be intertwined
with the parse_argv() code. But that code in turn refers to a bunch
of variables, so lifting out the function is not immediately possible.
So I think it's best to keep the existing approach where we provide
a list of options, without additional context, and skip them using
a custom routine.
099663ff8c117303af369a4d412dafed0c5614c2 added "support" for
-b/-s/-z ARG with a comment of
> /* Just to eat away the sysvinit kernel cmdline args without getopt()
> * error messages that we'll parse in parse_proc_cmdline_word() or
> * ignore. */
And for PID1 those was valid. But when not running as PID1, those
options would be parsed as valid but then help() would immediately
return -EINVAL:
$ build-old/systemd -b; echo $?
1
At the same time, when running as PID1, if we encounter an error,
we shouldn't opine about the rest of the command line. So continuing
with the loop and the checks after the loop were iffy.
Later, cd57038a30aa9447bde3af7111ac8dc517b38bbf made a big refactoring,
and the 'break' (i.e. continuation of the loop) was changed to 'return 0',
making things even more confusing, since now we'd just silently stop in
the middle of the command line if -b/-s/-z were encountered.
So be more careful: when running as PID1, stop parsing on error
and return from the function. We didn't parse the full command line,
so the later checks are not useful. Silently ignore -b/-s/-z.
When not running as PID1, explicitly say that -b/-s/-z are not
supported, and propogate the error if parsing fails, e.g. with
an unknown option.
parse_argv() uses FOREACH_OPTION (not _OR_RETURN) so we can preserve the
existing PID 1 tolerance: an unknown option, or one of the sysvinit-style
-b/-s/-z catch-alls, returns 0 instead of -EINVAL when getpid_cached() == 1.
The docs documented --crash-vt=, but the code implemented "crash-chvt",
as introduced in b9e74c399458a1146894ce371e7d85c60658110c.
The output from --help is now modified to match code.
getopt-defs.h is intentionally left in place since
proc_cmdline_filter_pid1_args() in src/basic/proc-cmdline.c still uses
its COMMON_/SYSTEMD_/SHUTDOWN_GETOPT_* macros to walk the kernel
command line.
Previously, opterr=0 was used to suppress error messages about option
parsing in PID1. They are now logged at debug level (if debug logging
is enabled.)
Co-developed-by: Claude Opus 4.7 <noreply@anthropic.com>