imds-generator: replace static Condition=initrd by a check in the generator
After looking at the unit, I'm not sure if systemd-imds-import.service
is supposed to run in the host system or not. But if it is supposed to
only run in the initrd, then the static condition in the unit file gives
as the worst behaviour: the generator does not do any checks if we are
in the initrd or not, and if it enabled the unit, it'll influence the
transaction ordering (possibly causing loops or additional work) and
then the unit will be unconditionally skipped. So replace the static
condition by a check in the generator. If the user specifies
systemd.imds.import on the commandline, it'll be honoured also in the
host.
units: drop After=network-online.target from imds services
The imds services are (very-) early boot services, ordered before
sysinit.target. OTOH, network-online.target is something that is often
established very late [1]. In fact, we always recommended to _not_
depend on it for things that are ordered before the usual targets that
allow user logins (e.g. graphical.target). So we certainly cannot add
early-boot ordering for it. Currently systemd-imds-imports.service
is conditionalized to only run in the initrd. But if the unit is enabled
in the host system, it would affect the transaction there, so we should
drop this bogus ordering everywhere.
(I think this might have been added by mistake. After all,
systemd-imds-import.service itself shouldn't care about the network.)
This fixes boot ordering issues reported upstream and in Fedora [2, 3].
Luca Boccassi [Tue, 26 May 2026 00:06:40 +0000 (01:06 +0100)]
test: skip TEST-55-OOMD entirely if stress-ng is broken on this host
This reverts commit a17efef137 ("test: try to detect SIGILL in stress-ng
and skip TEST-55-OOMD gracefully") and replaces it with a single
check to skip the test cases.
The previous check was not reliable as stress-ng can catch SIGILL
itself and exist with an error:
stress-ng[1068]: stress-ng: debug: [1068] caught SIGILL, address \
0x00005632f8330140 (ILL_ILLOPN)
stress-ng[1068]: stress-ng: debug: [1068] stress-ng: info: \
0x00005632f8330140:<62>71 fd 48 6f 2d 36 14 1c 00 c5 d1 ef ed 49 29
...
stress-ng[1053]: stress-ng: error: [1053] vm: [1061] terminated \
with an error, exit status=2 (stressor failed)
...
systemd[1]: TEST-55-OOMD-slowrule.service: Main process exited, \
code=exited, status=2/INVALIDARGUMENT
systemd[1]: TEST-55-OOMD-slowrule.service: Failed with result \
'exit-code'.
Try to detect at the beginning of the test and skip the test case if it
happens.
Mikhail Nogin [Mon, 25 May 2026 14:07:57 +0000 (17:07 +0300)]
dissect-image: reject empty partitions array
di is allocated while iterating over p.partitions. If the array is
empty,
the loop is skipped and di remains NULL, but the code still assigns
image
metadata through it.
Return EBADMSG before accessing di in this case.
Found by Linux Verification Center (linuxtesting.org) with Svace.
Yu Watanabe [Mon, 25 May 2026 18:43:47 +0000 (03:43 +0900)]
login: split-out common varlink filed definitions
Only effective change is that now Remote field is nullable on
ListSession, even though logind always provides the field.
But that should not be a big matter.
Currently translated at 100.0% (270 of 270 strings)
Co-authored-by: Fco. Javier F. Serrador <fserrador@gmail.com>
Translate-URL: https://translate.fedoraproject.org/projects/systemd/main/es/
Translation: systemd/main
Mikhail Nogin [Mon, 25 May 2026 13:33:43 +0000 (16:33 +0300)]
sysusers: log new group name on write failure
When adding new groups, putgrent_with_members() is called with the newly
constructed group n, but the error path logs gr->gr_name. gr belongs to
the
earlier loop over the existing group file and may be NULL after EOF.
Log n.gr_name instead.
Found by Linux Verification Center (linuxtesting.org) with Svace.
Luca Boccassi [Mon, 25 May 2026 15:54:48 +0000 (16:54 +0100)]
test: also check if networkd built with BTF support before BPF test
SUSE does not provide a vmlinux.h so the package is not built with
CO-RE support, hence the test fails. This was previously masked
by the fact that python3-packaging was never installed, so the
test always skipped everywhere as it could not detect the kernel
version.
Luca Boccassi [Mon, 25 May 2026 13:38:55 +0000 (14:38 +0100)]
test: avoid picking volatile user-session units in varlinkctl-unit test
Each <type>_id is picked from a Unit.List '{}' call and immediately
queried via a second Unit.List '{"name":"<id>"}' call. When the picked
unit is tied to a user session (run-user-N.mount, user@N.service,
user-runtime-dir@N.service, session-cN.scope) it may be torn down
between the two calls if the user manager happens to exit at that
moment, causing a spurious NoSuchUnit failure.
Filter those out of the candidate lists for mount/service/scope.
Excerpt from a failed CI run showing the race:
Luca Boccassi [Mon, 25 May 2026 12:25:15 +0000 (13:25 +0100)]
test: try to detect SIGILL in stress-ng and skip TEST-55-OOMD gracefully (#42290)
The attempt to fix SIGILL in TEST-55-OOMD actually made the test more
flaky, as the chosen stress-ng method generates less pressure, so it
often fails.
Try a different approach: try to detect that SIGILL caused the unit
calling stress-ng to fail, and skip gracefully in that case.
tools/test-crash-trace: do not use fixed signal numbers
The numbers of signals vary by arch. On the common arches the signals
listed here all use the same numbers, but people are likely to use
this on more fringe architectures too, so let's use symbolic names
instead.
Also the comment about gdb "hitting the same kill" didn't make sense.
The syntax is a bit baroque, but using a helper variable does not work.
Also shellcheck complains about $[ ] which would have made this more
legible.
Luca Boccassi [Mon, 25 May 2026 11:11:23 +0000 (12:11 +0100)]
test: try to detect SIGILL in stress-ng and skip TEST-55-OOMD gracefully
The attempt to fix SIGILL in TEST-55-OOMD actually made the test more
flaky, as the chosen stress-ng method generates less pressure, so it
often fails.
Try a different approach: try to detect that SIGILL caused the unit
calling stress-ng to fail, and skip gracefully in that case.
This allows multi-call binaries to be easily invoked with a different
name. After installation, the name is set by creating a symlink. But in
build directories, we don't create the symlinks. (There are also other
ways to achieve the same thing, e.g. zsh supports $ARGV0, and exec -a
can be used, but those are either non-portable or are more complicated
to use.) The primary use-case for me is to test --help output for
multicall binaries.
Also reorder the help for env vars to group the more generic ones near
the top.
This was initially proposed in https://github.com/systemd/systemd/pull/24054,
but there were some comments about the implementation. I had a branch
with the patch, but I don't think I ever actually submitted it as a
pull request.
Luca Boccassi [Sun, 24 May 2026 18:11:39 +0000 (19:11 +0100)]
nspawn: downgrade log level when unregistering and machine is already gone
nspawn started logging at notice level when exiting a container:
root@sid-arm64:~# exit
logout
Container sid-arm64 exited successfully.
Failed to unregister machine in system context, ignoring: No such device or address
Looks like it's racing with machined which already noticed it was
gone and remove it. Downgrade to debug on ENXIO.
Luca Boccassi [Sun, 24 May 2026 11:01:24 +0000 (12:01 +0100)]
shared/options: fix crash by aligning struct to sizeof(void*)
On some architectures like m68k, alignof(void*) is 2, not sizeof(void*)
(which is 4). So the natural alignment of struct Option is 2 and
sizeof(Option) == 26.
However, each variable placed in the SYSTEMD_OPTIONS ELF section via
_OPTION() carries _alignptr_ (= __attribute__((aligned(sizeof(void*))))),
which forces each entry to start at a 4-byte boundary. The linker
therefore inserts 2 bytes of padding between adjacent entries, producing
an actual stride of 28 in the section.
option_parse() iterates over the section with pointer arithmetic
("opt++"), which advances by sizeof(Option) == 26 and lands inside the
padding. The fields read back as zero, and since commit cf88903637
("tree-wide: get rid of most uses of ALIGN_PTR") added the ordering
assertion below, the resulting "0 < 0" trips it when running --help:
Assertion 'opt->id < (opt + 1)->id' failed at src/shared/options.c:116, function option_parse(). Aborting.
#0 0xc0a7b248 in ?? () from /usr/lib/m68k-linux-gnu/libc.so.6
#1 0xc0a7b2ce in pthread_kill () from /usr/lib/m68k-linux-gnu/libc.so.6
#2 0xc0a2edc6 in raise () from /usr/lib/m68k-linux-gnu/libc.so.6
#3 0xc0a1c128 in abort () from /usr/lib/m68k-linux-gnu/libc.so.6
#4 0xc05c6f78 in option_parse (options=0x0, options_end=0x0, state=0xc09ca968) at ../src/shared/options.c:116
Fix this by applying _alignptr_ to the struct definition itself, so that
sizeof(Option) is padded up to a multiple of sizeof(void*) and matches
the actual on-disk stride. Add an assert_cc() so any future regression
is caught at compile time.
The same latent bug applies to Verb and TestFunc, which use the same
section-placement pattern. Their natural sizeof was already a multiple
of sizeof(void*) so no crash was observed, but apply the same fix
defensively.
Daan De Meyer [Sat, 23 May 2026 10:33:12 +0000 (10:33 +0000)]
io-util: Drop faulty assumption that poll flags == epoll flags
Not the case on the sparc architecture. Replace with two translation
functions. The poll to epoll variant isn't used yet but will be used
when we add io-uring support to sd-event so we already add it here so
that it is in place.
Daan De Meyer [Sun, 24 May 2026 12:34:55 +0000 (12:34 +0000)]
meson: Limit test-link-abi to developer mode
This test can regress without changes on our end when
an older systemd is built against a newer glibc and
the newer glibc has an ABI break for a symbol we use
but don't dlsym(). To avoid spurious test failures for
downstream packagers, let's only run the test in developer
mode for now.
Daan De Meyer [Sun, 24 May 2026 11:41:30 +0000 (11:41 +0000)]
test-link-abi: fall back to lowest observed glibc version on new architectures
On architectures where glibc support was introduced after our baseline
(e.g. loongarch64, added in glibc 2.36), every imported symbol is
tagged with a version newer than the baseline, so the baseline check
fails spuriously. Detect this by taking the minimum version observed
across all binaries: if it exceeds the baseline, treat that minimum
as the effective baseline instead.
Luca Boccassi [Sat, 23 May 2026 15:18:03 +0000 (16:18 +0100)]
libc: dlsym the time64 alias for epoll_pwait2() on 32-bit _TIME_BITS=64
Commit f795d54591 ("libc,shared: detect newer library symbols at
runtime") replaced the build-time HAVE_EPOLL_PWAIT2 gate with a runtime
shim that resolves the libc symbol via dlsym(RTLD_DEFAULT, "epoll_pwait2").
That breaks on 32-bit architectures built with _TIME_BITS=64 (e.g. Debian
armhf since the time64 transition).
When _TIME_BITS=64 is in effect on a 32-bit target, glibc's <sys/epoll.h>
asm-renames the C identifier `epoll_pwait2` to the matching 64-bit-time_t
ABI symbol `__epoll_pwait2_time64`, so a normal C call resolves to the
variant whose `struct timespec` layout (16 bytes: s64 tv_sec, long tv_nsec,
long pad) matches what the compiler is using:
dlsym() doesn't see that header-level rename and returns the legacy
32-bit-time_t `epoll_pwait2` symbol, which interprets only the first 8
bytes of the 16-byte timespec we hand it. With a little-endian layout
that means the legacy callee reads tv_sec = (low 32 bits of real tv_sec)
and tv_nsec = (high 32 bits of real tv_sec, normally 0). Sub-second
timeouts therefore become {0,0} and return immediately, while multi-
second timeouts get truncated to whole seconds.
(the 889680 us shortfall is consistent with a sub-second `some_time` that
got rounded down to zero by the truncated wait).
Fix it by adding a DEFINE_LIBC_ERRNO_SHIM_NAMED() variant of the macro
that takes an explicit dlsym symbol name string. Both macros keep their
own copy of the body so every `func` reference stays behind `#` or `##`,
otherwise the override-header `#define epoll_pwait2 epoll_pwait2_shim`
would rewrite the token before the inner macro could paste it (yielding
`epoll_pwait2_shim_shim`). src/libc/epoll.c then picks
"__epoll_pwait2_time64" under __USE_TIME_BITS64 so the cached function
pointer matches the timespec ABI the rest of the code is built with.
Each contains the same copy-pasted typo in the NULL-event branch:
r = sd_event_default(&obj->event);
if (r < 0)
return 0; /* swallows -ENOMEM / -ECHILD */
The caller is told the attach succeeded, but obj->event is still
NULL. A subsequent *_start() then trips assert_return(obj->event,
-EINVAL) and returns a spurious -EINVAL (or aborts on assert-abort
builds), and any consumer of *_get_event() dereferences NULL.
The sibling helper sd_dhcp_server_attach_event already uses the
correct pattern:
r = sd_event_default(&server->event);
if (r < 0)
return r;
Fix by returning r instead of 0 at each of the four sites so the
sd_event_default() errno is propagated to the caller and ownership
stays with the caller.
Assisted-by: kres (claude-opus-4-7) Signed-off-by: Chris Mason <clm@meta.com>
Chris Mason [Fri, 22 May 2026 16:15:50 +0000 (09:15 -0700)]
core/exec-invoke: log write() allow-list widening
apply_syscall_filter() unconditionally inserts the write() syscall
into c->syscall_filter when exec_fd or handoff_timestamp_fd is in
use, so the parent can receive the exec status / handoff timestamp
from the child. When the unit configured a positive
SystemCallFilter= allow-list that deliberately omits write(), the
resulting widening of the operator's policy happens silently with
no trace in the journal.
Emit a log_debug() before the seccomp_filter_set_add_by_name() call
when syscall_allow_list is true, so the widening is at least
observable to operators inspecting the unit's debug log.
While here, document that mutating c->syscall_filter through a
'const ExecContext *c' is intentional: apply_syscall_filter() runs
only in the post-fork child, which owns a private copy of the
address space, so the hashmap change is never observed by the
manager.
No functional change for the allow-list itself; write() is still
added exactly as before.
Fixes: 84b79215ccc5 ("core: do not filter out write() if required in the very late stage") Assisted-by: kres (claude-opus-4-7) Signed-off-by: Chris Mason <clm@meta.com>
Chris Mason [Fri, 22 May 2026 15:49:27 +0000 (08:49 -0700)]
core/exec-invoke: chdir("/") after chroot in apply_root_directory
apply_root_directory() calls chroot() but never pairs it with a
chdir("/"). Moving cwd into the new root is left to the subsequent
apply_working_directory() call. That delegation breaks when the unit
sets WorkingDirectory=-/path (working_directory_missing_ok):
chroot(2) changes only the process root, not cwd. With cwd still
resolving to the pre-chroot host dentry, the process runs with the
kernel root pointing at the new root while relative-path accesses
and /proc/self/cwd reach paths outside RootDirectory=.
Only the chroot-only branch (exec_needs_mount_namespace() == false)
is affected. The mount-namespace branch uses pivot_root(), which
restructures the mount tree and makes the pre-chroot dentry
unreachable.
Fix by issuing chdir("/") immediately after the successful chroot()
in apply_root_directory() and propagating any failure with
EXIT_CHROOT, matching the standard chroot+chdir idiom and making
the chroot self-contained regardless of how
apply_working_directory() later handles a missing directory.
Assisted-by: kres (claude-opus-4-7) Signed-off-by: Chris Mason <clm@meta.com>
Chris Mason [Fri, 22 May 2026 16:14:51 +0000 (09:14 -0700)]
sd-dhcp6-client: fix ref leak on duplicate option add
sd_dhcp6_client_add_option() and sd_dhcp6_client_add_vendor_option()
store the caller's sd_dhcp6_option in an ordered container whose value
destructor is sd_dhcp6_option_unref, registered via
DEFINE_HASH_OPS_WITH_VALUE_DESTRUCTOR in dhcp6_option_hash_ops. The
container therefore owns exactly one ref per stored slot.
Both helpers guarded the ref bump only on r < 0:
r = ordered_{hashmap,set}_ensure_put(..., v);
if (r < 0)
return r;
sd_dhcp6_option_ref(v);
ordered_hashmap_ensure_put() (and the ordered_set wrapper) forward the
underlying hashmap_put_boldly tri-state verbatim: 1 on fresh insert, 0
when the identical key+value pair is already stored (no new slot
allocated), <0 on error. The r == 0 path fell through and bumped the
refcount without a corresponding slot.
The extra ref is permanently stranded once the container's destructor
runs. In a long-running networkd that reapplies .network files on
reload this leaks one sd_dhcp6_option allocation per duplicate-add.
Fix by tightening the guard to r <= 0 in both helpers, so
sd_dhcp6_option_ref(v) is reached only on a genuine new slot (r == 1).
Propagate the tri-valued result by returning r from both helpers
instead of a hard-coded constant.
sd_dhcp6_client_add_vendor_option() is a public libsystemd sd_ API;
its success return changes from a constant 1 to {0, 1}.
sd_dhcp6_client_add_option() now also propagates the tri-valued
result (previously a hard-coded 0 on success), giving the two parallel
helpers identical return contracts.
The sole non-fuzz in-tree callers are both in dhcp6_configure()
(src/network/networkd-dhcp6.c), which treat any r >= 0 as success and
do not distinguish 0 from 1, so the contract widening is caller-safe.
Fixes: e7d5fe17db9d ("DHCP client: make SendOption work for DHCPv6 too.") Assisted-by: kres (claude-opus-4-7) Signed-off-by: Chris Mason <clm@meta.com>
Chris Mason [Fri, 22 May 2026 16:13:50 +0000 (09:13 -0700)]
dissect-image: cast to uint64_t before *512 for sig lookup
In the PARTITION_ROOT_VERITY_SIG / PARTITION_USR_VERITY_SIG branch of
dissect_image(), the call to acquire_sig_for_roothash() passes the
partition offset and size as
where start and size are blkid_loff_t (int64_t). The literal 512 is
int, so the multiplication is performed in int64_t and overflows for
LBA values in (INT64_MAX/512, INT64_MAX]. The overflowed signed result
is then converted to the uint64_t parameters of
acquire_sig_for_roothash(), yielding a near-UINT64_MAX
partition_offset that slips past the sole offset guard (an exact
== UINT64_MAX sentinel) and reaches pread() with a corrupt disk
offset. An overflowed partition_size would be rejected by the 4 MiB
EFBIG check at src/shared/dissect-image.c:710-712 before pread(); the
dangerous scenario is a corrupt offset combined with a legitimate
small size.
The three sibling call sites at src/shared/dissect-image.c:1284,
1589-1590 and 1671-1672 all cast to uint64_t before multiplying by
512; only this call site was missed.
Fix by casting start and size to uint64_t before the multiplication,
matching the sibling sites and making the arithmetic well-defined.
Fixes: 98ca65c36aa9 ("dissect: check that roothash in signature matches before selecting partition") Assisted-by: kres (claude-opus-4-7) Signed-off-by: Chris Mason <clm@meta.com>
Chris Mason [Fri, 22 May 2026 16:23:03 +0000 (09:23 -0700)]
dissect-image: fix wrong errno on pread failure
In acquire_sig_for_roothash() the pread() failure branch returns
-ENOMEM, which is copy-pasted from the malloc() check immediately
above it:
_cleanup_free_ char *buf = new(char, partition_size+1);
if (!buf)
return -ENOMEM;
ssize_t n = pread(fd, buf, partition_size, partition_offset);
if (n < 0)
return -ENOMEM;
pread() sets errno to the actual I/O failure (EIO, EINTR, EBADF,
...). Aliasing those to -ENOMEM misleads dissect_log_error() and
any caller that branches on the returned code; verity signature
lookup failures get reported as out-of-memory.
Fix by returning -errno from the pread() failure branch. The
malloc() branch above is correct and unchanged.
Fixes: 98ca65c36aa9 ("dissect: check that roothash in signature matches before selecting partition") Assisted-by: kres (claude-opus-4-7) Signed-off-by: Chris Mason <clm@meta.com>
Chris Mason [Fri, 22 May 2026 16:15:04 +0000 (09:15 -0700)]
dissect-image: fix swallowed open() error in mountfsd_make_directory
mountfsd_make_directory() opens the parent directory after a
successful path_extract_filename() call, but reports failure using
the stale path_extract_filename() return value:
r = path_extract_filename(path, &dirname);
if (r < 0)
return log_debug_errno(r, ...);
...
_cleanup_close_ int fd = open(parent, O_DIRECTORY|O_CLOEXEC);
if (fd < 0)
return log_debug_errno(r, "Failed to open '%s': %m", parent);
At the open() check site r holds the non-negative return of
path_extract_filename(). log_debug_errno() with a non-negative
first argument returns 0, so the function reports success even
though open() failed. Callers that pass a non-NULL ret_directory_fd
then consume an uninitialized fd value, and the diagnostic message
is suppressed because log_debug_errno(0, ...) emits nothing.
Fix by passing errno instead of r, matching the convention used in
mountfsd_mount_image() and mountfsd_mount_directory() in the same
file. The real open() failure (ENOENT, EACCES, ENOTDIR, EMFILE, ...)
now propagates to the caller and the log message is preserved.
Fixes: 1be8caa6be6f ("importd: support unpacking tarballs to foreign UID range") Assisted-by: kres (claude-opus-4-7) Signed-off-by: Chris Mason <clm@meta.com>
Julian Sparber [Fri, 22 May 2026 15:23:41 +0000 (17:23 +0200)]
repart: Flush varlink messages for progress updates
Progress updates are not send out immidialty and only once the handler
for the run method completes since the event loop is not processed while
the run handler is executed. Therefore flush varlink messages immidialty
after a progress update is queued.
Luca Boccassi [Fri, 22 May 2026 15:41:25 +0000 (16:41 +0100)]
test: fix test-fileio leaving directory behind and failing on rerun
/* test_write_data_file_atomic_at */
src/test/test-fileio.c:740: Assertion failed: Expected "write_data_file_atomic_at(XAT_FDROOT, "tmp/zzz/wdfa", &a, 0)" to fail with error -2/ENOENT, but it succeeded
RSA PKCS#1 v1.5 is vulnerable to Bleichenbacher-style padding oracle
attacks, albeit very difficult and unlikely to actually happen in the
real world. Still for hardedning, switch new enrollments to RSA-OAEP,
with SHA-256 preferred and SHA-1 as fallback (probed at enrollment time,
since e.g. SoftHSM only accepts SHA-1, and older token might as well).
The actual padding scheme used to wrap a given key is recorded as a new
optional 'pkcs11-padding' / 'padding' field in the LUKS2 token JSON and
the homed user record. Decryption defaults to PKCS#1 v1.5 when absent so
existing enrollments keep working.
Luca Boccassi [Fri, 22 May 2026 13:07:06 +0000 (14:07 +0100)]
sysupdate: List default component only if transfer definition exists (#42179)
`sysupdate --json=short components` lists the components known to
sysupdate; these are the components which something like `updatectl`
will try to update.
The `default` component represents the host, and is meant to be listed
if transfer definitions exist in (for example) `/etc/sysupdate.d`
corresponding to the host OS. This then corresponds to `TARGET_HOST` in
`updatectl` and causes it to try updating that target.
The logic for working out whether the `default` component was present
essentially boiled down to “does `{/run,/etc,/usr/lib}/sysupdate.d`
exist”, and it didn’t check whether a `.transfer` or `.conf` file
actually existed in the config directory.
This is quite the corner case, but becomes more evident on systems where
sysupdate is being used to update a portable service but not the main
OS. At that point, if `/etc/sysupdate.d` exists empty (for some reason),
`updatectl` falls over because it starts trying to update the host OS
without any configuration to do so.
So, modify `sysupdate` to more fully load the available configuration
when listing components, and query it a bit more deeply to check whether
a default component exists.
If `sysupdate` is called with various command line arguments to affect
how its configuration is loaded, do *not* say that a default component
exists, as these arguments essentially anull the possibility of a
default being used in that process.
Add an integration test based on the reproducer provided by the issue
reporter. This test has been tested to fail if the changes to
`sysupdate.c` aren’t applied — if so, the second call to `sysupdate
components` would return
`{"default":true,"components":["some-component"]}`.
Signed-off-by: Philip Withnall <pwithnall@gnome.org> Fixes: https://github.com/systemd/systemd/issues/41501
Rocker Zhang [Thu, 21 May 2026 16:02:51 +0000 (00:02 +0800)]
test: cover LUO serialize-side anti-hijack guard in TEST-91
manager_luo_serialize_fd_stores() refuses to serialize a unit fd store
entry that holds a child LUO session named like PID 1's own ("systemd"),
to stop a service from hijacking PID 1's reserved session namespace
across kexec. That guard had no test coverage.
Add a test-luo store-hijack/check-hijack subcommand pair: on the first
boot a system service preserves a child LUO session named "systemd" in
its fd store; after kexec the test asserts the entry was not restored --
the unit's NFileDescriptorStore is 0, and check-hijack, run as the unit's
own second-boot ExecStart, confirms the hijack fd is absent from its
restored LISTEN_FDNAMES -- proving PID 1 skipped it during serialization.
The restore-side guards (corrupt mapping, reserved token 0, invalid unit
name, missing child session) are intentionally not covered: they only run
against PID 1's own "systemd" session built by luo_preserve_fd_stores(),
which a cooperating userspace helper cannot corrupt without racing or
displacing PID 1 (it single-owns /dev/liveupdate at shutdown). Triggering
them reliably would need kernel fault injection.
Co-developed-by: Claude Opus 4.7 <noreply@anthropic.com>
Philip Withnall [Tue, 19 May 2026 14:46:36 +0000 (15:46 +0100)]
sysupdate: List default component only if transfer definition exists
`sysupdate --json=short components` lists the components known to
sysupdate; these are the components which something like `updatectl`
will try to update.
The `default` component represents the host, and is meant to be listed
if transfer definitions exist in (for example) `/etc/sysupdate.d`
corresponding to the host OS. This then corresponds to `TARGET_HOST` in
`updatectl` and causes it to try updating that target.
The logic for working out whether the `default` component was present
essentially boiled down to “does `{/run,/etc,/usr/lib}/sysupdate.d`
exist”, and it didn’t check whether a `.transfer` or `.conf` file
actually existed in the config directory.
This is quite the corner case, but becomes more evident on systems where
sysupdate is being used to update a portable service but not the main
OS. At that point, if `/etc/sysupdate.d` exists empty (for some reason),
`updatectl` falls over because it starts trying to update the host OS
without any configuration to do so.
So, modify `sysupdate` to more fully load the available configuration
when listing components, and query it a bit more deeply to check whether
a default component exists.
If `sysupdate` is called with various command line arguments to affect
how its configuration is loaded, do *not* say that a default component
exists, as these arguments essentially anull the possibility of a
default being used in that process.
Add an integration test based on the reproducer provided by the issue
reporter. This test has been tested to fail if the changes to
`sysupdate.c` aren’t applied — if so, the second call to `sysupdate
components` would return
`{"default":true,"components":["some-component"]}`.
Signed-off-by: Philip Withnall <pwithnall@gnome.org> Fixes: https://github.com/systemd/systemd/issues/41501
Philip Withnall [Thu, 21 May 2026 10:17:56 +0000 (11:17 +0100)]
sysupdate: Add a flag to control error behaviour in internal function
Optionally prevent `context_read_definitions()` erroring out if zero
transfer definitions were found.
This commit makes no functional changes (the flag is always passed to
calls to `context_make_offline()` for the moment), but the new flag will be
used in the following commit.
Signed-off-by: Philip Withnall <pwithnall@gnome.org>
Helps: https://github.com/systemd/systemd/issues/41501
Philip Withnall [Thu, 21 May 2026 10:14:01 +0000 (11:14 +0100)]
sysupdate: Convert an internal bool argument to flags
This commit makes no functional changes, but the new flags will be
used and extended in the following commit. We need a flags variable to
avoid having two bool arguments, which would be confusing.
Signed-off-by: Philip Withnall <pwithnall@gnome.org>
Helps: https://github.com/systemd/systemd/issues/41501
Let's return -EBADMSG if the PE headers reference stuff missing in the
file, regardless if that's because the offsets are larger than SSIZE_MAX
or just larger than the file size. We generally use EBADMSG for all
cases we deem the file to not be a conformant PE file, and these two
cases are the same. Hence, let's be systematic here.
pcrextend: add support for measuring a user record, to be executed on first login of the user
This is supposed to be useful to mark an interactive user login as a
"break glass" event in the measurement logs, i.e. as in many typically
headless scenerios this indicates debug access or similar.
sysupdate: Add separate polkit actions for cancellation (#42209)
This allows us to have a separate, more permissive, policy for
cancelling ongoing sysupdate jobs. The new default policy for
cancellation actions is to allow them for the active user, without admin
authentication, because typically the user can just pull the plug on the
computer to cancel a job anyway.
Signed-off-by: Philip Withnall <pwithnall@gnome.org> Fixes: https://github.com/systemd/systemd/issues/38568
Daan De Meyer [Thu, 21 May 2026 22:00:28 +0000 (22:00 +0000)]
efi-api: fix unaligned access in efi_guid_to_id128()
EFI_GUID requires 4-byte alignment due to its uint32_t Data1 field, but
callers may pass pointers at arbitrary offsets into serialized EFI
variable buffers (e.g. bootctl walking BootXXXX entries). UBSan flagged
the misaligned member access; the old comment claiming the struct was
packed was wrong. Copy the bytes into an aligned local first.
Co-developed-by: Claude Opus 4.7 <noreply@anthropic.com>
Yu Watanabe [Sun, 10 May 2026 15:26:33 +0000 (00:26 +0900)]
dhcp-server-request: rework when we should reply DHCPNAK
Previously, DHCPNAK was sent only when the client is in INIT-REBOOT
state. But, on selecting or renewing, the request is directed to a
specific server, so we can safely reply with DHCPNAK.
Also, verify existing bound lease even when there is no static lease for
the client.
Yaping Li [Sun, 10 May 2026 14:50:13 +0000 (14:50 +0000)]
logind: add ListInhibitors Varlink method
The Varlink ListInhibitors method is the counterpart of D-Bus
ListInhibitors. Like its D-Bus counterpart it is zero-filter and streams
the full list of currently registered inhibitors using the
SD_VARLINK_METHOD_MORE pattern, returning InhibitorInfo objects with
Id, What, Who, Why, Mode, UID, PID, and Since fields.
There is no D-Bus GetInhibitor getter to fold in, so no unique-key
filter is introduced here.
Yaping Li [Sun, 10 May 2026 14:50:13 +0000 (14:50 +0000)]
logind: add ListSeats Varlink method
The Varlink ListSeats method accepts an optional Id filter, folding in
the D-Bus GetSeat(s) lookup.
Passing Id yields a single reply on match, or NoSuchSeat on miss.
Passing no Id with the 'more' flag streams the full list; passing no
Id without 'more' resolves to the caller's seat (preserving the
ergonomic default of GetSeat). The Id filter supports the special
names "self" and "auto" which resolve to the caller's seat.
The SeatInfo type in the io.systemd.Login Varlink IDL carries all seat
properties matching the D-Bus org.freedesktop.login1.Seat interface.
Yaping Li [Sun, 10 May 2026 14:50:13 +0000 (14:50 +0000)]
logind: add ListUsers Varlink method
The Varlink ListUsers method accepts optional UID and PID filters,
folding in the D-Bus GetUser(u) and GetUserByPID(u) lookups.
Passing a unique-key filter (UID and/or PID) yields a single reply on
match, or NoSuchUser on miss. Passing no filter with the 'more' flag
streams the full list; passing no filter without 'more' resolves to
the caller's user (preserving the ergonomic default of GetUser). If
both UID and PID are specified they must reference the same user,
otherwise NoSuchUser is returned.
The UserInfo type in the io.systemd.Login Varlink IDL carries all
user properties matching the D-Bus org.freedesktop.login1.User
interface.
Yaping Li [Sun, 10 May 2026 14:50:12 +0000 (14:50 +0000)]
logind: add ListSessions Varlink method
The Varlink ListSessions method accepts optional Id and PID filters,
folding in the D-Bus GetSession(s) and GetSessionByPID(u) lookups.
Passing a unique-key filter (Id and/or PID) yields a single reply on
match, or NoSuchSession on miss. Passing no filter streams the full
list (requires the 'more' flag). Specifying both Id and PID acts as a
consistency check: both must refer to the same session, otherwise
NoSuchSession is returned.
The Id filter supports the special names "self" and "auto" which
resolve to the caller's session. The SessionInfo type in the
io.systemd.Login Varlink IDL carries all session properties matching
the D-Bus org.freedesktop.login1.Session interface.