git.ipfire.org Git - thirdparty/systemd.git/log

imds-generator: replace static Condition=initrd by a check in the generator

After looking at the unit, I'm not sure if systemd-imds-import.service
is supposed to run in the host system or not. But if it is supposed to
only run in the initrd, then the static condition in the unit file gives
as the worst behaviour: the generator does not do any checks if we are
in the initrd or not, and if it enabled the unit, it'll influence the
transaction ordering (possibly causing loops or additional work) and
then the unit will be unconditionally skipped. So replace the static
condition by a check in the generator. If the user specifies
systemd.imds.import on the commandline, it'll be honoured also in the
host.

units: drop After=network-online.target from imds services

The imds services are (very-) early boot services, ordered before
sysinit.target. OTOH, network-online.target is something that is often
established very late [1]. In fact, we always recommended to _not_
depend on it for things that are ordered before the usual targets that
allow user logins (e.g. graphical.target). So we certainly cannot add
early-boot ordering for it. Currently systemd-imds-imports.service
is conditionalized to only run in the initrd. But if the unit is enabled
in the host system, it would affect the transaction there, so we should
drop this bogus ordering everywhere.

(I think this might have been added by mistake. After all,
systemd-imds-import.service itself shouldn't care about the network.)

This fixes boot ordering issues reported upstream and in Fedora [2, 3].

[1] https://systemd.io/NETWORK_ONLINE/#network-connectivity-has-been-established-network-onlinetarget
[2] https://github.com/systemd/systemd/pull/40980#discussion_r2949473028
[3] https://bugzilla.redhat.com/show_bug.cgi?id=2481304

test-execute: reenable the test with sanitizers

test: skip TEST-55-OOMD entirely if stress-ng is broken on this host

This reverts commit a17efef137 ("test: try to detect SIGILL in stress-ng
and skip TEST-55-OOMD gracefully") and replaces it with a single
check to skip the test cases.

The previous check was not reliable as stress-ng can catch SIGILL
itself and exist with an error:

  stress-ng[1068]: stress-ng: debug: [1068] caught SIGILL, address \
    0x00005632f8330140 (ILL_ILLOPN)
  stress-ng[1068]: stress-ng: debug: [1068] stress-ng: info: \
    0x00005632f8330140:<62>71 fd 48 6f 2d 36 14 1c 00 c5 d1 ef ed 49 29
  ...
  stress-ng[1053]: stress-ng: error: [1053] vm: [1061] terminated \
    with an error, exit status=2 (stressor failed)
  ...
  systemd[1]: TEST-55-OOMD-slowrule.service: Main process exited, \
    code=exited, status=2/INVALIDARGUMENT
  systemd[1]: TEST-55-OOMD-slowrule.service: Failed with result \
    'exit-code'.

Try to detect at the beginning of the test and skip the test case if it
happens.

dissect-image: reject empty partitions array

di is allocated while iterating over p.partitions. If the array is
empty,
the loop is skipped and di remains NULL, but the code still assigns
image
metadata through it.

Return EBADMSG before accessing di in this case.

Found by Linux Verification Center (linuxtesting.org) with Svace.

login: split-out common varlink filed definitions

Only effective change is that now Remote field is nullable on
ListSession, even though logind always provides the field.
But that should not be a big matter.

C.f. https://github.com/systemd/systemd/pull/41561#discussion_r3218411455

test-pressure: Fix checks

Follow up for 7e250f70

Revert "test: skip tpm2 test on s390x on GHA"

This reverts commit b735d01c8aa415f45fd562b245c673142cc8b4b4.

Closes #38229.

test-execute: drop workaround for fixed issue

The issue should be already fixed by f01f70a9a3f3609c0c8bdbaa4b0b4abbb2b43993.

po: Translated using Weblate (Catalan)

Currently translated at 100.0% (270 of 270 strings)

Co-authored-by: naly zzwd <xeanhort007@gmail.com>
Translate-URL: https://translate.fedoraproject.org/projects/systemd/main/ca/
Translation: systemd/main

po: Translated using Weblate (Italian)

Currently translated at 100.0% (270 of 270 strings)

Co-authored-by: Ali Ciloglu <ali.ciloqlu@murena.io>
Translate-URL: https://translate.fedoraproject.org/projects/systemd/main/it/
Translation: systemd/main

po: Translated using Weblate (Spanish)

Currently translated at 100.0% (270 of 270 strings)

Co-authored-by: Fco. Javier F. Serrador <fserrador@gmail.com>
Translate-URL: https://translate.fedoraproject.org/projects/systemd/main/es/
Translation: systemd/main

logind: migrate to Varlink (phase1) (#41561)

This implements phase 1 of #41560.

Add List Varlink methods for all logind object types (sessions, users,
seats, inhibitors).

Add $SYSTEMD_INVOKED_AS (#42286)

sysusers: log new group name on write failure

When adding new groups, putgrent_with_members() is called with the newly
constructed group n, but the error path logs gr->gr_name. gr belongs to
the
earlier loop over the existing group file and may be NULL after EOF.

Log n.gr_name instead.

Found by Linux Verification Center (linuxtesting.org) with Svace.

test: also check if networkd built with BTF support before BPF test

SUSE does not provide a vmlinux.h so the package is not built with
CO-RE support, hence the test fails. This was previously masked
by the fact that python3-packaging was never installed, so the
test always skipped everywhere as it could not detect the kernel
version.

Follow-up for c310106c15ea83913e2765dcb0d7c81d83f08a0e

test: avoid picking volatile user-session units in varlinkctl-unit test

Each <type>_id is picked from a Unit.List '{}' call and immediately
queried via a second Unit.List '{"name":"<id>"}' call. When the picked
unit is tied to a user session (run-user-N.mount, user@N.service,
user-runtime-dir@N.service, session-cN.scope) it may be torn down
between the two calls if the user manager happens to exit at that
moment, causing a spurious NoSuchUnit failure.

Filter those out of the candidate lists for mount/service/scope.
Excerpt from a failed CI run showing the race:

  [ 2631.395826] systemd[1]: user-runtime-dir@4711.service: Trying to enqueue job user-runtime-dir@4711.service/stop/fail
  [ 2632.101414] systemd[1]: user-runtime-dir@4711.service: Got final SIGCHLD for state stop.
  [ 2632.182537] TEST-74-AUX-UTILS.sh[3051]: + mount_id=run-user-4711.mount
  [ 2632.186760] TEST-74-AUX-UTILS.sh[3099]: + varlinkctl call /run/systemd/io.systemd.Manager io.systemd.Unit.List '{"name":"run-user-4711.mount"}'
  [ 2632.189752] systemd[1]: varlink-api-3-3: Sending message: {"error":"io.systemd.Unit.NoSuchUnit"}
  [ 2632.190752] TEST-74-AUX-UTILS.sh[3099]: Method call io.systemd.Unit.List() failed: io.systemd.Unit.NoSuchUnit
  [ 2632.191594] TEST-74-AUX-UTILS.sh[101]: Subtest /usr/lib/systemd/tests/testdata/units/TEST-74-AUX-UTILS.varlinkctl-unit.sh failed

Co-developed-by: Claude Opus 4.7 <noreply@anthropic.com>

test: install python package needed by networkd-tests.py

systemd-networkd-tests.py[515]: Failed to import either platform or packaging module, assuming the comparison failed

test: couple of fixes for TEST-87-AUX-UTILS-VM.vmspawn (#42291)

TEST-87-AUX-UTILS-VM.vmspawn is flaky, try to make it more robust

e.g.:
https://github.com/systemd/systemd/actions/runs/26373447540/job/77629553363

test: try to detect SIGILL in stress-ng and skip TEST-55-OOMD gracefully (#42290)

The attempt to fix SIGILL in TEST-55-OOMD actually made the test more
flaky, as the chosen stress-ng method generates less pressure, so it
often fails.

Try a different approach: try to detect that SIGILL caused the unit
calling stress-ng to fail, and skip gracefully in that case.

test: ignore both kill and wait errors in TEST-87-AUX-UTILS-VM.vmspawn

If the process already exited it's not just wait, but also kill that
will fail, so guard them both

Follow-up for d32bc9cd07e61e852c624f228dda8669f540bf6d

test: bump varlinkctl timeout in TEST-87-AUX-UTILS-VM.vmspawn

10s is too short for flaky/overloaded CI, and it sometimes fails

Follow-up for d32bc9cd07e61e852c624f228dda8669f540bf6d

tools/test-crash-trace: do not use fixed signal numbers

The numbers of signals vary by arch. On the common arches the signals
listed here all use the same numbers, but people are likely to use
this on more fringe architectures too, so let's use symbolic names
instead.

Also the comment about gdb "hitting the same kill" didn't make sense.

The syntax is a bit baroque, but using a helper variable does not work.
Also shellcheck complains about $[ ] which would have made this more
legible.

test: try to detect SIGILL in stress-ng and skip TEST-55-OOMD gracefully

The attempt to fix SIGILL in TEST-55-OOMD actually made the test more
flaky, as the chosen stress-ng method generates less pressure, so it
often fails.

Try a different approach: try to detect that SIGILL caused the unit
calling stress-ng to fail, and skip gracefully in that case.

TEST-75-RESOLVED: use SYSTEMD_INVOKED_AS=resolvconf

Add $SYSTEMD_INVOKED_AS

This allows multi-call binaries to be easily invoked with a different
name. After installation, the name is set by creating a symlink. But in
build directories, we don't create the symlinks. (There are also other
ways to achieve the same thing, e.g. zsh supports $ARGV0, and exec -a
can be used, but those are either non-portable or are more complicated
to use.) The primary use-case for me is to test --help output for
multicall binaries.

Also reorder the help for env vars to group the more generic ones near
the top.

This was initially proposed in https://github.com/systemd/systemd/pull/24054,
but there were some comments about the implementation. I had a branch
with the patch, but I don't think I ever actually submitted it as a
pull request.

Revert "test: pin stress-ng --vm-method to a portable scalar method in TEST-55-OOMD"

This reverts commit 881e4717c7981b274853309e68b39153e3b292f4.

Revert "test: switch TEST-55-OOMD stress-ng --vm-method to lfsr32"

This reverts commit 4ac23697280bf54bb768f0aa7a5c7d7d0bcf3f6b.

Some fixes found by AI (#42263)

meson: Use `fs.relative_to()` instead of `realpath`

When building with Meson >=1.3.0, use the native `fs.relative_to()`
helper instead of shelling out to `realpath`.

This is more portable: BusyBox `realpath` does not support the
`--relative-to` option.

test-pressure: Skip if controller didn't get enabled

If for any reason the controller didn't get enabled on the
scope, skip the test.

nspawn: downgrade log level when unregistering and machine is already gone

nspawn started logging at notice level when exiting a container:

root@sid-arm64:~# exit
logout
Container sid-arm64 exited successfully.
Failed to unregister machine in system context, ignoring: No such device or address

Looks like it's racing with machined which already noticed it was
gone and remove it. Downgrade to debug on ENXIO.

Follow-up for 1e084aad7d5c2132ed32ff2a75bbe21205c0f5f3

shared/options: fix crash by aligning struct to sizeof(void*)

On some architectures like m68k, alignof(void*) is 2, not sizeof(void*)
(which is 4). So the natural alignment of struct Option is 2 and
sizeof(Option) == 26.

However, each variable placed in the SYSTEMD_OPTIONS ELF section via
_OPTION() carries _alignptr_ (= __attribute__((aligned(sizeof(void*))))),
which forces each entry to start at a 4-byte boundary. The linker
therefore inserts 2 bytes of padding between adjacent entries, producing
an actual stride of 28 in the section.

option_parse() iterates over the section with pointer arithmetic
("opt++"), which advances by sizeof(Option) == 26 and lands inside the
padding. The fields read back as zero, and since commit cf88903637
("tree-wide: get rid of most uses of ALIGN_PTR") added the ordering
assertion below, the resulting "0 < 0" trips it when running --help:

Assertion 'opt->id < (opt + 1)->id' failed at src/shared/options.c:116, function option_parse(). Aborting.

#0  0xc0a7b248 in ?? () from /usr/lib/m68k-linux-gnu/libc.so.6
#1  0xc0a7b2ce in pthread_kill () from /usr/lib/m68k-linux-gnu/libc.so.6
#2  0xc0a2edc6 in raise () from /usr/lib/m68k-linux-gnu/libc.so.6
#3  0xc0a1c128 in abort () from /usr/lib/m68k-linux-gnu/libc.so.6
#4  0xc05c6f78 in option_parse (options=0x0, options_end=0x0, state=0xc09ca968) at ../src/shared/options.c:116

Fix this by applying _alignptr_ to the struct definition itself, so that
sizeof(Option) is padded up to a multiple of sizeof(void*) and matches
the actual on-disk stride. Add an assert_cc() so any future regression
is caught at compile time.

The same latent bug applies to Verb and TestFunc, which use the same
section-placement pattern. Their natural sizeof was already a multiple
of sizeof(void*) so no crash was observed, but apply the same fix
defensively.

Follow-up for cf889036377092cbec6c5ec86fdf0dc1c9326032

Co-developed-by: Claude Opus 4.7 <noreply@anthropic.com>

io-util: Drop faulty assumption that poll flags == epoll flags

Not the case on the sparc architecture. Replace with two translation
functions. The poll to epoll variant isn't used yet but will be used
when we add io-uring support to sd-event so we already add it here so
that it is in place.

Follow-up for cf4c65afa86021a750de38bbed192eeb1c9fd425

meson: Limit test-link-abi to developer mode

This test can regress without changes on our end when
an older systemd is built against a newer glibc and
the newer glibc has an ABI break for a symbol we use
but don't dlsym(). To avoid spurious test failures for
downstream packagers, let's only run the test in developer
mode for now.

test-link-abi: fall back to lowest observed glibc version on new architectures

On architectures where glibc support was introduced after our baseline
(e.g. loongarch64, added in glibc 2.36), every imported symbol is
tagged with a version newer than the baseline, so the baseline check
fails spuriously. Detect this by taking the minimum version observed
across all binaries: if it exceeds the baseline, treat that minimum
as the effective baseline instead.

Follow up for d9600a2a

Co-developed-by: Claude Opus 4.7 <noreply@anthropic.com>

libc: dlsym the time64 alias for epoll_pwait2() on 32-bit _TIME_BITS=64

Commit f795d54591 ("libc,shared: detect newer library symbols at
runtime") replaced the build-time HAVE_EPOLL_PWAIT2 gate with a runtime
shim that resolves the libc symbol via dlsym(RTLD_DEFAULT, "epoll_pwait2").
That breaks on 32-bit architectures built with _TIME_BITS=64 (e.g. Debian
armhf since the time64 transition).

When _TIME_BITS=64 is in effect on a 32-bit target, glibc's <sys/epoll.h>
asm-renames the C identifier `epoll_pwait2` to the matching 64-bit-time_t
ABI symbol `__epoll_pwait2_time64`, so a normal C call resolves to the
variant whose `struct timespec` layout (16 bytes: s64 tv_sec, long tv_nsec,
long pad) matches what the compiler is using:

    /* /usr/include/sys/epoll.h */
    #ifndef __USE_TIME64_REDIRECTS
    extern int epoll_pwait2 (int __epfd, struct epoll_event *__events,
                             int __maxevents,
                             const struct timespec *__timeout,
                             const __sigset_t *__ss) ...;
    #else
    # ifdef __REDIRECT
    extern int __REDIRECT (epoll_pwait2, (int __epfd, ...,
                                          const struct timespec *__timeout,
                                          const __sigset_t *__ss),
                           __epoll_pwait2_time64) ...;
    # else
    #  define epoll_pwait2 __epoll_pwait2_time64
    # endif
    #endif

dlsym() doesn't see that header-level rename and returns the legacy
32-bit-time_t `epoll_pwait2` symbol, which interprets only the first 8
bytes of the 16-byte timespec we hand it. With a little-endian layout
that means the legacy callee reads tv_sec = (low 32 bits of real tv_sec)
and tv_nsec = (high 32 bits of real tv_sec, normally 0). Sub-second
timeouts therefore become {0,0} and return immediately, while multi-
second timeouts get truncated to whole seconds.

That matches the test failure reported on armv7:

    /* test_simple_timeout */
    src/libsystemd/sd-event/test-event.c:724: Assertion failed:
      Expected "t >= usec_add(f, some_time)",
      but 571153514285 < 571154403965

(the 889680 us shortfall is consistent with a sub-second `some_time` that
got rounded down to zero by the truncated wait).

Fix it by adding a DEFINE_LIBC_ERRNO_SHIM_NAMED() variant of the macro
that takes an explicit dlsym symbol name string. Both macros keep their
own copy of the body so every `func` reference stays behind `#` or `##`,
otherwise the override-header `#define epoll_pwait2 epoll_pwait2_shim`
would rewrite the token before the inner macro could paste it (yielding
`epoll_pwait2_shim_shim`). src/libc/epoll.c then picks
"__epoll_pwait2_time64" under __USE_TIME_BITS64 so the cached function
pointer matches the timespec ABI the rest of the code is built with.

Follow-up for f795d5459151ad84acf77557cf47dddddb3b4bce

Co-developed-by: Claude Opus 4.7 <noreply@anthropic.com>

po: "detect" msgmerge-nofuzzy using file()

With find_program, meson verbosely reports the detection of the file,
which we don't need since it's part of the repo and always present.

Update NEWS

NEWS: Add note that libsystemd may not link to libm anymore

base-filesystem: add powerpc ifdef branch

base-filesystem: add hppa ifdef branch

units: add missing Documentation= key to systemd-report units

libsystemd-network: propagate sd_event_default() failure

Four *_attach_event helpers swallow sd_event_default() failures and
return success while leaving obj->event NULL:

    sd_dhcp_client_attach_event   (sd-dhcp-client.c)
    sd_dhcp6_client_attach_event  (sd-dhcp6-client.c)
    sd_ndisc_attach_event         (sd-ndisc.c)
    sd_radv_attach_event          (sd-radv.c)

Each contains the same copy-pasted typo in the NULL-event branch:

    r = sd_event_default(&obj->event);
    if (r < 0)
            return 0;          /* swallows -ENOMEM / -ECHILD */

The caller is told the attach succeeded, but obj->event is still
NULL. A subsequent *_start() then trips assert_return(obj->event,
-EINVAL) and returns a spurious -EINVAL (or aborts on assert-abort
builds), and any consumer of *_get_event() dereferences NULL.

The sibling helper sd_dhcp_server_attach_event already uses the
correct pattern:

    r = sd_event_default(&server->event);
    if (r < 0)
            return r;

Fix by returning r instead of 0 at each of the four sites so the
sd_event_default() errno is propagated to the caller and ownership
stays with the caller.

Assisted-by: kres (claude-opus-4-7)
Signed-off-by: Chris Mason <clm@meta.com>

core/exec-invoke: log write() allow-list widening

apply_syscall_filter() unconditionally inserts the write() syscall
into c->syscall_filter when exec_fd or handoff_timestamp_fd is in
use, so the parent can receive the exec status / handoff timestamp
from the child. When the unit configured a positive
SystemCallFilter= allow-list that deliberately omits write(), the
resulting widening of the operator's policy happens silently with
no trace in the journal.

Emit a log_debug() before the seccomp_filter_set_add_by_name() call
when syscall_allow_list is true, so the widening is at least
observable to operators inspecting the unit's debug log.

While here, document that mutating c->syscall_filter through a
'const ExecContext *c' is intentional: apply_syscall_filter() runs
only in the post-fork child, which owns a private copy of the
address space, so the hashmap change is never observed by the
manager.

No functional change for the allow-list itself; write() is still
added exactly as before.

Fixes: 84b79215ccc5 ("core: do not filter out write() if required in the very late stage")
Assisted-by: kres (claude-opus-4-7)
Signed-off-by: Chris Mason <clm@meta.com>

core/exec-invoke: chdir("/") after chroot in apply_root_directory

apply_root_directory() calls chroot() but never pairs it with a
chdir("/"). Moving cwd into the new root is left to the subsequent
apply_working_directory() call. That delegation breaks when the unit
sets WorkingDirectory=-/path (working_directory_missing_ok):

    apply_root_directory()              apply_working_directory()
      chroot(new_root)        -->         chdir(wd) fails (ENOENT)
      return 0                            return 0  /* missing_ok */

chroot(2) changes only the process root, not cwd. With cwd still
resolving to the pre-chroot host dentry, the process runs with the
kernel root pointing at the new root while relative-path accesses
and /proc/self/cwd reach paths outside RootDirectory=.

Only the chroot-only branch (exec_needs_mount_namespace() == false)
is affected. The mount-namespace branch uses pivot_root(), which
restructures the mount tree and makes the pre-chroot dentry
unreachable.

Fix by issuing chdir("/") immediately after the successful chroot()
in apply_root_directory() and propagating any failure with
EXIT_CHROOT, matching the standard chroot+chdir idiom and making
the chroot self-contained regardless of how
apply_working_directory() later handles a missing directory.

Assisted-by: kres (claude-opus-4-7)
Signed-off-by: Chris Mason <clm@meta.com>

sd-dhcp6-client: fix ref leak on duplicate option add

sd_dhcp6_client_add_option() and sd_dhcp6_client_add_vendor_option()
store the caller's sd_dhcp6_option in an ordered container whose value
destructor is sd_dhcp6_option_unref, registered via
DEFINE_HASH_OPS_WITH_VALUE_DESTRUCTOR in dhcp6_option_hash_ops. The
container therefore owns exactly one ref per stored slot.

Both helpers guarded the ref bump only on r < 0:

    r = ordered_{hashmap,set}_ensure_put(..., v);
    if (r < 0)
            return r;
    sd_dhcp6_option_ref(v);

ordered_hashmap_ensure_put() (and the ordered_set wrapper) forward the
underlying hashmap_put_boldly tri-state verbatim: 1 on fresh insert, 0
when the identical key+value pair is already stored (no new slot
allocated), <0 on error. The r == 0 path fell through and bumped the
refcount without a corresponding slot.

The extra ref is permanently stranded once the container's destructor
runs. In a long-running networkd that reapplies .network files on
reload this leaks one sd_dhcp6_option allocation per duplicate-add.

Fix by tightening the guard to r <= 0 in both helpers, so
sd_dhcp6_option_ref(v) is reached only on a genuine new slot (r == 1).
Propagate the tri-valued result by returning r from both helpers
instead of a hard-coded constant.

sd_dhcp6_client_add_vendor_option() is a public libsystemd sd_ API;
its success return changes from a constant 1 to {0, 1}.
sd_dhcp6_client_add_option() now also propagates the tri-valued
result (previously a hard-coded 0 on success), giving the two parallel
helpers identical return contracts.

The sole non-fuzz in-tree callers are both in dhcp6_configure()
(src/network/networkd-dhcp6.c), which treat any r >= 0 as success and
do not distinguish 0 from 1, so the contract widening is caller-safe.

Fixes: e7d5fe17db9d ("DHCP client: make SendOption work for DHCPv6 too.")
Assisted-by: kres (claude-opus-4-7)
Signed-off-by: Chris Mason <clm@meta.com>

dissect-image: cast to uint64_t before *512 for sig lookup

In the PARTITION_ROOT_VERITY_SIG / PARTITION_USR_VERITY_SIG branch of
dissect_image(), the call to acquire_sig_for_roothash() passes the
partition offset and size as

    acquire_sig_for_roothash(fd,
                             start * 512,
                             size * 512,
                             &root_hash, NULL);

where start and size are blkid_loff_t (int64_t). The literal 512 is
int, so the multiplication is performed in int64_t and overflows for
LBA values in (INT64_MAX/512, INT64_MAX]. The overflowed signed result
is then converted to the uint64_t parameters of
acquire_sig_for_roothash(), yielding a near-UINT64_MAX
partition_offset that slips past the sole offset guard (an exact
== UINT64_MAX sentinel) and reaches pread() with a corrupt disk
offset. An overflowed partition_size would be rejected by the 4 MiB
EFBIG check at src/shared/dissect-image.c:710-712 before pread(); the
dangerous scenario is a corrupt offset combined with a legitimate
small size.

The three sibling call sites at src/shared/dissect-image.c:1284,
1589-1590 and 1671-1672 all cast to uint64_t before multiplying by
512; only this call site was missed.

Fix by casting start and size to uint64_t before the multiplication,
matching the sibling sites and making the arithmetic well-defined.

Fixes: 98ca65c36aa9 ("dissect: check that roothash in signature matches before selecting partition")
Assisted-by: kres (claude-opus-4-7)
Signed-off-by: Chris Mason <clm@meta.com>

dissect-image: fix wrong errno on pread failure

In acquire_sig_for_roothash() the pread() failure branch returns
-ENOMEM, which is copy-pasted from the malloc() check immediately
above it:

    _cleanup_free_ char *buf = new(char, partition_size+1);
    if (!buf)
            return -ENOMEM;

    ssize_t n = pread(fd, buf, partition_size, partition_offset);
    if (n < 0)
            return -ENOMEM;

pread() sets errno to the actual I/O failure (EIO, EINTR, EBADF,
...). Aliasing those to -ENOMEM misleads dissect_log_error() and
any caller that branches on the returned code; verity signature
lookup failures get reported as out-of-memory.

Fix by returning -errno from the pread() failure branch. The
malloc() branch above is correct and unchanged.

Fixes: 98ca65c36aa9 ("dissect: check that roothash in signature matches before selecting partition")
Assisted-by: kres (claude-opus-4-7)
Signed-off-by: Chris Mason <clm@meta.com>

dissect-image: fix swallowed open() error in mountfsd_make_directory

mountfsd_make_directory() opens the parent directory after a
successful path_extract_filename() call, but reports failure using
the stale path_extract_filename() return value:

    r = path_extract_filename(path, &dirname);
    if (r < 0)
            return log_debug_errno(r, ...);
    ...
    _cleanup_close_ int fd = open(parent, O_DIRECTORY|O_CLOEXEC);
    if (fd < 0)
            return log_debug_errno(r, "Failed to open '%s': %m", parent);

At the open() check site r holds the non-negative return of
path_extract_filename(). log_debug_errno() with a non-negative
first argument returns 0, so the function reports success even
though open() failed. Callers that pass a non-NULL ret_directory_fd
then consume an uninitialized fd value, and the diagnostic message
is suppressed because log_debug_errno(0, ...) emits nothing.

Fix by passing errno instead of r, matching the convention used in
mountfsd_mount_image() and mountfsd_mount_directory() in the same
file. The real open() failure (ENOENT, EACCES, ENOTDIR, EMFILE, ...)
now propagates to the caller and the log message is preserved.

Fixes: 1be8caa6be6f ("importd: support unpacking tarballs to foreign UID range")
Assisted-by: kres (claude-opus-4-7)
Signed-off-by: Chris Mason <clm@meta.com>

repart: Flush varlink messages for progress updates

Progress updates are not send out immidialty and only once the handler
for the run method completes since the event loop is not processed while
the run handler is executed. Therefore flush varlink messages immidialty
after a progress update is queued.

Closes: github.com/systemd/systemd/issues/40238

Two test-fileio fixes (#42260)

NEWS: drop imagined --print-profiles option

In dc91cb82da2a599564d7ea7903c42b131d328f89 --print-profiles is
passed to swtpm_setup, that's it.

test: fix test-fileio leaving directory behind and failing on rerun

/* test_write_data_file_atomic_at */
src/test/test-fileio.c:740: Assertion failed: Expected "write_data_file_atomic_at(XAT_FDROOT, "tmp/zzz/wdfa", &a, 0)" to fail with error -2/ENOENT, but it succeeded

Follow-up for 67387626884afec7dbb64cb78a39a3676b7ff663

test: fix test-fileio failure in gitlab CI container

Follow-up for 67387626884afec7dbb64cb78a39a3676b7ff663

update NEWS a bit more for v261

meson: fix build with -Dmachine=false

Fixes https://github.com/systemd/systemd/issues/42257

Follow-up for 5d0ac2c23c7063b219df9fce74bc8d8481cb6e7a

man: add sd_varlink_connect_address() man page

meson: bump version to v261~rc1

NEWS: finalize date and time

NEWS: list contributors

Update NEWS

sd-dhcp-server-lease: always update all information in bound lease

We manage bound leases by their client ID. Hence, potentially, other
fields may be changed. Let's always update all information.

pkcs11: switch to RSA-OAEP SHA-256/SHA-1

RSA PKCS#1 v1.5 is vulnerable to Bleichenbacher-style padding oracle
attacks, albeit very difficult and unlikely to actually happen in the
real world. Still for hardedning, switch new enrollments to RSA-OAEP,
with SHA-256 preferred and SHA-1 as fallback (probed at enrollment time,
since e.g. SoftHSM only accepts SHA-1, and older token might as well).

The actual padding scheme used to wrap a given key is recorded as a new
optional 'pkcs11-padding' / 'padding' field in the LUKS2 token JSON and
the homed user record. Decryption defaults to PKCS#1 v1.5 when absent so
existing enrollments keep working.

sysupdate: List default component only if transfer definition exists (#42179)

`sysupdate --json=short components` lists the components known to
sysupdate; these are the components which something like `updatectl`
will try to update.

The `default` component represents the host, and is meant to be listed
if transfer definitions exist in (for example) `/etc/sysupdate.d`
corresponding to the host OS. This then corresponds to `TARGET_HOST` in
`updatectl` and causes it to try updating that target.

The logic for working out whether the `default` component was present
essentially boiled down to “does `{/run,/etc,/usr/lib}/sysupdate.d`
exist”, and it didn’t check whether a `.transfer` or `.conf` file
actually existed in the config directory.

This is quite the corner case, but becomes more evident on systems where
sysupdate is being used to update a portable service but not the main
OS. At that point, if `/etc/sysupdate.d` exists empty (for some reason),
`updatectl` falls over because it starts trying to update the host OS
without any configuration to do so.

So, modify `sysupdate` to more fully load the available configuration
when listing components, and query it a bit more deeply to check whether
a default component exists.

If `sysupdate` is called with various command line arguments to affect
how its configuration is loaded, do *not* say that a default component
exists, as these arguments essentially anull the possibility of a
default being used in that process.

Add an integration test based on the reproducer provided by the issue
reporter. This test has been tested to fail if the changes to
`sysupdate.c` aren’t applied — if so, the second call to `sysupdate
components` would return
`{"default":true,"components":["some-component"]}`.

Signed-off-by: Philip Withnall <pwithnall@gnome.org>
Fixes: https://github.com/systemd/systemd/issues/41501

bootctl: refactor get_file_version() (#42246)

nsresourced: Verify user namespace identity on registry lookup (#42224)

report-basic: add a bunch more essential metrics (#42248)

tpm2: measure user records on first login (#42228)

This is useful so that we can "blow a fuse" in TPM policies once a user
logged in.

sd-dhcp-server: update conditions when we should send DHCPNAK (#42252)

test: cover LUO serialize-side anti-hijack guard in TEST-91

manager_luo_serialize_fd_stores() refuses to serialize a unit fd store
entry that holds a child LUO session named like PID 1's own ("systemd"),
to stop a service from hijacking PID 1's reserved session namespace
across kexec. That guard had no test coverage.

Add a test-luo store-hijack/check-hijack subcommand pair: on the first
boot a system service preserves a child LUO session named "systemd" in
its fd store; after kexec the test asserts the entry was not restored --
the unit's NFileDescriptorStore is 0, and check-hijack, run as the unit's
own second-boot ExecStart, confirms the hijack fd is absent from its
restored LISTEN_FDNAMES -- proving PID 1 skipped it during serialization.

The restore-side guards (corrupt mapping, reserved token 0, invalid unit
name, missing child session) are intentionally not covered: they only run
against PID 1's own "systemd" session built by luo_preserve_fd_stores(),
which a cooperating userspace helper cannot corrupt without racing or
displacing PID 1 (it single-owns /dev/liveupdate at shutdown). Triggering
them reliably would need kernel fault injection.

Co-developed-by: Claude Opus 4.7 <noreply@anthropic.com>

dhcp-server-request: drop unused value in DHCPRequest

sysupdate: List default component only if transfer definition exists

`sysupdate --json=short components` lists the components known to
sysupdate; these are the components which something like `updatectl`
will try to update.

The `default` component represents the host, and is meant to be listed
if transfer definitions exist in (for example) `/etc/sysupdate.d`
corresponding to the host OS. This then corresponds to `TARGET_HOST` in
`updatectl` and causes it to try updating that target.

The logic for working out whether the `default` component was present
essentially boiled down to “does `{/run,/etc,/usr/lib}/sysupdate.d`
exist”, and it didn’t check whether a `.transfer` or `.conf` file
actually existed in the config directory.

This is quite the corner case, but becomes more evident on systems where
sysupdate is being used to update a portable service but not the main
OS. At that point, if `/etc/sysupdate.d` exists empty (for some reason),
`updatectl` falls over because it starts trying to update the host OS
without any configuration to do so.

So, modify `sysupdate` to more fully load the available configuration
when listing components, and query it a bit more deeply to check whether
a default component exists.

If `sysupdate` is called with various command line arguments to affect
how its configuration is loaded, do *not* say that a default component
exists, as these arguments essentially anull the possibility of a
default being used in that process.

Add an integration test based on the reproducer provided by the issue
reporter. This test has been tested to fail if the changes to
`sysupdate.c` aren’t applied — if so, the second call to `sysupdate
components` would return
`{"default":true,"components":["some-component"]}`.

Signed-off-by: Philip Withnall <pwithnall@gnome.org>
Fixes: https://github.com/systemd/systemd/issues/41501

sysupdate: Add a flag to control error behaviour in internal function

Optionally prevent `context_read_definitions()` erroring out if zero
transfer definitions were found.

This commit makes no functional changes (the flag is always passed to
calls to `context_make_offline()` for the moment), but the new flag will be
used in the following commit.

Signed-off-by: Philip Withnall <pwithnall@gnome.org>
Helps: https://github.com/systemd/systemd/issues/41501

sysupdate: Convert an internal bool argument to flags

This commit makes no functional changes, but the new flags will be
used and extended in the following commit. We need a flags variable to
avoid having two bool arguments, which would be confusing.

Signed-off-by: Philip Withnall <pwithnall@gnome.org>
Helps: https://github.com/systemd/systemd/issues/41501

bootctl: refactor get_file_version() to use proper PE parsing

This irked me for a while. Let's not scan for strings stupidly, but
properly parse PE files to find the magic marker.

It's easy with our own PE APIs, hence we should do it.

This moves logging to the callers (previously, this was all mixed up).

pe-binary: tweak pe_read_section_data() error codes

Let's return -EBADMSG if the PE headers reference stuff missing in the
file, regardless if that's because the offsets are larger than SSIZE_MAX
or just larger than the file size. We generally use EBADMSG for all
cases we deem the file to not be a conformant PE file, and these two
cases are the same. Hence, let's be systematic here.

report-basic: report TPM2 vendor strings too

This is highly relevant to debug/analyze TPM related behaviour, hence
let's report this as part of the metrics.

report-basic: also report confidential computing tech

report-basic: report various SMBIOS fields as metrics, too

This are pretty fundamental hw fields, let's make them available.

report-basic: also report /etc/machine-info fields, just like os-release fields

The TAGS= field is really nice to have in reports, hence make this a
thing.

update TODO

ci: add simple test case for new logic

docs: document new measurement

logind: automatically pull in systemd-pcrlogin@.service on first login

pcrextend: add support for measuring a user record, to be executed on first login of the user

This is supposed to be useful to mark an interactive user login as a
"break glass" event in the measurement logs, i.e. as in many typically
headless scenerios this indicates debug access or similar.

pcrextend: port to help-util.[ch] APIs

sysupdate: Add separate polkit actions for cancellation (#42209)

This allows us to have a separate, more permissive, policy for
cancelling ongoing sysupdate jobs. The new default policy for
cancellation actions is to allow them for the active user, without admin
authentication, because typically the user can just pull the plug on the
computer to cancel a job anyway.

Signed-off-by: Philip Withnall <pwithnall@gnome.org>
Fixes: https://github.com/systemd/systemd/issues/38568

efi-api: fix unaligned access in efi_guid_to_id128()

EFI_GUID requires 4-byte alignment due to its uint32_t Data1 field, but
callers may pass pointers at arbitrary offsets into serialized EFI
variable buffers (e.g. bootctl walking BootXXXX entries). UBSan flagged
the misaligned member access; the old comment claiming the struct was
packed was wrong. Copy the bytes into an aligned local first.

Co-developed-by: Claude Opus 4.7 <noreply@anthropic.com>

hwdb: reject out-of-bounds fnmatch prefixes

Ensure that fnmatch traversal doesn't go off the hwdb.bin in case of
a corrupted file

Originally reported on yeswehack.com as #YWH-PGM9780-70

For the unit test:
Co-developed-by: GitHub Copilot <github-copilot[bot]@users.noreply.github.com>

dhcp-server-request: rename dhcp_server_handle_message() -> dhcp_server_process_message()

Then, make it take struct iovec.
No functional change, just refactoring.

dhcp-server-request: rework when we should reply DHCPNAK

Previously, DHCPNAK was sent only when the client is in INIT-REBOOT
state. But, on selecting or renewing, the request is directed to a
specific server, so we can safely reply with DHCPNAK.

Also, verify existing bound lease even when there is no static lease for
the client.

dhcp-server-request: logs more when received invalid messages

dhcp-server-request: also save IP_PKTINFO of received message

It will be used later.

This also drops redundant ifindex check, as the socket is already
initialized with socket_bind_to_ifindex().

sd-dhcp-server: use sd_dhcp_message to parse/build DHCP messages (#42240)

sd-dhcp-server: drop unused enum

Follow-up for 2e5580a8c12427ddf421b7fee7be3677ff41bd5b.

logind: add ListInhibitors Varlink method

The Varlink ListInhibitors method is the counterpart of D-Bus
ListInhibitors. Like its D-Bus counterpart it is zero-filter and streams
the full list of currently registered inhibitors using the
SD_VARLINK_METHOD_MORE pattern, returning InhibitorInfo objects with
Id, What, Who, Why, Mode, UID, PID, and Since fields.

There is no D-Bus GetInhibitor getter to fold in, so no unique-key
filter is introduced here.

logind: add ListSeats Varlink method

The Varlink ListSeats method accepts an optional Id filter, folding in
the D-Bus GetSeat(s) lookup.

Passing Id yields a single reply on match, or NoSuchSeat on miss.
Passing no Id with the 'more' flag streams the full list; passing no
Id without 'more' resolves to the caller's seat (preserving the
ergonomic default of GetSeat). The Id filter supports the special
names "self" and "auto" which resolve to the caller's seat.

The SeatInfo type in the io.systemd.Login Varlink IDL carries all seat
properties matching the D-Bus org.freedesktop.login1.Seat interface.

logind: add ListUsers Varlink method

The Varlink ListUsers method accepts optional UID and PID filters,
folding in the D-Bus GetUser(u) and GetUserByPID(u) lookups.

Passing a unique-key filter (UID and/or PID) yields a single reply on
match, or NoSuchUser on miss. Passing no filter with the 'more' flag
streams the full list; passing no filter without 'more' resolves to
the caller's user (preserving the ergonomic default of GetUser). If
both UID and PID are specified they must reference the same user,
otherwise NoSuchUser is returned.

The UserInfo type in the io.systemd.Login Varlink IDL carries all
user properties matching the D-Bus org.freedesktop.login1.User
interface.

logind: add ListSessions Varlink method

The Varlink ListSessions method accepts optional Id and PID filters,
folding in the D-Bus GetSession(s) and GetSessionByPID(u) lookups.

Passing a unique-key filter (Id and/or PID) yields a single reply on
match, or NoSuchSession on miss. Passing no filter streams the full
list (requires the 'more' flag). Specifying both Id and PID acts as a
consistency check: both must refer to the same session, otherwise
NoSuchSession is returned.

The Id filter supports the special names "self" and "auto" which
resolve to the caller's session. The SessionInfo type in the
io.systemd.Login Varlink IDL carries all session properties matching
the D-Bus org.freedesktop.login1.Session interface.