Armaan Sandhu [Sat, 13 Jun 2026 09:25:12 +0000 (14:55 +0530)]
sysupdate: address review feedback on component/reboot guard
Move the --reboot/--component= rejection into parse_argv() alongside the
other cross-option checks, and tighten TEST-72 to assert the specific
guard message rather than merely a non-zero exit.
Armaan Sandhu [Sat, 13 Jun 2026 07:25:51 +0000 (12:55 +0530)]
sysupdate: refuse reboot/pending logic when a component is selected
The `pending` and `reboot` verbs, as well as the `--reboot` switch, compare
the newest installed version against the booted OS version (IMAGE_VERSION= from
os-release). When a component is selected via --component=, this compares the
component's version against the unrelated host OS version, which by design live
in separate version spaces. The result is arbitrary reboot decisions: depending
on the relative version strings sysupdate would either always or never reboot.
Refuse the combination with a clear error instead of silently performing a
bogus comparison. Correctly tracking a per-component booted version is left as a
future feature.
Ronan Pigott [Mon, 15 Jun 2026 23:58:42 +0000 (16:58 -0700)]
pam: use default auth pam_deny.so
run0 doesn't actually use the auth pam stack, since polkit does the
requisite authorization. However, if the service type is left undefined
pam falls back to the definitions of the "other" service, which, at
least in Arch Linux but possibly more, includes pam_warn.so to notify
the user about this apparent error.
This creates a bit of logspam, as systemd does actually call pam_setcred
in its generic pam code, which depends on the auth pam stack, creating a
warning message in the journal on every invocation of run0.
pam_deny.so is a no-op, which avoids falling back to the other pam
service.
dongshengyuan [Wed, 17 Jun 2026 05:06:21 +0000 (13:06 +0800)]
hibernate: fix swap selection and prefer swap that can hold image
When multiple swap devices exist, prefer one with enough free space
to hold the hibernation image over one that cannot, regardless of
priority. If no swap can fit, fall back to priority-first selection.
This avoids deterministically failing hibernation when the
highest-priority swap is too small but a lower-priority one fits.
Signed-off-by: dongshengyuan <dongshengyuan@uniontech.com> Co-developed-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
report: add support for signing reports via varlink backends + make report a varlink service (#42595)
This adds the following:
1. systemd-report gains a new --sign= option, taking a boolean. If true,
this makes systemd-report generate + systemd-report upload generate a
signed report, instead of a regular one. The signatures are collected
from Varlink-based backends.
2. One such backend is added which does a simple Ed21159 based signing
scheme.
3. this adds a new metrics source which just reports text files
symlinked into a special dir as metrics. This is used to report the Ed21159 public key as metric, by default, if it exists.
4. finally, systemd-report itself is turned into a varlink service. this
is useful for example for extracting a report from a system coming in
via the varlink/http bridge.
I thought a long time about the format of signing of reports. Initially
i intended to do this like homed's user record signing, i.e. require
normalization of the record, then normalize the record, and write it out
in dense form, since the result. Finally insert the resulting hash into
the user record itself. People have pointed me to the inherent messiness
of signing JSON this way though, as it requires any participant that
wishes to sign/authenticate records this way to implement the exact same
normalization/formatting rules, and in particular in the area of
floating point numbers (of which metrics presumably will have many) this
is quite problematic.
This signing hence goes a different way. instead of expecting
signer+verifier to independently come to the same normalized text form
of the json data, let's instead output a JSON-SEQ sequence, where the
first object is the report, and any subsequent objects are one signature
each. the signatures are supposed to cover the precise binary
representation of the first element in the JSON-SEQ stream. (i.e. from
the RS to the NL).
or in other words: a verifier would receive the JSON-SEQ stream, split
it up before each RS. Then it would leave object 1 unparsed for the
moment, and parse objects 2…n. It would then authenticate object 1's
precise binary representation with objects 2…n. Once that checks out, it
would parse object 1, and use it as report.
Luca Boccassi [Fri, 29 May 2026 10:23:23 +0000 (11:23 +0100)]
udev: run workers in sibling cgroup and use cgroup.kill
Since a1f4fd387603673a79a84ca4e5ce25b439b85fe6 udev processes
run in an 'udev' subcgroup, to avoid killing control processes
when clearing workers. But the main process is still in the same
cgroup, so the atomic cgroup.kill cannot be used.
If the main subcgroup exists, try to create a sibling 'workers'
cgroup, and use it for workers processes, and use cgroup.kill if
available.
This is especially useful as rules can spawn arbitrary programs/scripts
and TasksMax= is set to unlimited.
Luca Boccassi [Fri, 19 Jun 2026 18:35:16 +0000 (19:35 +0100)]
core: add LUOSession= unit setting (#42530)
Acquiring a LUO session from /dev/liveupdate requires privileges, and
also the device is a single-owner driver so only a single process can
open it at any given time.
Add a LUOSession= service settings that allows units running without
privileges to get a session assigned to them.
The kernel imposes a 64 chars limit on session names, which is too short
to avoid clashes, so derive a hash from joining the unit name with the
parameter name, that way two units using the same setting don't clash.
Revert "units: drop After=network-online.target from imds services"
IMDS access requires networking, hence we need to run after
network-online.target. Everything else would be racy and result in
likely timeouts, because we might try to contact the network too early.
Fix handling of `UKI/SBAT` and `UKI/Firmware` config entries. (#42478)
This fixes 2 issues with `ukify`:
`config_list_prepend` needs a list. The 2 more used entries are
specially treated (parsed into a list) and therefore happen to work, but
trying to set `UKI/Firmware` throws a type error, as it is passed as
just a string.
The `UKI/SBAT` config entry is a list but used the default `config_push`
value which sets it if unset (which also failed since its default of
`[]` is not `None`) - change this to `config_list_prepend`.
I did consider adding a `config_list_append`, but as far as I'm aware,
the order of sbat entries should not change the outcome anyway.
Luca Boccassi [Fri, 19 Jun 2026 11:00:37 +0000 (12:00 +0100)]
crypto-util: support OpenSSL 4
OpenSSL 4 broke ABI, so we need to look for both SONAMEs.
Try libcrypto.so.3 first, and fallback to libcrypto.so.4,
so that the older and more stable version is used if both
are installed, giving distros time to fix regressions.
jmestwa-coder [Sat, 13 Jun 2026 17:58:08 +0000 (23:28 +0530)]
catalog: bound item offsets against the mmap in the binary reader
The binary catalog reader trusted two values straight from a (possibly
hostile) database: open_mmap() summed header_size + n_items *
catalog_item_size in uint64 with no overflow check, and find_id() added
the matched item's offset to the map base with no upper bound. Reachable
through sd_journal_get_catalog() with $SYSTEMD_CATALOG set, this let
catalog_get()/catalog_list() strdup() a string starting outside the
mapping. Guard the size math with MUL_SAFE/INC_SAFE and reject item
offsets that fall outside the file.
Will Fancher [Fri, 15 May 2026 01:09:54 +0000 (21:09 -0400)]
cryptsetup: Invalidate ineffective try_discover_key before trying
A single incorrect password attempt from the user would be tried
twice, because the first failed try would think it was the
try_discover_key method that failed, and would invalidate that instead
of the actual failed password. This resulted in the user being
prompted one fewer time than they should be.
Luca Boccassi [Thu, 18 Jun 2026 15:46:32 +0000 (16:46 +0100)]
journalctl: dlopen gcrypt in the --setup-keys path
journalctl needs gcrypt to set up the journal sealing
keys and other such operations, but gets no dlopen note
with the dependency. Add the dlopen macro with a recommends
level so it can be skipped, but gets pulled in by default
on user/desktop systems.
Yu Watanabe [Wed, 17 Jun 2026 03:25:14 +0000 (12:25 +0900)]
meson: add -mstackrealign on i386 with musl and libucontext
On i386 with musl and libucontext, many tests crash with SIGSEGV
when using fibers implemented on top of ucontext APIs.
For example:
```
929/1498 libsystemd - systemd:test-event-future FAIL 0.97s killed by signal 11 SIGSEGV
/* test_sd_event_run_timer */
run-suspend-timer: Scheduling fiber
/home/pmos/build/src/systemd-261-rc3/tools/test-crash-trace.sh: line 18: 19994 Segmentation fault (core dumped) "$@"
===== exit 139 — replaying under gdb =====
/* test_sd_event_run_timer */
run-suspend-timer: Scheduling fiber
Program received signal SIGSEGV, Segmentation fault.
0xf7d68be3 in sd_event_add_time_relative (e=0xf7ffd3f0, ret=0xf7b01fa0, clock=1, usec=10000, accuracy=0, callback=0x565564ad <inner_timer_handler>, userdata=0xf7b01fa4) at ../src/libsystemd/sd-event/sd-event.c:1455
1455 void *userdata) {
Thread 1 (process 20485 "test-event-futu"):
#0 0xf7d68be3 in sd_event_add_time_relative (e=0xf7ffd3f0, ret=0xf7b01fa0, clock=1, usec=10000, accuracy=0, callback=0x565564ad <inner_timer_handler>, userdata=0xf7b01fa4) at ../src/libsystemd/sd-event/sd-event.c:1455
t = 17868423377094431728
r = <optimized out>
#1 0x565574c7 in sd_event_run_timer_fiber (userdata=0x0) at ../src/libsystemd/sd-event/test-event-future.c:329
inner = 0xf7ffd3f0
source = 0x0
counter = 0
r = <optimized out>
#2 0xf7dab308 in fiber_entry_point () at ../src/libsystemd/sd-future/fiber.c:208
_cleanup_log_unset_prefix_5 = 0x0
__unique_prefix_c6 = 0x5655bb30
f = 0xf7b06840
__func__ = "fiber_entry_point"
fake_stack_save = <optimized out>
#3 0xf7b032ae in setcontext () from /lib/libucontext.so.1
No symbol table info available.
#4 0x00000000 in ?? ()
No symbol table info available.
```
Building systemd with -mstackrealign makes all observed failures
disappear, suggesting a stack alignment issue in the interaction
between compiler-generated code and the ucontext implementation.
This resembles historical stack alignment issues in glibc's i386
makecontext() implementation, which required fixes to preserve the
expected stack alignment.
As a workaround, add -mstackrealign when building on i386 with musl
and libucontext.
Acquiring a LUO session from /dev/liveupdate requires privileges,
and also the device is a single-owner driver so only a single
process can open it at any given time.
Add a LUOSession= service settings that allows units running
without privileges to get a session assigned to them.
The kernel imposes a 64 chars limit on session names, which is
too short to avoid clashes, so derive a hash from joining the
unit name with the parameter name, that way two units using
the same setting don't clash.
The previous policy was primarily written from a standpoint
that AI models are not very good and we didn't wanna waste any
time reviewing PRs generated by AI. Now that AI models have become
actually good and their output is just as good as regular contributions,
let's stop requiring the disclosure as its pointless to still have it,
it doesn't really matter anymore whether a patch was written with or
without
AI. It's up to the author to make sure they're not wasting our time by
submitting unreviewed, untested code upstream, regardless of whether
that
code is written by an AI or not.
The new policy is inspired by https://github.com/lxc/incus/pull/3506,
with
various removals to be less adverse to the usage of AI.
Luca Boccassi [Wed, 17 Jun 2026 10:29:26 +0000 (11:29 +0100)]
mkosi: clean up generated rpm pre scripts in suse builds
2026-06-17T10:11:08.3789573Z Untracked files:
2026-06-17T10:11:08.3790064Z (use "git add <file>..." to include in what will be committed)
2026-06-17T10:11:08.3790566Z systemd-network.pre
2026-06-17T10:11:08.3790908Z systemd-resolve.pre
Daan De Meyer [Wed, 17 Jun 2026 08:10:12 +0000 (08:10 +0000)]
docs: Update AI usage policy
The previous policy was primarily written from a standpoint
that AI models are not very good and we didn't wanna waste any
time reviewing PRs generated by AI. Now that AI models have become
actually good and their output is just as good as regular contributions,
let's stop requiring the disclosure as its pointless to still have it,
it doesn't really matter anymore whether a patch was written with or without
AI. It's up to the author to make sure they're not wasting our time by
submitting unreviewed, untested code upstream, regardless of whether that
code is written by an AI or not.
The new policy is inspired by https://github.com/lxc/incus/pull/3506, with
various removals to be less adverse to the usage of AI.
Paul Meyer [Sat, 13 Jun 2026 10:07:45 +0000 (12:07 +0200)]
vmspawn: don't abort VM launch when an optional swtpm fails to start
When start_tpm() failed under TPM autodetect (arg_tpm <= 0), the swtpm
block logged "Failed to start tpm, ignoring" but then fell through to
event_add_child_pidref() with the still-unset child PidRef. That returns
-ESRCH (PIDREF_NULL fails its pidref_is_set() guard), and the VM launch
fails. Register the child only on start_tpm() success.
Co-developed-by: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Paul Meyer <katexochen0@gmail.com>
Paul Meyer [Sat, 13 Jun 2026 09:38:02 +0000 (11:38 +0200)]
vmspawn: fix QEMU target names for ppc64/mips64el/mipsel hosts
architecture_to_qemu_table mapped four 64-bit/LE architectures to wrong
qemu-system-<target> names: MIPS64_LE and MIPS_LE both to "mips" (the BE
MIPS32 target), and PPC64/PPC64_LE to "ppc" (32-bit PowerPC). The string
feeds both find_qemu_binary() ("qemu-system-" + name) and the
firmware.json architecture match, so on those hosts vmspawn looked up a
wrong or nonexistent binary and never matched a firmware descriptor. Map
them to the real targets: mips64el, mipsel, and ppc64 (qemu-system-ppc64
runs LE pSeries guests; there is no qemu-system-ppc64le).
Co-developed-by: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Paul Meyer <katexochen0@gmail.com>
Paul Meyer [Sat, 13 Jun 2026 09:24:13 +0000 (11:24 +0200)]
vmspawn: reject oversized fstab.extra credential before int-cast merge
The fstab.extra merge prepends the existing credential via
asprintf("%.*s", (int) existing->size, …). MachineCredential.size is
size_t, so for a credential >INT_MAX the (int) cast yields a negative
precision, which C treats as omitted — turning %.*s into an unbounded
read past the allocation. Reject such a credential up front with EFBIG;
for all realistic sizes the merge is unchanged.
Co-developed-by: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Paul Meyer <katexochen0@gmail.com>
Paul Meyer [Sat, 13 Jun 2026 08:57:46 +0000 (10:57 +0200)]
vmspawn: return negative errno from qemu_config_key()/_section_impl()
errno_or_else() already returns a negative errno, so "return
-errno_or_else(EIO)" returned a positive value on fprintf()/ferror()
failure. Callers checking "r < 0" therefore missed the write error and
treated it as success. Drop the spurious negation; these were the only
two "-errno_or_else" uses in the tree.
Co-developed-by: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Paul Meyer <katexochen0@gmail.com>
Paul Meyer [Sat, 13 Jun 2026 08:37:20 +0000 (10:37 +0200)]
vmspawn: null freed fields and drain subscribers before bridge teardown
vmspawn_varlink_context_free() discarded the sd_varlink_server_unref()
and vmspawn_qmp_bridge_free() return values, leaving ctx->varlink_server
and ctx->bridge dangling. No current handler reads those fields, but use
the assign-back idiom so the fields are NULL during any synchronous
callback regardless of future changes.
Also drain subscribers before freeing the bridge, so subscriber teardown
can't run against a half-freed bridge.
Co-developed-by: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Paul Meyer <katexochen0@gmail.com>
Paul Meyer [Sat, 13 Jun 2026 08:01:27 +0000 (10:01 +0200)]
vmspawn: reject --initrd= without --linux= for directory boot
determine_kernel() ran boot-entry discovery under a !arg_linux guard
only. With --initrd= but no --linux=, discovery overwrote arg_initrds
with the discovered entry's own initrds (leaking the user's strv) and
set arg_linux, so verify_arguments()'s "--initrd= needs --linux=" check,
which runs afterwards, never fired: the user's --initrd= was
silently discarded. Reject the combination before discovery. This also
keeps the discovery store leak-free.
Co-developed-by: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Paul Meyer <katexochen0@gmail.com>
Paul Meyer [Sat, 13 Jun 2026 07:58:53 +0000 (09:58 +0200)]
vmspawn: don't leak prior arg_directory in ephemeral branch
The ephemeral-snapshot branch did a bare arg_directory = strdup(...),
orphaning the -D/--directory allocation from parse_argv.
Use free_and_strdup(), matching the idiom already used elsewhere in
the file.
Co-developed-by: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Paul Meyer <katexochen0@gmail.com>
Paul Meyer [Sat, 13 Jun 2026 07:25:28 +0000 (09:25 +0200)]
vmspawn: enforce an identifier grammar for QEMU config types/ids/keys
qemu_config_section()/qemu_config_key() write section types and keys
unquoted ([type], key = …), but the type check only rejected '\n' while
the structurally identical id/key checks rejected more, even though the
unquoted type slot is the most sensitive. Replace the three ad-hoc
denylists with one allowlist matching QEMU's identifier grammar
([A-Za-z0-9._-], non-empty). This is merely hardening, all callers pass
constant input today.
For the value check use string_is_safe() for now, which is a bit
stricter than QEMUs own config validation.
Add test-vmspawn-qemu-config covering both.
Co-developed-by: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Paul Meyer <katexochen0@gmail.com>
Luca Boccassi [Tue, 16 Jun 2026 21:40:20 +0000 (22:40 +0100)]
sd-dlopen: fix build on 'hppa'
On hppa '.equ' is overridden, so even this workaround ('.set' is
overridden on alpha) causes a build failure:
cc -Isrc/basic/libbasic.a.p -Isrc/basic -I../src/basic -Isrc/fundamental -I../src/fundamental -Isrc/systemd -I../src/systemd -Isrc/version -I../src/version -fdiagnostics-color=always -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Wextra -std=gnu17 -O0 -g -Wno-missing-field-initializers -Wno-unused-parameter -Wno-nonnull-compare -Warray-bounds -Warray-bounds=2 -Wdate-time -Wendif-labels -Werror=bool-compare -Werror=discarded-qualifiers -Werror=flex-array-member-not-at-end -Werror=format=2 -Werror=format-signedness -Werror=implicit-function-declaration -Werror=implicit-int -Werror=incompatible-pointer-types -Werror=int-conversion -Werror=missing-declarations -Werror=missing-parameter-name -Werror=missing-prototypes -Werror=overflow -Werror=override-init -Werror=pointer-sign -Werror=return-type -Werror=sequence-point -Werror=shift-count-overflow -Werror=shift-overflow=2 -Werror=strict-flex-arrays -Werror=undef -Wfloat-equal -Wimplicit-fallthrough=5 -Winit-self -Wlogical-op -Wmissing-include-dirs -Wmissing-noreturn -Wnested-externs -Wold-style-definition -Wpointer-arith -Wredundant-decls -Wshadow -Wstrict-aliasing=2 -Wstrict-prototypes -Wsuggest-attribute=noreturn -Wunterminated-string-initialization -Wunused-function -Wwrite-strings -Wzero-as-null-pointer-constant -Wzero-length-bounds -fdiagnostics-show-option -fexcess-precision=standard -fno-common -fstack-protector -fstack-protector-strong -fstrict-flex-arrays=3 -fno-math-errno --param=ssp-buffer-size=4 -Wno-unused-result -Werror=shadow -fPIC -fno-strict-aliasing -fstrict-flex-arrays=1 -fvisibility=hidden -fno-omit-frame-pointer -include config.h -isystem../src/include/glibc -isystem../src/include/override -isystemsrc/include/override -isystem../src/include/uapi -fvisibility=default -MD -MQ src/basic/libbasic.a.p/compress.c.o -MF src/basic/libbasic.a.p/compress.c.o.d -o src/basic/libbasic.a.p/compress.c.o -c ../src/basic/compress.c
/tmp/ccxm7Waj.s: Assembler messages:
/tmp/ccxm7Waj.s:2085: Error: bad or irreducible absolute expression; zero assumed
/tmp/ccxm7Waj.s:2085: Error: junk at end of line, first unrecognized character is `,'
/tmp/ccxm7Waj.s:2268: Error: bad or irreducible absolute expression; zero assumed
/tmp/ccxm7Waj.s:2268: Error: junk at end of line, first unrecognized character is `,'
/tmp/ccxm7Waj.s:2544: Error: bad or irreducible absolute expression; zero assumed
/tmp/ccxm7Waj.s:2544: Error: junk at end of line, first unrecognized character is `,'
/tmp/ccxm7Waj.s:2800: Error: bad or irreducible absolute expression; zero assumed
/tmp/ccxm7Waj.s:2800: Error: junk at end of line, first unrecognized character is `,'
/tmp/ccxm7Waj.s:2956: Error: bad or irreducible absolute expression; zero assumed
/tmp/ccxm7Waj.s:2956: Error: junk at end of line, first unrecognized character is `,'
'.equiv' works on all architecures, but breaks on CentOS 9 due to binutils
2.35. Use an ifdef. Can be dropped and switch to '.equiv' once binutils 2.36
is the baseline.
dongshengyuan [Wed, 17 Jun 2026 03:00:49 +0000 (11:00 +0800)]
sysext,sysusers: fix wrong error variable in two error paths
sysext: utimensat() failure was logged with stale r (which is 0 after
the preceding successful write_backing_file call). Pass errno instead
so the actual failure reason is recorded and returned.
sysusers: rename() failure in make_backup() returned the raw positive
errno value. All callers check 'if (r < 0)', so the error was silently
ignored, allowing execution to continue after a failed backup. Return
-errno instead.
Signed-off-by: dongshengyuan <dongshengyuan@uniontech.com> Co-developed-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* 462bd9f5ea Update systemd to version 260.2 / rev 469 via SR 1356344
* 28967f9151 Update systemd to version 260.1 / rev 468 via SR 1353801
* 086bdf7ca5 Update systemd to version 260.1 / rev 467 via SR 1348897
* 8e7d3d3067 Update systemd to version 259.5 / rev 466 via SR 1338788
* 069ac9826b Update systemd to version 259.3 / rev 465 via SR 1336527
* 7ed02aefd6 Update systemd to version 258.5 / rev 464 via SR 1335466
* 811b7f2076 Update systemd to version 258.4 / rev 463 via SR 1332808
* 45a28d7f95 Update systemd to version 258.3 / rev 462 via SR 1329291
* 37342ddc36 Update systemd to version 257.9 / rev 461 via SR 1324470
* 7eafa80da7 Update systemd to version 258.3 / rev 460 via SR 1323386
* 29c9ee6b49 Update systemd to version 257.9 / rev 459 via SR 1321158
* 39613f8d2e Update systemd to version 258.2 / rev 458 via SR 1320482
* c235f1dcf5 Update systemd to version 257.9 / rev 457 via SR 1305565
dongshengyuan [Thu, 11 Jun 2026 07:14:49 +0000 (15:14 +0800)]
core: fix unit_merge() load state check evaluating after state overwrite
The condition on line 1206 checks other->load_state != UNIT_STUB to
decide whether to call the vtable done() callback, but the state was
already overwritten to UNIT_MERGED on line 1198, making the condition
always true.
Save the original load_state before overwriting it, so that units in
UNIT_STUB state (which never went through a load attempt) correctly
skip the done() call.
Signed-off-by: dongshengyuan <dongshengyuan@uniontech.com> Co-developed-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Daan De Meyer [Wed, 10 Jun 2026 10:06:14 +0000 (10:06 +0000)]
nsresourced: reclaim ranges from dead namespaces during allocation
The only runtime trigger for registry cleanup is the BPF kprobe that fires
on user namespace destruction; when it is missed (ring buffer overflow,
kprobe missing, fdstore entry dropped), the dead namespace's registry entry
survives and keeps its UID/GID ranges blocked until the manager restarts and
its startup sweep runs. The allocation hot path checked whether a candidate
range was already taken but never whether the namespace holding it was still
alive, so a single dead namespace could permanently starve an allocation.
This is most visible when a parent delegates its entire container UID window
to a child that then dies: every subsequent allocation from the parent fails
with NoDynamicRange even though the ranges are reclaimable.
Add userns_registry_reap_if_dead(), which probes a registered namespace's
liveness via the kernel namespace identifier recorded at allocation time and,
if it is authoritatively dead, releases its registry entry — restoring any
ranges it received via delegation to their ancestors. Call it from the
allocation availability check for both transient registrations and delegated
ranges, walking a chain of dead ancestors in the delegation case. This
mirrors the existing inode-slot stale cleanup and makes allocation
self-healing without waiting for a restart.
The startup sweep grew the same load-probe-release logic, so route it through
the new helper too; its errno return distinguishes alive, no-recorded-id, and
unprobeable-environment cases so the sweep keeps its early-out when lookup by
id isn't possible at all.
Co-developed-by: Claude Opus 4.8 <noreply@anthropic.com>
Luca Boccassi [Mon, 15 Jun 2026 20:33:08 +0000 (21:33 +0100)]
core: add version and structure to LUO json payload
We might want to add more state to the LUO session json payload,
so add a version (to allow clean compat breaks if needed) and nest
the current fdstore contents under a 'units' object, so that more
top-level data can be added in the future without breaking
backward compatibility.
dongshengyuan [Tue, 16 Jun 2026 06:44:15 +0000 (14:44 +0800)]
misc: fix minor error handling issues
fstab-generator: pass k instead of r to bus_error_message() so the
fallback error string reflects the actual bus call failure, not the
accumulated result that was reset to 0 earlier.
networkd-ndisc: return -ENOMEM when newdup() fails, since r is 0 at
that point and the OOM would otherwise be reported as success.
storagetm: add missing NULL check after strndup() for attr_model,
matching the pattern already used for attr_firmware and attr_serial.
Signed-off-by: dongshengyuan <dongshengyuan@uniontech.com> Co-developed-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Luca Boccassi [Mon, 15 Jun 2026 20:34:56 +0000 (21:34 +0100)]
core: only attempt to deserialize state from LUO on boot
Avoid trying to query for our LUO session on reexec/softreboot/reload/etc.
Currently /dev/liveupdate is only accessible to root so it's not a big
issue, but this might change in the future, so make sure nobody can
play games with us.
Luca Boccassi [Thu, 4 Jun 2026 19:20:51 +0000 (20:20 +0100)]
obs: prepare ParticleOS images in workflow
Link ParticleOS images in the workflow subproject for the PR,
so that they can be enabled with a click when needed.
But keep disabled by default, as they take a lot of resources,
especially disk space.
dongshengyuan [Tue, 16 Jun 2026 01:07:25 +0000 (09:07 +0800)]
gpt-auto-generator: fix error propagation in add_root_mount()
When generator_write_initrd_root_device_deps() fails, the error was
swallowed by returning 0 (success) instead of r. The two subsequent
calls in the same block correctly return r on failure.
Signed-off-by: dongshengyuan <dongshengyuan@uniontech.com> Co-developed-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
dongshengyuan [Tue, 16 Jun 2026 02:38:17 +0000 (10:38 +0800)]
mount: log control command before clearing it in mount_sigchld_event()
control_command and control_command_id were cleared before being passed
to unit_log_process_exit(), so the log always showed an invalid/unknown
command name.
Move both clears after the log call, matching the ordering in
socket_sigchld_event() and service_sigchld_event().
Signed-off-by: dongshengyuan <dongshengyuan@uniontech.com> Co-developed-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Daan De Meyer [Mon, 15 Jun 2026 09:06:42 +0000 (09:06 +0000)]
loop-util: shortcut block device fd when it carries no partition table
663f0bf5cb stopped reusing the original block device fd whenever
partition scanning was requested (LO_FLAGS_PARTSCAN) but couldn't be
enabled on the device, so that nested partition tables on devices the
kernel won't scan (e.g. the pmOS/android case) get exposed via a real
loop device.
However that also forced a pointless loop device for any partition that
carries a file system directly, e.g. a btrfs subvolume mounted via
MountImages=. For multi-device btrfs this is fatal: the kernel rejects
seeing the same member via both the original partition and the loop
device, and the mount fails.
A loop device is only ever needed here to expose a nested partition
table. So only refuse the shortcut when the device actually carries one,
probed via gpt_probe(), instead of whenever partition scanning is
disabled. Devices carrying a file system directly (or nothing) take the
shortcut as before.
Add an integration test to cover the failure scenario of the original
issue.
Fixes: https://github.com/systemd/systemd/issues/42520
Replaces: https://github.com/systemd/systemd/pull/42576
Follow-up for 663f0bf5cb79ecaf6dd71441ecdc9dc401e7eae6
Co-Authored-By: Luca Boccassi <luca.boccassi@gmail.com> Co-developed-by: Claude Opus 4.8 <noreply@anthropic.com>
Luca Boccassi [Mon, 15 Jun 2026 21:05:18 +0000 (22:05 +0100)]
report: place Upload() on io.systemd.Report.Uploader rather than io.systemd.Report interface (#42584)
We really want to use io.systemd.Report for the interface provided by
systemd-report itself, not by its backend. hence, rename the interface
that uploading plugins shall implement to io.systemd.Report.Uploader.
Note that we ideally should have a varlink interface definition for that
interface. if we had, we'd have noticed that earlier.
let's name the dir "/run/systemd/report.upload/" (rather than
"/run/systemd/metrics-upload/"). After all, these are reports that we
upload, not indiviudual metrics. And it would be particular confusing
since the dir to pick up metrics is called /run/systemd/report/, rather
than /run/systemd/metrics/. Hence the thing that deals with reports is
nmamed metrics, and the thing that deals in metrics is named reports...
report: place Upload() on io.systemd.Report.Uploader rather than io.systemd.Report interface
We really want to use io.systemd.Report for the interface
provided by systemd-report itself, not by its backend. hence, rename the
interface that uploading plugins shall implement to
io.systemd.Report.Uploader.
Note that we ideally should have a varlink interface definition for that
interface. if we had, we'd have noticed that earlier.
Daan De Meyer [Mon, 15 Jun 2026 07:55:22 +0000 (07:55 +0000)]
udev: only trigger the boot-disk loop device for optical drives
probe_gpt_boot_disk_needs_loop() sets ID_PART_GPT_AUTO_ROOT_DISK_NEEDS_LOOP
for any whole disk that holds the boot ESP/XBOOTLDR but whose partition table
the kernel cannot parse. Until now the udev rule turned that into a
systemd-loop@.service for every block device.
That is too broad: device-mapper devices also report kernel partition
scanning as disabled, but their partitions are managed in userspace by kpartx
(see 66-kpartx.rules). Setting up a loop device on top of them re-exposes the
same partition table a second time and only causes trouble.
Restrict the rule to optical drives, the one class that genuinely needs a
kernel-side loop device (El Torito GPT sector size mismatch, or drives that do
not support partition scanning) and that has no userspace partition manager of
its own.
Co-developed-by: Claude Fable 5 <noreply@anthropic.com>
Daan De Meyer [Mon, 15 Jun 2026 07:45:46 +0000 (07:45 +0000)]
udev-builtin-blkid: keep probing the boot disk when it needs a loop device
Since 4e0eabd40118 ("udev: also trigger loop device for boot disk when
partition scanning is unsupported"), builtin_blkid() bails out entirely as
soon as probe_gpt_boot_disk_needs_loop() reports that a loop device is
needed, skipping all superblock probing. As a result whole-disk properties
such as ID_PART_TABLE_UUID and ID_FS_* are no longer set.
This regresses any whole disk whose partitions the kernel cannot expose
itself but which is otherwise perfectly probeable, most notably
device-mapper multipath disks: kernel partition scanning is disabled on them
(their partitions are managed in userspace by kpartx), so they are now
flagged as needing a loop device and lose their ID_PART_TABLE_UUID.
The early return was never necessary. The original intent was only to skip
root partition discovery on the device, and that already happens on the loop
device instead: find_gpt_root() bails when the kernel can't scan partitions,
blkid probes at the device's own logical sector size so a GPT written for a
different sector size is simply not detected, and PART_ENTRY_* is only
emitted for partitions the kernel actually registered, of which a
loop-needing whole disk has none. So keep probing the device for its
whole-disk properties unconditionally and let partition and root discovery
happen on the loop device.
Co-developed-by: Claude Fable 5 <noreply@anthropic.com>
dongshengyuan [Mon, 15 Jun 2026 08:28:02 +0000 (16:28 +0800)]
portable: fix double-free in normalize_portable_changes()
Now that the fast path performs a deep copy identical to the general
loop (when n_changes_attached==0, found stays false for all entries),
the block is redundant. Remove it and let the general loop handle this
case.
Signed-off-by: dongshengyuan <dongshengyuan@uniontech.com> Co-developed-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
dongshengyuan [Mon, 15 Jun 2026 07:43:51 +0000 (15:43 +0800)]
random-seed: fix wrong error variable in log_error_errno()
At line 285, ftruncate() failure was logged using 'r' which is 0
from the preceding successful loop_write() call. log_error_errno(0, ...)
triggers an assertion crash in developer builds (ASSERT_NON_ZERO) and
silently returns success in release builds, swallowing the ftruncate error.
Replace with errno which is set by ftruncate() on failure.
Signed-off-by: dongshengyuan <dongshengyuan@uniontech.com> Co-developed-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Yu Watanabe [Mon, 15 Jun 2026 04:03:00 +0000 (13:03 +0900)]
musl: fix build on 32-bit architecture
```
../src/boot/test-efi-string.c: In function 'test_xvasprintf_status':
../src/boot/test-efi-string.c:744:34: error: format '%zi' expects argument of type 'signed size_t', but argument 4 has type 'long int' [-Werror=format=]
744 | test_printf_one("%i %i %zi", INT_MIN, INT_MAX, SSIZE_MAX);
| ~~^
| |
| int
| %li
cc1: some warnings being treated as errors
ninja: subcommand failed
```