Frantisek Sumsal [Mon, 10 Nov 2025 18:26:43 +0000 (19:26 +0100)]
test: ignore EC from the second `systemctl status -a` as well
There is a TOCTOU in the `systemctl status` where a unit might change
its state during the initial ListUnitsByPatterns call and the subsequent
individual GetAll calls, which then makes the systemctl call fail even
if the unit that was originally pulled in was active/running:
Frantisek Sumsal [Mon, 10 Nov 2025 16:42:06 +0000 (17:42 +0100)]
test: don't register short-living containers with machined
As registering the container creates a scope which might not be cleaned
up completely before we run a next command in the same container,
causing intermittent test fails:
[ 63.424739] TEST-13-NSPAWN.sh[4231]: + systemd-nspawn --directory=/var/lib/machines/TEST-13-NSPAWN.sanity.zH2 bash -xec '[[ $USER == root ]]'
[ 63.427504] systemd-nspawn[4381]: ░ Spawning container TEST-13-NSPAWN.sanity.zH2 on /var/lib/machines/TEST-13-NSPAWN.sanity.zH2.
[ 63.437154] systemd[1]: Started TEST-13-NSPAWN.sanity.zH2.scope - Container TEST-13-NSPAWN.sanity.zH2.
[ 63.437765] systemd-machined[1164]: New machine TEST-13-NSPAWN.sanity.zH2.
[ 63.440311] TEST-13-NSPAWN.sh[4381]: + [[ root == root ]]
[ 63.442046] systemd[1]: TEST-13-NSPAWN.sanity.zH2.scope: Killed unit cgroup '/machine.slice/TEST-13-NSPAWN.sanity.zH2.scope' with SIGKILL on client request.
[ 63.442583] systemd-nspawn[4381]: Container TEST-13-NSPAWN.sanity.zH2 exited successfully.
[ 63.443073] systemd-machined[1164]: Machine TEST-13-NSPAWN.sanity.zH2 terminated.
[ 63.448728] TEST-13-NSPAWN.sh[4231]: + systemd-nspawn --directory=/var/lib/machines/TEST-13-NSPAWN.sanity.zH2 --user=testuser bash -xec '[[ $USER == testuser ]]'
[ 63.451209] systemd-nspawn[4385]: ░ Spawning container TEST-13-NSPAWN.sanity.zH2 on /var/lib/machines/TEST-13-NSPAWN.sanity.zH2.
[ 63.455295] systemd-nspawn[4385]: Failed to allocate scope: Unit TEST-13-NSPAWN.sanity.zH2.scope was already loaded or has a fragment file.
[ 63.456139] systemd[1]: TEST-13-NSPAWN.sanity.zH2.scope: Deactivated successfully.
[ 63.461292] TEST-13-NSPAWN.sh[2839]: + at_exit
Since even systemd-nspawn's man page suggests not to register containers
with systemd-machined if they don't run a service manager, let's do just
that to mitigate the race.
Gero Schwäricke [Fri, 7 Nov 2025 15:09:17 +0000 (16:09 +0100)]
rules: add rule to generate unique symlinks for gpio devices
Regular generated paths make it hard to identify individual GPIO
devices. This is a challenge when using multiple USB-to-GPIO adapters
like Diolan DLN2.
The unique symlinks from this rule can be used, e.g., with gpiod tools.
Yu Watanabe [Mon, 10 Nov 2025 10:01:32 +0000 (19:01 +0900)]
test: avoid service name collision
The same service name was accidentally used for two invocations:
```
[ 1801.197993] H TEST-04-JOURNAL.sh[20563]: + assert_rc 0 journalctl -q -D /run/log/journal/e30adae55e664d328af442bf5df694c8/ -u test-23833.service --grep service=test-23833.service
[ 1801.198527] H TEST-04-JOURNAL.sh[20685]: + set +ex
[ 1801.222676] H TEST-04-JOURNAL.sh[20686]: Nov 10 03:18:51 H systemd[1]: test-23833.service: About to execute: /usr/bin/bash -c "echo service=test-23833.service invocation=\$INVOCATION_ID; journalctl --sync"
[ 1801.222676] H TEST-04-JOURNAL.sh[20686]: Nov 10 03:18:51 H systemd[1]: Started test-23833.service - [systemd-run] /usr/bin/bash -c "echo service=test-23833.service invocation=\$INVOCATION_ID; journalctl --sync".
[ 1801.222676] H TEST-04-JOURNAL.sh[20686]: Nov 10 03:18:51 H (bash)[20681]: test-23833.service: Executing: /usr/bin/bash -c "echo service=test-23833.service invocation=\$INVOCATION_ID; journalctl --sync"
[ 1801.222676] H TEST-04-JOURNAL.sh[20686]: Nov 10 03:18:51 H bash[20681]: service=test-23833.service invocation=1866f15e95924a688dcecde72bf345f6
[ 1801.227878] H TEST-04-JOURNAL.sh[20563]: + assert_rc 1 journalctl -q -D /var/log/journal/e30adae55e664d328af442bf5df694c8/ -u test-23833.service --grep service=test-23833.service
[ 1801.228265] H TEST-04-JOURNAL.sh[20689]: + set +ex
[ 1801.253412] H TEST-04-JOURNAL.sh[20690]: Nov 10 03:18:49 H systemd[1]: test-23833.service: About to execute: /usr/bin/bash -c "echo service=test-23833.service invocation=\$INVOCATION_ID; journalctl --sync"
[ 1801.253412] H TEST-04-JOURNAL.sh[20690]: Nov 10 03:18:49 H systemd[1]: Started test-23833.service - [systemd-run] /usr/bin/bash -c "echo service=test-23833.service invocation=\$INVOCATION_ID; journalctl --sync".
[ 1801.253412] H TEST-04-JOURNAL.sh[20690]: Nov 10 03:18:49 H (bash)[20581]: test-23833.service: Executing: /usr/bin/bash -c "echo service=test-23833.service invocation=\$INVOCATION_ID; journalctl --sync"
[ 1801.253412] H TEST-04-JOURNAL.sh[20690]: Nov 10 03:18:49 H bash[20581]: service=test-23833.service invocation=a3089a62b5624d21bac0a75a3995d8b5
[ 1801.258158] H TEST-04-JOURNAL.sh[20692]: FAIL: expected: '1' actual: '0'
```
This gives access to credentials within ExecCondition=. As described in
ticket #35788, I do have a use-case for this and as noted in the
commit that dropped this[1], this is OK to be revisited if there are
use-cases.
systemd-repart is incorrectly choosing the loop-mount
code path to copy files after formatting, instead of using the --rootdir
path, which is required by mkfs.btrfs to apply compression (since it's
on files, not the fs).
So two fixes (and an integ test):
1. If Btrfs compression is requested without a root directory (e.g.,
Compression= without CopyFiles=), we now log a warning and skip the
--compress flag. This prevents the mkfs.btrfs failure, and it's
meaningless anyway without any files.
2. The logic in repart now uses the --rootdir code path whenever the
partition is btrfs and compression is requested. Otherwise it still
won't work even in the legitimate case because use the loop mounting
code, which is too late to use --compress.
Chris Down [Thu, 6 Nov 2025 15:36:19 +0000 (23:36 +0800)]
test: Add integration test for btrfs compression in repart
Add testcase_btrfs_compression() to verify that btrfs partitions with
Compression= and CopyFiles= directives work correctly.
The test verifies the fix for issue #39584, where mkfs.btrfs would fail
with "ERROR: --compression must be used with --rootdir" when repart
tried to create compressed btrfs filesystems.
The test creates a partition definition with Format=btrfs,
Compression=zstd, and CopyFiles=, then validates:
1. systemd-repart output shows "Rootdir from:" and "Compress:",
confirming that the --rootdir code path is used
2. mkfs.btrfs is invoked with both --compress and --rootdir options
3. The file is successfully copied to the filesystem
4. Compression is actually applied (verified via compsize output
containing "zstd")
Yu Watanabe [Sat, 8 Nov 2025 23:44:25 +0000 (08:44 +0900)]
libarchive-util: several cleanups
- use loop for checking existence of functions,
- rename HAVE_LIBARCHIVE_XYZ -> HAVE_ARCHIVE_XYZ to make them match with
the function name,
- do not conditionally include user-util.h in libarchive-util.h,
- sort library function symbols.
nsresource: allow multiple userns from the same process in parallel
When generating a name for a transient userns automatically we so far
just included our PID to make it unique. That doens't really work if
multiple userns shall be kept in parallel by a single process. Let's hence
include a counter as well.
pull: there's no need to keep the downloaded image in memory, except for the sha256sums/gpg file
This seems to be a mistake, in place since the first commit: we only
want the downloaded data in memory if this is a sha256sums or gpg file,
which we need to prorcess ourselves.
pull: now that PullJob can verify expected digests, let's rely on it for tar/raw pulling
Instead of authenticating the downloaded image explicity in the tar and
in the raw downloader, we can now rely on the checksum checking in the
generic PullJob code. Hence do so: drop tep the checksum field from
TarPull and RawPull, and just initialized the ->expected_checksum in the
relevant PullJob instead.
import: rework pull logic to store download digests in binary form rather than string
We generally want to store data in parsed form, not formatted form,
hence let's follow our own rules on this, and store the message digest
as "struct iovec" rather than as string. This is generally more
efficient and safer, simply because of case issues.
pull-job: always implicitly NUL terminate downloaded payload stored in memory
Just as a safety measure, let's always NUL terminate what we are
downloading, maybe future code will parse it as string, and is sloppy by
accident.
(We have similar logic in read_full_file(), and I think it's a really
good rule, to always implicitly NUL terminate blobs we acquire that
might very well be used as text later on)
After the commit, the functions are only used to determine
whether journals shall be forwarded to selected targets,
hence rename as such and remove effectively unused condition
on EXEC_OUTPUT_TTY.
test: move the system time to exactly the timer's elapse time
When we moved the time to 1 minute after the timer would've elapsed,
systemd could pick RandomizedDelaySec= <= 1 minute which would then
cause the timer to elapse immediately and the InactiveExitTimestamp=
to get recalculated including a new next elapse time that would be for
the next "window":
systemd[1]: timer-RandomizedDelaySec-30785.timer: Adding 3.634672s random time.
systemd[1]: timer-RandomizedDelaySec-30785.timer: Realtime timer elapses at Fri 2025-11-07 00:10:03 UTC.
systemd[1]: timer-RandomizedDelaySec-30785.timer: Timer elapsed.
systemd[1]: timer-RandomizedDelaySec-30785.timer: Changed waiting -> running
systemd[1]: Found unit timer-RandomizedDelaySec-30785.timer at /run/systemd/system/timer-RandomizedDelaySec-30785.timer (regular file)
systemd[1]: Preset files say disable timer-RandomizedDelaySec-30785.timer.
systemd[1]: timer-RandomizedDelaySec-30785.timer: Got notified about unit deactivation.
systemd[1]: timer-RandomizedDelaySec-30785.timer: Adding 8h 39min 26.166418s random time.
systemd[1]: timer-RandomizedDelaySec-30785.timer: Realtime timer elapses at Sat 2025-11-08 08:49:26 UTC.
systemd[1]: timer-RandomizedDelaySec-30785.timer: Changed running -> waiting
...
TEST-53-TIMER.sh[1008]: InactiveExitTimestamp=Thu 2025-11-06 23:00:00 UTC
TEST-53-TIMER.sh[1010]: ++ systemctl show -P NextElapseUSecRealtime timer-RandomizedDelaySec-30785.timer
TEST-53-TIMER.sh[905]: + NEXT_ELAPSE_REALTIME='Sat 2025-11-08 08:49:26 UTC'
TEST-53-TIMER.sh[1011]: ++ date '--date=Sat 2025-11-08 08:49:26 UTC' +%s
TEST-53-TIMER.sh[905]: + NEXT_ELAPSE_REALTIME_S=1762591766
TEST-53-TIMER.sh[905]: + : 'Next elapse timestamp should be Fri 2025-11-07 00:10:00 UTC <= Sat 2025-11-08 08:49:26 UTC <= Fri 2025-11-07 22:10:00 UTC'
TEST-53-TIMER.sh[905]: + assert_ge 17625917661762474200
TEST-53-TIMER.sh[1012]: + set +ex
TEST-53-TIMER.sh[905]: + assert_le 17625917661762553400
TEST-53-TIMER.sh[1013]: + set +ex
TEST-53-TIMER.sh[1013]: FAIL: '1762591766' > '1762553400'
Technically, the race is still there, but the window for it should be
_much_ smaller now (< 1s on a reasonably fast system). Let's hope that's
enough.
profile/osc-context: move and extend check for TERM=dumb
Let's do the check early and skip most of the file if appropriate. Also, treat
missing $TERM same as "dumb". We're almost certainly at a dump terminal in that
case.
Francesco Valla [Sun, 27 Jul 2025 21:50:06 +0000 (23:50 +0200)]
modules-load: implement parallel module loading
Load modules in parallel using a pool of worker threads. The number of
threads is equal to the number of CPUs, with a maximum of 16 (to avoid
too many threads being started during boot on systems with many an high
core count, since the number of modules loaded on boot is usually on
the small side).
The number of threads can optionally be specified manually using the
SYSTEMD_MODULES_LOAD_NUM_THREADS environment variable; in this case,
no limit is enforced. If SYSTEMD_MODULES_LOAD_NUM_THREADS is set to 0,
probing happens sequentially.
Chris Down [Thu, 6 Nov 2025 15:17:01 +0000 (23:17 +0800)]
repart: Force --rootdir population for btrfs with compression
When a btrfs partition is configured with both Compression= and
CopyFiles=, we need to ensure files are copied during filesystem
creation using mkfs.btrfs --rootdir, rather than copying files
afterwards via loop device mounting.
This is required because mkfs.btrfs can only apply compression settings
when files are provided via --rootdir during filesystem creation. If we
format the filesystem first and then mount it to copy files, the
compression setting is meaningless.
Modify the partition_needs_populate() condition to force the --rootdir
code path when the format is btrfs and compression is requested.
This ensures that partition_populate_directory() runs and creates a
temporary directory with the files, which is then passed to
make_filesystem() as the root parameter, allowing mkfs.btrfs to create
the filesystem with compression applied.
Chris Down [Thu, 6 Nov 2025 15:11:55 +0000 (23:11 +0800)]
mkfs-util: Ignore btrfs compression when there is no dir to copy
mkfs.btrfs requires that the --compress option be used together with
--rootdir, as compression only makes sense in that context (because
compression is not a persistent setting).
Right now, If --compress is specified without --rootdir, mkfs.btrfs
fails with:
ERROR: --compression must be used with --rootdir
This can occur when repart is configured with Compression= but the
partition populate logic doesn't use the --rootdir code path (eg. when
using loop device mounting to copy files after mkfs).
Add a defensive check to skip compression and emit a user-friendly
warning when compression is requested but no root directory is
provided. The warning message references the repart directive names
(Compression= and CopyFiles=) rather than low-level mkfs options to
help users understand the requirement.
This prevents crashes but doesn't enable compression, that requires
ensuring the --rootdir code path is used, which it currently is not and
will be addressed in the next patch.
Yu Watanabe [Thu, 6 Nov 2025 15:35:34 +0000 (00:35 +0900)]
reread-partition-table: trigger change events when we failed to lock device
Before aa47d8ade18cc4a079fef5a1aaa37d763507104e, when we failed to lock
the device node, we simply trigger change events for the device and its
partitions. But the commit killed the fallback logic. Let's restore that.
Luca Boccassi [Thu, 6 Nov 2025 17:13:16 +0000 (17:13 +0000)]
test: ensure test checking status runs first
The test messes a bit with the ESP, which might cause bootctl status output to change.
Run the test that simply checks status without changing anything first.
Mike Yuan [Tue, 4 Nov 2025 20:13:49 +0000 (21:13 +0100)]
logind: handle session leader termination during deserialization more gracefully
We track session leaders by pidfd precisely to make restarts reliable,
as leader exiting before deserialization is somewhat expected.
Such case is already handled gracefully (we'd GC sessions without leader
before kicking off the new cycle), but let's also tweak the log message
a bit to reduce annoyance.
In these two cases we need to sync the journal _after_ the unit finishes
as well, because we try to match messages from systemd itself, not
(only) from the unit, and the messages about units are dispatched
asynchronously.
That is, in the first case (silent-success.service) we want to make sure
that LogLevelMax= filters out messages _about_ units (from systemd) as
well, including messages like "Deactivated..." and "Finished...", which
are sent out only when/after the unit is stopped.
In the second case we try to match messages with the "systemd" syslog
tag, but these messages come from systemd (obviously) and are sent out
asynchronously, which means they might not reach the journal before we
call `journalctl --sync` from the test unit itself, like happened here:
By syncing the journal after the unit is stopped we have much bigger
chance that the systemd messages already reached the journal - the race
is technically still there, but the chance we'd hit it should be pretty
negligible.
Like the regular status output, fields are omitted all together when
empty, unless explicitly requested via one of the sub-commands dns,
domain, nta, etc.