profile/osc-context: move and extend check for TERM=dumb
Let's do the check early and skip most of the file if appropriate. Also, treat
missing $TERM same as "dumb". We're almost certainly at a dump terminal in that
case.
Francesco Valla [Sun, 27 Jul 2025 21:50:06 +0000 (23:50 +0200)]
modules-load: implement parallel module loading
Load modules in parallel using a pool of worker threads. The number of
threads is equal to the number of CPUs, with a maximum of 16 (to avoid
too many threads being started during boot on systems with many an high
core count, since the number of modules loaded on boot is usually on
the small side).
The number of threads can optionally be specified manually using the
SYSTEMD_MODULES_LOAD_NUM_THREADS environment variable; in this case,
no limit is enforced. If SYSTEMD_MODULES_LOAD_NUM_THREADS is set to 0,
probing happens sequentially.
Yu Watanabe [Thu, 6 Nov 2025 15:35:34 +0000 (00:35 +0900)]
reread-partition-table: trigger change events when we failed to lock device
Before aa47d8ade18cc4a079fef5a1aaa37d763507104e, when we failed to lock
the device node, we simply trigger change events for the device and its
partitions. But the commit killed the fallback logic. Let's restore that.
Luca Boccassi [Thu, 6 Nov 2025 17:13:16 +0000 (17:13 +0000)]
test: ensure test checking status runs first
The test messes a bit with the ESP, which might cause bootctl status output to change.
Run the test that simply checks status without changing anything first.
Mike Yuan [Tue, 4 Nov 2025 20:13:49 +0000 (21:13 +0100)]
logind: handle session leader termination during deserialization more gracefully
We track session leaders by pidfd precisely to make restarts reliable,
as leader exiting before deserialization is somewhat expected.
Such case is already handled gracefully (we'd GC sessions without leader
before kicking off the new cycle), but let's also tweak the log message
a bit to reduce annoyance.
In these two cases we need to sync the journal _after_ the unit finishes
as well, because we try to match messages from systemd itself, not
(only) from the unit, and the messages about units are dispatched
asynchronously.
That is, in the first case (silent-success.service) we want to make sure
that LogLevelMax= filters out messages _about_ units (from systemd) as
well, including messages like "Deactivated..." and "Finished...", which
are sent out only when/after the unit is stopped.
In the second case we try to match messages with the "systemd" syslog
tag, but these messages come from systemd (obviously) and are sent out
asynchronously, which means they might not reach the journal before we
call `journalctl --sync` from the test unit itself, like happened here:
By syncing the journal after the unit is stopped we have much bigger
chance that the systemd messages already reached the journal - the race
is technically still there, but the chance we'd hit it should be pretty
negligible.
Like the regular status output, fields are omitted all together when
empty, unless explicitly requested via one of the sub-commands dns,
domain, nta, etc.
David Tardon [Thu, 6 Nov 2025 13:04:32 +0000 (14:04 +0100)]
ask-password-api: return if read_credential() failed
The current code causes assertion in strv_parse_nulstr() if
read_credential() results in an error different from ENXIO or ENOENT
(strace shows I'm getting EACCES):
==25155== HEAP SUMMARY:
==25155== in use at exit: 12,879 bytes in 39 blocks
==25155== total heap usage: 90 allocs, 51 frees, 53,964 bytes allocated
==25155==
==25155== 8 bytes in 1 blocks are definitely lost in loss record 4 of 38
==25155== at 0x4845866: malloc (vg_replace_malloc.c:446)
==25155== by 0x547FC2E: strdup (strdup.c:42)
==25155== by 0x4B2647C: strv_env_replace_strdup_passthrough (env-util.c:435)
==25155== by 0x42D547: parse_argv (homectl.c:3909)
==25155== by 0x43999C: run (homectl.c:5606)
==25155== by 0x4399F5: main (homectl.c:5613)
==25155==
==25155== LEAK SUMMARY:
==25155== definitely lost: 8 bytes in 1 blocks
After:
==25224== HEAP SUMMARY:
==25224== in use at exit: 12,871 bytes in 38 blocks
==25224== total heap usage: 90 allocs, 52 frees, 53,964 bytes allocated
==25224==
==25224== LEAK SUMMARY:
==25224== definitely lost: 0 bytes in 0 blocks
profile/systemd-osc-context: fix overriding of PROMPT_COMMAND
In https://github.com/systemd/systemd/issues/39114 users are reporting
that our script overrides PROMPT_COMMAND that they had. After looking
at /etc/bashrc in Fedora, I see that it only sets PROMPT_COMMAND if
[ -z "$PROMPT_COMMAND" ]. Let's adjust the script so this continues to
work.
Fixes https://github.com/systemd/systemd/issues/39114.
(This is a bit of a stretch. 39114 was originally about SecureCRT,
but that was resolved in SecureCRT. But there was a lot of dicussion
about the prompt being overriden, which this commit should fix.)
Like the regular status output, fields are omitted all together when
empty, unless explicitly requested via one of the sub-commands dns,
domain, nta, etc.
Nick Rosbrook [Fri, 10 Oct 2025 19:56:35 +0000 (15:56 -0400)]
resolve: add DumpDNSConfiguration to varlink API
Add io.systemd.Resolve.DumpDNSConfiguration. This provides the same
information as io.systemd.Resolve.Monitor.SubscribeDNSConfiguration,
but just returns the configuration once without the subscription logic.
In order to use the same definitions for DNSConfiguration et al. between
both interfaces, move the definitions to io.systemd.Resolve, and include
them in io.systemd.Resolve.Monitor.
This will be used to implement --json for resolvectl status.
systemd.service(5)’s documentation of `ExecCondition=` uses “failed” with
respect to the unit active state.
In particular the unit won’t be considered failed when `ExecCondition=`’s
command exits with a status of 1 through 254 (inclusive). It will however, when
it exits with 255 or abnormally (e.g. timeout, killed by a signal, etc.).
The table “Defined $SERVICE_RESULT values” in systemd.exec(5) uses “failed”
however rather with respect to the condition.
Tests seem to have shown that, if the exit status of the `ExecCondition=`
command is one of 1 through 254 (inclusive), `$SERVICE_RESULT` will be
`exec-condition`, if it is 255, `$SERVICE_RESULT` will be `exit-code` (but
`$EXIT_CODE` and `$EXIT_STATUS` will be empty or unset), if it’s killed because
of `SIGKILL`, `$SERVICE_RESULT` will `signal` and if it times out,
`$SERVICE_RESULT` will be `timeout`.
This commit clarifies the table at least for the case of an exit status of 1
through 254 (inclusive).
The others (signal, timeout and 255 are probably also still ambiguous (e.g.
`signal` uses “A service process”, which could be considered as the actual
service process only).
Signed-off-by: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
When VirtIO VSOCK device is not present, IOCTL_VM_SOCKETS_GET_LOCAL_CID
returns VMADDR_CID_LOCAL/1, and we issue a hint to connect to vsock%1.
This does not work. Filter out VMADDR_CID_LOCAL and VMADDR_CID_HOST,
those are not real addresses that can be used from the outside.
jouyouyun [Wed, 5 Nov 2025 10:03:34 +0000 (18:03 +0800)]
nss-resolve: fix the ip addr family validity check method
`i` only counts the number of matches with the current family,
while `n_addresses` counts the number of matches with the family INET or INET6.
If the address contains both INET and INET6, `assert(i == n_addresses)` will fail.
Chris Down [Wed, 5 Nov 2025 09:46:40 +0000 (17:46 +0800)]
systemctl: Support --timestamp for otherwise named properties
`systemctl show`'s `--timestamp` flag is supposed to reformat all
timestamp-based properties. However, the logic for detecting these
properties was incomplete and only checked if the name ended in
Timestamp.
Expand the check to explicitly include some non-"timestamp" named
properties that really are timestamps.
Luca Boccassi [Wed, 5 Nov 2025 19:39:10 +0000 (19:39 +0000)]
test: wait until the nspawn process is completely dead (#39576)
Before calling io.systemd.MachineImage.List.
The systemd-nspawn process takes a lock in the run() function in
nspawn.c and holds it for the entire runtime of that function. If we
call `machinectl terminate` the machine gets unregistered _before_ we
release the lock, so the original `machinectl status` check would return
early, allowing for a race where we call io.systemd.MachineImage.List
over Varlink when systemd-nspawn still holds the lock because the
process is still running.:
```
[ 41.691826] TEST-13-NSPAWN.sh[1102]: + machinectl terminate long-running
[ 41.695009] systemd-nspawn[2171]: Trying to halt container by sending TERM to container PID 1. Send SIGTERM again to trigger immediate termination.
[ 41.698235] systemd-machined[1192]: Machine long-running terminated.
[ 41.709520] TEST-13-NSPAWN.sh[1102]: + systemctl kill --signal=KILL systemd-nspawn@long-running.service
[ 41.709169] systemd-nspawn[2171]: Failed to unregister machine: No machine 'long-running' known
[ 41.720869] TEST-13-NSPAWN.sh[2346]: + varlinkctl --more call /run/systemd/machine/io.systemd.MachineImage io.systemd.MachineImage.List '{}'
[ 41.723359] TEST-13-NSPAWN.sh[2347]: + grep long-running
...
[ 41.735453] TEST-13-NSPAWN.sh[2352]: + varlinkctl call /run/systemd/machine/io.systemd.MachineImage io.systemd.MachineImage.List '{"name":"long-running", "acquireMetadata": "yes"}'
[ 41.736222] TEST-13-NSPAWN.sh[2353]: + grep OSRelease
[ 41.739500] TEST-13-NSPAWN.sh[2352]: Method call io.systemd.MachineImage.List() failed: Device or resource busy
[ 41.740641] systemd[1]: Received SIGCHLD.
[ 41.740670] systemd[1]: Child 2171 (systemd-nspawn) died (code=killed, status=9/KILL)
[ 41.740725] systemd[1]: systemd-nspawn@long-running.service: Child 2171 belongs to systemd-nspawn@long-running.service.
[ 41.740748] systemd[1]: systemd-nspawn@long-running.service: Main process exited, code=killed, status=9/KILL
[ 41.740755] systemd[1]: systemd-nspawn@long-running.service: Will spawn child (service_enter_stop_post): systemd-nspawn
[ 41.740872] systemd[1]: systemd-nspawn@long-running.service: About to execute: systemd-nspawn --cleanup --machine=long-running
...
```
Let's mitigate this by waiting until the corresponding
systemd-nspawn@.service instance enters the 'inactive' state where the
lock should be properly released.
test: wait until the nspawn process is completely dead
Before calling io.systemd.MachineImage.List.
The systemd-nspawn process takes a lock in the run() function in
nspawn.c and holds it for the entire runtime of that function. If we
call `machinectl terminate` the machine gets unregistered _before_ we
release the lock, so the original `machinectl status` check would return
early, allowing for a race where we call io.systemd.MachineImage.List
over Varlink when systemd-nspawn still holds the lock because the
process is still running.:
[ 41.691826] TEST-13-NSPAWN.sh[1102]: + machinectl terminate long-running
[ 41.695009] systemd-nspawn[2171]: Trying to halt container by sending TERM to container PID 1. Send SIGTERM again to trigger immediate termination.
[ 41.698235] systemd-machined[1192]: Machine long-running terminated.
[ 41.709520] TEST-13-NSPAWN.sh[1102]: + systemctl kill --signal=KILL systemd-nspawn@long-running.service
[ 41.709169] systemd-nspawn[2171]: Failed to unregister machine: No machine 'long-running' known
[ 41.720869] TEST-13-NSPAWN.sh[2346]: + varlinkctl --more call /run/systemd/machine/io.systemd.MachineImage io.systemd.MachineImage.List '{}'
[ 41.723359] TEST-13-NSPAWN.sh[2347]: + grep long-running
...
[ 41.735453] TEST-13-NSPAWN.sh[2352]: + varlinkctl call /run/systemd/machine/io.systemd.MachineImage io.systemd.MachineImage.List '{"name":"long-running", "acquireMetadata": "yes"}'
[ 41.736222] TEST-13-NSPAWN.sh[2353]: + grep OSRelease
[ 41.739500] TEST-13-NSPAWN.sh[2352]: Method call io.systemd.MachineImage.List() failed: Device or resource busy
[ 41.740641] systemd[1]: Received SIGCHLD.
[ 41.740670] systemd[1]: Child 2171 (systemd-nspawn) died (code=killed, status=9/KILL)
[ 41.740725] systemd[1]: systemd-nspawn@long-running.service: Child 2171 belongs to systemd-nspawn@long-running.service.
[ 41.740748] systemd[1]: systemd-nspawn@long-running.service: Main process exited, code=killed, status=9/KILL
[ 41.740755] systemd[1]: systemd-nspawn@long-running.service: Will spawn child (service_enter_stop_post): systemd-nspawn
[ 41.740872] systemd[1]: systemd-nspawn@long-running.service: About to execute: systemd-nspawn --cleanup --machine=long-running
...
Let's mitigate this by waiting until the corresponding
systemd-nspawn@.service instance enters the 'inactive' state where the
lock should be properly released.
Chris Down [Wed, 5 Nov 2025 10:41:17 +0000 (18:41 +0800)]
core: Only apply unprivileged userns logic to user managers
Commit 38748596f078 ("core: Make DelegateNamespaces= work for user
managers with CAP_SYS_ADMIN") refactored the logic for when an
unprivileged process should create a new user namespace for sandboxing.
This refactor inadvertently removed a check (`params->runtime_scope !=
RUNTIME_SCOPE_USER`) that differentiated between system services and user
services.
This causes a regression in rootless containers where systemd runs
unprivileged. When starting a system service (like `dbus-broker`) that
uses sandboxing features (eg. with `PrivateTmp=yes`), systemd now
incorrectly creates a new, minimal `PRIVATE_USERS_SELF` namespace.
This new namespace only maps UID/GID 0. When dbus-broker attempts to
drop privileges to the `dbus` user (GID 81), the `setresgid(81, 81, 81)`
call fails because GID 81 is not mapped.
Restore the check to ensure that the special unprivileged sandboxing
logic is only applied to user services, as was the original intent.
System services in a rootless context will now correctly run in the
container's main user namespace, where all necessary UIDs/GIDs are
mapped.
jouyouyun [Tue, 4 Nov 2025 08:10:31 +0000 (16:10 +0800)]
cgls: print error messages when --unit and --user-unit are used together
Mixing the `--unit` and `--user-unit` options will result in error messages.
During the parsing phase, only the `arg_show_unit` record of the last
occurrence of the option is used; all names are placed in the same `arg_names`,
thus mixing the two types of units in the query.
For example, `-u foo --user-unit bar` will also treat `foo` as a user unit and
query it in the user service.
Chris Down [Tue, 4 Nov 2025 10:19:07 +0000 (18:19 +0800)]
systemctl: Fix shutdown time parsing across DST changes
When parsing an absolute time specification like `hh:mm` for the
`shutdown` command, the code interprets a time in the past as "tomorrow
at this time". It currently implements this by adding a fixed 24-hour
duration (`USEC_PER_DAY`) to the timestamp.
This assumption breaks across DST transitions, as the day might not be
24 hours long. This can cause the shutdown to be scheduled at the wrong
time (typically off by one hour in either direction).
Change the logic to perform calendar arithmetic instead of timestamp
arithmetic. If the calculated time is in the past, we increment
`tm.tm_mday` and call `mktime_or_timegm_usec()` a second time.
This delegates all date normalization logic to `mktime()`, which
correctly handles all edge cases, including DST transitions, month-end
rollovers, and leap years.
core: use proper service type of TEST-07-PID.user-namespace-path.sh
TEST-07-PID.user-namespace-path.sh is flaky as Type=simple is used
(implicitly), explicitly use Type=exec instead to ensure the namespaces
are created before starting another service reusing the same namespaces.
Kai Lueke [Tue, 28 Oct 2025 11:56:45 +0000 (20:56 +0900)]
sysext: Check for /etc/initrd-release in given --root= tree
Both sysext and confext used the host's /etc/initrd-release file even
when --root=/somewhere was specified. A workaround was the
SYSTEMD_IN_INITRD= env var but without knowing this it was quite
confusing. Aside from users validating their extensions, the primary
use case for this to matter is when the extensions are set up from the
initrd where the initrd-release file is present when running but we want
to prepare the extensions for the final system and thus should match
for the right scope.
Make systemd-sysext check for /etc/initrd-release inside the given
--root= tree. An alternative would be to always ignore the
initrd-release check when --root= is passed but this way it is more
consistent. The image policy logic for EFI-loader-passed extensions
won't take effect when --root= is used, though.
Kai Lueke [Tue, 28 Oct 2025 15:08:42 +0000 (00:08 +0900)]
test: Add missing test cleanup for the last sysext test
The last sysext test leaked things into new tests added later,
uncovered by any new tests leftover check.
Remove the mutable folder state through a trap as done in other tests.
varlink-idl: add infra to test our enum parsers against varlink IDL enums
In many cases we want to expose enums for which we have the usual
xyz_to_string()/xyz_from_string() via Varlink as enums. Let's add some
infra to test the tables against each other, to automatically detect
when they deviate.
In order to implement this properly, let's export/introduce clean
json_underscorefy()/json_dashify(), for dealing with the fact that our
enums usually use dash separates ames, but Varlink doesn't allow that.
(This does not add the test cases for all enum types we expose right
now, but only adds the general infra).
When Type=notify-reload got introduced, it wasn't intended to be
mutually exclusive with ExecReload=. However, currently ExecReload=
is immediately forked off after the service main process is signaled,
leaving states in between essentially undefined. Given so broken
it is I doubt any sane user is using this setup, hence I took a stab
to rework everything:
1. Extensions are refreshed (unchanged)
2. ExecReload= is forked off without signaling the process
3a. If RELOADING=1 is sent during the ExecReload= invocation,
we'd refrain from signaling the process again, instead
just transition to SERVICE_RELOAD_NOTIFY directly and
wait for READY=1
3b. If not, signal the process after ExecReload= finishes
(from now on the same as Type=notify-reload w/o ExecReload=)
4. To accomodate the use case of performing post-reload tasks,
ExecReloadPost= is introduced which executes after READY=1
The new model greatly simplifies things, as no control processes
will be around in SERVICE_RELOAD_SIGNAL and SERVICE_RELOAD_NOTIFY
states.
See also: https://github.com/systemd/systemd/issues/37515#issuecomment-2891229652