Fix the confusing behavior where when an incorrect configuration item such as
'ManagerEnvironment=SYSTEMD_LOG_LEVEL=' is set, the first daemon-reload uses
old environment variables while the second daemon-reload uses LogLevel=.
Co-authored-by: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>
The difference in behaviour is that the operations that were done between the
first log_parse_environment() and the second one might not be logged now, e.g.
if the environment enabled debug logging. That is unfortunate, but parsing the
environment twice and not having the explicit configuration take effect until a
second daemon-reload is confusing. We will always have some window where the
configuration for logging does not apply, in particular this must be true when
parsing the logging configuration. To make that window smaller, move operations
that could log after the call to log_parse_environment() as far as possible.
man/systemd-systemd.conf: describe DefaultEnvironment= and ManagerEnvironment= better
The description of ME= said "see above", but it was actually above the other
one. So change the order. But while reading this, I found it very hard to
understand. So reword things, hopefully in a way that is easier to understand.
The current behaviour is rather complex and unintuitive, but this description
just tries to describe it truthfully.
Luca Boccassi [Wed, 15 Oct 2025 10:01:57 +0000 (11:01 +0100)]
Use verity sharing for user services and nspawn too (#39313)
https://github.com/systemd/systemd/pull/39168 made verity sharing
opt-in, and enabled it for system services.
Also enable it for user services for RootImage/etc, and for nspawn, for
the same reasons.
Govind Venugopal [Wed, 15 Oct 2025 09:20:41 +0000 (02:20 -0700)]
network: add DHCP server domain name option support (#39260)
Implements DHCP option 15 (Domain Name) for systemd-networkd's DHCP
server, allowing administrators to configure the DNS default domain that
clients should use.
This addresses the feature request in issue #37077, where users needed
to manually configure domain names using
SendOption=15:string:example.com as a workaround.
This adds two new configuration options to the [DHCPServer] section:
- EmitDomain= (boolean): whether to send domain name to clients
- Domain= (string): the domain name to send (e.g., "example.com")
Example configuration:
[DHCPServer] EmitDomain=yes Domain=example.com
This eliminates the need for manual workarounds using
SendOption=15:string:...
dissect-image: when autoprobing insist on vfat for XBOOTLDR
Let's reduce our attack surface by insisting that XBOOTLDR is vfat when
auto-probing, just like we do for the ESP. Given neither can
realistically be integrity protected (because firmware needs to access
them) let's insist on a vfat which has a much smaller attack surface,
and one we have to accept (for now) anyway, given that the ESP must be
VFAT.
This only applies to auto-probing of course. If people mount things
explicitly via fstab none of this matters. But we really shouldn't
automount a btrfs/xfs/ext4 partition as XBOOTLDR just because it looks
like one, as that would really defeat our otherwise possibly very strict
image policies.
This also introduces a new env var $SYSTEMD_DISSECT_FSTYPE_<DESIGNATOR>
environment variable that may override this hardcoding. This is in
particular useful in our testcases, since various actually do use ext4
as XBOOTLDR case. The tests are updated to make use of the new env var,
both as a mechanism to test this and to keep the tests working.
Luca Boccassi [Tue, 14 Oct 2025 17:46:08 +0000 (18:46 +0100)]
nspawn: enable verity sharing
Just like RootImage=, ExtensionImages= etc, nspawn can make use of
this to save a lot of time when starting containers that use an already
open image, since the default was changed to disabled.
Miroslav Lichvar [Tue, 14 Oct 2025 09:03:01 +0000 (11:03 +0200)]
udev: create symlinks for s390 PTP devices
Similarly to the udev rules handling KVM and Hyper-V PTP devices, create
symlinks for the s390-specific STCKE and Physical clocks (supported
since Linux 6.13) to have some stable names that can be specified in
default configurations of PTP/NTP applications.
core: allow split /usr/local/s?sbin with merged /usr/s?bin
Previously, we used either the fully split path or the fully merged path,
treating "split sbin" as a boolean condition. The idea was that conversion to
to merged bin would be a single event, so we don't need to care about the
details of the transition. But it turns out that some systems may be converted
in disparate steps. In https://bugzilla.redhat.com/show_bug.cgi?id=2400220,
there was a lengthy discussion about a coreos system where
/usr/local/{bin,sbin} were created as separate directories. Since /usr/local is
not part of the packaged system, it might remain split for a longer time. So
check /usr/local/s?bin separately and stop adding /usr/sbin to $PATH if only
/usr/local/s?bin is split. (I don't think it makes sense to handle the reverse
case, i.e. only /usr/s?bin being split, since that should be much rarer.)
Inspired by https://bugzilla.redhat.com/show_bug.cgi?id=2400220.
Frantisek Sumsal [Tue, 14 Oct 2025 12:23:55 +0000 (14:23 +0200)]
mkosi: explicitly pull in libz1 on OpenSUSE
Otherwise it pulls in libz-ng-compat1 which isn't 100% compatible with
libz1, and more importantly it requires an ldconfig drop-in in /etc/
(/etc/ld.so.conf.d/zlib-ng-compat-x86_64.conf) which breaks hermetic-usr
and TEST-07-PID1:
systemd[5582]: /usr/lib/systemd/systemd: error while loading shared libraries: libz.so.1: cannot open shared object file: No such file or directory
Frantisek Sumsal [Mon, 13 Oct 2025 15:36:55 +0000 (17:36 +0200)]
timer: rebase the next elapse timestamp only if timer didn't already run
The test added in f4c3c107d9be4e922a080fc292ed3889c4e0f4a5 uncovered a
corner case while recalculating the next elapse timestamp of a timer unit
that uses RandomizedDelaySec= during deserialization.
If the scheduled time (without RandomizedDelaySec=) already elapsed,
systemd "rebases" the next elapse timestamp to the time when systemd
first started, to make the RandomizedDelaySec= feature work even at
boot. However, since it was done unconditionally, it always overrode the
next elapse timestamp, which could then cause the final next elapse
timestamp to fall out of the expected window.
With a couple of additional debug logs one of the test fail looks like
this:
[ 132.129815] TEST-53-TIMER.sh[384]: + : 'Next elapse timestamp after daemon-reload, try #328'
[ 132.129815] TEST-53-TIMER.sh[384]: + systemctl daemon-reload
[ 132.136352] systemd[1]: Reload requested from client PID 16399 ('systemctl') (unit TEST-53-TIMER.service)...
[ 132.136636] systemd[1]: Reloading...
[ 132.446160] systemd[1]: Rebasing next elapse timestamp
[ 132.446168] systemd[1]: v->next_elapse: Tue 2025-10-14 00:10:00 CEST
[ 132.446170] systemd[1]: rebased: Tue 2025-10-14 00:10:56 CEST
[ 132.446172] systemd[1]: v->next_elapse after rebase: Tue 2025-10-14 00:10:56 CEST
[ 132.447361] systemd[1]: Reloading finished in 310 ms.
[ 132.484041] TEST-53-TIMER.sh[384]: + check_elapse_timestamp
[ 132.484041] TEST-53-TIMER.sh[384]: + systemctl status timer-RandomizedDelaySec-16377.timer
[ 132.533657] TEST-53-TIMER.sh[16440]: ● timer-RandomizedDelaySec-16377.timer
[ 132.533657] TEST-53-TIMER.sh[16440]: Loaded: loaded (/run/systemd/system/timer-RandomizedDelaySec-16377.timer; static)
[ 132.533657] TEST-53-TIMER.sh[16440]: Active: active (waiting) since Mon 2025-10-13 23:00:00 CEST; 1h 13min ago
[ 132.533657] TEST-53-TIMER.sh[16440]: Invocation: 5555d4f060114a5493ff228013830d17
[ 132.533657] TEST-53-TIMER.sh[16440]: Trigger: Tue 2025-10-14 22:10:04 CEST; 21h left
[ 132.533657] TEST-53-TIMER.sh[16440]: Triggers: ● timer-RandomizedDelaySec-16377.service
[ 132.533657] TEST-53-TIMER.sh[16440]: Oct 14 00:13:07 H systemd[1]: timer-RandomizedDelaySec-16377.timer: Changed dead -> waiting
[ 132.533657] TEST-53-TIMER.sh[16440]: Oct 14 00:13:07 H systemd[1]: timer-RandomizedDelaySec-16377.timer: Adding 15h 35min 1.230173s random time.
[ 132.533657] TEST-53-TIMER.sh[16440]: Oct 14 00:13:07 H systemd[1]: timer-RandomizedDelaySec-16377.timer: Realtime timer elapses at Tue 2025-10-14 15:45:58 CEST.
[ 132.533657] TEST-53-TIMER.sh[16440]: Oct 14 00:13:07 H systemd[1]: timer-RandomizedDelaySec-16377.timer: Changed dead -> waiting
[ 132.533657] TEST-53-TIMER.sh[16440]: Oct 14 00:13:08 H systemd[1]: timer-RandomizedDelaySec-16377.timer: Adding 16h 29min 44.084409s random time.
[ 132.533657] TEST-53-TIMER.sh[16440]: Oct 14 00:13:08 H systemd[1]: timer-RandomizedDelaySec-16377.timer: Realtime timer elapses at Tue 2025-10-14 16:40:41 CEST.
[ 132.533657] TEST-53-TIMER.sh[16440]: Oct 14 00:13:08 H systemd[1]: timer-RandomizedDelaySec-16377.timer: Changed dead -> waiting
[ 132.533657] TEST-53-TIMER.sh[16440]: Oct 14 00:13:08 H systemd[1]: timer-RandomizedDelaySec-16377.timer: Adding 21h 59min 7.955828s random time.
[ 132.533657] TEST-53-TIMER.sh[16440]: Oct 14 00:13:08 H systemd[1]: timer-RandomizedDelaySec-16377.timer: Realtime timer elapses at Tue 2025-10-14 22:10:04 CEST.
[ 132.533657] TEST-53-TIMER.sh[16440]: Oct 14 00:13:08 H systemd[1]: timer-RandomizedDelaySec-16377.timer: Changed dead -> waiting
[ 132.535386] TEST-53-TIMER.sh[384]: + systemctl show -p InactiveExitTimestamp timer-RandomizedDelaySec-16377.timer
[ 132.537727] TEST-53-TIMER.sh[16442]: InactiveExitTimestamp=Mon 2025-10-13 23:00:00 CEST
[ 132.540317] TEST-53-TIMER.sh[16444]: ++ systemctl show -P NextElapseUSecRealtime timer-RandomizedDelaySec-16377.timer
[ 132.547745] TEST-53-TIMER.sh[384]: + NEXT_ELAPSE_REALTIME='Tue 2025-10-14 22:10:04 CEST'
[ 132.548020] TEST-53-TIMER.sh[16445]: ++ date '--date=Tue 2025-10-14 22:10:04 CEST' +%s
[ 132.550218] TEST-53-TIMER.sh[384]: + NEXT_ELAPSE_REALTIME_S=1760472604
[ 132.550218] TEST-53-TIMER.sh[384]: + : 'Next elapse timestamp should be Tue 2025-10-14 00:10:00 CEST <= Tue 2025-10-14 22:10:04 CEST <= Tue 2025-10-14 22:10:00 CEST'
[ 132.550218] TEST-53-TIMER.sh[384]: + assert_ge 17604726041760393400
[ 132.550555] TEST-53-TIMER.sh[16446]: + set +ex
[ 132.550702] TEST-53-TIMER.sh[384]: + assert_le 17604726041760472600
[ 132.550832] TEST-53-TIMER.sh[16447]: + set +ex
[ 132.551091] TEST-53-TIMER.sh[16447]: FAIL: '1760472604' > '1760472600'
Here the original next elapse timestamp was Tue 2025-10-14 00:10:00 CEST
as expected, but it was overridden by the rebased timestamp:
Tue 2025-10-14 00:10:56 CEST. And when a new randomized delay was added
to it (21h 59min 7.955828s) the final next elapse timestamp fell out of
the expected window, i.e. Tue 2025-10-14 00:10:00 (scheduled time) <
Tue 2025-10-14 22:10:04 CEST (rebased elapse timestamp + randomized
delay) < Tue 2025-10-14 22:10:00 CEST (scheduled time + maximum from
RandomizedDelaySec=, i.e. 22h).
By limiting the timestamp rebase only the case where the unit hasn't
already run should prevent this from happening during daemon-reload.
This change was misguided. The warning is enough during development and will
get fixed, but turning this into a hard failure just makes WIP harder. Also, a
hard error increases the likelyhood of a build failure in scenarios where
somebody is disabling components (as seen e.g. in ba8801a07640205778c5a62539597c68d7bdb211). We already are not very good at
keeping our codebase compile correctly as it ages, because of changes in
compilers and dependencies, and we should not go out of our way to increase the
probability of failure. Such scenarios are painful for downstream builds.
meson: stop probing for paths of programs in /usr/sbin
We dropped support for split-usr a while ago, which means that the programs
will be in /usr/sbin, which actually may be the same as /usr/bin on merged-bin
systems. So the whole checking is mostly pointless in the usual case. OTOH, on
Nix the paths will be totally different and need to be set through the option
anyway. So save time during builds by using the "fallback" path unless the
option is specified.
This avoid some busywork during the slow serial build phase.
libsystemd: drop "const" decorators on public inline functions
The point of the "const" attribute is to give the compiler hints about
behaviour of functions if it only has the function prototype but no body
around. But inline functions are the ones where the compiler *always*
has the body around, hence the "const" decorator is really just noise:
the compuler can determine the constness on its own, just by looking at
the code.
This way we have can expose identical behaviour everywhere, can make use
of our atomic replacement calls, and openat() logic, and later apply
additional tracks while unpacking, such as putting limits on UID ranges
and similar.
dissect-image: take policy into consideration when unlocking verity, too
Previously, we'd take the image policy only into consideration when
dissecting the mage, but for the unlock/verity step we'd go via best
effort. Change that. This means we can now enforce policies such as
activating by root hash only even if a signature exists and similar.
Also, introduce a separate error code if we try to unlock a Verity
volume but have no root hash. Previously we'd return ENOKEY for that,
exactly like we do for encrypted volumes where we have no passparse. The
interctive unlock loop dissected_image_decrypt_interactively() is
otherwise very confused and will ask for a root hash, which makes no
sense. Hence use two distinct errors for this.
dissect-image: turn verity device sharing into opt-in
Sharing verity volumes is problematic for a veriety of reasons, for
example because it might pin the wrong backing device at the wrong time.
Let's hence turn this around: unless verity sharing is enabled, leave it
off, and turn $SYSTEMD_VERITY_SHARING into a true boolean that can be
set both ways.
The primary usecase for verity sharing is RootImage=, where it probably
makes sense to leave on, hence set the flag there.
This is crucial when putting together installers which install an OS on
a second disk: if verity sharing is always on we might mount the wrong
of the two disks at the wrong time.
Daan De Meyer [Mon, 13 Oct 2025 08:43:16 +0000 (10:43 +0200)]
sd-id128: Drop _sd_const_ from sd_id128_in_setv()
Both the const and pure attributes disallow modifying input arguments
but sd_id128_in_setv() clearly modifies its ap input argument by iterating
over it with va_arg() so drop the _sd_const_ attribute from
sd_id128_in_setv().
If systemd-pcrphase-initrd.service and friends failed for some reasons,
the test VM will reboot infinitely and the test will timeout. Let's
propagate the failure to the host and fail the test earlier in that case.
Duy Nguyen Van [Tue, 7 Oct 2025 04:09:19 +0000 (13:09 +0900)]
Fix build fail when add option "-fstack-protector-all"
When using canary check with "-fstack-protector-all" option. It causes a configure
error in systemd-boot when meson.build executes compile simple code to test linker option
"-static-pie" because -nolibstd option prevents using libc. It need for
canary to provide some function as "__stack_chk_guard". So need to turn off
canary check when compile sanity check.
mkosi: install test dependencies for EnterNamespace= test (#39268)
The test for the EnterNamespace= feature [0] has been both broken and
disabled since the migration to the mkosi framework, as it's missing the
libdw.pc file for pkg-config, so the test is skipped completely, and
it's also missing gcc to actually build the test binary.
Frantisek Sumsal [Fri, 10 Oct 2025 18:09:51 +0000 (20:09 +0200)]
test: temporarily skip the EnterNamespace= test w/o embedded debuginfo
The EnterNamespace= feature currently doesn't work if the debuginfo is
separated from the crashing binary. Until that's resolved, let's run the
test only if the test binary has embedded debuginfo (.debug_info
section; e.g. when systemd is built without WITH_DEBUG=1) or it contains
MiniDebugInfo (.gnu_debugdata section; default on Fedora and CentOS).
mkosi: install test dependencies for EnterNamespace= test
The test for the EnterNamespace= feature [0] has been both broken and
disabled since the migration to the mkosi framework, as it's missing the
libdw.pc file for pkg-config, so the test is skipped completely, and
it's also missing gcc to actually build the test binary.
logind: emit PropertiesChanged when lingering is enabled/disabled
Cockpit's podman plugin needs to know the lingering status so the UI can
advertise enabling `podman-restart` (which depends on lingering to
work). Currently it relies on watching `/var/lib/systemd/linger/${user}`
but that isn't a public API.
Drop `machine-id` OSC event field if /etc/machine-id doesn't exist
While we can safely assume that `/proc/sys/kernel/random/boot_id`
exists, the same can't be said for `/etc/machine-id` in environments
where systemd is installed, but not running. An example would be OCI
containers like with the official Arch Linux image, see [0].
Without this check the prompt would constantly output `/etc/machine-id:
no such file or directory` with the OSC events introduced in dadbb34
(v258).
First, let's say "must" rather than "shall" regarding creation of these
files, because without them group memberships will not work.
Secondly, suggest placing an empty JSON object in them, rather than
making them empty, simply to avoid issues with older systems that didn't
backport d6570eafe3b86584ca42979d1ced5bfd2228a5c7.
userdb: add support for looking up users or groups by uuid. (#37097)
Followon to #37024.
This implements (mostly) what was suggested there, except that only a
single UUID is accepted (modifying things to support multiple is a
relatively straightforward change from here)
I'm not really convinced this is the right approach:
* I can't really think of any cases where you'd need to query by
multiple UUIDs (I guess you might want to lookup multiple users, but in
that case why aren't there "usernames" or "uids" arrays?)
* If I specify username "foo" and UID 1234 and UID 1234 exists and has
username "bar", I get back the error `ConflictingRecordFound`
* If I specify username "foo" and UUID abcdef... and username "foo"
exists but has UUID 123456..., I get back the error
`NonMatchingRecordFound`
This makes the two ID types behave differently.
Additionally, when querying by `uuid`, the multiplexer will always sends
`more: true`, which is fine but a little unexpected.
I do think unifying things through the `UserDBMatch` struct could make
sense, but in that case I think it would make sense to unify all query
types in that way (username, uid, uuid), identify when the filter is for
a single or multiple records, and centralise determination of conflict
vs non matching record errors.
`userdb_by_name`/`userdb_by_uid` could then become helper functions for
the simple case where no additional filtering is needed.
Thoughts?
One other thought: Should the multiplexer just pass through all
parameters, even unknown ones, to the backend services? Even if it
doesn't know how to filter by every property, the backends might, and it
would be useful to allow them to optimise things. (I realise the
disadvantage of this, ofc, is loss of error checking)
creds: add explicit control on whether to allow null key decryption
The ability to encrypt/authenticate encryption with a null key was
originally just a fallback concept for cases where during early boot we
have no host key, but the local system has no TPM2. Nowadays it is used
for other stuff as well, such as pcrlock data propagation (i.e. data
that needs no protection itself and required to properly to TPM key
derivation).
Let's give better, explicit control over null key usage, i.e. let's make
it a tristate both on the systemd-creds command line and in the Varlink
IPC to control three cases:
- the default that we allow it only if SecureBoot is off
- explicitly allowed
- explicitly refused (this is new)
Ideally systemd-creds --allow-null switch would take a boolean argument
to control this as a tristate. Alas, that would be a compat break, hence
I added --refuse-null instead (which also maps to the low-level flag for
this).
This also normalizes that the null key is always called "null key" in
messages, and not sometimes "empty key" or "fallback key".
nspawn: register containers with both the system-wide and the per-user machined instance
Let's make sure each container run by unpriv users is known by both
instances. Making it known by the host instance ensures it is resovlable
by nss-myhostname (and various other tools), but making it known by the
per-user instance means the user can fully manage it on their own.
This simplifies the code a bit too, i.e. we always allocate a scope
ourselves now, and just register it, instead of letting machined do that
under some circumstances. THere's little value in that and it simplifies
things for us quite a bit.