udev: support reverting/serializing/deserializing configurations set by 'udevadm control' (#37067)
Previously, log level, properties, maximum number of worker processes,
and so on set by 'udevadm control' are discarded on restart. This makes
the configuration serialized on stop and deserialized in next
invocation. Also, this introduces 'udevadm control --revert' to clear
previous configurations.
Previously, configurations set by 'udevadm control' such as log level,
maximum number of childrens, global properties, and so on were discarded
on restart. This makes udevd serialize those configurations on stop, and
deserialize them in the next invocation.
test: Work around bug in meson when installing directory symlinks
Installing symlinks pointing to directories with install_subdir() is
broken (see https://github.com/mesonbuild/meson/pull/14471). Let's work
around the issue for now by manually installing the standalone directory
until the issue is fixed upstream and available in meson in all supported
distributions.
JSON User/Group records: Add properties for UUIDs (#37024)
It is useful to have stable and unique identifiers for a security
principal. The majority of identitiy management systems in use with Unix
systems today (e.g. Active Directory objectGUID, FreeIPA ipaUniqueID,
Kanidm UUIDs) assign each account and group a unique UUID and exposing
that to applications allows them to refer to accounts in a stable
manner.
At this time we are merely adding the properties to the user/group
records. Adding ways to perform lookups by these IDs is left for a
future PR.
See [discussion](https://mastodon.social/@pid_eins/114283987142625086) and
[this comment](https://github.com/systemd/systemd/issues/24032#issuecomment-2745246757).
I'm sure there are wording aspects which could be improved, but I
believe this is a reasonable initial stab at the problem.
mkosi: Make sure the mkosi image can be built without the source tree available (#37068)
Let's make sure the mkosi can be built (with `NO_BUILD` enabled) without
the source tree available. This allows running the integration tests
when only distribution packages are available but the source tree is
not.
mkosi: Move TEST-24-CRYPTSETUP files to mkosi/ directory
If the integration tests have been installed in the systemd-tests
package, the path to these in mkosi.postinst.chroot will be wrong.
Let's fix the issue by moving these files into the mkosi/ directory
as they're only used by mkosi regardless so they make more sense to
be there anyway.
mkosi: Rely on tmpfiles to put nsswitch.conf in place
Let's rely on tmpfiles to put our nsswitch.conf in place instead of
doing it in the post-install script. This moves us one step closer
to being able to build the mkosi image without having the source
tree available when NO_BUILD is used.
* 11efce9445 Install /usr/share/factory for upstream profile
* 4c3d753649 d/t/upstream: copy mkosi key from mkosi/ subdir if it exists
* 00f2ab1bce Install etc.conf tmpfiles.d in upstream builds
* dcf5869729 Refresh patch for upstream review changes
* f94714d8cc d/copyright: use GPL URL instead of old FSF postal address
* bf005e69f5 Update changelog for 257.5-2 release
* 709e474e5b Backport new patch to workaround /lib64 symlink incompatibility
* fa6c61db40 Update changelog for 257.5-1 release
* 9c9ca29ceb Remove conflicts with dracut:arm64 and build nspawn:arm64 again
* 5899bcc55d Update changelog for 257.5-1 release
* dd5cb92d08 Drop backports, included in 257.5
* c1373fb99e d/t/upstream: run mkosi genkey before summary
* 223d7a412a Install new files for upstream
* b9d337abd9 Use Conflicts instead of Breaks/Replaces for file move
* 9379847813 d/t/upstream: write mkosi.local.conf in subdir if the rest of the configs are in subdir
* 86fc24b565 d/t/upstream: do not fail if 10-root.conf is not present
test-sd-device: limit the number of iterations when testing device parent/child functions
The test "hangs" and times out on some arm64 machines. It actually works as
expected, but the machine has 2016 children under /sys/devices/system/memory/,
and the tests do a double loop over this, which is slow enough to hit the 120 s
limit. Add a limit on the number of iterations.
Another option would be to exclude "memory" subsystem. But we may have other
subsystems which have the same problem in the future, so I think it'll be more
robust to not try to limit the fix to a specific subsystem.
Christian Hesse [Wed, 9 Apr 2025 21:03:06 +0000 (23:03 +0200)]
man: mention special functionality for reload-or-restart with --marked (#37076)
We had a downstream discussion on what `systemctl reload-or-restart
--marked` does, until upstream chimed in and pointed on very special
behavior for that combination. 😜
The second references the first, but not vice versa. Let's fix this.
Some AMD systems have support for features like custom brightness
curve or adaptive backlight management. These features allow the
display driver to adjust the brightness based upon other factors
than just the user brightness request.
The user's brightness request is indicated in the 'brightness' file
but the effective result of the logic in the display driver is stored
in the 'actual_brightness' file.
This leads to problems when shutting the system down because the value
of 'actual_brightness' may be lower than 'brightness' and the wrong value
gets stored for the next boot.
For example if the brightness a user requested was 150, the actual_brightness
might be 130. So the next boot the brightness will be "set" to 130, but the
actual brightness might be 115. If the user reboots again it will be set to 115
for the next boot but the actual brightness might be 100. That is this gets worse
and worse each reboot cycle until the system eventually boots up at minimum
brightness.
Furthermore the kernel documentation indicates that the brightness and
actual_brightness files are not guaranteed to be the same values.
Due to this; drop the use of 'actual_brightness' when saving/restoring brightness
and instead rely only upon 'brightness'.
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
test: Make sure symlinks in integration-tests are properly installed
meson follows symlinks by default, so make sure we use
follow_symlinks=False if meson is new enough and rsync otherwise like
we already do for other testdata subdirectories.
kmeaw [Sun, 30 Mar 2025 12:08:38 +0000 (13:08 +0100)]
shared/calendarspec: fix normalization when DST is negative
When trying to calculate the next firing of 'hourly', we'd lose the
tm_isdst value on the next iteration.
On most systems in Europe/Dublin it would cause a 100% cpu hang due to
timers restarting.
This happens in Europe/Dublin because Ireland defines the Irish Standard Time
as UTC+1, so winter time is encoded in tzdata as negative 1 hour of daylight
saving.
Before this patch:
$ env TZ=IST-1GMT-0,M10.5.0/1,M3.5.0/1 systemd-analyze calendar --base-time='Sat 2025-03-29 22:00:00 UTC' --iterations=5 'hourly'
Original form: hourly
Normalized form: *-*-* *:00:00
Next elapse: Sat 2025-03-29 23:00:00 GMT
(in UTC): Sat 2025-03-29 23:00:00 UTC
From now: 13h ago
Iteration #2: Sun 2025-03-30 00:00:00 GMT
(in UTC): Sun 2025-03-30 00:00:00 UTC
From now: 12h ago
Iteration #3: Sun 2025-03-30 00:00:00 GMT <-- note every next iteration having the same firing time
(in UTC): Sun 2025-03-30 00:00:00 UTC
From now: 12h ago
...
With this patch:
$ env TZ=IST-1GMT-0,M10.5.0/1,M3.5.0/1 systemd-analyze calendar --base-time='Sat 2025-03-29 22:00:00 UTC' --iterations=5 'hourly'
Original form: hourly
Normalized form: *-*-* *:00:00
Next elapse: Sat 2025-03-29 23:00:00 GMT
(in UTC): Sat 2025-03-29 23:00:00 UTC
From now: 13h ago
Iteration #2: Sun 2025-03-30 00:00:00 GMT
(in UTC): Sun 2025-03-30 00:00:00 UTC
From now: 12h ago
Iteration #3: Sun 2025-03-30 02:00:00 IST <-- the expected 1 hour jump
(in UTC): Sun 2025-03-30 01:00:00 UTC
From now: 11h ago
...
This bug isn't reproduced on Debian and Ubuntu because they mitigate it by
using the rearguard version of tzdata. ArchLinux and NixOS don't, so it would
cause pid1 to spin during DST transition.
This is how the affected tzdata looks like:
$ zdump -V -c 2024,2025 Europe/Dublin
Europe/Dublin Sun Mar 31 00:59:59 2024 UT = Sun Mar 31 00:59:59 2024 GMT isdst=1 gmtoff=0
Europe/Dublin Sun Mar 31 01:00:00 2024 UT = Sun Mar 31 02:00:00 2024 IST isdst=0 gmtoff=3600
Europe/Dublin Sun Oct 27 00:59:59 2024 UT = Sun Oct 27 01:59:59 2024 IST isdst=0 gmtoff=3600
Europe/Dublin Sun Oct 27 01:00:00 2024 UT = Sun Oct 27 01:00:00 2024 GMT isdst=1 gmtoff=0
Compare it to Europe/London:
$ zdump -V -c 2024,2025 Europe/London
Europe/London Sun Mar 31 00:59:59 2024 UT = Sun Mar 31 00:59:59 2024 GMT isdst=0 gmtoff=0
Europe/London Sun Mar 31 01:00:00 2024 UT = Sun Mar 31 02:00:00 2024 BST isdst=1 gmtoff=3600
Europe/London Sun Oct 27 00:59:59 2024 UT = Sun Oct 27 01:59:59 2024 BST isdst=1 gmtoff=3600
Europe/London Sun Oct 27 01:00:00 2024 UT = Sun Oct 27 01:00:00 2024 GMT isdst=0 gmtoff=0
Mike Yuan [Tue, 8 Apr 2025 20:35:14 +0000 (22:35 +0200)]
run0: make sure we submit $SHELL to remote
Normally, the service manager sets $SHELL to the target user's
login shell, but run0 always overrides that with either
originating user's shell or value from --setenv=SHELL=. In both cases
$SHELL needs to be sent.
Turns out makepkg sets $SOURCE_DATE_EPOCH= to the current time for
every build if not set explicitly which causes full rebuilds if we
don't set time-epoch explicitly ourselves, so let's do that everywhere
to avoid unnecessary rebuilds.
vsc_tag() always reruns even if the vcs-tag option is disabled. Let's
use custom_target() instead so that we can only enable build_always_stale
if the vcs-tag option is enabled.
mkosi: drop os-release symlink for minimal-base image
[ 385s] ERROR: link target doesn't exist (neither in build root nor in installed system):
[ 385s] /usr/lib/systemd/tests/mkosi/mkosi.images/minimal-base/mkosi.extra/etc/os-release -> ../usr/lib/os-release
It shouldn't be even needed, everything should look in /usr/lib/os-release too
udev-config: restore log level set by systemd.log_level on reload
If previously log level was specified in udev.conf but not now,
then let's make 'udevadm control --reload' sets the log level
specified by systemd.log_level.
Yu Watanabe [Thu, 27 Mar 2025 03:57:30 +0000 (12:57 +0900)]
udev-watch: add inotify watch by manager process
Previously, inotify watch on a device node was added/removed by a
worker process processing the relevant uevent. However, that could not
avoid races. For example,
1. A device node X is removed by the kernel (e.g. unplug USB memory), and
the kernel removes the inotify watch for the device node and produces
IN_IGNORED event and 'remove' uevent for the device.
2. Before udevd processes the 'remove' uevent of the device, a worker
process may try to add an inotify watch on another device node Y.
As the inotify watch on X has been already removed, the worker may
acquire the same watch handle that was previously assigned to X.
3. Since the 'remove' uevent for X is not processed yet, the symlink
named with the watch handle still exists and points to X. So, the
worker process for Y cannot add the symlink...
To avoid such races, let's sequentially add/remove inotify watch by the
manager process.
Note, this potentially reduces performance on boot when there exists
huge amount of disks and/or partitions.
With the latest mkosi it's possible for MinimumVersion= to be a git
commit so let's start making use of that. This will make mkosi fail
if it's executed within the systemd repository and the checked out
commit is too old.
Putting the mkosi commit sha in mkosi/mkosi.conf also allows retrieving
it without having the full source tree available.
We also make a bunch of improvements to the fetch-mkosi.py script.
The arguments `(rd.)systemd.mount-extra` take a value that looks like
`WHAT:WHERE[:FSTYPE[:OPTIONS]]`. The `OPTIONS` were parsed into a nulstr
where a comma-separated c-string was expected. This leads to a bug where
only the first option was taken into account by the generator.
For example, if you passed `systemd.mount-extra=/x:/y:baz:ro,defaults`
to the kernel, `systemd-fstab-generator` would translate that into a
nulstr: `ro\0defaults\0`.
Since methods processing options in the generator expected a
comma-separated c-string, they would only see the first option, `ro` in
this case.
It is useful to have stable and unique identifiers for a security principal.
The majority of identitiy management systems in use with Unix systems today
(e.g. Active Directory objectGUID, FreeIPA ipaUniqueID, Kanidm UUIDs) assign
each account and group a unique UUID and exposing that to applications allows
them to refer to accounts in a stable manner.
This change does not implement user or group lookup by UUID; that is left for
a later PR.
base-filesystem: avoid creating /lib64 symlink on existing rootfs
While all distributions agree on where the basic rootfs symlinks
(/bin /sbin /lib) should point to, not all of them agree on the
target of /lib64. Debian and derivatives, expect something different
than Fedora et al. This is mostly due to the different way multiarch
vs multilib are designed.
This can lead to the situation where running systemd-nspawn on Fedora
to boot a Debian container creates an incompatible symlink in the guest
persistent, pre-created and pre-populated root filesystem, causing
issues due to these incompatibilities.
While it would be great if Debian and derivatives had the same
expectations as the rest of the world, this is baked in many places
and not likely to ever be fixable, as the multiarch vs multilib
behaviours are now very entrenched, and changing it would break
compatibilities left and right.
The core purpose of base-filesystem was to allow bringing up a system
with an empty/ephemeral/etc rootfs (and a /usr/ image on top). So as
a workaround, create /lib64 only if we detect that we have created
/bin /lib and /sbin, as that's a sure sign we are booting into an
empty rootfs that needs to be populated.
Conversely, if the filesystem _already_ has /bin /sbin and /lib,
it means it is not ephemeral and it is pre-prepared and persistent,
so it's a good idea to avoid second-guessing the image builder tool
or the package manager and override what it does, and just let them
carry on with the system however they configured it.
Reworked and reworded, original author: Helmut Grohne <helmut@subdivi.de>
udev: drop unnecessary discardment of queued events
With the previous commit, now on_post_exit() checks only events
currently being processed. Hence, it is not necessary to discard
queued events in manager_exit().
Also, as already SIGTERM is sent to all workers, kill workers timer
is not necessary anymore after manager_exit(), hence disable it.
This mostly does not change any behavior. Just refactoring and
preparation for later change.
udev: do not wait for event queue being empty on exit
When the manager process is requested to terminate, if a worker process
try to lock a block device and failed, then the worker returns a
TRY_AGAIN notification and the event is requeued. Hence, the event queue
may have pending events even after manager_exit() is called. In such
situation, sd_event_exit() will never called, and udevd will stuck.
This makes, after termination is requested, it only checks whether there
are any events currently being processed.
It is not necessary to wait for a worker processing an event before
sending SIGTERM. Workers will handle SIGTERM after they finish events
that they are currently processing. Let's send SIGTERM whenever it
necessary.