We didn't check the number of arguments first, hence ended up outputting
some ugly complaints with `(null)` in a format string. And what's worse
accepted any number of arguments, where we'd ignore all but the first
two though.
This partially reverts 8d04b8198d4c0cca0118f731369ad7156f0726b6.
If we completely drop the file, users will get a 404. But this document
has been in place for a long time and is referred to in many other places,
incl. our old wiki at https://www.freedesktop.org/wiki/Software/.
The page already says that it's been replaced
("… Please consult this document only as a historical reference. …").
We should only remove it from the index (which 8d04b8198d4c0cca0118f731369ad7156f0726b6 did).
In general, let's be more careful about preserving link stability.
When we change something in a way that breaks URLs, we're creating
pain for users.
emergency-action: sleep 5s before rebooting in various cases
This adds a new EMERGENCY_ACTION_SLEEP_5S flag, which when set will
delay the emergency action for 5s. This is supposed to be used together
with EMERGENCY_ACTION_WARN so that users can actually read the message
we output.
We enable this with all emergency action requests that already set
EMERGENCY_ACTION_WARN, except for the 7x ctrl-alt-del burst reboot,
where the user knows what they do and there's no real reason to wait,
they don't need to be informed.
This also enables both EMERGENCY_ACTION_WARN + EMERGENCY_ACTION_SLEEP_5S
for FailureAction= processing of regular units, where these were so far
off. (it leaves this off for SuccessAction= however!). This is a good
thing to make things more debuggable: if something fails and we reboot
this really deserves notification of the user.
(For SuccessAction= this logic does not apply, since the shutdown action
induced here is apparently intended part of the codeflow, for example in
systemd-reboot.service or a similar unit, where the shutdown is goal and
not exception and derserves no additional noisy reporting).
So far /run/systemd/ was created as side-effect of initializing the
D-Bus client/server. But in one of the next commits we'll suppress
connecting to D-Bus in test runs, hence let's move the logic our of the
D-Bus code and into manager_startup().
Then, also drop creating it again and again in PID 1 at various places,
and just rely on it to exist.
coredump,analyze: use read_full_file() for reading various top-level /proc/ files
Kernel API file systems typically use either "raw" or "seq_file" to
implement their various interface files. The former are really simple
(to point I'd call them broken), in that they have no understanding of
file offsets, and return their contents again and again on every read(),
and thus EOF is indicated by a short read, not by a zero read. The
latter otoh works like a typical file: you read until you get a
zero-sized read back.
We have read_virtual_file() to read the "raw" files, and can use regular
read_full_file() to read the "seq_file" ones.
Apparently all files in the top-level /proc/ directory use 'seq_file'.
but we accidentally used read_virtual_file() for them. Fix that.
bootctl: make sure bootctl --image= works on image with /usr/ but without / (#36727)
```
Let's make sure we can use the tool on ParticleOS images. They have no
root fs by default (until they are instantiated), but always have /usr/.
Hence add DISSECT_IMAGE_USR_NO_ROOT which has the desired effect.
```
bootctl: tweak status output when operating on --image= files
Let's not claim the system was not booted with UEFI if we use --image=.
The system wasn't booted at all, after all. Hence supress the whole
section altogether in this case.
bootctl: make sure bootctl --image= works on image with /usr/ but without /
Let's make sure we can use the tool on ParticleOS images. They have no
root fs by default (until they are instantiated), but always have /usr/.
Hence add DISSECT_IMAGE_USR_NO_ROOT which has the desired effect.
Yu Watanabe [Thu, 13 Mar 2025 03:11:40 +0000 (12:11 +0900)]
TEST-73-LOCALE: do not unnecessarily restart systemd-localed
It is not necessary to clear previous keymap assignment, as
`localectl set-keymap` will anyway overwrite the previous assignment.
This drops the unnecessary restart of systemd-localed in the loop.
The mkosi test image contains about 500~700 keymaps. The test
performance is greatly improved by reducing the number of restarts,
especially when the test is running with sanitizers.
On Fedora 41 with sanitizers,
Before:
1/1 systemd:integration-tests / TEST-73-LOCALE OK 1157.50s
After:
1/1 systemd:integration-tests / TEST-73-LOCALE OK 104.43s
Yu Watanabe [Wed, 12 Mar 2025 19:12:57 +0000 (04:12 +0900)]
udev: use INTERFACE property rather than sysname when processing network interface (#36627)
sd-device replaces '!' in sysname with '/', hence sysname may be
different from ifname.
Let's use INTERFACE property when we need network interface name.
This fixes the following unexpected renaming of network interfaces
created with '!' in their name, e.g. 'hoge!foo' -> 'hoge_foo':
```
$ run0 ip link add 'hoge!foo' type dummy
$ ip link show 'hoge!foo'
Device "hoge!foo" does not exist.
$ ip link show 'hoge_foo'
410: hoge_foo: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether ee:54:4a:dd:c4:c7 brd ff:ff:ff:ff:ff:ff
```
There are way too many users configuring the DNS= setting by mistake,
because what it seems to do is different from what it actually does. We
do not have consensus to change its behavior, so let's at least add a
warning comment.
Yu Watanabe [Wed, 5 Mar 2025 22:25:28 +0000 (07:25 +0900)]
udev/net: fix assignment of ID_NET_NAME=
E.g. sd_device object of network interface 'hoge!foo' has sysname 'hoge/foo'.
So, previously udevd assigned 'hoge/foo' rather than 'hoge!foo' to ID_NET_NAME,
hence even when renaming is not requested, such interface was renamed to 'hoge_foo'
(note '/' cannot be used in network interface name, hence escaped to underbar).
- Fixes a race in systemd-run caused by b7ba8d55b8e413ff326abc4814b92d42b8d3c3c3, which causes issue #36679.
- Skip verifying masked units in TEST-23.
- Avoid false-positive ASan warning by switching sanitizer run from
Fedora rawhide to Fedora 41, caused by recent update from
llvm-19.1.7-11.fc43 to llvm-20.1.0-1.fc43. Hopefully issue #36678 should
be fixed.
Michal KoutnĂ˝ [Mon, 3 Feb 2025 16:02:09 +0000 (17:02 +0100)]
test-cgroup-util: Ignore LXC group
LXC helper processes hide themselve in .lxc cgroup, we don't have to
deal with the inside tests (and the error in conversion to unit is handled).
Skip those but keep iterating over remaining processes to detect what
can be created around us.
Michal KoutnĂ˝ [Fri, 17 Jan 2025 17:00:25 +0000 (18:00 +0100)]
test-cgroup-util: Skip procs analysis without cgroupfs
cg_pidref_get_path() cannot work (current implementaion) without
cgroupfs (when it checks unified or not setup). Similarly,
cg_pidref_get_unit() assumes all processes are part of a unit. So carry
out the test only when running on a systemd setup.
Michal KoutnĂ˝ [Wed, 15 Jan 2025 15:36:28 +0000 (16:36 +0100)]
test-cgroup-util: Check return values
The test is supposed to check a battery of cgroup helpers on each
process found but it doesn't literally check anything besides presence
of procfs. (One can visually check printed output only. Introduction in aff38e74bd ("nspawn: suffix the nspawn cgroups with ".nspawn"").)
Make some assumptions about visible processes and turn the test into
testing that systemd helpers can deal with whatever process they find on
the SUT.
Yu Watanabe [Tue, 11 Mar 2025 19:50:33 +0000 (04:50 +0900)]
resolve question marks in /etc/hostname to characters hashed from machine ID (#36647)
So I have a bunch of particle os instances around, that I frequently
factory reset. and it's confusing, since they all have the same name.
Let's do something about this, and extend the hostname setup logic a bit
to deal better with "cattle" rather than "pet" deployments.
Specifically: if a hostname in /etc/hostname contains a bunch of
question marks we'll replace it with hex chars hashed from the machine
id.
Yu Watanabe [Mon, 10 Mar 2025 20:15:11 +0000 (05:15 +0900)]
run: check if the start job is finished on PropertiesChanged signal and so on
Otherwise, if systemd-run is disconnected from bus before JobRemoved
signal, then c->start_job will never freed, thus run_context_check_done()
will never call sd_event_exit() even after the service is finished.
This drops monitoring JobRemoved signal, and make systemd-run check if
the start job is started when PropertiesChanged signal is received.
Yu Watanabe [Mon, 10 Mar 2025 16:54:28 +0000 (01:54 +0900)]
ci/mkosi: enable sanitizers on Fedora 41
It seems the recent update of LLVM package in Fedora rawhide breaks
sanitizers, and udevd freezes after false-positive (I guess) issue is
detected:
systemd-udevd[2646]: =================================================================
systemd-udevd[2646]: ==2646==ERROR: AddressSanitizer: stack-buffer-underflow on address 0x7ffc3a642660 at pc 0x555627ac022b bp 0x7ffc3a6422b0 sp 0x7ffc3a6422a8
systemd-udevd[2646]: READ of size 8 at 0x7ffc3a642660 thread T0 ((udev-worker))
llvm-19.1.7-11.fc43 worked fine, but llvm-20.1.0-1.fc43 does not.
To avoid the issue, let's enable sanitizer on Fedora 41, and disable it
on Fedora rawhide.
Yu Watanabe [Mon, 10 Mar 2025 19:21:11 +0000 (04:21 +0900)]
TEST-23-UNIT-FILE: skip verifying masked unit
This fixes the following failure:
TEST-23-UNIT-FILE.sh[2408]: + systemd-analyze --recursive-errors=no --man=no verify /usr/lib/systemd/system/sysinit.target.wants/systemd-hwdb-update.service
systemd-analyze[2737]: sys-kernel-config.mount: symlinks are not allowed for units of this type, rejecting.
systemd-analyze[2737]: proc-sys-fs-binfmt_misc.automount: symlinks are not allowed for units of this type, rejecting.
systemd-analyze[2737]: dev-hugepages.mount: symlinks are not allowed for units of this type, rejecting.
systemd-analyze[2737]: sys-kernel-tracing.mount: symlinks are not allowed for units of this type, rejecting.
systemd-analyze[2737]: sys-kernel-debug.mount: symlinks are not allowed for units of this type, rejecting.
systemd-analyze[2737]: sys-fs-fuse-connections.mount: symlinks are not allowed for units of this type, rejecting.
systemd-analyze[2737]: dev-mqueue.mount: symlinks are not allowed for units of this type, rejecting.
systemd-analyze[2737]: Unit systemd-hwdb-update.service is masked.
TEST-23-UNIT-FILE.sh[166]: + :
TEST-23-UNIT-FILE.sh[166]: + kill -0 2408
TEST-23-UNIT-FILE.sh[166]: + wait 2408
TEST-23-UNIT-FILE.sh[166]: + echo 'Subtest /usr/lib/systemd/tests/testdata/units/TEST-23-UNIT-FILE.verify-unit-files.sh failed'
TEST-23-UNIT-FILE.sh[166]: Subtest /usr/lib/systemd/tests/testdata/units/TEST-23-UNIT-FILE.verify-unit-files.sh failed
mountfsd: also return suggested mount point paths for the returned partitions
When mounting a disk image we return a bunch of mount fds referencing
the various partitions in the disk, along with some metadata about them.
One key metadata field is the "designator" which is supposed to tell
clients what is what, and where to mount it.
Let's make this more explicit: let's also include the literal relative
path where each mount shall be placed, to simplify implementations of
clients that do not care about the concept of designators.
basic: move gethostname_full() from basic/hostname-util.c → shared/hostname-setup.c
In one of the next commits we'd like to introduce a concept of
optionally hashing the hostname from the machine ID. For that we we need
to optionally back gethostname_full() by code involving sd-id128, hence
let's move it from src/basic/ to src/shared/, since only there we are
allowed to use our public APIs.
David Tardon [Fri, 7 Mar 2025 15:22:00 +0000 (16:22 +0100)]
bus-polkit: shortcut auth. after first denial
A D-Bus/Varlink method can issue PolicyKit auth. requests for multiple
actions; in this case the method is expected to fail on the first one
that is not allowed. This is enforced by asserts in
async_polkit_read_reply(), but that's a wrong place for the check for
two reasons:
1. it doesn't allow to get a meaningful stack trace;
2. sending the query to polkit is already a pointless exercise.
Let's do the check in *_verify_polkit_async_full() and don't send
anything to PolicyKit in that case.
Inspired by https://bugzilla.redhat.com/show_bug.cgi?id=2349594 .
cgroup-util: Handle capsule@ paths like user@ paths (#36664)
The capsule instances are related to user instances, so treat them
equally to user@.service when handling cgroup paths. This also saves us
from polluting public libsystemd API with variant for capsules too.
Michal KoutnĂ˝ [Mon, 3 Feb 2025 13:44:20 +0000 (14:44 +0100)]
cgroup-util: Handle capsule@ paths like user@ paths
The capsule instances are related to user instances, so treat them
equally to user@.service when handling cgroup paths. This also saves us
from polluting public libsystemd API with variant for capsules too.
Mike Yuan [Fri, 25 Oct 2024 23:51:04 +0000 (01:51 +0200)]
core/service: introduce sd_notify() RESTART_RESET=1 for resetting restart counter
We have RestartMaxDelaySec= + RestartSteps= to exponentially increase
auto restart durations, but it currently cannot be reset by the service
itself, which makes it sometimes awkward to use. A typical pattern
in real life is that a service was once down (e.g. due to temporary
network interruption) and multiple restarts were attempted. Then,
future restarts would always wait for increated amount of time based on
RestartMaxDelaySec=, even after the original problem got resolved.
Such "persistence" could result in longer unavailablity than there
should be for failures that come later.
(C.f. https://utcc.utoronto.ca/~cks/space/blog/linux/SystemdResettingUnitBackoff)
Let's introduce a new sd_notify() notification for resetting the restart
counter. There were discussions about making this timer-based, but I think
it's more flexible to leave the decision-making to the service. This enables
them to do a combination of N successful requests + uptime check for instance.
Yu Watanabe [Mon, 10 Mar 2025 13:44:02 +0000 (22:44 +0900)]
udev: scan partitions and trigger synthetic change events in child process
Rereading partition table may take longer on slow disk. The main process
should not be blocked by the operation. Let's fork a child process and
do that on the child.