Luca Boccassi [Sat, 23 Nov 2024 13:28:03 +0000 (13:28 +0000)]
test: mask tmpfiles.d file shipped by selinux policy package in containers
This tmpfiles.d wants to write to sysfs, which is read-only in containers,
so systemd-tmpfiles --create fails in TEST-22-TMPFILES when ran in nspawn
if the selinux policy package is instealled. Mask it, as it's not our
config file, we don't need it in the test.
Michał Górny [Sun, 17 Nov 2024 15:34:35 +0000 (16:34 +0100)]
nspawn: Include arm_fadvise64_64 in syscall allow_list
Add the `arm_fadvise64_64` syscall to the allow_list, in addition
to the existing `fadvise64` and `fadvise64_64` syscalls, as this is
the syscall actually defined for `arm` architecture. Adding it fixes
the syscall being rejected in arm32 containers.
tests: fix access mode of root inode of throw-away container images
Otherwise the root inode will typically have what mkdtemp sets up, which
is something like 0700, which is weird and somewhat broken when trying
to look into containers from unpriv users.
nspawn: don't try to unregister a machine we never registered
When registering we condition this on "arg_register". Let's do the same
when unregistering, otherwise we might end up trying to unregister a
machine we never registered.
sd-varlink: fix bug when enqueuing messages with fds asynchronously
When determining the poll events to wait for we need to take the queue
of pending messages that carry fds into account. Otherwise we might end
up not waking up if such an fd-carrying message is enqueued
asynchronously (i.e. not from a dispatch callback).
Otherwise, when systemd-udev-trigger.service is (re)started just before
daemon-reexec, which can be easily happen on systemd package update, then
udev database files for many devices may have ID_PROCESSING=1 property,
thus devices may not be enumerated on daemon-reexec. That causes many
units especially mount units being deactivated after daemon-reexec.
Undeprecate commandline params forcequotacheck, fastboot, and forcefsck
Those are historical names, but there is nothing wrong with them. The files on
/ (/fastboot, /forcefsck, and /forcequotacheck) are problematic because they
require a modification of the root file system. But the commandline params work
fine. They have the obvious advantage compared to our "modern" option that they
are much easier to type without looking up the spelling in the docs. Undeprecate
them to avoid unnecessary churn.
killall: gracefully handle processes inserted into containers via nsenter -a
"nsenter -a" doesn't migrate the specified process into the target
cgroup (it really should). Thus the cgroup will remain in a cgroup
that is (due to cgroup ns) outside our visibility. The kernel will
report the cgroup path of such cgroups as starting with "/../". Detect
that and print a reasonably error message instead of trying to resolve
that.
Mike Yuan [Mon, 18 Nov 2024 18:41:07 +0000 (19:41 +0100)]
core/exec-invoke: suppress placeholder home only in build_environment()
Currently, get_fixed_user() employs USER_CREDS_SUPPRESS_PLACEHOLDER,
meaning home path is set to NULL if it's empty or root. However,
the path is also used for applying WorkingDirectory=~, and we'd
spuriously use the invoking user's home as fallback even if
User= is changed in that case.
Let's instead delegate such suppression to build_environment(),
so that home is proper initialized for usage at other steps.
shell doesn't actually suffer from such problem, but it's changed
too for consistency.
cryptenroll: show better log message if slot to wipe does not exist
```
$ systemd-cryptenroll /dev/vda3
SLOT TYPE
0 password
$ systemd-cryptenroll --wipe-slot 1 /dev/vda3
Failed to wipe slot 1, continuing: No such file or directory
```
systemctl: grey out tasks limit the same way we grey out the fd store limit in the output
"systemctl status systemd-logind" otherwise looks a bit weird, since the
tasks and the fdstore lines are so close to each other but formatted
quite differently when it comes to coloring.
pid1: make clear that $WATCHDOG_USEC is set for the shutdown binary, noone else
We use the $WATCHDOG_USEC variable for two very closely uses: as part of
the sd_watchdog_enabled() protocol for implementing service watchdogs.
And as part of the protocol between the service manager and
systemd-shutdown across the PID 1 execve() transition during shutdown.
Apparently some exitrds tools got confused by the latter use. Let's
address that by setting $WATCHDOG_PID to 1, in accordance to the
sd_watchdog_enabled() protocol to make clear this is only intended for
PID 1 and nothing else.
Luca Boccassi [Thu, 14 Nov 2024 16:19:25 +0000 (16:19 +0000)]
test: skip TEST-84-STORAGETM if running with bugged libnvme
libnvme 1.11 appears to require a kernel built with NVME TLS
kconfigs, and fails hard if it is not, as the expected
privileged keyring '.nvme' is not present. We cannot just
create it from userspace, as privileged keyrings can only
be created by the kernel itself (those starting with '.').
Skip the test if the library exactly matches this version.
andre4ik3 [Wed, 13 Nov 2024 14:53:25 +0000 (18:53 +0400)]
boot: allocate cleanup pages below 4GiB only on x86
Outside of x86, some machines (e.g. Apple silicon, AMD Opteron A1100) have
physical memory mapped above 4GiB, meaning this allocation will fail, causing
the entire boot process to fail on these machines.
This commit makes it so that the below-4GB address space allocation requirement
is only set on x86 platforms, and not on other platforms (that don't have the
specific Linux x86 boot protocol), thereby fixing boot on those that have no
memory mapped below 4GiB in their address space.
Tested on an Apple silicon M1 laptop and an AMD x86_64 desktop tower.
Mike Yuan [Wed, 13 Nov 2024 16:45:53 +0000 (17:45 +0100)]
portable: do not use SYNTHETIC_ERRNO for sd_bus_error_set_errno()
The concept of synthetic errnos is about logging, which
is irrelevant irt bus error and we don't do any special
treatment in sd-bus for them, meaning the value propagated
would be spurious.
With the offending commit, on remove event, database file for a device is once
removed in event_execute_rules_on_remove(), but later re-created here.
This fixes the issue, and makes the database file not re-created on remove event.
Štěpán Němec [Mon, 11 Nov 2024 19:10:00 +0000 (20:10 +0100)]
man: fix incorrect volume numbers in internal man page references
Some ambiguity (e.g., same-named man pages in multiple volumes)
makes it impossible to fully automate this, but the following
Python snippet (run inside the man/ directory of the systemd repo)
helped to generate the sed command lines (which were subsequently
manually reviewed, run and the false positives reverted):
for file in Path(".").glob("*.xml"):
tree = ET.parse(file, lxml.etree.XMLParser(recover=True))
meta = tree.find("refmeta")
if meta is not None:
title = meta.findtext("refentrytitle")
if title is not None:
vol = meta.findtext("manvolnum")
if vol is not None:
man2vol[title] = vol
citerefs = list(tree.iter("citerefentry"))
if citerefs:
man2citerefs[title] = citerefs
for man, refs in man2citerefs.items():
for ref in refs:
title = ref.findtext("refentrytitle")
if title is not None:
has = ref.findtext("manvolnum")
try:
should_have = man2vol[title]
except KeyError: # Non-systemd man page reference? Ignore.
continue
if has != should_have:
print(
f"sed -i '\\|<citerefentry><refentrytitle>{title}"
f"</refentrytitle><manvolnum>{has}</manvolnum>"
f"</citerefentry>|s|<manvolnum>{has}</manvolnum>|"
f"<manvolnum>{should_have}</manvolnum>|' {man}.xml"
)
`loginctl kill-session --kill-whom=leader <N>` (or the D-Bus equivalent)
doesn't work because logind ends up calling `KillUnit(..., "main", ...)`
on a scope unit and these don't have a `MainPID` property. Here, I just
make it send a signal to the `Leader` directly.
Lidong Zhong [Thu, 7 Nov 2024 06:41:11 +0000 (14:41 +0800)]
udev: skipping empty udev rules file while collecting the stats
To keep align with the logic used in udev_rules_parse_file(), we also
should skip the empty udev rules file while collecting the stats during
manager reload. Otherwise all udev rules files will be parsed again whenever
reloading udev manager with an empty udev rules file. It's time consuming
and the following uevents will fail with timeout.
man: drop whitespace from final <programlisting> lines
In the troff output, this doesn't seem to make any difference. But in the
html output, the whitespace is sometimes preserved, creating an additional
gap before the following content. Drop it everywhere to avoid this.
Since v256 we completely fail to boot if v1 is configured. Fedora 41 was just
released with v256.7 and this is probably the first major exposure of users to
this code. It turns out not work very well. Fedora switched to v2 as default in
F31 (2019) and at that time some people added configuration to use v1 either
because of Docker or for other reasons. But it's been long enough ago that
people don't remember this and are now very unhappy when the system refuses to
boot after an upgrade.
Refusing to boot is also unnecessarilly punishing to users. For machines that
are used remotely, this could mean somebody needs to physically access the
machine. For other users, the machine might be the only way to access the net
and help, and people might not know how to set kernel parameters without some
docs. And because this is in systemd, after an upgrade all boot choices are
affected, and it's not possible to e.g. select an older kernel for boot. And
crashing the machine doesn't really serve our goal either: we were giving a
hint how to continue using v1 and nothing else.
If the new override is configured, warn and immediately boot to v1.
If v1 is configured w/o the override, warn and wait 30 s and boot to v2.
Also give a hint how to switch to v2.
The advice is to set systemd.unified_cgroup_hierarchy=1 (instead of removing
systemd.unified_cgroup_hierarchy=0). I think this is easier to convey. Users
who are understand what is going on can just remove the option instead.
The caching is dropped in cg_is_legacy_wanted(). It turns out that the
order in which those functions are called during early setup is very fragile.
If cg_is_legacy_wanted() is called before we have set up the v2 hierarchy,
we incorrectly cache a true answer. The function is called just a handful
of times at most, so we don't really need to cache the response.
Let's systematically make sure that we link up the D-Bus interfaces from
the daemon man pages once in prose and once in short form at the bottom
("See Also"), for all daemons.
Also, add reverse links at the bottom of the D-Bus API docs.
man: tone down claims on processes having exited already in ExecStop=
Processes can easily survive the first kill operation we execute, hence
we shouldn't make strong claims about them having exited already. Let's
just say "likely" hence.
resolved: log error messages for openssl/gnutls context creation
In https://bugzilla.redhat.com/show_bug.cgi?id=2322937 we're getting
an error message:
Okt 29 22:21:03 fedora systemd-resolved[29311]: Could not create manager: Cannot allocate memory
I expect that this actually comes from dnstls_manager_init(), the
openssl version. But without real logs it's hard to know for sure.
Use EIO instead of ENOMEM, because the problem is unlikely to be actually
related to memory.
Martin Wilck [Wed, 30 Oct 2024 15:57:39 +0000 (16:57 +0100)]
udev-builtin-path_id: SAS wide ports must have num_phys > 1
Some kernel SAS drivers (e.g. smartpqi) expose ports with num_phys = 0. udev
shouldn't treat these ports as wide ports. SAS wide ports always have
num_phys > 1. See comments for sas_port_add_phy() in the kernel sources.
Sample data from a smartpqi system to illustrate the issue below.
Here the phy device is attached to port 0:0, which has no end devices attached
and the SAS end device (where sda is attached) is associated with SAS
port 0:1, which has no associated phy device. Thus num_phys for port-0:1 is 0.
This is arguably wrong, but it's how smartpqi has always set up its devices in
sysfs.
Mike Gilbert [Thu, 24 Oct 2024 16:24:35 +0000 (12:24 -0400)]
posix_spawn_wrapper: do not set POSIX_SPAWN_SETSIGDEF flag
Setting this flag is a noop without a corresponding call to
posix_spawnattr_setsigdefault.
If we call posix_spawnattr_setsigdefault with a full signal set,
it causes glibc's posix_spawn implementation to call sigaction 63 times,
once for each signal. That seems wasteful.
This feature is really only useful for signals which have their
disposition set to SIG_IGN. Otherwise the dispostion gets set to
SIG_DFL automatically, either by clone(CLONE_CLEAR_SIGHAND) or the
subsequent execve.
As far as I can tell, systemd does not have any signals set to SIG_IGN
under normal operating conditions.
Łukasz Stelmach [Tue, 29 Oct 2024 14:53:45 +0000 (15:53 +0100)]
core: make mount(8) and swapon(8) inherit SMACK label from systemd
By default mount(8), umount(8), swapon(8) and swapoff(8) should run with
with the SMACK label inherited from systemd rather than the default one
meant for services.
Yu Watanabe [Wed, 23 Oct 2024 19:40:45 +0000 (04:40 +0900)]
network: process queued remove requests before networkd is stopped
This makes networkd process all queued remove requests when a
terminating or restarting signal is received. Otherwise, e.g. DHCPv4
address will not be removed on stop, especially when
KeepConfiguration=no.
cryptenroll,homectl,journalctl: adjust messages before qrcodes
Users will generally know what a qrcode is, so let's not treat them as dumb and
explain that it can be scanned. OTOH, we should say what the qrcode contains
and it is useful to give a hint why the users would want to scan it. Reword
messages accordingly.
(Also, don't say "to your phone", when somebody might be using a stolen phone,
or something else then a phone.)
hugo303 [Fri, 25 Oct 2024 10:15:02 +0000 (12:15 +0200)]
analyze: Add times in seconds for Activating and Activated in tooltip
Print the times in seconds in the tooltip to remove the need to count
and trying to follow the lines in the svg diagram in order to see at
what times these events happen.
When invoked on a running system, bsod would not print the qrcode.
The check for "color support" on stdout is pointless, since we're not
printing to stdout but to a terminal fd that is opened separately.