nsresourced: check polkit before executing our operations
Let's tighten rules on namespace operations: let's always ask PK for
permission before doing anything.
Note that if polkit is absent we'll still allow things, and the default
PK policy will also still allow things, but there's now a clear way how
people can not allow things if they want, by modifying the PK policy.
nsresourced: explicitly remove network interfaces when their userns goes away
Let's tighten the screws a bit on the network interfaces we delegate,
and explicitly destroy them, just like we destroy delegated cgroups.
Ideally, this should happen automatically because the userns goes away
that pins the veth, or because the client holding an fd for a tap device
goes away as the userns goes away. But you never know who keeps a
reference, hence let's explicitly destroy this too.
This is a simple helper for creating a userns that just maps the
callers user to UID 0 in the namespace. This can be acquired unpriv,
which makes it useful for various purposes, for example for the logic in
is_idmapping_supported(), hence port it over.
(is_idmapping_supported() used a different mapping before, with the
nobody users, but there's no real reason for that, and we'll use
userns_acquire_self_root() elsewhere soon, where the root mapping is
important).
namespace-util: make "setgroups" users property writable via userns_acquire()
Unprivileged namespaces are only allowed if the "setgroups" file is set
to "deny" for processes. And we need to write it before writing the
gidmap. Hence add a parameter for that.
Then, also patch all current users to actually enable this. The usecase
generally don't need it (because they don't care about unprivileged
userns), but it doesn't hurt to enable the concept anyway in all current
users (none of them actually runs complex userspace in them, but they
mostly use userns_acquire() for idmapped mounts and similar).
Let's anyway make this option explicit in the function call, to indicate
that the concept exists and is applied.
I recently noticed that our serial/VM terminals did not get fedora's
color shell prompt, nor got color support in "ls".
I spend a bit of time investigating and it's all a bit of a mess. If we
don't have any idea what kind of terminal we are talking to via serial
or hypervisor console then we so far just set TERM=vt220 as a reasonable
fallback: vt220 is quite universally defined in terminfo/termcap, and it
supports pageup/pagedown (unlike vt100).
However, real vt220 DEC terminals did not support color, and hence
termcap/terminfo says "no color, sorry". Which sucks, but actually
neither coreutils' "ls" (via `dircolors`) nor fedora's color shell
prompt actually care for termcap/terminfo. So why don't we get color?
In the coreutils case: it has it's own mini-database of terminals. A
very skewed one, where TERM=vt100 enables colors (and DEC vt100
definitely never ever had color support!), but vt220 does not. However,
what it actually does is check $COLORTERM. If that's set then it would
enable color.
In the fedora color prmpt case: it tries to derive color support by
looking for the word "color" in $TERM. Horrible hack if you ask me...
In order to make things better I did a bunch of things:
1. I think the idea of actually having a fully correct and up-to-date
termcap/terminfo database is kinda illusionary these days. But
apparently regarding color support $COLORTERM kinda took it place.
coreutils cares, and systemd itself cares too. To some point at least:
we consume it to determine color support, but we never propagate it in
nspawn, run0 and so on. So this PR fixes that.
2. Also, we are kinda stuck with vt220 I guess as default fallback for
serial terminals. But let's tweak it, and set $COLORTERM=truecolor as
default too. this means we default to a vt220 terminal, but with color.
Which is an ahistorical thing to do, but I think it's the best way out.
3. I also filed a bug against util-linux asking them to treat $COLORTERM
like $TERM, and let it propagate from getty into login shell:
https://github.com/util-linux/util-linux/issues/3463 – With that we
should get color support in ls by default now.
4. I also asked coreutils to treat vt220 the same as they already treat
vt100 and simply do color, even if though that's ahistorical:
https://github.com/coreutils/coreutils/issues/96
5. I then asked the fedora color prompt package to check $COLORTERM:
https://bugzilla.redhat.com/show_bug.cgi?id=2352650
6. I also asked the fedora ssh package to propagate $COLORTERM to remote
hosts by default, like they already cover $TERM. terminal emulators set
both these days generally, hence this would make sense.
https://bugzilla.redhat.com/show_bug.cgi?id=2352653
7. while at it, I figured it makes sense to not only propagate/consume
$COLORTERM at the same time as $TERM, but also consider $NO_COLOR. In
contrast to $COLORTERM for which no spec seems to exist, that one
actually does have a spec: https://no-color.org/
It might make sense for those interested in other distros than Fedora to
maybe ask for similar changes for their ssh and color shell prompt
packages (if they have something coresponding).
Luca Boccassi [Mon, 17 Mar 2025 11:29:33 +0000 (11:29 +0000)]
build: add C23 support (#35085)
To support C23, this introduces UTF8() macro to define UTF-8 literals,
as C23 changed char8_t from char to unsigned char.
This also makes pointer signedness warning critical, and updates C
standards table for tests.
When we pass information about our calling terminal on to some service
or command we invoke, propagate $COLORTERM + $NO_COLOR in addition to
$TERM, in order to always consider the triplet of the three env vars the
real deal.
main: explicitly pick up $COLORTERM + $NO_COLOR from kernel cmdline where we pick up $TERM
I think we should work towards always picking up the triplet of $TERM +
$COLORTERM + $NO_COLOR where we so far picked up $TERM only. I think
it's safe to say that at this time, $TERM is not enough anymore to
clearly communicate terminal feature support. Hence, teach PID 1 to pick
$COLORTERM + $NO_COLOR wherever we pick up $TERM.
exec: when we have no $TERM configuration, and we default to vt220, also set $COLORTERM
When we configure a serial or VM terminal and have no $TERM
configuration, then we default to vt220 as a fallback. This is a pretty
safe bet, since the termcap/terminfo definitions for vt220 are
relatively widely available (much like vt100), and (in contrast to
vt100) it supports pageup/pagedown keys. vt220 is a terminal without
color support however, but we do want color support, because in 2025
there's really no terminal emulator without color in this world.
The $COLORTERM env var is used my many emulators and tools to
communicate that ANSI color support is available, despite what $TERM
says. Hence, let's tweak systemd's logic to also set $COLORTERM in case
we set the vt220 $TERM fallback.
This means we define an ahistoric frankenterminal: a vt220 (that
historically definitely didn't have color) that is explicitly configured
to have color.
One effect of this is that coreutils' dircolors command will start to
output color sequences in systemd's serial or VM terminals. (Since it
actually honours $COLORTERM).
Also note that systemd itself checks $COLORTERM since a long time, hence
it makes sense for us to also set it.
Note that this unfortunately doesn't have the desired effect of
propagating $COLORTERM into any getty shell sessions yet. That's because
util-linux' login package currently fiters $COLORTERM (but lets $TERM
though). I filed a bug about that here:
Yu Watanabe [Mon, 17 Mar 2025 03:18:41 +0000 (12:18 +0900)]
udev-builtin-btrfs: refuse to call for irrelevant device node
If btrfs builtin command is called, then check if the specified device
node is owned by the device.
This also allows the command is called specifying any device node.
Yu Watanabe [Sun, 16 Mar 2025 21:53:46 +0000 (06:53 +0900)]
nspawn: introduce --cleanup option (#34776)
This is useful when the previous invocation is unexpectedly killed.
Otherwise, if systemd-nspawn is killed forcibly, then unix-export
directory is not cleared and unmounted, and the subsequent invocation
will fail. E.g.
```
[ 18.895515] TEST-13-NSPAWN.sh[645]: + machinectl start long-running
[ 18.945703] systemd-nspawn[1387]: Mount point '/run/systemd/nspawn/unix-export/long-running' exists already, refusing.
[ 18.949236] systemd[1]: systemd-nspawn@long-running.service: Failed with result 'exit-code'.
[ 18.949743] systemd[1]: Failed to start systemd-nspawn@long-running.service.
```
Mike Yuan [Mon, 10 Mar 2025 18:42:05 +0000 (19:42 +0100)]
semaphore-runner: disable cgroup setup in lxc
lxc tries to mount /sys/fs/cgroup/ following host hierarchy by default,
which is problematic for us since we want to unconditionally use
cgroup v2 in cgns. Disable it hence and let pid1 figure it out.
Yu Watanabe [Tue, 15 Oct 2024 08:25:09 +0000 (17:25 +0900)]
nspawn: introduce --cleanup option to clear propagation and unix-export directories
This is useful when the previous invocation is unexpectedly killed.
Otherwise, if systemd-nspawn is killed forcibly, then unix-export
directory is not cleared and unmounted, and the subsequent invocation
will fail. E.g.
===
[ 18.895515] TEST-13-NSPAWN.sh[645]: + machinectl start long-running
[ 18.945703] systemd-nspawn[1387]: Mount point '/run/systemd/nspawn/unix-export/long-running' exists already, refusing.
[ 18.949236] systemd[1]: systemd-nspawn@long-running.service: Failed with result 'exit-code'.
[ 18.949743] systemd[1]: Failed to start systemd-nspawn@long-running.service.
===
Yu Watanabe [Sun, 16 Mar 2025 00:31:43 +0000 (09:31 +0900)]
macro: Introduce UTF8() macro to define UTF-8 string literal
C23 changed char8_t from char to unsigned char, hence assigning a u8 literal
to const char* emits pointer sign warning, e.g.
========
../src/shared/qrcode-util.c: In function ‘print_border’:
../src/shared/qrcode-util.c:16:34: warning: pointer targets in passing argument 1 of ‘fputs’ differ in signedness [-Wpointer-sign]
16 | #define UNICODE_FULL_BLOCK u8"█"
| ^~~~~
| |
| const unsigned char *
../src/shared/qrcode-util.c:65:39: note: in expansion of macro ‘UNICODE_FULL_BLOCK’
65 | fputs(UNICODE_FULL_BLOCK, output);
| ^~~~~~~~~~~~~~~~~~
========
This introduces UTF8() macro, which define u8 literal and casts to consth char*,
then rewrites all u8 literal definitions with the macro.
With this change, we can build systemd with C23.
The log line looked like this:
bootctl[1457]: ! Mount point '/efi' which backs the random seed file is world accessible, which is a security hole! !
which doesn't look nice.
Also upgrade the message to error. This is something to fix.
basic/glyph-util: rename "special glyph" to just "glyph"
Admittedly, some of our glyphs _are_ special, e.g. "O=" for SPECIAL_GLYPH_TOUCH ;)
But we don't need this in the name. The very long names make some invocations
very wordy, e.g. special_glyph(SPECIAL_GLYPH_SLIGHTLY_UNHAPPY_SMILEY).
Also, I want to add GLYPH_SPACE, which is not special at all.
Yu Watanabe [Sat, 15 Mar 2025 00:04:25 +0000 (09:04 +0900)]
test: drop redundant parentheses in ASSERT_OK() and friends
This reverts 278e3adf50e36518c5a5dd798ca998e7eac5436e, and drop more
redundant parentheses, as they unfortunately does not suppress the
false-positive warnings by coverity.
We didn't check the number of arguments first, hence ended up outputting
some ugly complaints with `(null)` in a format string. And what's worse
accepted any number of arguments, where we'd ignore all but the first
two though.
This partially reverts 8d04b8198d4c0cca0118f731369ad7156f0726b6.
If we completely drop the file, users will get a 404. But this document
has been in place for a long time and is referred to in many other places,
incl. our old wiki at https://www.freedesktop.org/wiki/Software/.
The page already says that it's been replaced
("… Please consult this document only as a historical reference. …").
We should only remove it from the index (which 8d04b8198d4c0cca0118f731369ad7156f0726b6 did).
In general, let's be more careful about preserving link stability.
When we change something in a way that breaks URLs, we're creating
pain for users.
emergency-action: sleep 5s before rebooting in various cases
This adds a new EMERGENCY_ACTION_SLEEP_5S flag, which when set will
delay the emergency action for 5s. This is supposed to be used together
with EMERGENCY_ACTION_WARN so that users can actually read the message
we output.
We enable this with all emergency action requests that already set
EMERGENCY_ACTION_WARN, except for the 7x ctrl-alt-del burst reboot,
where the user knows what they do and there's no real reason to wait,
they don't need to be informed.
This also enables both EMERGENCY_ACTION_WARN + EMERGENCY_ACTION_SLEEP_5S
for FailureAction= processing of regular units, where these were so far
off. (it leaves this off for SuccessAction= however!). This is a good
thing to make things more debuggable: if something fails and we reboot
this really deserves notification of the user.
(For SuccessAction= this logic does not apply, since the shutdown action
induced here is apparently intended part of the codeflow, for example in
systemd-reboot.service or a similar unit, where the shutdown is goal and
not exception and derserves no additional noisy reporting).
So far /run/systemd/ was created as side-effect of initializing the
D-Bus client/server. But in one of the next commits we'll suppress
connecting to D-Bus in test runs, hence let's move the logic our of the
D-Bus code and into manager_startup().
Then, also drop creating it again and again in PID 1 at various places,
and just rely on it to exist.
coredump,analyze: use read_full_file() for reading various top-level /proc/ files
Kernel API file systems typically use either "raw" or "seq_file" to
implement their various interface files. The former are really simple
(to point I'd call them broken), in that they have no understanding of
file offsets, and return their contents again and again on every read(),
and thus EOF is indicated by a short read, not by a zero read. The
latter otoh works like a typical file: you read until you get a
zero-sized read back.
We have read_virtual_file() to read the "raw" files, and can use regular
read_full_file() to read the "seq_file" ones.
Apparently all files in the top-level /proc/ directory use 'seq_file'.
but we accidentally used read_virtual_file() for them. Fix that.
bootctl: make sure bootctl --image= works on image with /usr/ but without / (#36727)
```
Let's make sure we can use the tool on ParticleOS images. They have no
root fs by default (until they are instantiated), but always have /usr/.
Hence add DISSECT_IMAGE_USR_NO_ROOT which has the desired effect.
```
bootctl: tweak status output when operating on --image= files
Let's not claim the system was not booted with UEFI if we use --image=.
The system wasn't booted at all, after all. Hence supress the whole
section altogether in this case.
bootctl: make sure bootctl --image= works on image with /usr/ but without /
Let's make sure we can use the tool on ParticleOS images. They have no
root fs by default (until they are instantiated), but always have /usr/.
Hence add DISSECT_IMAGE_USR_NO_ROOT which has the desired effect.
Yu Watanabe [Thu, 13 Mar 2025 03:11:40 +0000 (12:11 +0900)]
TEST-73-LOCALE: do not unnecessarily restart systemd-localed
It is not necessary to clear previous keymap assignment, as
`localectl set-keymap` will anyway overwrite the previous assignment.
This drops the unnecessary restart of systemd-localed in the loop.
The mkosi test image contains about 500~700 keymaps. The test
performance is greatly improved by reducing the number of restarts,
especially when the test is running with sanitizers.
On Fedora 41 with sanitizers,
Before:
1/1 systemd:integration-tests / TEST-73-LOCALE OK 1157.50s
After:
1/1 systemd:integration-tests / TEST-73-LOCALE OK 104.43s