Daan De Meyer [Mon, 17 Mar 2025 10:35:23 +0000 (11:35 +0100)]
core: Make DelegateNamespaces= work for user managers with CAP_SYS_ADMIN
Currently DelegateNamespaces= only works for services spawned by the
system manager. User managers will always unshare the user namespace
first even if they're running with CAP_SYS_ADMIN.
Let's add support for DelegateNamespaces= for user managers if they're
running with CAP_SYS_ADMIN. By default, we'll still delegate all namespaces
for user managers, but this can now be overridden by explicitly passing
DelegateNamespaces=.
If a user manager is running without CAP_SYS_ADMIN, the user manager is
still always unshared first just like before.
Daan De Meyer [Mon, 17 Mar 2025 11:26:46 +0000 (12:26 +0100)]
capability-util: Ignore unknown capabilities instead of aborting
capability_ambient_set_apply() can be called with capability sets
containing unknown capabilities. Let's not crash when this is the
case but instead ignore the unknown capabilities.
This fixes a crash when running the following command:
Make sure the test has its own /proc and skip it in containers as
MountAPIVFS=yes in a container always results in a read-only /proc/sys
which means the test can't write to /proc/sys/kernel/ns_last_pid.
Daan De Meyer [Mon, 17 Mar 2025 15:20:00 +0000 (16:20 +0100)]
TEST-07-PID1.delegate-namespaces: Make sure fully visible procfs is available
To be able to mount /proc inside an unprivileged user namespace, we have
to make sure a fully visible procfs is available on the host, so let's make
sure that's the case.
Daan De Meyer [Mon, 17 Mar 2025 15:17:25 +0000 (16:17 +0100)]
core: Also check if we can mount /proc if pid namespace is delegated
If the pid namespace is delegated, it doesn't matter if we have CAP_SYS_ADMIN,
we'll still fail to mount /proc if part of it is masked on the host so also
check if we can mount /proc if the pid namespace is delegated.
Yu Watanabe [Mon, 17 Mar 2025 01:36:33 +0000 (10:36 +0900)]
getty-generator: unify add_serial_getty() and add_container_getty()
This also makes the generator not trigger an assertion added by 1cd3c49d09bf78a2a2e4cf25cb3d388e1f08a709. If getty.ttys.container
credential contains a line prefixed with '/dev/', then the assertion
assert(!path_startswith(tty, "/dev/"));
was triggered. This drops the offending assertion, and such lines
are handled gracefully now.
Also, an empty string, "/dev/", and "/dev/pts/" (that is, a directory
without tty name) are gracefully skipped now.
Let's return the size in a return parameter instead of the return value.
And if NULL is specified this tells us the caller doesn't care about the
size and expects a NUL terminated string. In that case look for an
embedded NUL byte, and refuse in that case.
This should lock things down a bit, as we'll systematically refuse
embedded NUL strings now when we expect strings.
Sonia Zorba [Tue, 18 Mar 2025 00:25:51 +0000 (02:25 +0200)]
hwdb: fix backspace not working on HP Pavilion laptop (#36777)
PR #34685 moved the handling of keys 66/65 from specific models to
generic HP laptops.
Key 66 has been linked to the `pickup_phone` function; however, this
action key is not available on all HP laptop models, particularly older
versions. On my HP Pavilion laptop, key 66 is mapped to the `backspace`
function, which caused the backspace key to stop working after the
change.
The following PR fixes the issue on my **HP Pavilion Laptop 15-eg0xxx**.
I have placed the modifications under the Pavilion section, but I cannot
guarantee that this solution will apply to all Pavilion models.
Additionally, I have included a line that checks for "HP" instead of
solely searching for "Hewlett-Packard," as my model is simply labeled as
HP.
Currently, the unit is only reffed in transient_unit_set_properties()
via AddRef(), which however would be dropped if a reconnection
is attempted. Make sure to explicitly re-add reference in that case.
Yu Watanabe [Mon, 17 Mar 2025 22:34:03 +0000 (07:34 +0900)]
nsresourced,vmspawn: allow unpriv "tap" based networking in vmspawn (#36688)
This extends nsresourced to also allow delegation of a network tap
device (in addition to veth) to unpriv clients, with a strictly enforced
naming scheme.
also tightens security on a couple of things:
* enforces polkit on all nsresourced ops too (though by default still
everything is allowed)
* put a limit on delegated network devices
* forcibly clean up delegated network devices when the userns goes away
tree-wide: refuse user/group records lacking UID or GID
userdb allows user/group records without UID/GID (it only really
requires a name), in order to permit "unfixated" records. But that means
we cannot just rely on the field to be valid. And we mostly got that
right, but not everywhere. Fix that.
nsresourced: check polkit before executing our operations
Let's tighten rules on namespace operations: let's always ask PK for
permission before doing anything.
Note that if polkit is absent we'll still allow things, and the default
PK policy will also still allow things, but there's now a clear way how
people can not allow things if they want, by modifying the PK policy.
nsresourced: explicitly remove network interfaces when their userns goes away
Let's tighten the screws a bit on the network interfaces we delegate,
and explicitly destroy them, just like we destroy delegated cgroups.
Ideally, this should happen automatically because the userns goes away
that pins the veth, or because the client holding an fd for a tap device
goes away as the userns goes away. But you never know who keeps a
reference, hence let's explicitly destroy this too.
This is a simple helper for creating a userns that just maps the
callers user to UID 0 in the namespace. This can be acquired unpriv,
which makes it useful for various purposes, for example for the logic in
is_idmapping_supported(), hence port it over.
(is_idmapping_supported() used a different mapping before, with the
nobody users, but there's no real reason for that, and we'll use
userns_acquire_self_root() elsewhere soon, where the root mapping is
important).
namespace-util: make "setgroups" users property writable via userns_acquire()
Unprivileged namespaces are only allowed if the "setgroups" file is set
to "deny" for processes. And we need to write it before writing the
gidmap. Hence add a parameter for that.
Then, also patch all current users to actually enable this. The usecase
generally don't need it (because they don't care about unprivileged
userns), but it doesn't hurt to enable the concept anyway in all current
users (none of them actually runs complex userspace in them, but they
mostly use userns_acquire() for idmapped mounts and similar).
Let's anyway make this option explicit in the function call, to indicate
that the concept exists and is applied.
I recently noticed that our serial/VM terminals did not get fedora's
color shell prompt, nor got color support in "ls".
I spend a bit of time investigating and it's all a bit of a mess. If we
don't have any idea what kind of terminal we are talking to via serial
or hypervisor console then we so far just set TERM=vt220 as a reasonable
fallback: vt220 is quite universally defined in terminfo/termcap, and it
supports pageup/pagedown (unlike vt100).
However, real vt220 DEC terminals did not support color, and hence
termcap/terminfo says "no color, sorry". Which sucks, but actually
neither coreutils' "ls" (via `dircolors`) nor fedora's color shell
prompt actually care for termcap/terminfo. So why don't we get color?
In the coreutils case: it has it's own mini-database of terminals. A
very skewed one, where TERM=vt100 enables colors (and DEC vt100
definitely never ever had color support!), but vt220 does not. However,
what it actually does is check $COLORTERM. If that's set then it would
enable color.
In the fedora color prmpt case: it tries to derive color support by
looking for the word "color" in $TERM. Horrible hack if you ask me...
In order to make things better I did a bunch of things:
1. I think the idea of actually having a fully correct and up-to-date
termcap/terminfo database is kinda illusionary these days. But
apparently regarding color support $COLORTERM kinda took it place.
coreutils cares, and systemd itself cares too. To some point at least:
we consume it to determine color support, but we never propagate it in
nspawn, run0 and so on. So this PR fixes that.
2. Also, we are kinda stuck with vt220 I guess as default fallback for
serial terminals. But let's tweak it, and set $COLORTERM=truecolor as
default too. this means we default to a vt220 terminal, but with color.
Which is an ahistorical thing to do, but I think it's the best way out.
3. I also filed a bug against util-linux asking them to treat $COLORTERM
like $TERM, and let it propagate from getty into login shell:
https://github.com/util-linux/util-linux/issues/3463 – With that we
should get color support in ls by default now.
4. I also asked coreutils to treat vt220 the same as they already treat
vt100 and simply do color, even if though that's ahistorical:
https://github.com/coreutils/coreutils/issues/96
5. I then asked the fedora color prompt package to check $COLORTERM:
https://bugzilla.redhat.com/show_bug.cgi?id=2352650
6. I also asked the fedora ssh package to propagate $COLORTERM to remote
hosts by default, like they already cover $TERM. terminal emulators set
both these days generally, hence this would make sense.
https://bugzilla.redhat.com/show_bug.cgi?id=2352653
7. while at it, I figured it makes sense to not only propagate/consume
$COLORTERM at the same time as $TERM, but also consider $NO_COLOR. In
contrast to $COLORTERM for which no spec seems to exist, that one
actually does have a spec: https://no-color.org/
It might make sense for those interested in other distros than Fedora to
maybe ask for similar changes for their ssh and color shell prompt
packages (if they have something coresponding).
Luca Boccassi [Mon, 17 Mar 2025 11:29:33 +0000 (11:29 +0000)]
build: add C23 support (#35085)
To support C23, this introduces UTF8() macro to define UTF-8 literals,
as C23 changed char8_t from char to unsigned char.
This also makes pointer signedness warning critical, and updates C
standards table for tests.
When we pass information about our calling terminal on to some service
or command we invoke, propagate $COLORTERM + $NO_COLOR in addition to
$TERM, in order to always consider the triplet of the three env vars the
real deal.
main: explicitly pick up $COLORTERM + $NO_COLOR from kernel cmdline where we pick up $TERM
I think we should work towards always picking up the triplet of $TERM +
$COLORTERM + $NO_COLOR where we so far picked up $TERM only. I think
it's safe to say that at this time, $TERM is not enough anymore to
clearly communicate terminal feature support. Hence, teach PID 1 to pick
$COLORTERM + $NO_COLOR wherever we pick up $TERM.
exec: when we have no $TERM configuration, and we default to vt220, also set $COLORTERM
When we configure a serial or VM terminal and have no $TERM
configuration, then we default to vt220 as a fallback. This is a pretty
safe bet, since the termcap/terminfo definitions for vt220 are
relatively widely available (much like vt100), and (in contrast to
vt100) it supports pageup/pagedown keys. vt220 is a terminal without
color support however, but we do want color support, because in 2025
there's really no terminal emulator without color in this world.
The $COLORTERM env var is used my many emulators and tools to
communicate that ANSI color support is available, despite what $TERM
says. Hence, let's tweak systemd's logic to also set $COLORTERM in case
we set the vt220 $TERM fallback.
This means we define an ahistoric frankenterminal: a vt220 (that
historically definitely didn't have color) that is explicitly configured
to have color.
One effect of this is that coreutils' dircolors command will start to
output color sequences in systemd's serial or VM terminals. (Since it
actually honours $COLORTERM).
Also note that systemd itself checks $COLORTERM since a long time, hence
it makes sense for us to also set it.
Note that this unfortunately doesn't have the desired effect of
propagating $COLORTERM into any getty shell sessions yet. That's because
util-linux' login package currently fiters $COLORTERM (but lets $TERM
though). I filed a bug about that here:
Yu Watanabe [Mon, 17 Mar 2025 03:18:41 +0000 (12:18 +0900)]
udev-builtin-btrfs: refuse to call for irrelevant device node
If btrfs builtin command is called, then check if the specified device
node is owned by the device.
This also allows the command is called specifying any device node.
Yu Watanabe [Sun, 16 Mar 2025 21:53:46 +0000 (06:53 +0900)]
nspawn: introduce --cleanup option (#34776)
This is useful when the previous invocation is unexpectedly killed.
Otherwise, if systemd-nspawn is killed forcibly, then unix-export
directory is not cleared and unmounted, and the subsequent invocation
will fail. E.g.
```
[ 18.895515] TEST-13-NSPAWN.sh[645]: + machinectl start long-running
[ 18.945703] systemd-nspawn[1387]: Mount point '/run/systemd/nspawn/unix-export/long-running' exists already, refusing.
[ 18.949236] systemd[1]: systemd-nspawn@long-running.service: Failed with result 'exit-code'.
[ 18.949743] systemd[1]: Failed to start systemd-nspawn@long-running.service.
```
Mike Yuan [Mon, 10 Mar 2025 18:42:05 +0000 (19:42 +0100)]
semaphore-runner: disable cgroup setup in lxc
lxc tries to mount /sys/fs/cgroup/ following host hierarchy by default,
which is problematic for us since we want to unconditionally use
cgroup v2 in cgns. Disable it hence and let pid1 figure it out.
Yu Watanabe [Tue, 15 Oct 2024 08:25:09 +0000 (17:25 +0900)]
nspawn: introduce --cleanup option to clear propagation and unix-export directories
This is useful when the previous invocation is unexpectedly killed.
Otherwise, if systemd-nspawn is killed forcibly, then unix-export
directory is not cleared and unmounted, and the subsequent invocation
will fail. E.g.
===
[ 18.895515] TEST-13-NSPAWN.sh[645]: + machinectl start long-running
[ 18.945703] systemd-nspawn[1387]: Mount point '/run/systemd/nspawn/unix-export/long-running' exists already, refusing.
[ 18.949236] systemd[1]: systemd-nspawn@long-running.service: Failed with result 'exit-code'.
[ 18.949743] systemd[1]: Failed to start systemd-nspawn@long-running.service.
===
Yu Watanabe [Sun, 16 Mar 2025 00:31:43 +0000 (09:31 +0900)]
macro: Introduce UTF8() macro to define UTF-8 string literal
C23 changed char8_t from char to unsigned char, hence assigning a u8 literal
to const char* emits pointer sign warning, e.g.
========
../src/shared/qrcode-util.c: In function ‘print_border’:
../src/shared/qrcode-util.c:16:34: warning: pointer targets in passing argument 1 of ‘fputs’ differ in signedness [-Wpointer-sign]
16 | #define UNICODE_FULL_BLOCK u8"█"
| ^~~~~
| |
| const unsigned char *
../src/shared/qrcode-util.c:65:39: note: in expansion of macro ‘UNICODE_FULL_BLOCK’
65 | fputs(UNICODE_FULL_BLOCK, output);
| ^~~~~~~~~~~~~~~~~~
========
This introduces UTF8() macro, which define u8 literal and casts to consth char*,
then rewrites all u8 literal definitions with the macro.
With this change, we can build systemd with C23.
The log line looked like this:
bootctl[1457]: ! Mount point '/efi' which backs the random seed file is world accessible, which is a security hole! !
which doesn't look nice.
Also upgrade the message to error. This is something to fix.
basic/glyph-util: rename "special glyph" to just "glyph"
Admittedly, some of our glyphs _are_ special, e.g. "O=" for SPECIAL_GLYPH_TOUCH ;)
But we don't need this in the name. The very long names make some invocations
very wordy, e.g. special_glyph(SPECIAL_GLYPH_SLIGHTLY_UNHAPPY_SMILEY).
Also, I want to add GLYPH_SPACE, which is not special at all.
Yu Watanabe [Sat, 15 Mar 2025 00:04:25 +0000 (09:04 +0900)]
test: drop redundant parentheses in ASSERT_OK() and friends
This reverts 278e3adf50e36518c5a5dd798ca998e7eac5436e, and drop more
redundant parentheses, as they unfortunately does not suppress the
false-positive warnings by coverity.