secure-boot: tighten enrollment logic a bit regarding file sizes
It's OK the dbx file is not loaded, but let's explicitly check for that
(i.e. if the buffer is actually non-NULL), rather than the size of the
bufer, since empty files actually do exist.
Or in other words, let's not magically suppress enrollment of empty
files, but let uefi firmware handle these on their own.
sd-journal: make sure sd_journal_add_match() also accepts SIZE_MAX as size
In many of our internal functions that take a pointer + a size we have
introduced the rule that SIZE_MAX as size means: take strlen().
sd_journal_add_match() has something similar, but the special value is
0, not SIZE_MAX. This is a bit ugly, since a zero size data block is
theoretically fine. The only reason sd_journal_add_match() gets away
with using this special value is because valid matches must consist of
at least 2 chars, hence cannot be zero.
But let's make this more robust and less surprising when compared to the
rest of our code, and *also* accept SIZE_MAX to mean strlen().
If we try to deserialize only a pidfd that points to a process that
has been reaped, creating the pidref object will fail, which means that
we'll try to create a pidref object from the serialized pid that comes
next. If the pid has already been reused, this will succeed and we'll
now have a pidref that points to a different process.
Let's avoid this issue by serializing both the pidfd and the pid and
creating the pidref object directly from both. This means we'll reuse
the deserialized pidfd instead of opening a new one. We'll then immediately
notice the pidfd is dead and do the appropriate follow up depending on
the unit type.
The timeout on sd-resolved's side is 5-10s (UDP or TCP), but dig's
default timeout is 5s. Let's give sd-resolved enough time to timeout
before either giving up or checking if it served stale data on dig's
side.
test: let curl show a potential error in silent mode
I collected a couple of fails in this particular test, but without any
output they're impossible to debug. Let's make this slightly less
annoying and let curl show an error (if any) even in silent mode.
This patch uncovers that curl has been (silently) complaining about not
being able to write to the output destination, because `grep -q`
short-circuits on the first match and doesn't bother reading the rest,
so replace `grep -q` with `grep ... >/dev/null` to force grep to always
read the whole thing from curl.
test: forward journal to console in TEST-24-CRYPTSETUP
If we fail to mount the encrypted /var during boot we're left with
nothing to debug, so let's do the same thing we do for TEST-08-INITRD
and forward journal to the console.
tree-wide: make sure net/if.h is included before any linux/ header
The linux/ headers include linux/libc-compat.h that makes sure the
linux/ headers won't redeclare symbols already declared by net/if.h, but
glibc's net/if.h doesn't do that, so if the include order is reversed
we'll end up with a bunch of errors about redeclared stuff:
This also drops remaining workarounds from the last time this issue was
brought up (6f270e6bd8) since they shouldn't be needed anymore if the
order of the includes is the "correct" one. I also added a comment to
each affected include when this is inevitably encountered again in the
future.
Just like we already have $SYSTEMD_PACKAGES for systemd packages to
re-install in the main image, let's add $INITRD_PACKAGES for all
systemd packages to re-install in the initrd.
mkosi: Install openSUSE-release instead of distribution-release
distribution-release is a virtual package that is by default satisfied
by the openSUSE MicroOS-release package. Let's make sure we pull in the
generic openSUSE-release package instead by installing
patterns-base-minimal_base which has a Suggests dependency on
openSUSE-release which makes sure it takes priority over the MicroOS one.
We might want to run the build scripts outside of mkosi as well at
some point, e.g. to build an rpm after booting the image, so let's
make them more generic by using /usr/lib/os-release to figure out
which pkg specs we should use instead of $PKG_SUBDIR. To make ubuntu
use the debian pkg spec, we add a symlink pkg/ubuntu which points to
debian/ in the same directory.
dns_transaction_request_dnssec_rr was already adjusted in 400171036592,
to allow for the return parameter to be passed uninitialized. However
this codepath was missed, meaning this function could sometimes return
success without having actually set the parameter.
Fixes: 400171036592 ("resolved: minor dnssec fixups") Fixes: 47690634f157 ("resolved: don't request the SOA for every dns label")
network/dhcp6: return earlier if no lease acquired
Previously, even If an interface has not acquired a DHCPv6 lease,
networkd logs a misleading message:
===
Apr 09 10:44:57 systemd-networkd[3970750]: veth99: DHCPv6 lease lost
===
The function should do nothing when no lease acquired. Let's return
earlier and suppress the log message.
This allows us to build and install after booting without having to
build a new image. Together with
https://github.com/systemd/mkosi/pull/2601 and after enabling
RuntimeBuildSources=yes, after booting, "meson install -C /work/build"
can be used to do an incremental build and install. This won't build
proper packages, but will be invaluable for having a quick compile,
edit, test cycle without having to rebuild the image all the time.
networkd: report error if lease file cannot be loaded and ignore
On my system, networkd would report that interface ve-rawhide is "Failed"
without anything in the logs:
systemd-networkd[651095]: ve-rawhide: Trying to reconfigure the interface.
systemd-networkd[651095]: ve-rawhide: Gained IPv6LL
systemd-networkd[651095]: ve-rawhide: Link DOWN
systemd-networkd[651095]: ve-rawhide: Lost carrier
systemd-networkd[651095]: ve-rawhide: Configuring with /usr/lib/systemd/network/80-container-ve.network.
systemd-networkd[651095]: ve-rawhide: Link UP
systemd-networkd[651095]: ve-rawhide: Gained carrier
systemd-networkd[651095]: ve-rawhide: Failed
At debug level:
systemd-networkd[799993]: dhcp-server-lease/ve-rawhide:1:1: Missing object field 'Address'.
I'm not sure why "Address" is missing, but anyway, in this case, we should ignore the
lease file rather than refusing to configure the interface. Also, warn at the point
where we know what the filename is.
test-execute: check for s390x first and duplicate test
s390x will define both s390x and s390, so exec-personality-s390.service is ran
in both cases but fails on s390x, as the personality returned is s390x.
Split the test and check specifically for s390x.
Mike Yuan [Sun, 7 Apr 2024 11:33:37 +0000 (19:33 +0800)]
systemctl-logind: auto soft-reboot only if /run/nextroot/ is mountpoint
Consider the following case: a user sets up a minimum rootfs for
file system maintenance work in /run/nextroot/ dir directly. When
they're done, they expect 'systemctl reboot' to perform a full reboot.
But they keep soft-rebooting back to the tmpfs root, until they
find out about $SYSTEMCTL_SKIP_AUTO_SOFT_REBOOT.
So currently, when /run/nextroot/ is a normal dir, pid1 automatically
turns it into a bind mount to soft-reboot into. This is good, but when
combined with automatic soft-reboot it has an arguably unexpected
behavior, since /run/nextroot/ can never go away in such a case.
OTOH, if /run/nextroot/ is a mountpoint in the first place, the mount
is *moved* so a second reboot would not trigger auto soft-reboot.
Let's just make things more friendly to users, and do auto soft-reboot
only if /run/nextroot/ is also a mountpoint.
With gcc-14.0.1-0.13.fc40, when compiling with -O2, the compiler doesn't understand
that sd_bus_error_setf() always returns negative on error when <name> is provided:
[28/576] Compiling C object systemd-resolved.p/src_resolve_resolved-bus.c.o
../src/resolve/resolved-bus.c: In function ‘call_link_method’:
../src/resolve/resolved-bus.c:1763:16: warning: ‘l’ may be used uninitialized [-Wmaybe-uninitialized]
1763 | return handler(message, l, error);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
../src/resolve/resolved-bus.c:1749:15: note: ‘l’ was declared here
1749 | Link *l;
| ^
../src/resolve/resolved-bus.c: In function ‘bus_method_get_link’:
../src/resolve/resolved-bus.c:1822:13: warning: ‘l’ may be used uninitialized [-Wmaybe-uninitialized]
1822 | p = link_bus_path(l);
| ^~~~~~~~~~~~~~~~
../src/resolve/resolved-bus.c:1810:15: note: ‘l’ was declared here
1810 | Link *l;
| ^
...
Let's make the assertion a bit more explicit. With this, the warning goes away,
but I think it's more obvious to a human reader too.
core: silence gcc warning about unitialized variable
When compiled with -O2, the compiler is not happy about dynamic_user_pop() and
would warn about the output variables not being set. It does have a point:
we were doing a cast from ssize_t to int, and theoretically there could be
wraparound. So let's add an explicit check that the cast to int is fine.
[540/2509] Compiling C object src/core/libsystemd-core-256.so.p/dynamic-user.c.o
../src/core/dynamic-user.c: In function ‘dynamic_user_close.isra’:
../src/core/dynamic-user.c:580:9: warning: ‘uid’ may be used uninitialized [-Wmaybe-uninitialized]
580 | unlink_uid_lock(lock_fd, uid, d->name);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../src/core/dynamic-user.c:560:15: note: ‘uid’ was declared here
560 | uid_t uid;
| ^~~
../src/core/dynamic-user.c: In function ‘dynamic_user_realize’:
../src/core/dynamic-user.c:476:29: warning: ‘new_uid’ may be used uninitialized [-Wmaybe-uninitialized]
476 | num = new_uid;
| ~~~~^~~~~~~~~
../src/core/dynamic-user.c:398:23: note: ‘new_uid’ was declared here
398 | uid_t new_uid;
| ^~~~~~~
nsresourced: add new daemon for granting clients user namespaces and assigning resources to them
This adds a small, socket-activated Varlink daemon that can delegate UID
ranges for user namespaces to clients asking for it.
The primary call is AllocateUserRange() where the user passes in an
uninitialized userns fd, which is then set up.
There are other calls that allow assigning a mount fd to a userns
allocated that way, to set up permissions for a cgroup subtree, and to
allocate a veth for such a user namespace.
Since the UID assignments are supposed to be transitive, i.e. not
permanent, care is taken to ensure that users cannot create inodes owned
by these UIDs, so that persistancy cannot be acquired. This is
implemented via a BPF-LSM module that ensures that any member of a
userns allocated that way cannot create files unless the mount it
operates on is owned by the userns itself, or is explicitly
allowelisted.
BPF LSM program with contributions from Alexei Starovoitov.
dissect-image: make dissected_image_acquire_metadata() operate within a userns if possible
This opens the door for making the call work without privileges: if we
pass in a userns fd and DissectedImage that has mount fds then we can
acquire all information without privs.
lock-util: make global lock return parameter to image_path_lock() optional
When adding unprivileged nspawn support we don't really want a global
lock file, since we cannot even access the dir they are stored in, hence
make the concept optional.