random-seed: refresh EFI boot seed when writing a new seed
Since this runs at shutdown to write a new seed, we should also keep the
bootloader's seed maximally fresh by doing the same. So we follow the
same pattern - hash some new random bytes with the old seed to make a
new seed. We let this fail without warning, because it's just an
opportunistic thing. If the user happens to have set up the random seed
with bootctl, and the RNG is initialized, then things should be fine. If
not, we create a new seed if systemd-boot is in use. And if not, then we
just don't do anything.
boot: implement kernel EFI RNG seed protocol with proper hashing
Rather than passing seeds up to userspace via EFI variables, pass seeds
directly to the kernel's EFI stub loader, via LINUX_EFI_RANDOM_SEED_TABLE_GUID.
EFI variables can potentially leak and suffer from forward secrecy
issues, and processing these with userspace means that they are
initialized much too late in boot to be useful. In contrast,
LINUX_EFI_RANDOM_SEED_TABLE_GUID uses EFI configuration tables, and so
is hidden from userspace entirely, and is parsed extremely early on by
the kernel, so that every single call to get_random_bytes() by the
kernel is seeded.
In order to do this properly, we use a bit more robust hashing scheme,
and make sure that each input is properly memzeroed out after use. The
scheme is:
The various inputs are:
- LINUX_EFI_RANDOM_SEED_TABLE_GUID from prior bootloaders
- 256 bits of seed from EFI's RNG
- The (immutable) system token, from its EFI variable
- The prior on-disk seed
- The UEFI monotonic counter
- A timestamp
This also adjusts the secure boot semantics, so that the operation is
only aborted if it's not possible to get random bytes from EFI's RNG or
a prior boot stage. With the proper hashing scheme, this should make
boot seeds safe even on secure boot.
There is currently a bug in Linux's EFI stub in which if the EFI stub
manages to generate random bytes on its own using EFI's RNG, it will
ignore what the bootloader passes. That's annoying, but it means that
either way, via systemd-boot or via EFI stub's mechanism, the RNG *does*
get initialized in a good safe way. And this bug is now fixed in the
efi.git tree, and will hopefully be backported to older kernels.
As the kernel recommends, the resultant seeds are 256 bits and are
allocated using pool memory of type EfiACPIReclaimMemory, so that it
gets freed at the right moment in boot.
bootctl,bootspec: make use of CHASE_PROHIBIT_SYMLINKS whenever we access the ESP/XBOOTLDR
Let's make use of the new flag whenever we access the ESP or XBOOTLDR.
The resources we make use of in these partitions can't possibly use
symlinks (because UEFI knows no symlink concept), and they are untrusted
territory, hence under no circumstances we should be tricked into
following symlinks that shouldn't be there in the first place.
Of course, you might argue thta ESP/XBOOTLDR are VFAT and thus don#t
know symlinks. But the thing is, they don#t have to be. Firmware can
support other file systems too, and people can use efifs to gain access
to arbitrary Linux file systems from EFI. Hence, let's better be safe
than sorry.
chase-symlinks: add new flag for prohibiting any following of symlinks
This is useful when operating in the ESP, which is untrusted territory,
and where under no circumstances we should be tricked by symlinks into
doing anything we don't want to.
nulstr-util: fix corner cases of strv_make_nulstr()
Let's change the return semantics of strv_make_nulstr() so that we can
properly distuingish the case where we have a no entries in the nulstr
from the case where we have a single empty string in a nulstr.
Previously we couldn't distuingish those, we'd in both cases return a
size of zero, and a buffer with two NUL bytes.
With this change, we'll still return a buffer with two NULL bytes, but
for the case where no entries are defined we'll return a size of zero,
and where we have two a size of one.
This is a good idea, as it makes sure we can properly handle all corner
cases.
Nowadays the function is used by one place only: ask-password-api.c. The
corner case never mattered there, since it was used to serialize
passwords, and it was known that there was exactly one password, not
less. But let's clean this up. This means the subtraction of the final
NUL byte now happens in ask-password-api.c instead.
nulstr-util: use memdup_suffix0() where appropriate
if the nulstr is not nul-terminated, we shouldn't use strndup() but
memdup_suffix0(), to not trip up static analyzers which imply we are
duping a string here.
This rework the logic for handling the "header" cells a bit. Instead of
special casing the first row in regards to uppercasing/coloring let's
just intrduce a proper cell type TABLE_HEADER which is in most ways
identical to TABLE_STRING except that it defaults to uppercase output
and underlined coloring.
This is mostly refactoring, but I think it makes a ton of sense as it
makes the first row less special and you could in fact insert
TABLE_HEADER (and in fact TABLE_FIELD) cells wherever you like and
something sensible would happen (i.e. a string cell is displayed with
a specific formatting).
Yu Watanabe [Fri, 11 Nov 2022 04:54:03 +0000 (13:54 +0900)]
ac-power: check battery existence and status
If a battery is not present or its status is not discharging, then
the battery should not be used as a power source.
Let's count batteries currently discharging.
MkfsSion [Sat, 29 Oct 2022 18:29:02 +0000 (14:29 -0400)]
libfido2-util: Perform pre-flight check for credentials in token
Do not attempt to decrypt using a key slot unless its corresponding
credential is found on an available FIDO2 token. Avoids multiple touches
/ confirmations when unlocking a LUKS2 device with multiple FIDO2 tokens
enrolled.
Partially fixes #19208 (when the libcryptsetup plugin is in use).
Daan De Meyer [Fri, 11 Nov 2022 10:26:54 +0000 (11:26 +0100)]
strv: Make sure strv_make_nulstr() always returns a valid nulstr
strv_make_nulstr() is documented to always return a valid nulstr,
but if the input is `NULL` we return a string terminated with only
a single NUL terminator, so let's fix that and always terminate the
resulting string with two NUL bytes.
Originally, the table formatting code was written to display a number of
records, one per line, and within each line multiple fields of the same
record. The first line contains the column names.
It was then started to be used in a "vertical" mode however,
i.e. with field names on the left instead of the top. Let's support such
a mode explicitly, so that we can provide systematic styling, and can
properly convert this mode to JSON.
A new constructor "table_new_vertical()" is added creating such
"vertical" tables. Internally, this is a table with two columns: "key"
and "value". When outputting this as JSON we'll output a single JSON
object, with key/value as fields. (Which is different from the
traditional output where we'd use the first line as JSON field names,
and output an array of objects).
A new cell type TABLE_FIELD is added for specifically marking the
"field" cells, i.e. the cells in the first column. We'll automatically
suffic ":" to these fields on output.
This is useful to force off fancy unicode glyph use (i.e. use "->"
instead of "→"), which is useful in tests where locales might be
missing, and thus control via $LC_CTYPE is not reliable.
Use this in TEST-58, to ensure the output checks we do aren't confused
by missing these glyphs being unicode or not.
repart: Don't descend into directories assigned to other partitions
Let's say we have the following repart definitions files root.conf
and home.conf:
```
[Partition]
Type=root
CopyFiles=/
```
```
[Partition]
Type=home
CopyFiles=/home
```
Currently, we'd end up copying /home to both the root partition and
the home partition. To prevent this from happening, let's adopt a
new policy when copying files for a partition: We won't copy any
files/directories that appear in the CopyFiles= list of another
partition, unless that directory explicitly appears in our own
CopyFiles= list.
This way, we prevent copying /home twice into the root and home
partition, but should a user really want that behavior, they can
have it by adding /home to the CopyFIles= list of the root partition
as well.
dissect: also parse out the top-level GPT table uuid and expose this as image UUID
systemd-repart generates this in a suitably stable fashion, hence let's
actually use it as an identifier for the image. As a first step parse
it, and show it.
Due to "historical reasons" both gcc and clang treat *all* trailing
arrays members as flexible arrays, this has an evil side effect
of inhibiting bounds checks on such members as __builtin_object_size
cannot say for sure that:
struct {
...
type foo[3];
}
has a trailing foo member of fixed size rather than unspecified.
Ideally we should use -fstrict-flex-arrays as is, but we have to
tolerate kernel uapi headers that use [0] and third party libraries
written in c89 that may use [1] like curl.
in_initrd() was really doing two things: checking if we're in the initrd, and
also verifying that the initrd is set up correctly. But this second check is
complicated, in particular it would return false for overlayfs, even with an
upper tmpfs layer. It also doesn't support the use case of having an initial
initrd with tmpfs, and then transitioning into an intermediate initrd that is
e.g. a DDI, i.e. a filesystem possibly with verity arranged as a disk image.
We don't need to check if we're in initrd in every program. Instead, concerns
are separated:
- in_initrd() just does a simple check for /etc/initrd-release.
- When doing cleanup, pid1 checks if it's on a tmpfs before starting to wipe
the old root. The only case where we want to remove the old root is when
we're on a plain tempory filesystem. With an overlay, we'd be creating
whiteout files, which is not very useful. (*)
This should resolve https://bugzilla.redhat.com/show_bug.cgi?id=2137631
which is caused by systemd refusing to treat the system as an initrd because
overlayfs is used.
(*) I think the idea of keeping the initrd fs around for shutdown is outdated.
We should just have a completely separate exitrd that is unpacked when we want
to shut down. This way, we don't waste memory at runtime, and we also don't
transition to a potentially older version of systemd. But we don't have support
for this yet.
```
archlinux_systemd_ci: In file included from ../build/src/dissect/dissect.c:15:
archlinux_systemd_ci: ../build/src/basic/build.h:4:10: fatal error: version.h: No such file or directory
archlinux_systemd_ci: 4 | #include "version.h"
archlinux_systemd_ci: | ^~~~~~~~~~~
archlinux_systemd_ci: compilation terminated.
```
```
archlinux_systemd_ci: In file included from ../build/src/journal/cat.c:13:
archlinux_systemd_ci: ../build/src/basic/build.h:4:10: fatal error: 'version.h' file not found
archlinux_systemd_ci: #include "version.h"
archlinux_systemd_ci: ^~~~~~~~~~~
archlinux_systemd_ci: 1 error generated.
```
```
archlinux_systemd_ci: In file included from ../build/src/sysext/sysext.c:10:
archlinux_systemd_ci: ../build/src/basic/build.h:4:10: fatal error: version.h: No such file or directory
archlinux_systemd_ci: 4 | #include "version.h"
archlinux_systemd_ci: | ^~~~~~~~~~~
archlinux_systemd_ci: compilation terminated.
archlinux_systemd_ci: FAILED: systemd-inhibit.p/src_login_inhibit.c.o
```
```
archlinux_systemd_ci: In file included from ../build/src/login/inhibit.c:12:
archlinux_systemd_ci: ../build/src/basic/build.h:4:10: fatal error: version.h: No such file or directory
archlinux_systemd_ci: 4 | #include "version.h"
archlinux_systemd_ci: | ^~~~~~~~~~~
archlinux_systemd_ci: compilation terminated.
```
recurse-dir: optionally, call callback when entering/leaving toplevel dir, too
So far recurse_dir() will call the callback whenever we enter a
directory, and then pass the struct dirent for that directory, and an fd
for the directory the dirent is part of (i.e. the parent of the
directory we call things for). For the top-level dir the function is
invoked for we will not call the callback however, because we have no
dirent for that, and not fd for the directory the top-level dir is part
of. Let's add a flag to call it anyway, and in that case pass a NULL
dirent and -1 as directory fd.
This is useful when we want to treat the top-level dir the same as any
dir further down.
This is done opt-in since the callback must be ablet to handle a NULL
dirent and a -1 directory fd.
We are not interested in the struct dirent data, hence there's no point
in passing RECURSE_DIR_ENSURE_TYPE. Let's drop it, and thus avoid some
extrac work on file systems where readdir() does not report .d_type.
Also drop RECURSE_DIR_SAME_MOUNT, because DDIs after all may contain
multiple partitions, and we mount them all together. The --list command
really should report the full set of files in an image.
In tests it's useful to be able to delete temporary directories
via a file descriptor to them, so let's add rm_rf_physical_and_close()
which gets the file descriptor path via /proc and tries to remove it
that way.
basic/virt: treat missing /proc as sign of being in a chroot
The logic of running_in_chroot() has been the same since the introduction of
this function in b4f10a5e8956d26f0bc6b9aef12846b57caee08b: if /proc is not
mounted, the function returns -ENOENT and all callers treat this as false. But
that might be the most common case of chrooted calls, esp. in all the naïve
chroots that were done with the chroot binary without additional setup.
(In particular rpm executes all scriptlets in a chroot without bothering to set
up /proc or /sys, and we have codepaths in sysusers and tmpfiles to support
running in such an environment.)
This change effectively shortcircuits various calls to udevadm, downgrades
logging in tmpfiles, and disables all verbs marked with VERB_ONLINE_ONLY in
systemctl. detect-virt -r is also affected:
$ sudo chroot /var/lib/machines/rawhide
before> systemd-detect-virt -r && echo OK
Failed to check for chroot() environment: No such file or directory
after> systemd-detect-virt -r && echo OK
OK