Emil Velikov [Wed, 4 Oct 2023 10:51:47 +0000 (11:51 +0100)]
sd-boot: introduce and use efivar_unset()
Currently some of the code base check for the variable presence before
removing it, and some do not.
More so, in all cases (being updated) we're dealing with non-volatile
variables where changing those attribute to NVRAM wear out.
From what information I could find, there is no definitive answer if the
UEFI implementation will write to the NVRAM even when the variable is
missing.
So add a simple helper that checks for the variable presence before
removing it. While also having a bit cleaner API than the current
efivar_set(..., NULL, ...);
efivar_unset() follows the design from efivar_set*() where it returns an
EFI_STATUS even though its (presently) unused.
v2:
- add inline comment, use early return
v3:
- typos? typos!
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Jade Lovelace [Sun, 1 Oct 2023 05:21:33 +0000 (22:21 -0700)]
analyze: add tooltips with dependency information to "plot"
This helps a lot with figuring out why units were started when they
were, rather than guessing there is a dependency relation. We could
perhaps also do fun JavaScript things in the future to highlight
dependencies on mouse-over.
NRK [Mon, 2 Oct 2023 13:25:00 +0000 (19:25 +0600)]
macro: use __builtin_unreachable on NDEBUG
note that this slightly changes the semantic of assert when NDEBUG is
defined. if there's an extern function call (without attribute pure or
similar) then the compiler has to assume it has side effects and still
emit the function call.
whereas the old assert guaranteed that nothing will be evaluated on
NDEBUG.
Dan Streetman [Thu, 31 Aug 2023 13:10:40 +0000 (09:10 -0400)]
tpm2: change tpm2_unseal() to accept Tpm2Context instead of device string
This matches the change to tpm2_seal(), which now accepts a Tpm2Context instead
of a device string.
This also allows using the same TPM context for sealing and unsealing, which
will be required by (future) test code when sealing/unsealing using a transient
key.
Dan Streetman [Fri, 8 Sep 2023 18:22:11 +0000 (14:22 -0400)]
tpm2: use GREEDY_REALLOC_APPEND() in tpm2_get_capability_handles(), cap max value
Simplify the function with GREEDY_REALLOC_APPEND(). Also limit the size_t-sized
max value to UINT32_MAX since that's the maximum of the range this searches,
and the max parameter for tpm2_get_capability() is uint32_t.
Dan Streetman [Wed, 2 Aug 2023 17:35:46 +0000 (13:35 -0400)]
tpm2: update tpm2 test for supported commands
The test expects TPM2_CC_FIRST - 1 and TPM2_CC_LAST + 1 to be unsupported, but
those are not necessarily invalid commands. Instead test known-invalid
commands. Also add some more valid commands.
Dan Streetman [Fri, 8 Sep 2023 16:39:49 +0000 (12:39 -0400)]
tpm2: downgrade most log functions from error to debug
Because most TPM2 functions here are 'library-like' functions, they should be
at debug level, not error level.
The only functions not reduced to logging at debug are tpm2_list_devices(),
since it is expected to print output, and the tpm2_parse_pcr_argument_*()
functions, since the system-wide parse_*_argument() functions generally log at
error level.
test: spawn the to-be-killed-on-soft-reboot units with --collect
Otherwise they might leave stuff behind if they don't respond fast
enough to the first SIGTERM and get SIGKILLEd, which then breaks reusing
the unit name further in the test:
[ 2993.620849] H testsuite-82.sh[43]: + systemd-run -p Type=exec -p DefaultDependencies=no -p IgnoreOnIsolate=yes --unit=testsuite-82-nosurvive.service sleep infinity
[ 2993.628686] H systemd[1]: testsuite-82-nosurvive.service: About to execute: /usr/bin/sleep infinity
[ 2993.628886] H systemd[1]: testsuite-82-nosurvive.service: Forked /usr/bin/sleep as 65
[ 2993.629328] H systemd[1]: testsuite-82-nosurvive.service: Changed dead -> start
...
[ 2993.699892] H testsuite-82.sh[43]: + systemctl --no-block --check-inhibitors=yes soft-reboot
[ 2993.704326] H systemd-logind[41]: The system will soft-reboot now!
...
[ 3001.249302] H systemd[1]: Sending SIGKILL to PID 65 (sleep).
...
[ 3001.303158] H testsuite-82.sh[136]: + systemd-notify '--status=Second Boot'
...
[ 3001.409504] H testsuite-82.sh[136]: + systemd-run -p Type=exec --unit=testsuite-82-nosurvive.service sleep infinity
[ 3001.414061] H testsuite-82.sh[165]: Failed to start transient service unit: Unit testsuite-82-nosurvive.service was already loaded or has a fragment file.
resolve: tolerate merging a zero-ttl RR and a nonzero-ttl RR if not mDNS
resolved rejected RRsets containing a RR with a zero TTL and a RR with a nonzero TTL. In practice—see the linked issues—, this case triggered when an AF_UNSPEC query to a CNAMEd domain returned a zero TTL for the CNAME on one address family and a nonzero TTL for the CNAME on the other address family.
The zero-nonzero TTL check cites RFC 2181 § 5.2 in a comment. That section says DNS clients should reject any RRset containing differing TTLs, which the check only implements a very special case of. That the old behavior caused real-world false NXDOMAIN results is reason enough to completely ignore the RFC's recommendation. However, mDNS treats zero TTLs specially, so the error case needs to be kept for mDNS.
Luca Boccassi [Sun, 1 Oct 2023 17:55:12 +0000 (18:55 +0100)]
docs: add document about UEFI security posture in src/boot/efi/
This is not intended as a user guide, but to describe the generic security
posture of the UEFI components. Hence we do not publish it on systemd.io
but only in the repository.
dissect-image: optionally allow mounting via new kernel mount API in two steps
This adds support for the new fsmount() logic of the kernel: we'll first
create an unattached fsmount fd, and then in a second step attach this
to some real file system inode – as opposed to attaching file system
directly. The benefit of this is that we can pass the open fsmount fds
over some sockets if need be, to isolate the mounting code from the
attaching code.
core: Use a subdirectory of /run/ for PrivateDevices=
When we're starting early boot services such as systemd-userdbd.service,
/tmp might not yet be mounted, so let's use a directory in /run instead
which is guaranteed to be available.
Daan De Meyer [Sun, 1 Oct 2023 18:40:45 +0000 (20:40 +0200)]
mount: Log when we can't create the mount point
Debugging mount unit failures caused by systemd not being able to
create the mount point is currently rather hard. Let's log about
failures to create mount points to simplify debugging.
journalctl: find boot ID more gracefully in corrupted journal
In discover_next_boot(), first we find a new boot ID based on the value
stored in the entry object. Then, find the tail (or head when we are going
upwards) entry of the boot based on the _BOOT_ID= field data.
If boot IDs of an entry in the entry object and _BOOT_ID field data
are inconsistent, which may happen on corrupted journal, then previously
discover_next_boot() failed with -ENODATA.
This makes the function check if the two boot IDs in each entry are
consistent, and skip the entry if not.
Fixes the failure of `journalctl -b -1` for 'truncated' journal:
https://github.com/systemd/systemd/pull/29334#issuecomment-1736567951
Yu Watanabe [Sun, 1 Oct 2023 07:48:36 +0000 (16:48 +0900)]
fileio: make read_full_file_full() usable with size and READ_FULL_FILE_UNBASE64
When READ_FULL_FILE_UNBASE64 (or READ_FULL_FILE_UNHEX) is specified,
setting size argument by caller is difficult, as it is hard to estimate
the encoded length.
This makes when size is specified with decoding option, let's read file
more, and check decoded size later with the specified size.
This adds a new script tools/check-version-history.py and a corresponding
test when building in developer mode. It checks manpages (except dbus
documentation which is handled by update-dbus-docs) for missing version
history information.
It also adds ignore lists based on version 183 (the version that our version
annotations go back to). These can be augmented if we want to ignore other
elements if it doesn't make sense for them to have version annotations.
Yu Watanabe [Sun, 1 Oct 2023 03:04:59 +0000 (12:04 +0900)]
sd-netlink: make the default timeout configurable by environment variable
On normal systems, triggering a timeout should be a bug in code or
configuration error, so I do not think we should extend the default
timeout. Also, we should not introduce a 'first class' configuration
option about that. But, making it configurable may be useful for cases
such that "an extremely highly utilized system (lots of OOM kills,
very high CPU utilization, etc)".
sd-journal: merge journal_file_next_entry_for_data() with generic_array_get_plus_one()
Because journal_file_next_entry_for_data() provides the first entry, while
journal_file_next_entry() actually provides the next entry of the input,
this also renames it to journal_file_move_to_entry_for_data().
Also, previously, on DIRECTION_UP the function did not fall back to the
'extra' entry when all entries linked in the chained array are broken.
This also fixes the issue, and now it fall back to the extra entry.
sd-journal: fix calculation of number of 'total' entries in the chained arrays
If there's corruption and we are going upwards, then the 'total'
must be decreased when we go to the previous array. However,
previously, we wrongly kept or increased the number. This fixes
the behavior.
In the do-while loop, we do not read any other entry array object, hence
the current object is always in the mmap cache and not necessary to re-read it.