Michal Sekletar [Wed, 1 Jun 2022 08:15:06 +0000 (10:15 +0200)]
scope: allow unprivileged delegation on scopes
Previously it was possible to set delegate property for scope, but you
were not able to allow unprivileged process to manage the scope's cgroup
hierarchy. This is useful when launching manager process that will run
unprivileged but is supposed to manage its own (scope) sub-hierarchy.
test: skip the relevant test case if systemd-measure is not present
systemd-measure is not built without gnu-efi, which is the case, for
example, on ppc64le. Let's skip the relevant test case in this case
instead of failing.
```
The Meson build system
Version: 0.58.2
...
Host machine cpu family: ppc64
Host machine cpu: ppc64le
...
Message: Skipping systemd-measure.1 because HAVE_GNU_EFI is false
...
[ 115.711775] testsuite-70.sh[745]: + cat
[ 115.741996] testsuite-70.sh[832]: + /usr/lib/systemd/systemd-measure calculate --linux=/tmp/tpmdata1 --initrd=/tmp/tpmdata2
[ 115.754015] testsuite-70.sh[833]: + cmp - /tmp/result
[ 115.758004] testsuite-70.sh[832]: /usr/lib/systemd/tests/testdata/units/testsuite-70.sh: line 56: /usr/lib/systemd/systemd-measure: No such file or directory
[ 115.773851] testsuite-70.sh[833]: cmp: EOF on - which is empty
[ 115.983681] sh[835]: + systemctl poweroff --no-block
```
Daan De Meyer [Wed, 3 Aug 2022 09:37:17 +0000 (11:37 +0200)]
repart: Only lock block device once
Let's lock the backing fd instead of locking/unlocking multiple
times when doing multiple operations with repart. It doesn't make
much sense for anything else to touch the block device while there
are still repart operations pending on it. By keeping the lock over
the full duration of repart, we avoid anything else from interfering
with the block device inbetween operations.
Luca Boccassi [Wed, 3 Aug 2022 17:41:13 +0000 (18:41 +0100)]
integritysetup: do not use crypt_init_data_device after crypt_init
crypt_init_data_device() replaces the crypt_device struct with a
new allocation, losing the old one, which we get from crypt_init().
Use crypt_set_data_device() instead.
This command takes a mountpoint, unmounts it and makes sure the
underlying partition devices and block device are removed before
exiting.
To mirror the --mount operation, we also add a --rmdir option which
does the opposite of --mkdir, and a -U option which is a shortcut
for --umount --rmdir.
test: optionally wait a bit when checking the mount unit
On fast systems we might race against systemd and check the mount unit
after mounting it way too early before systemd had a chance to react to
the change.
```
[ 4.677701] H systemd[1]: Event source 0x210b3b0 (mount-monitor-dispatch) entered rate limit state.
...
[ 4.863731] H testsuite-64.sh[812]: + mount /logsysfsRxx
[ 4.865918] H kernel: EXT4-fs (vda2): mounted filesystem with ordered data mode. Opts: (null)
[ 4.866213] H testsuite-64.sh[812]: + systemctl status /logsysfsRxx
[ 4.877502] H testsuite-64.sh[919]: ○ logsysfsRxx.mount - /logsysfsRxx
[ 4.877502] H testsuite-64.sh[919]: Loaded: loaded (/etc/fstab; generated)
[ 4.877502] H testsuite-64.sh[919]: Active: inactive (dead)
[ 4.877502] H testsuite-64.sh[919]: Where: /logsysfsRxx
[ 4.877502] H testsuite-64.sh[919]: What: /dev/disk/by-uuid/deadbeef-dead-dead-beef-222222222222
[ 4.877502] H testsuite-64.sh[919]: Docs: man:fstab(5)
[ 4.877502] H testsuite-64.sh[919]: man:systemd-fstab-generator(8)
[ 4.877502] H testsuite-64.sh[919]: Aug 03 10:10:10 H systemd[1]: logsysfsRxx.mount: Processing implicit device dependencies
[ 4.877502] H testsuite-64.sh[919]: Aug 03 10:10:10 H systemd[1]: logsysfsRxx.mount: Added Requires dependency on /dev/disk/by-uuid/deadbeef-dead-dead-beef-222222222222
[ 4.877502] H testsuite-64.sh[919]: Aug 03 10:10:10 H systemd[1]: logsysfsRxx.mount: Added StopPropagatedFrom dependency on /dev/disk/by-uuid/deadbeef-dead-dead-beef-222222222222
[ 4.895683] H sh[920]: + systemctl poweroff --no-block
[ 4.906533] H systemd[1]: Found unit logsysfsRxx.mount at /run/systemd/generator/logsysfsRxx.mount (regular file)
[ 4.906594] H systemd[1]: Preset files don't specify rule for logsysfsRxx.mount. Enabling.
[ 4.906990] H systemd[1]: testsuite-64.service: Main process exited, code=exited, status=3/NOTIMPLEMENTED
[ 4.907057] H systemd[1]: testsuite-64.service: Failed with result 'exit-code'.
[ 4.907287] H systemd[1]: Failed to start testsuite-64.service.
[ 4.955293] H systemd[1]: Starting end.service...
[ 4.955736] H systemd-logind[809]: The system will power off now!
[ 4.955868] H systemd-logind[809]: System is powering down.
[ 4.975781] H systemd[1]: Event source 0x210b3b0 (mount-monitor-dispatch) left rate limit state.
[ 4.975821] H systemd[1]: logsysfsRxx.mount: Processing implicit device dependencies
[ 4.975857] H systemd[1]: logsysfsRxx.mount: Added Requires dependency on /dev/vda2
[ 4.975893] H systemd[1]: logsysfsRxx.mount: Added StopPropagatedFrom dependency on /dev/vda2
[ 4.975928] H systemd[1]: Unit blockdev@dev-vda2.target has alias blockdev@.target.
[ 4.975967] H systemd[1]: logsysfsRxx.mount: Added After dependency on /dev/vda2
[ 4.976081] H systemd[1]: logsysfsRxx.mount: Changed dead -> mounted
```
James Hilliard [Mon, 1 Aug 2022 01:11:47 +0000 (01:11 +0000)]
bpf: fix is_allow_list section
The llvm bpf compiler appears to place const volatile variables in
a non-standard section which creates an incompatibility with the gcc
bpf compiler.
To fix this force GCC to also use the rodata section.
Note this does emit an assembler warning:
Generating src/core/bpf/restrict_ifaces/restrict-ifaces.bpf.unstripped.o with a custom command
/tmp/ccM2b7jP.s: Assembler messages:
/tmp/ccM2b7jP.s:87: Warning: setting incorrect section attributes for .rodata
Fixes:
../src/core/restrict-ifaces.c:45:14: error: ‘struct
restrict_ifaces_bpf’ has no member named ‘rodata’; did you mean
‘data’?
45 | obj->rodata->is_allow_list = is_allow_list;
| ^~~~~~
| data
Loïc Collignon [Wed, 3 Aug 2022 09:42:28 +0000 (11:42 +0200)]
Fix 24172: __STDC_VERSION__ may be defined in C++
According to the C++ ISO standard, a conformant compiler is allowed to
define this macro to any value for any reason as it is implementation
defined: https://timsong-cpp.github.io/cppwp/cpp.predefined#2.3
This mean that it cannot be assumed that it is not defined in a C++.
Change the condition to reflect that.
This patch adds support for enrolling secure boot boot keys from sd-boot.
***DANGER*** NOTE ***DANGER***
This feature might result in your device becoming soft-brick as outlined
below, please use this feature carefully.
***DANGER*** NOTE ***DANGER***
If secure-boot-enrollment is set to no, then no action whatsoever is performed,
no matter the files on the ESP.
If secure boot keys are found under $ESP/loader/keys and secure-boot-enrollment
is set to either manual or force then sd-boot will generate enrollment entries
named after the directories they are in. The entries are shown at the very bottom
of the list and can be selected by the user from the menu. If the user selects it,
the user is shown a screen allowing for cancellation before a timeout. The enrollment
proceeds if the action is not cancelled after the timeout.
Additionally, if the secure-boot-enroll option is set to 'force' then the keys
located in the directory named 'auto' are going to be enrolled automatically. The user
is still going to be shown a screen allowing them to cancel the action if they want to,
however the enrollment will proceed automatically after a timeout without
user cancellation.
After keys are enrolled, the system reboots with secure boot enabled therefore, it is
***critical*** to ensure that everything needed for the system to boot is signed
properly (sd-boot itself, kernel, initramfs, PCI option ROMs).
This feature currently only allows loading the most simple set of variables: PK, KEK
and db.
The files need to be prepared with cert-to-efi-sig-list and then signed with
sign-efi-sig-list.
Here is a short example to generate your own keys and the right files for
auto-enrollement.
`
keys="PK KEK DB"
uuid="{$(systemd-id128 new -u)}"
for key in ${keys}; do
openssl req -new -x509 -subj "/CN=${key}/ -keyout "${key}.key" -out "${key}.crt"
openssl x509 -outform DER -in "${key}.crt" -out "${key}.cer"
cert-to-efi-sig-list -g "${uuid}" "${key}.crt" "${key}.esl.nosign"
done
Once these keys are enrolled, all the files needed for boot ***NEED*** to be signed in
order to run. You can sign the binaries with the sbsign tool, for example:
Einsler Lee [Tue, 2 Mar 2021 12:21:21 +0000 (20:21 +0800)]
main: reopen /dev/console for user service manager
Now the console_fd of user service manager is 2. Even if LogTarget=console is set in /etc/systemd/user.conf,there is no log in the console.
This reopen the /dev/console, so the log of user service can be output in the console.
repart: when keeping ref to backing inode/devnode, use fd_reopen() rathern than F_DUPFD
Via the "backing_fd" variable we intend to pin the backing inode through
our entire code. So far we typically created the fd via F_DUPFD_CLOEXEC,
and thus any BSD lock taken one the original fd is shared with our
backing_fd reference. And if the origina fd is closed but our backing_fd
is not, we'll keep the BSD lock open, even if we then reopen the block
device through the backing_fd. If hit, this results in a deadlock.
Let's fix that by creating the backing_fd via fd_reopen(), so that the
locks are no longer shared, and if the original fd is closed all BSD
locks on it that are in effect are auto-released.
(Note the deadlock is only triggered if multiple operations on the same
backing inode are executed, i.e. factory reset, resize and applying of
partitions.)
Calling fd_is_mountpoint() with AT_EMPTYPATH and an empty filename can
only work if we have new statx() available. If we do not, we can still
make things work for directories, but not for other inodes (since there
we cannot query information about the parent inode to compare things.)
Hence, let's handle and test this explicitly, to support this to the
level this is possible.
test: install libgcc_s.so.1 explicitly if available
Since the library is dlopen()ed by libpthread and required during
pthread_exit()/pthread_cancel(), let's install it explicitly if available to
avoid unexpected fails in tests. This also consolidates all related
workarounds for this library across the test scripts.
Daan De Meyer [Tue, 2 Aug 2022 09:51:40 +0000 (11:51 +0200)]
mkosi: Update to latest commit
With this update, Arch Linux keyring updates will be automatically
pulled in instead of having to update to a new mkosi commit every
time the keyring gets outdated.
measure: add new tool to precalculate PCR values for a kernel image
For now, this simply outputs the PCR hash values expected for a kernel
image, if it's measured like sd-stub would do it.
(Later on, we can extend the tool, to optionally sign these
pre-calculated measurements, in order to implement signed PCR policies
for disk encryption.)
This is not actually used (or even supposed to be used) in clean
codepaths, but is tremendously useful when verifying things work
correctly, as a debugging tool.
Report whether the devicetree + sort-key boot loader spec type #1
fields are supported, and whether the "@saved" pseudo-entry is
supported.
Strictly speaking, thes features have been added in versions that are
already released (250+), so by adding this those version even though
they support the features will be considered not supporting them, but
that should be OK (the opposite would be a problem though, i.e. if we'd
assume a boot loader had a feature it actually does not).
These three features are features relevant to userspace, as it allows
userspace to tweak/genereate BLS entries or set EFI vars correctly.
Other features (i.e. that have no impliciations to userspace) are not
reported.
stub: clean up kernel command line when converting to ASCII
Let's be a bit more careful when converting the UTF-16 cmdline to ASCII.
Let's convert all characters out of the printable ASCII range to spaces,
instead of blindly relying on C's downcasting behaviour.
stub: introduce StubFeatures, similar to LoaderFeatures
systemd-boot reports its features via the LoaderFeatures EFI variable.
Let's add something similar for stub features, given they have been
growing.
For starters only define four feature flags. One is a baseline feature
we pretty much always supported (see comment in code), two are features
added in one of the most recently released systemd version, and the
final one, is a feature we added a few commits ago.
This is useful for userspace to figure out what is supported and what
not.
sd-stub: measure sysext images picked up by sd-stub into PCR 13
Let's grab another so far unused PCR, and measure all sysext images into
it that we load from the ESP. Note that this is possibly partly redundant,
since sysext images should have dm-verity enabled, and that is hooked up
to IMA. However, measuring this explicitly has the benefit that we can
measure filenames too, easily, and that all without need for IMA or
anything like that.
This means: when booting a unified sd-stub kernel through sd-boot we'll
now have:
2. PCR 12: kernel command line (i.e. the one embedded in the image, plus
optionally an overriden one) + any credential files picked up by
sd-stub
3. PCR 13: sysext images picked up by sd-stub
And each of these three PCRs should carry just the above, and start from
zero, thus be pre-calculatable.
Thus, all components and parameters of the OS boot process (i.e.
everything after the boot loader) is now nicely pre-calculable.
NOTE: this actually replaces previous measuring of the syext images into
PCR 4. I added this back in 845707aae23b3129db635604edb95c4048a5922a,
following the train of thought, that sysext images for the initrd should
be measured like the initrd itself they are for, and according to my
thinking that would be a unified kernel which is measured by firmware
into PCR 4 like any other UEFI executables.
However, I think we should depart from that idea. First and foremost
that makes it harder to pre-calculate PCR 4 (since we actually measured
quite incompatible records to the TPM event log), but also I think
there's great value in being able to write policies that bind to the
used sysexts independently of the earlier boot chain (i.e. shim, boot
loader, unified kernel), hence a separate PCR makes more sense.
Strictly speaking, this is a compatibility break, but I think one we can
get away with, simply because the initrd sysext images are currently not
picked up by systemd-sysext yet in the initrd, and because of that we
can be reasonably sure noone uses this yet, and hence relies on the PCR
register used. Hence, let's clean this up before people actually do
start relying on this.
efi: from the stub measure the ELF kernel + built-in initrd and so on into PCR 11
Here we grab a new – on Linux so far unused (by my Googling skills, that
is) – and measure all static components of the PE kernel image into.
This is useful since for the first time we'll have a PCR that contains
only a PCR of the booted kernel, nothing else. That allows putting
together TPM policies that bind to a specific kernel (+ builtin initrd),
without having to have booted that kernel first. PCRs can be
pre-calculated. Yay!
You might wonder, why we measure just the discovered PE sections we are
about to use, instead of the whole PE image. That's because of the next
step I have in mind: PE images should also be able to carry an
additional section that contains a signature for its own expected,
pre-calculated PCR values. This signature data should then be passed
into the booted kernel and can be used there in TPM policies. Benefit:
TPM policies can now be bound to *signatures* of PCRs, instead of the
raw hash values themselves. This makes update management a *lot* easier,
as policies don't need to be updated whenever a kernel is updated, as
long as the signature is available. Now, if the PCR signature is
embedded in the kernel PE image it cannot be of a PCR hash of the kernel
PE image itself, because that would be a chicken-and-egg problem. Hence,
by only measuring the relavent payload sections (and that means
excluding the future section that will contain the PCR hash signature)
we avoid this problem, naturally.
efi: optionally report when measuring to TPM whether we actually did
the measurement calls can succeed either when they actually measured
something, or when they skipped measurement because the local system
didn't support TPMs.
Let's optionally return a boolean saying which case it is. This is later
useful to tell userspace how and if we measured something.
Eli Schwartz [Wed, 27 Jul 2022 01:49:48 +0000 (21:49 -0400)]
meson: fix broken boolean kwarg
Everywhere else that `conf.get('ENABLE_*')` is used as a boolean key for
something (for example in if statements) it always checks if == 1, but
in this one case it neglects to do so. This is important because
conf.get yields the same int that was stored, but if statements require
booleans.
So does executable's "install" kwarg, at least according to the
documentation. In actuality, it accepts all types without sanity
checking, then uses python "if bool(var)", so you can actually do
`install: 'do not'` and that's treated identical to `true`. This is a
type-checking bug which Meson will eventually fix.
Eli Schwartz [Wed, 27 Jul 2022 01:09:07 +0000 (21:09 -0400)]
meson: use files in run_command with relativized path
Passing a file as a command argument in string form assumes that
run_command has the current subdir as its cwd, but Meson's documentation
*explicitly* calls this out as undefined and wrong to use.
Indeed, muon has a different implementation that uses a different cwd,
and this argument cannot be found. Instead, passing a files() object
means that it's the job of meson itself to verify the file exists, then
pass it to the run_command in some format that guarantees it is a valid
path reference.
Eli Schwartz [Thu, 19 May 2022 10:54:40 +0000 (06:54 -0400)]
meson: move i18n module import to only when it is used
When translations are disabled, it's not necessary to `import('i18n')`
and do nothing with it. Also, importing it is (slightly) slow as Meson
needs to load another implementation file from disk, so why bother with
that work?
More particularly, muon does not yet implement this module and fails to
setup. Since there's already an option to disable using it, it makes
sense to let that option completely skip the not-implemented
functionality and actually succeed.
Eli Schwartz [Thu, 19 May 2022 10:50:35 +0000 (06:50 -0400)]
meson: fix type for many build options
Integers and booleans are supposed to be actual integers and booleans,
not strings describing their value, but Meson silently accepted either
one. It's still wrong to do it though, and other implementations of
Meson such as muon choke on it.
Fei Li [Fri, 17 Jun 2022 11:26:28 +0000 (19:26 +0800)]
virt: detect KubeVirt instance
Kubevirt is currently technically based on KVM (but not xen yet[1]).
The systemd-detect-virt command, used to differentiate the current
virtualization environment, works fine on x86 relying on CPUID, while
fails to get the correct value (none instead of kvm) on aarch64.
Let's fix this by adding a new 'vendor[KubeVirt] = kvm' classification
considering the sys_vendor is always KubeVirt.
Let's remove the baud settings for the container getty units since
they don't have any effect there anyway. On top of that, when we're
dealing with container TTYs, we can handle all the setup involved
ourselves so let's prevent agetty/login from touching the container
tty at all.
One example where this helps is that it actually makes disabling
TTYVHangup have an effect since before, login would unconditionally
call vhangup() on the tty.
Alexander Wilson [Fri, 22 Jul 2022 11:08:11 +0000 (04:08 -0700)]
machinectl: Add plumbing for a `--force` flag for file copy
machine: Add APIs CopyTo[Machine]WithFlags + CopyFrom[Machine]WithFlags
- Same API to those without `WithFlags` (except this can take flags)
- Initially, only a flag to allow replacing a file if it already exists
Alexander Wilson [Fri, 22 Jul 2022 11:13:31 +0000 (04:13 -0700)]
copy: Respect COPY_REPLACE flag for copy_tree
- Add a test that asserts that copy_tree on an existing file will fail without COPY_REPLACE
- Add a test that asserts that copy_tree with COPY_MERGE and COPY_REPLACE on an existing directory will overwrite files that already exist.
Alexander Wilson [Fri, 22 Jul 2022 11:15:08 +0000 (04:15 -0700)]
copy.[ch]: Refactor
- Refactor: Move HardlinkContext to header file
- Refactor: Create `fd_copy_tree_generic` which isolates the functionality to check stat type and appropriately copy.
- Refactor: Create `fd_copy_leaf` which handles copying leaf nodes of a file tree.
stub: override StubInfo EFI variable unconditionally, since *we* own it
The other variables are owned by the boot menu (i.e. sd-boot), we only
fill those in if it didn't do so for us (to support cases where our stub
kernel is directly invoked by UEFI). But StubInfo is genuinely about the
stub, hence let's simplify things and unconditionally set it from the
stub.
boot: introduce common shortcut exit path in pack_cpio()
THis will be useful in a later commit, when we add more stuff to the
common exit path. But even without that, it's a nice simplification,
removing redundant lines.