repart: Add support for formatting verity partitions
This commit adds a new Verity= setting to repart definition files
with two possible values: "data" and "hash".
If Verity= is set to "data", repart works as before, and populates
the partition with the content from CopyBlocks= or CopyFiles=.
If Verity= is set to "hash", repart will try to find a matching
data partition with Verity=data and equal values for CopyBlocks=
or CopyFiles=, Format= and MakeDirectories=. If a matching data
partition is found, repart will generate verity hashes for that
data partition in the verity partition. The UUID of the data
partition is set to the first 128 bits of the verity root hash. The
UUID of the hashes partition is set to the final 128 bits of the
verity root hash.
Kai Lueke [Mon, 15 Aug 2022 15:47:03 +0000 (17:47 +0200)]
Use original filename for extension name check
The loading of an extension image from a symlink "NAME.raw" to
"NAME-VERSION.raw" failed because the release file name check worked
with the backing file of the loop device which already resolves the
symlink and thus the found name "NAME-VERSION" mismatched "NAME".
Pass the original filename and use it instead of the backing file
when available. This fixes the loading of "NAME.raw" extensions which
are a symlink to "NAME-VERSION.raw" as, e.g., may be the case when
systemd-sysupdate manages multiple versions.
rootidmap bind option will map the root user from the container to the
owner of the mounted directory on the filesystem. This will ensure files
and directories created by the root user in the container will be owned
by the directory owner on the filesystem. All other user will remain
unmapped.
nspawn: rename RemountIdmapFlags enum to RemountIdmapping
This enum should be used to define various idmapping modes for bind
mounts which might be incompatible. Changing its name and the values
name to reflect that.
If multiple service is starting simultaneously with a shared image,
then one of the service may fail to create a mount node:
systemd[695]: Bind-mounting /usr/lib/os-release on /run/systemd/unit-root/run/host/os-release (MS_BIND|MS_REC "")...
systemd[696]: Bind-mounting /usr/lib/os-release on /run/systemd/unit-root/run/host/os-release (MS_BIND|MS_REC "")...
systemd[695]: Failed to mount /usr/lib/os-release (type n/a) on /run/systemd/unit-root/run/host/os-release (MS_BIND|MS_REC ""): No such file or directory
systemd[696]: Failed to mount /usr/lib/os-release (type n/a) on /run/systemd/unit-root/run/host/os-release (MS_BIND|MS_REC ""): No such file or directory
systemd[695]: Bind-mounting /usr/lib/os-release on /run/systemd/unit-root/run/host/os-release (MS_BIND|MS_REC "")...
systemd[696]: Failed to create destination mount point node '/run/systemd/unit-root/run/host/os-release': Operation not permitted
systemd[695]: Successfully mounted /usr/lib/os-release to /run/systemd/unit-root/run/host/os-release
The function apply_one_mount() in src/core/namespace.c gracefully
handles -EEXIST from make_mount_point_inode_from_path(), but it erroneously
returned -EPERM previously. This fixes the issue.
Fixes one of the issues in #24147, especially reported at
https://github.com/systemd/systemd/issues/24147#issuecomment-1236194671.
bootspec: do not build two many json object at once
This is a workaround for an issue in the memory sanitizer.
If a function is called with too many arguments, then the sanitizer
triggers the following false-positive warning:
==349==WARNING: MemorySanitizer: use-of-uninitialized-value
#0 0x7f8b247134a7 in json_buildv /work/build/../../src/systemd/src/shared/json.c:3213:17
#1 0x7f8b24714231 in json_build /work/build/../../src/systemd/src/shared/json.c:4117:13
#2 0x7f8b24487fa5 in show_boot_entries /work/build/../../src/systemd/src/shared/bootspec.c:1424:29
#3 0x4a6a1b in LLVMFuzzerTestOneInput /work/build/../../src/systemd/src/fuzz/fuzz-bootspec.c:119:16
#4 0x4c6693 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:611:15
#5 0x4c5e7a in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool, bool*) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:514:3
#6 0x4c7ce4 in fuzzer::Fuzzer::ReadAndExecuteSeedCorpora(std::__Fuzzer::vector<fuzzer::SizedFile, std::__Fuzzer::allocator<fuzzer::SizedFile> >&) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:826:7
#7 0x4c7f19 in fuzzer::Fuzzer::Loop(std::__Fuzzer::vector<fuzzer::SizedFile, std::__Fuzzer::allocator<fuzzer::SizedFile> >&) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:857:3
#8 0x4b757f in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:912:6
#9 0x4e0bd2 in main /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerMain.cpp:20:10
#10 0x7f8b23ead082 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x24082) (BuildId: 1878e6b475720c7c51969e69ab2d276fae6d1dee)
#11 0x41f69d in _start (build-out/fuzz-bootspec+0x41f69d)
Dumping everything to console slows the test quite considerably on
slower machines, so let's forward nspawn logs to the journal to still
have them available in case something goes south.
This should, hopefully, help with TEST-13 timeouts in Ubuntu CI and
maybe with CPU soft lockups in CentOS CI.
This should make the test faster on fast machines and more reliable on
slower/under-load machines, where the 4 sec sleep wasn't sometimes enough.
Spotted on C8S machines under load:
```
test_added_after (__main__.ExecutionResumeTest) ... FAIL
test_added_before (__main__.ExecutionResumeTest) ... ok
test_interleaved (__main__.ExecutionResumeTest) ... ok
test_issue_6533 (__main__.ExecutionResumeTest) ... ok
test_no_change (__main__.ExecutionResumeTest) ... ok
test_removal (__main__.ExecutionResumeTest) ... ok
test_swapped (__main__.ExecutionResumeTest) ... ok
======================================================================
FAIL: test_added_after (__main__.ExecutionResumeTest)
----------------------------------------------------------------------
Traceback (most recent call last):
File "./test/test-exec-deserialization.py", line 101, in check_output
with open(self.output_file, 'r') as log:
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpjnec1dj4'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "./test/test-exec-deserialization.py", line 150, in test_added_after
self.check_output(expected_output)
File "./test/test-exec-deserialization.py", line 104, in check_output
self.fail()
AssertionError: None
----------------------------------------------------------------------
Ran 7 tests in 44.270s
```
In sd-device, `devpath` is a kind of syspath without '/sys' prefix, e.g.
/devices/pci0000:00/0000:00:1c.4/0000:3c:00.0/nvme/nvme0/nvme0n1,
and `devname` is a path to the device node, e.g. /dev/nvme0n1.
Let's use the consistent name for the helper function.
condition: change operator logic to use $= instead of =$ for glob comparisons
So this is a bit of a bikeshedding thing. But I think we should do this
nonetheless, before this is released.
Playing around with the glob matches I realized that "=$" is really hard
to grep for, since in shell code it's an often seen construct. Also,
when reading code I often found myself thinking first that the "$"
belongs to the rvalue instead of the operator, in a variable expansion
scheme.
If we move the $ character to the left hand, I think we are on the safer
side, since usually lvalues are much more restricted in character sets
than rvalues (at least most programming languages do enforce limits on
the character set for identifiers).
It makes it much easier to grep for the new operator, and easier to read
too. Example:
None of our other fnmatch() calls make use of this, and the concept was
new to me at least. Given that this is only used for the recently added
SMBIOS field matches (and is not included in any release) let's disable
"extended" matches for now. We can certainly revisit this, and enable it
later if there is real demand, but if we do, we should probably add that
all over the place, not just for smbios matches.
Let's move the operator enum into its own .c/.h file, so that we can
reuse it elsewhere, in particular systemd-analyze's compare-versions
logic.
Let's rename the concept CompareOperator, since it is nowadays
genericlaly about both order *and* fnmatch comparisons, hence just
naming it "order" is misleading.
loop-util: lock the control device around clearing the loopback device and deleting it
This mirrors what we already do during allocation. We lock the control
device first, and then release the block device and then delete it.
This makes things substantially more robust as long all participants do
such locking: we won't attempt to delete a block device somebody else
already is using.
Back when I wrote this code I wasn't aware of BLKPG and what it can do.
Hence I came up with this hack to attach an empty file to delete all
partitions. But today we can do better with BLKPG: let's just explicitly
remove all partitions, and then try again.
loop-util: rework how we lock loopback block devices
Let's rework how we lock loopback block devices in two ways:
1. Lock a separate fd, instead of the main block device fd. We already
did that for our internal locking when allocating loopback block
devices, but do so for the exposed locking (i.e.
loop_device_flock()), too, so that the lock is independent of the
main fd we actually use of IO.
2. Instead of locking the device during allocation of the loopback
device, then unlocking it (which will make udev run), and then
re-locking things if we need, let's instead just keep the lock the
whole time, to make things a bit safer and faster, and not have to
wait for udev at all. This is done by adding a "lock_op" parameter to
loop device allocation functions that declares the initial state of
the lock, and is one of LOCK_UN/LOCK_SH/LOCK_EX. This change also
shortens a lot of code, since we allocate + immediately lock loopback
devices pretty much everywhere.
Now that the loopback device code already destroys the partitions we
don't have to do this here anymore.
I am sure the right place to delete the partitions is in the loopback
code, since we really only should do that for loopback devices, see
bug #24431, and not on "real" block devices.
I am also not convinced dropping partitions the dissection logic doesn't
care about is a good idea, after all. The dissection stuff should
probably not consider itself the "owner" of the block devices it
analyzes, but take a more passive role: figure out what is what, but not
modify it.
loop-util: when clearing a loopback device delete partitions first, and take BSD lock
Whenever we release a loopback device, let's first synchronously delete
all partitions, so that we know that's complete and not done
asynchronously in the background. Take a BSD lock on the device while
doing so, so that udev won't make the devices busy while we do this.