Yu Watanabe [Tue, 31 May 2022 19:01:10 +0000 (04:01 +0900)]
test-network: call networkctl only when specified interface exists
Otherwise, this easily trigger another exception:
```
======================================================================
ERROR: test_erspan_tunnel_v0 (__main__.NetworkdNetDevTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "./test/test-network/systemd-networkd-tests.py", line 686, in wait_online
check_output(*args, env=env)
File "./test/test-network/systemd-networkd-tests.py", line 65, in check_output
return subprocess.check_output(command, universal_newlines=True, **kwargs).rstrip()
File "/usr/lib64/python3.6/subprocess.py", line 356, in check_output
**kwargs).stdout
File "/usr/lib64/python3.6/subprocess.py", line 438, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['/usr/lib/systemd/systemd-networkd-wait-online', '--timeout=20s', '--interface=erspan99:routable', '--interface=erspan98:routable', '--interface=dummy98:degraded']' returned non-zero exit status 1.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "./test/test-network/systemd-networkd-tests.py", line 1808, in test_erspan_tunnel_v0
self.wait_online(['erspan99:routable', 'erspan98:routable', 'dummy98:degraded'])
File "./test/test-network/systemd-networkd-tests.py", line 689, in wait_online
output = check_output(*networkctl_cmd, '-n', '0', 'status', link.split(':')[0], env=env)
File "./test/test-network/systemd-networkd-tests.py", line 65, in check_output
return subprocess.check_output(command, universal_newlines=True, **kwargs).rstrip()
File "/usr/lib64/python3.6/subprocess.py", line 356, in check_output
**kwargs).stdout
File "/usr/lib64/python3.6/subprocess.py", line 438, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['/usr/bin/networkctl', '-n', '0', 'status', 'erspan99']' returned non-zero exit status 1.
```
We currently have a convoluted and complex selection of which random
numbers to use. We can simplify this down to two functions that cover
all of our use cases:
1) Randomness for crypto: this one needs to wait until the RNG is
initialized. So it uses getrandom(0). If that's not available, it
polls on /dev/random, and then reads from /dev/urandom. This function
returns whether or not it was successful, as before.
2) Randomness for other things: this one uses getrandom(GRND_INSECURE).
If it's not available it uses getrandom(GRND_NONBLOCK). And if that
would block, then it falls back to /dev/urandom. And if /dev/urandom
isn't available, it uses the fallback code. It never fails and
doesn't return a value.
These two cases match all the uses of randomness inside of systemd.
I would prefer to make both of these return void, and get rid of the
fallback code, and simply assert in the incredibly unlikely case that
/dev/urandom doesn't exist. But Luca disagrees, so this commit attempts
to instead keep case (1) returning a return value, which all the callers
already check, and fix the fallback code in (2) to be less bad than
before.
For the less bad fallback code for (2), we now use auxval and some
timestamps, together with various counters representing the invocation,
hash it all together and provide the output. Provided that AT_RANDOM is
secure, this construction is probably okay too, though notably it
doesn't have any forward secrecy. Fortunately, it's only used by
random_bytes() and not by crypto_random_bytes().
msizanoen1 [Mon, 30 May 2022 15:08:07 +0000 (22:08 +0700)]
cgroup-util: Properly handle conditions where cgroup.threads is empty after SIGKILL but processes still remain
After sending a SIGKILL to a process, the process might disappear from
`cgroup.threads` but still show up in `cgroup.procs` and still remains in the
cgroup and cause migrating new processes to `Delegate=yes` cgroups to fail with
`-EBUSY`. This is especially likely for heavyweight processes that consume more
kernel CPU time to clean up.
Fix this by only returning 0 when both `cgroup.threads` and
`cgroup.procs` are empty.
Benjamin Franzke [Mon, 30 May 2022 18:21:48 +0000 (20:21 +0200)]
man/nspawn: os-release is only checked for booted containers
/etc/os-release existence is only enforced in --boot mode,
therefore the term "starting" (which also applies to chroot-like mode)
is substituted with "booting" in this context.
Benjamin Franzke [Sat, 28 May 2022 12:55:22 +0000 (14:55 +0200)]
man/nspawn: add a sentence-connecting adverb to machinectl note
The recommendation to use machinectl login/shell instead of
trying to combine two distinct container instances seemed a
litte bit out of context and is now combined via "rather".
Yu Watanabe [Fri, 27 May 2022 05:11:56 +0000 (14:11 +0900)]
portable: remove drop-in configs even if the main unit file does not exist
When we run `portablectl detach --enable --runtime`, then it triggers
`DisableUnitFilesWithFlags` DBus method and the main unit file is
removed, but its drop-ins are not. Hence, portable_detach() failed to
list existing portable units.
This makes the loop for listing portable units also accept drop-in
directories. So, all remaining drop-in directories are correctly
removed.
Before:
```
testsuite-29.sh[600]: + portablectl detach --now --runtime --enable /tmp/rootdir minimal-app0
portablectl[1391]: (Matching unit files with prefixes 'minimal-app0'.)
portablectl[1391]: Queued /org/freedesktop/systemd1/job/1812 to call StopUnit on portable service minimal-app0-foo.service.
portablectl[1391]: Removed "/run/systemd/system.attached/minimal-app0-foo.service".
portablectl[1391]: Queued /org/freedesktop/systemd1/job/1813 to call StopUnit on portable service minimal-app0.service.
portablectl[1391]: Removed "/run/systemd/system.attached/minimal-app0.service".
portablectl[1391]: Got result done/Success for job minimal-app0-foo.service
portablectl[1391]: Got result done/Success for job minimal-app0.service
portablectl[1391]: DetachImage failed: No unit files associated with '/tmp/rootdir' found attached to the system. Image not attached?
```
After:
```
testsuite-29.sh[508]: + portablectl detach --now --runtime --enable /tmp/rootdir minimal-app0
portablectl[1076]: (Matching unit files with prefixes 'minimal-app0'.)
portablectl[1076]: Queued /org/freedesktop/systemd1/job/1946 to call StopUnit on portable service minimal-app0-foo.service.
portablectl[1076]: Removed "/run/systemd/system.attached/minimal-app0-foo.service".
portablectl[1076]: Queued /org/freedesktop/systemd1/job/1947 to call StopUnit on portable service minimal-app0.service.
portablectl[1076]: Removed "/run/systemd/system.attached/minimal-app0.service".
portablectl[1076]: Removed /run/systemd/system.attached/minimal-app0.service.d/10-profile.conf.
portablectl[1076]: Removed /run/systemd/system.attached/minimal-app0.service.d/20-portable.conf.
portablectl[1076]: Removed /run/systemd/system.attached/minimal-app0.service.d.
portablectl[1076]: Removed /run/systemd/system.attached/minimal-app0-foo.service.d/10-profile.conf.
portablectl[1076]: Removed /run/systemd/system.attached/minimal-app0-foo.service.d/20-portable.conf.
portablectl[1076]: Removed /run/systemd/system.attached/minimal-app0-foo.service.d.
portablectl[1076]: Removed /run/portables/rootdir.
portablectl[1076]: Removed /run/systemd/system.attached.
```
Michal Sekletar [Mon, 30 May 2022 09:55:41 +0000 (11:55 +0200)]
unit: check for mount rate limiting before checking active state
Having this check as part of mount_can_start() is too late because
UNIT(u)->can_start() virtual method is called after checking the active
state of unit in unit_start().
We need to hold off running mount start jobs when /p/s/mountinfo monitor
is rate limited even when given mount unit is already active.
Jan Janssen [Fri, 27 May 2022 19:15:22 +0000 (21:15 +0200)]
meson: Build header tests with -pedantic
By using __extension__, we can silence pedantic errors we cannot or
do not want to fix.
This in particular silences:
- enum values being outside of int range
- variadic macros
- long long being C99
- type of bit-field ‘type’ is a GCC extension
- use of C99 bool in public header functions
tests: link tests using fabs against libm explicitly
Some compiler wrappers like honggfuzz pass -fno-builtin explicitly
and because of that the tests where fabs is used fail to compile
with something like
```
FAILED: test-bus-marshal
...
/usr/bin/ld: test-bus-marshal.p/src_libsystemd_sd-bus_test-bus-marshal.c.o: undefined reference to symbol 'fabs@@GLIBC_2.2.5'
/usr/bin/ld: /usr/lib64/libm.so.6: error adding symbols: DSO missing from command line
collect2: error: ld returned 1 exit status
```
Fun fact: it took honggfuzz less than a minute to discover
https://github.com/advisories/GHSA-gmc7-pqv9-966m used by
systemd to compress/descompress some stuff.
Nick Rosbrook [Tue, 24 May 2022 17:15:13 +0000 (13:15 -0400)]
hwdb: implement --root option for systemd-hwdb query
Currently, the systemd-hwdb --root flag only has an effect for the
'update' verb. It would be useful to be able to use the --root option
for the 'query' verb too (e.g. for testing a hwdb.bin created with
systemd-hwdb update --root <path>).
Use sd_hwdb_new_from_path to initialize the hwdb if --root is passed to
systemd-hwdb query.
Note that this functionality was not added to 'udevadm hwdb' since that
command is deprecated.
Nick Rosbrook [Tue, 24 May 2022 17:08:06 +0000 (13:08 -0400)]
sd-hwdb: add sd_hwdb_new_from_path
The existing sd_hwdb_new function always initializes the hwdb from the
first successful hwdb.bin it finds from hwdb_bin_paths. This means there
is currently no way to initialize a hwdb from an explicit path, which
would be useful for systemd-hwdb query.
Add sd_hwdb_new_from_path to allow a sd_hwdb to be initialized from a
custom path outside of hwdb_bin_paths.
Nick Rosbrook [Thu, 26 May 2022 18:32:20 +0000 (14:32 -0400)]
sd-hwdb: include sys/stat.h in hwdb-internal.h
Include this header to fix errors when including hwdb-internal.h:
../src/libsystemd/sd-hwdb/hwdb-internal.h:16:21: error: field ‘st’ has incomplete type
16 | struct stat st;
If something doesn't match, let's print the non-matching value.
If we can't query something, say what.
And make the messages in the udev and blkid paths different, so
we tell which approach failed from a log.
kernel-install: if a plugin fails, return error immediately
Since the first version in 81516adcb71a47837544340f72eb8ee810274119,
kernel-install would "gather" a return value by summing the exit codes
of the plugins… This makes no sense, because those are not additive values.
Let's just break off immediately. We now implement cleanup via trap, so if we
break, we should leave no garbage behind.
docs/BLS: clear up the confusion about what $BOOT means
The text used was originally written for everything being on the ESP. It was
later generalized for support XBOOTLDR, and "$BOOT" was introduced to mean
something like "XBOOTLDR if present, the ESP otherwise", and most of the text
was changed to talk about $BOOT. Sadly, this doesn't work, because the two
partitions are not interchangeable. sd-boot loads entries from both partitions,
and its configuration, random-seed, etc. only from the ESP.
The terms are redefined: $BOOT now means either the ESP or the "boot partition"
playing the same role on MBR systems, and $XBOOTLDR is XBOOTLDR.
Like various previous commits, this makes the specification describe our
current implementation.
Also, the let's just accept the common practice of using /boot and /boot/efi.
Since both partitions need to be read to gather configuration, it isn't a
problem that one is mounted underneath the other one. I think having /boot and
/efi is OK, but not better in any measureable way, so let's stop trying to push
people towards this setup.
A note that XBOOTLDR must be on the same disk as ESP is added.
Yu Watanabe [Fri, 20 May 2022 08:25:12 +0000 (10:25 +0200)]
core/device: do not downgrade device state if it is already enumerated
On switching root, a device may have a persistent databse. In that case,
Device.enumerated_found may have DEVICE_FOUND_UDEV flag, and it is not
necessary to downgrade the Device.deserialized_found and
Device.deserialized_state. Otherwise, the state of the device unit may
be changed plugged -> dead -> plugged, if the device has not been mounted.
Martin Wilck [Wed, 25 May 2022 10:01:00 +0000 (12:01 +0200)]
core/device: device_coldplug(): don't set DEVICE_DEAD
dm-crypt device units generated by systemd-cryptsetup-generator
habe BindsTo= dependencies on their backend devices. The dm-crypt
devices have the db_persist flag set, and thus survive the udev db
cleanup while switching root. But backend devices usually don't survive.
These devices are neither mounted nor used for swap, thus they will
seen as DEVICE_NOT_FOUND after switching root.
The BindsTo dependency will cause systemd to schedule a stop
job for the dm-crypt device, breaking boot:
[ 68.929457] krypton systemd[1]: systemd-cryptsetup@cr_root.service: Unit is stopped because bound to inactive unit dev-disk-by\x2duuid-3bf91f73\x2d1ee8\x2d4cfc\x2d9048\x2d93ba349b786d.device.
[ 68.945660] krypton systemd[1]: systemd-cryptsetup@cr_root.service: Trying to enqueue job systemd-cryptsetup@cr_root.service/stop/replace
[ 69.473459] krypton systemd[1]: systemd-cryptsetup@cr_root.service: Installed new job systemd-cryptsetup@cr_root.service/stop as 343
Avoid this by not setting the state of the backend devices to
DEVICE_DEAD.
analyze: use '' instead of the empty string when showing versions
It looks like garbled output… I didn't use shell-escape, because the other
characters that are special for the shell that are used in versions should
not be escaped.
basic/string-util: tweak strverscmp_improved() for some corner cases
So far we had the rule that '' == '', '0_' == '0', but '_' > ''. This means
that the general rule that strings are compared iteratively, and each
segment that compares equal can be dropped and the comparison resumes at
the following characters wasn't true in such cases. Similarly, '0~' < '0',
but after dropping the common segment, '~' > ''.
The special handling of empty strings is dropped, and '_' == '' and
'~' < ''.
Anita Zhang [Tue, 24 May 2022 17:51:27 +0000 (10:51 -0700)]
test-seccomp: check for CAP_IPC_OWNER before calling shmat()
shmat() requires the CAP_IPC_OWNER capability. When running test-seccomp
in environments with root + CAP_SYS_ADMIN, but not CAP_IPC_OWNER,
memory_deny_write_execute_shmat would fail. This fixes it.
kernel-install: ignore extra args passed when invoked as installkernel
kernel's 'make install' invokes install.sh which calls /sbin/install-kernel.
Thus we are invoked as e.g.
/sbin/installkernel 5.18.0 arch/x86/boot/bzImage System.map /boot
The last two arguments would be passed as "initrds".
Before , we would just quitely ignore
/boot, because it doesn't pass the 'test -f' test, and possibly try to do
something with System.map. 742561efbe938c45936f2e4f5d81b3ff6b352882 tightened
the check, so we now throw an error.
It seems that the correct thing is to ignore those two arguments, because
our plugin syntax has no notion of System.map. And the installation directory
we can figure out ourselves better. Effectively, this makes things behave
like before, but less by accident.
docs/BLS: rework the description of directory layout
We said "`$BOOT/loader/` is the directory containing all files needed
for Type #1 entries" which is blatantly wrong. And also saying that we
define two directories, /loader and /loader/entries, but only ever defining
the second one was not very consistent.
Instead, let's say that /loader/ is for "boot loader configuration", and
/loader/entries has the snippets. A new section about /<entry-token>/<version>/
is added. This is described as the "recommended layout for additional files".
Also, we said that ID= should be used in the file name, but in fact it
wasn't in the example that was given, and afaik, nobody ever did that. So
this part is reduced to say "kernel version (as returned by `uname -r`,
including the OS identifier)". AFAIK, all distros include some form of
OS identifier in the version, so this should be good enough.
Since we now don't depend on autodetection (e.g. with entry-token and layout
configured), the installed doesn't need to always create /loader/entries and
things will still work. So don't say that the installer needs to create it.
Part of the discussion is moved to the Discussion section.
Overall, this brings the specification more in line with actual practice.
docs: reworder/rewrite BLS to read more like a specification
I tried not to introduce any semantic changes, but to reorder the whole
text to be more usable as a reference specification: more sections are
created and the discussion and justifications are moved to the end.
Also, "BIOS" is changed to "firmware" in various places, and other parts
of the text that made sense when this was originally written are now dated
are adjusted. I separated and extended the examples a bit.
The abstract at the top ("TL;DR: Currently there’s no common boot scheme…")
is dropped. It didn't seem to fit anywhere.