Yu Watanabe [Mon, 14 Mar 2022 13:02:37 +0000 (22:02 +0900)]
test: wait for loopback device being actually created
It seems there exists a short time period that we cannot see the
loopback device after `losetup` is finished:
```
testsuite-58.sh[367]: ++ losetup -b 1024 -P --show -f /tmp/testsuite-58-sector-1024.img
kernel: loop1: detected capacity change from 0 to 204800
testsuite-58.sh[285]: + LOOP=/dev/loop1
testsuite-58.sh[285]: + systemd-repart --pretty=yes --definitions=/tmp/testsuite-58-sector/ --seed=750b6cd5c4ae4012a15e7be3c29e6a47 --empty=require --dry-run=no /dev/loop1
testsuite-58.sh[368]: Device '/dev/loop1' has no dm-crypt/dm-verity device, no need to look for underlying block device.
testsuite-58.sh[368]: Failed to determine canonical path for '/dev/loop1': No such file or directory
testsuite-58.sh[368]: Failed to open file or determine backing device of /dev/loop1: No such file or directory
```
Yu Watanabe [Sun, 13 Mar 2022 12:38:10 +0000 (21:38 +0900)]
test: use /var/tmp for storing disk images
The Ubuntu CI on ppc64el seems to have a issue on tmpfs, and files
may not be fsynced. See c10caebb98803b812ebc4dd6cdeaab2ca17826d7.
For safety, let's use /var/tmp to store disk images.
Vivien Didelot [Mon, 14 Mar 2022 20:34:57 +0000 (16:34 -0400)]
units: fix factory-reset.target description
The current description for the factory reset target does not add any
value and doesn't respect the definition of the related property as
described in systemd.unit(5).
Starting the target currently results in the following log:
[ 11.139174] systemd[1]: Reached target Target that triggers factory reset. Does nothing by default..
[ OK ] Reached target Target that…set. Does nothing by default..
Simply update the target description to "Factory Reset".
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
/dev/urandom is seeded with RDRAND. Calling genuine_random_bytes(...,
..., 0) will use /dev/urandom as a last resort. Hence, we gain nothing
here by having our own RDRAND wrapper, because /dev/urandom already is
based on RDRAND output, even before /dev/urandom has fully initialized.
Furthermore, RDRAND is not actually fast! And on each successive
generation of new x86 CPUs, from both AMD and Intel, it just gets
slower.
This commit simplifies things by just using /dev/urandom in cases where
we before might use RDRAND, since /dev/urandom will always have RDRAND
mixed in as part of it.
And above where I say "/dev/urandom", what I actually mean is
GRND_INSECURE, which is the same thing but won't generate warnings in
dmesg.
macro: handle DECIMAL_STR_MAX() special cases more accurately
So far DECIMAL_STR_MAX() overestimated the types in two ways: it would
also adds space for a "-" for unsigned types.
And it would always return the same size for 64bit values regardless of
signedness, even though the longest maximum numbers for signed and
unsigned differ in length by one digit. i.e. 2^64-1 (i.e. UINT64_MAX) is
one decimal digit longer than -2^63 (INT64_MIN) - for the other integer
widths the number of digits in the "longest" decimal value is always the
same, regardless of signedness. by example: strlen("65535") ==
strlen("32768") (i.e. the relevant 16 bit limits) holds — and similar
for 8bit and 32bit integer width limits — but
strlen("18446744073709551615") > strlen("9223372036854775808") (i.e. the
relevant 64 bit limits).
Franck Bui [Mon, 14 Mar 2022 17:03:02 +0000 (18:03 +0100)]
journal: preserve acls when rotating user journals with NOCOW attribute set
When restoring the COW flag for journals on BTRFS, the full journal contents
are copied into new files. But during these operations, the acls of the
previous files were lost and users were not able to access to their old
journal contents anymore.
Frantisek Sumsal [Sun, 13 Mar 2022 13:45:03 +0000 (14:45 +0100)]
macro: account for negative values in DECIMAL_STR_WIDTH()
With negative numbers we wouldn't account for the minus sign, thus
returning a string with one character too short, triggering buffer
overflows in certain situations.
1. The "entry-token" concept already introduced in kernel-install is now
made use of. i.e. specifically there's a new option --entry-token=
that can be used to explicitly select by which ID to identify boot
loader entries: the machine ID, or some OS ID (ID= or IMAGE_ID= from
/etc/os-release, or even some completely different string. The
selected string is then persisted to /etc/kernel/entry-token, so that
kernel-install can find it there.
2. The --make-machine-id-directory= switch is renamed to
--make-entry-directory= since after all it's not necessarily the
machine ID the dir is named after, but can be any other string as
selected by the entry token.
3. This drops all code to make automatic changes to /etc/machine-info.
Specifically, the KERNEL_INSTALL_MACHINE_ID= field is now more
generically implemented in /etc/kernel/entry-token described above,
hence no need to place it at two locations. And the
KERNEL_INSTALL_LAYOUT= field is not configurable by user switch or
similar anyway in bootctl, but only read from
/etc/kernel/install.conf, and hence copying it from one configuration
file to another appears unnecessary, the second copy is fully
redundant. Note that this just drops writing these fields, they'll
still be honoured when already set.
This drops documentation of KERNEL_INSTALL_MACHINE_ID as machine-info
field (though we'll still read it for compat).
This updates the kernel-install man page to always say "ENTRY-TOKEN"
instead of "MACHINE-ID" where appropriate, to clear the confusion up
between the two.
This also tries to fix how we denote env vars (always prefix with $ and
without = suffix), and other vars (without $ but with = suffix)
kernel-install: search harder for kernel image/initrd drop-in dir
If not explicitly configured, let's search a bit harder for the
ENTRY_TOKEN, and let's try the machine ID, the IMAGE_ID and ID fields of
/etc/os-release and finally "Default", all below potential $XBOOTLDR.
kernel-install: only generate systemd.boot_id= in kernel command line if used for naming the boot loader spec files/dirs
Now that we can distinguish the naming of the boot loader spec
dirs/files and the machine ID let's tweak the logic for suffixing the
kernel cmdline with systemd.boot_id=: let's only do that when we
actually need the boot ID for naming these dirs/files. If we don't,
let's not bother.
This should be beneficial for "golden" images that shall not carry any
machine IDs at all, i.e acquire their identity only once the final
userspace is actually reached.
kernel-install: add a new $ENTRY_TOKEN variable for naming boot entries
This cleans up naming of boot loader spec boot entries a bit (i.e. the
naming of the .conf snippet files, and the directory in $BOOT where the
kernel images and initrds are placed), and isolates it from the actual machine
ID concept.
Previously there was a sinlge concept for both things, because typically
the entries are just named after the machine ID. However one could also
use a different identifier, i.e. not a 128bit ID in which cases issues
pop up everywhere. For example, the "machine-id" field in the generated
snippets would not be a machine ID anymore, and the newly added
systemd.machine_id= kernel parameter would possibly get passed invalid
data.
Hence clean this up:
$MACHINE_ID → always a valid 128bit ID.
$ENTRY_TOKEN → usually the $MACHINE_ID but can be any other string too.
This is used to name the directory to put kernels/initrds in. It's also
used for naming the *.conf snippets that implement the Boot Loader Type
1 spec.
kernel-install: don't try to persist used machine ID locally
This reworks the how machine ID used by the boot loader spec snippet
generation logic. Instead of persisting it automatically to /etc/ we'll
append it via systemd.machined_id= to the kernel command line, and thus
persist it in the generated boot loader spec snippets instead. This has
nice benefits:
1. We do not collide with read-only root
2. The machine ID remains stable across factory reset, so that we can
safely recognize the path in $BOOT we drop our kernel images in
again, i.e. kernel updates will work correctly and safely across
kernel factory resets.
3. Previously regular systems had different machine IDs while in
initrd and after booting into the host system. With this change
they will now have the same.
This then drops implicit persisting of KERNEL_INSTALL_MACHINE_ID, as its
unnecessary then. The field is still honoured though, for compat
reasons.
This also drops the "Default" fallback previously used, as it actually
is without effect, the randomized ID generation already took precedence
in all cases. This means $MACHNE_ID/KERNEL_INSTALL_MACHINE_ID are now
guaranteed to look like a proper machine ID, which is useful for us,
given you need it that way to be able to pass it to the
systemd.machine_id= kernel command line option.
Yu Watanabe [Fri, 11 Mar 2022 06:13:23 +0000 (15:13 +0900)]
meson: move to c_std=gnu11
Recently, the kernel communitiy started to discuss to move C11 (gnu11) [1],
and it seems to come near future.
Let's also move to c_std=gnu11. Unlike the kernel, we already uses
gnu99, hence hopefully we can move to C11 without changing anything.
Yu Watanabe [Mon, 28 Feb 2022 01:55:51 +0000 (10:55 +0900)]
network: re-design request queue
This makes Request object takes hash, compare, free, and process functions.
With this change, the logic in networkd-queue.c can be mostly
independent of the type of the request or the object (e.g. Address) assigned
to the request, and it becomes simpler.
Yu Watanabe [Mon, 28 Feb 2022 02:15:01 +0000 (11:15 +0900)]
network: refuse to configure link properties when in initialized state
The condition should be satisfied only when users request to reconfigure
the link, and in that case, all request will be cancelled. Hence, it is
not necessary to process the request.
Yu Watanabe [Mon, 28 Feb 2022 00:20:42 +0000 (09:20 +0900)]
network: introduce request_call_netlink_async()
In most netlink handlers, we do the following,
1. decrease the message counter,
2. check the link state,
3. error handling,
4. update link state via e.g. link_check_ready().
The first two steps are mostly common, hence let's extract it.
Moreover, this is not only extracting the common logic, but provide a
strong advantage; `request_call_netlink_async()` assigns the relevant
Request object to the userdata of the netlink slot, and the request object
has full information about the message we sent. Hence, in the future,
netlink handler can print more detailed error message. E.g. when
an address is failed to configure, then currently we only show an
address is failed to configure, but with this commit, potentially we can
show which address is failed explicitly.
This does not change such error handling yet. But let's do that later.
Yu Watanabe [Sun, 27 Feb 2022 06:39:16 +0000 (15:39 +0900)]
network: make Request object take Manager*
Previously, even though all Request object are owned by Manager, they
do not have direct reference to Manager, but through Link or NetDev
object. But, as Link or NetDev can be NULL, we need to conditionalize
how to access Manager from Request with the type of the request.
This makes the way simpler, as now Request object has direct reference
to Manager.
This also rename request_drop() -> request_detach(), as in the previous
commit, the reference counter is introduced, so even if a reference of
a Request object from Manager is dropped, the object may still alive.
The naming `request_drop()` sounds the object will freed by the
function. But it may not. And `request_detach()` suggests the object
will not be managed by Manager any more, and I think it is more
appropreate.
This is just a cleanup, and should not change any behavior.
Yu Watanabe [Sun, 27 Feb 2022 06:18:01 +0000 (15:18 +0900)]
network: introduce reference counter for Request object
Currently, all Request object are always owned by Manager, and freed
when it is processed, especially, soon after a netlink message is sent.
So, it is not necessary to introduce the reference counter.
In a later commit, the Request object will _not_ be freed at the time
when a netlink message is sent, but assigned to the relevant netlink
slot as a userdata, and will be freed when a reply is received. So, the
owner of the Request object is changed in its lifetime. In that case, it
is convenient that the object has reference counter to avoid memleak or
double free.
Yu Watanabe [Sat, 26 Feb 2022 06:56:39 +0000 (15:56 +0900)]
network: make request_process_address() and friends take Link and corresponding object
This also renames e.g. request_process_address() -> address_process_request().
Also, this drops type checks such as `assert(req->type == REQUEST_TYPE_ADDRESS)`,
as in the later commits, the function of processing request, e.g.
`address_process_request()`, will be assigned to the Request object when
it is created. And the request type will be used to distinguish and to
avoid deduplicating requests which do not have any assigned objects,
like REQUEST_TYPE_DHCP4_CLIENT. Hence, the type checks in process functions
are mostly not necessary and redundant.
This is mostly cleanups and preparation for later commits, and should
not change any behavior.
get_pretty_hostname() so far had semantics not in line with our usual
ones: the return parameter was actually freed before the return string
written into it, because that's what parse_env_file() does. Moreover,
when the value was not set it would return NULL but succeed.
Let's normalize this, and only fill in the return value if there's
something set, and never read from it, like we usually do with return
parameter, and in particular those named "ret_xyz".
The existing callers don't really care about the differences, but it's
nicer to normalize behaviour to minimize surprises.
Frantisek Sumsal [Thu, 10 Mar 2022 14:18:45 +0000 (15:18 +0100)]
cgls: mangle user-provided unit names
so the CLI interface is now similar to `systemctl`, i.e. if no unit name
suffix is provided, assume `.service`.
Fixes: #20492
Before:
```
$ systemd-cgls --unit user@1000
Failed to query unit control group path: Invalid argument
Failed to list cgroup tree: Invalid argument
```
Luca Boccassi [Thu, 10 Mar 2022 01:30:08 +0000 (01:30 +0000)]
core: support ExtensionDirectories in user manager
Unprivileged overlayfs is supported since Linux 5.11. The only
change needed to get ExtensionDirectories to work is to avoid
hard-coding the staging directory to the system manager runtime
directory, everything else just works (TM).
This change does two things: raise the default limit for nspawn
containers (where we try to mimic closely what the kernel does), and
bump it when running on old kernels which still have the lower setting.
The first ExecStartPre or the first ExecStart commands would get the metadata,
but not the subsequent ones. Also check that we do not pass it in
ExecStartPost.
manager: prevent cleanup of triggering units before we start the handler
This fixes the following case:
OnFailure= would be spawned correctly, but OnSuccess= would be
spawned without the MONITOR_* metadata, because we'd "collect" the unit
that started successfully. So let's block cleanup while we have a job
running for the handler. The job cannot last infinitely, so at some point
we'll be able to collect both.
We already logged what we are spawning, but not so much why. Let's
add this, so it's easier to distinguish execstartpre/execstart/execstartpost
and such.