Richard Phibel [Thu, 6 Jul 2023 12:03:35 +0000 (14:03 +0200)]
service: fix for RestartMode=direct option
With the fix done in PR28215, the unit restart job is created with type JOB_START.
Because of that, it is not properly merged anymore with the old one: the
merged job has state JOB_RUNNING. It should have state JOB_WAITING.
I think that the old job is not cleaned up because we don't go through the failed state.
With this fix, the merged job is properly created with state JOB_WAITING.
Richard Phibel [Thu, 6 Jul 2023 12:33:52 +0000 (14:33 +0200)]
service: add new RestartMode option
When this option is set to direct, the service restarts without entering a failed
state. Dependent units are not notified of transitory failure.
This is useful for the following use case:
We have a target with Requires=my-service, After=my-service.
my-service.service is a oneshot service and has Restart=on-failure in
its definition.
my-service.service can get stuck for various reasons and time out, in
which case it is restarted. Currently, when it fails the first time, the
target fails, even though my-service is restarted.
The behavior we're looking for is that until my-service is not restarted
anymore, the target stays pending waiting for my-service.service to
start successfully or fail without being restarted anymore.
The article "a" goes before consonant sounds and "an" goes before vowel
sounds. This commit changes an to a for UKI, UDP, UTF-8, URL, UUID, U-Label, UI
and USB, since they start with the sound /ˌjuː/.
Let's cast these floats explicitly to usec_t, since implicit
float-to-integer casts are dangerous business, and we should underline
that there's a cast happening here.
network: handle captive portal with multiple routers
Before this patch, if a network has multiple routers and one of them
provides a captive portal, then the portal was overwritten or cleared
when another RA from another router is received.
This makes captive portals managed in the similar way as DNS servers or
DNS domains. So now captive portal can safely handled even if a network
has multiple routers.
For confidential computing they want to be able to revoke initrds too, so allow
passing a specific --sbat section when building a UKI too, not just an addon.
Merge it with the stub and kernel sections.
While we don't strictly follow the rule, most of our userspace names
these fields that count entries in some array n_xyz, hence let's do so
in the EFI boot code too, to make things less special.
test: change partition label to test if the outdated devlinks are removed
The change is intended to reproduce the issue #27983, though the
original issue is highly racy, and the test does not reproduce it
reliably. But, anyway, it is better to change the partition label to
test the devlink removal.
When the function is called, the device may be already removed, and
another device has the same syspath. Such situation can occur when a
partition removed and another is created. In that case, the sysfs paths
of the removed and newly created partitions can be same, but their
devnums are different, and thus the database files corresponding to the
devices are also different.
Let's make our units more robust to being added to an initrd:
1. systemd-boot-update only makes sense if sd-boot is available in /usr/
to copy into the ESP. This is generally not the case in initrds, and
even if it was, we shouldn't update the ESP from the initrd, but from
the host instead.
2. The rfkill services save/restore rfkill state, but that information
is only available once /var/ is mounted, which generally happens
after the initrd transition.
3. utmp management is partly in /var/, and legacy anyway, hence don't
bother with it in the initrd.
Before this commit, when a unit that is restarting propagates stop
to other units, it can also depend on them, which results in
job type conflict and thus failure to pull in the dependencies.
So, let's introduce a new dependency atom UNIT_ATOM_PROPAGATE_STOP_GRACEFUL,
and use it for PropagatesStopTo=. It will enqueue a restart job if
there's already a start job, which meets the ultimate goal and avoids
job type conflict.
This extends the test framework a bit, and allows adding additional
initrds to the qemu invocation, which we use here to place credentials
in the new /run/systemd/@initrd/ credentials dir which are then passed
to the host.
Let's add two new helpers: mount_credentials_fs() and
credentials_fs_mount_flags(). The former mounts a file system suitable
for storing of unencrypted credentials at runtime (i.e. a ramfs or
tmpfs). The latter determines the right mount flags to use for such a
mount.
Both functions mostly just take code from execute.c, but make two
changes:
1. If the kernel supports it we'll use a tmpfs with the new "noswap"
mount option instead of ramfs. Was added in kernel 6.4, hence is very
recent, but tmpfs is so much less crappy than ramfs, hence worth it.
2. We'll set MS_NOSYMFOLLOW on the mounts if supported. These file
systems should only contain regulra files, hence no need to allow
symlinks.
creds-util: add new helper read_credential_with_decryption()
This is just like read_credential() but also looks into the encrypted
credential directory, not just the regular one.
Normally, we decrypt credentials at the moment we pass them to services.
From service PoV all credentials are hence decrypted credentials.
However, when we want to access credentials in a generator this logic
does not apply: here we have the regular and the encrypted credentials
directory. So far we didn't attempt to make use of credentials in
generators hence.
Let's address and add helper that looks into both directories, and talks
to the TPM if necessary to decrypt the credentials.
execute: fix credential dir handling for fs which support ACLs
When the credential dir is backed by an fs that supports ACLs we must be
more careful with adjusting the 'x' bit of the directory, as any chmod()
call on the dir will reset the mask entry of the ACL entirely which we
don't want. Hence, do a manual set of ACL changes, that only add/drop
the 'x' bit but otherwise leave the ACL as it is.
This matters if we use tmpfs rather than ramfs to store credentials.
This log message is shown pretty regular at boot in various scenarios
(such as CI builds), and it's not a reason for any concern, it's just the
immediate effect of explicit configuration. Hence let's downgrade from
LOG_NOTICE to LOG_INFO so that it is still usually in the boot output,
but not particularly highlighted, since there's really no reason to.