docs: add a new document describing the VM interface of systemd
This mirrors the existing CONTAINER_INTERFACE.md document, but describes
extension points of systemd running in a VM with a machine manager
supervising it.
manager: send an sd_notify() message informing the container manager when systemd's special UNIX signals become available
From the outside it's difficult to determine whether (and when) the PID1
inside a container supports systemd's more complete set of UNIX process
signals or not. Let's make this easier, and simply send a notification
message when we are ready.
This new passive target is supposed to be pulled in by SSH
implementations and should be reached when remote SSH access is
possible. The idea is that this target can be used as indicator for
other components to determine if and when SSH access is possible.
One specific usecase for this is the new sd_notify() logic in PID 1 that
sends its own supervisor notifications whenever target units are
reached. This can be used to precisely schedule SSH connections from
host to VM/container, or just to identify systems where SSH is even
available.
core: notify supervisor over targets we reach, as we reach them
Let's inform the the supervisor about various happenings of our service
manager, specifically the boot milestones we reach.
We so far have only a singular READY=1 message, to inform about bootup
completion. But sometimes it is interesting to have something for
finegrained, in particular something that indicates optional components
that have been activated.
Usecase for this: in a later PR I intend to introduce a generic
"ssh.target" that is supposed to be activated when SSH becomes available
on a host. A supervisor (i.e. a VMM/hypervisor/container mgr/…) can
watch for that, and know two things:
1. that SSH is generally available in the system
2. when it is available
In order to not flood the supervisor with events I only send these out
for target units. We could open this up later, in theory, but I think it
makes sense to tell people instead to define clear milestone target
units if they want a supervisor to be able to track system state.
machine-id-setup: inform supervisor about chosen machine ID
Similar as the previous commit, it's useful for a supervisor to know
what machine ID we settlted on, in particular as various other things
are deterministically derived from it, for example MAC addresses and
such.
hostname-setup: send chosen hostname to supervisor via sd_notify()
once we decided on a hostname, let's tell the supervisor about it. This
is useful for example in order to recognize the system via mDNS/LLMNR or
in a DHCP lease.
Piotr Drąg [Thu, 14 Mar 2024 12:50:12 +0000 (13:50 +0100)]
po: add pkg/debian to POTFILES.skip
Debian packaging includes the exploded tarball, so scripts used to
detect files that should be in POTFILES.in, like intltool-update -m
used on https://l10n.gnome.org/module/systemd/, falsely detect its
files as needed to be translated. Avoid this behavior by putting
the whole submodule in POTFILES.skip.
"Starting Boot Control…" would be a fairly confusing message in the boot logs.
Use "… Service" to mirror what we have in other services like
systemd-{hostnamed,timedated,portabled,machined,…}.service.
We generally don't specify the protocol implementation in unit descriptions.
For journald, we have:
$ git grep Description 'units/*journald*'
units/systemd-journald-audit.socket:Description=Journal Audit Socket
units/systemd-journald-dev-log.socket:Description=Journal Socket (/dev/log)
units/systemd-journald-varlink@.socket:Description=Journal Varlink Socket for Namespace %i
units/systemd-journald.service.in:Description=Journal Service
units/systemd-journald.socket:Description=Journal Sockets
units/systemd-journald@.service.in:Description=Journal Service for Namespace %i
units/systemd-journald@.socket:Description=Journal Sockets for Namespace %i
so we need to keep "Varlink" in the name. But also use "Sockets" (plural)
for the "main" socket unit, since it opens multiple sockets.
RuntimeError is documented as "Unspecified run-time error". It doesn't make
much sense for Python. (It originated in Java, where exceptions that can be
thrown by a function are declared in the function signature. All code calling
such a function must either explicitly catch all possible exception types, or
allow them to propagate by listing them in its own exception type list. This is
nice in theory, but in practice very annoying. Especially during development,
when the list of possible exception types is not finalized, we would end up
adding and removing exceptions to functions signatures all the time. Also for
code which is designed to call functions recursively, we would soon end up with
all functions declaring all possible exception types… To avoid this, people
would quite often do fake handling with a block that either prints and ignores
an exception, or has just a comment like "fix me later", or even nothing. This
often lead to people forgetting to adjust this later on and production code
containing such constructs. An escape hatch was opened with RuntimeException and
its subclasses, which do not need to be pre-declared. Various memory-related
exceptions were added as subclasses of RuntimeException. But later on, people
starting using this to not to have to declare all exception types everywhere.)
In Python, exceptions do no have to be pre-declared, and for code which just
encounters a failure, we should raise a specific exception type. The catch-all
class for unexpected input is ValueError.
For https://github.com/systemd/systemd/issues/31637:
BadSectionError: Section '.data' @0x28000 overlaps previous section @0x28000+0x300=@0x28300
Also, exception strings should not contain trailing periods, because they are
often embedded in sentences.
tools/elf2efi: align columns in tables, unify formatting
For tables which represent binary data structures, readability is greatly
enhanced if the part which shows field size and type is aligned. This follows
the usual style for tables in the rest of the systemd codebase.
Also, use the same style for functions: if the function signature is too long
to fit in one line, put each parameter on a separate line.
Also, for comprehension expressions, if they are split, use the usual Python
style.
Also, drop format annotations, since the code isn't automatically formatted
anymore, and automatic formatting is neither feasible nor a goal for the
systemd codebase.
I was looking at the logs in some bug and saw this:
Mar 13 15:55:12 fedora systemd[1]: systemd-pcrmachine.service - TPM2 PCR Machine ID Measurement was skipped because of an unmet condition check (ConditionSecurity=measured-uki).
Mar 13 15:55:12 fedora systemd[1]: Starting systemd-remount-fs.service - Remount Root and Kernel File Systems...
Mar 13 15:55:12 fedora systemd[1]: systemd-tpm2-setup-early.service - TPM2 SRK Setup (Early) was skipped because of an unmet condition check (ConditionSecurity=measured-uki).
This is overly technical, for most units we don't provide this level of
detail about the implementation. So retitle the units to be more accessible.
Also, the fact that it's a v. 2 of the TPM is not that important. We don't
support TPM 1.2, but computers without TPM v2 are getting rare. For other
units we don't advertise the version of hardware, and let's not do this here,
to reduce some complexity.
Daan De Meyer [Wed, 13 Mar 2024 13:18:03 +0000 (14:18 +0100)]
mkosi: Enable KVM
Since https://github.blog/2024-01-17-github-hosted-runners-double-the-power-for-open-source/,
it seems that KVM is supported on GA runners, so let's explicitly
enable it to make sure it is used.
We update mkosi to latest and set QemuFirmware=uefi to disable
secure boot which crashes qemu until https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2038777
is fixed.
The branch with configure_file() was broken: meson doesn't know that
this file is a prerequisite for other targets, so partial rebuilds were broken.
Easy reproducer:
git mv .git{,.no}
touch meson build && ninja -C build src/basic/libbasic.a
rm build/version.h
ninja -C build src/basic/libbasic.a
Using vcs_tag() also in that case makes meson always build the file.
(Combined with the issue fixed in previous commit, I was encountering
failed builds quite often.)
With git-worktree, .git is just a file that specifies where
the parent git directory is. All the git information is available
in a git worktree, so it should be treated the same as a checkout
with a .git directory.
Daan De Meyer [Wed, 13 Mar 2024 09:26:52 +0000 (10:26 +0100)]
units: Bump various oneshot unit timeouts to 90s
In mkosi, we've been having CI failures caused by
systemd-machine-id-commit.service timing out. Let's bump the timeout
for it and systemd-rfkill.service to 90s which we also use for other
oneshot services to avoid transient failures on slower systems.
Daan De Meyer [Fri, 8 Mar 2024 10:33:25 +0000 (11:33 +0100)]
mkosi: Introduce packaging sources as submodules
By always cloning the latest branch commit, we can't bisect properly
using mkosi as when bisecting wildly different packaging sources will
be used compared to when the commit was merged. By using submodules, we
track individual commits which means when bisecting the same packaging
sources will be used.
We use git submodules as dependabot has support for automatically making
PRs to update git submodules. This commit also includes the necessary
dependabot configuration to enable this.
We make ubuntu/debian use the same submodule instead of adding the debian
packaging sources twice by introducing a new $PKG_SUBDIR environment variable
and using it instead of $DISTRIBUTION.
polkit: add another flag that controls how to treat the PK absent case
Typically if PK is not present we want to treat this as "denied". But
sometimes it makes sense to treat this case as "allowed".
In particular the combination POLKIT_ALWAYS_QUERY and
POLKIT_DEFAULT_ALLOW makes a lot of sense: it means we can enable PK
logic for actions where we so far bypassed the checks for root. With the
new combination we can have a default policy of allowing some operation
but still provide an effective hook to disable it.
Also add some debug logging about PK operations and results as they are ongoing.
polkit: allow checking if we already acquired some action
This adds a new helper that basically just wraps
async_polkit_query_have_action() and allows calling this without
actually triggering a PK authentication operation: it just checks if we
aleady have acquired an action or not.
Yu Watanabe [Thu, 29 Feb 2024 04:06:31 +0000 (13:06 +0900)]
sd-ndisc-router: adjust function names and type of returned value
- prefix length and preference should be fit in uint8_t, and actually
the kernel and networkd uses uint8_t to store them.
- captive portal is now stored as a NUL-terminated string. Hence, it
is not necessary to also provide its length.