Use OPTION_NAMESPACE() to keep the resolvectl and systemd-resolve
option sets separate. The resolvconf-compat path (resolvconf
invocation) keeps its own getopt-based parsing.
--help output has the expected changes to formatting. Synopis
for [status] is now shows that the verb is optional.
Co-developed-by: Claude Opus 4.7 <noreply@anthropic.com>
PR #41776 introduced the io.systemd.StorageProvider Varlink interface
and
two backends ('block' exposes host block devices, 'fs' exposes regular
files / dirs / subvolumes under /var/lib/storage), plus the
storagectl(1)
CLI to enumerate them. The only consumer so far was mount.storage. This
series wires up the first of the three integrations called out in
TODO.md:
systemd-vmspawn --bind-volume=PROVIDER:VOLUME[:CONFIG][:K=V,...]
Boot-time attach. Drives added this way are immutable at runtime.
io.systemd.MachineInstance.AddStorage / .RemoveStorage
Two new generic methods on the per-machine control socket. vmspawn
implements them (this series); systemd-nspawn will reuse the same
methods later.
machinectl bind-volume MACHINE PROVIDER:VOLUME[:CONFIG][:K=V,...]
machinectl unbind-volume MACHINE PROVIDER:VOLUME
Runtime hotplug front-end: machinectl Acquire()s the fd locally and
pushes it across to the target machine's MachineInstance socket.
Volumes are identified by a user-visible name "<provider>:<volume>"
(e.g.
"block:/dev/sda"). The 3rd 'config' field is opaque to the shared layer
and interpreted per backend — vmspawn maps it to a DiskType from
disk_type_table[] (virtio-blk default, virtio-scsi, nvme, scsi-cd; same
vocabulary as --extra-drive); future nspawn will read it as a mount
path.
- Document the new --bind-volume= option in systemd-vmspawn(1) and
the new bind-volume / unbind-volume verbs in machinectl(1).
- Add an integration test
(TEST-87-AUX-UTILS-VM.bind-volume.sh) covering boot-time attach
via --bind-volume, runtime attach via 'machinectl bind-volume',
runtime detach via 'machinectl unbind-volume', the StorageImmutable
rejection of attempts to detach boot-time volumes, and the
NoSuchStorage rejection of detach on unknown names.
- Strike "hook-up in systemd-vmspawn" from TODO.md; the nspawn and
service-manager hookups remain.
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
For bind-volume, machinectl parses the SPEC with the shared
bind_volume_parse(), Acquires the storage volume from the named
provider on the machinectl side, locates the target machine's
io.systemd.MachineInstance control socket via
machine_get_control_address(), pushes the fd across, and calls
io.systemd.MachineInstance.AddStorage with name='<provider>:<volume>'
and the user-supplied config string.
For unbind-volume, machinectl just forwards the name string to
io.systemd.MachineInstance.RemoveStorage.
Volumes attached at machine startup (e.g. via systemd-vmspawn's
--bind-volume=) are rejected with StorageImmutable when the user
attempts to unbind them at runtime.
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
Wire up the runtime hotplug Varlink methods on the per-VM control
socket:
AddStorage → take fd from the link, look up the DiskType from the
'config' field, build a DriveInfo flagged
QMP_DRIVE_REMOVABLE, dispatch to
vmspawn_qmp_add_block_device(). Reply delivered async
by on_add_device_add_complete() once the guest sees
the device.
RemoveStorage → forward the user-visible name to
vmspawn_qmp_remove_block_device(); the existing
device_del / DEVICE_DELETED / blockdev-del chain
replies on the link.
Add SD_VARLINK_SERVER_ALLOW_FD_PASSING_INPUT to the server flags so
clients can push storage fds across via sd_varlink_push_fd().
Maps -EEXIST → StorageExists and -EOPNOTSUPP/-EINVAL →
ConfigNotSupported in the AddStorage handler so callers see the
specific MachineInstance errors.
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
For each --bind-volume passed at startup, vmspawn calls Acquire() on
the named StorageProvider and attaches the resulting fd to the VM as
an additional drive. The drive is identified by the user-visible name
'<provider>:<volume>' on the bridge — that is also the handle used
later when machinectl unbind-volume detaches drives at runtime
(though boot-time drives like these are NOT removable; that is the
StorageImmutable behaviour added earlier).
The colon grammar is parsed by the shared bind_volume_parse() helper.
The 3rd 'config' field selects the guest device type from the
disk_type_table[] vocabulary (virtio-blk, virtio-scsi, nvme, scsi-cd);
empty defaults to virtio-blk per the TASK grammar.
Wiring lives next to the existing --extra-drive setup: parse_argv()
appends a parsed BindVolume to arg_bind_volumes, and prepare_device_info()
hands the array to vmspawn_bind_volume_prepare_boot() which Acquires
each volume and pushes a DriveInfo onto the existing drives array.
PCIe port assignment (assign_pcie_ports()) and the QMP setup loop pick
them up automatically.
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
This is vmspawn's per-backend code for the StorageProvider integration.
Other backends (future systemd-nspawn, future service-manager
BindVolume=) consume the same shared parser and Acquire helper but
each provides its own attach/detach glue; this is vmspawn's.
- disk_type_from_bind_volume_config() turns the opaque BindVolume
'config' field (e.g. "scsi-cd") into a DiskType. Empty defaults to
virtio-blk to match the --bind-volume CLI grammar.
- vmspawn_bind_volume_acquire() takes a parsed BindVolume, calls
storage_acquire_volume() for the fd, and builds a DriveInfo ready
for vmspawn_qmp_setup_drives() (boot) or vmspawn_qmp_add_block_device()
(hotplug). Rejects directory-typed volumes (vmspawn block devices
need a regular file or a host block device).
- vmspawn_bind_volume_attach_fd() is the runtime path: takes a fd
that was already pushed across by an AddStorage caller plus the
name+config it specified, builds the DriveInfo with
QMP_DRIVE_REMOVABLE set and a varlink link, and dispatches to
vmspawn_qmp_add_block_device(). Reply is delivered asynchronously
by the existing on_add_device_add_complete() callback.
- vmspawn_bind_volume_prepare_boot() is a thin loop the boot-time
path uses to populate DriveInfos.
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
vmspawn: track removability as a QmpDriveFlags bit and expose add_block_device
Drives attached at boot via the existing CLI options (--image,
--extra-drive) must not be detachable at runtime via the upcoming
RemoveStorage Varlink method, while drives added at runtime via
AddStorage must be. Track this distinction with a new QMP_DRIVE_REMOVABLE
property flag — placed alongside QMP_DRIVE_BLOCK_DEVICE, not in the
transient BlockDeviceStateFlags state-machine, since "may be removed"
is a permanent property of the drive.
vmspawn_qmp_remove_block_device() now early-rejects unknown ids with
io.systemd.MachineInstance.NoSuchStorage and immutable drives with
io.systemd.MachineInstance.StorageImmutable.
vmspawn_qmp_add_block_device() loses its 'static' qualifier and gets a
declaration in the header, so the runtime hotplug path
(vmspawn-bind-volume.c, next) can dispatch into it directly.
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
shared: add AddStorage / RemoveStorage to io.systemd.MachineInstance
Define two new methods on the generic 'MachineInstance' Varlink
interface that systemd-vmspawn (this series) and (future)
systemd-nspawn implement on their per-machine control sockets:
AddStorage(fileDescriptorIndex, name, config?) -> ()
Attach a storage volume — the caller passes an fd previously
acquired from a StorageProvider, plus a unique name of the form
'<provider>:<volume>' that identifies this binding for later
removal, plus a backend-specific 'config' field (vmspawn: guest
device type; future nspawn: mount path).
RemoveStorage(name) -> ()
Detach a previously-added storage volume.
Plus errors NoSuchStorage, StorageExists, StorageImmutable (the volume
was attached at boot and cannot be removed), BadConfig, and
ConfigNotSupported. Names follow the io.systemd.StorageProvider
vocabulary (NoSuchVolume, BadTemplate, TypeNotSupported, etc.) so the
two interfaces are visually consistent.
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
storagectl: refactor mount.storage helper to use storage_acquire_volume()
Drop the inline socket-build + sd_varlink_callbo() + reply-dispatch
+ take_fd block from run_as_mount_helper() in favour of the shared
helper. Preserves the type-fallback retry (TypeNotSupported / WrongType
re-tries with requestAs="blk") and the per-error-id message mapping;
the helper just reports the io.systemd.StorageProvider.* error name
back to the caller.
Net effect: ~50 lines of dedup, no functional change.
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
storagectl's mount.storage helper bundles "open StorageProvider socket
+ Acquire() + dispatch reply + take fd" inline. Future consumers
(systemd-vmspawn boot-time --bind-volume, machinectl bind-volume) need
the same dance.
Factor it into a single libshared helper that takes the Acquire()
parameters by value and returns the fd plus the actual type/read-only
flags. Library code, so no logging — varlink errors are surfaced via
sd_varlink_error_to_errno() and the StorageProvider error_id is
returned to the caller via reterr_error_id (caller decides how to
format messages).
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
Add a universal parser for the colon-separated grammar
'PROVIDER:VOLUME[:CONFIG][:K=V,K=V,…]' that backs --bind-volume on
systemd-vmspawn (next), machinectl bind-volume, and the future nspawn
+ service-manager BindVolume= integrations.
The 'config' field is opaque to shared code and interpreted per
backend (vmspawn: a DiskType name, future nspawn: a mount path). The
trailing key=value list is parsed into the io.systemd.StorageProvider
.Acquire() parameters (template, create, read-only/ro, size/create-size
and request-as), with values validated against the existing
storage-util enums and validators. Provider/volume names are checked
with storage_provider_name_is_valid() and storage_volume_name_is_valid();
the combined "<provider>:<volume>" string is also validated as
string_is_safe so it is safe to use as a QEMU device id.
Add a test-machine-util unit test covering the happy paths plus a
handful of malformed inputs.
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
The storage backend providers (block, fs) and storagectl currently each
extract storage-util.c into their target. Several upcoming consumers
(machine-util's BindVolume parser, vmspawn's hotplug glue, machinectl's
new bind-volume verbs) need the StorageProvider type/string-table
helpers and a future shared Acquire client helper.
Move storage-util.{c,h} to src/shared so libshared exports the symbols
once and every consumer (storage providers, storagectl, libshared
itself) picks them up by linking libshared. Drop the now-redundant
'extract'/'objects' wiring in src/storage/meson.build.
No code changes; this is purely a relocation.
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
The mount.storage helper open-codes the conventional 64K UID/GID
delegation block size as 0x10000 / 0x10000U in four places. Several
other places in the tree do the same (nspawn's arg_uid_range default,
homed's mount setup, …), but with no shared name.
Add USERNS_RANGE_SIZE in user-util.h alongside UID_NOBODY and friends,
and switch storagectl over to it. Other call sites can adopt it
incrementally.
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
test-homectl-prompts: add manual test to exercise prompt functionality
The prompt for groups is nice. The prompt for a shell could use some
love. Looking at this is much easier if we can invoke the code outside
in isolation.
I wrote this when looking at https://github.com/systemd/systemd/pull/41947,
where I wanted to see how the homectl prompt works with the changes.
Luca Boccassi [Tue, 5 May 2026 18:24:41 +0000 (19:24 +0100)]
test: make TEST-04-JOURNAL.journalctl-varlink more robust (#41953)
This test is sometimes flaky under sanitizers, and it does repeated
calls with the same parameters to run through different greps, and
the second one sometimes fails.
Store the result and grep it twice instead to try and reduce
flakiness.
Jonas Dreßler [Thu, 30 Apr 2026 18:27:43 +0000 (20:27 +0200)]
sysupdate: Ensure that end of the MatchPattern is matched correctly
An error snuck into the pattern parsing of the `MatchPattern` key in the
sysupdate transfer files. If there's two files "part1-v2.raw", and
"part1-v2.raw.tar" in the source folder, and MatchPattern="part1-@v.raw",
sysupdate will incorrectly choose "part1-v2.raw.tar" instead of
"part1-v2.raw".
While the pattern matching works perfectly fine, after the full pattern
is successfully matched to the string, we don't ensure that the string
actually ends when the pattern just did.
This means we can end up choosing a wrong file for the update, if the
filename/path happens to start with the same MatchPattern.
Fix it by ensuring the string ends after our match pattern ended.
Michael Vogt [Tue, 5 May 2026 12:55:18 +0000 (14:55 +0200)]
report: fold io.systemd.Basic facts into metrics
We removed the concept of facts, so we need to update the existing
io.systemd.Basic facts provider to metrics. This commit does just
that. Its mostly mechanical.
This also means that facts.{c,h} and varlink-io.systemd.Facts.{c,h}
are gone now.
Michael Vogt [Wed, 29 Apr 2026 15:52:50 +0000 (17:52 +0200)]
report: when a report fails, print the json error details
When a report upload fails the backend often provides useful
details via the varlink error. Show them as part of the upload
error message. For now we just dump the json because we have
no structure that the backends should follow. We may want to
consider adding one (like check for an "error_message" key in
the json). But for now this is a nice step forward.
report: upload reports using a "varlink socket directory"
Two new verbs are added: "generate" and "upload". The first one just
creates a "report", i.e. puts the metrics into a structured JSON object
that in the future is intended to carry additional data like a
signature:
The second verb can be used to upload or otherwise process the report.
It builds on the code added in 0a8560eed873a5f89487630a19db550fdbee3c15.
In /run/systemd/metrics-upload/ we expect a set of sockets. We'll call
out to each one of them. This allows the data to be processed in custom
ways, incl. writing to storage or sending over the network.
Each socket must provide a single interface:
io.systemd.Metrics.Upload {"report":$data}
Luca Boccassi [Tue, 5 May 2026 15:50:40 +0000 (16:50 +0100)]
test: reduce number of identical io.systemd.JournalAccess.GetEntries calls
This test is sometimes flaky under sanitizers, and it does repeated
calls with the same parameters to run through different greps, and
the second one sometimes fails.
Store the result and grep it twice instead to try and reduce
flakiness.
terminal-util: when prompting for a choice from a list, preselect longest prefix
If all entries of a menu prompt start with the same prefix, let's
preselect the prefix to enhance user experience.
This is particularly relevant when prompting for a disk to install
things on, as typically they all start with the same prefix /dev/, and
if there's only a single target medium discoverable, then we can even
fill it out fully.
Luca Boccassi [Tue, 5 May 2026 14:33:49 +0000 (15:33 +0100)]
test-oomd: fix flakiness under sanitizers
The test asserts that pgscan is 0, but under sanitizers this sometimes
fails and shows up as 1. We cannot control what the kernel scans, and
with sanitizers the runtime can be slow enough it's possible that the
kernel does a pass on the cgroup of the unit test.
Instead of asserting that it's 0, assert that it's between 0 and 9,
which seems a reasonable range.
bootctl,mute-console,pcrextend,pcrlock,repart: allow connections from self
With SD_VARLINK_SERVER_ROOT_ONLY, we refuse all unprivileged operations.
This is silly, the user can and should be able to do anything that doesn't
require privileges.
E.g.:
$ SYSTEMD_LOG_LEVEL=debug varlinkctl introspect /usr/lib/systemd/systemd-pcrextend
Forking off Varlink child process '/usr/lib/systemd/systemd-pcrextend'.
Successfully forked off '(sd-vlexec)' as PID 568993.
varlink: Setting state idle-client
json-stream: Sending message: {"method":"org.varlink.service.GetInterfaceDescription","parameters":{"interface":"io.systemd.PCRExtend"}}
Skipping PR_SET_MM, as we don't have privileges.
varlink: Changing state idle-client → calling
varlink: Unprivileged client attempted connection, refusing.
Failed to run Varlink event loop: Operation not permitted
json-stream: Got POLLHUP from socket.
varlink: Changing state calling → pending-disconnect
varlink: Connection was closed.
Failed to issue org.varlink.service.GetInterfaceDescription() varlink call: Connection reset by peer
This and similar commands now work, e.g.
$ SYSTEMD_LOG_LEVEL=debug varlinkctl call --more ./build/bootctl io.systemd.BootControl.ListBootEntries {}
...
Failed to open directory "/efi": No such file or directory
File system "/boot" is not a FAT EFI System Partition (ESP) file system.
...
Method call failed: Permission denied
{
"origin" : "linux",
"errno" : 13,
"errnoName" : "EACCES"
}
Which is fine — we lack privileges to actually return a useful answer, but the
call itself should go through.
I didn't touch udevd, which refuses to run if it is not root, and does a lot of
privileged setup, so would refuse to start even if the check was removed.
Luca Boccassi [Tue, 5 May 2026 09:43:45 +0000 (10:43 +0100)]
test: make TEST-64 btrfs_basic cleanup robust against reruns
The LUKS subtest in testcase_btrfs_basic leaves stale LUKS headers on
the underlying SCSI devices, so if the VM is rebooted the test fails
because the LUKS signature is still there and blkid finds it.
Luca Boccassi [Tue, 5 May 2026 12:55:54 +0000 (13:55 +0100)]
vmspawn-qmp: take temporary ref in drive_info_add_fail
drive_info_add_fail() calls bridge_unregister_drive() followed by
drive_info_unref(), then continues to access the DriveInfo object.
While all current callers hold their own reference, it is a bit
fragile and it trips static analyzers. Take a local reference.
Both the main parser and the util-linux mount-helper-mode parser
(invoked as mount.mstack) are converted with "systmed-mstack" and
"mount.mstack" as namespaces. The latter has no help.
For systemd-mstack, Commands are listed first, and then Options.
And --no-pager, --no-legend, --json= are moved to the end.
Co-developed-by: Claude Opus 4.7 <noreply@anthropic.com>
Also fix a latent bug in parse_interface_with_operstate_range() where
the global 'optarg' was used instead of the 'str' parameter when
extracting the interface name; with getopt removed they would have
diverged.
The help strings are adjusted a bit to be grammatical and short so
that the table formatting is easier.
Co-developed-by: Claude Opus 4.7 <noreply@anthropic.com>
--help and --version are moved to the beginning of the option list.
This is the usual location. Custom '-v' alias for --version is dropped.
It is not used by anything and it's better to follow the usual style.
Co-developed-by: Claude Opus 4.7 <noreply@anthropic.com>
The verb implementation functions are reordered to match the listing in --help.
The option are reorded a bit to have the "important" options that determine
behaviour first, and various display options and tweaks later. The cases in
parse_argv are ordered in the same way. No functional change.
Here we have the unusual situation that the option list is
conditionalized. I thought about embedding some "tag" information in
individual options to allow the options to be filtered by some arbitrary
conditions. But it seems that using groups works quite well. It wouldn't
scale well if there was a lot more options and conditions, but for the
current set it's good enough.
For options that are not supported in a given service, we print a custom
message ("This service does not support [this] option"), instead of the
generic "Unknown option …". I think this is actually better: we don't
have to pretent that we don't know about the option, and can directly
say that the it's a valid option in general but this service does not
support it (yet).
This converts systemd-homed, systemd-hostnamed, systemd-importd,
systemd-localed, systemd-logind, systemd-machined, systemd-networkd,
systemd-portabled, systemd-resolved, systemd-sysupdated,
systemd-timedated, and systemd-timesyncd.
When we add introspection of the option data, we'll somehow have to deal
with conditionalization. But let's cross that bridge when we need to.
Convert systemd-analyze to option and verb macros (#41945)
I thought that this would require some bigger changes, but it turns out
that the existing functionality is good enough with some minor
adjustments if used appropriately.
The behaviour of help_section() is changed to simplify all callers.
systemd-dissect: do not fail dissection on LUKS v1 partitions
partition_is_luks2_integrity() was returning -EINVAL when it
encountered a non-LUKS2 header (e.g. LUKS v1), which caused the
caller to abort the entire disk dissection. A LUKS v1 partition
simply isn't LUKS2-with-integrity, so return 0 instead and let
dissection continue normally.
Luca Boccassi [Tue, 5 May 2026 08:52:29 +0000 (09:52 +0100)]
test: skip TEST-70-TPM2.nvpcr check if pcrextend socket inactive
systemd-dissect --mtree calls io.systemd.PCRExtend over Varlink to extend
the verity NvPCR after activation, and the test then diffs the measure
log to find the new entry. But systemd-pcrextend.socket has
ConditionSecurity=measured-os, which fails when the firmware did not
initialize PCRs, so the test fails.
Simon Lucido [Mon, 4 May 2026 09:40:41 +0000 (11:40 +0200)]
core/varlink-metrics: expose ReloadCount as a metric
Add ReloadCount to the io.systemd.Metrics family table so it can be
queried alongside other manager-level metrics via systemd-report.
Also extend the existing integration test to cross-check the value
returned by systemd-report against the D-Bus and Varlink transports
on every assertion.
Co-developed-by: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Simon Lucido <simonlucido@meta.com>
The logic that was tested in the previous commit is used to implement
the behaviour for unit-shell and other verbs without changes.
The compare-versions synopsis is shortened to "V1 [OP] V2" to make the
verb synopsis fit. Unusual capitalizaition of "Command" is changed to
"COMMAND" (it's a replace arg, not a fixed string), and some help
strings are adjusted. The order of options in --help is based on the
existing order in parse_argv(). The old order in --help was mostly
random. I think it might be good to figure out something more rational
here, but I'm leaving that as a separate step.
The urlification of dot(1) in the --help string is lost. It's hard to
do this with the help string being stored in a read-only section.
I think this is not worth the trouble to reimplement in the current
scheme.
resolve: enforce the search domain limit earlier (#41938)
The search domain limit is already enforced by dns_search_domain_new(),
but in this case it's way too late. Let's enforce it during the first
loop to avoid unnecessary parsing.
---
Also, set a similar limit for NTAs - introduce a new constant, since
there's no pre-existing limit. I pulled the value from a thin air since
there's (AFAIK) no mandated maximum/minimum for NTAs, but given they're
supposed to be a manual and _temporary_ workarounds, hopefully 2K of
NTAs will be more than enough (if not, please yell).
Also note: the newly added error messages don't have the trailing "."
and similarly the newly introduced constant doesn't have the "u" suffix
to match the style of the surrounding code (and I didn't want to fix the
surrounding code to make the diff minimal). If this is not desirable,
please also yell.
cryptsetup: avoid a segfault when a keyfile is passed along with a TPM device (#41892)
A segfault is observed when both key_file and tpm2-device are
simultaneously passed to systemd-cryptsetup, e.g.:
systemd-cryptsetup attach test_data /vol /my-pass tpm2-device=auto
The crash appears after commit 5c6aad9 but the flaw in the logic was
pre-existing.
The search domain limit is already enforced by dns_search_domain_new(),
but in this case it's way too late. Let's enforce it during the first
loop to avoid unnecessary parsing.
Luca Boccassi [Mon, 4 May 2026 13:42:03 +0000 (14:42 +0100)]
test: suppress PCR public key auto-loading in TEST-70-TPM2 dditest
The dditest block calls systemd-repart with Encrypt=tpm2 but without
--tpm2-public-key-pcrs=. Since systemd-stub drops
/run/systemd/tpm2-pcr-public-key.pem when booting from a signed UKI
systemd-repart auto-loads it and enrolls a signed PCR policy, and
then systemd-cryptsetup tpm2-device=auto has no matching signature file,
so unlock fails.
--tpm2-public-key= is not enough as the default kicks in then.
Luca Boccassi [Sun, 3 May 2026 21:16:15 +0000 (22:16 +0100)]
test: make TEST-64 mdadm_lvm cleanup robust against reruns
mdadm --zero-superblock only wipes the MD metadata on the underlying
disks, not the LVM PV header that lives in the array data area. When
the VM is restarted and the test re-creates the array with the same
UUID, /dev/md127 exposes the old data including the LVM PV header, so
udev's 69-lvm.rules auto-triggers lvm-activate-mdlvm_vg.service which
races with the test's own pvcreate for exclusive access on /dev/md127.
Wipe the LVM signature off the MD device (and the underlying disks as
a belt-and-braces measure) to avoid the race on re-run, fixing failures
when the VM is rebooted instead of shut down.
Co-developed-by: Claude Opus 4.7 <noreply@anthropic.com>
Luca Boccassi [Mon, 4 May 2026 11:58:33 +0000 (12:58 +0100)]
semaphore: stop deleting all apt sources
The image configuration was changed and the main sources are
now in a drop-in apt sources files too, so deleting the whole
drop-in directory breaks installing packages. Just delete the
disabled ones and chrome.
Valentin David [Mon, 4 May 2026 08:25:19 +0000 (10:25 +0200)]
core: Open netfilter socket only when needed
On initrds where nfnetlink module is missing, trying to open
a NETLINK_NETFILTER netlink socket takes a lot of time then fails.
This makes boot noticibly slower. Even though probably no
unit in an initrd need netfilter.
So here we delay opening the socket until we know we need it.
TEST-70-TPM2: Test the key_file + tpm2-device= combo
When key_file is passed along with tpm2-device= to systemd-cryptsetup, the
logic is to try the blob as a TPM blob first, and then fall back to trying the
file as a regular key file. Check that this fallback works.
the logic in attach_luks_or_plain_or_bitlk_by_tpm2() tries to process it as a
TPM blob first. This did not work properly because it passes n_blobs=1 to
acquire_tpm2_key(), and the key_file is only read when n_blobs == 0. As a
result, the code ends up calling tpm2_unseal(..., blobs=NULL, n_blobs=1, ...).
Before commit 5c6aad9 ("cryptsetup-tokens: Print tpm2-primary-alg: only when it
is known"), the segfault was not observed because tpm2_unseal() was bailing out
early when primary_alg == 0. However, after that change, it attempts to process
the blob (which is NULL) and crashes.
Fix this logic by passing n_blobs=0 to acquire_tpm2_key() so that it actually
reads the key_file. Additionally, assert 'blobs' in tpm2_unseal() as a
safeguard.
Valentin David [Sat, 18 Apr 2026 13:09:00 +0000 (15:09 +0200)]
boot: Try to load UKI from simple filesystem before LoadImage
When the source buffer is NULL, the firmware is supposed to try to load the UKI
with simple filesystem protocol then load file 2 protocol. But it seems
on some versions of AMI, it does not use simple filesystem protocol,
and then fails to load if the ESP was loaded from an El Torito boot
catalog. Trying to load the source buffer from the simple filesystem protocol
protocols seems work around this limitation.
Shim for example, also loads the source buffer before calling LoadImage. So it
seems to be a safe thing to do. We could also maybe in the future use load file
2 protocol if simple filesystem failed in the first place.
test: make TEST-70-TPM2 and TEST-86-MULTI-PROFILE-UKI robust against reruns (#41922)
These tests leave a lot of state around, and when the test is re-run,
for example due to the qemu bug that makes a VM reboot instead of
shutting down, it fails.
Luca Boccassi [Sun, 3 May 2026 15:33:38 +0000 (16:33 +0100)]
test: make TEST-86-MULTI-PROFILE-UKI robust against reruns
When qemu reboots instead of shutting down after the last iteration,
the profile is already set to profile2 but the /root/encrypted.raw is
gone so the test fails. Reset the default boot entry at the end of the
test to make it robust against reruns.
Luca Boccassi [Sun, 3 May 2026 15:23:41 +0000 (16:23 +0100)]
test: make TEST-70-TPM2 robust against reruns
The test leaves a lot of state around, and when the test is re-run,
for example due to the qemu bug that makes a VM reboot instead of
shutting down, it fails.
Do more cleanups in the traps.
[ 162.642175] TEST-70-TPM2.sh[2815]: Calculated public key name: 000b2b66edc3a466e81059286aaf38d09ea42a7a9dcdf6ba3b664c62f0cae4ce4f66
[ 162.642628] TEST-70-TPM2.sh[2815]: PolicyAuthorize calculated digest: 2caa740101f65734d50395d6abc64fa46015d40d1f5de239434578544e592a92
[ 162.643681] TEST-70-TPM2.sh[2815]: Calculated NV index name: 000b439cfa1534815bbe8d33b80c56f5a8d17d36fe94a7782b23a37b50def5fc5eaa
[ 162.645111] TEST-70-TPM2.sh[2815]: PolicyAuthorizeNV calculated digest: 69ee0e89fafe6b9df2cd6a5defbf74aa46cf6d92703e645d463549da4ba5e1a4
[ 162.645407] TEST-70-TPM2.sh[2815]: Combined signed PCR policies and pcrlock policies cannot be calculated offline, currently.
[ 162.649576] TEST-70-TPM2.sh[2815]: Releasing crypt device /dev/loop0 context.
[ 162.652433] TEST-70-TPM2.sh[2815]: Releasing device-mapper backend.
[ 162.653518] TEST-70-TPM2.sh[2815]: Closing read only fd for /dev/loop0.
[ 162.654359] TEST-70-TPM2.sh[2815]: Closing read write fd for /dev/loop0.
[ 162.654786] TEST-70-TPM2.sh[2815]: Failed to encrypt device: Operation not supported
Luca Boccassi [Sat, 2 May 2026 22:18:22 +0000 (23:18 +0100)]
test: make varlink StartTransient checks compatible with jq 1.6
The new "varlinkctl --more StartTransient" subtest pipes a JSON-SEQ
stream of multiple records into "jq --seq -e ...". CentOS 9
ships jq 1.6, where -e only inspects the last input record's output:
when the trailing record (the final reply) doesn't match the
"select()" filter, jq exits non-zero even though earlier records
match, so the test fails.
Use --slurp which collapses the records into an array first and
returns a single bool.