A follow-up to the AddStorage / RemoveStorage series. ReplaceStorage
swaps the *backing file* of an already-attached storage device on a
running vmspawn-managed VM, leaving the guest-visible device frontend
(virtio-blk, virtio-scsi, nvme, scsi-cd) and every other property of
the device untouched. The intended use is to point an existing disk
at a new image without the guest seeing a hot-unplug/hot-plug cycle.
The signature mirrors AddStorage minus the 'config' field: the
device frontend doesn't change, only the backing behind it. Read-
only / read-write is derived from the new fd's O_ACCMODE; scsi-cd is
forced read-only to match the boot-time policy. S_ISBLK on the new
fd selects host_device vs file driver, matching AddStorage.
The QMP primitive is blockdev-reopen. It cannot change a file /
host_device node's 'filename' so we can't just point the existing
file node at a new fd, but it can swap a format node's 'file' child
to a different existing monitor-owned node by node-name reference
(case 3 in qemu/qapi/block-core.json:5034-5040). The chain is:
add-fd (host fd → new fdset)
blockdev-add (new file node, filename=/dev/fdset/N — fd-only)
remove-fd (release monitor's ref; new file holds the dup)
blockdev-reopen (format node, file = new file node-name)
blockdev-del (old file node; its dup release frees old fdset)
The reopen options must restate every option the original blockdev-
add emitted on the format node — blockdev-reopen resets any
unspecified option to its driver default. The 'file' field is a
node-name string reference, never a path.
No new errors and no new IDL types beyond the method itself;
everything is built on the existing NoSuchStorage / StorageImmutable
/ NotConnected / EBUSY vocabulary.
The series is:
vmspawn: split blockdev-add into separate file and format calls
Preparatory refactor. qemu/blockdev.c:3440 only marks the
top-level BDS returned by blockdev-add as monitor-owned;
inline children are NOT, so blockdev-del later rejects them
with "Node X is not owned by the monitor". Split into two
blockdev-add calls so the file node is independently
deletable. DriveInfo gains qmp_file_node_name and a
file_generation counter; the teardown helper deletes format
then file (file-first is rejected as "node used as 'file'
of Y"). The ephemeral path was already structured this way;
only the regular add path changes. Drops the now-unused
qmp_build_blockdev_add_inline().
shared/varlink-io.systemd.MachineInstance: add ReplaceStorage method
IDL only: ReplaceStorage(fileDescriptorIndex, name). No new
errors.
vmspawn: implement io.systemd.MachineInstance.ReplaceStorage
vmspawn_qmp_replace_block_device() entry point, ReplaceCtx
(refcounted, ReplaceCtxStateFlags for partial-state tracking)
and four async callbacks plus an idempotent replace_fail.
file_generation is bumped before issuing blockdev-add so
retries don't collide on node-name.
BLOCK_DEVICE_STATE_REPLACE_PENDING gates concurrent
Replace / Remove on the same drive. On reopen success the
trailing blockdev-del of the old file node fires from the
reopen callback; its failure logs a warning and still replies
success (the swap already committed; the orphan resolves at VM
exit). QMP disconnect mid-replace routes via
qmp_client_fail_pending → replace_fail → NotConnected.
test: integration test for io.systemd.MachineInstance.ReplaceStorage
TEST-87-AUX-UTILS-VM.replace-storage covers happy-path replace,
successive replaces (file_generation rotation), StorageImmutable
rejection on the boot-time drive, NoSuchStorage on unknown
names, InvalidParameter on malformed names, and clean
RemoveStorage after a replace (proves the new file node is
monitor-owned and the teardown order works). Backing files are
passed via 'varlinkctl --push-fd'; no machinectl front-end is
added in this round.
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>