Warner Losh [Thu, 5 Feb 2026 16:29:58 +0000 (09:29 -0700)]
bsd-user: Add target_semid_ds and target_msqid_ds structures
Add the target ABI definitions for System V semaphore and message queue
data structures, needed for semctl() and msgctl() syscall emulation.
Signed-off-by: Stacey Son <sson@FreeBSD.org> Signed-off-by: Mikael Urankar <mikael.urankar@gmail.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Warner Losh <imp@bsdimp.com>
Warner Losh [Thu, 5 Feb 2026 16:25:08 +0000 (09:25 -0700)]
common-user: Drop __linux__ around .note.GNU-stack
GNU-stack tagging is a toolchain issue, not an OS issue. All the
toolchains require this for ELF.
Signed-off-by: Warner Losh <imp@bsdimp.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Warner Losh [Fri, 6 Feb 2026 15:04:01 +0000 (08:04 -0700)]
bsd-user: Remove NetBSD-specific code
Remove the NetBSD specific code form bsd-user. It's not been maintained
in any meaningful way since it was introduced to the tree in 2008. It
hasn't been connected to the build since 2021, and last time (in 2023) I
tried to mock-up the meson support it needed, it failed to build. While
there were some out-of-tree work, I've not been able to connect with
that code.
Cc: Reinoud Zandijk <reinoud@netbsd.org> Cc: Ryo ONODERA <ryoon@netbsd.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Warner Losh <imp@bsdimp.com>
Warner Losh [Fri, 6 Feb 2026 14:55:00 +0000 (07:55 -0700)]
bsd-user: Remove OpenBSD-specific code
Remove the OpenBSD specific code form bsd-user. It's not been maintained
in any meaningful way since it was introduced to the tree in 2008. It
hasn't been connected to the build since 2021, and last time (in 2023) I
tried to mock-up the meson support it needed, it failed to build. I
contacted the OpenBSD people in 2018, it appears, and even at that time
they tought this code was not at all useful to them.
Cc: Brad Smith <brad@comstyle.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Warner Losh <imp@bsdimp.com>
Warner Losh [Fri, 6 Feb 2026 16:00:30 +0000 (09:00 -0700)]
freebsd: FreeBSD 15 has native inotify
Check to make sure that we have inotify in libc, before looking for it
in libinotify.
Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Marc-André Lureau <marcandre.lureau@redhat.com> Cc: Daniel P. Berrange <berrange@redhat.com> Cc: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Warner Losh <imp@bsdimp.com>
Jon Kohler [Thu, 6 Nov 2025 17:46:25 +0000 (10:46 -0700)]
target/i386: introduce ClearwaterForest-v3 to expose ITS_NO
Expose ITS_NO by default, as users using Clearwater Forest and higher
CPU models would not be able to live migrate to lower CPU hosts due to
missing features. In that case, they would not be vulnerable to ITS.
its-no was originally added on [1], but needs to be exposed on the
individual CPU models for the guests to see by default.
Note: Version 1 already exposes ARCH_CAP_BHI_NO, which would already
mark the CPU as invulnerable to ITS (at least in Linux); however,
expose ITS_NO for completeness.
[1] 74978391b2da ("target/i386: Make ITS_NO available to guests")
Jon Kohler [Thu, 6 Nov 2025 17:46:24 +0000 (10:46 -0700)]
target/i386: introduce SierraForest-v5 to expose ITS_NO
Expose ITS_NO by default, as users using Sierra Forest and higher
CPU models would not be able to live migrate to lower CPU hosts due to
missing features. In that case, they would not be vulnerable to ITS.
its-no was originally added on [1], but needs to be exposed on the
individual CPU models for the guests to see by default.
Note: For SRF, version 2 already exposed BHI_CTRL, which would already
mark the CPU as invulnerable to ITS (at least in Linux); however,
expose ITS_NO for completeness.
[1] 74978391b2da ("target/i386: Make ITS_NO available to guests")
Jon Kohler [Thu, 6 Nov 2025 17:46:23 +0000 (10:46 -0700)]
target/i386: introduce GraniteRapids-v5 to expose ITS_NO
Expose ITS_NO by default, as users using Granite Rapids and higher
CPU models would not be able to live migrate to lower CPU hosts due to
missing features. In that case, they would not be vulnerable to ITS.
its-no was originally added on [1], but needs to be exposed on the
individual CPU models for the guests to see by default.
[1] 74978391b2da ("target/i386: Make ITS_NO available to guests")
Jon Kohler [Thu, 6 Nov 2025 17:46:22 +0000 (10:46 -0700)]
target/i386: introduce SapphireRapids-v6 to expose ITS_NO
Expose ITS_NO by default, as users using Sapphire Rapids and higher
CPU models would not be able to live migrate to lower CPU hosts due to
missing features. In that case, they would not be vulnerable to ITS.
its-no was originally added on [1], but needs to be exposed on the
individual CPU models for the guests to see by default.
[1] 74978391b2da ("target/i386: Make ITS_NO available to guests")
Enumerate ability to enable Intel Mode-Based Execute Control (MBEC)
on secondary execution control bit 22.
Intel MBEC is a hardware feature, introduced in the Kabylake
generation, that allows for more granular control over execution
permissions. MBEC enables the separation and tracking of execution
permissions for supervisor (kernel) and user-mode code. It is used as
an accelerator for Microsoft's Memory Integrity [1] (also known as
hypervisor-protected code integrity or HVCI).
Code is mirrored here:
https://github.com/JonKohler/linux/tree/mbec-v1-6.18
https://github.com/JonKohler/kvm-unit-tests/tree/mbec-v1
LKML thread(s) are here:
Original RFC: https://lore.kernel.org/all/20250313203702.575156-1-jon@nutanix.com/
V1 code: https://lore.kernel.org/all/20251223054806.1611168-1-jon@nutanix.com/
KVM unit test changes: https://lore.kernel.org/all/20251223054850.1611618-1-jon@nutanix.com/
Cc: Xiaoyao Li <xiaoyao.li@intel.com> Cc: Zhao Liu <zhao1.liu@intel.com> Co-authored-by: Jon Kohler <jon@nutanix.com> Co-authored-by: Aditya Desai <aditya.desai@nutanix.com> Signed-off-by: Jon Kohler <jon@nutanix.com> Link: https://lore.kernel.org/r/20251223060834.1618428-1-jon@nutanix.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Peter Maydell [Sat, 28 Feb 2026 14:30:23 +0000 (14:30 +0000)]
Merge tag 'pull-11.0-testing-updates-270226-2' of https://gitlab.com/stsquad/qemu into staging
testing updates (vm, docker, arm functional)
- migrate non-lcitool Debian containers to Trixie (13)
- remove legacy-test-cross hacks
- fix some minor make vm- Makefile issues
- bump OpenBSD to 7.8
- add VBSA EFI functional test for Arm
* tag 'pull-11.0-testing-updates-270226-2' of https://gitlab.com/stsquad/qemu:
tests/functional: add Arm VBSA uefi conformance test
tests/vm: build openbsd from lcitool data
tests/vm: fix interactive boot
tests/vm: remove unused import
tests/vm: bump OpenBSD to the current 7.8 release
tests/docker: migrate legacy-test-cross compilers to trixie
tests/docker: upgrade most non-lcitool debian tests to debian 13
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Sat, 28 Feb 2026 14:30:13 +0000 (14:30 +0000)]
Merge tag 'pull-9p-20260228' of https://github.com/cschoenebeck/qemu into staging
9pfs changes:
* Fix crash under unlink-heavy load in v9fs_mark_fids_unreclaim().
* Fix crash with the synth fs driver.
# -----BEGIN PGP SIGNATURE-----
#
# iQJLBAABCgA1FiEEltjREM96+AhPiFkBNMK1h2Wkc5UFAmmi7dIXHHFlbXVfb3Nz
# QGNydWRlYnl0ZS5jb20ACgkQNMK1h2Wkc5Uitw//SvhQlsjdtoJjsQYNpYN7gbkC
# PwU3b88xP1iPEhZ7ZWoQYdd0bJEdb4CmfG/iKlz8jwCvY2VnX4xkfU15GYLzEtOR
# PU13dBxXPLmhRFFO6TzLm8kFwa8LkZB/Hm8cGCwBshsGwIgQAE9RNoE/LyQb6C7B
# ocgdNFXkSvaqnWY6m3PjQb59IZ1Smg5P7GMvoIzqeiVYJ6PvsLvW9V/w3UrT5dhB
# hFqK8eUgYV5Wgx1qUeYqr8O9425tdGf9cO7w1eIl/YKTQkV3wbkvY8LlARYeLU6P
# bg2eVqj3c+dDVpO0+VSNUNutV5STHYP6Ub/WEZBP92MIOx8VIvus62/zKR2Hq2uY
# e0qpC3lCvKGxzxH54GGYzUfKonLi3uv5tLMfB/EPcLQd4bH0bf23a/F5gesYvzwZ
# N8WeHxV/cNpxxkM6lIDTdSIoxtj8HXLsZxkSJ8bOPpcXd7JPfIQATfYKVvVO7AzB
# JHsGSwHZ4xKw2KuDtN6xsalf48kVi8VZpcmgmCCgFN/m0ubQTcRiIXoZ3c8j9xp/
# UqrmcpBX5uU4t0CDEm0RBwyHVey7Gv0xFg8VKfIdWczdIcGNh/VCzp8rm+zcl6FB
# XFkA7O2q/qIPNpj0JNaBekKSLvDtqjgR0rOHh0iJhhzdQWIVkbKd0OvMtmpgKlbk
# vdAulpGAJftqpoe6/zo=
# =pZuI
# -----END PGP SIGNATURE-----
# gpg: Signature made Sat Feb 28 13:29:54 2026 GMT
# gpg: using RSA key 96D8D110CF7AF8084F88590134C2B58765A47395
# gpg: issuer "qemu_oss@crudebyte.com"
# gpg: Good signature from "Christian Schoenebeck <qemu_oss@crudebyte.com>" [unknown]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: ECAB 1A45 4014 1413 BA38 4926 30DB 47C3 A012 D5F4
# Subkey fingerprint: 96D8 D110 CF7A F808 4F88 5901 34C2 B587 65A4 7395
* tag 'pull-9p-20260228' of https://github.com/cschoenebeck/qemu:
hw/9pfs: fix missing EOPNOTSUPP on Twstat and Trenameat for fs synth driver
hw/9pfs: fix data race in v9fs_mark_fids_unreclaim()
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
hw/9pfs: fix missing EOPNOTSUPP on Twstat and Trenameat for fs synth driver
Renaming files/dirs is only supported by path-based fs drivers. EOPNOTSUPP
should be returned on any renaming attempt for not path-based fs drivers.
This was already the case for 9p "Trename" request type. However for 9p
request types "Trenameat" and "Twstat" this was yet missing.
So fix this by checking in Twstat and Trenameat request handlers whether
the fs driver in use is really path based, if not return EOPNOTSUPP and
abort further handling of the request.
This fixes a crash with the 9p "synth" fs driver which is not path-based.
The crash happened because the synth driver stores and expects a raw
V9fsSynthNode pointer instead of a C-string on V9fsPath.data. So the
C-string delivered by 9p server to synth fs driver was incorrectly
casted to a V9fsSynthNode pointer, eventually causing a segfault.
Reported-by: Oliver Chang <ochang@google.com> Fixes: https://issues.oss-fuzz.com/issues/477990727
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3298 Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Reviewed-by: Greg Kurz <groug@kaod.org> Link: https://lore.kernel.org/qemu-devel/E1vrbaP-000Gqb-B3@kylie.crudebyte.com/
Richie Buturla [Wed, 11 Feb 2026 15:44:50 +0000 (16:44 +0100)]
hw/9pfs: fix data race in v9fs_mark_fids_unreclaim()
A data race between v9fs_mark_fids_unreclaim() and v9fs_path_copy()
causes an inconsistent read of fidp->path. In v9fs_path_copy(), the
path size is set before the data pointer is allocated, creating a
window where size is non-zero but data is NULL.
v9fs_co_open2() holds a write lock during path modifications,
but v9fs_mark_fids_unreclaim() was not acquiring a read
lock, allowing it to race.
Fix by holding the path read lock during FID table iteration.
The commit says:
> This reverts commit 55d98e3edeeb17dd8445db27605d2b34f4c3ba85.
>
> The commit introduced a regression in the replay functional test
> on alpha (tests/functional/alpha/test_replay.py), that causes CI
> failures regularly. Thus revert this change until someone has
> figured out what is going wrong here.
myrslint [Fri, 15 Aug 2025 16:53:09 +0000 (16:53 +0000)]
KVM: i386: Default disable ignore guest PAT quirk
Add a new accelerator option that allows the guest to adjust the PAT.
This is already the case for TDX guests and allows using virtio-gpu
Venus with RADV or NVIDIA drivers.
The quirk is disabled by default. Since this caused problems with
Linux's Bochs video device driver, add a knob to leave it enabled,
and for now do ont enable it by default.
Signed-off-by: Myrsky Lintu <qemu.haziness801@passinbox.com>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2943 Link: https://lore.kernel.org/r/175527721636.15451.4393515241478547957-1@git.sr.ht
[Add property; for now leave it off by default. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
John Snow [Thu, 19 Feb 2026 18:54:09 +0000 (13:54 -0500)]
rust: use checked_div to make clippy happy
When upgrading from Fedora 41 to Fedora 43 for CI tests, clippy begins
complaining about not using checked_div instead of manually checking
divisors. Make clippy happy and use checked_div() instead.
Ani Sinha [Wed, 25 Feb 2026 03:49:39 +0000 (09:19 +0530)]
qom: add 'confidential-guest-reset' property for x86 confidential vms
Through the new 'confidential-guest-reset' property, control plane should be
able to detect if the hypervisor supports x86 confidential guest resets. Older
hypervisors that do not support resets will not have this property populated.
Suggested-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Ani Sinha <anisinha@redhat.com> Link: https://lore.kernel.org/r/20260225035000.385950-35-anisinha@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Ani Sinha [Wed, 25 Feb 2026 03:49:38 +0000 (09:19 +0530)]
tests/functional/x86_64: add functional test to exercise vm fd change on reset
A new functional test is added that exercises the code changes related to
closing of the old KVM VM file descriptor and opening a new one upon VM reset.
This normally happens when confidential guests are reset but for
non-confidential guests, we use a special machine specific debug/test parameter
'x-change-vmfd-on-reset' to enable this behavior.
Only specific code changes related to re-initialisation of SEV-ES, SEV-SNP and
TDX platforms are not exercised in this test as they require hardware that
supports running confidential guests.
Ani Sinha [Wed, 25 Feb 2026 03:49:37 +0000 (09:19 +0530)]
hw/machine: introduce machine specific option 'x-change-vmfd-on-reset'
A new machine specific option 'x-change-vmfd-on-reset' is introduced for
debugging and testing only (hence the 'x-' prefix). This option when enabled
will force KVM VM file descriptor to be changed upon guest reset like
in the case of confidential guests. This can be used to exercise the code
changes that are specific for confidential guests on non-confidential
guests as well (except changes that require hardware support for
confidential guests).
A new functional test has been added in the next patch that uses this new
parameter to test the VM file descriptor changes.
Ani Sinha [Wed, 25 Feb 2026 03:49:36 +0000 (09:19 +0530)]
kvm/clock: add support for confidential guest reset
Confidential guests change the KVM VM file descriptor upon reset and also create
new VCPU file descriptors against the new KVM VM file descriptor. We need to
save the clock state from kvm before KVM VM file descriptor changes and restore
it after. Also after VCPU file descriptors changed, we must call
KVM_KVMCLOCK_CTRL on the VCPU file descriptor to inform KVM that the VCPU is
in paused state.
Ani Sinha [Wed, 25 Feb 2026 03:49:35 +0000 (09:19 +0530)]
kvm/vcpu: add notifiers to inform vcpu file descriptor change
When new vcpu file descriptors are created and bound to the new kvm file
descriptor as a part of the confidential guest reset mechanism, various
subsystems needs to know about it. This change adds notifiers so that various
subsystems can take appropriate actions when vcpu fds change by registering
their handlers to this notifier.
Subsequent changes will register specific handlers to this notifier.
Ani Sinha [Wed, 25 Feb 2026 03:49:34 +0000 (09:19 +0530)]
ppc/openpic: create a new openpic device and reattach mem region on coco reset
For confidential guests during the reset process, the old KVM VM file
descriptor is closed and a new one is created. When a new file descriptor is
created, a new openpic device needs to be created against this new KVM VM file
descriptor as well. Additionally, existing memory region needs to be reattached
to this new openpic device and proper CPU attributes set associating new file
descriptor. This change makes this happen with the help of a callback handler
that gets called when the KVM VM file descriptor changes as a part of the
confidential guest reset process.
Ani Sinha [Wed, 25 Feb 2026 03:49:33 +0000 (09:19 +0530)]
kvm/xen-emu: re-initialize capabilities during confidential guest reset
On confidential guests KVM virtual machine file descriptor changes as a
part of the guest reset process. Xen capabilities needs to be re-initialized in
KVM against the new file descriptor.
Ani Sinha [Wed, 25 Feb 2026 03:49:32 +0000 (09:19 +0530)]
hw/hyperv/vmbus: add support for confidential guest reset
On confidential guests when the KVM virtual machine file descriptor changes as
a part of the reset process, event file descriptors needs to be reassociated
with the new KVM VM file descriptor. This is achieved with the help of a
callback handler that gets called when KVM VM file descriptor changes during
the confidential guest reset process.
This patch is tested on non-confidential platform only.
Ani Sinha [Wed, 25 Feb 2026 03:49:31 +0000 (09:19 +0530)]
kvm/hyperv: add synic feature to CPU only if its not enabled
We need to make sure that synic CPU feature is not already enabled. If it is,
trying to enable it again will result in the following assertion:
Unexpected error in object_property_try_add() at ../qom/object.c:1268:
qemu-system-x86_64: attempt to add duplicate property 'synic' to object (type 'host-x86_64-cpu')
Ani Sinha [Wed, 25 Feb 2026 03:49:30 +0000 (09:19 +0530)]
kvm/i8254: add support for confidential guest reset
A confidential guest reset involves closing the old virtual machine KVM file
descriptor and opening a new one. Since its a new KVM fd, PIT needs to be
re-initialized again. This is done with the help of a notifier which is invoked
upon KVM vm file descriptor change during the confidential guest reset process.
Ani Sinha [Wed, 25 Feb 2026 03:49:29 +0000 (09:19 +0530)]
kvm/i8254: refactor pit initialization into a helper
The initialization code will be used again by VM file descriptor change
notifier callback in a subsequent change. So refactor common code into a new
helper function.
Ani Sinha [Fri, 27 Feb 2026 07:24:45 +0000 (12:54 +0530)]
hw/vfio: generate new file fd for pseudo device and rebind existing descriptors
Normally the vfio pseudo device file descriptor lives for the life of the VM.
However, when the kvm VM file descriptor changes, a new file descriptor
for the pseudo device needs to be generated against the new kvm VM descriptor.
Other existing vfio descriptors needs to be reattached to the new pseudo device
descriptor. This change performs the above steps.
Ani Sinha [Wed, 25 Feb 2026 03:49:27 +0000 (09:19 +0530)]
i386/sev: add support for confidential guest reset
When the KVM VM file descriptor changes as a part of the confidential guest
reset mechanism, it necessary to create a new confidential guest context and
re-encrypt the VM memory. This happens for SEV-ES and SEV-SNP virtual machines
as a part of SEV_LAUNCH_FINISH, SEV_SNP_LAUNCH_FINISH operations.
A new resettable interface for SEV module has been added. A new reset callback
for the reset 'exit' state has been implemented to perform the above operations
when the VM file descriptor has changed during VM reset.
Tracepoints has been added also for tracing purpose.
Ani Sinha [Wed, 25 Feb 2026 03:49:26 +0000 (09:19 +0530)]
i386/sev: free existing launch update data and kernel hashes data on init
If there is existing launch update data and kernel hashes data, they need to be
freed when initialization code is executed. This is important for resettable
confidential guests where the initialization happens once every reset.
Ani Sinha [Wed, 25 Feb 2026 03:49:25 +0000 (09:19 +0530)]
i386/sev: add notifiers only once
The various notifiers that are used needs to be installed only once not on
every initialization. This includes the vm state change notifier and others.
This change uses 'cgs->ready' flag to install the notifiers only one time,
the first time.
Ani Sinha [Wed, 25 Feb 2026 03:49:24 +0000 (09:19 +0530)]
i386/sev: add migration blockers only once
sev_launch_finish() and sev_snp_launch_finish() could be called multiple times
when the confidential guest is being reset/rebooted. The migration
blockers should not be added multiple times, once per invocation. This change
makes sure that the migration blockers are added only one time by adding the
migration blockers to the vm state change handler when the vm transitions to
the running state. Subsequent reboots do not change the state of the vm.
Ani Sinha [Wed, 25 Feb 2026 03:49:23 +0000 (09:19 +0530)]
i386/tdx: add a pre-vmfd change notifier to reset tdx state
During reset, when the VM file descriptor is changed, the TDX state needs to be
re-initialized. A notifier callback is implemented to reset the old
state and free memory before the new state is initialized post VM file
descriptor change.
Ani Sinha [Wed, 25 Feb 2026 03:49:22 +0000 (09:19 +0530)]
i386/tdx: finalize TDX guest state upon reset
When the confidential virtual machine KVM file descriptor changes due to the
guest reset, some TDX specific setup steps needs to be done again. This
includes finalizing the initial guest launch state again. This change
re-executes some parts of the TDX setup during the device reset phaze using a
resettable interface. This finalizes the guest launch state again and locks
it in. Machine done notifier which was previously used is no longer needed as
the same code is now executed as a part of VM reset.
Ani Sinha [Wed, 25 Feb 2026 03:49:20 +0000 (09:19 +0530)]
accel/kvm: rebind current VCPUs to the new KVM VM file descriptor upon reset
Confidential guests needs to generate a new KVM file descriptor upon virtual
machine reset. Existing VCPUs needs to be reattached to this new
KVM VM file descriptor. As a part of this, new VCPU file descriptors against
this new KVM VM file descriptor needs to be created and re-initialized.
Resources allocated against the old VCPU fds needs to be released. This change
makes this happen.
Ani Sinha [Wed, 25 Feb 2026 03:49:19 +0000 (09:19 +0530)]
kvm/i386: reload firmware for confidential guest reset
When IGVM is not being used by the confidential guest, the guest firmware has
to be reloaded explicitly again into memory. This is because, the memory into
which the firmware was loaded before reset was encrypted and is thus lost
upon reset. When IGVM is used, it is expected that the IGVM will contain the
guest firmware and the execution of the IGVM directives will set up the guest
firmware memory.
Ani Sinha [Wed, 25 Feb 2026 03:49:18 +0000 (09:19 +0530)]
hw/i386: export a new function x86_bios_rom_reload
Confidential guest smust reload their bios rom upon reset. This is because
bios memory is encrypted and upon reset, the contents of the old bios memory
is lost and cannot be re-used. To this end, export a new x86 function
x86_bios_rom_reload() to reload the bios again. This function will be used in
the subsequent patches.
Ani Sinha [Wed, 25 Feb 2026 03:49:17 +0000 (09:19 +0530)]
hw/i386: refactor x86_bios_rom_init for reuse in confidential guest reset
For confidential guests, bios image must be reinitialized upon reset. This
is because bios memory is encrypted and hence once the old confidential
kvm context is destroyed, it cannot be decrypted. It needs to be reinitilized.
Towards that, this change refactors x86_bios_rom_init() code so that
parts of it can be called during confidential guest reset.
No functional chnage.
Ani Sinha [Wed, 25 Feb 2026 03:49:15 +0000 (09:19 +0530)]
kvm/i386: implement architecture support for kvm file descriptor change
When the kvm file descriptor changes as a part of confidential guest reset,
some architecture specific setups including SEV/SEV-SNP/TDX specific setups
needs to be redone. These changes are implemented as a part of the
kvm_arch_on_vmfd_change() callback which was introduced previously.
Ani Sinha [Wed, 25 Feb 2026 03:49:14 +0000 (09:19 +0530)]
i386/kvm: unregister smram listeners prior to vm file descriptor change
We will re-register smram listeners after the VM file descriptors has changed.
We need to unregister them first to make sure addresses and reference counters
work properly.
Ani Sinha [Wed, 25 Feb 2026 03:49:13 +0000 (09:19 +0530)]
accel/kvm: notify when KVM VM file fd is about to be changed
Various subsystems might need to take some steps before the KVM file descriptor
for a virtual machine is changed. So a new boolean attribute is added to the
vmfd_notifier structure which is passed to the notifier callbacks.
vmfd_notifer.pre is true for pre-notification of vmfd change and false for
post notification. Notifier callback implementations can simply check
the boolean value for (vmfd_notifer*)->pre and can take actions for pre or
post vmfd change based on the value.
Subsequent patches will add callback implementations for specific components
that need this pre-notification.
Ani Sinha [Wed, 25 Feb 2026 03:49:12 +0000 (09:19 +0530)]
accel/kvm: add a notifier to indicate KVM VM file descriptor has changed
A notifier callback can be used by various subsystems to perform actions when
KVM file descriptor for a virtual machine changes as a part of confidential
guest reset process. This change adds this notifier mechanism. Subsequent
patches will add specific implementations for various notifier callbacks
corresponding to various subsystems that need to take action when KVM VM file
descriptor changed.
Ani Sinha [Wed, 25 Feb 2026 03:49:11 +0000 (09:19 +0530)]
accel/kvm: mark guest state as unprotected after vm file descriptor change
When the KVM VM file descriptor has changed and a new one created, the guest
state is no longer in protected state. Mark it as such.
The guest state becomes protected again when TDX and SEV-ES and SEV-SNP mark
it as such.
Ani Sinha [Wed, 25 Feb 2026 03:49:10 +0000 (09:19 +0530)]
accel/kvm: add changes required to support KVM VM file descriptor change
This change adds common kvm specific support to handle KVM VM file descriptor
change. KVM VM file descriptor can change as a part of confidential guest reset
mechanism. A new function api kvm_arch_on_vmfd_change() per
architecture platform is added in order to implement architecture specific
changes required to support it. A subsequent patch will add x86 specific
implementation for kvm_arch_on_vmfd_change() as currently only x86 supports
confidential guest reset.
Ani Sinha [Wed, 25 Feb 2026 03:49:09 +0000 (09:19 +0530)]
system/physmem: add helper to reattach existing memory after KVM VM fd change
After the guest KVM file descriptor has changed as a part of the process of
confidential guest reset mechanism, existing memory needs to be reattached to
the new file descriptor. This change adds a helper function ram_block_rebind()
for this purpose. The next patch will make use of this function.
Ani Sinha [Wed, 25 Feb 2026 03:49:08 +0000 (09:19 +0530)]
hw/accel: add a per-accelerator callback to change VM accelerator handle
When a confidential virtual machine is reset, a new guest context in the
accelerator must be generated post reset. Therefore, the old accelerator guest
file handle must be closed and a new one created. To this end, a per-accelerator
callback, "rebuild_guest" is introduced that would get called when a confidential
guest is reset. Subsequent patches will introduce specific implementation of
this callback for KVM accelerator.
Ani Sinha [Wed, 25 Feb 2026 03:49:07 +0000 (09:19 +0530)]
accel/kvm: add confidential class member to indicate guest rebuild capability
As a part of the confidential guest reset process, the existing encrypted guest
state must be made mutable since it would be discarded after reset. A new
encrypted and locked guest state must be established after the reset. To this
end, a new boolean member per confidential guest support class
(eg, tdx or sev-snp) is added that will indicate whether its possible to
rebuild guest state:
bool can_rebuild_guest_state;
This is true if rebuilding guest state is possible, false otherwise.
A KVM based confidential guest reset is only possible when
the existing state is locked but its possible to rebuild guest state.
Otherwise, the guest is not resettable.
Ani Sinha [Wed, 25 Feb 2026 03:49:06 +0000 (09:19 +0530)]
i386/kvm: avoid installing duplicate msr entries in msr_handlers
kvm_filter_msr() does not check if an msr entry is already present in the
msr_handlers table and installs a new handler unconditionally. If the function
is called again with the same MSR, it will result in duplicate entries in the
table and multiple such calls will fill up the table needlessly. Fix that.
Alexander Graf [Wed, 25 Feb 2026 22:08:04 +0000 (22:08 +0000)]
hw/nitro: Enable direct kernel boot
Nitro Enclaves can only boot EIF files which are a combination of
kernel, initramfs and cmdline in a single file. When the kernel image is
not an EIF, treat it like a kernel image and assemble an EIF image on
the fly. This way, users can call QEMU with a direct
kernel/initrd/cmdline combination and everything "just works".
Alexander Graf [Wed, 25 Feb 2026 22:08:03 +0000 (22:08 +0000)]
hw/core/eif: Move definitions to header
In follow-up patches we need some EIF file definitions that are
currently in the eif.c file, but want to access them from a separate
device. Move them into the header instead.
Alexander Graf [Wed, 25 Feb 2026 22:08:02 +0000 (22:08 +0000)]
hw/nitro: Add nitro machine
Add a machine model to spawn a Nitro Enclave. Unlike the existing -M
nitro-enclave, this machine model works exclusively with the -accel
nitro accelerator to drive real Nitro Enclave creation. It supports
memory allocation, number of CPU selection, both x86_64 as well as
aarch64, implements the Enclave heartbeat logic and debug serial
console.
Alex Bennée [Thu, 26 Feb 2026 18:53:02 +0000 (18:53 +0000)]
tests/functional: add Arm VBSA uefi conformance test
The VBSA test is a subset of the wider Arm architecture compliance
suites (ACS) which validate machines meet particular minimum set of
requirements. The VBSA is for virtual machines so it makes sense we
should check the -M virt machine is compliant.
Fortunately there are prebuilt binaries published via github so all we
need to do is build an EFI partition and place things in the right
place.
There are some additional Linux based tests which are left for later.
Alex Bennée [Thu, 26 Feb 2026 18:53:00 +0000 (18:53 +0000)]
tests/vm: fix interactive boot
For reasons still not clear to me passing the single dashed
-interactive would confuse the argument parsing enough we tried to
pass "nterative" as a string to the launch command causing failure and
head scratching.
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-ID: <20260226185303.1920021-6-alex.bennee@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Alex Bennée [Thu, 26 Feb 2026 18:52:57 +0000 (18:52 +0000)]
tests/docker: migrate legacy-test-cross compilers to trixie
The bugs have evidently been fixed in the latest release so we can
migrate the laggards into how all-test-cross container and remove the
legacy hacks. They are also packaged for the main architectures so we
don't need to jump through the amd64 hoops.
Suggested-by: John Snow <jsnow@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20260226185303.1920021-3-alex.bennee@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
John Snow [Thu, 26 Feb 2026 18:52:56 +0000 (18:52 +0000)]
tests/docker: upgrade most non-lcitool debian tests to debian 13
Debian 11 was EOL in 2024, and Debian 12 will be EOL this June. This
patch moves all but one of our tests, debian-legacy-test-cross, onto
Debian 13.
This patch does the bare minimum to upgrade these tests and doesn't make
any attempt at optimization or cleanup that may or may not be possible
with this upgrade.
Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
[AJB: tweak summary line]
Message-ID: <20260226185303.1920021-2-alex.bennee@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Alexander Graf [Wed, 25 Feb 2026 22:08:01 +0000 (22:08 +0000)]
target/arm/cpu64: Allow -host for nitro
The nitro accel does not actually make use of CPU emulation or details:
It always uses the host CPU regardless of configuration. Machines for
the nitro accel select the host CPU type as default to have a clear
statement of the above and to have a unified cpu type across all
supported architectures.
The arm64 logic on Linux currently only allows -cpu host for KVM based
virtual machines. Add a special case for nitro so that when the nitro
accel is active, it allows use of the host cpu type.
Nitro Enclaves expect the parent instance to host a vsock heartbeat listener
at port 9000. To host a Nitro Enclave with the nitro accel in QEMU, add
such a heartbeat listener as device model, so that the machine can
easily instantiate it.
Nitro Enclaves support a special "debug" mode. When in debug mode, the
Nitro Hypervisor provides a vsock port that the parent can connect to to
receive serial console output of the Enclave. Add a new nitro-serial-vsock
driver that implements short-circuit logic to establish the vsock
connection to that port and feed its data into a chardev, so that a machine
model can use it as serial device.
Alexander Graf [Wed, 25 Feb 2026 22:07:58 +0000 (22:07 +0000)]
accel: Add Nitro Enclaves accelerator
Nitro Enclaves are a confidential compute technology which
allows a parent instance to carve out resources from itself
and spawn a confidential sibling VM next to itself. Similar
to other confidential compute solutions, this sibling is
controlled by an underlying vmm, but still has a higher level
vmm (QEMU) to implement some of its I/O functionality and
lifecycle.
Add an accelerator to drive this interface. In combination with
follow-on patches to enhance the Nitro Enclaves machine model, this
will allow users to run a Nitro Enclave using QEMU.
Alexander Graf [Wed, 25 Feb 2026 22:07:57 +0000 (22:07 +0000)]
hw/nitro: Add Nitro Vsock Bus
Add a dedicated bus for Nitro Enclave vsock devices. In Nitro Enclaves,
communication between parent and enclave/hypervisor happens almost
exclusively through vsock. The nitro-vsock-bus models this dependency
in QEMU, which allows devices in this bus to implement individual services
on top of vsock.
The nitro machine spawns this bus by creating the included
nitro-vsock-bridge sysbus device.
The nitro accel then advertises the Enclave's CID to the bus by calling
nitro_vsock_bridge_start_enclave() on the bridge device as soon as it
knows the CID.
Nitro vsock devices can listen to that event and learn the Enclave's CID
when it is available to perform actions, such as connect to the debug
serial vsock port.
Alexander Graf [Wed, 25 Feb 2026 22:07:56 +0000 (22:07 +0000)]
linux-headers: Add nitro_enclaves.h
QEMU is learning to drive the /dev/nitro_enclaves device node. Include
its UAPI header into our local copy of kernel headers so it has all
defines we need to drive it.
We want to enable QEMU to drive the /dev/nitro_enclaves device node. Add
its UAPI header into our kernel sync so we have all defines we need to
drive it.
Maxim Levitsky [Mon, 23 Feb 2026 22:19:08 +0000 (17:19 -0500)]
accel/kvm: Don't clear pending #SMI in kvm_get_vcpu_events
The kvm_get_vcpu_events propogates the state of the pending smi
from the kernel to the cpu->interrupt_request, with the intention
of having un up to date migration state.
Later the opposite is done, the kvm_put_vcpu_events restores the state
of the pending #SMI from the 'cs->interrupt_request'
The only problem is that kvm_get_vcpu_events also resets the SMI
in cpu->interrupt_request when there is no pending #SMI indicated by the kernel,
and that is wrong as the SMI might be still raised by qemu.
While at it, also fix a similar but more theoretical bug with regard to a
latched #INIT while in SMM.
A simple reproducer for this bug is to read an EFI variable in a loop
from within a guest, while at the same time run 'info registers' on
the qemu HMP monitor.
The reads will, once in a while, fail with an 'Invalid argument' error.
Mohamed Mediouni [Mon, 23 Feb 2026 23:39:41 +0000 (00:39 +0100)]
target/i386: emulate, hvf, mshv: rework MMU code
target/i386/emulate doesn't currently properly emulate instructions
which might cause a page fault during their execution. Notably, REP STOS/MOVS
from MMIO to an address which is unmapped until a page fault exception is raised
causes an abort() in vmx_write_mem.
Change the interface between the HW accel backend and target/i386/emulate as a first step towards addressing that.
Adapt the page table walker code to give actionable errors,
while leaving a possibility for backends to provide their own walker.
This removes the usage of the Hyper-V page walker in the mshv backend.