git.ipfire.org Git - thirdparty/qemu.git/log

bsd-user: Add target_semid_ds and target_msqid_ds structures

Add the target ABI definitions for System V semaphore and message queue
data structures, needed for semctl() and msgctl() syscall emulation.

Signed-off-by: Stacey Son <sson@FreeBSD.org>
Signed-off-by: Mikael Urankar <mikael.urankar@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Warner Losh <imp@bsdimp.com>

common-user: Drop __linux__ around .note.GNU-stack

GNU-stack tagging is a toolchain issue, not an OS issue. All the
toolchains require this for ELF.

Signed-off-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>

bsd-user: Remove NetBSD-specific code

Remove the NetBSD specific code form bsd-user. It's not been maintained
in any meaningful way since it was introduced to the tree in 2008. It
hasn't been connected to the build since 2021, and last time (in 2023) I
tried to mock-up the meson support it needed, it failed to build. While
there were some out-of-tree work, I've not been able to connect with
that code.

Cc: Reinoud Zandijk <reinoud@netbsd.org>
Cc: Ryo ONODERA <ryoon@netbsd.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Warner Losh <imp@bsdimp.com>

bsd-user: Remove OpenBSD-specific code

Remove the OpenBSD specific code form bsd-user. It's not been maintained
in any meaningful way since it was introduced to the tree in 2008. It
hasn't been connected to the build since 2021, and last time (in 2023) I
tried to mock-up the meson support it needed, it failed to build. I
contacted the OpenBSD people in 2018, it appears, and even at that time
they tought this code was not at all useful to them.

Cc: Brad Smith <brad@comstyle.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Warner Losh <imp@bsdimp.com>

freebsd: FreeBSD 15 has native inotify

Check to make sure that we have inotify in libc, before looking for it
in libinotify.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: Daniel P. Berrange <berrange@redhat.com>
Cc: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Warner Losh <imp@bsdimp.com>

target/i386: emulate: fix scas

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260228214704.19048-9-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

whpx: i386: expose HV_X64_MSR_APIC_FREQUENCY when kernel-irqchip=off

Now that we expose AccessFrequencyRegs, expose HV_X64_MSR_APIC_FREQUENCY as well for the case when the Hyper-V LAPIC is not used.

If the Hyper-V LAPIC is used, this will be handled by the hypervisor instead of the VMM, hence gating it on !whpx_irqchip_in_kernel().

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260228214704.19048-8-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

whpx: i386: enable PMU

Also a partition property instead of a CPU one...

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260228214704.19048-7-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: emulate: more 64-bit register handling

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260228214704.19048-6-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

whpx: i386: warn on unsupported MSR access instead of failing silently

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260228214704.19048-5-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

whpx: i386: enable synthetic processor features

At the point in time in which we setup the partition, the vCPUs
aren't available yet.

So enable them by default for now like what the MSHV backend does.

AccessFrequencyRegs is shared for both the LAPIC frequency reporting and the TSC frequency.

To still benefit from the fixed TSC frequency reporting when kernel-irqchip=off, still enable AccessFrequencyRegs anyway.

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260228214704.19048-4-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

whpx: i386: enable all supported host features

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260228214704.19048-3-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

whpx: i386: move whpx_vcpu_kick_out_of_hlt() invocation to interrupt raise time

This fixes the sti followed by hlt kvm_unit_tests.

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260228214704.19048-2-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: introduce ClearwaterForest-v3 to expose ITS_NO

Expose ITS_NO by default, as users using Clearwater Forest and higher
CPU models would not be able to live migrate to lower CPU hosts due to
missing features. In that case, they would not be vulnerable to ITS.

its-no was originally added on [1], but needs to be exposed on the
individual CPU models for the guests to see by default.

Note: Version 1 already exposes ARCH_CAP_BHI_NO, which would already
mark the CPU as invulnerable to ITS (at least in Linux); however,
expose ITS_NO for completeness.

[1] 74978391b2da ("target/i386: Make ITS_NO available to guests")

Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Jon Kohler <jon@nutanix.com>
Link: https://lore.kernel.org/r/20251106174626.49930-6-jon@nutanix.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: introduce SierraForest-v5 to expose ITS_NO

Expose ITS_NO by default, as users using Sierra Forest and higher
CPU models would not be able to live migrate to lower CPU hosts due to
missing features. In that case, they would not be vulnerable to ITS.

its-no was originally added on [1], but needs to be exposed on the
individual CPU models for the guests to see by default.

Note: For SRF, version 2 already exposed BHI_CTRL, which would already
mark the CPU as invulnerable to ITS (at least in Linux); however,
expose ITS_NO for completeness.

[1] 74978391b2da ("target/i386: Make ITS_NO available to guests")

Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Jon Kohler <jon@nutanix.com>
Link: https://lore.kernel.org/r/20251106174626.49930-5-jon@nutanix.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: introduce GraniteRapids-v5 to expose ITS_NO

Expose ITS_NO by default, as users using Granite Rapids and higher
CPU models would not be able to live migrate to lower CPU hosts due to
missing features. In that case, they would not be vulnerable to ITS.

its-no was originally added on [1], but needs to be exposed on the
individual CPU models for the guests to see by default.

[1] 74978391b2da ("target/i386: Make ITS_NO available to guests")

Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Jon Kohler <jon@nutanix.com>
Link: https://lore.kernel.org/r/20251106174626.49930-4-jon@nutanix.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: introduce SapphireRapids-v6 to expose ITS_NO

Expose ITS_NO by default, as users using Sapphire Rapids and higher
CPU models would not be able to live migrate to lower CPU hosts due to
missing features. In that case, they would not be vulnerable to ITS.

its-no was originally added on [1], but needs to be exposed on the
individual CPU models for the guests to see by default.

[1] 74978391b2da ("target/i386: Make ITS_NO available to guests")

Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Jon Kohler <jon@nutanix.com>
Link: https://lore.kernel.org/r/20251106174626.49930-3-jon@nutanix.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: Add MSR_IA32_ARCH_CAPABILITIES ITS_NO

Add bit definition for Indirect Target Selection (ITS_NO) bit 62, to
allow ITS_NO to be added directly to a CPU model in the future.

Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Jon Kohler <jon@nutanix.com>
Link: https://lore.kernel.org/r/20251106174626.49930-2-jon@nutanix.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: Add VMX_SECONDARY_EXEC_MODE_BASED_EPT_EXEC

Enumerate ability to enable Intel Mode-Based Execute Control (MBEC)
on secondary execution control bit 22.

Intel MBEC is a hardware feature, introduced in the Kabylake
generation, that allows for more granular control over execution
permissions. MBEC enables the separation and tracking of execution
permissions for supervisor (kernel) and user-mode code. It is used as
an accelerator for Microsoft's Memory Integrity [1] (also known as
hypervisor-protected code integrity or HVCI).

[1] https://learn.microsoft.com/en-us/windows/security/hardware-security/enable-virtualization-based-protection-of-code-integrity

Code is mirrored here:
https://github.com/JonKohler/linux/tree/mbec-v1-6.18
https://github.com/JonKohler/kvm-unit-tests/tree/mbec-v1

LKML thread(s) are here:
Original RFC: https://lore.kernel.org/all/20250313203702.575156-1-jon@nutanix.com/
V1 code: https://lore.kernel.org/all/20251223054806.1611168-1-jon@nutanix.com/
KVM unit test changes: https://lore.kernel.org/all/20251223054850.1611618-1-jon@nutanix.com/

Cc: Xiaoyao Li <xiaoyao.li@intel.com>
Cc: Zhao Liu <zhao1.liu@intel.com>
Co-authored-by: Jon Kohler <jon@nutanix.com>
Co-authored-by: Aditya Desai <aditya.desai@nutanix.com>
Signed-off-by: Jon Kohler <jon@nutanix.com>
Link: https://lore.kernel.org/r/20251223060834.1618428-1-jon@nutanix.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Merge tag 'pull-11.0-testing-updates-270226-2' of https://gitlab.com/stsquad/qemu into staging

testing updates (vm, docker, arm functional)

  - migrate non-lcitool Debian containers to Trixie (13)
  - remove legacy-test-cross hacks
  - fix some minor make vm- Makefile issues
  - bump OpenBSD to 7.8
  - add VBSA EFI functional test for Arm

# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmmhlHYACgkQ+9DbCVqe
# KkQVsQf+Mnow3ceQ4Tx9ovnn18SyS6+hXzBqUabd2aV4ybnkcwXAsY1XnArINdQP
# FuaJNgQalcaQYF9iCgZpSE0hcdk8Zt2lISZOPOAMvZ5zvia+fT2FoqKQYevIK/Oq
# 1A8g96yZW33EPi4SemEgnmXoKl8a0/HDqD7AT/L0JJuDWuldplRC2vJHoyT0tnC0
# qgTmENOGJsxbLJnpu9y2PyHpTgeRw7TdHwjN56c8Q0RjIptUFHotU47xYvMAbcdw
# E6mbeppnrOlF+0kNBy+jtAg1Fleh7JGIYbwyRh4QctRxIgTgvBJn/ej9BuljI230
# 8bCPaG6X5Ijrru66mTSQNyrsMAu4tA==
# =0Rb8
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri Feb 27 12:56:22 2026 GMT
# gpg:                using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [full]
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8  DF35 FBD0 DB09 5A9E 2A44

* tag 'pull-11.0-testing-updates-270226-2' of https://gitlab.com/stsquad/qemu:
  tests/functional: add Arm VBSA uefi conformance test
  tests/vm: build openbsd from lcitool data
  tests/vm: fix interactive boot
  tests/vm: remove unused import
  tests/vm: bump OpenBSD to the current 7.8 release
  tests/docker: migrate legacy-test-cross compilers to trixie
  tests/docker: upgrade most non-lcitool debian tests to debian 13

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge tag 'pull-9p-20260228' of https://github.com/cschoenebeck/qemu into staging

9pfs changes:

* Fix crash under unlink-heavy load in v9fs_mark_fids_unreclaim().

* Fix crash with the synth fs driver.

# -----BEGIN PGP SIGNATURE-----
#
# iQJLBAABCgA1FiEEltjREM96+AhPiFkBNMK1h2Wkc5UFAmmi7dIXHHFlbXVfb3Nz
# QGNydWRlYnl0ZS5jb20ACgkQNMK1h2Wkc5Uitw//SvhQlsjdtoJjsQYNpYN7gbkC
# PwU3b88xP1iPEhZ7ZWoQYdd0bJEdb4CmfG/iKlz8jwCvY2VnX4xkfU15GYLzEtOR
# PU13dBxXPLmhRFFO6TzLm8kFwa8LkZB/Hm8cGCwBshsGwIgQAE9RNoE/LyQb6C7B
# ocgdNFXkSvaqnWY6m3PjQb59IZ1Smg5P7GMvoIzqeiVYJ6PvsLvW9V/w3UrT5dhB
# hFqK8eUgYV5Wgx1qUeYqr8O9425tdGf9cO7w1eIl/YKTQkV3wbkvY8LlARYeLU6P
# bg2eVqj3c+dDVpO0+VSNUNutV5STHYP6Ub/WEZBP92MIOx8VIvus62/zKR2Hq2uY
# e0qpC3lCvKGxzxH54GGYzUfKonLi3uv5tLMfB/EPcLQd4bH0bf23a/F5gesYvzwZ
# N8WeHxV/cNpxxkM6lIDTdSIoxtj8HXLsZxkSJ8bOPpcXd7JPfIQATfYKVvVO7AzB
# JHsGSwHZ4xKw2KuDtN6xsalf48kVi8VZpcmgmCCgFN/m0ubQTcRiIXoZ3c8j9xp/
# UqrmcpBX5uU4t0CDEm0RBwyHVey7Gv0xFg8VKfIdWczdIcGNh/VCzp8rm+zcl6FB
# XFkA7O2q/qIPNpj0JNaBekKSLvDtqjgR0rOHh0iJhhzdQWIVkbKd0OvMtmpgKlbk
# vdAulpGAJftqpoe6/zo=
# =pZuI
# -----END PGP SIGNATURE-----
# gpg: Signature made Sat Feb 28 13:29:54 2026 GMT
# gpg:                using RSA key 96D8D110CF7AF8084F88590134C2B58765A47395
# gpg:                issuer "qemu_oss@crudebyte.com"
# gpg: Good signature from "Christian Schoenebeck <qemu_oss@crudebyte.com>" [unknown]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: ECAB 1A45 4014 1413 BA38  4926 30DB 47C3 A012 D5F4
#      Subkey fingerprint: 96D8 D110 CF7A F808 4F88  5901 34C2 B587 65A4 7395

* tag 'pull-9p-20260228' of https://github.com/cschoenebeck/qemu:
  hw/9pfs: fix missing EOPNOTSUPP on Twstat and Trenameat for fs synth driver
  hw/9pfs: fix data race in v9fs_mark_fids_unreclaim()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

hw/9pfs: fix missing EOPNOTSUPP on Twstat and Trenameat for fs synth driver

Renaming files/dirs is only supported by path-based fs drivers. EOPNOTSUPP
should be returned on any renaming attempt for not path-based fs drivers.
This was already the case for 9p "Trename" request type. However for 9p
request types "Trenameat" and "Twstat" this was yet missing.

So fix this by checking in Twstat and Trenameat request handlers whether
the fs driver in use is really path based, if not return EOPNOTSUPP and
abort further handling of the request.

This fixes a crash with the 9p "synth" fs driver which is not path-based.

The crash happened because the synth driver stores and expects a raw
V9fsSynthNode pointer instead of a C-string on V9fsPath.data. So the
C-string delivered by 9p server to synth fs driver was incorrectly
casted to a V9fsSynthNode pointer, eventually causing a segfault.

Reported-by: Oliver Chang <ochang@google.com>
Fixes: https://issues.oss-fuzz.com/issues/477990727
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3298
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Link: https://lore.kernel.org/qemu-devel/E1vrbaP-000Gqb-B3@kylie.crudebyte.com/

hw/9pfs: fix data race in v9fs_mark_fids_unreclaim()

A data race between v9fs_mark_fids_unreclaim() and v9fs_path_copy()
causes an inconsistent read of fidp->path. In v9fs_path_copy(), the
path size is set before the data pointer is allocated, creating a
window where size is non-zero but data is NULL.

v9fs_co_open2() holds a write lock during path modifications,
but v9fs_mark_fids_unreclaim() was not acquiring a read
lock, allowing it to race.

Fix by holding the path read lock during FID table iteration.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3300
Signed-off-by: Richie Buturla <richie@linux.ibm.com>
Link: https://lore.kernel.org/qemu-devel/20260211154450.254338-1-richie@linux.ibm.com/
Fixes: 7a46274529 ("hw/9pfs: Add file descriptor reclaim support")
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>

Reapply "rcu: Unify force quiescent state"

This reverts commit ddb4d9d1748681cfde824d765af6cda4334fcce3.

The commit says:
> This reverts commit 55d98e3edeeb17dd8445db27605d2b34f4c3ba85.
>
> The commit introduced a regression in the replay functional test
> on alpha (tests/functional/alpha/test_replay.py), that causes CI
> failures regularly. Thus revert this change until someone has
> figured out what is going wrong here.

Reapply the change as alpha is fixed.

Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20260217-alpha-v1-2-0dcc708c9db3@rsg.ci.i.u-tokyo.ac.jp
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/alpha: Reset CPU

alpha_cpu_realizefn() did not properly call cpu_reset(), which
corrupted icount. Add the missing function call to fix icount.

Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Tested-by: Thomas Huth <thuth@redhat.com>
Link: https://lore.kernel.org/r/20260217-alpha-v1-1-0dcc708c9db3@rsg.ci.i.u-tokyo.ac.jp
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

hw: i386: vapic: enable on WHPX with user-mode irqchip

Alleviate a performance bottleneck on legacy Windows guests.

In my test setup, this makes Windows XP boot times be 20x faster
than they're otherwise.

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260226181930.53170-4-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

whpx: x86: kick out of HLT manually when using the kernel-irqchip

Otherwise, interrupts processed through the cancel vCPU and inject path will not cause the vCPU to go out of its halt state.

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260226181930.53170-3-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

whpx: x86: remove inaccurate comment

WHvRunVpExitReasonX64Halt _is_ triggered on halt with kernel-irqchip=off as of Windows 11 version 25H2.

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260226181930.53170-2-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: i386: Default disable ignore guest PAT quirk

Add a new accelerator option that allows the guest to adjust the PAT.
This is already the case for TDX guests and allows using virtio-gpu
Venus with RADV or NVIDIA drivers.

The quirk is disabled by default. Since this caused problems with
Linux's Bochs video device driver, add a knob to leave it enabled,
and for now do ont enable it by default.

Signed-off-by: Myrsky Lintu <qemu.haziness801@passinbox.com>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2943
Link: https://lore.kernel.org/r/175527721636.15451.4393515241478547957-1@git.sr.ht
[Add property; for now leave it off by default. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust: use checked_div to make clippy happy

When upgrading from Fedora 41 to Fedora 43 for CI tests, clippy begins
complaining about not using checked_div instead of manually checking
divisors. Make clippy happy and use checked_div() instead.

Signed-off-by: John Snow <jsnow@redhat.com>
Link: https://lore.kernel.org/r/20260219185409.708130-2-jsnow@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

ui: drop spice-protocol < 0.14.3 support

According to repology, all our supported distributions have 0.14.3.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Link: https://lore.kernel.org/r/20260211-cleanups-v1-7-e63c96572389@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

audio: require spice >= 0.15

Spice server 0.15.0 was released on 2021-04-16. It is part of all our
supported distro (except CentOS 9, which doesn't include it).

It has all the new required audio APIs/interfaces.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Link: https://lore.kernel.org/r/20260211-cleanups-v1-5-e63c96572389@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

audio: require pulse >= 0.9.13

pulseaudio 0.9.13 was released on 2009-09-10. All our supported
distros have it.

PA_*_IS_GOOD are from 0.9.11.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Link: https://lore.kernel.org/r/20260211-cleanups-v1-4-e63c96572389@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

scripts/vendor.py: add pycotap

Related to commit 5ec1eec11000ef118b2a87c330245ffaa475f5ee ("python:
Install pycotap in our venv if necessary")

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Link: https://lore.kernel.org/r/20260211-cleanups-v1-3-e63c96572389@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

python/wheel: remove meson-1.9.0

Leftover from commit 8c04b6a48b15a478ff3f9d152592a0ba503a31e4.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Link: https://lore.kernel.org/r/20260211-cleanups-v1-2-e63c96572389@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

audio: fix nominal volume channel (cosmetic)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Link: https://lore.kernel.org/r/20260211-cleanups-v1-1-e63c96572389@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

qom: add 'confidential-guest-reset' property for x86 confidential vms

Through the new 'confidential-guest-reset' property, control plane should be
able to detect if the hypervisor supports x86 confidential guest resets. Older
hypervisors that do not support resets will not have this property populated.

Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-35-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

tests/functional/x86_64: add functional test to exercise vm fd change on reset

A new functional test is added that exercises the code changes related to
closing of the old KVM VM file descriptor and opening a new one upon VM reset.
This normally happens when confidential guests are reset but for
non-confidential guests, we use a special machine specific debug/test parameter
'x-change-vmfd-on-reset' to enable this behavior.
Only specific code changes related to re-initialisation of SEV-ES, SEV-SNP and
TDX platforms are not exercised in this test as they require hardware that
supports running confidential guests.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-34-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

hw/machine: introduce machine specific option 'x-change-vmfd-on-reset'

A new machine specific option 'x-change-vmfd-on-reset' is introduced for
debugging and testing only (hence the 'x-' prefix). This option when enabled
will force KVM VM file descriptor to be changed upon guest reset like
in the case of confidential guests. This can be used to exercise the code
changes that are specific for confidential guests on non-confidential
guests as well (except changes that require hardware support for
confidential guests).
A new functional test has been added in the next patch that uses this new
parameter to test the VM file descriptor changes.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-33-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

kvm/clock: add support for confidential guest reset

Confidential guests change the KVM VM file descriptor upon reset and also create
new VCPU file descriptors against the new KVM VM file descriptor. We need to
save the clock state from kvm before KVM VM file descriptor changes and restore
it after. Also after VCPU file descriptors changed, we must call
KVM_KVMCLOCK_CTRL on the VCPU file descriptor to inform KVM that the VCPU is
in paused state.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-32-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

kvm/vcpu: add notifiers to inform vcpu file descriptor change

When new vcpu file descriptors are created and bound to the new kvm file
descriptor as a part of the confidential guest reset mechanism, various
subsystems needs to know about it. This change adds notifiers so that various
subsystems can take appropriate actions when vcpu fds change by registering
their handlers to this notifier.
Subsequent changes will register specific handlers to this notifier.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-31-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

ppc/openpic: create a new openpic device and reattach mem region on coco reset

For confidential guests during the reset process, the old KVM VM file
descriptor is closed and a new one is created. When a new file descriptor is
created, a new openpic device needs to be created against this new KVM VM file
descriptor as well. Additionally, existing memory region needs to be reattached
to this new openpic device and proper CPU attributes set associating new file
descriptor. This change makes this happen with the help of a callback handler
that gets called when the KVM VM file descriptor changes as a part of the
confidential guest reset process.

Reviewed-by: Bernhard Beschow <shentey@gmail.com>
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-30-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

kvm/xen-emu: re-initialize capabilities during confidential guest reset

On confidential guests KVM virtual machine file descriptor changes as a
part of the guest reset process. Xen capabilities needs to be re-initialized in
KVM against the new file descriptor.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-29-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

hw/hyperv/vmbus: add support for confidential guest reset

On confidential guests when the KVM virtual machine file descriptor changes as
a part of the reset process, event file descriptors needs to be reassociated
with the new KVM VM file descriptor. This is achieved with the help of a
callback handler that gets called when KVM VM file descriptor changes during
the confidential guest reset process.

This patch is tested on non-confidential platform only.

Acked-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-28-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

kvm/hyperv: add synic feature to CPU only if its not enabled

We need to make sure that synic CPU feature is not already enabled. If it is,
trying to enable it again will result in the following assertion:

Unexpected error in object_property_try_add() at ../qom/object.c:1268:
qemu-system-x86_64: attempt to add duplicate property 'synic' to object (type 'host-x86_64-cpu')

So enable synic only if its not enabled already.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-27-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

kvm/i8254: add support for confidential guest reset

A confidential guest reset involves closing the old virtual machine KVM file
descriptor and opening a new one. Since its a new KVM fd, PIT needs to be
re-initialized again. This is done with the help of a notifier which is invoked
upon KVM vm file descriptor change during the confidential guest reset process.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-26-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

kvm/i8254: refactor pit initialization into a helper

The initialization code will be used again by VM file descriptor change
notifier callback in a subsequent change. So refactor common code into a new
helper function.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-25-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

hw/vfio: generate new file fd for pseudo device and rebind existing descriptors

Normally the vfio pseudo device file descriptor lives for the life of the VM.
However, when the kvm VM file descriptor changes, a new file descriptor
for the pseudo device needs to be generated against the new kvm VM descriptor.
Other existing vfio descriptors needs to be reattached to the new pseudo device
descriptor. This change performs the above steps.

Tested-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260227072445.406907-1-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/sev: add support for confidential guest reset

When the KVM VM file descriptor changes as a part of the confidential guest
reset mechanism, it necessary to create a new confidential guest context and
re-encrypt the VM memory. This happens for SEV-ES and SEV-SNP virtual machines
as a part of SEV_LAUNCH_FINISH, SEV_SNP_LAUNCH_FINISH operations.

A new resettable interface for SEV module has been added. A new reset callback
for the reset 'exit' state has been implemented to perform the above operations
when the VM file descriptor has changed during VM reset.

Tracepoints has been added also for tracing purpose.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-23-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/sev: free existing launch update data and kernel hashes data on init

If there is existing launch update data and kernel hashes data, they need to be
freed when initialization code is executed. This is important for resettable
confidential guests where the initialization happens once every reset.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-22-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/sev: add notifiers only once

The various notifiers that are used needs to be installed only once not on
every initialization. This includes the vm state change notifier and others.
This change uses 'cgs->ready' flag to install the notifiers only one time,
the first time.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-21-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/sev: add migration blockers only once

sev_launch_finish() and sev_snp_launch_finish() could be called multiple times
when the confidential guest is being reset/rebooted. The migration
blockers should not be added multiple times, once per invocation. This change
makes sure that the migration blockers are added only one time by adding the
migration blockers to the vm state change handler when the vm transitions to
the running state. Subsequent reboots do not change the state of the vm.

Reviewed-by: Prasad Pandit <pjp@fedoraproject.org>
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-20-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/tdx: add a pre-vmfd change notifier to reset tdx state

During reset, when the VM file descriptor is changed, the TDX state needs to be
re-initialized. A notifier callback is implemented to reset the old
state and free memory before the new state is initialized post VM file
descriptor change.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-19-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/tdx: finalize TDX guest state upon reset

When the confidential virtual machine KVM file descriptor changes due to the
guest reset, some TDX specific setup steps needs to be done again. This
includes finalizing the initial guest launch state again. This change
re-executes some parts of the TDX setup during the device reset phaze using a
resettable interface. This finalizes the guest launch state again and locks
it in. Machine done notifier which was previously used is no longer needed as
the same code is now executed as a part of VM reset.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-18-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/tdx: refactor TDX firmware memory initialization code into a new function

A new helper function is introduced that refactors all firmware memory
initialization code into a separate function. No functional change.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-17-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

accel/kvm: rebind current VCPUs to the new KVM VM file descriptor upon reset

Confidential guests needs to generate a new KVM file descriptor upon virtual
machine reset. Existing VCPUs needs to be reattached to this new
KVM VM file descriptor. As a part of this, new VCPU file descriptors against
this new KVM VM file descriptor needs to be created and re-initialized.
Resources allocated against the old VCPU fds needs to be released. This change
makes this happen.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-16-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

kvm/i386: reload firmware for confidential guest reset

When IGVM is not being used by the confidential guest, the guest firmware has
to be reloaded explicitly again into memory. This is because, the memory into
which the firmware was loaded before reset was encrypted and is thus lost
upon reset. When IGVM is used, it is expected that the IGVM will contain the
guest firmware and the execution of the IGVM directives will set up the guest
firmware memory.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-15-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

hw/i386: export a new function x86_bios_rom_reload

Confidential guest smust reload their bios rom upon reset. This is because
bios memory is encrypted and upon reset, the contents of the old bios memory
is lost and cannot be re-used. To this end, export a new x86 function
x86_bios_rom_reload() to reload the bios again. This function will be used in
the subsequent patches.

Reviewed-by: Bernhard Beschow <shentey@gmail.com>
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-14-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

hw/i386: refactor x86_bios_rom_init for reuse in confidential guest reset

For confidential guests, bios image must be reinitialized upon reset. This
is because bios memory is encrypted and hence once the old confidential
kvm context is destroyed, it cannot be decrypted. It needs to be reinitilized.
Towards that, this change refactors x86_bios_rom_init() code so that
parts of it can be called during confidential guest reset.
No functional chnage.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-13-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/kvm: refactor xen init into a new function

Cosmetic - no new functionality added. Xen initialisation code is refactored
into its own function.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-12-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

kvm/i386: implement architecture support for kvm file descriptor change

When the kvm file descriptor changes as a part of confidential guest reset,
some architecture specific setups including SEV/SEV-SNP/TDX specific setups
needs to be redone. These changes are implemented as a part of the
kvm_arch_on_vmfd_change() callback which was introduced previously.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-11-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/kvm: unregister smram listeners prior to vm file descriptor change

We will re-register smram listeners after the VM file descriptors has changed.
We need to unregister them first to make sure addresses and reference counters
work properly.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-10-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

accel/kvm: notify when KVM VM file fd is about to be changed

Various subsystems might need to take some steps before the KVM file descriptor
for a virtual machine is changed. So a new boolean attribute is added to the
vmfd_notifier structure which is passed to the notifier callbacks.
vmfd_notifer.pre is true for pre-notification of vmfd change and false for
post notification. Notifier callback implementations can simply check
the boolean value for (vmfd_notifer*)->pre and can take actions for pre or
post vmfd change based on the value.

Subsequent patches will add callback implementations for specific components
that need this pre-notification.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-9-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

accel/kvm: add a notifier to indicate KVM VM file descriptor has changed

A notifier callback can be used by various subsystems to perform actions when
KVM file descriptor for a virtual machine changes as a part of confidential
guest reset process. This change adds this notifier mechanism. Subsequent
patches will add specific implementations for various notifier callbacks
corresponding to various subsystems that need to take action when KVM VM file
descriptor changed.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-8-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

accel/kvm: mark guest state as unprotected after vm file descriptor change

When the KVM VM file descriptor has changed and a new one created, the guest
state is no longer in protected state. Mark it as such.
The guest state becomes protected again when TDX and SEV-ES and SEV-SNP mark
it as such.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-7-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

accel/kvm: add changes required to support KVM VM file descriptor change

This change adds common kvm specific support to handle KVM VM file descriptor
change. KVM VM file descriptor can change as a part of confidential guest reset
mechanism. A new function api kvm_arch_on_vmfd_change() per
architecture platform is added in order to implement architecture specific
changes required to support it. A subsequent patch will add x86 specific
implementation for kvm_arch_on_vmfd_change() as currently only x86 supports
confidential guest reset.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-6-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

system/physmem: add helper to reattach existing memory after KVM VM fd change

After the guest KVM file descriptor has changed as a part of the process of
confidential guest reset mechanism, existing memory needs to be reattached to
the new file descriptor. This change adds a helper function ram_block_rebind()
for this purpose. The next patch will make use of this function.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-5-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

hw/accel: add a per-accelerator callback to change VM accelerator handle

When a confidential virtual machine is reset, a new guest context in the
accelerator must be generated post reset. Therefore, the old accelerator guest
file handle must be closed and a new one created. To this end, a per-accelerator
callback, "rebuild_guest" is introduced that would get called when a confidential
guest is reset. Subsequent patches will introduce specific implementation of
this callback for KVM accelerator.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-4-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

accel/kvm: add confidential class member to indicate guest rebuild capability

As a part of the confidential guest reset process, the existing encrypted guest
state must be made mutable since it would be discarded after reset. A new
encrypted and locked guest state must be established after the reset. To this
end, a new boolean member per confidential guest support class
(eg, tdx or sev-snp) is added that will indicate whether its possible to
rebuild guest state:

bool can_rebuild_guest_state;

This is true if rebuilding guest state is possible, false otherwise.
A KVM based confidential guest reset is only possible when
the existing state is locked but its possible to rebuild guest state.
Otherwise, the guest is not resettable.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-3-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/kvm: avoid installing duplicate msr entries in msr_handlers

kvm_filter_msr() does not check if an msr entry is already present in the
msr_handlers table and installs a new handler unconditionally. If the function
is called again with the same MSR, it will result in duplicate entries in the
table and multiple such calls will fill up the table needlessly. Fix that.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-2-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

docs: Add Nitro Enclaves documentation

Now that all pieces are in place to spawn Nitro Enclaves using
a special purpose accelerator and machine model, document how
to use it.

Signed-off-by: Alexander Graf <graf@amazon.com>
Link: https://lore.kernel.org/r/20260225220807.33092-12-graf@amazon.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

hw/nitro: Enable direct kernel boot

Nitro Enclaves can only boot EIF files which are a combination of
kernel, initramfs and cmdline in a single file. When the kernel image is
not an EIF, treat it like a kernel image and assemble an EIF image on
the fly. This way, users can call QEMU with a direct
kernel/initrd/cmdline combination and everything "just works".

Signed-off-by: Alexander Graf <graf@amazon.com>
Reviewed-by: Dorjoy Chowdhury <dorjoychy111@gmail.com>
Link: https://lore.kernel.org/r/20260225220807.33092-11-graf@amazon.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

hw/core/eif: Move definitions to header

In follow-up patches we need some EIF file definitions that are
currently in the eif.c file, but want to access them from a separate
device. Move them into the header instead.

Signed-off-by: Alexander Graf <graf@amazon.com>
Reviewed-by: Dorjoy Chowdhury <dorjoychy111@gmail.com>
Link: https://lore.kernel.org/r/20260225220807.33092-10-graf@amazon.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

hw/nitro: Add nitro machine

Add a machine model to spawn a Nitro Enclave. Unlike the existing -M
nitro-enclave, this machine model works exclusively with the -accel
nitro accelerator to drive real Nitro Enclave creation. It supports
memory allocation, number of CPU selection, both x86_64 as well as
aarch64, implements the Enclave heartbeat logic and debug serial
console.

To use it, create an EIF file and run

  $ qemu-system-x86_64 -accel nitro,debug-mode=on -M nitro -nographic \
                       -kernel test.eif

or

  $ qemu-system-aarch64 -accel nitro,debug-mode=on -M nitro -nographic \
                        -kernel test.eif

Signed-off-by: Alexander Graf <graf@amazon.com>
Link: https://lore.kernel.org/r/20260225220807.33092-9-graf@amazon.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

tests/functional: add Arm VBSA uefi conformance test

The VBSA test is a subset of the wider Arm architecture compliance
suites (ACS) which validate machines meet particular minimum set of
requirements. The VBSA is for virtual machines so it makes sense we
should check the -M virt machine is compliant.

Fortunately there are prebuilt binaries published via github so all we
need to do is build an EFI partition and place things in the right
place.

There are some additional Linux based tests which are left for later.

Reviewed-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-ID: <20260226185303.1920021-8-alex.bennee@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

tests/vm: build openbsd from lcitool data

For now only use the minimal decadency set until all the OpenBSD
mappings can be divined.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-ID: <20260226185303.1920021-7-alex.bennee@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

tests/vm: fix interactive boot

For reasons still not clear to me passing the single dashed
-interactive would confuse the argument parsing enough we tried to
pass "nterative" as a string to the launch command causing failure and
head scratching.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-ID: <20260226185303.1920021-6-alex.bennee@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

tests/vm: remove unused import

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-ID: <20260226185303.1920021-5-alex.bennee@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

tests/vm: bump OpenBSD to the current 7.8 release

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-ID: <20260226185303.1920021-4-alex.bennee@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

tests/docker: migrate legacy-test-cross compilers to trixie

The bugs have evidently been fixed in the latest release so we can
migrate the laggards into how all-test-cross container and remove the
legacy hacks. They are also packaged for the main architectures so we
don't need to jump through the amd64 hoops.

Suggested-by: John Snow <jsnow@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20260226185303.1920021-3-alex.bennee@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

tests/docker: upgrade most non-lcitool debian tests to debian 13

Debian 11 was EOL in 2024, and Debian 12 will be EOL this June. This
patch moves all but one of our tests, debian-legacy-test-cross, onto
Debian 13.

This patch does the bare minimum to upgrade these tests and doesn't make
any attempt at optimization or cleanup that may or may not be possible
with this upgrade.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
[AJB: tweak summary line]
Message-ID: <20260226185303.1920021-2-alex.bennee@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

target/arm/cpu64: Allow -host for nitro

The nitro accel does not actually make use of CPU emulation or details:
It always uses the host CPU regardless of configuration. Machines for
the nitro accel select the host CPU type as default to have a clear
statement of the above and to have a unified cpu type across all
supported architectures.

The arm64 logic on Linux currently only allows -cpu host for KVM based
virtual machines. Add a special case for nitro so that when the nitro
accel is active, it allows use of the host cpu type.

Signed-off-by: Alexander Graf <graf@amazon.com>
Link: https://lore.kernel.org/r/20260225220807.33092-8-graf@amazon.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

hw/nitro: Introduce Nitro Enclave Heartbeat device

Nitro Enclaves expect the parent instance to host a vsock heartbeat listener
at port 9000. To host a Nitro Enclave with the nitro accel in QEMU, add
such a heartbeat listener as device model, so that the machine can
easily instantiate it.

Signed-off-by: Alexander Graf <graf@amazon.com>
Link: https://lore.kernel.org/r/20260225220807.33092-7-graf@amazon.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

hw/nitro/nitro-serial-vsock: Nitro Enclaves vsock console

Nitro Enclaves support a special "debug" mode. When in debug mode, the
Nitro Hypervisor provides a vsock port that the parent can connect to to
receive serial console output of the Enclave. Add a new nitro-serial-vsock
driver that implements short-circuit logic to establish the vsock
connection to that port and feed its data into a chardev, so that a machine
model can use it as serial device.

Signed-off-by: Alexander Graf <graf@amazon.com>
Link: https://lore.kernel.org/r/20260225220807.33092-6-graf@amazon.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

accel: Add Nitro Enclaves accelerator

Nitro Enclaves are a confidential compute technology which
allows a parent instance to carve out resources from itself
and spawn a confidential sibling VM next to itself. Similar
to other confidential compute solutions, this sibling is
controlled by an underlying vmm, but still has a higher level
vmm (QEMU) to implement some of its I/O functionality and
lifecycle.

Add an accelerator to drive this interface. In combination with
follow-on patches to enhance the Nitro Enclaves machine model, this
will allow users to run a Nitro Enclave using QEMU.

Signed-off-by: Alexander Graf <graf@amazon.com>
Link: https://lore.kernel.org/r/20260225220807.33092-5-graf@amazon.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

hw/nitro: Add Nitro Vsock Bus

Add a dedicated bus for Nitro Enclave vsock devices. In Nitro Enclaves,
communication between parent and enclave/hypervisor happens almost
exclusively through vsock. The nitro-vsock-bus models this dependency
in QEMU, which allows devices in this bus to implement individual services
on top of vsock.

The nitro machine spawns this bus by creating the included
nitro-vsock-bridge sysbus device.

The nitro accel then advertises the Enclave's CID to the bus by calling
nitro_vsock_bridge_start_enclave() on the bridge device as soon as it
knows the CID.

Nitro vsock devices can listen to that event and learn the Enclave's CID
when it is available to perform actions, such as connect to the debug
serial vsock port.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alexander Graf <graf@amazon.com>
Link: https://lore.kernel.org/r/20260225220807.33092-4-graf@amazon.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

linux-headers: Add nitro_enclaves.h

QEMU is learning to drive the /dev/nitro_enclaves device node. Include
its UAPI header into our local copy of kernel headers so it has all
defines we need to drive it.

Signed-off-by: Alexander Graf <graf@amazon.com>
Link: https://lore.kernel.org/r/20260225220807.33092-3-graf@amazon.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

scripts/update-linux-headers: Add Nitro Enclaves header

We want to enable QEMU to drive the /dev/nitro_enclaves device node. Add
its UAPI header into our kernel sync so we have all defines we need to
drive it.

Signed-off-by: Alexander Graf <graf@amazon.com>
Link: https://lore.kernel.org/r/20260225220807.33092-2-graf@amazon.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

accel/kvm: Don't clear pending #SMI in kvm_get_vcpu_events

The kvm_get_vcpu_events propogates the state of the pending smi
from the kernel to the cpu->interrupt_request, with the intention
of having un up to date migration state.

Later the opposite is done, the kvm_put_vcpu_events restores the state
of the pending #SMI from the 'cs->interrupt_request'

The only problem is that kvm_get_vcpu_events also resets the SMI
in cpu->interrupt_request when there is no pending #SMI indicated by the kernel,
and that is wrong as the SMI might be still raised by qemu.

While at it, also fix a similar but more theoretical bug with regard to a
latched #INIT while in SMM.

A simple reproducer for this bug is to read an EFI variable in a loop
from within a guest, while at the same time run 'info registers' on
the qemu HMP monitor.

The reads will, once in a while, fail with an 'Invalid argument' error.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20260223221908.361456-1-mlevitsk@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: emulate: propagate errors all the way and stop early

This ended up being a bigger patch than I thought it'd be...

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260223233950.96076-29-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

whpx: i386: ignore send_msi to interrupt vector 0

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260223233950.96076-28-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

whpx: i386: bump to x2apic

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260223233950.96076-27-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

whpx: i386: inject exceptions

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260223233950.96076-26-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: emulate: propagate memory errors on most reads/writes

Use that to not bump RIP for those cases.

Warn on read/write from/to unmapped MMIO, but not consider that as an exception.
For reads, return 0xFF(s) as the register value in that case.

Leaves a coverage gap for read_val_ext(), to be handled in a later commit.

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260223233950.96076-25-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: emulate: remove fetch_instruction helper too

Not used anymore.
Link: https://lore.kernel.org/r/20260223233950.96076-24-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: emulate: raise an exception on translation fault

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260223233950.96076-23-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: emulate: get rid of write_val_to_mem() helper

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260223233950.96076-22-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

hvf: i386: save/restore CR0/2/3

For symmetry, save/restore the same set of registers even when not needed.

CR2 save/restore needed as page faults injected to the guest imply modifying CR2.

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260223233950.96076-21-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: emulate, hvf, mshv: rework MMU code

target/i386/emulate doesn't currently properly emulate instructions
which might cause a page fault during their execution. Notably, REP STOS/MOVS
from MMIO to an address which is unmapped until a page fault exception is raised
causes an abort() in vmx_write_mem.

Change the interface between the HW accel backend and target/i386/emulate as a first step towards addressing that.

Adapt the page table walker code to give actionable errors,
while leaving a possibility for backends to provide their own walker.

This removes the usage of the Hyper-V page walker in the mshv backend.

Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260223233950.96076-20-mohamed@unpredictable.fr
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>