Stefan Hajnoczi [Fri, 30 May 2025 15:41:21 +0000 (11:41 -0400)]
Merge tag 'pull-target-arm-20250530-2' of https://git.linaro.org/people/pmaydell/qemu-arm into staging
target-arm queue:
* hw/arm: Add GMAC devices to NPCM8XX SoC
* hw/arm: Add missing psci_conduit to NPCM8XX SoC boot info
* docs/interop: convert text files to restructuredText
* target/arm: Some minor refactorings
* tests/functional: Add a test for the Stellaris arm machines
* hw/block: Drop unused nand.c
* tag 'pull-target-arm-20250530-2' of https://git.linaro.org/people/pmaydell/qemu-arm:
hw/block: Drop unused nand.c
tests/functional: Add a test for the Stellaris arm machines
target/arm/hvf: Include missing 'cpu-qom.h' header
target/arm/kvm: Include missing 'cpu-qom.h' header
target/arm/qmp: Include missing 'cpu.h' header
target/arm/cpu-features: Include missing 'cpu.h' header
hw/arm/boot: Include missing 'system/memory.h' header
target/arm/cpregs: Include missing 'target/arm/cpu.h' header
target/arm: Only link with zlib when TCG is enabled
target/arm/hvf_arm: Avoid using poisoned CONFIG_HVF definition
target/arm/tcg-stubs: compile file once (system)
docs/interop: convert text files to restructuredText
hw/arm: Add missing psci_conduit to NPCM8XX SoC boot info
tests/qtest: Migrate GMAC test from 7xx to 8xx
hw/arm: Add GMAC devices to NPCM8XX SoC
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Stefan Hajnoczi [Fri, 30 May 2025 15:41:13 +0000 (11:41 -0400)]
Merge tag 'pull-request-2025-05-30' of https://gitlab.com/thuth/qemu into staging
* Functional tests improvements
* Endianness improvements/clean-ups for the Microblaze machines
* Remove obsolete -2.4 and -2.5 i440fx and q35 machine types and related code
Stefan Hajnoczi [Fri, 30 May 2025 15:41:07 +0000 (11:41 -0400)]
Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging
* target/i386/kvm: Intel TDX support
* target/i386/emulate: more lflags cleanups
* meson: remove need for explicit listing of dependencies in hw_common_arch and
target_common_arch
* rust: small fixes
* hpet: Reorganize register decoding to be more similar to Rust code
* target/i386: fixes for AMD models
* target/i386: new EPYC-Turin CPU model
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (77 commits)
target/i386/tcg/helper-tcg: fix file references in comments
target/i386: Add support for EPYC-Turin model
target/i386: Update EPYC-Genoa for Cache property, perfmon-v2, RAS and SVM feature bits
target/i386: Add couple of feature bits in CPUID_Fn80000021_EAX
target/i386: Update EPYC-Milan CPU model for Cache property, RAS, SVM feature bits
target/i386: Update EPYC-Rome CPU model for Cache property, RAS, SVM feature bits
target/i386: Update EPYC CPU model for Cache property, RAS, SVM feature bits
rust: make declaration of dependent crates more consistent
docs: Add TDX documentation
i386/tdx: Validate phys_bits against host value
i386/tdx: Make invtsc default on
i386/tdx: Don't treat SYSCALL as unavailable
i386/tdx: Fetch and validate CPUID of TD guest
target/i386: Print CPUID subleaf info for unsupported feature
i386: Remove unused parameter "uint32_t bit" in feature_word_description()
i386/cgs: Introduce x86_confidential_guest_check_features()
i386/tdx: Define supported KVM features for TDX
i386/tdx: Add XFD to supported bit of TDX
i386/tdx: Add supported CPUID bits relates to XFAM
i386/tdx: Add supported CPUID bits related to TD Attributes
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* tag 'pull-nbd-2025-05-29' of https://repo.or.cz/qemu/ericb:
iotests: Filter out ZFS in several tests
iotests: Improve mirror-sparse on ext4 and xfs
iotests: Use disk_usage in more places
nbd: Set unix socket send buffer on Linux
nbd: Set unix socket send buffer on macOS
io: Add helper for setting socket send buffer size
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
tests/unit/test-util-sockets: fix mem-leak on error object
The test fails with --enable-asan as the error struct is never freed.
In the case where the test expects a success but it fails, let's also
report the error for debugging (it will be freed internally).
Fixes 316e8ee8d6 ("util/qemu-sockets: Refactor inet_parse() to use QemuOpts")
Signed-off-by: Matheus Tavares Bernardino <matheus.bernardino@oss.qualcomm.com> Reviewed-by: Juraj Marcin <jmarcin@redhat.com>
Message-ID: <518d94c7db20060b2a086cf55ee9bffab992a907.1748280011.git.matheus.bernardino@oss.qualcomm.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
hw/net/vmxnet3: Merge DeviceRealize in InstanceInit
Simplify merging vmxnet3_realize() within vmxnet3_instance_init(),
removing the need for device_class_set_parent_realize().
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250512083948.39294-20-philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
VMXNET3_COMPAT_FLAG_DISABLE_PCIE was only used by the
hw_compat_2_5[] array, via the 'x-disable-pcie=on' property.
We removed all machines using that array, lets remove all the
code around VMXNET3_COMPAT_FLAG_DISABLE_PCIE.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250512083948.39294-19-philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
VMXNET3_COMPAT_FLAG_OLD_MSI_OFFSETS was only used by the
hw_compat_2_5[] array, via the 'x-old-msi-offsets=on' property.
We removed all machines using that array, lets remove all the
code around VMXNET3_COMPAT_FLAG_OLD_MSI_OFFSETS.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250512083948.39294-18-philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
Simplify replacing pvscsi_realize() by pvscsi_instance_init(),
removing the need for device_class_set_parent_realize().
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250512083948.39294-17-philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
PVSCSI_COMPAT_DISABLE_PCIE_BIT was only used by the
hw_compat_2_5[] array, via the 'x-disable-pcie=on' property.
We removed all machines using that array, lets remove all the
code around PVSCSI_COMPAT_DISABLE_PCIE_BIT, including the now
unused PVSCSIState::compat_flags field.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250512083948.39294-16-philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
PVSCSI_COMPAT_OLD_PCI_CONFIGURATION was only used by the
hw_compat_2_5[] array, via the 'x-old-pci-configuration=on'
property. We removed all machines using that array, lets remove
all the code around PVSCSI_COMPAT_OLD_PCI_CONFIGURATION.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250512083948.39294-15-philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
The hw_compat_2_5[] array was only used by the pc-q35-2.5 and
pc-i440fx-2.5 machines, which got removed. Remove it.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250512083948.39294-13-philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250512083948.39294-12-philmd@linaro.org>
[thuth: Fix error from check_patch.pl wrt to an empty "for" loop] Signed-off-by: Thomas Huth <thuth@redhat.com>
hw/i386/x86: Remove X86MachineClass::save_tsc_khz field
The X86MachineClass::save_tsc_khz boolean was only used
by the pc-q35-2.5 and pc-i440fx-2.5 machines, which got
removed. Remove it and simplify tsc_khz_needed().
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250512083948.39294-11-philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
hw/i386/pc: Remove deprecated pc-q35-2.5 and pc-i440fx-2.5 machines
These machines has been supported for a period of more than 6 years.
According to our versioned machine support policy (see commit ce80c4fa6ff "docs: document special exception for machine type
deprecation & removal") they can now be removed.
Remove the now unused empty pc_compat_2_5[] array.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250512083948.39294-10-philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
VIRTIO_PCI_FLAG_DISABLE_PCIE was only used by the hw_compat_2_4[]
array, via the 'x-disable-pcie=false' property. We removed all
machines using that array, lets remove all the code around
VIRTIO_PCI_FLAG_DISABLE_PCIE (see commit 9a4c0e220d8 for similar
VIRTIO_PCI_FLAG_* enum removal).
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250512083948.39294-9-philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
VIRTIO_PCI_FLAG_MIGRATE_EXTRA was only used by the
hw_compat_2_4[] array, via the 'migrate-extra=true'
property. We removed all machines using that array,
lets remove all the code around VIRTIO_PCI_FLAG_MIGRATE_EXTRA.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20250512083948.39294-8-philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
E1000_FLAG_MAC was only used by the hw_compat_2_4[] array,
via the 'extra_mac_registers=off' property. We removed all
machines using that array, lets remove all the code around
E1000_FLAG_MAC, including the MAC_ACCESS_FLAG_NEEDED enum,
similarly to commit fa4ec9ffda7 ("e1000: remove old
compatibility code").
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20250512083948.39294-7-philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
The hw_compat_2_4[] array was only used by the pc-q35-2.4 and
pc-i440fx-2.4 machines, which got removed. Remove it.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250512083948.39294-6-philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
The pc_compat_2_4[] array was only used by the pc-q35-2.4
and pc-i440fx-2.4 machines, which got removed. Remove it.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250512083948.39294-4-philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
hw/i386/pc: Remove PCMachineClass::broken_reserved_end field
The PCMachineClass::broken_reserved_end field was only used
by the pc-q35-2.4 and pc-i440fx-2.4 machines, which got removed.
Remove it and simplify pc_memory_init().
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250512083948.39294-3-philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
hw/i386/pc: Remove deprecated pc-q35-2.4 and pc-i440fx-2.4 machines
These machines has been supported for a period of more than 6 years.
According to our versioned machine support policy (see commit ce80c4fa6ff "docs: document special exception for machine type
deprecation & removal") they can now be removed.
Remove the qtest in test-x86-cpuid-compat.c file.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250512083948.39294-2-philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
Thomas Huth [Thu, 15 May 2025 13:20:19 +0000 (15:20 +0200)]
docs: Deprecate the qemu-system-microblazeel binary
The (former big-endian only) binary qemu-system-microblaze can
handle both endiannesses nowadays, so we don't need the separate
qemu-system-microblazeel binary for little endian anymore. Let's
deprecate it to avoid unnecessary compilation and test time in
the future.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20250515132019.569365-5-thuth@redhat.com>
Thomas Huth [Thu, 15 May 2025 13:20:18 +0000 (15:20 +0200)]
hw/microblaze: Remove the big-endian variants of ml605 and xlnx-zynqmp-pmu
Both machines were added with little-endian in mind only (the
"endianness" CPU property was hard-wired to "true", see commits 133d23b3ad1 and a88bbb006a52), so the variants that showed up
on the big endian target likely never worked. We deprecated these
non-working machine variants two releases ago, and so far nobody
complained, so it should be fine now to disable them. Hard-wire
the machines to little endian now.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20250515132019.569365-4-thuth@redhat.com>
Thomas Huth [Thu, 15 May 2025 13:20:17 +0000 (15:20 +0200)]
tests/functional: Test both microblaze s3adsp1800 endianness variants
Now that the endianness of the petalogix-s3adsp1800 can be configured,
we should test that the cross-endianness also works as expected, thus
test the big endian variant on the little endian target and vice versa.
(based on an original idea from Philippe Mathieu-Daudé)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20250515132019.569365-3-thuth@redhat.com>
Thomas Huth [Thu, 15 May 2025 13:20:16 +0000 (15:20 +0200)]
hw/microblaze: Add endianness property to the petalogix_s3adsp1800 machine
Since the microblaze target can now handle both endianness, big and
little, we should provide a config knob for the user to select the
desired endianness.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20250515132019.569365-2-thuth@redhat.com>
Thomas Huth [Wed, 21 May 2025 14:37:32 +0000 (16:37 +0200)]
tests/functional/test_mem_addr_space: Use set_machine() to select the machine
By using self.set_machine() the tests get properly skipped in case
the machine has not been compiled into the QEMU binary, e.g. when
"configure" has been run with "--without-default-devices".
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20250521143732.140711-1-thuth@redhat.com>
Thomas Huth [Thu, 22 May 2025 08:02:08 +0000 (10:02 +0200)]
tests/functional/test_mips_malta: Re-enable the check for the PCI host bridge
The problem with the PCI bridge has been fixed in commit e5894fd6f411c1
("hw/pci-host/gt64120: Fix endianness handling"), so we can enable the
corresponding test again.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20250522080208.205489-1-thuth@redhat.com>
Thomas Huth [Wed, 21 May 2025 14:51:12 +0000 (16:51 +0200)]
tests/functional/test_sparc64_tuxrun: Explicitly set the 'sun4u' machine
Use self.set_machine() to set the machine instead of relying on the
default machine of the binary. This way the test can be skipped in
case the machine has not been compiled into the QEMU binary.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Tested-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20250521145112.142222-1-thuth@redhat.com>
Eric Blake [Fri, 23 May 2025 16:27:23 +0000 (11:27 -0500)]
iotests: Filter out ZFS in several tests
Fiona reported that ZFS makes sparse file testing awkward, since:
- it has asynchronous allocation (not even 'fsync $file' makes du see
the desired size; it takes the slower 'fsync -f $file' which is not
appropriate for the tests)
- for tests of fully allocated files, ZFS with compression enabled
still reports smaller disk usage
Add a new _require_disk_usage that quickly probes whether an attempt
to create a sparse 5M file shows as less than 1M usage, while the same
file with -o preallocation=full shows as more than 4M usage without
sync, which should filter out ZFS behavior. Then use it in various
affected tests.
This does not add the new filter on all tests that Fiona is seeing ZFS
failures on, but only those where I could quickly spot that there is
at least one place where the test depends on the output of 'du -b' or
'stat -c %b'.
Eric Blake [Fri, 23 May 2025 16:27:22 +0000 (11:27 -0500)]
iotests: Improve mirror-sparse on ext4 and xfs
Fiona reported that an ext4 filesystem on top of LVM can sometimes
report over-allocation to du (based on the heuristics the filesystem
is making while observing the contents being mirrored); even though
the contents and actual size matched, about 50% of the time the size
reported by disk_usage was too large by 4k, failing the test. In
auditing other iotests, this is a common problem we've had to deal
with.
Meanwhile, Markus reported that an xfs filesystem reports disk usage
at a default granularity of 1M (so the sparse file occupies 3M, since
it has just over 2M data).
Reported-by: Fiona Ebner <f.ebner@proxmox.com> Reported-by: Markus Armbruster <armbru@redhat.com> Fixes: c0ddcb2c ("tests: Add iotest mirror-sparse for recent patches") Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fiona Ebner <f.ebner@proxmox.com> Tested-by: Fiona Ebner <f.ebner@proxmox.com>
Message-ID: <20250523163041.2548675-7-eblake@redhat.com>
[eblake: Also fix xfs issue] Signed-off-by: Eric Blake <eblake@redhat.com>
Nir Soffer [Sat, 17 May 2025 20:11:54 +0000 (23:11 +0300)]
nbd: Set unix socket send buffer on Linux
Like macOS we have similar issue on Linux. For TCP socket the send
buffer size is 2626560 bytes (~2.5 MiB) and we get good performance.
However for unix socket the default and maximum buffer size is 212992
bytes (208 KiB) and we see poor performance when using one NBD
connection, up to 4 times slower than macOS on the same machine.
Tracing shows that for every 2 MiB payload (qemu uses 2 MiB io size), we
do 1 recvmsg call with TCP socket, and 10 recvmsg calls with unix
socket.
Fixing this issue requires changing the maximum send buffer size (the
receive buffer size is ignored). This can be done using:
With this we can set the socket buffer size to 2 MiB. With the defaults
the value requested by qemu is clipped to the maximum size and has no
effect.
I tested on 2 machines:
- Fedora 42 VM on MacBook Pro M2 Max
- Dell PowerEdge R640 (Intel(R) Xeon(R) Gold 6230 CPU @ 2.10GHz)
On the older Dell machine we see very little improvement, up to 1.03
higher throughput. On the M2 machine we see up to 2.67 times higher
throughput. The following results are from the M2 machine.
Reading from qemu-nbd with qemu-img convert. In this test buffer size of
4m is optimal (2.28 times faster).
Signed-off-by: Nir Soffer <nirsof@gmail.com>
Message-ID: <20250517201154.88456-4-nirsof@gmail.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>
Nir Soffer [Sat, 17 May 2025 20:11:53 +0000 (23:11 +0300)]
nbd: Set unix socket send buffer on macOS
On macOS we need to increase unix socket buffers size on the client and
server to get good performance. We set socket buffers on macOS after
connecting or accepting a client connection.
Testing shows that setting socket receive buffer size (SO_RCVBUF) has no
effect on performance, so we set only the send buffer size (SO_SNDBUF).
It seems to work like Linux but not documented.
Testing shows that optimal buffer size is 512k to 4 MiB, depending on
the test case. The difference is very small, so I chose 2 MiB.
I tested reading from qemu-nbd and writing to qemu-nbd with qemu-img and
computing a blkhash with nbdcopy and blksum.
To focus on NBD communication and get less noisy results, I tested
reading and writing to null-co driver. I added a read-pattern option to
the null-co driver to return data full of 0xff:
Signed-off-by: Nir Soffer <nirsof@gmail.com>
Message-ID: <20250517201154.88456-3-nirsof@gmail.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>
Nir Soffer [Sat, 17 May 2025 20:11:52 +0000 (23:11 +0300)]
io: Add helper for setting socket send buffer size
Testing reading and writing from qemu-nbd using a unix domain socket
shows that the platform default send buffer size is too low, leading to
poor performance and hight cpu usage.
Add a helper for setting socket send buffer size to be used in NBD code.
It can also be used in other contexts.
We don't need a helper for receive buffer size since it is not used with
unix domain sockets. This is documented for Linux, and not documented
for macOS.
Failing to set the socket buffer size is not a fatal error, but the
caller may want to warn about the failure.
Signed-off-by: Nir Soffer <nirsof@gmail.com>
Message-ID: <20250517201154.88456-2-nirsof@gmail.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>
Peter Maydell [Thu, 29 May 2025 16:45:13 +0000 (17:45 +0100)]
hw/block: Drop unused nand.c
The nand.c device (TYPE_NAND) is an emulation of a NAND flash memory
chip which was used by the old OMAP boards. No current QEMU board
uses it, and although techically "-device nand,chip-id=0x6b" doesn't
error out, it's not possible to usefully use it from the command
line because the only interface it has is via calling C functions
like nand_setpins() and nand_setio().
The "config OMAP" stanza (used only by the SX1 board) is the only
thing that does "select NAND" to compile in this code, but the SX1
board doesn't actually use the NAND device.
Remove the NAND device code entirely; this is effectively leftover
cleanup from when we dropped the PXA boards and the OMAP boards
other than the sx1.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20250522142859.3122389-1-peter.maydell@linaro.org
Thomas Huth [Thu, 29 May 2025 16:45:12 +0000 (17:45 +0100)]
tests/functional: Add a test for the Stellaris arm machines
The 2023 edition of the QEMU advent calendar featured an image
that we can use to test whether the lm3s6965evb machine is basically
still working.
And for the lm3s811evb there is a small test kernel on github
which can be used to check its UART.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-id: 20250519170242.520805-1-thuth@redhat.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
target/arm/cpu-features: Include missing 'cpu.h' header
"target/arm/cpu-features.h" dereferences the ARMISARegisters
structure, which is defined in "cpu.h". Include the latter to
avoid when refactoring unrelated headers:
In file included from target/arm/internals.h:33:
target/arm/cpu-features.h:45:54: error: unknown type name 'ARMISARegisters'
45 | static inline bool isar_feature_aa32_thumb_div(const ARMISARegisters *id)
| ^
target/arm/cpu-features.h:47:12: error: use of undeclared identifier 'R_ID_ISAR0_DIVIDE_SHIFT'
47 | return FIELD_EX32(id->id_isar0, ID_ISAR0, DIVIDE) != 0;
| ^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-id: 20250513173928.77376-7-philmd@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
hw/arm/boot: Include missing 'system/memory.h' header
default_reset_secondary() uses address_space_stl_notdirty(),
itself declared in "system/memory.h". Include this header in
order to avoid when refactoring headers:
../hw/arm/boot.c:281:5: error: implicit declaration of function 'address_space_stl_notdirty' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
address_space_stl_notdirty(as, info->smp_bootreg_addr,
^
Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20250513173928.77376-6-philmd@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
target/arm/cpregs: Include missing 'target/arm/cpu.h' header
CPReadFn type definitions use the CPUARMState type, itself
declared in "cpu.h". Include this file in order to avoid when
refactoring headers:
../target/arm/cpregs.h:241:27: error: unknown type name 'CPUARMState'
typedef uint64_t CPReadFn(CPUARMState *env, const ARMCPRegInfo *opaque);
^
Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20250513173928.77376-5-philmd@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Souleymane Conte [Thu, 29 May 2025 16:45:10 +0000 (17:45 +0100)]
docs/interop: convert text files to restructuredText
buglink: https://gitlab.com/qemu-project/qemu/-/issues/527 Signed-off-by: Souleymane Conte <conte.souleymane@gmail.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20250522092622.40869-1-conte.souleymane@gmail.com
[PMM: switched a few more bits of formatting to monospaced;
updated references to qcow2.txt in MAINTAINERS, qcow2-cache.txt
and bitmaps.rst] Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Nabih Estefan [Thu, 29 May 2025 16:45:09 +0000 (17:45 +0100)]
tests/qtest: Migrate GMAC test from 7xx to 8xx
For upstreaming we migrated this test to 7xx (since that was already
upstream) move it back to 8xx where it can check the 4 GMACs since that
is the board this test was originally created for.
Signed-off-by: Nabih Estefan <nabihestefan@google.com>
Message-id: 20250508220718.735415-3-nabihestefan@google.com Reviewed-by: Tyrone Ting <kfting@nuvoton.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Hao Wu [Thu, 29 May 2025 16:45:09 +0000 (17:45 +0100)]
hw/arm: Add GMAC devices to NPCM8XX SoC
The GMAC was originally created for the 8xx machine. During upstreaming
both the GMAC and the 8XX we removed it so they would not depend on each
other for the process, that connection should be added back in.
* tag 'pull-qapi-2025-05-28' of https://repo.or.cz/qemu/armbru:
qapi: use imperative style in documentation
qapi: make all generated files common
qapi: remove qapi_specific_outputs from meson.build
qapi: make s390x specific CPU commands unconditionally available
qapi: make most CPU commands unconditionally available
qapi: Make CpuModelExpansionInfo::deprecated-props optional and generic
qapi: remove the misc-target.json file
qapi: make Xen event commands unconditionally available
qapi: make SGX commands unconditionally available
qapi: expose query-gic-capability command unconditionally
qapi: make SEV commands unconditionally available
qapi: expand docs for SEV commands
qapi: expose rtc-reset-reinjection command unconditionally
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* tag 'pull-misc-2025-05-28' of https://repo.or.cz/qemu/armbru:
docs/about/removed-features: Move removal notes to tidy up order
docs/about/deprecated: Move deprecation notes to tidy up order
qapi/migration: Deprecate migrate argument @detach
docs/about: Belatedly document tightening of QMP device_add checking
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Stefan Hajnoczi [Wed, 28 May 2025 19:17:25 +0000 (15:17 -0400)]
Merge tag 'pull-tcg-20250528' of https://gitlab.com/rth7680/qemu into staging
accel/tcg: Fix atomic_mmu_lookup vs TLB_FORCE_SLOW
linux-user: implement pgid field of /proc/self/stat
target/sh4: Use MO_ALIGN for system UNALIGN()
target/microblaze: Use TARGET_LONG_BITS == 32 for system mode
accel/tcg: Add TCGCPUOps.pointer_wrap
target/*: Populate TCGCPUOps.pointer_wrap
* tag 'pull-tcg-20250528' of https://gitlab.com/rth7680/qemu: (28 commits)
accel/tcg: Assert TCGCPUOps.pointer_wrap is set
target/sparc: Fill in TCGCPUOps.pointer_wrap
target/s390x: Fill in TCGCPUOps.pointer_wrap
target/riscv: Fill in TCGCPUOps.pointer_wrap
target/ppc: Fill in TCGCPUOps.pointer_wrap
target/mips: Fill in TCGCPUOps.pointer_wrap
target/loongarch: Fill in TCGCPUOps.pointer_wrap
target/i386: Fill in TCGCPUOps.pointer_wrap
target/arm: Fill in TCGCPUOps.pointer_wrap
target: Use cpu_pointer_wrap_uint32 for 32-bit targets
target: Use cpu_pointer_wrap_notreached for strict align targets
accel/tcg: Add TCGCPUOps.pointer_wrap
target/sh4: Use MO_ALIGN for system UNALIGN()
tcg: Drop TCGContext.page_{mask,bits}
tcg: Drop TCGContext.tlb_dyn_max_bits
target/microblaze: Simplify compute_ldst_addr_type{a,b}
target/microblaze: Drop DisasContext.r0
target/microblaze: Use TARGET_LONG_BITS == 32 for system mode
target/microblaze: Fix printf format in mmu_translate
target/microblaze: Use TCGv_i64 for compute_ldst_addr_ea
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Babu Moger [Thu, 8 May 2025 19:58:04 +0000 (14:58 -0500)]
target/i386: Add support for EPYC-Turin model
Add the support for AMD EPYC zen 5 processors (EPYC-Turin).
Add the following new feature bits on top of the feature bits from
the previous generation EPYC models.
movdiri : Move Doubleword as Direct Store Instruction
movdir64b : Move 64 Bytes as Direct Store Instruction
avx512-vp2intersect : AVX512 Vector Pair Intersection to a Pair
of Mask Register
avx-vnni : AVX VNNI Instruction
prefetchi : Indicates support for IC prefetch
sbpb : Selective Branch Predictor Barrier
ibpb-brtype : IBPB includes branch type prediction flushing
srso-user-kernel-no : Not vulnerable to SRSO at the user-kernel boundary
Babu Moger [Thu, 8 May 2025 19:58:03 +0000 (14:58 -0500)]
target/i386: Update EPYC-Genoa for Cache property, perfmon-v2, RAS and SVM feature bits
Found that some of the cache properties are not set correctly for EPYC models.
l1d_cache.no_invd_sharing should not be true.
l1i_cache.no_invd_sharing should not be true.
L2.self_init should be true.
L2.inclusive should be true.
L3.inclusive should not be true.
L3.no_invd_sharing should be true.
Fix these cache properties.
Also add the missing RAS and SVM features bits on AMD EPYC-Genoa model.
The SVM feature bits are used in nested guests.
perfmon-v2 : Allow guests to make use of the PerfMonV2 features.
succor : Software uncorrectable error containment and recovery capability.
overflow-recov : MCA overflow recovery support.
lbrv : LBR virtualization
tsc-scale : MSR based TSC rate control
vmcb-clean : VMCB clean bits
flushbyasid : Flush by ASID
pause-filter : Pause intercept filter
pfthreshold : PAUSE filter threshold
v-vmsave-vmload: Virtualized VMLOAD and VMSAVE
vgif : Virtualized GIF
fs-gs-base-ns : WRMSR to {FS,GS,KERNEL_GS}_BASE is non-serializing
The feature details are available in APM listed below [1].
[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
Publication # 24593 Revision 3.41.
Babu Moger [Thu, 8 May 2025 19:58:02 +0000 (14:58 -0500)]
target/i386: Add couple of feature bits in CPUID_Fn80000021_EAX
Add CPUID bit indicates that a WRMSR to MSR_FS_BASE, MSR_GS_BASE, or
MSR_KERNEL_GS_BASE is non-serializing amd PREFETCHI that the indicates
support for IC prefetch.
CPUID_Fn80000021_EAX
Bit Feature description
20 Indicates support for IC prefetch.
1 FsGsKernelGsBaseNonSerializing.
WRMSR to FS_BASE, GS_BASE and KernelGSbase are non-serializing.
Babu Moger [Thu, 8 May 2025 19:58:01 +0000 (14:58 -0500)]
target/i386: Update EPYC-Milan CPU model for Cache property, RAS, SVM feature bits
Found that some of the cache properties are not set correctly for EPYC models.
l1d_cache.no_invd_sharing should not be true.
l1i_cache.no_invd_sharing should not be true.
L2.self_init should be true.
L2.inclusive should be true.
L3.inclusive should not be true.
L3.no_invd_sharing should be true.
Fix these cache properties.
Also add the missing RAS and SVM features bits on AMD EPYC-Milan model.
The SVM feature bits are used in nested guests.
Paolo Bonzini [Mon, 26 May 2025 07:22:20 +0000 (09:22 +0200)]
rust: make declaration of dependent crates more consistent
Crates like "bilge" and "libc" can be shared by more than one directory,
so declare them directly in rust/meson.build. While at it, make their
variable names end with "_rs" and always add a subproject() statement
(as that pinpoints the error better if the subproject is missing and
cannot be downloaded).
Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Xiaoyao Li [Thu, 8 May 2025 14:59:57 +0000 (10:59 -0400)]
i386/tdx: Fetch and validate CPUID of TD guest
Use KVM_TDX_GET_CPUID to get the CPUIDs that are managed and enfored
by TDX module for TD guest. Check QEMU's configuration against the
fetched data.
Print wanring message when 1. a feature is not supported but requested
by QEMU or 2. QEMU doesn't want to expose a feature while it is enforced
enabled.
- If cpu->enforced_cpuid is not set, prints the warning message of both
1) and 2) and tweak QEMU's configuration.
- If cpu->enforced_cpuid is set, quit if any case of 1) or 2).
Lei Wang [Tue, 17 Dec 2024 12:39:31 +0000 (07:39 -0500)]
i386: Remove unused parameter "uint32_t bit" in feature_word_description()
Parameter "uint32_t bit" is not used in function feature_word_description(),
so remove it.
Signed-off-by: Lei Wang <lei4.wang@intel.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Link: https://lore.kernel.org/r/20241217123932.948789-2-xiaoyao.li@intel.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
To do cgs specific feature checking. Note the feature checking in
x86_cpu_filter_features() is valid for non-cgs VMs. For cgs VMs like
TDX, what features can be supported has more restrictions.
Xiaoyao Li [Thu, 8 May 2025 14:59:53 +0000 (10:59 -0400)]
i386/tdx: Add supported CPUID bits relates to XFAM
Some CPUID bits are controlled by XFAM. They are not covered by
tdx_caps.cpuid (which only contians the directly configurable bits), but
they are actually supported when the related XFAM bit is supported.
Add these XFAM controlled bits to TDX supported CPUID bits based on the
supported_xfam.
Besides, incorporate the supported_xfam into the supported CPUID leaf of
0xD.
Xiaoyao Li [Thu, 8 May 2025 14:59:52 +0000 (10:59 -0400)]
i386/tdx: Add supported CPUID bits related to TD Attributes
For TDX, some CPUID feature bit is configured via TD attributes. They
are not covered by tdx_caps.cpuid (which only contians the directly
configurable CPUID bits), but they are actually supported when the
related attributre bit is supported.
Note, LASS and KeyLocker are not supported by KVM for TDX, nor does
QEMU support it (see TDX_SUPPORTED_TD_ATTRS). They are defined in
tdx_attrs_maps[] for the completeness of the existing TD Attribute
bits that are related with CPUID features.
Xiaoyao Li [Thu, 8 May 2025 14:59:51 +0000 (10:59 -0400)]
i386/tdx: Add TDX fixed1 bits to supported CPUIDs
TDX architecture forcibly sets some CPUID bits for TD guest that VMM
cannot disable it. They are fixed1 bits.
Fixed1 bits are not covered by tdx_caps.cpuid (which only contains the
directly configurable bits), while fixed1 bits are supported for TD guest
obviously.
Add fixed1 bits to tdx_supported_cpuid. Besides, set all the fixed1
bits to the initial set of KVM's support since KVM might not report them
as supported.
Xiaoyao Li [Thu, 8 May 2025 14:59:50 +0000 (10:59 -0400)]
i386/tdx: Implement adjust_cpuid_features() for TDX
Maintain a TDX specific supported CPUID set, and use it to mask the
common supported CPUID value of KVM. It can avoid newly added supported
features (reported via KVM_GET_SUPPORTED_CPUID) for common VMs being
falsely reported as supported for TDX.
As the first step, initialize the TDX supported CPUID set with all the
configurable CPUID bits. It's not complete because there are other CPUID
bits are supported for TDX but not reported as directly configurable.
E.g. the XFAM related bits, attribute related bits and fixed-1 bits.
They will be handled in the future.
Also, what matters are the CPUID bits related to QEMU's feature word.
Only mask the CPUID leafs which are feature word leaf.
Xiaoyao Li [Thu, 8 May 2025 14:59:48 +0000 (10:59 -0400)]
cpu: Don't set vcpu_dirty when guest_state_protected
QEMU calls kvm_arch_put_registers() when vcpu_dirty is true in
kvm_vcpu_exec(). However, for confidential guest, like TDX, putting
registers is disallowed due to guest state is protected.
Only set vcpu_dirty to true with guest state is not protected when
creating the vcpu.
Xiaoyao Li [Thu, 8 May 2025 14:59:46 +0000 (10:59 -0400)]
i386/tdx: Only configure MSR_IA32_UCODE_REV in kvm_init_msrs() for TDs
For TDs, only MSR_IA32_UCODE_REV in kvm_init_msrs() can be configured
by VMM, while the features enumerated/controlled by other MSRs except
MSR_IA32_UCODE_REV in kvm_init_msrs() are not under control of VMM.
Isaku Yamahata [Thu, 8 May 2025 14:59:45 +0000 (10:59 -0400)]
i386/tdx: Don't synchronize guest tsc for TDs
TSC of TDs is not accessible and KVM doesn't allow access of
MSR_IA32_TSC for TDs. To avoid the assert() in kvm_get_tsc, make
kvm_synchronize_all_tsc() noop for TDs,
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> Reviewed-by: Connor Kuehl <ckuehl@redhat.com> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Acked-by: Gerd Hoffmann <kraxel@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Link: https://lore.kernel.org/r/20250508150002.689633-40-xiaoyao.li@intel.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Xiaoyao Li [Thu, 8 May 2025 14:59:44 +0000 (10:59 -0400)]
i386/tdx: Set and check kernel_irqchip mode for TDX
KVM mandates kernel_irqchip to be split mode.
Set it to split mode automatically when users don't provide an explicit
value, otherwise check it to be the split mode.
Suggested-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Link: https://lore.kernel.org/r/20250508150002.689633-39-xiaoyao.li@intel.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Xiaoyao Li [Thu, 8 May 2025 14:59:43 +0000 (10:59 -0400)]
i386/tdx: Disable PIC for TDX VMs
Legacy PIC (8259) cannot be supported for TDX VMs since TDX module
doesn't allow directly interrupt injection. Using posted interrupts
for the PIC is not a viable option as the guest BIOS/kernel will not
do EOI for PIC IRQs, i.e. will leave the vIRR bit set.
Hence disable PIC for TDX VMs and error out if user wants PIC.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Acked-by: Gerd Hoffmann <kraxel@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Link: https://lore.kernel.org/r/20250508150002.689633-38-xiaoyao.li@intel.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Xiaoyao Li [Thu, 8 May 2025 14:59:42 +0000 (10:59 -0400)]
i386/tdx: Disable SMM for TDX VMs
TDX doesn't support SMM and VMM cannot emulate SMM for TDX VMs because
VMM cannot manipulate TDX VM's memory.
Disable SMM for TDX VMs and error out if user requests to enable SMM.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Acked-by: Gerd Hoffmann <kraxel@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Link: https://lore.kernel.org/r/20250508150002.689633-37-xiaoyao.li@intel.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Xiaoyao Li [Thu, 8 May 2025 14:59:41 +0000 (10:59 -0400)]
i386/tdx: Set kvm_readonly_mem_enabled to false for TDX VM
TDX only supports readonly for shared memory but not for private memory.
In the view of QEMU, it has no idea whether a memslot is used as shared
memory of private. Thus just mark kvm_readonly_mem_enabled to false to
TDX VM for simplicity.
Xiaoyao Li [Thu, 8 May 2025 14:59:39 +0000 (10:59 -0400)]
i386/cpu: Introduce enable_cpuid_0x1f to force exposing CPUID 0x1f
Currently, QEMU exposes CPUID 0x1f to guest only when necessary, i.e.,
when topology level that cannot be enumerated by leaf 0xB, e.g., die or
module level, are configured for the guest, e.g., -smp xx,dies=2.
However, TDX architecture forces to require CPUID 0x1f to configure CPU
topology.
Introduce a bool flag, enable_cpuid_0x1f, in CPU for the case that
requires CPUID leaf 0x1f to be exposed to guest.
Introduce a new function x86_has_cpuid_0x1f(), which is the wrapper of
cpu->enable_cpuid_0x1f and x86_has_extended_topo() to check if it needs
to enable cpuid leaf 0x1f for the guest.
Xiaoyao Li [Thu, 8 May 2025 14:59:34 +0000 (10:59 -0400)]
i386/tdx: Handle KVM_SYSTEM_EVENT_TDX_FATAL
TD guest can use TDG.VP.VMCALL<REPORT_FATAL_ERROR> to request
termination. KVM translates such request into KVM_EXIT_SYSTEM_EVENT with
type of KVM_SYSTEM_EVENT_TDX_FATAL.
Add hanlder for such exit. Parse and print the error message, and
terminate the TD guest in the handler.
Xiaoyao Li [Thu, 8 May 2025 14:59:33 +0000 (10:59 -0400)]
i386/tdx: Enable user exit on KVM_HC_MAP_GPA_RANGE
KVM translates TDG.VP.VMCALL<MapGPA> to KVM_HC_MAP_GPA_RANGE, and QEMU
needs to enable user exit on KVM_HC_MAP_GPA_RANGE in order to handle the
memory conversion requested by TD guest.
Xiaoyao Li [Thu, 8 May 2025 14:59:29 +0000 (10:59 -0400)]
i386/tdx: Setup the TD HOB list
The TD HOB list is used to pass the information from VMM to TDVF. The TD
HOB must include PHIT HOB and Resource Descriptor HOB. More details can
be found in TDVF specification and PI specification.
Build the TD HOB in TDX's machine_init_done callback.
Co-developed-by: Isaku Yamahata <isaku.yamahata@intel.com> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Acked-by: Gerd Hoffmann <kraxel@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Link: https://lore.kernel.org/r/20250508150002.689633-24-xiaoyao.li@intel.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Xiaoyao Li [Thu, 8 May 2025 14:59:28 +0000 (10:59 -0400)]
headers: Add definitions from UEFI spec for volumes, resources, etc...
Add UEFI definitions for literals, enums, structs, GUIDs, etc... that
will be used by TDX to build the UEFI Hand-Off Block (HOB) that is passed
to the Trusted Domain Virtual Firmware (TDVF).
All values come from the UEFI specification [1], PI spec [2] and TDVF
design guide[3].
Xiaoyao Li [Thu, 8 May 2025 14:59:27 +0000 (10:59 -0400)]
i386/tdx: Track RAM entries for TDX VM
The RAM of TDX VM can be classified into two types:
- TDX_RAM_UNACCEPTED: default type of TDX memory, which needs to be
accepted by TDX guest before it can be used and will be all-zeros
after being accepted.
- TDX_RAM_ADDED: the RAM that is ADD'ed to TD guest before running, and
can be used directly. E.g., TD HOB and TEMP MEM that needed by TDVF.
Maintain TdxRamEntries[] which grabs the initial RAM info from e820 table
and mark each RAM range as default type TDX_RAM_UNACCEPTED.
Then turn the range of TD HOB and TEMP MEM to TDX_RAM_ADDED since these
ranges will be ADD'ed before TD runs and no need to be accepted runtime.
The TdxRamEntries[] are later used to setup the memory TD resource HOB
that passes memory info from QEMU to TDVF.
Xiaoyao Li [Thu, 8 May 2025 14:59:26 +0000 (10:59 -0400)]
i386/tdx: Track mem_ptr for each firmware entry of TDVF
For each TDVF sections, QEMU needs to copy the content to guest
private memory via KVM API (KVM_TDX_INIT_MEM_REGION).
Introduce a field @mem_ptr for TdxFirmwareEntry to track the memory
pointer of each TDVF sections. So that QEMU can add/copy them to guest
private memory later.
TDVF sections can be classified into two groups:
- Firmware itself, e.g., TDVF BFV and CFV, that located separately from
guest RAM. Its memory pointer is the bios pointer.
- Sections located at guest RAM, e.g., TEMP_MEM and TD_HOB.
mmap a new memory range for them.
Register a machine_init_done callback to do the stuff.