Luc Michel [Fri, 26 Sep 2025 07:07:23 +0000 (09:07 +0200)]
hw/arm/xlnx-versal: sdhci: refactor creation
Refactor the SDHCI controllers creation using the VersalMap structure.
Signed-off-by: Luc Michel <luc.michel@amd.com> Reviewed-by: Francisco Iglesias <francisco.iglesias@amd.com> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@amd.com> Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20250926070806.292065-6-luc.michel@amd.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Luc Michel [Fri, 26 Sep 2025 07:07:22 +0000 (09:07 +0200)]
hw/arm/xlnx-versal: canfd: refactor creation
Refactor the CAN controllers creation using the VersalMap structure.
Note that the connection to the CRL is removed for now and will be
re-added by next commits.
The xlnx-versal-virt machine now dynamically creates the correct amount
of CAN bus link properties based on the number of CAN controller
advertised by the SoC.
Signed-off-by: Luc Michel <luc.michel@amd.com> Reviewed-by: Francisco Iglesias <francisco.iglesias@amd.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@amd.com> Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20250926070806.292065-5-luc.michel@amd.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Luc Michel [Fri, 26 Sep 2025 07:07:21 +0000 (09:07 +0200)]
hw/arm/xlnx-versal: uart: refactor creation
Refactor the UARTs creations. The VersalMap struct is now used to
describe the SoC and its peripherals. For now it contains the two UARTs
mapping information. The creation function now embeds the FDT creation
logic as well. The devices are now created dynamically using qdev_new
and (qdev|sysbus)_realize_and_unref.
This will allow to rely entirely on the VersalMap structure to create
the SoC and allow easy addition of new SoCs of the same family (like
versal2 coming with next commits).
Note that the connection to the CRL is removed for now and will be
re-added by next commits.
Signed-off-by: Luc Michel <luc.michel@amd.com> Reviewed-by: Francisco Iglesias <francisco.iglesias@amd.com> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@amd.com> Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20250926070806.292065-4-luc.michel@amd.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Luc Michel [Fri, 26 Sep 2025 07:07:20 +0000 (09:07 +0200)]
hw/arm/xlnx-versal: prepare for FDT creation
The following commits will move FDT creation logic from the
xlnx-versal-virt machine to the xlnx-versal SoC itself. Prepare this by
passing the FDT handle to the SoC before it is realized.
For now the SoC only creates the two clock nodes. The ones from the
xlnx-versal virt machine are renamed with a `old-' prefix and will be
removed once they are not referenced anymore.
Signed-off-by: Luc Michel <luc.michel@amd.com> Reviewed-by: Francisco Iglesias <francisco.iglesias@amd.com> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@amd.com> Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20250926070806.292065-3-luc.michel@amd.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Luc Michel [Fri, 26 Sep 2025 07:07:19 +0000 (09:07 +0200)]
hw/arm/xlnx-versal: split the xlnx-versal type
Split the xlnx-versal device into two classes, a base, abstract class
and the existing concrete one. Introduce a VersalVersion type that will
be used across several device models when versal2 implementation is
added.
This is in preparation for versal2 implementation.
Signed-off-by: Luc Michel <luc.michel@amd.com> Reviewed-by: Francisco Iglesias <francisco.iglesias@amd.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@amd.com> Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20250926070806.292065-2-luc.michel@amd.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Thu, 25 Sep 2025 11:57:23 +0000 (12:57 +0100)]
target/arm: Don't set HCR.RW for AArch32 only CPUs
In commit 39ec3fc0301 we fixed a bug where we were not implementing
HCR_EL2.RW as RAO/WI for CPUs where EL1 doesn't support AArch32.
However, we got the condition wrong, so we now set this bit even on
CPUs which have no AArch64 support at all. This is wrong because the
AArch32 HCR register defines this bit as RES0.
Correct the condition we use for forcing HCR_RW to be set.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3128 Fixes: 39ec3fc0301 ("target/arm: HCR_EL2.RW should be RAO/WI if EL1 doesn't support AArch32") Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250925115723.1293233-1-peter.maydell@linaro.org
Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging
virtio,pci,pc: features, fixes
users can now control VM bit in smbios.
vhost-user-device is now user-createable.
intel_iommu now supports PRI
virtio-net now supports GSO over UDP tunnel
ghes now supports error injection
amd iommu now supports dma remapping for vfio
better error messages for virtio
small fixes all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCgAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmji0s0PHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRpuH4H/09h70IqAWZGHIWKGmmGGtdKOj3g54KuI0Ss
# mGECEsHvvBexOy670Qy8jdgXfaW4UuNui8BiOnJnGsBX8Y0dy+/yZori3KhkXkaY
# D57Ap9agkpHem7Vw0zgNsAF2bzDdlzTiQ6ns5oDnSq8yt82onCb5WGkWTGkPs/jL
# Gf8Jv+Ddcpt5SU4/hHPYC8pUhl7z4xPOOyl0Qp1GG21Pxf5v4sGFcWuGGB7UEPSQ
# MjZeoM0rSnLDtNg18sGwD5RPLQs13TbtgsVwijI79c3w3rcSpPNhGR5OWkdRCIYF
# 8A0Nhq0Yfo0ogTht7yt1QNPf/ktJkuoBuGVirvpDaix2tCBECes=
# =Zvq/
# -----END PGP SIGNATURE-----
# gpg: Signature made Sun 05 Oct 2025 01:19:25 PM PDT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [unknown]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [unknown]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (75 commits)
virtio: improve virtqueue mapping error messages
pci: Fix wrong parameter passing to pci_device_get_iommu_bus_devfn()
intel_iommu: Simplify caching mode check with VFIO device
intel_iommu: Enable Enhanced Set Root Table Pointer Support (ESRTPS)
vdpa-dev: add get_vhost() callback for vhost-vdpa device
amd_iommu: HATDis/HATS=11 support
intel-iommu: Move dma_translation to x86-iommu
amd_iommu: Refactor amdvi_page_walk() to use common code for page walk
amd_iommu: Do not assume passthrough translation when DTE[TV]=0
amd_iommu: Toggle address translation mode on devtab entry invalidation
amd_iommu: Add dma-remap property to AMD vIOMMU device
amd_iommu: Set all address spaces to use passthrough mode on reset
amd_iommu: Toggle memory regions based on address translation mode
amd_iommu: Invalidate address translations on INVALIDATE_IOMMU_ALL
amd_iommu: Add replay callback
amd_iommu: Unmap all address spaces under the AMD IOMMU on reset
amd_iommu: Use iova_tree records to determine large page size on UNMAP
amd_iommu: Sync shadow page tables on page invalidation
amd_iommu: Add basic structure to support IOMMU notifier updates
amd_iommu: Add a page walker to sync shadow page tables on invalidation
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Merge tag 'pull-trivial-patches' of https://gitlab.com/mjt0k/qemu into staging
trivial patches for 2025-10-05
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCgAdFiEEZKoqtTHVaQM2a/75gqpKJDselHgFAmjiFiUACgkQgqpKJDse
# lHitTBAAnv7RI9gCW7cc1y+BDPl+gqRuo8f+d1Mg/bUMf5BVjtPYUjzlwW1dShT+
# W4cOwBZbuKt/EqodQvkBpaZrv3mlfVSmSWAYy5egL5naRgHOqnzcbt3nVSpJBfrI
# VWMYJDT8owU80gRpeJ70UEQgBUxJ6ipnBgTuo6ILfDYAaOGIgvh6UMqdPQqHjULV
# CCv+TgIs6Uu4clZvPunHjaqnocvxnSWzpn6sy0xzk94QDNTW2ijBMAEhjKcUE9GA
# 2UhqVeiHvRVhAkGOTR2JNXtwwl864JHtJ4TqfilE8OUVF2+KcG3t5j8SI0JLgvRz
# sjHcFaOuVQXz2xVQv1dD6xVeq0YxkMIHMVe0ScN4WtVNTc8y5zdUSaphEcT7ELWe
# a4juN1qTqJ7B0h1BqCJoKY6fhWAcKhQccESKmXoxuiiXTOJb436F0IXPWzeByg+2
# Hm+dtgjCoOaR8KRgx7dS3zowMFCUDE0HqyHQVj974455bzlwdc3LIO2KniLtbZgV
# rt6JWbn5poBcRkSBXV2SI78dls10Dn+So/ecmJjxMz+61hBN+t6oqTdaaPaLnyUL
# yDwwn5Ji91hQO+hBL4+wRw+Ssbmz8YcdEyacbtgIE+erS0lHt9Df/21rSTSI3xRl
# 0sj+WTSQ1Ar7CwE8veVNqvgdZRc45kHvBwYXyxgLADmcGqok5wM=
# =gc8G
# -----END PGP SIGNATURE-----
# gpg: Signature made Sat 04 Oct 2025 11:54:29 PM PDT
# gpg: using RSA key 64AA2AB531D56903366BFEF982AA4A243B1E9478
# gpg: Good signature from "Michael Tokarev <mjt@debian.org>" [unknown]
# gpg: aka "Michael Tokarev <mjt@corpit.ru>" [unknown]
# gpg: aka "Michael Tokarev <mjt@tls.msk.ru>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 9D8B E14E 3F2A 9DD7 9199 28F1 61AD 3D98 ECDF 2C8E
# Subkey fingerprint: 64AA 2AB5 31D5 6903 366B FEF9 82AA 4A24 3B1E 9478
* tag 'pull-trivial-patches' of https://gitlab.com/mjt0k/qemu:
system/runstate: remove duplicate in runstate transitions
docs/specs/spdm.rst: Fix typo in x86_64 architecture name
docs/devel: Correct uefi-vars-x64 device name
wdt_i6300esb: fix incorrect mask for interrupt type
hid: fix incorrect return value for hid
vhost-user-test: remove trailing newlines in g_test_message() calls
hw/net/can: Remove redundant status bit setting in can_sja1000
ui/gtk: Fix callback function signature
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
qemu_bh_new_guarded() is considered legacy since commit 9c86c97f12c
("async: Add an optional reentrancy guard to the BH API"); recommend
the new API: aio_bh_new_guarded().
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20250924163911.51479-1-philmd@linaro.org> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Improve error reporting when virtqueue ring mapping fails by including a
device identifier in the error message.
Introduce a helper qdev_get_printable_name() in qdev-core, which returns
either:
- the device ID, if explicitly provided (e.g. -device ...,id=foo)
- the QOM path from qdev_get_dev_path(dev) otherwise
- "<unknown device>" as a fallback when no identifier is present
This makes it easier to identify which device triggered the error in
multi-device setups or when debugging complex guest configurations.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/230 Buglink: https://bugs.launchpad.net/qemu/+bug/1919021 Suggested-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Alessandro Ratti <alessandro@0x65c.net>
Message-Id: <20250924093138.559872-2-alessandro@0x65c.net> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
pci: Fix wrong parameter passing to pci_device_get_iommu_bus_devfn()
The 2nd parameter of pci_device_get_iommu_bus_devfn() about root PCIBus
backed by an IOMMU for the PCI device, the 3rd is about aliased PCIBus
of the PCI device.
Meanwhile the 3rd and 4th parameters are optional, pass NULL if they
are not needed.
Reviewed-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250929034206.439266-4-zhenzhong.duan@intel.com> Fixes: a849ff5d6f ("pci: Add a pci-level initialization function for IOMMU notifiers") Fixes: f0f37daf8e ("pci: Add a PCI-level API for PRI") Fixes: e9b457500a ("pci: Add a pci-level API for ATS") Fixes: 042cbc9aec ("pci: Add an API to get IOMMU's min page size and virtual address width") Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
intel_iommu: Simplify caching mode check with VFIO device
In early days, we had different tricks to ensure caching-mode=on with VFIO
device:
28cf553afe ("intel_iommu: Sanity check vfio-pci config on machine init done") c6cbc29d36 ("pc/q35: Disallow vfio-pci hotplug without VT-d caching mode")
There is also a patch with the same purpose but for VDPA device:
b8d78277c0 ("intel-iommu: fail MAP notifier without caching mode")
Because without caching mode, MAP notifier won't work correctly since guest
won't send IOTLB update event when it establishes new mappings in the I/O page
tables.
Now with host IOMMU device interface between VFIO and vIOMMU, we can simplify
first two commits above with a small check in set_iommu_device(). This also
works for future IOMMUFD backed VDPA implementation which may also need caching
mode on. But for legacy VDPA we still need commit b8d78277c0 as it doesn't
use the host IOMMU device interface.
For coldplug VFIO device:
qemu-system-x86_64: -device vfio-pci,host=0000:3b:00.0,id=hostdev3,bus=root0,iommufd=iommufd0: vfio 0000:3b:00.0: Failed to set vIOMMU: Device assignment is not allowed without enabling caching-mode=on for Intel IOMMU.
For hotplug VFIO device:
if "iommu=off" is configured in guest,
Error: vfio 0000:3b:00.0: Failed to set vIOMMU: Device assignment is not allowed without enabling caching-mode=on for Intel IOMMU.
else
Error: vfio 0000:3b:00.0: memory listener initialization failed: Region vtd-00.0-dmar: device 01.00.0 requires caching mode: Operation not supported
The specialty for hotplug is due to the check in commit b8d78277c0 happen before
the check in set_iommu_device.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250929034206.439266-3-zhenzhong.duan@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
intel_iommu: Enable Enhanced Set Root Table Pointer Support (ESRTPS)
According to VTD spec rev 4.1 section 6.6:
"For implementations reporting the Enhanced Set Root Table Pointer Support
(ESRTPS) field as Clear, on a 'Set Root Table Pointer' operation, software
must perform a global invalidate of the context cache, PASID-cache (if
applicable), and IOTLB, in that order. This is required to ensure hardware
references only the remapping structures referenced by the new root table
pointer and not stale cached entries.
For implementations reporting the Enhanced Set Root Table Pointer Support
(ESRTPS) field as Set, as part of 'Set Root Table Pointer' operation,
hardware performs global invalidation on all DMA remapping translation
caches and hence software is not required to perform additional
invalidations"
We already implemented ESRTPS capability in vtd_handle_gcmd_srtp() by
calling vtd_reset_caches(), just set ESRTPS in DMAR_CAP_REG to avoid
unnecessary global invalidation requests of context, PASID-cache and
IOTLB from guest.
This change doesn't impact migration as the content of DMAR_CAP_REG is
migrated too.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250929034206.439266-2-zhenzhong.duan@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Li Zhaoxin [Fri, 26 Sep 2025 11:08:17 +0000 (19:08 +0800)]
vdpa-dev: add get_vhost() callback for vhost-vdpa device
Commit c255488d67 "virtio: add vhost support for virtio devices"
added the get_vhost() function, but it did not include vhost-vdpa devices.
So when I use the vdpa device and query the status of the vdpa device
with the x-query-virtio-status qmp command, since vdpa does not implement
vhost_get, it will cause qemu to crash.
Therefore, in order to obtain the status of the virtio device under vhost-vdpa,
we need to add a vhost_get implement for the vdpa device.
Co-developed-by: Miao Kezhan <miaokezhan@baidu.com> Signed-off-by: Miao Kezhan <miaokezhan@baidu.com> Signed-off-by: Li Zhaoxin <lizhaoxin04@baidu.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <2778f817cb6740a15ecb37927804a67288b062d1.1758860411.git.lizhaoxin04@baidu.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add a way to disable DMA translation support in AMD IOMMU by
allowing to set IVHD HATDis to 1, and exposing HATS (Host Address
Translation Size) as Reserved value.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250919213515.917111-23-alejandro.j.jimenez@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
To be later reused by AMD, now that it shares similar property.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250919213515.917111-22-alejandro.j.jimenez@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
amd_iommu: Refactor amdvi_page_walk() to use common code for page walk
Simplify amdvi_page_walk() by making it call the fetch_pte() helper that is
already in use by the shadow page synchronization code. Ensures all code
uses the same page table walking algorithm.
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250919213515.917111-21-alejandro.j.jimenez@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
amd_iommu: Do not assume passthrough translation when DTE[TV]=0
The AMD I/O Virtualization Technology (IOMMU) Specification (see Table
8: V, TV, and GV Fields in Device Table Entry), specifies that a DTE
with V=1, TV=0 does not contain a valid address translation information.
If a request requires a table walk, the walk is terminated when this
condition is encountered.
Do not assume that addresses for a device with DTE[TV]=0 are passed
through (i.e. not remapped) and instead terminate the page table walk
early.
Fixes: d29a09ca6842 ("hw/i386: Introduce AMD IOMMU") Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250919213515.917111-20-alejandro.j.jimenez@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
amd_iommu: Toggle address translation mode on devtab entry invalidation
A guest must issue an INVALIDATE_DEVTAB_ENTRY command after changing a
Device Table entry (DTE) e.g. after attaching a device and setting up its
DTE. When intercepting this event, determine if the DTE has been configured
for paging or not, and toggle the appropriate memory regions to allow DMA
address translation for the address space if needed. Requires dma-remap=on.
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250919213515.917111-19-alejandro.j.jimenez@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
amd_iommu: Add dma-remap property to AMD vIOMMU device
In order to enable device assignment with IOMMU protection and guest DMA
address translation, IOMMU MAP notifier support is necessary to allow users
like VFIO to synchronize the shadow page tables i.e. to receive
notifications when the guest updates its I/O page tables and replay the
mappings onto host I/O page tables.
Provide a new dma-remap property to govern the ability to register for MAP
notifications, effectively providing global control over the DMA address
translation functionality that was implemented in previous changes.
Note that DMA remapping support also requires the vIOMMU is configured with
the NpCache capability, so a guest driver issues IOMMU invalidations for
both map() and unmap() operations. This capability is already set by default
and written to the configuration in amdvi_pci_realize() as part of
AMDVI_CAPAB_FEATURES.
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250919213515.917111-18-alejandro.j.jimenez@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
amd_iommu: Set all address spaces to use passthrough mode on reset
On reset, restore the default address translation mode (passthrough) for all
the address spaces managed by the vIOMMU.
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250919213515.917111-17-alejandro.j.jimenez@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
amd_iommu: Toggle memory regions based on address translation mode
Enable the appropriate memory region for an address space depending on the
address translation mode selected for it. This is currently based on a
generic x86 IOMMU property, and only done during the address space
initialization. Extract the code into a helper and toggle the regions based
on whether the specific address space is using address translation (via the
newly introduced addr_translation field). Later, region activation will also
be controlled by availability of DMA remapping capability (via dma-remap
property to be introduced in follow up changes).
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250919213515.917111-16-alejandro.j.jimenez@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
amd_iommu: Invalidate address translations on INVALIDATE_IOMMU_ALL
When the kernel IOMMU driver issues an INVALIDATE_IOMMU_ALL, the address
translation and interrupt remapping information must be cleared for all
Device IDs and all domains. Introduce a helper to sync the shadow page table
for all the address spaces with registered notifiers, which replays both MAP
and UNMAP events.
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250919213515.917111-15-alejandro.j.jimenez@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
A replay() method is necessary to efficiently synchronize the host page
tables after VFIO registers a notifier for IOMMU events. It is called to
ensure that existing mappings from an IOMMU memory region are "replayed" to
a specified notifier, initializing or updating the shadow page tables on the
host.
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250919213515.917111-14-alejandro.j.jimenez@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
amd_iommu: Unmap all address spaces under the AMD IOMMU on reset
Support dropping all existing mappings on reset. When the guest kernel
reboots it will create new ones, but other components that run before
the kernel (e.g. OVMF) should not be able to use existing mappings from
the previous boot.
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250919213515.917111-13-alejandro.j.jimenez@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
amd_iommu: Use iova_tree records to determine large page size on UNMAP
Keep a record of mapped IOVA ranges per address space, using the iova_tree
implementation. Besides enabling optimizations like avoiding unnecessary
notifications, a record of existing <IOVA, size> mappings makes it possible
to determine if a specific IOVA is mapped by the guest using a large page,
and adjust the size when notifying UNMAP events.
When unmapping a large page, the information in the guest PTE encoding the
page size is lost, since the guest clears the PTE before issuing the
invalidation command to the IOMMU. In such case, the size of the original
mapping can be retrieved from the iova_tree and used to issue the UNMAP
notification. Using the correct size is essential since the VFIO IOMMU
Type1v2 driver in the host kernel will reject unmap requests that do not
fully cover previous mappings.
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250919213515.917111-12-alejandro.j.jimenez@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
amd_iommu: Sync shadow page tables on page invalidation
When the guest issues an INVALIDATE_IOMMU_PAGES command, decode the address
and size of the invalidation and sync the guest page table state with the
host. This requires walking the guest page table and calling notifiers
registered for address spaces matching the domain ID encoded in the command.
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250919213515.917111-11-alejandro.j.jimenez@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
amd_iommu: Add basic structure to support IOMMU notifier updates
Add the minimal data structures required to maintain a list of address
spaces (i.e. devices) with registered notifiers, and to update the type of
events that require notifications.
Note that the ability to register for MAP notifications is not available.
It will be unblocked by following changes that enable the synchronization of
guest I/O page tables with host IOMMU state, at which point an amd-iommu
device property will be introduced to control this capability.
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250919213515.917111-10-alejandro.j.jimenez@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
amd_iommu: Add a page walker to sync shadow page tables on invalidation
For the specified address range, walk the page table identifying regions
as mapped or unmapped and invoke registered notifiers with the
corresponding event type.
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250919213515.917111-9-alejandro.j.jimenez@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
amd_iommu: Add helpers to walk AMD v1 Page Table format
The current amdvi_page_walk() is designed to be called by the replay()
method. Rather than drastically altering it, introduce helpers to fetch
guest PTEs that will be used by a page walker implementation.
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250919213515.917111-8-alejandro.j.jimenez@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
amd_iommu: Return an error when unable to read PTE from guest memory
Make amdvi_get_pte_entry() return an error value (-1) in cases where the
memory read fails, versus the current return of 0 to indicate failure.
The reason is that 0 is also a valid value to have stored in the PTE in
guest memory i.e. the guest does not have a mapping. Before this change,
amdvi_get_pte_entry() returned 0 for both an error and for empty PTEs, but
the page walker implementation that will be introduced in upcoming changes
needs a method to differentiate between the two scenarios.
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250919213515.917111-7-alejandro.j.jimenez@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Extracting the DTE from a given AMDVIAddressSpace pointer structure is a
common operation required for syncing the shadow page tables. Implement a
helper to do it and check for common error conditions.
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250919213515.917111-6-alejandro.j.jimenez@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
amd_iommu: Helper to decode size of page invalidation command
The size of the region to invalidate depends on the S bit and address
encoded in the command. Add a helper to extract this information, which
will be used to sync shadow page tables in upcoming changes.
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250919213515.917111-5-alejandro.j.jimenez@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Move code related to Device Table and Page Table to an earlier location in
the file, where it does not require forward declarations to be used by the
various invalidation functions that will need to query the DTE and walk the
page table in upcoming changes.
This change consist of code movement only, no functional change intended.
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250919213515.917111-4-alejandro.j.jimenez@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
amd_iommu: Document '-device amd-iommu' common options
Document the common parameters used when emulating AMD vIOMMU.
Besides the two amd-iommu specific options: 'xtsup' and 'dma-remap', the
the generic x86 IOMMU option 'intremap' is also included, since it is
typically specified in QEMU command line examples and mailing list threads.
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250919213515.917111-3-alejandro.j.jimenez@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
memory: Adjust event ranges to fit within notifier boundaries
Invalidating the entire address space (i.e. range of [0, ~0ULL]) is a
valid and required operation by vIOMMU implementations. However, such
invalidations currently trigger an assertion unless they originate from
device IOTLB invalidations.
Although in recent Linux guests this case is not exercised by the VTD
implementation due to various optimizations, the assertion will be hit
by upcoming AMD vIOMMU changes to support DMA address translation. More
specifically, when running a Linux guest with VFIO passthrough device,
and a kernel that does not contain commmit 3f2571fed2fa ("iommu/amd:
Remove redundant domain flush from attach_device()").
Remove the assertion altogether and adjust the range to ensure it does
not cross notifier boundaries.
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Peter Xu <peterx@redhat.com>
Message-Id: <20201116165506.31315-6-eperezma@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250919213515.917111-2-alejandro.j.jimenez@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Viktor Kurilko [Fri, 8 Aug 2025 14:29:25 +0000 (21:29 +0700)]
Add a feature for mapping a host unix socket to a guest tcp socket
This patch adds the ability to map a host unix socket to a guest tcp socket when
using the slirp backend. This feature was added in libslirp version 4.7.0.
A new syntax for unix socket: -hostfwd=unix:hostpath-[guestaddr]:guestport
Signed-off-by: Viktor Kurilko <murlockkinght@gmail.com> Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Message-ID: <20250808143904.363907-1-murlockkinght@gmail.com>
This results in undefined behavior when dev->exp.sriov_cap is 0 because
this is not an SR-IOV device. For example, unparent_vfs() segfaults when
total_vfs happens to be non-zero.
Fix this by returning early from pcie_sriov_pf_exit() when
dev->exp.sriov_cap is 0 because this is not an SR-IOV device.
Cc: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> Cc: Michael S. Tsirkin <mst@redhat.com> Reported-by: Qing Wang <qinwang@redhat.com> Buglink: https://issues.redhat.com/browse/RHEL-116443 Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> Fixes: cab1398a60eb ("pcie_sriov: Reuse SR-IOV VF device instances") Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250924155153.579495-1-stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Stefan Hajnoczi [Mon, 22 Sep 2025 22:01:49 +0000 (18:01 -0400)]
tests/virtio-scsi: add a virtio_error() IOThread test
Now that virtio_error() calls should work in an IOThread, add a
virtio-scsi IOThread test cases that triggers virtio_error().
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250922220149.498967-6-stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Stefan Hajnoczi [Mon, 22 Sep 2025 22:01:48 +0000 (18:01 -0400)]
tests/libqos: extract qvirtqueue_set_avail_idx()
Setting the vring's avail.idx can be useful for low-level VIRTIO tests,
especially for testing error scenarios with invalid vrings. Extract it
into a new function so that the next commit can add a test that uses
this new test API.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250922220149.498967-5-stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Stefan Hajnoczi [Mon, 22 Sep 2025 22:01:47 +0000 (18:01 -0400)]
virtio: support irqfd in virtio_notify_config()
virtio_error() calls virtio_notify_config() to inject a VIRTIO
Configuration Change Notification. This doesn't work from IOThreads
because the BQL is not held and the interrupt code path requires the
BQL.
Follow the same approach as virtio_notify() and use ->config_notifier
(an irqfd) when called from the IOThread.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250922220149.498967-4-stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Stefan Hajnoczi [Mon, 22 Sep 2025 22:01:46 +0000 (18:01 -0400)]
virtio: unify virtio_notify_irqfd() and virtio_notify()
The difference between these two functions:
- virtio_notify() uses the interrupt code path (MSI or classic IRQs)
- virtio_notify_irqfd() uses guest notifiers (irqfds)
virtio_notify() can only be called with the BQL held because the
interrupt code path requires the BQL. Device models use
virtio_notify_irqfd() from IOThreads since the BQL is not held.
The two functions can be unified by pushing down the if
(qemu_in_iothread()) check from virtio-blk and virtio-scsi into core
virtio code. This is in preparation for the next commit that will add
irqfd support to virtio_notify_config() and where it's unattractive to
introduce another irqfd-only API for device model callers.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250922220149.498967-3-stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Stefan Hajnoczi [Mon, 22 Sep 2025 22:01:45 +0000 (18:01 -0400)]
vhost: use virtio_config_get_guest_notifier()
There is a getter function so avoid accessing the ->config_notifier
field directly.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250922220149.498967-2-stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Fixes: 920557971b6 (ich9: add TCO interface emulation) Signed-off-by: Igor Mammedov <imammedo@redhat.com> Tested-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250922132600.562193-1-imammedo@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250901111630.1018573-6-clement.mathieu--drif@eviden.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250901111630.1018573-5-clement.mathieu--drif@eviden.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250901111630.1018573-4-clement.mathieu--drif@eviden.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
wait_desc with SW=0,IF=0,FN=1 must not be considered as an
invalid descriptor as it is used to implement section 7.10 of
the VT-d spec.
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250901111630.1018573-3-clement.mathieu--drif@eviden.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
pcie: Add a way to get the outstanding page request allocation (pri) from the config space.
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250901111630.1018573-2-clement.mathieu--drif@eviden.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Igor Mammedov [Mon, 1 Sep 2025 08:49:15 +0000 (10:49 +0200)]
smbios: cap DIMM size to 2Tb as workaround for broken Windows
With current limit set to match max spec size (2PTb),
Windows fails to parse type 17 records when DIMM size reaches 4Tb+.
Failure happens in GetPhysicallyInstalledSystemMemory() function,
and fails "Check SMBIOS System Memory Tables" SVVP test.
Though not fatal, it might cause issues for userspace apps,
something like [1].
Lets cap default DIMM size to 2Tb for now, until MS fixes it.
PS: It's obvious 32 int overflow math somewhere in Windows,
MS admitted that it's Windows bug and in a process of fixing it.
However it's unclear if W10 and earlier would get the fix.
So however I dislike changing defaults, we heed to work around
the issue (it looks like QEMU regression while not being it).
Hopefully 2Tb/DIMM split will last longer until VM memory size
will become large enough to cause to many type 17 records issue
again.
PPS:
Alternatively, instead of messing with defaults, we can create
a dedicated knob to ask for desired DIMM size cap explicitly
on CLI. That will let users to enable workaround when they
hit this corner case. Downside is that knob has to be propagated
up all mgmt stack, which might be not desirable.
PPPS:
Yet alternatively, users can configure initial RAM to be less
than 4Tb and all additional RAM add as DIMMs on QEMU CLI.
(however it's the job to be done by mgmt which could know
Windows version and total amount of RAM)
Signed-off-by: Igor Mammedov <imammedo@redhat.com> Fixes: 62f182c97b ("smbios: make memory device size configurable per Machine") Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250901084915.2607632-1-imammedo@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Alex Bennée [Mon, 1 Sep 2025 10:59:48 +0000 (11:59 +0100)]
hw/virtio: rename vhost-user-device and make user creatable
We didn't make the device user creatable in the first place because we
were worried users might get confused. Rename the device to make its
nature as a test device even more explicit. While we are at it add a
Kconfig variable so it can be skipped for those that want to thin out
their build configuration even further.
Acked-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-ID: <20250820195632.1956795-1-alex.bennee@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250901105948.982583-1-alex.bennee@linaro.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
pcie_sriov: Fix broken MMIO accesses from SR-IOV VFs
Starting with commit cab1398a60eb, SR-IOV VFs are realized as soon as
pcie_sriov_pf_init() is called. Because pcie_sriov_pf_init() must be
called before pcie_sriov_pf_init_vf_bar(), the VF BARs types won't be
known when the VF realize function calls pcie_sriov_vf_register_bar().
This breaks the memory regions of the VFs (for instance with igbvf):
$ lspci
...
Region 0: Memory at 281a00000 (64-bit, prefetchable) [virtual] [size=16K]
Region 3: Memory at 281a20000 (64-bit, prefetchable) [virtual] [size=16K]
peng guo [Tue, 5 Aug 2025 14:23:00 +0000 (22:23 +0800)]
hw/i386/pc: Avoid overlap between CXL window and PCI 64bit BARs in QEMU
When using a CXL Type 3 device together with a virtio 9p device in QEMU on a
physical server, the 9p device fails to initialize properly. The kernel reports
the following error:
virtio: device uses modern interface but does not have VIRTIO_F_VERSION_1
9pnet_virtio virtio0: probe with driver 9pnet_virtio failed with error -22
Further investigation revealed that the 64-bit BAR space assigned to the 9pnet
device was overlapped by the memory window allocated for the CXL devices. As a
result, the kernel could not correctly access the BAR region, causing the
virtio device to malfunction.
To address this issue, this patch adds the reserved memory end calculation
for cxl devices to reserve sufficient address space and ensure that CXL memory
windows are allocated beyond all PCI 64-bit BARs. This prevents overlap with
64-bit BARs regions such as those used by virtio or other pcie devices,
resolving the conflict.
Fixes: 03b39fcf64bc ("hw/cxl: Make the CXL fixed memory window setup a machine parameter") Signed-off-by: peng guo <engguopeng@buaa.edu.cn> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250805142300.15226-1-engguopeng@buaa.edu.cn> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
hw/smbios: allow clearing the VM bit in SMBIOS table 0
This is useful to be able to freeze a specific version of SeaBIOS to
prevent guest visible changes between BIOS updates. This is currently
not possible since the extension byte 2 provided by SeaBIOS does not
set the VM bit, whereas QEMU sets it unconditionally.
Allowing to clear it also seems useful if we want to hide the fact that
the guest system is running inside a virtual machine.
Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250724195409.43499-1-d-tatianin@yandex-team.ru> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Such script allows customizing the error data, allowing to change
all fields at the record. Please use:
$ ghes_inject.py arm -h
For more details about its usage.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <5ea174638e33d23635332fa6d4ae9d751355f127.1758610789.git.mchehab+huawei@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
docs: hest: add new "etc/acpi_table_hest_addr" and update workflow
While the HEST layout didn't change, there are some internal
changes related to how offsets are calculated and how memory error
events are triggered.
Update specs to reflect such changes.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <e3e8bd92ce40d997c67ac1d4d973c0041b8f59fc.1758610789.git.mchehab+huawei@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <2253eb50df797ab320b4ca610bd22a38e5cfd17a.1758610789.git.mchehab+huawei@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
acpi/generic_event_device.c: enable use_hest_addr for QEMU 10.x
Now that we have everything in place, enable using HEST GPA
instead of etc/hardware_errors GPA.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <ad77b64aa1f09141efe942539445908631423975.1758610789.git.mchehab+huawei@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
qapi/acpi-hest: add an interface to do generic CPER error injection
Create a QMP command to be used for generic ACPI APEI hardware error
injection (HEST) via GHESv2, and add support for it for ARM guests.
Error injection uses ACPI_HEST_SRC_ID_QMP source ID to be platform
independent. This is mapped at arch virt bindings, depending on the
types supported by QEMU and by the BIOS. So, on ARM, this is supported
via ACPI_GHES_NOTIFY_GPIO notification type.
This patch was co-authored:
- original ghes logic to inject a simple ARM record by Shiju Jose;
- generic logic to handle block addresses by Jonathan Cameron;
- generic GHESv2 error inject by Mauro Carvalho Chehab;
Co-authored-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Co-authored-by: Shiju Jose <shiju.jose@huawei.com> Co-authored-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Acked-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <81e2118b3c8b7e5da341817f277d61251655e0db.1758610789.git.mchehab+huawei@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
arm/virt: Wire up a GED error device for ACPI / GHES
Adds support to ARM virtualization to allow handling
generic error ACPI Event via GED & error source device.
It is aligned with Linux Kernel patch:
https://lore.kernel.org/lkml/1272350481-27951-8-git-send-email-ying.huang@intel.com/
Co-authored-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Co-authored-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Acked-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <3237a76b1469d669436399495825348bf34122cd.1758610789.git.mchehab+huawei@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
tests/acpi: virt: allow acpi table changes at DSDT and HEST tables
We'll be adding a new GED device for HEST GPIO notification and
increasing the number of entries at the HEST table.
Blocklist testing HEST and DSDT tables until such changes
are completed.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Acked-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <7fca7eb9b801f1b196210f66538234b94bd31c23.1758610789.git.mchehab+huawei@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Marco Cavenati [Thu, 2 Oct 2025 14:41:45 +0000 (16:41 +0200)]
system/runstate: remove duplicate in runstate transitions
Remove duplicate entry PRELAUNCH->INMIGRATE from runstate_transitions_def.
Move PRELAUNCH->SUSPENDED entry with all the other PRELAUNCH->XXX entries.
Signed-off-by: Marco Cavenati <Marco.Cavenati@eurecom.fr> Reviewed-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Peter Maydell [Tue, 23 Sep 2025 12:01:18 +0000 (13:01 +0100)]
docs/specs/spdm.rst: Fix typo in x86_64 architecture name
The spdm.rst docs call the 64-bit x86 architecture "x64-64".
This is a typo; correct it to our canonical name for the
architecture, "x86_64".
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The documentation for UEFI variable storage in uefi-vars.rst
incorrectly listed the device name as `uefi-vars-x86`.
The correct device name as implemented in the source code is
`uefi-vars-x64`.
This commit updates the documentation to use the correct name,
aligning it with the implementation.
Signed-off-by: Nana Liu <nanliu@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
wdt_i6300esb: fix incorrect mask for interrupt type
According to Intel 6300ESB Controller Hub Datasheet 14.4.15, the interrupt
type mask should be 0x03 (0b11) instead of 0x11. In the original
implementation, when we want to disable all interrupt by setting the
value to 0x03, we will get 0x01 which is incorrect when we want to read
the value again. However, there is no problem when considering the correct
behavior since 0x01 is reserved and unused just like 0x03. This patch is
just a fix to return the register value.
Signed-off-by: ShengYi Hung <aokblast@FreeBSD.org> Reviewed-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The return value of hid_keyboard_write is used to set the packet's actual_length
and pass to xhci directly to allow guest know how many byte actually processed.
Therefore, return 1 to indicate a successful transfer or it will be
considered as a wrong xfer.
Signed-off-by: ShengYi Hung <aokblast@FreeBSD.org> Reviewed-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
hw/net/can: Remove redundant status bit setting in can_sja1000
In PeliCAN mode reception, the RBS (Receive Buffer Status) bit
is set twice at line 842 and 845 with identical operations:
s->status_pel |= 0x01;
s->status_pel |= (1 << 0);
Between these two operations, only interrupt_pel is modified and
status_pel bit 4 is cleared, neither affecting bit 0. The second
operation is redundant.
This cleanup aligns PeliCAN mode with BasicCAN mode, which correctly
sets this bit only once (line 883).
Signed-off-by: SillyZ <1357816113@qq.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Filip Hejsek [Fri, 12 Sep 2025 22:58:35 +0000 (00:58 +0200)]
ui/gtk: Fix callback function signature
The correct type for opaque pointer is gpointer,
not gpointer * (which is a pointer to a pointer).
Signed-off-by: Filip Hejsek <filip.hejsek@gmail.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Merge tag 'staging-pull-request' of https://gitlab.com/peterx/qemu into staging
Migration/Memory Pull for 10.2
- PeterX's fix on tls warning for preempt channel when migratino completes
- Arun's series to enhance error reporting for vTPM and migration framework
- PeterX's patch to cleanup multifd send TLS BYE messages
- Juraj's fix on postcopy start state transition when switchover failed
- Yanfei's fix to migrate APIC before VFIO-PCI to avoid irq fallbacks
- Dan's cleanup to simplify error reporting in qemu_fill_buffer()
- PeterM's fix on address space leak when cpu hot plug / unplug
- Steve's cpr-exec wholeset
# -----BEGIN PGP SIGNATURE-----
#
# iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCaN/uIhIccGV0ZXJ4QHJl
# ZGhhdC5jb20ACgkQO1/MzfOr1wZ+mAEA1l2RS9sZS1W3vXQMCNb+Nu8Uo2p+e5Qj
# Uu6J0WVV+XsBANtzGZk2UM/frqlABywW3/ozJ4qBvIPKo758Mr6/lqUH
# =asUv
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 03 Oct 2025 08:39:14 AM PDT
# gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706
# gpg: issuer "peterx@redhat.com"
# gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [unknown]
# gpg: aka "Peter Xu <peterx@redhat.com>" [unknown]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706
* tag 'staging-pull-request' of https://gitlab.com/peterx/qemu: (45 commits)
migration-test: test cpr-exec
vfio: cpr-exec mode
migration: cpr-exec docs
migration: cpr-exec mode
migration: cpr-exec save and load
migration: cpr-exec-command parameter
oslib: qemu_clear_cloexec
migration: add cpr_walk_fd
migration: multi-mode notifier
migration: simplify error reporting after channel read
physmem: Destroy all CPU AddressSpaces on unrealize
memory: New AS helper to serialize destroy+free
include/system/memory.h: Clarify address_space_destroy() behaviour
migration: ensure APIC is loaded prior to VFIO PCI devices
migration: Fix state transition in postcopy_start() error handling
migration/multifd/tls: Cleanup BYE message processing on sender side
migration: HMP: Adjust the order of output fields
migration: Make migration_has_failed() work even for CANCELLING
io/crypto: Move tls premature termination handling into QIO layer
backends/tpm: Propagate vTPM error on migration failure
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
acpi/generic_event_device: add an APEI error device
Adds a generic error device to handle generic hardware error
events as specified at ACPI 6.5 specification at 18.3.2.7.2:
https://uefi.org/specs/ACPI/6.5/18_Platform_Error_Interfaces.html#event-notification-for-generic-error-sources
using HID PNP0C33.
The PNP0C33 device is used to report hardware errors to
the guest via ACPI APEI Generic Hardware Error Source (GHES).
Co-authored-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Co-authored-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <2790f664c849d53de0ce3049fa8c7950c1de1f86.1758610789.git.mchehab+huawei@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
acpi/generic_event_device: add logic to detect if HEST addr is available
Create a new property (x-has-hest-addr) and use it to detect if
the GHES table offsets can be calculated from the HEST address
(qemu 10.0 and upper) or via the legacy way via an offset obtained
from the hardware_errors firmware file.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <c4eb3cf32a3f158ae62dac29e866ac3f373956c3.1758610789.git.mchehab+huawei@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
acpi/generic_event_device: Update GHES migration to cover hest addr
The GHES migration logic should now support HEST table location too.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <ede7ddf4b10f34094a4327dc458d630ad319bd1c.1758610789.git.mchehab+huawei@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
acpi/ghes: add a notifier to notify when error data is ready
Some error injection notify methods are async, like GPIO
notify. Add a notifier to be used when the error record is
ready to be sent to the guest OS.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <edf9c6e5b80dc57e3443893bf9e1eb25cb9d266b.1758610789.git.mchehab+huawei@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
acpi/ghes: don't hard-code the number of sources for HEST table
The current code is actually dependent on having just one error
structure with a single source, as any change there would cause
migration issues.
As the number of sources should be arch-dependent, as it will depend on
what kind of notifications will exist, and how many errors can be
reported at the same time, change the logic to be more flexible,
allowing the number of sources to be defined when building the
HEST table by the caller.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <1698680848c11d6f26368426f1657e14faaf55c4.1758610789.git.mchehab+huawei@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
acpi/ghes: Use HEST table offsets when preparing GHES records
There are two pointers that are needed during error injection:
1. The start address of the CPER block to be stored;
2. The address of the read ack.
It is preferable to calculate them from the HEST table. This allows
checking the source ID, the size of the table and the type of the
HEST error block structures.
Yet, keep the old code, as this is needed for migration purposes
from older QEMU versions.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <d4344e8dbe66372e1e093d968eda2e8b0527ba48.1758610789.git.mchehab+huawei@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Store HEST table address at GPA, placing its the start of the table at
hest_addr_le variable.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <c29aa5e6ab9b2d93dd5328481630c3b03da86261.1758610789.git.mchehab+huawei@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
acpi/ghes: prepare to change the way HEST offsets are calculated
Add a new ags flag to change the way HEST offsets are calculated.
Currently, offsets needed to store ACPI HEST offsets and read ack
are calculated based on a previous knowledge from the logic
which creates the HEST table.
Such logic is not generic, not allowing to easily add more HEST
entries nor replicates what OSPM does.
As the next patches will be adding a more generic logic, add a
new use_hest_addr, set to false, in preparation for such changes.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <f5de17bf04b27828e1a439ad396b4f7982eaf156.1758610789.git.mchehab+huawei@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
acpi/ghes: Cleanup the code which gets ghes ged state
Move the check logic into a common function and simplify the
code which checks if GHES is enabled and was properly setup.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <2bbb1d3eb88b0a668114adef2f1c2a94deebba0e.1758610789.git.mchehab+huawei@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Revert "hw/acpi/ghes: Make ghes_record_cper_errors() static"
The ghes_record_cper_errors() function was introduced to be used
by other types of errors, as part of the error injection
patch series. That's why it is not static.
Make it non-static again to allow its usage outside ghes.c
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <14f2a888cfbf922d5f2bf94d7612114f25107d59.1758610789.git.mchehab+huawei@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Paolo Abeni [Mon, 22 Sep 2025 14:18:28 +0000 (16:18 +0200)]
net: implement UDP tunnel features offloading
When any host or guest GSO over UDP tunnel offload is enabled the
virtio net header includes the additional tunnel-related fields,
update the size accordingly.
Push the GSO over UDP tunnel offloads all the way down to the tap
device extending the newly introduced NetFeatures struct, and
eventually enable the associated features.
As per virtio specification, to convert features bit to offload bit,
map the extended features into the reserved range.
Finally, make the vhost backend aware of the exact header layout, to
copy it correctly. The tunnel-related field are present if either
the guest or the host negotiated any UDP tunnel related feature:
add them to the kernel supported features list, to allow qemu
transfer to the backend the needed information.
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <093b4bc68368046bffbcab2202227632d6e4e83b.1758549625.git.pabeni@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Paolo Abeni [Mon, 22 Sep 2025 14:18:27 +0000 (16:18 +0200)]
net: implement tunnel probing
Tap devices support GSO over UDP tunnel offload. Probe for such
feature in a similar manner to other offloads.
GSO over UDP tunnel needs to be enabled in addition to a "plain"
offload (TSO or USO).
No need to check separately for the outer header checksum offload:
the kernel is going to support both of them or none.
The new features are disabled by default to avoid compat issues,
and could be enabled, after that hw_compat_10_1 will be added,
together with the related compat entries.
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <a987a8a7613cbf33bb2209c7c7f5889b512638a7.1758549625.git.pabeni@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Paolo Abeni [Mon, 22 Sep 2025 14:18:26 +0000 (16:18 +0200)]
virtio-net: implement extended features support
Use the extended types and helpers to manipulate the virtio_net
features.
Note that offloads are still 64bits wide, as per specification,
and extended offloads will be mapped into such range.
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <bc5afdc5c1cb1a37238dd2b36004db3d46cbf211.1758549625.git.pabeni@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Paolo Abeni [Mon, 22 Sep 2025 14:18:25 +0000 (16:18 +0200)]
vhost-net: implement extended features support
Provide extended version of the features manipulation helpers,
and let the device initialization deal with the full features space,
adjusting the relevant format strings accordingly.
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <69c78c432e28e146a8874b2a7d00e9cbd111b1ba.1758549625.git.pabeni@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Paolo Abeni [Mon, 22 Sep 2025 14:18:24 +0000 (16:18 +0200)]
vhost-backend: implement extended features support
Leverage the kernel extended features manipulation ioctls(), if
available, and fallback to old ops otherwise. Error out when setting
extended features but kernel support is not available.
Note that extended support for get/set backend features is not needed,
as the only feature that can be changed belongs to the 64 bit range.
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <150daade3d59e77629276920e014ee8e5fc12121.1758549625.git.pabeni@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Paolo Abeni [Mon, 22 Sep 2025 14:18:23 +0000 (16:18 +0200)]
qmp: update virtio features map to support extended features
Extend the VirtioDeviceFeatures struct with an additional u64
to track unknown features in the 64-127 bit range and decode
the full virtio features spaces for vhost and virtio devices.
Also add entries for the soon-to-be-supported virtio net GSO over
UDP features.
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <e51969f94d89045b333f1bc5ef5fca9e12fc371a.1758549625.git.pabeni@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Paolo Abeni [Mon, 22 Sep 2025 14:18:22 +0000 (16:18 +0200)]
vhost: add support for negotiating extended features
Similar to virtio infra, vhost core maintains the features status
in the full extended format and allows the devices to implement
extended version of the getter/setter.
Note that 'protocol_features' are not extended: they are only
used by vhost-user, and the latter device is not going to implement
extended features soon.
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <a0062c3b1847fb2baedd6cd8f6ef13b051d6beb2.1758549625.git.pabeni@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Paolo Abeni [Mon, 22 Sep 2025 14:18:21 +0000 (16:18 +0200)]
virtio-pci: implement support for extended features
Extend the features configuration space to 128 bits. If the virtio
device supports any extended features, allow the common read/write
operation to access all of it, otherwise keep exposing only the
lower 64 bits.
On migration, save the 128 bit version of the features only if the
upper bits are non zero. Relay on reset to clear all the feature
space before load.
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <c0b81601f65b41ca8310eba8f05e2dcf3702de89.1758549625.git.pabeni@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Paolo Abeni [Mon, 22 Sep 2025 14:18:20 +0000 (16:18 +0200)]
virtio: add support for negotiating extended features
The virtio specifications allows for a device features space up
to 128 bits and more. Soon we are going to use some of the 'extended'
bits features for the virtio net driver.
Add support to allow extended features negotiation on a per
devices basis. Devices willing to negotiated extended features
need to implemented a new pair of features getter/setter, the
core will conditionally use them instead of the basic one.
Note that 'bad_features' don't need to be extended, as they are
bound to the 64 bits limit.
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <9bb29d70adc3f2b8c7756d4e3cd076cffee87826.1758549625.git.pabeni@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Paolo Abeni [Mon, 22 Sep 2025 14:18:19 +0000 (16:18 +0200)]
virtio: serialize extended features state
If the driver uses any of the extended features (i.e. 64 or above),
store the extended features range (64-127 bits).
At load time, let legacy features initialize the full features range
and pass it to the set helper; sub-states loading will have filled-up
the extended part as needed.
This is one of the few spots that need explicitly to know and set
in stone the extended features array size; add a build bug to prevent
breaking the migration should such size change again in the future:
more serialization plumbing will be needed.
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <d5d9d398675bee6c4c7d7308c5d3d5d3c6d17d87.1758549625.git.pabeni@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Paolo Abeni [Mon, 22 Sep 2025 14:18:18 +0000 (16:18 +0200)]
virtio: introduce extended features type
The virtio specifications allows for up to 128 bits for the
device features. Soon we are going to use some of the 'extended'
bits features (bit 64 and above) for the virtio net driver.
Represent the virtio features bitmask with a fixed size array, and
introduce a few helpers to help manipulate them.
Most drivers will keep using only 64 bits features space: use union
to allow them access the lower part of the extended space without any
per driver change.
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <6a9bbb5eb33830f20afbcb7e64d300af4126dd98.1758549625.git.pabeni@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Paolo Abeni [Mon, 22 Sep 2025 14:18:17 +0000 (16:18 +0200)]
linux-headers: Update to Linux v6.17-rc1
Update headers to include the virtio GSO over UDP tunnel features
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <0b1f3c011f90583ab52aa4fef04df6db35cc4a69.1758549625.git.pabeni@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Paolo Abeni [Mon, 22 Sep 2025 14:18:16 +0000 (16:18 +0200)]
linux-headers: deal with counted_by annotation
Such annotation is present into the kernel uAPI headers since
v6.7, and will be used soon by the vhost_type.h. Deal with it
just stripping it.
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <a1430f43cc954d2a931fa60581bda6d6af4bc771.1758549625.git.pabeni@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Paolo Abeni [Mon, 22 Sep 2025 14:18:15 +0000 (16:18 +0200)]
net: bundle all offloads in a single struct
The set_offload() argument list is already pretty long and
we are going to introduce soon a bunch of additional offloads.
Replace the offload arguments with a single struct and update
all the relevant call-sites.
No functional changes intended.
Signed-off-by: Paolo Abeni <pabeni@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <a9d4dd043b8c71b791e9ff05e17ef06072d9714e.1758549625.git.pabeni@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>