]> git.ipfire.org Git - thirdparty/qemu.git/log
thirdparty/qemu.git
10 days agovfio: vfio_find_ram_discard_listener
Steve Sistare [Thu, 29 May 2025 19:24:03 +0000 (12:24 -0700)] 
vfio: vfio_find_ram_discard_listener

Define vfio_find_ram_discard_listener as a subroutine so additional calls to
it may be added in a subsequent patch.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Link: https://lore.kernel.org/qemu-devel/1748546679-154091-8-git-send-email-steven.sistare@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
10 days agoMAINTAINERS: Add reviewer for CPR
Steve Sistare [Thu, 29 May 2025 19:23:57 +0000 (12:23 -0700)] 
MAINTAINERS: Add reviewer for CPR

CPR is integrated with live migration, and has the same maintainers.
But, add a CPR section to add a reviewer.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/1748546679-154091-2-git-send-email-steven.sistare@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
10 days agovfio/iommufd: Save vendor specific device info
Zhenzhong Duan [Wed, 4 Jun 2025 06:21:15 +0000 (14:21 +0800)] 
vfio/iommufd: Save vendor specific device info

Some device information returned by ioctl(IOMMU_GET_HW_INFO) are vendor
specific. Save them as raw data in a union supporting different vendors,
then vendor IOMMU can query the raw data with its fixed format for
capability directly.

Because IOMMU_GET_HW_INFO is only supported in linux, so declare those
capability related structures with CONFIG_LINUX.

Suggested-by: Eric Auger <eric.auger@redhat.com>
Suggested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20250604062115.4004200-5-zhenzhong.duan@intel.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
10 days agovfio/iommufd: Implement [at|de]tach_hwpt handlers
Zhenzhong Duan [Wed, 4 Jun 2025 06:21:14 +0000 (14:21 +0800)] 
vfio/iommufd: Implement [at|de]tach_hwpt handlers

Implement [at|de]tach_hwpt handlers in VFIO subsystem. vIOMMU
utilizes them to attach to or detach from hwpt on host side.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Link: https://lore.kernel.org/qemu-devel/20250604062115.4004200-4-zhenzhong.duan@intel.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
10 days agovfio/iommufd: Add properties and handlers to TYPE_HOST_IOMMU_DEVICE_IOMMUFD
Zhenzhong Duan [Wed, 4 Jun 2025 06:21:13 +0000 (14:21 +0800)] 
vfio/iommufd: Add properties and handlers to TYPE_HOST_IOMMU_DEVICE_IOMMUFD

Enhance HostIOMMUDeviceIOMMUFD object with 3 new members, specific
to the iommufd BE + 2 new class functions.

IOMMUFD BE includes IOMMUFD handle, devid and hwpt_id. IOMMUFD handle
and devid are used to allocate/free ioas and hwpt. hwpt_id is used to
re-attach IOMMUFD backed device to its default VFIO sub-system created
hwpt, i.e., when vIOMMU is disabled by guest. These properties are
initialized in hiod::realize() after attachment.

2 new class functions are [at|de]tach_hwpt(). They are used to
attach/detach hwpt. VFIO and VDPA can have different implementions,
so implementation will be in sub-class instead of HostIOMMUDeviceIOMMUFD,
e.g., in HostIOMMUDeviceIOMMUFDVFIO.

Add two wrappers host_iommu_device_iommufd_[at|de]tach_hwpt to wrap the
two functions.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20250604062115.4004200-3-zhenzhong.duan@intel.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
10 days agobackends/iommufd: Add a helper to invalidate user-managed HWPT
Zhenzhong Duan [Wed, 4 Jun 2025 06:21:12 +0000 (14:21 +0800)] 
backends/iommufd: Add a helper to invalidate user-managed HWPT

This helper passes cache invalidation request from guest to invalidate
stage-1 page table cache in host hardware.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20250604062115.4004200-2-zhenzhong.duan@intel.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
10 days agovfio/container: pass MemoryRegion to DMA operations
John Levon [Wed, 21 May 2025 21:55:34 +0000 (22:55 +0100)] 
vfio/container: pass MemoryRegion to DMA operations

Pass through the MemoryRegion to DMA operation handlers of vfio
containers. The vfio-user container will need this later, to translate
the vaddr into an offset for the dma map vfio-user message; CPR will
also will need this.

Originally-by: John Johnson <john.g.johnson@oracle.com>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Steve Sistare <steven.sistare@oracle.com>
Link: https://lore.kernel.org/qemu-devel/20250521215534.2688540-1-john.levon@nutanix.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
10 days agovfio: return mr from vfio_get_xlat_addr
Steve Sistare [Mon, 19 May 2025 13:26:43 +0000 (06:26 -0700)] 
vfio: return mr from vfio_get_xlat_addr

Modify memory_get_xlat_addr and vfio_get_xlat_addr to return the memory
region that the translated address is found in.  This will be needed by
CPR in a subsequent patch to map blocks using IOMMU_IOAS_MAP_FILE.

Also return the xlat offset, so we can simplify the interface by removing
the out parameters that can be trivially derived from mr and xlat.

Lastly, rename the functions to  to memory_translate_iotlb() and
vfio_translate_iotlb().

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Link: https://lore.kernel.org/qemu-devel/1747661203-136490-1-git-send-email-steven.sistare@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
10 days agovfio/igd: Fix incorrect error propagation in vfio_pci_igd_opregion_detect()
Tomita Moeko [Thu, 22 May 2025 15:16:36 +0000 (23:16 +0800)] 
vfio/igd: Fix incorrect error propagation in vfio_pci_igd_opregion_detect()

In vfio_pci_igd_opregion_detect(), errp will be set when the device does
not have OpRegion or is hotplugged. This errp will be propagated to
pci_qdev_realize(), which interprets it as failure, causing unexpected
termination on devices without OpRegion like SR-IOV VFs or discrete
GPUs. Fix it by not setting errp in vfio_pci_igd_opregion_detect().

This patch also checks if the device has OpRegion before hotplug status
to prevent unwanted warning messages on non-IGD devices.

Fixes: c0273e77f2d7 ("vfio/igd: Detect IGD device by OpRegion")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2968
Reported-by: Edmund Raile <edmund.raile@protonmail.com>
Link: https://lore.kernel.org/qemu-devel/30044d14-17ec-46e3-b9c3-63d27a5bde27@gmail.com
Tested-by: Edmund Raile <edmund.raile@protonmail.com>
Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Corvin Köhne <c.koehne@beckhoff.com>
Link: https://lore.kernel.org/qemu-devel/20250522151636.20001-1-tomitamoeko@gmail.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
10 days agovfio/iommufd: Add comment emphasizing no movement of hiod->realize() call
Zhenzhong Duan [Wed, 21 May 2025 11:03:01 +0000 (19:03 +0800)] 
vfio/iommufd: Add comment emphasizing no movement of hiod->realize() call

The nested IOMMU support needs device and hwpt id which are generated
only after attachment. Hiod encapsulates these information in realize()
and passes to vIOMMU.

Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20250521110301.3313877-1-zhenzhong.duan@intel.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
10 days agovfio: refactor out IRQ signalling setup
John Levon [Tue, 20 May 2025 15:03:53 +0000 (16:03 +0100)] 
vfio: refactor out IRQ signalling setup

This makes for a slightly more readable vfio_msix_vector_do_use()
implementation, and we will rely on this shortly.

Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20250520150419.2172078-5-john.levon@nutanix.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
10 days agovfio: move config space read into vfio_pci_config_setup()
John Levon [Tue, 20 May 2025 15:03:52 +0000 (16:03 +0100)] 
vfio: move config space read into vfio_pci_config_setup()

Small cleanup that reduces duplicate code for vfio-user and reduces the
size of vfio_realize(); while we're here, correct that name to
vfio_pci_realize().

Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20250520150419.2172078-4-john.levon@nutanix.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
10 days agovfio: move more cleanup into vfio_pci_put_device()
John Levon [Tue, 20 May 2025 15:03:51 +0000 (16:03 +0100)] 
vfio: move more cleanup into vfio_pci_put_device()

All of the cleanup can be done in the same place, and vfio-user will
want to do the same.

Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20250520150419.2172078-3-john.levon@nutanix.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
10 days agovfio: add more VFIOIOMMUClass docs
John Levon [Tue, 20 May 2025 16:25:30 +0000 (17:25 +0100)] 
vfio: add more VFIOIOMMUClass docs

Add some additional doc comments for these class methods.

Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20250520162530.2194548-1-john.levon@nutanix.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
10 days agovfio/igd: OpRegion not found fix error typo
Edmund Raile [Mon, 19 May 2025 11:24:23 +0000 (11:24 +0000)] 
vfio/igd: OpRegion not found fix error typo

Signed-off-by: Edmund Raile <edmund.raile@protonmail.com>
Reviewed-by: Tomita Moeko <tomitamoeko@gmail.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/MFFbQoTpea_CK5ELq8oJ-a3Q57wo7ywQlrIqDvdIDKhUuCm59VUz2QzvdojO5r_wb_7SHifU0Kym3loj4eASPhdzYpjtiMCTePzyg1zrroo=@protonmail.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
12 days agoMerge tag 'pull-qapi-2025-06-03' of https://repo.or.cz/qemu/armbru into staging
Stefan Hajnoczi [Tue, 3 Jun 2025 13:19:26 +0000 (09:19 -0400)] 
Merge tag 'pull-qapi-2025-06-03' of https://repo.or.cz/qemu/armbru into staging

QAPI patches patches for 2025-06-03

# -----BEGIN PGP SIGNATURE-----
#
# iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAmg+l58SHGFybWJydUBy
# ZWRoYXQuY29tAAoJEDhwtADrkYZTKhYP/jp/b96B6341Z7czsBkU+CheIbPzLhvw
# QaahaM8C2B8opiiEIU46rRdV2ikccd5npj5rVEioJ8z3TLPfpQiWcKKyBBHBQGLW
# bIlAX0Ti/s6RTsSpduwAqsbwThJYEeERA5Bzn9qZTubRy9O8JYKisvRIs0SsqIU0
# kp3MXg4xWZUs+OGGl5SzLsoei7FaTmF3KGN9DMHM8ra21c82lWwKAFOUIERFWI/J
# 9Ed6pU58oE0hFd3LD7N4HAxyExCZN5ifcPI1ILEj/RSTaYedoQZ1PMP9PRfmyEXJ
# StgbbpnuaSBd8uWnahDutTpsZvBHenZpZF95loPZOSWNHIB7djCJTk9nI6Uc8bUH
# UytdLkcGXoWjbRJHua9feW7k8HJAMHZq+6m7AqvbdWUBrxpvutuqGE2vJqZSEjad
# 43+azaQRnXT0bNJ4oB6oXccyteaRf0QdZnKjdSCRtMsu6RZNNtVkx9kaE/lnwvBF
# YigN0hFeGc+0LxjOUjD2JgsJS+i//jW3LFpxwXaVXBqmpl9iiBZYjAOdoC0tJzsE
# eMOXcQGZJtLCmhOEVs7bRevuKCIjwIm/XQw6R31nE1kLf/jEjGox5IaBv8VP4mIf
# EoEiL5Euh5zAejGa5vo7SIJ5G8LglV4U9eK9ee9iveITENhlcOUfMDWnFkYjbCt+
# n6aPxPvN9kQ2
# =MPkT
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 03 Jun 2025 02:35:11 EDT
# gpg:                using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653
# gpg:                issuer "armbru@redhat.com"
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [full]
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>" [full]
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* tag 'pull-qapi-2025-06-03' of https://repo.or.cz/qemu/armbru:
  qapi: Improve documentation around job state @concluded
  qapi: Tidy up references to job state CONCLUDED
  qapi: Mention both job-cancel and block-job-cancel in doc comments
  qapi: Refer to job-FOO instead of deprecated block-job-FOO in docs
  qapi: Spell JSON null correctly in blockdev-reopen documentation
  qapi: Use proper markup instead of CAPS for emphasis in doc comments
  qapi: Fix capitalization in doc comments
  qapi: Correct spelling of QEMU in doc comments
  qapi: Drop a problematic (Since: 2.11) from query-hotpluggable-cpus
  qapi: Avoid breaking lines within (since X.Y)
  qapi: Move (since X.Y) to end of description
  qapi: Tidy up whitespace in doc comments
  qapi: Tidy up run-together sentences in doc comments

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
12 days agoMerge tag 'tracing-pull-request' of https://gitlab.com/stefanha/qemu into staging
Stefan Hajnoczi [Tue, 3 Jun 2025 13:19:12 +0000 (09:19 -0400)] 
Merge tag 'tracing-pull-request' of https://gitlab.com/stefanha/qemu into staging

Pull request

# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEhpWov9P5fNqsNXdanKSrs4Grc8gFAmg+JcgACgkQnKSrs4Gr
# c8jRKQgAgBwP5c+5YAN868Uu9nIZjT/B544FkQSp77t4SPfzzzChYHy4CGlbspYm
# vGnAkYRn5u7EXLnJ7bm9J5wLvGLVLtyWJbpCRUHjYTG37xa4Q0NZ/I2iJqUbU863
# D8lv/R5kjlUsa/p955v2TCl2q8Oif++slqsLeFOoH0dy26ehalasLkqCf5SXlhlF
# 5ULMRDKvHxkQhntp3k3DjzZVI7cUDhhLSYK9jpEVy+BVlhmUtWEeLp/mDdhdBQps
# fy4c7G0VBpsUEIZP8+DFPSwTdQ+p2jjJXSlPGGCYBh5KfAKnOD8XaGWlozR5Gngz
# v4bSHzgxU0HArmAfTh3vVftljyNvug==
# =a0+Y
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 02 Jun 2025 18:29:28 EDT
# gpg:                using RSA key 8695A8BFD3F97CDAAC35775A9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [ultimate]
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>" [ultimate]
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* tag 'tracing-pull-request' of https://gitlab.com/stefanha/qemu:
  trace/simple: seperate hot paths of tracing fucntions

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
12 days agoqapi: Improve documentation around job state @concluded
Markus Armbruster [Tue, 27 May 2025 07:39:16 +0000 (09:39 +0200)] 
qapi: Improve documentation around job state @concluded

We use "the query list" in a few places.  It's not entirely obvious
what that means.  It's actually the output of query-jobs or
query-block-jobs.

Documentation of @auto-dismiss talks about the job disappearing from
the query list when it reaches state @concluded.  This is less than
precise.  The job doesn't merely disappear from the query list, it
disappears, period.

Documentation of JobStatus @concluded explains "the job will remain in
the query list until it is dismissed".  Again less than precise.  It
remains in state @concluded until dismissed.

Rephrase without use of "the query list" for clarity and precision.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20250527073916.1243024-14-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
12 days agoqapi: Tidy up references to job state CONCLUDED
Markus Armbruster [Tue, 27 May 2025 07:39:15 +0000 (09:39 +0200)] 
qapi: Tidy up references to job state CONCLUDED

When talking about the job state machine, we refer to the states like
READY, ABORTING, CONCLUDED, and so forth.  Except in two places, where
we use JOB_STATUS_CONCLUDED.  Replace by CONCLUDED for consistency.

We should arguably use the JobStatus enum values instead.  Left for
another day.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20250527073916.1243024-13-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
12 days agoqapi: Mention both job-cancel and block-job-cancel in doc comments
Markus Armbruster [Tue, 27 May 2025 07:39:14 +0000 (09:39 +0200)] 
qapi: Mention both job-cancel and block-job-cancel in doc comments

Several doc comments mention block-job-cancel where the more generic
job-cancel would also work.  Adjust them to mention both.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20250527073916.1243024-12-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
12 days agoqapi: Refer to job-FOO instead of deprecated block-job-FOO in docs
Markus Armbruster [Tue, 27 May 2025 07:39:13 +0000 (09:39 +0200)] 
qapi: Refer to job-FOO instead of deprecated block-job-FOO in docs

We deprecated several block-job-FOO commands in commit
b836bf2ab68 (qapi/block-core: deprecate some block-job- APIs).  Update
the doc comments to refer to their replacements instead.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20250527073916.1243024-11-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
12 days agoqapi: Spell JSON null correctly in blockdev-reopen documentation
Markus Armbruster [Tue, 27 May 2025 07:39:12 +0000 (09:39 +0200)] 
qapi: Spell JSON null correctly in blockdev-reopen documentation

The doc comment misspells JSON null as NULL.  Fix that.

Cc: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20250527073916.1243024-10-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
12 days agoqapi: Use proper markup instead of CAPS for emphasis in doc comments
Markus Armbruster [Tue, 27 May 2025 07:39:11 +0000 (09:39 +0200)] 
qapi: Use proper markup instead of CAPS for emphasis in doc comments

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20250527073916.1243024-9-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
12 days agoqapi: Fix capitalization in doc comments
Markus Armbruster [Tue, 27 May 2025 07:39:10 +0000 (09:39 +0200)] 
qapi: Fix capitalization in doc comments

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20250527073916.1243024-8-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
12 days agoqapi: Correct spelling of QEMU in doc comments
Markus Armbruster [Tue, 27 May 2025 07:39:09 +0000 (09:39 +0200)] 
qapi: Correct spelling of QEMU in doc comments

Improve awkward phrasing in migrate-incoming While there.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20250527073916.1243024-7-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
12 days agoqapi: Drop a problematic (Since: 2.11) from query-hotpluggable-cpus
Markus Armbruster [Tue, 27 May 2025 07:39:08 +0000 (09:39 +0200)] 
qapi: Drop a problematic (Since: 2.11) from query-hotpluggable-cpus

There is a (Since: 2.11) in a query-hotpluggable-cpus example.
Versioning information ought to be in the command description, not
examples.  The command description is basically empty (there is a TODO
about it).

What exactly didn't work before 2.11 is not quite clear from the
documentation.  The example was added in commit 4dc3b151882 (s390x:
implement query-hotpluggable-cpus), which suggests the command failed
for the s390x target until then.  This was almost eight years ago, and
I doubt anyone still cares about this detail.  Simply delete
the problematic (Since: 2.11).

Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20250527073916.1243024-6-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
12 days agoqapi: Avoid breaking lines within (since X.Y)
Markus Armbruster [Tue, 27 May 2025 07:39:07 +0000 (09:39 +0200)] 
qapi: Avoid breaking lines within (since X.Y)

Easier on the eyes and for grep.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20250527073916.1243024-5-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
12 days agoqapi: Move (since X.Y) to end of description
Markus Armbruster [Tue, 27 May 2025 07:39:06 +0000 (09:39 +0200)] 
qapi: Move (since X.Y) to end of description

By convention, we put (since X.Y) at the end of the description.  Move
the ones that somehow ended up in the middle of the description to the
end.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20250527073916.1243024-4-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
12 days agoqapi: Tidy up whitespace in doc comments
Markus Armbruster [Tue, 27 May 2025 07:39:05 +0000 (09:39 +0200)] 
qapi: Tidy up whitespace in doc comments

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20250527073916.1243024-3-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
12 days agoqapi: Tidy up run-together sentences in doc comments
Markus Armbruster [Tue, 27 May 2025 07:39:04 +0000 (09:39 +0200)] 
qapi: Tidy up run-together sentences in doc comments

Fixes: a937b6aa739f (qapi: Reformat doc comments to conform to current conventions)
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20250527073916.1243024-2-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
12 days agotrace/simple: seperate hot paths of tracing fucntions
Tanish Desai [Wed, 28 May 2025 19:25:28 +0000 (19:25 +0000)] 
trace/simple: seperate hot paths of tracing fucntions

This change improves performance by moving the hot path of the trace_vhost_commit()(or any other trace function) logic to the header file.
Previously, even when the trace event was disabled, the function call chain:-
trace_vhost_commit()(Or any other trace function) →  _nocheck__trace_vhost_commit() →  _simple_trace_vhost_commit()
incurred a significant function prologue overhead before checking the trace state.

Disassembly of _simple_trace_vhost_commit() (from the .c file) showed that 11 out of the first 14 instructions were prologue-related, including:
0x10 stp x29, x30, [sp, #-64]! Prologue: allocates 64-byte frame and saves old FP (x29) & LR (x30)
0x14 adrp x3, trace_events_enabled_count Prologue: computes page-base of the trace-enable counter
0x18 adrp x2, __stack_chk_guard Important (maybe prolog don't know?)(stack-protector): starts up the stack-canary load
0x1c mov x29, sp Prologue: sets new frame pointer
0x20 ldr x3, [x3] Prologue: loads the actual trace-enabled count
0x24 stp x19, x20, [sp, #16] Prologue: spills callee-saved regs used by this function (x19, x20)
0x28 and w20, w0, #0xff Tracepoint setup: extracts the low-8 bits of arg0 as the “event boolean”
0x2c ldr x2, [x2] Prologue (cont’d): completes loading of the stack-canary value
0x30 and w19, w1, #0xff Tracepoint setup: extracts low-8 bits of arg1
0x34 ldr w0, [x3] Important: loads the current trace-enabled flag from memory
0x38 ldr x1, [x2] Prologue (cont’d): reads the canary
0x3c str x1, [sp, #56] Prologue (cont’d): writes the canary into the new frame
0x40 mov x1, #0 Prologue (cont’d): zeroes out x1 for the upcoming branch test
0x44 cbnz w0, 0x88 Important: if tracing is disabled (w0==0) skip the heavy path entirely

The trace-enabled check happens after the prologue. This is wasteful when tracing is disabled, which is often the case in production.
To optimize this:
_nocheck__trace_vhost_commit() is now fully inlined in the .h file with
the hot path.It checks trace_event_get_state() before calling into _simple_trace_vhost_commit(), which remains in .c.
This avoids calling into the .c function altogether when the tracepoint is disabled, thereby skipping unnecessary prologue instructions.

This results in better performance by removing redundant instructions in the tracing fast path.

Signed-off-by: Tanish Desai <tanishdesai37@gmail.com>
Message-id: 20250528192528.3968-1-tanishdesai37@gmail.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
12 days agoMerge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into...
Stefan Hajnoczi [Mon, 2 Jun 2025 18:52:44 +0000 (14:52 -0400)] 
Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging

virtio,pci,pc: features, fixes, tests

vhost will now no longer set a call notifier if unused
some work towards loongarch testing based on bios-tables-test
some core pci work for SVM support in vtd
vhost vdpa init has been optimized for response time to QMP
A couple more fixes

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCgAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmg97ZUPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRpRBsH/0Fx4NNMaynXmVOgV1rMFirTydhQG5NSdeJv
# i1RHd25Rne/RXH0CL71UPuOPADWh6bv9iZTg6RU6g7TwI8K9v3M0R71RlPLh1Lh1
# x7fifWNSNXVi18fM9/j+mIg7I2Ye0AaqveezRJWGzqoOxQKKlVI2xspKZBCCkygd
# i2tgtR1ORB6+ji6wVoTDPlL42X5Jef5MUT3XOcRR5biHm0JfqxxQKVM83mD+5yMI
# 0YqjT2BVRzo5rGN7mSuf7tQ50xI6I0wI1+eoWeKHRbg08f709M8TZRDKuVh24Evg
# 9WnIhKLTzRVdCNLNbw9h9EhxoANpWCyvmnn6GCfkJui40necFHY=
# =0lO6
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 02 Jun 2025 14:29:41 EDT
# gpg:                using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg:                issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (26 commits)
  hw/i386/pc_piix: Fix RTC ISA IRQ wiring of isapc machine
  vdpa: move memory listener register to vhost_vdpa_init
  vdpa: move iova_tree allocation to net_vhost_vdpa_init
  vdpa: reorder listener assignment
  vdpa: add listener_registered
  vdpa: set backend capabilities at vhost_vdpa_init
  vdpa: reorder vhost_vdpa_set_backend_cap
  vdpa: check for iova tree initialized at net_client_start
  vhost: Don't set vring call if guest notifier is unused
  tests/qtest/bios-tables-test: Use MiB macro rather hardcode value
  tests/data/uefi-boot-images: Add ISO image for LoongArch system
  uefi-test-tools:: Add LoongArch64 support
  pci: Add a PCI-level API for PRI
  pci: Add a pci-level API for ATS
  pci: Add a pci-level initialization function for IOMMU notifiers
  memory: Store user data pointer in the IOMMU notifiers
  pci: Add an API to get IOMMU's min page size and virtual address width
  pci: Cache the bus mastering status in the device
  pcie: Helper functions to check to check if PRI is enabled
  pcie: Add a helper to declare the PRI capability for a pcie device
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
12 days agohw/i386/pc_piix: Fix RTC ISA IRQ wiring of isapc machine
Bernhard Beschow [Mon, 26 May 2025 20:38:20 +0000 (22:38 +0200)] 
hw/i386/pc_piix: Fix RTC ISA IRQ wiring of isapc machine

Commit 56b1f50e3c10 ("hw/i386/pc: Wire RTC ISA IRQs in south bridges")
attempted to refactor RTC IRQ wiring which was previously done in
pc_basic_device_init() but forgot about the isapc machine. Fix this by
wiring in the code section dedicated exclusively to the isapc machine.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2961
Fixes: 56b1f50e3c10 ("hw/i386/pc: Wire RTC ISA IRQs in south bridges")
cc: qemu-stable
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Mark Cave-Ayland <mark.caveayland@nutanix.com>
Message-Id: <20250526203820.1853-1-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
12 days agovdpa: move memory listener register to vhost_vdpa_init
Eugenio Pérez [Thu, 22 May 2025 14:58:39 +0000 (10:58 -0400)] 
vdpa: move memory listener register to vhost_vdpa_init

Current memory operations like pinning may take a lot of time at the
destination.  Currently they are done after the source of the migration is
stopped, and before the workload is resumed at the destination.  This is a
period where neigher traffic can flow, nor the VM workload can continue
(downtime).

We can do better as we know the memory layout of the guest RAM at the
destination from the moment that all devices are initializaed.  So
moving that operation allows QEMU to communicate the kernel the maps
while the workload is still running in the source, so Linux can start
mapping them.

As a small drawback, there is a time in the initialization where QEMU
cannot respond to QMP etc.  By some testing, this time is about
0.2seconds.  This may be further reduced (or increased) depending on the
vdpa driver and the platform hardware, and it is dominated by the cost
of memory pinning.

This matches the time that we move out of the called downtime window.
The downtime is measured as the elapsed trace time between the last
vhost_vdpa_suspend on the source and the last vhost_vdpa_set_vring_enable_one
on the destination. In other words, from "guest CPUs freeze" to the
instant the final Rx/Tx queue-pair is able to start moving data.

Using ConnectX-6 Dx (MLX5) NICs in vhost-vDPA mode with 8 queue-pairs,
the series reduces guest-visible downtime during back-to-back live
migrations by more than half:
- 39G VM:   4.72s -> 2.09s (-2.63s, ~56% improvement)
- 128G VM:  14.72s -> 5.83s (-8.89s, ~60% improvement)

Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <20250522145839.59974-8-jonah.palmer@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
12 days agovdpa: move iova_tree allocation to net_vhost_vdpa_init
Eugenio Pérez [Thu, 22 May 2025 14:58:38 +0000 (10:58 -0400)] 
vdpa: move iova_tree allocation to net_vhost_vdpa_init

As we are moving to keep the mapping through all the vdpa device life
instead of resetting it at VirtIO reset, we need to move all its
dependencies to the initialization too.  In particular devices with
x-svq=on need a valid iova_tree from the beginning.

Simplify the code also consolidating the two creation points: the first
data vq in case of SVQ active and CVQ start in case only CVQ uses it.

Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Suggested-by: Si-Wei Liu <si-wei.liu@oracle.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <20250522145839.59974-7-jonah.palmer@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
12 days agovdpa: reorder listener assignment
Eugenio Pérez [Thu, 22 May 2025 14:58:37 +0000 (10:58 -0400)] 
vdpa: reorder listener assignment

Since commit f6fe3e333f ("vdpa: move memory listener to
vhost_vdpa_shared") this piece of code repeatedly assign
shared->listener members.  This was not a problem as it was not used
until device start.

However next patches move the listener registration to this
vhost_vdpa_init function.  When the listener is registered it is added
to an embedded linked list, so setting its members again will cause
memory corruption to the linked list node.

Do the right thing and only set it in the first vdpa device.

Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <20250522145839.59974-6-jonah.palmer@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
12 days agovdpa: add listener_registered
Eugenio Pérez [Thu, 22 May 2025 14:58:36 +0000 (10:58 -0400)] 
vdpa: add listener_registered

Check if the listener has been registered or not, so it needs to be
registered again at start.

Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <20250522145839.59974-5-jonah.palmer@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
12 days agovdpa: set backend capabilities at vhost_vdpa_init
Eugenio Pérez [Thu, 22 May 2025 14:58:35 +0000 (10:58 -0400)] 
vdpa: set backend capabilities at vhost_vdpa_init

The backend does not reset them until the vdpa file descriptor is closed
so there is no harm in doing it only once.

This allows the destination of a live migration to premap memory in
batches, using VHOST_BACKEND_F_IOTLB_BATCH.

Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <20250522145839.59974-4-jonah.palmer@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
12 days agovdpa: reorder vhost_vdpa_set_backend_cap
Eugenio Pérez [Thu, 22 May 2025 14:58:34 +0000 (10:58 -0400)] 
vdpa: reorder vhost_vdpa_set_backend_cap

It will be used directly by vhost_vdpa_init.

Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <20250522145839.59974-3-jonah.palmer@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
12 days agovdpa: check for iova tree initialized at net_client_start
Eugenio Pérez [Thu, 22 May 2025 14:58:33 +0000 (10:58 -0400)] 
vdpa: check for iova tree initialized at net_client_start

To map the guest memory while it is migrating we need to create the
iova_tree, as long as the destination uses x-svq=on. Checking to not
override it.

The function vhost_vdpa_net_client_stop clear it if the device is
stopped. If the guest starts the device again, the iova tree is
recreated by vhost_vdpa_net_data_start_first or vhost_vdpa_net_cvq_start
if needed, so old behavior is kept.

Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <20250522145839.59974-2-jonah.palmer@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
12 days agovhost: Don't set vring call if guest notifier is unused
Huaitong Han [Thu, 22 May 2025 10:05:48 +0000 (18:05 +0800)] 
vhost: Don't set vring call if guest notifier is unused

The vring call fd is set even when the guest does not use MSI-X (e.g., in the
case of virtio PMD), leading to unnecessary CPU overhead for processing
interrupts.

The commit 96a3d98d2c("vhost: don't set vring call if no vector") optimized the
case where MSI-X is enabled but the queue vector is unset. However, there's an
additional case where the guest uses INTx and the INTx_DISABLED bit in the PCI
config is set, meaning that no interrupt notifier will actually be used.

In such cases, the vring call fd should also be cleared to avoid redundant
interrupt handling.

Fixes: 96a3d98d2c("vhost: don't set vring call if no vector")
Reported-by: Zhiyuan Yuan <yuanzhiyuan@chinatelecom.cn>
Signed-off-by: Jidong Xia <xiajd@chinatelecom.cn>
Signed-off-by: Huaitong Han <hanht2@chinatelecom.cn>
Message-Id: <20250522100548.212740-1-hanht2@chinatelecom.cn>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 weeks agotests/qtest/bios-tables-test: Use MiB macro rather hardcode value
Bibo Mao [Tue, 20 May 2025 13:01:53 +0000 (21:01 +0800)] 
tests/qtest/bios-tables-test: Use MiB macro rather hardcode value

Replace 1024 * 1024 with MiB macro.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Message-Id: <20250520130158.767083-4-maobibo@loongson.cn>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 weeks agotests/data/uefi-boot-images: Add ISO image for LoongArch system
Bibo Mao [Tue, 20 May 2025 13:01:52 +0000 (21:01 +0800)] 
tests/data/uefi-boot-images: Add ISO image for LoongArch system

To test ACPI tables, edk2 needs to be booted with a disk image having
EFI partition. This image is created using UefiTestToolsPkg.

The image is generated with the following command:
  make -f tests/uefi-test-tools/Makefile

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Message-Id: <20250520130158.767083-3-maobibo@loongson.cn>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 weeks agouefi-test-tools:: Add LoongArch64 support
Bibo Mao [Tue, 20 May 2025 13:01:51 +0000 (21:01 +0800)] 
uefi-test-tools:: Add LoongArch64 support

Add support to build bios-tables-test iso image for LoongArch system.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Message-Id: <20250520130158.767083-2-maobibo@loongson.cn>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 weeks agopci: Add a PCI-level API for PRI
CLEMENT MATHIEU--DRIF [Tue, 20 May 2025 07:19:04 +0000 (07:19 +0000)] 
pci: Add a PCI-level API for PRI

A device can send a PRI request to the IOMMU using pci_pri_request_page.
The PRI response is sent back using the notifier managed with
pci_pri_register_notifier and pci_pri_unregister_notifier.

Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Co-authored-by: Ethan Milon <ethan.milon@eviden.com>
Message-Id: <20250520071823.764266-12-clement.mathieu--drif@eviden.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 weeks agopci: Add a pci-level API for ATS
CLEMENT MATHIEU--DRIF [Tue, 20 May 2025 07:19:03 +0000 (07:19 +0000)] 
pci: Add a pci-level API for ATS

Devices implementing ATS can send translation requests using
pci_ats_request_translation. The invalidation events are sent
back to the device using the iommu notifier managed with
pci_iommu_register_iotlb_notifier / pci_iommu_unregister_iotlb_notifier.

Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Co-authored-by: Ethan Milon <ethan.milon@eviden.com>
Message-Id: <20250520071823.764266-11-clement.mathieu--drif@eviden.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 weeks agopci: Add a pci-level initialization function for IOMMU notifiers
CLEMENT MATHIEU--DRIF [Tue, 20 May 2025 07:19:01 +0000 (07:19 +0000)] 
pci: Add a pci-level initialization function for IOMMU notifiers

This is meant to be used by ATS-capable devices.

Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Message-Id: <20250520071823.764266-10-clement.mathieu--drif@eviden.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 weeks agomemory: Store user data pointer in the IOMMU notifiers
CLEMENT MATHIEU--DRIF [Tue, 20 May 2025 07:19:00 +0000 (07:19 +0000)] 
memory: Store user data pointer in the IOMMU notifiers

This will help developers of ATS-capable devices to track a state.

Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Message-Id: <20250520071823.764266-9-clement.mathieu--drif@eviden.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 weeks agopci: Add an API to get IOMMU's min page size and virtual address width
CLEMENT MATHIEU--DRIF [Tue, 20 May 2025 07:18:59 +0000 (07:18 +0000)] 
pci: Add an API to get IOMMU's min page size and virtual address width

This kind of information is needed by devices implementing ATS in order
to initialize their translation cache.

Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Message-Id: <20250520071823.764266-8-clement.mathieu--drif@eviden.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 weeks agopci: Cache the bus mastering status in the device
CLEMENT MATHIEU--DRIF [Tue, 20 May 2025 07:18:58 +0000 (07:18 +0000)] 
pci: Cache the bus mastering status in the device

The cached is_master value is necessary to know if a device is
allowed to issue ATS/PRI requests or not as these operations do not go
through the master_enable memory region.

Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Message-Id: <20250520071823.764266-7-clement.mathieu--drif@eviden.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 weeks agopcie: Helper functions to check to check if PRI is enabled
CLEMENT MATHIEU--DRIF [Tue, 20 May 2025 07:18:57 +0000 (07:18 +0000)] 
pcie: Helper functions to check to check if PRI is enabled

pri_enabled can be used to check whether the capability is present and
enabled on a PCIe device

Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Message-Id: <20250520071823.764266-6-clement.mathieu--drif@eviden.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 weeks agopcie: Add a helper to declare the PRI capability for a pcie device
CLEMENT MATHIEU--DRIF [Tue, 20 May 2025 07:18:54 +0000 (07:18 +0000)] 
pcie: Add a helper to declare the PRI capability for a pcie device

Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Message-Id: <20250520071823.764266-5-clement.mathieu--drif@eviden.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 weeks agopcie: Helper function to check if ATS is enabled
CLEMENT MATHIEU--DRIF [Tue, 20 May 2025 07:18:52 +0000 (07:18 +0000)] 
pcie: Helper function to check if ATS is enabled

ats_enabled checks whether the capability is
present or not. If so, we read the configuration space to get
the status of the feature (enabled or not).

Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Message-Id: <20250520071823.764266-4-clement.mathieu--drif@eviden.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 weeks agopcie: Helper functions to check if PASID is enabled
CLEMENT MATHIEU--DRIF [Tue, 20 May 2025 07:18:51 +0000 (07:18 +0000)] 
pcie: Helper functions to check if PASID is enabled

pasid_enabled checks whether the capability is
present or not. If so, we read the configuration space to get
the status of the feature (enabled or not).

Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Message-Id: <20250520071823.764266-3-clement.mathieu--drif@eviden.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 weeks agopcie: Add helper to declare PASID capability for a pcie device
CLEMENT MATHIEU--DRIF [Tue, 20 May 2025 07:18:51 +0000 (07:18 +0000)] 
pcie: Add helper to declare PASID capability for a pcie device

Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Message-Id: <20250520071823.764266-2-clement.mathieu--drif@eviden.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 weeks agohw/i386/amd_iommu: Fix xtsup when vcpus < 255
Vasant Hegde [Fri, 16 May 2025 10:05:35 +0000 (15:35 +0530)] 
hw/i386/amd_iommu: Fix xtsup when vcpus < 255

If vCPUs > 255 then x86 common code (x86_cpus_init()) call kvm_enable_x2apic().
But if vCPUs <= 255 then the common code won't calls kvm_enable_x2apic().

This is because commit 8c6619f3e692 ("hw/i386/amd_iommu: Simplify non-KVM
checks on XTSup feature") removed the call to kvm_enable_x2apic when xtsup
is "on", which break things when guest is booted with x2apic mode and
there are <= 255 vCPUs.

Fix this by adding back kvm_enable_x2apic() call when xtsup=on.

Fixes: 8c6619f3e692 ("hw/i386/amd_iommu: Simplify non-KVM checks on XTSup feature")
Reported-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Tested-by: Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Cc: Philippe Mathieu-Daudé <philmd@linaro.org>
Cc: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Sairaj Kodilkar <sarunkod@amd.com>
Message-Id: <20250516100535.4980-3-sarunkod@amd.com>
Fixes: 8c6619f3e692 ("hw/i386/amd_iommu: Simplify non-KVM checks on XTSup feature")
Reported-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Tested-by: Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Cc: Philippe Mathieu-Daudé <philmd@linaro.org>
Cc: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Sairaj Kodilkar <sarunkod@amd.com>
2 weeks agohw/i386/amd_iommu: Fix device setup failure when PT is on.
Sairaj Kodilkar [Fri, 16 May 2025 10:05:34 +0000 (15:35 +0530)] 
hw/i386/amd_iommu: Fix device setup failure when PT is on.

Commit c1f46999ef506 ("amd_iommu: Add support for pass though mode")
introduces the support for "pt" flag by enabling nodma memory when
"pt=off". This allowed VFIO devices to successfully register notifiers
by using nodma region.

But, This also broke things when guest is booted with the iommu=nopt
because, devices bypass the IOMMU and use untranslated addresses (IOVA) to
perform DMA reads/writes to the nodma memory region, ultimately resulting in
a failure to setup the devices in the guest.

Fix the above issue by always enabling the amdvi_dev_as->iommu memory region.
But this will once again cause VFIO devices to fail while registering the
notifiers with AMD IOMMU memory region.

Fixes: c1f46999ef506 ("amd_iommu: Add support for pass though mode")
Signed-off-by: Sairaj Kodilkar <sarunkod@amd.com>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Message-Id: <20250516100535.4980-2-sarunkod@amd.com>
Fixes: c1f46999ef506 ("amd_iommu: Add support for pass though mode")
Signed-off-by: Sairaj Kodilkar <sarunkod@amd.com>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
2 weeks agovirtio: check for validity of indirect descriptors
Yuri Benditovich [Thu, 15 May 2025 06:32:37 +0000 (09:32 +0300)] 
virtio: check for validity of indirect descriptors

virtio processes indirect descriptors even if the respected
feature VIRTIO_RING_F_INDIRECT_DESC was not negotiated.
If qemu is used with reduced set of features to emulate the
hardware device that does not support indirect descriptors,
the will probably trigger problematic flows on the hardware
setup but do not reveal the  mistake on qemu.
Add LOG_GUEST_ERROR for such case. This will issue logs with
'-d guest_errors' in the command line

Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
Message-Id: <20250515063237.808293-1-yuri.benditovich@daynix.com>
Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
2 weeks agoMerge tag 'pull-target-arm-20250530-2' of https://git.linaro.org/people/pmaydell...
Stefan Hajnoczi [Fri, 30 May 2025 15:41:21 +0000 (11:41 -0400)] 
Merge tag 'pull-target-arm-20250530-2' of https://git.linaro.org/people/pmaydell/qemu-arm into staging

target-arm queue:
 * hw/arm: Add GMAC devices to NPCM8XX SoC
 * hw/arm: Add missing psci_conduit to NPCM8XX SoC boot info
 * docs/interop: convert text files to restructuredText
 * target/arm: Some minor refactorings
 * tests/functional: Add a test for the Stellaris arm machines
 * hw/block: Drop unused nand.c

# -----BEGIN PGP SIGNATURE-----
#
# iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAmg5qPYZHHBldGVyLm1h
# eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3tXUD/9tKWMUEYl23gd9IB5Ee3xK
# dcgG4Fzv0Ae8HLTd1agyhrg5S2LiXmFi37IO65d8Wxf7Y2TBU+kj1m3aB/C3w9Bx
# VdHGfNsHAMuYdYCOEm9OvmuSMYSxDRd43pNWdBxbc9/MgLM24rImJ05YHoZFVGrY
# S5olcZOl3/ttFHtigO4AYAbxkHMAJ5gDyNJiuk88IPx9WGYdmmM4mzJ/m17/Re01
# hdOUi0DKQO7kl+646knSU0dicu8NeO5rBAyJzu3vFBnvYXznjd9XaxF+A0Opl54P
# aBUZz27nDLvnGQrN8B5CjevjUysko+KL/L4NRqebeQKhSe4C8tKFIDocRTGyOEoR
# SAI0UpZbcX/mXt52aksSwMNG8oRvHOqpJRnNaaCZQoMjK7SlFwi6WctDpwiGt/Hu
# WaVlXaC77YRiKf1RAgH2CxV04ts342v+bndjfi4vy8D4zbTvwgqKxg+qk3N+JBMR
# ZUI5Gz3OcGXbw5awJAYbJmyo6qxBysmdHpPY8I1eW0ohzRx1rZ3Vka4yIje5mgO+
# 5yFpSy4GDRqNYKgGwlXRaseB38qKL4bEz0+uGzXYqdG7ACBz0xhT5H10npXkX/au
# LumtwW1sohsv3Xf9oBHQ1WQel7LDcWGVEZHZn6q67mazjvivLjREvA74dq1e8bqD
# zovTStIpBYRChXTRK1ShUQ==
# =Xts4
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 30 May 2025 08:47:50 EDT
# gpg:                using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg:                issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [full]
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>" [full]
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [full]
# gpg:                 aka "Peter Maydell <peter@archaic.org.uk>" [unknown]
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* tag 'pull-target-arm-20250530-2' of https://git.linaro.org/people/pmaydell/qemu-arm:
  hw/block: Drop unused nand.c
  tests/functional: Add a test for the Stellaris arm machines
  target/arm/hvf: Include missing 'cpu-qom.h' header
  target/arm/kvm: Include missing 'cpu-qom.h' header
  target/arm/qmp: Include missing 'cpu.h' header
  target/arm/cpu-features: Include missing 'cpu.h' header
  hw/arm/boot: Include missing 'system/memory.h' header
  target/arm/cpregs: Include missing 'target/arm/cpu.h' header
  target/arm: Only link with zlib when TCG is enabled
  target/arm/hvf_arm: Avoid using poisoned CONFIG_HVF definition
  target/arm/tcg-stubs: compile file once (system)
  docs/interop: convert text files to restructuredText
  hw/arm: Add missing psci_conduit to NPCM8XX SoC boot info
  tests/qtest: Migrate GMAC test from 7xx to 8xx
  hw/arm: Add GMAC devices to NPCM8XX SoC

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2 weeks agoMerge tag 'pull-request-2025-05-30' of https://gitlab.com/thuth/qemu into staging
Stefan Hajnoczi [Fri, 30 May 2025 15:41:13 +0000 (11:41 -0400)] 
Merge tag 'pull-request-2025-05-30' of https://gitlab.com/thuth/qemu into staging

* Functional tests improvements
* Endianness improvements/clean-ups for the Microblaze machines
* Remove obsolete -2.4 and -2.5 i440fx and q35 machine types and related code

# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmg5mlARHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbX1eRAAjvTK4noIfzc9QQI7EyUafgdp65m44wwx
# vfjlLbhmEnWFF11Qhovc6o36N4zF4Pt30mbXZs0gQaDR5H9RT8wrg9kShirhZX3O
# 4raPHCJFBviUCktSg90eFtvuQnfVK9cBMB8PMRQix+V5wRXcCx+cc6ebnQZ+UHp4
# L2d+qKRoHCPRO/dvQth4Be7a5pXqFQeu4gq7i/w9PCa7O+akSM3lc8dsJPuCiXgQ
# R7dkwsrRQzmiEC6aDmauNpsRRs0yptQs+9b83V4moLX07hk/R/I59EDFQqALLim7
# jmSbLnulKSSCeatV54PE/K4QxT62iA2OuJ6wo/vzVBGpzLdKE4aq99OcNPDxwWi0
# wc6xVDNtMyr81Ex4pZ0WgVKt57tDBIp9RijB5wTAhRPqKgnHtRGVNqX9TrsFls5L
# jIyKgfTxFKf9RA/a53p3uUXNmpLDVG63AhA9jWrAUtGOGJ0V+cDD2hTygXai8XTS
# 66aiEdMiuPFV2fApaEftcySFrMoT7RG1JHlcMjsTOpRdZF/x+rehFQKOHcdBeJ6r
# /zJ18MXbd5vEcglBz8joPwHu3mt2NLew+IvLPoAlwMfrniiNnUC+IY2Jzz3jYpBI
# WbbaesVG7J8SzJ6SwNOVuiCbiAImOkrxEz/8Jm783sZvWSzLYmwI9bBp9KXVxGty
# ed14fLi8g5U=
# =8SJJ
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 30 May 2025 07:45:20 EDT
# gpg:                using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg:                issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg:                 aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg:                 aka "Thomas Huth <huth@tuxfamily.org>" [full]
# gpg:                 aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3  EAB9 2ED9 D774 FE70 2DB5

* tag 'pull-request-2025-05-30' of https://gitlab.com/thuth/qemu: (25 commits)
  tests/unit/test-util-sockets: fix mem-leak on error object
  hw/net/vmxnet3: Merge DeviceRealize in InstanceInit
  hw/net/vmxnet3: Remove VMXNET3_COMPAT_FLAG_DISABLE_PCIE definition
  hw/net/vmxnet3: Remove VMXNET3_COMPAT_FLAG_OLD_MSI_OFFSETS definition
  hw/scsi/vmw_pvscsi: Convert DeviceRealize -> InstanceInit
  hw/scsi/vmw_pvscsi: Remove PVSCSI_COMPAT_DISABLE_PCIE_BIT definition
  hw/scsi/vmw_pvscsi: Remove PVSCSI_COMPAT_OLD_PCI_CONFIGURATION definition
  hw/core/machine: Remove hw_compat_2_5[] array
  hw/nvram/fw_cfg: Remove legacy FW_CFG_ORDER_OVERRIDE
  hw/i386/x86: Remove X86MachineClass::save_tsc_khz field
  hw/i386/pc: Remove deprecated pc-q35-2.5 and pc-i440fx-2.5 machines
  hw/virtio/virtio-pci: Remove VIRTIO_PCI_FLAG_DISABLE_PCIE definition
  hw/virtio/virtio-pci: Remove VIRTIO_PCI_FLAG_MIGRATE_EXTRA definition
  hw/net/e1000: Remove unused E1000_FLAG_MAC flag
  hw/core/machine: Remove hw_compat_2_4[] array
  hw/i386/pc: Remove pc_compat_2_4[] array
  hw/i386/pc: Remove PCMachineClass::broken_reserved_end field
  hw/i386/pc: Remove deprecated pc-q35-2.4 and pc-i440fx-2.4 machines
  docs: Deprecate the qemu-system-microblazeel binary
  hw/microblaze: Remove the big-endian variants of ml605 and xlnx-zynqmp-pmu
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2 weeks agoMerge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging
Stefan Hajnoczi [Fri, 30 May 2025 15:41:07 +0000 (11:41 -0400)] 
Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging

* target/i386/kvm: Intel TDX support
* target/i386/emulate: more lflags cleanups
* meson: remove need for explicit listing of dependencies in hw_common_arch and
  target_common_arch
* rust: small fixes
* hpet: Reorganize register decoding to be more similar to Rust code
* target/i386: fixes for AMD models
* target/i386: new EPYC-Turin CPU model

# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCgAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmg4BxwUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroP67gf+PEP4EDQP0AJUfxXYVsczGf5snGjz
# ro8jYmKG+huBZcrS6uPK5zHYxtOI9bHr4ipTHJyHd61lyzN6Ys9amPbs/CRE2Q4x
# Ky4AojPhCuaL2wHcYNcu41L+hweVQ3myj97vP3hWvkatulXYeMqW3/4JZgr4WZ69
# A9LGLtLabobTz5yLc8x6oHLn/BZ2y7gjd2LzTz8bqxx7C/kamjoDrF2ZHbX9DLQW
# BKWQ3edSO6rorSNHWGZsy9BE20AEkW2LgJdlV9eXglFEuEs6cdPKwGEZepade4bQ
# Rdt2gHTlQdUDTFmAbz8pttPxFGMC9Zpmb3nnicKJpKQAmkT/x4k9ncjyAQ==
# =XmkU
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 29 May 2025 03:05:00 EDT
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (77 commits)
  target/i386/tcg/helper-tcg: fix file references in comments
  target/i386: Add support for EPYC-Turin model
  target/i386: Update EPYC-Genoa for Cache property, perfmon-v2, RAS and SVM feature bits
  target/i386: Add couple of feature bits in CPUID_Fn80000021_EAX
  target/i386: Update EPYC-Milan CPU model for Cache property, RAS, SVM feature bits
  target/i386: Update EPYC-Rome CPU model for Cache property, RAS, SVM feature bits
  target/i386: Update EPYC CPU model for Cache property, RAS, SVM feature bits
  rust: make declaration of dependent crates more consistent
  docs: Add TDX documentation
  i386/tdx: Validate phys_bits against host value
  i386/tdx: Make invtsc default on
  i386/tdx: Don't treat SYSCALL as unavailable
  i386/tdx: Fetch and validate CPUID of TD guest
  target/i386: Print CPUID subleaf info for unsupported feature
  i386: Remove unused parameter "uint32_t bit" in feature_word_description()
  i386/cgs: Introduce x86_confidential_guest_check_features()
  i386/tdx: Define supported KVM features for TDX
  i386/tdx: Add XFD to supported bit of TDX
  i386/tdx: Add supported CPUID bits relates to XFAM
  i386/tdx: Add supported CPUID bits related to TD Attributes
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2 weeks agoMerge tag 'pull-nbd-2025-05-29' of https://repo.or.cz/qemu/ericb into staging
Stefan Hajnoczi [Fri, 30 May 2025 15:40:56 +0000 (11:40 -0400)] 
Merge tag 'pull-nbd-2025-05-29' of https://repo.or.cz/qemu/ericb into staging

NBD patches for 2025-05-29

- Nir Soffer: Allow for larger Unix socket buffers in NBD
- Eric Blake: clean up mirror-sparse iotest issues

# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCAAdFiEEccLMIrHEYCkn0vOqp6FrSiUnQ2oFAmg42T0ACgkQp6FrSiUn
# Q2r5nwgAg4ftfPBnynqL54dQ6rPKPOwW3n4Ei26EsC86OcFIGEGuCK6UGBH4bH6d
# BgyjNWY/6/t90vnXcBGVFmxrugHGh3TwOpAY08TqW0LGmpJiwX5wZTk3cVbcwXat
# ME8oYeOQwLwqboFthlgnXsUuQrKtXrkY27154ztH354x4bi5AmHi//Or4+EdFf8L
# /cCmS7uHPiHV9l1+U1hV4i1UQ+3rWHIOcfn/sKeEwPfrlyEW+2fxWUjl7qyf/Mqz
# EwCtkjz4WsFTxYyQPN6r3NyoEIZDRK27srubVhat6Fk9gOnR5Rh2MCntyxUpXmo5
# 4xD3QkVbXVRhXv6n6rjmA/Q3bvZ1oQ==
# =yjPj
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 29 May 2025 18:01:33 EDT
# gpg:                using RSA key 71C2CC22B1C4602927D2F3AAA7A16B4A2527436A
# gpg: Good signature from "Eric Blake <eblake@redhat.com>" [full]
# gpg:                 aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>" [full]
# gpg:                 aka "[jpeg image of size 6874]" [full]
# Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2  F3AA A7A1 6B4A 2527 436A

* tag 'pull-nbd-2025-05-29' of https://repo.or.cz/qemu/ericb:
  iotests: Filter out ZFS in several tests
  iotests: Improve mirror-sparse on ext4 and xfs
  iotests: Use disk_usage in more places
  nbd: Set unix socket send buffer on Linux
  nbd: Set unix socket send buffer on macOS
  io: Add helper for setting socket send buffer size

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2 weeks agotests/unit/test-util-sockets: fix mem-leak on error object
Matheus Tavares Bernardino [Mon, 26 May 2025 17:20:55 +0000 (10:20 -0700)] 
tests/unit/test-util-sockets: fix mem-leak on error object

The test fails with --enable-asan as the error struct is never freed.
In the case where the test expects a success but it fails, let's also
report the error for debugging (it will be freed internally).

Fixes 316e8ee8d6 ("util/qemu-sockets: Refactor inet_parse() to use QemuOpts")

Signed-off-by: Matheus Tavares Bernardino <matheus.bernardino@oss.qualcomm.com>
Reviewed-by: Juraj Marcin <jmarcin@redhat.com>
Message-ID: <518d94c7db20060b2a086cf55ee9bffab992a907.1748280011.git.matheus.bernardino@oss.qualcomm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2 weeks agohw/net/vmxnet3: Merge DeviceRealize in InstanceInit
Philippe Mathieu-Daudé [Mon, 12 May 2025 08:39:48 +0000 (10:39 +0200)] 
hw/net/vmxnet3: Merge DeviceRealize in InstanceInit

Simplify merging vmxnet3_realize() within vmxnet3_instance_init(),
removing the need for device_class_set_parent_realize().

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250512083948.39294-20-philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2 weeks agohw/net/vmxnet3: Remove VMXNET3_COMPAT_FLAG_DISABLE_PCIE definition
Philippe Mathieu-Daudé [Mon, 12 May 2025 08:39:47 +0000 (10:39 +0200)] 
hw/net/vmxnet3: Remove VMXNET3_COMPAT_FLAG_DISABLE_PCIE definition

VMXNET3_COMPAT_FLAG_DISABLE_PCIE was only used by the
hw_compat_2_5[] array, via the 'x-disable-pcie=on' property.
We removed all machines using that array, lets remove all the
code around VMXNET3_COMPAT_FLAG_DISABLE_PCIE.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250512083948.39294-19-philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2 weeks agohw/net/vmxnet3: Remove VMXNET3_COMPAT_FLAG_OLD_MSI_OFFSETS definition
Philippe Mathieu-Daudé [Mon, 12 May 2025 08:39:46 +0000 (10:39 +0200)] 
hw/net/vmxnet3: Remove VMXNET3_COMPAT_FLAG_OLD_MSI_OFFSETS definition

VMXNET3_COMPAT_FLAG_OLD_MSI_OFFSETS was only used by the
hw_compat_2_5[] array, via the 'x-old-msi-offsets=on' property.
We removed all machines using that array, lets remove all the
code around VMXNET3_COMPAT_FLAG_OLD_MSI_OFFSETS.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250512083948.39294-18-philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2 weeks agohw/scsi/vmw_pvscsi: Convert DeviceRealize -> InstanceInit
Philippe Mathieu-Daudé [Mon, 12 May 2025 08:39:45 +0000 (10:39 +0200)] 
hw/scsi/vmw_pvscsi: Convert DeviceRealize -> InstanceInit

Simplify replacing pvscsi_realize() by pvscsi_instance_init(),
removing the need for device_class_set_parent_realize().

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250512083948.39294-17-philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2 weeks agohw/scsi/vmw_pvscsi: Remove PVSCSI_COMPAT_DISABLE_PCIE_BIT definition
Philippe Mathieu-Daudé [Mon, 12 May 2025 08:39:44 +0000 (10:39 +0200)] 
hw/scsi/vmw_pvscsi: Remove PVSCSI_COMPAT_DISABLE_PCIE_BIT definition

PVSCSI_COMPAT_DISABLE_PCIE_BIT was only used by the
hw_compat_2_5[] array, via the 'x-disable-pcie=on' property.
We removed all machines using that array, lets remove all the
code around PVSCSI_COMPAT_DISABLE_PCIE_BIT, including the now
unused PVSCSIState::compat_flags field.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250512083948.39294-16-philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2 weeks agohw/scsi/vmw_pvscsi: Remove PVSCSI_COMPAT_OLD_PCI_CONFIGURATION definition
Philippe Mathieu-Daudé [Mon, 12 May 2025 08:39:43 +0000 (10:39 +0200)] 
hw/scsi/vmw_pvscsi: Remove PVSCSI_COMPAT_OLD_PCI_CONFIGURATION definition

PVSCSI_COMPAT_OLD_PCI_CONFIGURATION was only used by the
hw_compat_2_5[] array, via the 'x-old-pci-configuration=on'
property. We removed all machines using that array, lets remove
all the code around PVSCSI_COMPAT_OLD_PCI_CONFIGURATION.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250512083948.39294-15-philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2 weeks agohw/core/machine: Remove hw_compat_2_5[] array
Philippe Mathieu-Daudé [Mon, 12 May 2025 08:39:41 +0000 (10:39 +0200)] 
hw/core/machine: Remove hw_compat_2_5[] array

The hw_compat_2_5[] array was only used by the pc-q35-2.5 and
pc-i440fx-2.5 machines, which got removed. Remove it.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250512083948.39294-13-philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2 weeks agohw/nvram/fw_cfg: Remove legacy FW_CFG_ORDER_OVERRIDE
Philippe Mathieu-Daudé [Mon, 12 May 2025 08:39:40 +0000 (10:39 +0200)] 
hw/nvram/fw_cfg: Remove legacy FW_CFG_ORDER_OVERRIDE

The MachineClass::legacy_fw_cfg_order boolean was only used
by the pc-q35-2.5 and pc-i440fx-2.5 machines, which got
removed. Remove it along with:

- FW_CFG_ORDER_OVERRIDE_* definitions
- fw_cfg_set_order_override()
- fw_cfg_reset_order_override()
- fw_cfg_order[]
- rom_set_order_override()
- rom_reset_order_override()

Simplify CLI and pc_vga_init() / pc_nic_init().

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250512083948.39294-12-philmd@linaro.org>
[thuth: Fix error from check_patch.pl wrt to an empty "for" loop]
Signed-off-by: Thomas Huth <thuth@redhat.com>
2 weeks agohw/i386/x86: Remove X86MachineClass::save_tsc_khz field
Philippe Mathieu-Daudé [Mon, 12 May 2025 08:39:39 +0000 (10:39 +0200)] 
hw/i386/x86: Remove X86MachineClass::save_tsc_khz field

The X86MachineClass::save_tsc_khz boolean was only used
by the pc-q35-2.5 and pc-i440fx-2.5 machines, which got
removed. Remove it and simplify tsc_khz_needed().

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250512083948.39294-11-philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2 weeks agohw/i386/pc: Remove deprecated pc-q35-2.5 and pc-i440fx-2.5 machines
Philippe Mathieu-Daudé [Mon, 12 May 2025 08:39:38 +0000 (10:39 +0200)] 
hw/i386/pc: Remove deprecated pc-q35-2.5 and pc-i440fx-2.5 machines

These machines has been supported for a period of more than 6 years.
According to our versioned machine support policy (see commit
ce80c4fa6ff "docs: document special exception for machine type
deprecation & removal") they can now be removed.

Remove the now unused empty pc_compat_2_5[] array.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250512083948.39294-10-philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2 weeks agohw/virtio/virtio-pci: Remove VIRTIO_PCI_FLAG_DISABLE_PCIE definition
Philippe Mathieu-Daudé [Mon, 12 May 2025 08:39:37 +0000 (10:39 +0200)] 
hw/virtio/virtio-pci: Remove VIRTIO_PCI_FLAG_DISABLE_PCIE definition

VIRTIO_PCI_FLAG_DISABLE_PCIE was only used by the hw_compat_2_4[]
array, via the 'x-disable-pcie=false' property. We removed all
machines using that array, lets remove all the code around
VIRTIO_PCI_FLAG_DISABLE_PCIE (see commit 9a4c0e220d8 for similar
VIRTIO_PCI_FLAG_* enum removal).

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250512083948.39294-9-philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2 weeks agohw/virtio/virtio-pci: Remove VIRTIO_PCI_FLAG_MIGRATE_EXTRA definition
Philippe Mathieu-Daudé [Mon, 12 May 2025 08:39:36 +0000 (10:39 +0200)] 
hw/virtio/virtio-pci: Remove VIRTIO_PCI_FLAG_MIGRATE_EXTRA definition

VIRTIO_PCI_FLAG_MIGRATE_EXTRA was only used by the
hw_compat_2_4[] array, via the 'migrate-extra=true'
property. We removed all machines using that array,
lets remove all the code around VIRTIO_PCI_FLAG_MIGRATE_EXTRA.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20250512083948.39294-8-philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2 weeks agohw/net/e1000: Remove unused E1000_FLAG_MAC flag
Philippe Mathieu-Daudé [Mon, 12 May 2025 08:39:35 +0000 (10:39 +0200)] 
hw/net/e1000: Remove unused E1000_FLAG_MAC flag

E1000_FLAG_MAC was only used by the hw_compat_2_4[] array,
via the 'extra_mac_registers=off' property. We removed all
machines using that array, lets remove all the code around
E1000_FLAG_MAC, including the MAC_ACCESS_FLAG_NEEDED enum,
similarly to commit fa4ec9ffda7 ("e1000: remove old
compatibility code").

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20250512083948.39294-7-philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2 weeks agohw/core/machine: Remove hw_compat_2_4[] array
Philippe Mathieu-Daudé [Mon, 12 May 2025 08:39:34 +0000 (10:39 +0200)] 
hw/core/machine: Remove hw_compat_2_4[] array

The hw_compat_2_4[] array was only used by the pc-q35-2.4 and
pc-i440fx-2.4 machines, which got removed. Remove it.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250512083948.39294-6-philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2 weeks agohw/i386/pc: Remove pc_compat_2_4[] array
Philippe Mathieu-Daudé [Mon, 12 May 2025 08:39:32 +0000 (10:39 +0200)] 
hw/i386/pc: Remove pc_compat_2_4[] array

The pc_compat_2_4[] array was only used by the pc-q35-2.4
and pc-i440fx-2.4 machines, which got removed. Remove it.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250512083948.39294-4-philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2 weeks agohw/i386/pc: Remove PCMachineClass::broken_reserved_end field
Philippe Mathieu-Daudé [Mon, 12 May 2025 08:39:31 +0000 (10:39 +0200)] 
hw/i386/pc: Remove PCMachineClass::broken_reserved_end field

The PCMachineClass::broken_reserved_end field was only used
by the pc-q35-2.4 and pc-i440fx-2.4 machines, which got removed.
Remove it and simplify pc_memory_init().

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250512083948.39294-3-philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2 weeks agohw/i386/pc: Remove deprecated pc-q35-2.4 and pc-i440fx-2.4 machines
Philippe Mathieu-Daudé [Mon, 12 May 2025 08:39:30 +0000 (10:39 +0200)] 
hw/i386/pc: Remove deprecated pc-q35-2.4 and pc-i440fx-2.4 machines

These machines has been supported for a period of more than 6 years.
According to our versioned machine support policy (see commit
ce80c4fa6ff "docs: document special exception for machine type
deprecation & removal") they can now be removed.

Remove the qtest in test-x86-cpuid-compat.c file.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250512083948.39294-2-philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2 weeks agodocs: Deprecate the qemu-system-microblazeel binary
Thomas Huth [Thu, 15 May 2025 13:20:19 +0000 (15:20 +0200)] 
docs: Deprecate the qemu-system-microblazeel binary

The (former big-endian only) binary qemu-system-microblaze can
handle both endiannesses nowadays, so we don't need the separate
qemu-system-microblazeel binary for little endian anymore. Let's
deprecate it to avoid unnecessary compilation and test time in
the future.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20250515132019.569365-5-thuth@redhat.com>

2 weeks agohw/microblaze: Remove the big-endian variants of ml605 and xlnx-zynqmp-pmu
Thomas Huth [Thu, 15 May 2025 13:20:18 +0000 (15:20 +0200)] 
hw/microblaze: Remove the big-endian variants of ml605 and xlnx-zynqmp-pmu

Both machines were added with little-endian in mind only (the
"endianness" CPU property was hard-wired to "true", see commits
133d23b3ad1 and a88bbb006a52), so the variants that showed up
on the big endian target likely never worked. We deprecated these
non-working machine variants two releases ago, and so far nobody
complained, so it should be fine now to disable them. Hard-wire
the machines to little endian now.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20250515132019.569365-4-thuth@redhat.com>

2 weeks agotests/functional: Test both microblaze s3adsp1800 endianness variants
Thomas Huth [Thu, 15 May 2025 13:20:17 +0000 (15:20 +0200)] 
tests/functional: Test both microblaze s3adsp1800 endianness variants

Now that the endianness of the petalogix-s3adsp1800 can be configured,
we should test that the cross-endianness also works as expected, thus
test the big endian variant on the little endian target and vice versa.
(based on an original idea from Philippe Mathieu-Daudé)

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20250515132019.569365-3-thuth@redhat.com>

2 weeks agohw/microblaze: Add endianness property to the petalogix_s3adsp1800 machine
Thomas Huth [Thu, 15 May 2025 13:20:16 +0000 (15:20 +0200)] 
hw/microblaze: Add endianness property to the petalogix_s3adsp1800 machine

Since the microblaze target can now handle both endianness, big and
little, we should provide a config knob for the user to select the
desired endianness.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20250515132019.569365-2-thuth@redhat.com>

2 weeks agotests/functional/test_mem_addr_space: Use set_machine() to select the machine
Thomas Huth [Wed, 21 May 2025 14:37:32 +0000 (16:37 +0200)] 
tests/functional/test_mem_addr_space: Use set_machine() to select the machine

By using self.set_machine() the tests get properly skipped in case
the machine has not been compiled into the QEMU binary, e.g. when
"configure" has been run with "--without-default-devices".

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20250521143732.140711-1-thuth@redhat.com>

2 weeks agotests/functional/test_mips_malta: Re-enable the check for the PCI host bridge
Thomas Huth [Thu, 22 May 2025 08:02:08 +0000 (10:02 +0200)] 
tests/functional/test_mips_malta: Re-enable the check for the PCI host bridge

The problem with the PCI bridge has been fixed in commit e5894fd6f411c1
("hw/pci-host/gt64120: Fix endianness handling"), so we can enable the
corresponding test again.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20250522080208.205489-1-thuth@redhat.com>

2 weeks agotests/functional/test_sparc64_tuxrun: Explicitly set the 'sun4u' machine
Thomas Huth [Wed, 21 May 2025 14:51:12 +0000 (16:51 +0200)] 
tests/functional/test_sparc64_tuxrun: Explicitly set the 'sun4u' machine

Use self.set_machine() to set the machine instead of relying on the
default machine of the binary. This way the test can be skipped in
case the machine has not been compiled into the QEMU binary.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20250521145112.142222-1-thuth@redhat.com>

2 weeks agoiotests: Filter out ZFS in several tests
Eric Blake [Fri, 23 May 2025 16:27:23 +0000 (11:27 -0500)] 
iotests: Filter out ZFS in several tests

Fiona reported that ZFS makes sparse file testing awkward, since:
- it has asynchronous allocation (not even 'fsync $file' makes du see
  the desired size; it takes the slower 'fsync -f $file' which is not
  appropriate for the tests)
- for tests of fully allocated files, ZFS with compression enabled
  still reports smaller disk usage

Add a new _require_disk_usage that quickly probes whether an attempt
to create a sparse 5M file shows as less than 1M usage, while the same
file with -o preallocation=full shows as more than 4M usage without
sync, which should filter out ZFS behavior.  Then use it in various
affected tests.

This does not add the new filter on all tests that Fiona is seeing ZFS
failures on, but only those where I could quickly spot that there is
at least one place where the test depends on the output of 'du -b' or
'stat -c %b'.

Reported-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20250523163041.2548675-8-eblake@redhat.com>
Reviewed-by: Fiona Ebner <f.ebner@proxmox.com>
Tested-by: Fiona Ebner <f.ebner@proxmox.com>
2 weeks agoiotests: Improve mirror-sparse on ext4 and xfs
Eric Blake [Fri, 23 May 2025 16:27:22 +0000 (11:27 -0500)] 
iotests: Improve mirror-sparse on ext4 and xfs

Fiona reported that an ext4 filesystem on top of LVM can sometimes
report over-allocation to du (based on the heuristics the filesystem
is making while observing the contents being mirrored); even though
the contents and actual size matched, about 50% of the time the size
reported by disk_usage was too large by 4k, failing the test.  In
auditing other iotests, this is a common problem we've had to deal
with.

Meanwhile, Markus reported that an xfs filesystem reports disk usage
at a default granularity of 1M (so the sparse file occupies 3M, since
it has just over 2M data).

Reported-by: Fiona Ebner <f.ebner@proxmox.com>
Reported-by: Markus Armbruster <armbru@redhat.com>
Fixes: c0ddcb2c ("tests: Add iotest mirror-sparse for recent patches")
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fiona Ebner <f.ebner@proxmox.com>
Tested-by: Fiona Ebner <f.ebner@proxmox.com>
Message-ID: <20250523163041.2548675-7-eblake@redhat.com>
[eblake: Also fix xfs issue]
Signed-off-by: Eric Blake <eblake@redhat.com>
2 weeks agoiotests: Use disk_usage in more places
Eric Blake [Fri, 23 May 2025 16:27:21 +0000 (11:27 -0500)] 
iotests: Use disk_usage in more places

Commit be9bac07 added a utility disk_usage function, but there are
a couple of other tests that could also use it.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20250523163041.2548675-6-eblake@redhat.com>
Reviewed-by: Fiona Ebner <f.ebner@proxmox.com>
Tested-by: Fiona Ebner <f.ebner@proxmox.com>
2 weeks agonbd: Set unix socket send buffer on Linux
Nir Soffer [Sat, 17 May 2025 20:11:54 +0000 (23:11 +0300)] 
nbd: Set unix socket send buffer on Linux

Like macOS we have similar issue on Linux. For TCP socket the send
buffer size is 2626560 bytes (~2.5 MiB) and we get good performance.
However for unix socket the default and maximum buffer size is 212992
bytes (208 KiB) and we see poor performance when using one NBD
connection, up to 4 times slower than macOS on the same machine.

Tracing shows that for every 2 MiB payload (qemu uses 2 MiB io size), we
do 1 recvmsg call with TCP socket, and 10 recvmsg calls with unix
socket.

Fixing this issue requires changing the maximum send buffer size (the
receive buffer size is ignored). This can be done using:

    $ cat /etc/sysctl.d/net-mem-max.conf
    net.core.wmem_max = 2097152

    $ sudo sysctl -p /etc/sysctl.d/net-mem-max.conf

With this we can set the socket buffer size to 2 MiB. With the defaults
the value requested by qemu is clipped to the maximum size and has no
effect.

I tested on 2 machines:
- Fedora 42 VM on MacBook Pro M2 Max
- Dell PowerEdge R640 (Intel(R) Xeon(R) Gold 6230 CPU @ 2.10GHz)

On the older Dell machine we see very little improvement, up to 1.03
higher throughput. On the M2 machine we see up to 2.67 times higher
throughput. The following results are from the M2 machine.

Reading from qemu-nbd with qemu-img convert. In this test buffer size of
4m is optimal (2.28 times faster).

| buffer size | time    | user    | system  |
|-------------|---------|---------|---------|
|     default |   4.292 |   0.243 |   1.604 |
|      524288 |   2.167 |   0.058 |   1.288 |
|     1048576 |   2.041 |   0.060 |   1.238 |
|     2097152 |   1.884 |   0.060 |   1.191 |
|     4194304 |   1.881 |   0.054 |   1.196 |

Writing to qemu-nbd with qemu-img convert. In this test buffer size of
1m is optimal (2.67 times faster).

| buffer size | time    | user    | system  |
|-------------|---------|---------|---------|
|     default |   3.113 |   0.334 |   1.094 |
|      524288 |   1.173 |   0.179 |   0.654 |
|     1048576 |   1.164 |   0.164 |   0.670 |
|     2097152 |   1.227 |   0.197 |   0.663 |
|     4194304 |   1.227 |   0.198 |   0.666 |

Computing a blkhash with nbdcopy. In this test buffer size of 512k is
optimal (1.19 times faster).

| buffer size | time    | user    | system  |
|-------------|---------|---------|---------|
|     default |   2.140 |   4.483 |   2.681 |
|      524288 |   1.794 |   4.467 |   2.572 |
|     1048576 |   1.807 |   4.447 |   2.644 |
|     2097152 |   1.822 |   4.461 |   2.698 |
|     4194304 |   1.827 |   4.465 |   2.700 |

Computing a blkhash with blksum. In this test buffer size of 4m is
optimal (2.65 times faster).

| buffer size | time    | user    | system  |
|-------------|---------|---------|---------|
|     default |   3.582 |   4.595 |   2.392 |
|      524288 |   1.499 |   4.384 |   1.482 |
|     1048576 |   1.377 |   4.381 |   1.345 |
|     2097152 |   1.388 |   4.389 |   1.354 |
|     4194304 |   1.352 |   4.395 |   1.302 |

Signed-off-by: Nir Soffer <nirsof@gmail.com>
Message-ID: <20250517201154.88456-4-nirsof@gmail.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2 weeks agonbd: Set unix socket send buffer on macOS
Nir Soffer [Sat, 17 May 2025 20:11:53 +0000 (23:11 +0300)] 
nbd: Set unix socket send buffer on macOS

On macOS we need to increase unix socket buffers size on the client and
server to get good performance. We set socket buffers on macOS after
connecting or accepting a client connection.

Testing shows that setting socket receive buffer size (SO_RCVBUF) has no
effect on performance, so we set only the send buffer size (SO_SNDBUF).
It seems to work like Linux but not documented.

Testing shows that optimal buffer size is 512k to 4 MiB, depending on
the test case. The difference is very small, so I chose 2 MiB.

I tested reading from qemu-nbd and writing to qemu-nbd with qemu-img and
computing a blkhash with nbdcopy and blksum.

To focus on NBD communication and get less noisy results, I tested
reading and writing to null-co driver. I added a read-pattern option to
the null-co driver to return data full of 0xff:

    NULL="json:{'driver': 'raw', 'file': {'driver': 'null-co', 'size': '10g', 'read-pattern': 255}}"

For testing buffer size I added an environment variable for setting the
socket buffer size.

Read from qemu-nbd via qemu-img convert. In this test buffer size of 2m
is optimal (12.6 times faster).

    qemu-nbd -r -t -e 0 -f raw -k /tmp/nbd.sock "$NULL" &
    qemu-img convert -f raw -O raw -W -n "nbd+unix:///?socket=/tmp/nbd.sock" "$NULL"

| buffer size | time    | user    | system  |
|-------------|---------|---------|---------|
|     default |  13.361 |   2.653 |   5.702 |
|       65536 |   2.283 |   0.204 |   1.318 |
|      131072 |   1.673 |   0.062 |   1.008 |
|      262144 |   1.592 |   0.053 |   0.952 |
|      524288 |   1.496 |   0.049 |   0.887 |
|     1048576 |   1.234 |   0.047 |   0.738 |
|     2097152 |   1.060 |   0.080 |   0.602 |
|     4194304 |   1.061 |   0.076 |   0.604 |

Write to qemu-nbd with qemu-img convert. In this test buffer size of 2m
is optimal (9.2 times faster).

    qemu-nbd -t -e 0 -f raw -k /tmp/nbd.sock "$NULL" &
    qemu-img convert -f raw -O raw -W -n "$NULL" "nbd+unix:///?socket=/tmp/nbd.sock"

| buffer size | time    | user    | system  |
|-------------|---------|---------|---------|
|     default |   8.063 |   2.522 |   4.184 |
|       65536 |   1.472 |   0.430 |   0.867 |
|      131072 |   1.071 |   0.297 |   0.654 |
|      262144 |   1.012 |   0.239 |   0.587 |
|      524288 |   0.970 |   0.201 |   0.514 |
|     1048576 |   0.895 |   0.184 |   0.454 |
|     2097152 |   0.877 |   0.174 |   0.440 |
|     4194304 |   0.944 |   0.231 |   0.535 |

Compute a blkhash with nbdcopy, using 4 NBD connections and 256k request
size. In this test buffer size of 4m is optimal (5.1 times faster).

    qemu-nbd -r -t -e 0 -f raw -k /tmp/nbd.sock "$NULL" &
    nbdcopy --blkhash "nbd+unix:///?socket=/tmp/nbd.sock" null:

| buffer size | time    | user    | system  |
|-------------|---------|---------|---------|
|     default |   8.624 |   5.727 |   6.507 |
|       65536 |   2.563 |   4.760 |   2.498 |
|      131072 |   1.903 |   4.559 |   2.093 |
|      262144 |   1.759 |   4.513 |   1.935 |
|      524288 |   1.729 |   4.489 |   1.924 |
|     1048576 |   1.696 |   4.479 |   1.884 |
|     2097152 |   1.710 |   4.480 |   1.763 |
|     4194304 |   1.687 |   4.479 |   1.712 |

Compute a blkhash with blksum, using 1 NBD connection and 256k read
size. In this test buffer size of 512k is optimal (10.3 times faster).

    qemu-nbd -r -t -e 0 -f raw -k /tmp/nbd.sock "$NULL" &
    blksum "nbd+unix:///?socket=/tmp/nbd.sock"

| buffer size | time    | user    | system  |
|-------------|---------|---------|---------|
|     default |  13.085 |   5.664 |   6.461 |
|       65536 |   3.299 |   5.106 |   2.515 |
|      131072 |   2.396 |   4.989 |   2.069 |
|      262144 |   1.607 |   4.724 |   1.555 |
|      524288 |   1.271 |   4.528 |   1.224 |
|     1048576 |   1.294 |   4.565 |   1.333 |
|     2097152 |   1.299 |   4.569 |   1.344 |
|     4194304 |   1.291 |   4.559 |   1.327 |

Signed-off-by: Nir Soffer <nirsof@gmail.com>
Message-ID: <20250517201154.88456-3-nirsof@gmail.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2 weeks agoio: Add helper for setting socket send buffer size
Nir Soffer [Sat, 17 May 2025 20:11:52 +0000 (23:11 +0300)] 
io: Add helper for setting socket send buffer size

Testing reading and writing from qemu-nbd using a unix domain socket
shows that the platform default send buffer size is too low, leading to
poor performance and hight cpu usage.

Add a helper for setting socket send buffer size to be used in NBD code.
It can also be used in other contexts.

We don't need a helper for receive buffer size since it is not used with
unix domain sockets. This is documented for Linux, and not documented
for macOS.

Failing to set the socket buffer size is not a fatal error, but the
caller may want to warn about the failure.

Signed-off-by: Nir Soffer <nirsof@gmail.com>
Message-ID: <20250517201154.88456-2-nirsof@gmail.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2 weeks agohw/block: Drop unused nand.c
Peter Maydell [Thu, 29 May 2025 16:45:13 +0000 (17:45 +0100)] 
hw/block: Drop unused nand.c

The nand.c device (TYPE_NAND) is an emulation of a NAND flash memory
chip which was used by the old OMAP boards.  No current QEMU board
uses it, and although techically "-device nand,chip-id=0x6b" doesn't
error out, it's not possible to usefully use it from the command
line because the only interface it has is via calling C functions
like nand_setpins() and nand_setio().

The "config OMAP" stanza (used only by the SX1 board) is the only
thing that does "select NAND" to compile in this code, but the SX1
board doesn't actually use the NAND device.

Remove the NAND device code entirely; this is effectively leftover
cleanup from when we dropped the PXA boards and the OMAP boards
other than the sx1.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20250522142859.3122389-1-peter.maydell@linaro.org

2 weeks agotests/functional: Add a test for the Stellaris arm machines
Thomas Huth [Thu, 29 May 2025 16:45:12 +0000 (17:45 +0100)] 
tests/functional: Add a test for the Stellaris arm machines

The 2023 edition of the QEMU advent calendar featured an image
that we can use to test whether the lm3s6965evb machine is basically
still working.

And for the lm3s811evb there is a small test kernel on github
which can be used to check its UART.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-id: 20250519170242.520805-1-thuth@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2 weeks agotarget/arm/hvf: Include missing 'cpu-qom.h' header
Philippe Mathieu-Daudé [Thu, 29 May 2025 16:45:12 +0000 (17:45 +0100)] 
target/arm/hvf: Include missing 'cpu-qom.h' header

ARMCPU typedef is declared in "cpu-qom.h". Include it in
order to avoid when refactoring unrelated headers:

  target/arm/hvf_arm.h:23:41: error: unknown type name 'ARMCPU'
     23 | void hvf_arm_set_cpu_features_from_host(ARMCPU *cpu);
        |                                         ^

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-id: 20250513173928.77376-10-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2 weeks agotarget/arm/kvm: Include missing 'cpu-qom.h' header
Philippe Mathieu-Daudé [Thu, 29 May 2025 16:45:12 +0000 (17:45 +0100)] 
target/arm/kvm: Include missing 'cpu-qom.h' header

ARMCPU typedef is declared in "cpu-qom.h". Include it in
order to avoid when refactoring unrelated headers:

  target/arm/kvm_arm.h:54:29: error: unknown type name 'ARMCPU'
     54 | bool write_list_to_kvmstate(ARMCPU *cpu, int level);
        |                             ^

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-id: 20250513173928.77376-9-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2 weeks agotarget/arm/qmp: Include missing 'cpu.h' header
Philippe Mathieu-Daudé [Thu, 29 May 2025 16:45:12 +0000 (17:45 +0100)] 
target/arm/qmp: Include missing 'cpu.h' header

arm-qmp-cmds.c uses ARM_MAX_VQ, which is defined in "cpu.h".
Include the latter to avoid when refactoring unrelated headers:

  target/arm/arm-qmp-cmds.c:83:19: error: use of undeclared identifier 'ARM_MAX_VQ'
     83 | QEMU_BUILD_BUG_ON(ARM_MAX_VQ > 16);
        |                   ^

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-id: 20250513173928.77376-8-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2 weeks agotarget/arm/cpu-features: Include missing 'cpu.h' header
Philippe Mathieu-Daudé [Thu, 29 May 2025 16:45:11 +0000 (17:45 +0100)] 
target/arm/cpu-features: Include missing 'cpu.h' header

"target/arm/cpu-features.h" dereferences the ARMISARegisters
structure, which is defined in "cpu.h". Include the latter to
avoid when refactoring unrelated headers:

  In file included from target/arm/internals.h:33:
  target/arm/cpu-features.h:45:54: error: unknown type name 'ARMISARegisters'
     45 | static inline bool isar_feature_aa32_thumb_div(const ARMISARegisters *id)
        |                                                      ^
  target/arm/cpu-features.h:47:12: error: use of undeclared identifier 'R_ID_ISAR0_DIVIDE_SHIFT'
     47 |     return FIELD_EX32(id->id_isar0, ID_ISAR0, DIVIDE) != 0;
        |            ^

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-id: 20250513173928.77376-7-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2 weeks agohw/arm/boot: Include missing 'system/memory.h' header
Philippe Mathieu-Daudé [Thu, 29 May 2025 16:45:11 +0000 (17:45 +0100)] 
hw/arm/boot: Include missing 'system/memory.h' header

default_reset_secondary() uses address_space_stl_notdirty(),
itself declared in "system/memory.h". Include this header in
order to avoid when refactoring headers:

  ../hw/arm/boot.c:281:5: error: implicit declaration of function 'address_space_stl_notdirty' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
    address_space_stl_notdirty(as, info->smp_bootreg_addr,
    ^

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20250513173928.77376-6-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>