Stefan Hajnoczi [Tue, 7 Nov 2023 01:42:17 +0000 (09:42 +0800)]
Merge tag 'pull-block-2023-11-06' of https://gitlab.com/hreitz/qemu into staging
Block patches:
- One patch to make qcow2's discard-no-unref option do better what it is
supposed to do (i.e. prevent fragmentation)
- Two fixes for zoned requests
# -----BEGIN PGP SIGNATURE-----
#
# iQJGBAABCAAwFiEEy2LXoO44KeRfAE00ofpA0JgBnN8FAmVJHbgSHGhyZWl0ekBy
# ZWRoYXQuY29tAAoJEKH6QNCYAZzfLn4QAKxuUYZaXirv6K4U2tW4aAJtc5uESdwv
# WYhG7YU7MleBGCY0fRoih5thrPrzRLC8o1QhbRcA36+/PAZf4BYrJEfqLUdzuN5x
# 6Vb1n3NRUzPD1+VfL/B9hVZhFbtTOUZuxPGEqCoHAmqBaeKuYRT1bLZbtRtPVLSk
# 5eTMiyrpRMlBWc7O71eGKLqU4k0vAznwHBGf2Z93qWAsKcRZCwbAWYa7Q6rJ9jJ8
# 1jNsQuAk0p74/uGEpFhoEVrFEcV6pMbI4+jB9i0t9YYxT0tLIdIX1VUx+AHJfItk
# IF2stB6SFOaAy2W3Fn+0oJvz40aMLzg9VjEeTpGmdlKC67ZTYa6Obwzy5WNLPIap
# k7VUheUEe8qoKUtxQNxGLR/HKEJSFXyhU0lgAGxE1gl2xc1QFFFsrimpwFd3d37j
# 3PwfhjARHonf4ZXgsvtIjb7nG9seMZYO7Vht0OztJyW8c2XN5OFVPir9xLbd9VUg
# wZNGB8jAsHgj77+S/mRIwpP+laKL8wB7zYZ1mgFI98QJIYqL8tGdV/IiUhLljHzc
# XAmwekOhBMMbgHhliBy9zDuTy59+zZ0FoxZPn/JvBjqBAkEnz9EbhHxi2imQg+1d
# XSoLbx1X1yEbepWz8mCGiveLIPkt+3qMJuuQF76nURaA+nm3tCl/nKca6QLnVKzU
# 2QtPWS0qRmwd
# =5w7S
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 07 Nov 2023 01:09:12 HKT
# gpg: using RSA key CB62D7A0EE3829E45F004D34A1FA40D098019CDF
# gpg: issuer "hreitz@redhat.com"
# gpg: Good signature from "Hanna Reitz <hreitz@redhat.com>" [unknown]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: CB62 D7A0 EE38 29E4 5F00 4D34 A1FA 40D0 9801 9CDF
* tag 'pull-block-2023-11-06' of https://gitlab.com/hreitz/qemu:
file-posix: fix over-writing of returning zone_append offset
block/file-posix: fix update_zones_wp() caller
qcow2: keep reference on zeroize with discard-no-unref enabled
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Stefan Hajnoczi [Tue, 7 Nov 2023 01:41:52 +0000 (09:41 +0800)]
Merge tag 'pull-vfio-20231106' of https://github.com/legoater/qemu into staging
vfio queue:
* Support for non 64b IOVA space
* Introduction of a PCIIOMMUOps callback structure to ease future
extensions
* Fix for a buffer overrun when writing the VF token
* PPC cleanups preparing ground for IOMMUFD support
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmVI+bIACgkQUaNDx8/7
# 7KHW4g/9FmgX0k2Elm1BAul3slJtuBT8/iHKfK19rhXICxhxS5xBWJA8FmosTWAT
# 91YqQJhOHARxLd9VROfv8Fq8sAo+Ys8bP3PTXh5satjY5gR9YtmMSVqvsAVLn7lv
# a/0xp7wPJt2UeKzvRNUqFXNr7yHPwxFxbJbmmAJbNte8p+TfE2qvojbJnu7BjJbg
# sTtS/vFWNJwtuNYTkMRoiZaUKEoEZ8LnslOqKUjgeO59g4i3Dq8e2JCmHANPFWUK
# cWmr7AqcXgXEnLSDWTtfN53bjcSCYkFVb4WV4Wv1/7hUF5jQ4UR0l3B64xWe0M3/
# Prak3bWOM/o7JwLBsgaWPngXA9V0WFBTXVF4x5qTwhuR1sSV8MxUvTKxI+qqiEzA
# FjU89oSZ+zXId/hEUuTL6vn1Th8/6mwD0L9ORchNOQUKzCjBzI4MVPB09nM3AdPC
# LGThlufsZktdoU2KjMHpc+gMIXQYsxkgvm07K5iZTZ5eJ4tV5KB0aPvTZppGUxe1
# YY9og9F3hxjDHQtEuSY2rzBQI7nrUpd1ZI5ut/3ZgDWkqD6aGRtMme4n4GsGsYb2
# Ht9+d2RL9S8uPUh+7rV8K/N3+vXgXRaEYTuAScKtflEbA7YnZA5nUdMng8x0kMTQ
# Y73XCd4UGWDfSSZsgaIHGkM/MRIHgmlrfcwPkWqWW9vF+92O6Hw=
# =/Du0
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 06 Nov 2023 22:35:30 HKT
# gpg: using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <clg@redhat.com>" [unknown]
# gpg: aka "Cédric Le Goater <clg@kaod.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B 0B60 51A3 43C7 CFFB ECA1
* tag 'pull-vfio-20231106' of https://github.com/legoater/qemu: (22 commits)
vfio/common: Move vfio_host_win_add/del into spapr.c
vfio/spapr: Make vfio_spapr_create/remove_window static
vfio/container: Move spapr specific init/deinit into spapr.c
vfio/container: Move vfio_container_add/del_section_window into spapr.c
vfio/container: Move IBM EEH related functions into spapr_pci_vfio.c
util/uuid: Define UUID_STR_LEN from UUID_NONE string
util/uuid: Remove UUID_FMT_LEN
vfio/pci: Fix buffer overrun when writing the VF token
util/uuid: Add UUID_STR_LEN definition
hw/pci: modify pci_setup_iommu() to set PCIIOMMUOps
test: Add some tests for range and resv-mem helpers
virtio-iommu: Consolidate host reserved regions and property set ones
virtio-iommu: Implement set_iova_ranges() callback
virtio-iommu: Record whether a probe request has been issued
range: Introduce range_inverse_array()
virtio-iommu: Introduce per IOMMUDevice reserved regions
util/reserved-region: Add new ReservedRegion helpers
range: Make range_compare() public
virtio-iommu: Rename reserved_regions into prop_resv_regions
vfio: Collect container iova range info
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Stefan Hajnoczi [Tue, 7 Nov 2023 01:41:42 +0000 (09:41 +0800)]
Merge tag 'pull-hv-balloon-20231106' of https://github.com/maciejsszmigiero/qemu into staging
Hyper-V Dynamic Memory protocol driver.
This driver is like virtio-balloon on steroids for Windows guests:
it allows both changing the guest memory allocation via ballooning and
inserting pieces of extra RAM into it on demand from a provided memory
backend via Windows-native Hyper-V Dynamic Memory protocol.
* Preparatory patches to support empty memory devices and ones with
large alignment requirements.
* Revert of recently added "hw/virtio/virtio-pmem: Replace impossible
check by assertion" commit 5960f254dbb4 since this series makes this
situation possible again.
* Protocol definitions.
* Hyper-V DM protocol driver (hv-balloon) base (ballooning only).
Stefan Hajnoczi [Tue, 7 Nov 2023 01:41:24 +0000 (09:41 +0800)]
Merge tag 'pull-xenfv-stable-20231106' of git://git.infradead.org/users/dwmw2/qemu into staging
Bugfixes for emulated Xen support
Selected bugfixes for mainline and stable, especially to the per-vCPU
local APIC vector delivery mode for event channel notifications, which
was broken in a number of ways.
The xen-block driver has been defaulting to the wrong protocol for x86
guest, and this fixes that — which is technically an incompatible change
but I'm fairly sure nobody relies on the broken behaviour (and in
production I *have* seen guests which rely on the correct behaviour,
which now matches the blkback driver in the Linux kernel).
A handful of other simple fixes for issues which came to light as new
features (qv) were being developed.
# -----BEGIN PGP SIGNATURE-----
#
# iQJIBAABCAAyFiEEvgfZ/VSAmrLEsP9fY3Ys2mfi81kFAmVIvv4UHGR3bXcyQGlu
# ZnJhZGVhZC5vcmcACgkQY3Ys2mfi81nFmRAAvK3VNuGDV56TJqFdtEWD+3jzSZU0
# CoL1mxggvwnlFn1SdHvbC5jl+UscknErcNbqlxMTTg9jQiiQqzFuaWujJnL0dEOY
# RJiS2scKln/1gv9NRbLE31FjPwoNz+zJI/iMvdutjT7Ll//v34jY0vd1Y5Wo53ay
# MBschuuxD1sUUTHNj5f9afrgZaetJfgBSNZraiLR5T2HEadJVJuhItdGxW1+KaPI
# zBIcflIeZmJl9b/L1a2bP3KJmRo8QzHB56X3uzwkPhYhYSU2dnCaJTLCkiNfK+Qh
# SgCBMlzsvJbIZqDA9YPOGdKK1ArfTJRmRDwAkqH0YQknQGoIkpN+7eQiiSv6PMS5
# U/93V7r6MfaftIs6YdWSnFozWeBuyKZL9H2nAXqZgL5t6uEMVR8Un/kFnGfslTFY
# 9gQ1o4IM6ECLiXhIP/sPNOprrbFb0HU7QPtEDJOxrJzBM+IfLbldRHn4p9CccqQA
# LHvJF98VhX1d0nA0iZBT3qqfKPbmUhRV9Jrm+WamqNrRXhiGdF8EidsUf8RWX+JD
# xZWJiqhTwShxdLE6TC/JgFz4cQCVHG8QiZstZUbdq59gtz9YO5PGByMgI3ds7iNQ
# lGXAPFm+1wU85W4dZOH7qyim6d9ytFm2Fm110BKM8l9B6UKEuKHpsxXMqdo65JXI
# 7uBKbVpdPKul0DY=
# =dQ7h
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 06 Nov 2023 18:25:02 HKT
# gpg: using RSA key BE07D9FD54809AB2C4B0FF5F63762CDA67E2F359
# gpg: issuer "dwmw2@infradead.org"
# gpg: Good signature from "David Woodhouse <dwmw2@infradead.org>" [unknown]
# gpg: aka "David Woodhouse <dwmw2@exim.org>" [unknown]
# gpg: aka "David Woodhouse <david@woodhou.se>" [unknown]
# gpg: aka "David Woodhouse <dwmw2@kernel.org>" [unknown]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: BE07 D9FD 5480 9AB2 C4B0 FF5F 6376 2CDA 67E2 F359
* tag 'pull-xenfv-stable-20231106' of git://git.infradead.org/users/dwmw2/qemu:
hw/xen: use correct default protocol for xen-block on x86
hw/xen: take iothread mutex in xen_evtchn_reset_op()
hw/xen: fix XenStore watch delivery to guest
hw/xen: don't clear map_track[] in xen_gnttab_reset()
hw/xen: select kernel mode for per-vCPU event channel upcall vector
i386/xen: fix per-vCPU upcall vector for Xen emulation
i386/xen: Don't advertise XENFEAT_supervisor_mode_kernel
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* tag 'q800-for-8.2-pull-request' of https://github.com/vivier/qemu-m68k:
macfb: allow reads from the DAFB_LUT register
macfb: allow larger write accesses to the DAFB_LUT register
macfb: rename DAFB_RESET to DAFB_LUT_INDEX
macfb: don't clear interrupts when writing to DAFB_RESET
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Naohiro Aota [Mon, 30 Oct 2023 07:38:53 +0000 (16:38 +0900)]
file-posix: fix over-writing of returning zone_append offset
raw_co_zone_append() sets "s->offset" where "BDRVRawState *s". This pointer
is used later at raw_co_prw() to save the block address where the data is
written.
When multiple IOs are on-going at the same time, a later IO's
raw_co_zone_append() call over-writes a former IO's offset address before
raw_co_prw() completes. As a result, the former zone append IO returns the
initial value (= the start address of the writing zone), instead of the
proper address.
Fix the issue by passing the offset pointer to raw_co_prw() instead of
passing it through s->offset. Also, remove "offset" from BDRVRawState as
there is no usage anymore.
Fixes: 4751d09adcc3 ("block: introduce zone append write for zoned devices") Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Message-Id: <20231030073853.2601162-1-naohiro.aota@wdc.com> Reviewed-by: Sam Li <faithilikerun@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Sam Li [Fri, 25 Aug 2023 04:05:56 +0000 (12:05 +0800)]
block/file-posix: fix update_zones_wp() caller
When the zoned request fail, it needs to update only the wp of
the target zones for not disrupting the in-flight writes on
these other zones. The wp is updated successfully after the
request completes.
Fixed the callers with right offset and nr_zones.
Signed-off-by: Sam Li <faithilikerun@gmail.com>
Message-Id: <20230825040556.4217-1-faithilikerun@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[hreitz: Rebased and fixed comment spelling] Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
qcow2: keep reference on zeroize with discard-no-unref enabled
When the discard-no-unref flag is enabled, we keep the reference for
normal discard requests.
But when a discard is executed on a snapshot/qcow2 image with backing,
the discards are saved as zero clusters in the snapshot image.
When committing the snapshot to the backing file, not
discard_in_l2_slice is called but zero_in_l2_slice. Which did not had
any logic to keep the reference when discard-no-unref is enabled.
Therefor we add logic in the zero_in_l2_slice call to keep the reference
on commit.
Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1621 Signed-off-by: Jean-Louis Dupond <jean-louis@dupond.be>
Message-Id: <20231003125236.216473-2-jean-louis@dupond.be>
[hreitz: Made the documentation change more verbose, as discussed
on-list] Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Peter Maydell [Mon, 6 Nov 2023 15:00:29 +0000 (15:00 +0000)]
target/arm: Fix A64 LDRA immediate decode
In commit be23a049 in the conversion to decodetree we broke the
decoding of the immediate value in the LDRA instruction. This should
be a 10 bit signed value that is scaled by 8, but in the conversion
we incorrectly ended up scaling it only by 2. Fix the scaling
factor.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1970 Fixes: be23a049 ("target/arm: Convert load (pointer auth) insns to decodetree") Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20231106113445.1163063-1-peter.maydell@linaro.org
The flash "wins" and the RAM mapping is useless (but also harmless).
This happened as a result of commit 6ec1588e in 2014, which changed
"we always map the RAM to the low addresses for vexpress-a9" to "we
always map flash in the low addresses", but forgot to stop mapping
the RAM.
In real hardware, this low part of memory is remappable, both at
runtime by the guest writing to a control register, and configurably
as to what you get out of reset -- you can have the first flash
device, or the second, or the DDR2 RAM, or the external AXI bus
(which for QEMU means "nothing there"). In an ideal world we would
support that remapping both at runtime and via a machine property to
select the out-of-reset behaviour.
Pending anybody caring enough to implement the full remapping
behaviour:
* remove the useless mapped-but-inaccessible lowram MR
* document that QEMU doesn't support remapping of low memory
Fixes: 6ec1588e ("hw/arm/vexpress: Alias NOR flash at 0 for vexpress-a9")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1761 Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20231103185602.875849-1-peter.maydell@linaro.org
For SO_EE_ORIGIN_ZEROCOPY the 32-bit notification range is encoded
as [ee_info, ee_data] inclusively, so ee_info should be less or
equal to ee_data.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Maksim Davydov <davydov-max@yandex-team.ru>
Message-id: 20231017125941.810461-7-vsementsov@yandex-team.ru Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Coverity signals that variable as being used uninitialized. And really,
when work with external APIs that's better to zero out the structure,
where we set some fields by hand.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Maksim Davydov <davydov-max@yandex-team.ru>
Message-id: 20231017125941.810461-6-vsementsov@yandex-team.ru Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
block/nvme: nvme_process_completion() fix bound for cid
NVMeQueuePair::reqs has length NVME_NUM_REQS, which less than
NVME_QUEUE_SIZE by 1.
Fixes: 1086e95da17050 ("block/nvme: switch to a NVMeRequest freelist") Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Maksim Davydov <davydov-max@yandex-team.ru>
Message-id: 20231017125941.810461-5-vsementsov@yandex-team.ru Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
mc146818rtc: rtc_set_time(): initialize tm to zeroes
set_time() function doesn't set all the fields, so it's better to
initialize tm structure. And Coverity will be happier about it.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Maksim Davydov <davydov-max@yandex-team.ru>
Message-id: 20231017125941.810461-4-vsementsov@yandex-team.ru Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
util/filemonitor-inotify: qemu_file_monitor_watch(): assert no overflow
Prefer clear assertions instead of [im]possible array overflow.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Maksim Davydov <davydov-max@yandex-team.ru>
Message-id: 20231017125941.810461-3-vsementsov@yandex-team.ru Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
hw/i386/intel_iommu: vtd_slpte_nonzero_rsvd(): assert no overflow
We support only 3- and 4-level page-tables, which is firstly checked in
vtd_decide_config(), then setup in vtd_init(). Than level fields are
checked by vtd_is_level_supported().
So here we can't have level out from 1..4 inclusive range. Let's assert
it. That also explains Coverity that we are not going to overflow the
array.
CID: 1487158, 1487186 Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Maksim Davydov <davydov-max@yandex-team.ru>
Message-id: 20231017125941.810461-2-vsementsov@yandex-team.ru Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Udo Steinberg [Mon, 6 Nov 2023 15:00:26 +0000 (15:00 +0000)]
hw/arm/virt: Report correct register sizes in ACPI DBG2/SPCR tables.
Documentation for using the GAS in ACPI tables to report debug UART addresses at
https://learn.microsoft.com/en-us/windows-hardware/drivers/bringup/acpi-debug-port-table
states the following:
- The Register Bit Width field contains the register stride and must be a
power of 2 that is at least as large as the access size. On 32-bit
platforms this value cannot exceed 32. On 64-bit platforms this value
cannot exceed 64.
- The Access Size field is used to determine whether byte, WORD, DWORD, or
QWORD accesses are to be used. QWORD accesses are only valid on 64-bit
architectures.
Documentation for the ARM PL011 at
https://developer.arm.com/documentation/ddi0183/latest/
states that the registers are:
- spaced 4 bytes apart (see Table 3-2), so register stride must be 32.
- 16 bits in size in some cases (see individual registers), so access
size must be at least 2.
Linux doesn't seem to care about this error in the table, but it does
affect at least the NOVA microhypervisor.
In theory we therefore have a choice between reporting the access
size as 2 (16 bit accesses) or 3 (32-bit accesses). In practice,
Linux does not correctly handle the case where the table reports the
access size as 2: as of kernel commit 750b95887e5678, the code in
acpi_parse_spcr() tries to tell the serial driver to use 16 bit
accesses by passing "mmio16" in the option string, but the PL011
driver code in pl011_console_match() only recognizes "mmio" or
"mmio32". The result is that unless the user has enabled 'earlycon'
there is no console output from the guest kernel.
We therefore choose to report the access size as 32 bits; this works
for NOVA and also for Linux. It is also what the UEFI firmware on a
Raspberry Pi 4 reports, so we're in line with existing real-world
practice.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1938 Signed-off-by: Udo Steinberg <udo@hypervisor.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: minor commit message tweaks; use 32 bit accesses] Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Sebastian Ott [Mon, 6 Nov 2023 15:00:26 +0000 (15:00 +0000)]
hw/arm/virt: fix PMU IRQ registration
Since commit 9036e917f8 ("{include/}hw/arm: refactor virt PPI logic")
PMU IRQ registration fails for arm64 guests:
[ 0.563689] hw perfevents: unable to request IRQ14 for ARM PMU counters
[ 0.565160] armv8-pmu: probe of pmu failed with error -22
That commit re-defined VIRTUAL_PMU_IRQ to be a INTID but missed a case
where the PMU IRQ is actually referred by its PPI index. Fix that by using
INTID_TO_PPI() in that case.
Fixes: 9036e917f8 ("{include/}hw/arm: refactor virt PPI logic")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1960 Signed-off-by: Sebastian Ott <sebott@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 475d918d-ab0e-f717-7206-57a5beb28c7b@redhat.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add the necessary plumbing for the hv-balloon driver to the PC machine.
Co-developed-by: David Hildenbrand <david@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
qapi: Add query-memory-devices support to hv-balloon
Used by the driver to report its provided memory state information.
Co-developed-by: David Hildenbrand <david@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Add Hyper-V Dynamic Memory Protocol driver (hv-balloon) hot-add support
One of advantages of using this protocol over ACPI-based PC DIMM hotplug is
that it allows hot-adding memory in much smaller granularity because the
ACPI DIMM slot limit does not apply.
In order to enable this functionality a new memory backend needs to be
created and provided to the driver via the "memdev" parameter.
This can be achieved by, for example, adding
"-object memory-backend-ram,id=mem1,size=32G" to the QEMU command line and
then instantiating the driver with "memdev=mem1" parameter.
The device will try to use multiple memslots to cover the memory backend in
order to reduce the size of metadata for the not-yet-hot-added part of the
memory backend.
Co-developed-by: David Hildenbrand <david@redhat.com> Acked-by: David Hildenbrand <david@redhat.com> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Add Hyper-V Dynamic Memory Protocol driver (hv-balloon) base
This driver is like virtio-balloon on steroids: it allows both changing the
guest memory allocation via ballooning and (in the next patch) inserting
pieces of extra RAM into it on demand from a provided memory backend.
The actual resizing is done via ballooning interface (for example, via
the "balloon" HMP command).
This includes resizing the guest past its boot size - that is, hot-adding
additional memory in granularity limited only by the guest alignment
requirements, as provided by the next patch.
In contrast with ACPI DIMM hotplug where one can only request to unplug a
whole DIMM stick this driver allows removing memory from guest in single
page (4k) units via ballooning.
After a VM reboot the guest is back to its original (boot) size.
In the future, the guest boot memory size might be changed on reboot
instead, taking into account the effective size that VM had before that
reboot (much like Hyper-V does).
For performance reasons, the guest-released memory is tracked in a few
range trees, as a series of (start, count) ranges.
Each time a new page range is inserted into such tree its neighbors are
checked as candidates for possible merging with it.
Besides performance reasons, the Dynamic Memory protocol itself uses page
ranges as the data structure in its messages, so relevant pages need to be
merged into such ranges anyway.
One has to be careful when tracking the guest-released pages, since the
guest can maliciously report returning pages outside its current address
space, which later clash with the address range of newly added memory.
Similarly, the guest can report freeing the same page twice.
The above design results in much better ballooning performance than when
using virtio-balloon with the same guest: 230 GB / minute with this driver
versus 70 GB / minute with virtio-balloon.
During a ballooning operation most of time is spent waiting for the guest
to come up with newly freed page ranges, processing the received ranges on
the host side (in QEMU and KVM) is nearly instantaneous.
The unballoon operation is also pretty much instantaneous:
thanks to the merging of the ballooned out page ranges 200 GB of memory can
be returned to the guest in about 1 second.
With virtio-balloon this operation takes about 2.5 minutes.
These tests were done against a Windows Server 2019 guest running on a
Xeon E5-2699, after dirtying the whole memory inside guest before each
balloon operation.
Using a range tree instead of a bitmap to track the removed memory also
means that the solution scales well with the guest size: even a 1 TB range
takes just a few bytes of such metadata.
Since the required GTree operations aren't present in every Glib version
a check for them was added to the meson build script, together with new
"--enable-hv-balloon" and "--disable-hv-balloon" configure arguments.
If these GTree operations are missing in the system's Glib version this
driver will be skipped during QEMU build.
An optional "status-report=on" device parameter requests memory status
events from the guest (typically sent every second), which allow the host
to learn both the guest memory available and the guest memory in use
counts.
Following commits will add support for their external emission as
"HV_BALLOON_STATUS_REPORT" QMP events.
The driver is named hv-balloon since the Linux kernel client driver for
the Dynamic Memory Protocol is named as such and to follow the naming
pattern established by the virtio-balloon driver.
The whole protocol runs over Hyper-V VMBus.
The driver was tested against Windows Server 2012 R2, Windows Server 2016
and Windows Server 2019 guests and obeys the guest alignment requirements
reported to the host via DM_CAPABILITIES_REPORT message.
Acked-by: David Hildenbrand <david@redhat.com> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
This commit adds Hyper-V Dynamic Memory Protocol definitions, taken
from hv_balloon Linux kernel driver, adapted to the QEMU coding style and
definitions.
Acked-by: David Hildenbrand <david@redhat.com> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Zhenzhong Duan [Thu, 2 Nov 2023 07:12:26 +0000 (15:12 +0800)]
vfio/common: Move vfio_host_win_add/del into spapr.c
Only spapr supports a customed host window list, other vfio driver
assume 64bit host window. So remove the check in listener callback
and move vfio_host_win_add/del into spapr.c and make it static.
With the check removed, we still need to do the same check for
VFIO_SPAPR_TCE_IOMMU which allows a single host window range
[dma32_window_start, dma32_window_size). Move vfio_find_hostwin
into spapr.c and do same check in vfio_container_add_section_window
instead.
When mapping a ram device section, if it's unaligned with
hostwin->iova_pgsizes, this mapping is bypassed. With hostwin
moved into spapr, we changed to check container->pgsizes.
Suggested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
Zhenzhong Duan [Thu, 2 Nov 2023 07:12:24 +0000 (15:12 +0800)]
vfio/container: Move spapr specific init/deinit into spapr.c
Move spapr specific init/deinit code into spapr.c and wrap
them with vfio_spapr_container_init/deinit, this way footprint
of spapr is further reduced, vfio_prereg_listener could also
be made static.
vfio_listener_release is unnecessary when prereg_listener is
moved out, so have it removed.
No functional changes intended.
Suggested-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
BALATON Zoltan [Wed, 1 Nov 2023 20:45:38 +0000 (21:45 +0100)]
ati-vga: Add 30 bit palette access register
Radeon cards have a 30 bit DAC and corresponding palette register to
access it. We only use 8 bits but let the guests use 10 bit color
values for those that access it through this register.
BALATON Zoltan [Wed, 1 Nov 2023 20:45:37 +0000 (21:45 +0100)]
ati-vga: Support unaligned access to GPIO DDC registers
The GPIO_VGA_DDC and GPIO_DVI_DDC registers are used on Radeon for DDC
access. Some drivers like the PPC Mac FCode ROM uses unaligned writes
to these registers so implement this the same way as already done for
GPIO_MONID which is used the same way for the Rage 128 Pro.
BALATON Zoltan [Wed, 1 Nov 2023 20:45:36 +0000 (21:45 +0100)]
ati-vga: Fix aperture sizes
Apparently these should be half the memory region sizes confirmed at
least by Radeon FCocde ROM while Rage 128 Pro ROMs don't seem to use
these. Linux r100 DRM driver also checks for a bit in HOST_PATH_CNTL
so we also add that even though the FCode ROM does not seem to set it.
Cong Liu [Tue, 31 Oct 2023 01:25:15 +0000 (09:25 +0800)]
virtio-gpu-rutabaga: Add empty interface to fix arm64 crash
Add an empty element to the interfaces array, which is consistent with
the behavior of other devices in qemu and fixes the crash on arm64.
0 0x0000fffff5c18550 in () at /usr/lib64/libc.so.6
1 0x0000fffff6c9cd6c in g_strdup () at /usr/lib64/libglib-2.0.so.0
2 0x0000aaaaab4945d8 in g_strdup_inline (str=<optimized out>) at /usr/include/glib-2.0/glib/gstrfuncs.h:321
3 type_new (info=info@entry=0xaaaaabc1b2c8 <virtio_gpu_rutabaga_pci_info>) at ../qom/object.c:133
4 0x0000aaaaab494f14 in type_register_internal (info=0xaaaaabc1b2c8 <virtio_gpu_rutabaga_pci_info>) at ../qom/object.c:143
5 type_register (info=0xaaaaabc1b2c8 <virtio_gpu_rutabaga_pci_info>) at ../qom/object.c:152
6 type_register_static (info=0xaaaaabc1b2c8 <virtio_gpu_rutabaga_pci_info>) at ../qom/object.c:157
7 type_register_static_array (infos=<optimized out>, nr_infos=<optimized out>) at ../qom/object.c:165
8 0x0000aaaaab6147e8 in module_call_init (type=type@entry=MODULE_INIT_QOM) at ../util/module.c:109
9 0x0000aaaaab10a0ec in qemu_init_subsystems () at ../system/runstate.c:817
10 0x0000aaaaab10d334 in qemu_init (argc=13, argv=0xfffffffff198) at ../system/vl.c:2760
11 0x0000aaaaaae4da6c in main (argc=<optimized out>, argv=<optimized out>) at ../system/main.c:47
David Woodhouse [Tue, 24 Oct 2023 21:22:47 +0000 (22:22 +0100)]
hw/xen: take iothread mutex in xen_evtchn_reset_op()
The xen_evtchn_soft_reset() function requires the iothread mutex, but is
also called for the EVTCHNOP_reset hypercall. Ensure the mutex is taken
in that case.
Cc: qemu-stable@nongnu.org Fixes: a15b10978fe6 ("hw/xen: Implement EVTCHNOP_reset") Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
David Woodhouse [Tue, 17 Oct 2023 12:34:18 +0000 (13:34 +0100)]
hw/xen: fix XenStore watch delivery to guest
When fire_watch_cb() found the response buffer empty, it would call
deliver_watch() to generate the XS_WATCH_EVENT message in the response
buffer and send an event channel notification to the guest… without
actually *copying* the response buffer into the ring. So there was
nothing for the guest to see. The pending response didn't actually get
processed into the ring until the guest next triggered some activity
from its side.
Add the missing call to put_rsp().
It might have been slightly nicer to call xen_xenstore_event() here,
which would *almost* have worked. Except for the fact that it calls
xen_be_evtchn_pending() to check that it really does have an event
pending (and clear the eventfd for next time). And under Xen it's
defined that setting that fd to O_NONBLOCK isn't guaranteed to work,
so the emu implementation follows suit.
This fixes Xen device hot-unplug.
Cc: qemu-stable@nongnu.org Fixes: 0254c4d19df ("hw/xen: Add xenstore wire implementation and implementation stubs") Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
David Woodhouse [Wed, 18 Oct 2023 12:31:20 +0000 (13:31 +0100)]
hw/xen: don't clear map_track[] in xen_gnttab_reset()
The refcounts actually correspond to 'active_ref' structures stored in a
GHashTable per "user" on the backend side (mostly, per XenDevice).
If we zero map_track[] on reset, then when the backend drivers get torn
down and release their mapping we hit the assert(s->map_track[ref] != 0)
in gnt_unref().
So leave them in place. Each backend driver will disconnect and reconnect
as the guest comes back up again and reconnects, and it all works out OK
in the end as the old refs get dropped.
Cc: qemu-stable@nongnu.org Fixes: de26b2619789 ("hw/xen: Implement soft reset for emulated gnttab") Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
David Woodhouse [Wed, 11 Oct 2023 23:06:26 +0000 (00:06 +0100)]
hw/xen: select kernel mode for per-vCPU event channel upcall vector
A guest which has configured the per-vCPU upcall vector may set the
HVM_PARAM_CALLBACK_IRQ param to fairly much anything other than zero.
For example, Linux v6.0+ after commit b1c3497e604 ("x86/xen: Add support
for HVMOP_set_evtchn_upcall_vector") will just do this after setting the
vector:
/* Trick toolstack to think we are enlightened. */
if (!cpu)
rc = xen_set_callback_via(1);
That's explicitly setting the delivery to GSI#1, but it's supposed to be
overridden by the per-vCPU vector setting. This mostly works in Qemu
*except* for the logic to enable the in-kernel handling of event channels,
which falsely determines that the kernel cannot accelerate GSI delivery
in this case.
Add a kvm_xen_has_vcpu_callback_vector() to report whether vCPU#0 has
the vector set, and use that in xen_evtchn_set_callback_param() to
enable the kernel acceleration features even when the param *appears*
to be set to target a GSI.
Preserve the Xen behaviour that when HVM_PARAM_CALLBACK_IRQ is set to
*zero* the event channel delivery is disabled completely. (Which is
what that bizarre guest behaviour is working round in the first place.)
Cc: qemu-stable@nongnu.org Fixes: 91cce756179 ("hw/xen: Add xen_evtchn device for event channel emulation") Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
David Woodhouse [Wed, 11 Oct 2023 22:30:08 +0000 (23:30 +0100)]
i386/xen: fix per-vCPU upcall vector for Xen emulation
The per-vCPU upcall vector support had three problems. Firstly it was
using the wrong hypercall argument and would always return -EFAULT when
the guest tried to set it up. Secondly it was using the wrong ioctl() to
pass the vector to the kernel and thus the *kernel* would always return
-EINVAL. Finally, even when delivering the event directly from userspace
with an MSI, it put the destination CPU ID into the wrong bits of the
MSI address.
Linux doesn't (yet) use this mode so it went without decent testing
for a while.
Cc: qemu-stable@nongnu.org Fixes: 105b47fdf2d0 ("i386/xen: implement HVMOP_set_evtchn_upcall_vector") Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
* tag 'pull-sp-20231105' of https://gitlab.com/rth7680/qemu: (21 commits)
target/sparc: Check for invalid cond in gen_compare_reg
target/sparc: Implement UDIV inline
target/sparc: Implement UDIVX and SDIVX inline
target/sparc: Discard cpu_cond at the end of each insn
target/sparc: Record entire jump condition in DisasContext
target/sparc: Merge gen_op_next_insn into only caller
target/sparc: Pass displacement to advance_jump_cond
target/sparc: Merge advance_jump_uncond_{never,always} into advance_jump_cond
target/sparc: Merge gen_branch2 into advance_pc
target/sparc: Do flush_cond in advance_jump_cond
target/sparc: Always copy conditions into a new temporary
target/sparc: Change DisasCompare.c2 to int
target/sparc: Remove DisasCompare.is_bool
target/sparc: Remove CC_OP leftovers
target/sparc: Remove CC_OP_TADDTV, CC_OP_TSUBTV
target/sparc: Remove CC_OP_SUB, CC_OP_SUBX, CC_OP_TSUB
target/sparc: Remove CC_OP_ADD, CC_OP_ADDX, CC_OP_TADD
target/sparc: Remove CC_OP_DIV
target/sparc: Remove CC_OP_LOGIC
target/sparc: Split psr and xcc into components
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* tag 'migration-20231103-pull-request' of https://gitlab.com/juan.quintela/qemu:
migration: Unlock mutex in error case
docs/migration: Add the dirty limit section
tests/migration: Introduce dirty-limit into guestperf
tests/migration: Introduce dirty-ring-size option into guestperf
tests: Add migration dirty-limit capability test
system/dirtylimit: Drop the reduplicative check
system/dirtylimit: Fix a race situation
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Stefan Hajnoczi [Mon, 6 Nov 2023 00:36:47 +0000 (08:36 +0800)]
Merge tag 'dump-pull-request' of https://gitlab.com/marcandre.lureau/qemu into staging
dump queue
Hi
The "dump" queue, with:
- [PATCH v3 qemu 0/3] Allow dump-guest-memory to output standard kdump format
- [PATCH v2 0/5] dump: Minor fixes & improvements
* tag 'dump-pull-request' of https://gitlab.com/marcandre.lureau/qemu:
dump: Drop redundant check for empty dump
dump: Improve some dump-guest-memory error messages
dump: Recognize "fd:" protocols on Windows hosts
dump: Fix g_array_unref(NULL) in dump-guest-memory
dump: Rename qmp_dump_guest_memory() parameter to match QAPI schema
dump: Add command interface for kdump-raw formats
dump: Allow directly outputting raw kdump format
dump: Pass DumpState to write_ functions
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iLMEAAEKAB0WIQS4/x2g0v3LLaCcbCxAov/yOSY+3wUCZUSQIgAKCRBAov/yOSY+
# 31aIBADj5FzdUxyFB813SouAiEiyMdI4bN98AunomAk3Kt8PF1XPoP8kPzcjxcMI
# kCW4eoHb12MVs9OclkqFY3VyaxtSD3YSG/h8W9YxaDyU+L/q89RS+J4r6CAZ8ylg
# J4uxs3Lv8nwPEvRb4zITAt8JQllLey1100j/uu4fU0Rx7vUcMA==
# =9RMx
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 03 Nov 2023 14:16:02 HKT
# gpg: using RSA key B8FF1DA0D2FDCB2DA09C6C2C40A2FFF239263EDF
# gpg: Good signature from "Song Gao <m17746591750@163.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B8FF 1DA0 D2FD CB2D A09C 6C2C 40A2 FFF2 3926 3EDF
* tag 'pull-loongarch-20231103' of https://gitlab.com/gaosong/qemu:
linux-user/loongarch64: Add LASX sigcontext save/restore
linux-user/loongarch64: Add LSX sigcontext save/restore
linux-user/loongarch64: Use abi_{ulong,uint} types
linux-user/loongarch64: setup_sigframe() set 'end' context size 0
linux-user/loongarch64: Fix setup_extcontext alloc wrong fpu_context size
linux-user/loongarch64: Use traps to track LSX/LASX usage
target/loongarch: Support 4K page size
target/loongarch: Implement query-cpu-model-expansion
target/loongarch: Allow user enable/disable LSX/LASX features
target/loongarch: Add cpu model 'max'
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
target/sparc: Check for invalid cond in gen_compare_reg
Consolidate the test here; drop the "inverted logic".
Fix MOVr and FMOVR, which were missing the invalid test.
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
target/sparc: Discard cpu_cond at the end of each insn
If the insn raises no exceptions, there will be no path in which
cpu_cond is used, and so the computation may be optimized away.
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
target/sparc: Record entire jump condition in DisasContext
Use the original condition instead of consuming cpu_cond,
which will now only be live along exception paths.
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
target/sparc: Merge gen_op_next_insn into only caller
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
target/sparc: Pass displacement to advance_jump_cond
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
target/sparc: Merge advance_jump_uncond_{never,always} into advance_jump_cond
Handle these via TCG_COND_{ALWAYS,NEVER}.
Allow dc->npc to be variable, using gen_mov_pc_npc.
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The function had only one caller. Canonicalize the cpu_cond
test to TCG_COND_NE, the "natural" sense of its value.
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
target/sparc: Always copy conditions into a new temporary
This will allow the condition to live across changes to
the global cc variables.
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We don't require c2 to be variable, so emphasize that.
We don't currently require c2 to be non-zero, but that will change.
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Since we're going to feed cpu_cond to another comparison, we don't
reqire a boolean value -- anything non-zero is sufficient.
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
All instructions have been converted to generate
full condition codes explicitly.
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
These are all related and implementable with common code.
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
These are all related and implementable with common code.
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Return both result and overflow from helper_[us]div.
Compute all flags explicitly in gen_op_[us]divcc.
Marginally improve the INT64_MIN special case in helper_sdiv.
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Step in removing CC_OP: change the representation of CC_OP_FLAGS.
The 8 bits are distributed between 6 variables, which should make
it easy to keep up to date.
The code within cc_helper.c is quite ugly but is only temporary.
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Isolate linux-user from changes to icc representation.
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Mark Cave-Ayland [Thu, 26 Oct 2023 08:56:49 +0000 (09:56 +0100)]
macfb: allow larger write accesses to the DAFB_LUT register
The original tests with MacOS showed that only the bottom 8 bits of the DAFB_LUT
register were used when writing to the LUT, however A/UX performs some of its
writes using 4 byte accesses. Expand the address range for the DAFB_LUT register
so that different size accesses write the correct value to the color_palette
array.
Mark Cave-Ayland [Thu, 26 Oct 2023 08:56:48 +0000 (09:56 +0100)]
macfb: rename DAFB_RESET to DAFB_LUT_INDEX
When A/UX uses the MacOS Device Manager Status (GetEntries) call to read the
contents of the CLUT, it is easy to see that the requested index is written to
the DAFB_RESET register. Update the palette_current index with the requested
value, and rename it to DAFB_LUT_INDEX to reflect its true purpose.
Let's support empty memory devices -- memory devices that don't have a
memory device region in the current configuration. hv-balloon with an
optional memdev is the primary use case.
Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Cédric Le Goater [Thu, 26 Oct 2023 07:06:35 +0000 (09:06 +0200)]
vfio/pci: Fix buffer overrun when writing the VF token
qemu_uuid_unparse() includes a trailing NUL when writing the uuid
string and the buffer size should be UUID_FMT_LEN + 1 bytes. Use the
recently added UUID_STR_LEN which defines the correct size.
Fixes: CID 1522913 Fixes: 2dca1b37a760 ("vfio/pci: add support for VF token") Cc: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: "Denis V. Lunev" <den@openvz.org> Signed-off-by: Cédric Le Goater <clg@redhat.com>
Cédric Le Goater [Thu, 26 Oct 2023 07:06:34 +0000 (09:06 +0200)]
util/uuid: Add UUID_STR_LEN definition
qemu_uuid_unparse() includes a trailing NUL when writing the uuid
string and the buffer size should be UUID_FMT_LEN + 1 bytes. Add a
define for this size and use it where required.
Cc: Fam Zheng <fam@euphon.net> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: "Denis V. Lunev" <den@openvz.org> Signed-off-by: Cédric Le Goater <clg@redhat.com>
Yi Liu [Tue, 17 Oct 2023 16:14:04 +0000 (18:14 +0200)]
hw/pci: modify pci_setup_iommu() to set PCIIOMMUOps
This patch modifies pci_setup_iommu() to set PCIIOMMUOps
instead of setting PCIIOMMUFunc. PCIIOMMUFunc is used to
get an address space for a PCI device in vendor specific
way. The PCIIOMMUOps still offers this functionality. But
using PCIIOMMUOps leaves space to add more iommu related
vendor specific operations.
Cc: Kevin Tian <kevin.tian@intel.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Peter Xu <peterx@redhat.com> Cc: Eric Auger <eric.auger@redhat.com> Cc: Yi Sun <yi.y.sun@linux.intel.com> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Eric Auger <eric.auger@redhat.com> Cc: Peter Maydell <peter.maydell@linaro.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Xu <peterx@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Andrey Smirnov <andrew.smirnov@gmail.com> Cc: Helge Deller <deller@gmx.de> Cc: Hervé Poussineau <hpoussin@reactos.org> Cc: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Cc: BALATON Zoltan <balaton@eik.bme.hu> Cc: Elena Ufimtseva <elena.ufimtseva@oracle.com> Cc: Jagannathan Raman <jag.raman@oracle.com> Cc: Matthew Rosato <mjrosato@linux.ibm.com> Cc: Eric Farman <farman@linux.ibm.com> Cc: Halil Pasic <pasic@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Thomas Huth <thuth@redhat.com> Cc: Helge Deller <deller@gmx.de> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Eric Auger <eric.auger@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com>
[ clg: - refreshed on latest QEMU
- included hw/remote/iommu.c
- documentation update
- asserts in pci_setup_iommu()
- removed checks on iommu_bus->iommu_ops->get_address_space
- included Elroy PCI host (PA-RISC) ] Signed-off-by: Cédric Le Goater <clg@redhat.com>
Eric Auger [Thu, 19 Oct 2023 13:45:18 +0000 (15:45 +0200)]
test: Add some tests for range and resv-mem helpers
Add unit tests for both resv_region_list_insert() and
range_inverse_array().
Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: "Michael S. Tsirkin" <mst@redhat.com>
[ clg: Removal of unused variable in compare_ranges() ] Signed-off-by: Cédric Le Goater <clg@redhat.com>
Eric Auger [Thu, 19 Oct 2023 13:45:17 +0000 (15:45 +0200)]
virtio-iommu: Consolidate host reserved regions and property set ones
Up to now we were exposing to the RESV_MEM probe requests the
reserved memory regions set though the reserved-regions array property.
Combine those with the host reserved memory regions if any. Those
latter are tagged as RESERVED. We don't have more information about
them besides then cannot be mapped. Reserved regions set by
property have higher priority.
Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: "Michael S. Tsirkin" <mst@redhat.com> Tested-by: Yanghang Liu <yanghliu@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
The implementation populates the array of per IOMMUDevice
host reserved ranges.
It is forbidden to have conflicting sets of host IOVA ranges
to be applied onto the same IOMMU MR (implied by different
host devices).
In case the callback is called after the probe request has
been issues by the driver, a warning is issued.
Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: "Michael S. Tsirkin" <mst@redhat.com> Tested-by: Yanghang Liu <yanghliu@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
Eric Auger [Thu, 19 Oct 2023 13:45:14 +0000 (15:45 +0200)]
range: Introduce range_inverse_array()
This helper reverses a list of regions within a [low, high]
span, turning original regions into holes and original
holes into actual regions, covering the whole UINT64_MAX span.
Signed-off-by: Eric Auger <eric.auger@redhat.com> Tested-by: Yanghang Liu <yanghliu@redhat.com> Reviewed-by: "Michael S. Tsirkin" <mst@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
Eric Auger [Thu, 19 Oct 2023 13:45:13 +0000 (15:45 +0200)]
virtio-iommu: Introduce per IOMMUDevice reserved regions
For the time being the per device reserved regions are
just a duplicate of IOMMU wide reserved regions. Subsequent
patches will combine those with host reserved regions, if any.
Signed-off-by: Eric Auger <eric.auger@redhat.com> Tested-by: Yanghang Liu <yanghliu@redhat.com> Reviewed-by: "Michael S. Tsirkin" <mst@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
Eric Auger [Thu, 19 Oct 2023 13:45:12 +0000 (15:45 +0200)]
util/reserved-region: Add new ReservedRegion helpers
Introduce resv_region_list_insert() helper which inserts
a new ReservedRegion into a sorted list of reserved region.
In case of overlap, the new region has higher priority and
hides the existing overlapped segments. If the overlap is
partial, new regions are created for parts which are not
overlapped. The new region has higher priority independently
on the type of the regions.
Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Tested-by: Yanghang Liu <yanghliu@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
Eric Auger [Thu, 19 Oct 2023 13:45:11 +0000 (15:45 +0200)]
range: Make range_compare() public
Let's expose range_compare() in the header so that it can be
reused outside of util/range.c
Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Cédric Le Goater <clg@redhat.com>
Eric Auger [Thu, 19 Oct 2023 13:45:10 +0000 (15:45 +0200)]
virtio-iommu: Rename reserved_regions into prop_resv_regions
Rename VirtIOIOMMU (nb_)reserved_regions fields with the "prop_" prefix
to highlight those fields are set through a property, at machine level.
They are IOMMU wide.
A subsequent patch will introduce per IOMMUDevice reserved regions
that will include both those IOMMU wide property reserved
regions plus, sometimes, host reserved regions, if the device is
backed by a host device protected by a physical IOMMU. Also change
nb_ prefix by nr_.
Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: "Michael S. Tsirkin" <mst@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
Eric Auger [Thu, 19 Oct 2023 13:45:09 +0000 (15:45 +0200)]
vfio: Collect container iova range info
Collect iova range information if VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE
capability is supported.
This allows to propagate the information though the IOMMU MR
set_iova_ranges() callback so that virtual IOMMUs
get aware of those aperture constraints. This is only done if
the info is available and the number of iova ranges is greater than
0.
A new vfio_get_info_iova_range helper is introduced matching
the coding style of existing vfio_get_info_dma_avail. The
boolean returned value isn't used though. Code is aligned
between both.
Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Tested-by: Yanghang Liu <yanghliu@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
This helper will allow to convey information about valid
IOVA ranges to virtual IOMMUS.
Signed-off-by: Eric Auger <eric.auger@redhat.com> Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: "Michael S. Tsirkin" <mst@redhat.com>
[ clg: fixes in memory_region_iommu_set_iova_ranges() and
iommu_set_iova_ranges() documentation ] Signed-off-by: Cédric Le Goater <clg@redhat.com>