Chenyi Qiang [Thu, 12 Jun 2025 08:27:43 +0000 (16:27 +0800)]
memory: Change memory_region_set_ram_discard_manager() to return the result
Modify memory_region_set_ram_discard_manager() to return -EBUSY if a
RamDiscardManager is already set in the MemoryRegion. The caller must
handle this failure, such as having virtio-mem undo its actions and fail
the realize() process. Opportunistically move the call earlier to avoid
complex error handling.
This change is beneficial when introducing a new RamDiscardManager
instance besides virtio-mem. After
ram_block_coordinated_discard_require(true) unlocks all
RamDiscardManager instances, only one instance is allowed to be set for
one MemoryRegion at present.
Suggested-by: David Hildenbrand <david@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com> Tested-by: Alexey Kardashevskiy <aik@amd.com> Reviewed-by: Alexey Kardashevskiy <aik@amd.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com> Link: https://lore.kernel.org/r/20250612082747.51539-3-chenyi.qiang@intel.com Signed-off-by: Peter Xu <peterx@redhat.com>
Chenyi Qiang [Thu, 12 Jun 2025 08:27:42 +0000 (16:27 +0800)]
memory: Export a helper to get intersection of a MemoryRegionSection with a given range
Rename the helper to memory_region_section_intersect_range() to make it
more generic. Meanwhile, define the @end as Int128 and replace the
related operations with Int128_* format since the helper is exported as
a wider API.
Suggested-by: Alexey Kardashevskiy <aik@amd.com> Reviewed-by: Alexey Kardashevskiy <aik@amd.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com> Link: https://lore.kernel.org/r/20250612082747.51539-2-chenyi.qiang@intel.com Signed-off-by: Peter Xu <peterx@redhat.com>
Chaney, Ben [Mon, 16 Jun 2025 20:56:50 +0000 (20:56 +0000)]
migration: Don't sync volatile memory after migration completes
Syncing volatile memory provides no benefit, instead it can cause
performance issues in some cases. Only sync memory that is marked as
non-volatile after migration completes on destination.
Jaehoon Kim [Wed, 11 Jun 2025 20:56:09 +0000 (15:56 -0500)]
tests/migration: Setup pre-listened cpr.sock to remove race-condition.
When the source VM attempts to connect to the destination VM's Unix
domain socket (cpr.sock) during a cpr-transfer test, race conditions can
occur if the socket file isn't ready. This can lead to connection
failures when running tests.
This patch creates and listens on the socket in advance, and passes the
pre-listened FD directly. This avoids timing issues and improves the
reliability of CPR tests.
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com> Signed-off-by: Jaehoon Kim <jhkim@linux.ibm.com> Reviewed-by: Steve Sistare <steven.sistare@oracle.com> Link: https://lore.kernel.org/r/20250611205610.147008-2-jhkim@linux.ibm.com
[peterx: null-initialize opts_target, per Steve] Signed-off-by: Peter Xu <peterx@redhat.com>
Jaehoon Kim [Wed, 11 Jun 2025 20:56:10 +0000 (15:56 -0500)]
migration: Support fd-based socket address in cpr_transfer_input
Extend cpr_transfer_input to handle SOCKET_ADDRESS_TYPE_FD alongside
SOCKET_ADDRESS_TYPE_UNIX. This change supports the use of pre-listened
socket file descriptors for cpr migration channels.
This change is particularly useful in qtest environments, where the
socket may be created externally and passed via fd.
Juraj Marcin [Wed, 21 May 2025 15:16:13 +0000 (17:16 +0200)]
ui/vnc: Update display update interval when VM state changes to RUNNING
If a virtual machine is paused for an extended period time, for example,
due to an incoming migration, there are also no changes on the screen.
VNC in such case increases the display update interval by
VNC_REFRESH_INTERVAL_INC (50 ms). The update interval can then grow up
to VNC_REFRESH_INTERVAL_MAX (3000 ms).
When the machine resumes, it can then take up to 3 seconds for the first
display update. Furthermore, the update interval is then halved with
each display update with changes on the screen. If there are moving
elements on the screen, such as a video, this can be perceived as
freezing and stuttering for few seconds before the movement is smooth
again.
This patch resolves this issue, by adding a listener to VM state changes
and changing the update interval when the VM state changes to RUNNING.
The update_displaychangelistener() function updates the internal timer,
and the display is refreshed immediately if the timer is expired.
Signed-off-by: Juraj Marcin <jmarcin@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Link: https://lore.kernel.org/r/20250521151616.3951178-1-jmarcin@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>
Fabiano Rosas [Fri, 23 May 2025 12:30:23 +0000 (09:30 -0300)]
tests/qtest: Remove migration-helpers.c
Commit 407bc4bf90 ("qapi: Move include/qapi/qmp/ to include/qobject/")
brought the migration-helpers.c back by mistake. This file has been
replaced with migration/migration-qmp.c and
migration/migration-util.c.
Fixes: 407bc4bf90 ("qapi: Move include/qapi/qmp/ to include/qobject/") Signed-off-by: Fabiano Rosas <farosas@suse.de>
Message-id: 20200310152141.13959-1-peter.maydell@linaro.org Reviewed-by: Markus Armbruster <armbru@redhat.com> Link: https://lore.kernel.org/r/20250523123023.19284-1-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>
Yanfei Xu [Wed, 14 May 2025 11:58:27 +0000 (19:58 +0800)]
migration/ram: avoid to do log clear in the last round
There won't be any ram sync after the stage of save_complete, therefore
it's unnecessary to do manually protect for dirty pages being sent. Skip
to do this in last round can reduce noticeable downtime.
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (24 commits)
i386/tdx: handle TDG.VP.VMCALL<GetQuote>
i386/tdx: handle TDG.VP.VMCALL<GetTdVmCallInfo>
update Linux headers to v6.16-rc3
i386/tdx: Clarify the error message of mrconfigid/mrowner/mrownerconfig
i386/tdx: Fix the typo of the comment of struct TdxGuest
i386/cpu: Rename enable_cpuid_0x1f to force_cpuid_0x1f
i386/tdx: Error and exit when named cpu model is requested
i386/cpu: Warn about why CPUID_EXT_PDCM is not available
i386/cpu: Move adjustment of CPUID_EXT_PDCM before feature_dependencies[] check
rust: hpet: fix new warning
rust: pl011: Add missing logging to match C version
rust: pl011: Implement logging
rust/qemu-api: Add initial logging support based on C API
rust: move rust.bindgen to qemu-api crate
rust: prepare variable definitions for multiple bindgen invocations
rust: qom: change instance_init to take a ParentInit<>
rust: qom: make ParentInit lifetime-invariant
rust: qom: introduce ParentInit
rust: hpet: fully initialize object during instance_init
rust: qemu_api: introduce MaybeUninit field projection
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Isaku Yamahata [Mon, 28 Nov 2022 09:43:52 +0000 (17:43 +0800)]
i386/tdx: handle TDG.VP.VMCALL<GetQuote>
Add property "quote-generation-socket" to tdx-guest, which is a property
of type SocketAddress to specify Quote Generation Service(QGS).
On request of GetQuote, it connects to the QGS socket, read request
data from shared guest memory, send the request data to the QGS,
and store the response into shared guest memory, at last notify
TD guest by interrupt.
Note, above example uses the unix socket. It can be other types, like vsock,
which depends on the implementation of QGS.
To avoid no response from QGS server, setup a timer for the transaction.
If timeout, make it an error and interrupt guest. Define the threshold of
time to 30s at present, maybe change to other value if not appropriate.
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> Co-developed-by: Chenyi Qiang <chenyi.qiang@intel.com> Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com> Co-developed-by: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Tested-by: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Xiaoyao Li [Tue, 3 Jun 2025 05:03:05 +0000 (01:03 -0400)]
i386/tdx: Clarify the error message of mrconfigid/mrowner/mrownerconfig
The error message is misleading - we successfully decoded the data,
the decoded data was simply with the wrong length.
Change the error message to show it is an length check failure with both
the received and expected values.
Suggested-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Link: https://lore.kernel.org/r/20250603050305.1704586-4-xiaoyao.li@intel.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Xiaoyao Li [Tue, 3 Jun 2025 05:03:03 +0000 (01:03 -0400)]
i386/cpu: Rename enable_cpuid_0x1f to force_cpuid_0x1f
The name of "enable_cpuid_0x1f" isn't right to its behavior because the
leaf 0x1f can be enabled even when "enable_cpuid_0x1f" is false.
Rename it to "force_cpuid_0x1f" to better reflect its behavior.
Suggested-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Link: https://lore.kernel.org/r/20250603050305.1704586-2-xiaoyao.li@intel.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Xiaoyao Li [Tue, 4 Mar 2025 05:24:49 +0000 (00:24 -0500)]
i386/cpu: Move adjustment of CPUID_EXT_PDCM before feature_dependencies[] check
There is one entry relates to CPUID_EXT_PDCM in feature_dependencies[].
So it needs to get correct value of CPUID_EXT_PDCM before using
feature_dependencies[] to apply dependencies.
Besides, it also ensures CPUID_EXT_PDCM value is tracked in
env->features[FEAT_1_ECX].
Paolo Bonzini [Mon, 16 Jun 2025 16:56:49 +0000 (18:56 +0200)]
rust: hpet: fix new warning
Nightly rustc complains that HPETAddrDecode has a lifetime but it is not
clearly noted that it comes from &self. Apply the compiler's suggestion
to shut it up.
Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Bernhard Beschow [Sun, 15 Jun 2025 11:20:34 +0000 (13:20 +0200)]
rust/qemu-api: Add initial logging support based on C API
A log_mask_ln!() macro is provided which expects similar arguments as the
C version. However, the formatting works as one would expect from Rust.
To maximize code reuse the macro is just a thin wrapper around
qemu_log(). Also, just the bare minimum of logging masks is provided
which should suffice for the current use case of Rust in QEMU.
Paolo Bonzini [Fri, 13 Jun 2025 12:49:27 +0000 (14:49 +0200)]
rust: move rust.bindgen to qemu-api crate
Once qemu-api is split in multiple crates, each of them will have
its own invocation of bindgen. There cannot be only one, because
there are occasional "impl" blocks for the bindgen-generated
structs (e.g. VMStateFlags or QOM classes) that have to
reside in the same crate as the bindgen-generated code.
For now, prepare for this new organization by invoking bindgen
within the qemu-api crate's build definitions; it's also a
much better place to list enums that need specific treatment
from bindgen.
Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Fri, 13 Jun 2025 12:51:54 +0000 (14:51 +0200)]
rust: prepare variable definitions for multiple bindgen invocations
When splitting the QEMU Rust bindings into multiple crates, the
bindgen-generated structs also have to be split so that it's
possible to add "impl" blocks (e.g. for Sync/Send or Default,
or even for utility methods in cases such as VMStateFlags).
Tweak various variable definitions in meson.build, to avoid naming
conflicts once there will be multiple bindgen invocations.
Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Tue, 4 Mar 2025 19:48:05 +0000 (20:48 +0100)]
rust: qom: change instance_init to take a ParentInit<>
This removes undefined behavior associated to writing to uninitialized
fields, and makes it possible to remove "unsafe" from the instance_init
implementation.
However, the init function itself is still unsafe, because it must promise
(as a sort as MaybeUninit::assume_init) that all fields have been
initialized.
Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Fri, 28 Feb 2025 09:20:48 +0000 (10:20 +0100)]
rust: qom: make ParentInit lifetime-invariant
This is the trick that allows the parent-field initializer to be used
only for the object that it's meant to be initialized. This way,
the owner of a MemoryRegion must be the object that embeds it.
More information is in the comments; it's best explained with a simplified
example.
Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Fri, 28 Feb 2025 08:40:30 +0000 (09:40 +0100)]
rust: qom: introduce ParentInit
This is a smart pointer for MaybeUninit; it can be upcasted to the
already-initialized parent classes, or dereferenced to a MaybeUninit
for the class that is being initialized.
Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Tue, 15 Apr 2025 11:13:19 +0000 (13:13 +0200)]
rust: hpet: fully initialize object during instance_init
The array of BqlRefCell<HPETTimer> is not initialized yet at the
end of instance_init. In particular, the "state" field is NonNull
and therefore it is invalid to have it as zero bytes.
Note that MaybeUninit is necessary because assigning to self.timers[index]
would trigger Drop of the old value.
Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Fri, 28 Feb 2025 08:41:42 +0000 (09:41 +0100)]
rust: qemu_api: introduce MaybeUninit field projection
Add a macro that makes it possible to convert a MaybeUninit<> into
another MaybeUninit<> for a single field within it. Furthermore, it is
possible to use the resulting MaybeUninitField<> in APIs that take the
parent object, such as memory_region_init_io().
This allows removing some of the undefined behavior from instance_init()
functions, though this may not be the definitive implementation.
Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Bernhard Beschow [Tue, 10 Jun 2025 20:41:27 +0000 (22:41 +0200)]
hw: Fix type constant for DTB files
Commit fcb1ad456c58 ("system/datadir: Add new type constant for DTB files")
introduced a new type constant for DTB files and converted the boards with
bundled device trees to use it. Convert the other boards for consistency.
Mark Cave-Ayland [Wed, 11 Jun 2025 13:03:15 +0000 (14:03 +0100)]
target/i386: fix TB exit logic in gen_movl_seg() when writing to SS
Before commit e54ef98c8a ("target/i386: do not trigger IRQ shadow for LSS"), any
write to SS in gen_movl_seg() would cause a TB exit. The changes introduced by
this commit were intended to restrict the DISAS_EOB_INHIBIT_IRQ exit to the case
where inhibit_irq is true, but missed that a DISAS_EOB_NEXT exit can still be
required when writing to SS and inhibit_irq is false.
Comparing the PE(s) && !VM86(s) section with the logic in x86_update_hflags(), we
can see that the DISAS_EOB_NEXT exit is still required for the !CODE32 case when
writing to SS in gen_movl_seg() because any change to the SS flags can affect
hflags. Similarly we can see that the existing CODE32 case is still correct since
a change to any of DS, ES and SS can affect hflags. Finally for the
gen_op_movl_seg_real() case an explicit TB exit is not needed because the segment
register selector does not affect hflags.
Update the logic in gen_movl_seg() so that a write to SS with inhibit_irq set to
false where PE(s) && !VM86(s) will generate a DISAS_EOB_NEXT exit along with the
inline comment. This has the effect of allowing Win98SE to boot in QEMU once
again.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Fixes: e54ef98c8a ("target/i386: do not trigger IRQ shadow for LSS")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2987 Link: https://lore.kernel.org/r/20250611130315.383151-1-mark.cave-ayland@ilande.co.uk Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Mon, 9 Jun 2025 10:58:54 +0000 (12:58 +0200)]
meson: cleanup win32 library detection
As pointed out by Akihiko Odaki, all Win32 libraries in MinGW have lowercase
names. This means that on (case-insensitive) Windows you can use the mixed-case
names suggested by Microsoft or all-lowercase names, while on Linux you need to
make them lowercase.
QEMU was already using lowercase names, so there is no need to test the
mixed-case name version of libSynchronization. Remove the unnecessary test
and while at it make all the tests use "required: true".
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
oltolm [Thu, 12 Jun 2025 22:15:22 +0000 (00:15 +0200)]
meson: fix Windows build
The build fails on Windows. Replace calls to Unix programs like ´cat´,
´sed´ and ´true´ with calls to ´python´ and wrap calls to
´os.path.relpath´ in try-except because it can fail when the two paths
are on different drives. Make sure to convert the Windows paths to Unix
paths to prevent warnings in generated files.
Signed-off-by: oltolm <oleg.tolmatcev@gmail.com>
Message-id: 20250612221521.1109-2-oleg.tolmatcev@gmail.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Stefan Hajnoczi [Mon, 16 Jun 2025 17:14:42 +0000 (13:14 -0400)]
Merge tag 'pull-target-arm-20250616' of https://git.linaro.org/people/pmaydell/qemu-arm into staging
target-arm queue:
* hw/arm/virt: Check bypass iommu is not set for iommu-map DT property
* tests/functional: Add a test for the realview-eb-mpcore machine
* qemu-options.hx: Fix reversed description of icount sleep behavior
* target/arm: Define raw write for PMU CLR registers
* docs/interop: convert qed_spec.txt to reStructuredText format
* hw/arm: make cpu targeted by arm_load_kernel the primary CPU.
* hw/intc/arm_gic: introduce a first-cpu-index property
* hw/arm/mps2: Configure the AN500 CPU with 16 MPU regions
* linux-user/arm: Fix return value of SYS_cacheflush
* tag 'pull-target-arm-20250616' of https://git.linaro.org/people/pmaydell/qemu-arm:
linux-user/arm: Fix return value of SYS_cacheflush
hw/arm/mps2: Configure the AN500 CPU with 16 MPU regions
hw/intc/arm_gic: introduce a first-cpu-index property
hw/arm: make cpu targeted by arm_load_kernel the primary CPU.
docs/interop: convert qed_spec.txt to reStructuredText format
target/arm: Define raw write for PMU CLR registers
qemu-options.hx: Fix reversed description of icount sleep behavior
tests/functional: Add a test for the realview-eb-mpcore machine
hw/arm/virt: Check bypass iommu is not set for iommu-map DT property
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Peter Maydell [Thu, 5 Jun 2025 14:18:01 +0000 (15:18 +0100)]
hw/arm/mps2: Configure the AN500 CPU with 16 MPU regions
The AN500 application note documents that it configures the Cortex-M7
CPU to have 16 MPU regions. We weren't doing this in our emulation,
so the CPU had only the default 8 MPU regions. Set the mpu-ns-regions
property to 16 for this board.
This bug doesn't affect any of the other board types we model in
this source file, because they all use either the Cortex-M3 or
Cortex-M4. Those CPUs do not have an RTL configurable number of
MPU regions, and always provide 8 regions if the MPU is built in.
Cc: qemu-stable@nongnu.org Reported-by: Corentin GENDRE <cocotroupe20@gmail.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20250605141801.1083266-1-peter.maydell@linaro.org
Clément Chigot [Mon, 26 May 2025 08:55:20 +0000 (10:55 +0200)]
hw/arm: make cpu targeted by arm_load_kernel the primary CPU.
Currently, arm booting processus assumes that the first_cpu is the CPU
that will boot: `arm_load_kernel` is powering off all but the `first_cpu`;
`do_cpu_reset` is setting the loader address only for this `first_cpu`.
For most of the boards, this isn't an issue as the kernel is loaded and
booted on the first CPU anyway. However, for zynqmp, the option
"boot-cpu" allows to choose any CPUs.
Create a new arm_boot_info entry `primary_cpu` recording which CPU will
be boot first. This one is set when `arm_boot_kernel` is called.
Signed-off-by: Clément Chigot <chigot@adacore.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20250526085523.809003-2-chigot@adacore.com Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
docs/interop: convert qed_spec.txt to reStructuredText format
Convert the qed_spec.txt file to reStructuredText and
include it in the manual.
buglink: https://gitlab.com/qemu-project/qemu/-/issues/527 Signed-off-by: Souleymane Conte <conte.souleymane@gmail.com>
Message-id: 20250609135124.45078-1-conte.souleymane@gmail.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: adjusted position of doc in the table of contents;
bulked up commit message; added file to MAINTAINERS section
for QED; made 'Consistency checking' a higher level section;
fixed one preexisting grammar nit (s/by from/from/)] Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Akihiko Odaki [Sat, 31 May 2025 12:11:06 +0000 (21:11 +0900)]
target/arm: Define raw write for PMU CLR registers
Raw writes to PMCNTENCLR and PMCNTENCLR_EL0 incorrectly used their
default write function, which clears written bits instead of writes the
raw value.
PMINTENCLR and PMINTENCLR_EL1 are similar registers, but they instead
had ARM_CP_NO_RAW. Commit 7a0e58fa6487 ("target-arm: Split NO_MIGRATE
into ALIAS and NO_RAW") sugguests ARM_CP_ALIAS should be used instead of
ARM_CP_NO_RAW in such a case:
> We currently mark ARM coprocessor/system register definitions with
> the flag ARM_CP_NO_MIGRATE for two different reasons:
> 1) register is an alias on to state that's also visible via
> some other register, and that other register is the one
> responsible for migrating the state
> 2) register is not actually state at all (for instance the TLB
> or cache maintenance operation "registers") and it makes no
> sense to attempt to migrate it or otherwise access the raw state
>
> This works fine for identifying which registers should be ignored
> when performing migration, but we also use the same functions for
> synchronizing system register state between QEMU and the kernel
> when using KVM. In this case we don't want to try to sync state
> into registers in category 2, but we do want to sync into registers
> in category 1, because the kernel might have picked a different
> one of the aliases as its choice for which one to expose for
> migration.
These registers fall in category 1 (ARM_CP_ALIAS), not category 2
(ARM_CP_NO_RAW).
ARM_CP_NO_RAW also has another undesired side effect that hides
registers from GDB.
Properly set raw write functions and drop the ARM_CP_NO_RAW flag from
PMINTENCLR and PMINTENCLR_EL1; this fixes GDB/KVM state synchronization
of PMCNTENCLR and PMCNTENCLR_EL0, and exposes all these four registers
to GDB.
It is not necessary to add ARM_CP_ALIAS to these registers because the
flag is already set.
Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Message-id: 20250531-clr-v3-1-377f9bf1746d@rsg.ci.i.u-tokyo.ac.jp Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Ethan Chen [Fri, 6 Jun 2025 09:57:28 +0000 (17:57 +0800)]
qemu-options.hx: Fix reversed description of icount sleep behavior
The documentation for the -icount option incorrectly describes the behavior
of the sleep suboption. Based on the actual implementation and system
behavior, the effects of sleep=on and sleep=off were inadvertently reversed.
This commit updates the description to reflect their intended functionality.
Cc: qemu-stable@nongnu.org Fixes: fa647905e6ba ("qemu-options.hx: Fix minor issues in icount documentation") Signed-off-by: Ethan Chen <ethan84@andestech.com>
Message-id: 20250606095728.3672832-1-ethan84@andestech.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* tag 'block-pull-request' of https://gitlab.com/stefanha/qemu: (31 commits)
net/stream: skip automatic zero-init of large array
net/socket: skip automatic zero-init of large array
hw/ufs/lu: skip automatic zero-init of large array
hw/scsi/megasas: skip automatic zero-init of large arrays
hw/scsi/lsi53c895a: skip automatic zero-init of large array
hw/usb/hcd-ohci: skip automatic zero-init of large array
hw/ppc/spapr_tpm_proxy: skip automatic zero-init of large arrays
hw/ppc/pnv_occ: skip automatic zero-init of large struct
hw/nvme/ctrl: skip automatic zero-init of large arrays
hw/net/xgamc: skip automatic zero-init of large array
hw/net/virtio-net: skip automatic zero-init of large arrays
hw/net/tulip: skip automatic zero-init of large array
hw/net/rtl8139: skip automatic zero-init of large array
hw/misc/aspeed_hace: skip automatic zero-init of large array
hw/hyperv/syndbg: skip automatic zero-init of large array
hw/display/vmware_vga: skip automatic zero-init of large struct
hw/dma/xlnx_csu_dma: skip automatic zero-init of large array
hw/char/sclpconsole-lm: skip automatic zero-init of large array
hw/audio/via-ac97: skip automatic zero-init of large array
hw/audio/sb16: skip automatic zero-init of large array
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
net/stream: skip automatic zero-init of large array
The 'net_stream_send' method has a 68k byte array used for copying
data between guest and host. Skip the automatic zero-init of this
array to eliminate the performance overhead in the I/O hot path.
The 'buf1' array will be fully initialized when reading data off
the network socket.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Message-id: 20250610123709.835102-32-berrange@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
net/socket: skip automatic zero-init of large array
The 'net_socket_send' method has a 68k byte array used for copying
data between guest and host. Skip the automatic zero-init of this
array to eliminate the performance overhead in the I/O hot path.
The 'buf1' array will be fully initialized when reading data off
the network socket.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Message-id: 20250610123709.835102-31-berrange@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
hw/ufs/lu: skip automatic zero-init of large array
The 'ufs_emulate_scsi_cmd' method has a 4k byte array used for
copying data from the device. Skip the automatic zero-init of
this array to eliminate the performance overhead in the I/O hot
path.
The 'outbuf' array will be fully initialized when data is copied
from the guest.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Message-id: 20250610123709.835102-30-berrange@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
hw/scsi/megasas: skip automatic zero-init of large arrays
The 'megasas_dcmd_pd_get_list' and 'megasas_dcmd_get_properties'
methods have 4k structs used for copying data from the device.
Skip the automatic zero-init of this array to eliminate the
performance overhead in the I/O hot path.
The 'info' structs are manually initialized with memset(). The
compiler ought to be intelligent enough to turn the memset()
into a static initialization operation, and thus not duplicate
the automatic zero-init. Replacing memset() with '{}' makes it
unambiguous that the arrays are statically initialized.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Message-id: 20250610123709.835102-29-berrange@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
hw/scsi/lsi53c895a: skip automatic zero-init of large array
The 'lsi_memcpy' method has a 4k byte array used for copying data
to/from the device. Skip the automatic zero-init of this array to
eliminate the performance overhead in the I/O hot path.
The 'buf' array will be fully initialized when data is copied.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Message-id: 20250610123709.835102-28-berrange@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
hw/usb/hcd-ohci: skip automatic zero-init of large array
The 'ohci_service_iso_td' method has a 8k byte array used for copying
data between guest and host. Skip the automatic zero-init of this
array to eliminate the performance overhead in the I/O hot path.
The 'buf' array will be fully initialized when reading data from guest
memory.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Message-id: 20250610123709.835102-27-berrange@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
hw/ppc/spapr_tpm_proxy: skip automatic zero-init of large arrays
The 'tpm_execute' method has a pair of 4k arrays used for copying
data between guest and host. Skip the automatic zero-init of these
arrays to eliminate the performance overhead in the I/O hot path.
The two arrays will be fully initialized when reading data from
guest memory or reading data from the proxy FD.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Message-id: 20250610123709.835102-26-berrange@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
hw/ppc/pnv_occ: skip automatic zero-init of large struct
The 'occ_model_tick' method has a 12k struct used for copying
data between guest and host. Skip the automatic zero-init of this
struct to eliminate the performance overhead in the I/O hot path.
The 'dynamic_data' buffer will be fully initialized when reading
data from the guest.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Message-id: 20250610123709.835102-25-berrange@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
hw/nvme/ctrl: skip automatic zero-init of large arrays
The 'nvme_map_sgl' method has a 256 element array used for copying
data from the device. Skip the automatic zero-init of this array
to eliminate the performance overhead in the I/O hot path.
The 'segment' array will be fully initialized when reading data from
the device.
The 'nme_changed_nslist' method has a 4k byte array that is manually
initialized with memset(). The compiler ought to be intelligent
enough to turn the memset() into a static initialization operation,
and thus not duplicate the automatic zero-init. Replacing memset()
with '{}' makes it unambiguous that the array is statically initialized.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Message-id: 20250610123709.835102-24-berrange@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
hw/net/xgamc: skip automatic zero-init of large array
The 'xgmac_enet_send' method has a 8k byte array used for copying
data between guest and host. Skip the automatic zero-init of this
array to eliminate the performance overhead in the I/O hot path.
The 'frame' buffer will be fully initialized when reading guest
memory to fetch the data to send.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20250610123709.835102-23-berrange@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
hw/net/virtio-net: skip automatic zero-init of large arrays
The 'virtio_net_receive_rcu' method has three arrays with
VIRTQUEUE_MAX_SIZE elements, which are apprixmately 32k in
size used for copying data between guest and host. Skip the
automatic zero-init of these arrays to eliminate the
performance overhead in the I/O hot path.
The three arrays will be selectively initialized as required
when processing network buffers.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20250610123709.835102-22-berrange@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
hw/net/tulip: skip automatic zero-init of large array
The 'tulip_setup_frame' method has a 4k byte array used for copynig
DMA data from the device. Skip the automatic zero-init of this array
to eliminate the performance overhead in the I/O hot path.
The 'buf' array will be fully initialized when reading data from the
device.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20250610123709.835102-21-berrange@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
hw/net/rtl8139: skip automatic zero-init of large array
The 'rtl8139_transmit_one' method has a 8k byte array used for
copying data between guest and host. Skip the automatic zero-init
of this array to eliminate the performance overhead in the I/O
hot path.
The 'txbuffer' will be fully initialized when reading PCI DMA
buffers.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20250610123709.835102-20-berrange@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
hw/misc/aspeed_hace: skip automatic zero-init of large array
The 'do_hash_operation' method has a 256 element iovec array used for
holding pointers to data that is to be hashed. Skip the automatic
zero-init of this array to eliminate the performance overhead in the
I/O hot path.
The 'iovec' array will be selectively initialized based on data that
needs to be hashed.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20250610123709.835102-19-berrange@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
hw/hyperv/syndbg: skip automatic zero-init of large array
The 'handle_recv_msg' method has a 4k byte array used for copying
data between the network socket and guest memory. Skip the automatic
zero-init of this array to eliminate the performance overhead in the
I/O hot path.
The 'data_buf' array will be fully initialized when data is read
off the network socket.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20250610123709.835102-18-berrange@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
hw/display/vmware_vga: skip automatic zero-init of large struct
The 'vmsvga_fifo_run' method has a struct which is a little over 20k
in size, used for holding image data for cursor changes. Skip the
automatic zero-init of this struct to eliminate the performance
overhead in the I/O hot path.
The cursor variable will be fully initialized only when processing
a cursor definition message from the guest.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20250610123709.835102-17-berrange@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
hw/dma/xlnx_csu_dma: skip automatic zero-init of large array
The 'xlnx_csu_dma_src_notify' method has a 4k byte array used for
copying DMA data. Skip the automatic zero-init of this array to
eliminate the performance overhead in the I/O hot path.
The 'buf' array will be fully initialized when data is copied.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20250610123709.835102-16-berrange@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
hw/char/sclpconsole-lm: skip automatic zero-init of large array
The 'process_mdb' method has a 4k byte array used for copying data
between the guest and the chardev backend. Skip the automatic zero-init
of this array to eliminate the performance overhead in the I/O hot
path.
The 'buffer' array will be selectively initialized when data is converted
between EBCDIC and ASCII.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20250610123709.835102-15-berrange@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
hw/audio/via-ac97: skip automatic zero-init of large array
The 'out_cb' method has a 4k byte array used for copying data
between the audio backend and device. Skip the automatic zero-init
of this array to eliminate the performance overhead in the I/O hot
path.
The 'tmpbuf' array will be fully initialized when reading data from
device memory.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20250610123709.835102-14-berrange@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
hw/audio/sb16: skip automatic zero-init of large array
The 'write_audio' method has a 4k byte array used for copying data
between the audio backend and device. Skip the automatic zero-init
of this array to eliminate the performance overhead in the I/O hot
path.
The 'tmpbuf' array will be fully initialized when reading data from
device memory.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20250610123709.835102-13-berrange@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
hw/audio/marvell_88w8618: skip automatic zero-init of large array
The 'mv88w8618_audio_callback' method has a 4k byte array used for
copying data between the audio backend and device. Skip the automatic
zero-init of this array to eliminate the performance overhead in
the I/O hot path.
The 'buf' array will be fully initialized when reading data from
device memory.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20250610123709.835102-12-berrange@redhat.com
[Fixed hw/audio/gus in commit message --Stefan] Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
hw/audio/gus: skip automatic zero-init of large array
The 'GUS_read_DMA' method has a 4k byte array used for copying
data between the audio backend and device. Skip the automatic
zero-init of this array to eliminate the performance overhead in
the I/O hot path.
The 'tmpbuf' array will be fully initialized when reading data
from device memory.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20250610123709.835102-11-berrange@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
hw/audio/es1370: skip automatic zero-init of large array
The 'es1370_transfer_audio' method has a 4k byte array used for
copying data between the audio backend and device. Skip the automatic
zero-init of this array to eliminate the performance overhead in
the I/O hot path.
The 'tmpbuf' array will be fully initialized when reading data from
the audio backend and/or device memory.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20250610123709.835102-10-berrange@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
hw/audio/cs4231a: skip automatic zero-init of large arrays
The 'cs_write_audio' method has a pair of byte arrays, one 4k in size
and one 8k, which are used in converting audio samples. Skip the
automatic zero-init of these arrays to eliminate the performance
overhead in the I/O hot path.
The 'tmpbuf' array will be fully initialized when reading a block of
data from the guest. The 'linbuf' array will be fully initialized
when converting the audio samples.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20250610123709.835102-9-berrange@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
hw/audio/ac97: skip automatic zero-init of large arrays
The 'read_audio' & 'write_audio' methods have a 4k byte array used
for copying data between the audio backend and device. Skip the
automatic zero-init of these arrays to eliminate the performance
overhead in the I/O hot path.
The 'tmpbuf' array will be fully initialized when reading data from
the audio backend and/or device memory.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20250610123709.835102-8-berrange@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
chardev/char-socket: skip automatic zero-init of large array
The 'tcp_chr_read' method has a 4k byte array used for copying
data between the socket and device. Skip the automatic zero-init
of this array to eliminate the performance overhead in the I/O
hot path.
The 'buf' array will be fully initialized when reading data off
the network socket.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20250610123709.835102-7-berrange@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
chardev/char-pty: skip automatic zero-init of large array
The 'pty_chr_read' method has a 4k byte array used for copying
data between the PTY and device. Skip the automatic zero-init
of this array to eliminate the performance overhead in the I/O
hot path.
The 'buf' array will be fully initialized when reading data off
the PTY.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20250610123709.835102-6-berrange@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
chardev/char-fd: skip automatic zero-init of large array
The 'fd_chr_read' method has a 4k byte array used for copying
data between the socket and device. Skip the automatic zero-init
of this array to eliminate the performance overhead in the I/O
hot path.
The 'buf' array will be fully initialized when reading data off
the network socket.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20250610123709.835102-5-berrange@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
block: skip automatic zero-init of large array in ioq_submit
The 'ioq_submit' method has a struct array that is 8k in size.
Skip the automatic zero-init of this array to eliminate the
performance overhead in the I/O hot path.
The 'iocbs' array will selectively initialized when processing
the I/O data.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20250610123709.835102-4-berrange@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Stefan Hajnoczi [Tue, 10 Jun 2025 12:36:40 +0000 (13:36 +0100)]
hw/virtio/virtio: avoid cost of -ftrivial-auto-var-init in hot path
Since commit 7ff9ff039380 ("meson: mitigate against use of uninitialize
stack for exploits") the -ftrivial-auto-var-init=zero compiler option is
used to zero local variables. While this reduces security risks
associated with uninitialized stack data, it introduced a measurable
bottleneck in the virtqueue_split_pop() and virtqueue_packed_pop()
functions.
These virtqueue functions are in the hot path. They are called for each
element (request) that is popped from a VIRTIO device's virtqueue. Using
__attribute__((uninitialized)) on large stack variables in these
functions improves fio randread bs=4k iodepth=64 performance from 304k
to 332k IOPS (+9%).
This issue was found using perf-top(1). virtqueue_split_pop() was one of
the top CPU consumers and the "annotate" feature showed that the memory
zeroing instructions at the beginning of the functions were hot.
Fixes: 7ff9ff039380 ("meson: mitigate against use of uninitialize stack for exploits") Cc: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20250610123709.835102-3-berrange@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The QEMU_UNINITIALIZED macro is to be used to skip the default compiler
variable initialization done by -ftrivial-auto-var-init=zero.
Use this in cases where there a method in the device I/O path (or other
important hot paths), that has large variables on the stack. A rule of
thumb is that "large" means a method with 4kb data in the local stack
frame. Any variables which are KB in size, should be annotated with this
attribute, to pre-emptively eliminate any potential overhead from the
compiler zero'ing memory.
Given that this turns off a security hardening feature, when using this
to flag variables, it is important that the code is double-checked to
ensure there is no possible use of uninitialized data in the method.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20250610123709.835102-2-berrange@redhat.com
[DB: split off patch & rewrite guidance on when to use the annotation] Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Stefan Hajnoczi [Thu, 12 Jun 2025 17:36:43 +0000 (13:36 -0400)]
Merge tag 'qga-pull-2025-06-12' of https://github.com/kostyanf14/qemu into staging
qga-pull-2025-06-12
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEwsLBCepDxjwUI+uE711egWG6hOcFAmhK3hkACgkQ711egWG6
# hOdZ9g//aObON4+a2fSuTWToJwj5i2fcplXDD4OUnxH+pc3qt4bc50cpD4mbH3VZ
# 2W854DWfrvPOv1beVYlmOLKztCTFk445BwtV5im4TBBcRmPt9GXyGqqax+3msziF
# gA0r3KrJ4mv6OUvx61Jmgz4pFkHhWda6BbnTZbFPgPSz/poLN78Ib9TpAvOWBIEg
# 6bdux8Ivh4gWO22OtY7O8XDU/NwkVwQNJQ1iv3Y4EUJ+Qv4prePrDiyNVn0jf1S0
# KxIx4tPYf6B4mYbcc3/lURuI+R8H2KxCt7GmGxBl1esqjGOEUj/fjp54+OqOf/2n
# a/ZIWFu0cN1SK279eluBOm4Y7IGRouaFALaBJQLdEhYQgJmrCaEnSzHQCTR4cZQr
# V2KkmGFXV7IdLvlLl38safp/G8cxvq21ijEx/RkoZ7Iklx8wWx5A/Cy0D52IViXD
# +gsBpqGsMia+7Rus9o4P2QjWA5hCvaN7XH2rVGtELyoQwwhBfxCmhtn8qi5Vjybz
# 7f3tr0BwdRm70KL//OhSL6DZHOGyRdqyiV27IP/2K5TVqKjkZNP0eIL97Y6xoGe6
# vXLbx6y+wUW0LXJGXe2+OtR/nFTu+VJ8IapfwQfd9JIR8Z25cNsFLhvfmWlPQiMc
# EkNUEbEez21PSKuKz9cVHlfLl/L4VSgzychKF9uQWm7rhbK+Roc=
# =6AwB
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 12 Jun 2025 10:03:05 EDT
# gpg: using RSA key C2C2C109EA43C63C1423EB84EF5D5E8161BA84E7
# gpg: Good signature from "Kostiantyn Kostiuk (Upstream PR sign) <kkostiuk@redhat.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: C2C2 C109 EA43 C63C 1423 EB84 EF5D 5E81 61BA 84E7
* tag 'qga-pull-2025-06-12' of https://github.com/kostyanf14/qemu:
qga: Add tests for guest-get-load command
qga-win: implement a 'guest-get-load' command
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Thomas Huth [Tue, 3 Jun 2025 10:15:26 +0000 (12:15 +0200)]
tests/functional: Add a test for the realview-eb-mpcore machine
Check that we can boot a Linux kernel here and that we can at
least send one ping network packet.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-id: 20250603101526.21217-1-thuth@redhat.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
hw/arm/virt: Check bypass iommu is not set for iommu-map DT property
default_bus_bypass_iommu tells us whether the bypass_iommu is set
for the default PCIe root bus. Make sure we check that before adding
the "iommu-map" DT property.
Cc: qemu-stable@nongnu.org Fixes: 6d7a85483a06 ("hw/arm/virt: Add default_bus_bypass_iommu machine option") Suggested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Reviewed-by: Donald Dutile <ddutile@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20250602114655.42920-1-shameerali.kolothum.thodi@huawei.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Windows has no native equivalent API, but it would be possible to
simulate it as illustrated here (BSD-3-Clause):
https://github.com/giampaolo/psutil/pull/1485
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Tested-by: Dehan Meng <demeng@redhat.com> Reviewed-by: Yan Vugenfirer <yvugenfi@redhat.com> Signed-off-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Stefan Hajnoczi [Wed, 11 Jun 2025 15:39:53 +0000 (11:39 -0400)]
Merge tag 'pull-vfio-20250611' of https://github.com/legoater/qemu into staging
vfio queue:
* Fixed newly added potential issues in vfio-pci
* Added support to report vfio-ap configuration changes
* Added prerequisite support for vfio-user
* Added first part for VFIO live update support
Stefan Hajnoczi [Wed, 11 Jun 2025 15:39:30 +0000 (11:39 -0400)]
Merge tag 'pull-request-2025-06-11' of https://gitlab.com/thuth/qemu into staging
* Remove aarch64 job from travis.yml
* Remove deprecated s390-ccw-virtio-4.1 machine
* Add memlock functional test
* Various other small updates and fixes
* tag 'pull-request-2025-06-11' of https://gitlab.com/thuth/qemu:
scripts/meson-buildoptions: Sort coroutine_backend choices lexicographically
MAINTAINERS: Update Akihiko Odaki's affiliation
MAINTAINERS: Update the paths to the testing documentation files
tests/vm/README: fix documentation path in tests/vm/README
tests/functional: add memlock tests
tests/functional: add skipLockedMemoryTest decorator
tests/functional: Speed up the avr_mega2560 test
tests/functional: Use the 'none' machine for the VNC test
hw/s390x/s390-virtio-ccw: Remove the deprecated 4.1 machine type
travis.yml: Remove the aarch64 job
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* tag 'hw-misc-20250610' of https://github.com/philmd/qemu: (24 commits)
hw/net/i82596: Factor configure function out
hw/net/i82596: Update datasheet URL
hw/misc/stm32_rcc: Fix stm32_rcc_write() arguments order
hw/riscv/riscv-iommu: Remove definition of RISCVIOMMU[Pci|Sys]Class
hw/gpio/aspeed: Fix definition of AspeedGPIOClass
hw/virtio/virtio-pmem: Fix definition of VirtIOPMEMClass
hw/virtio/virtio-mem: Fix definition of VirtIOMEMClass
tests/unit/test-char: Avoid using g_alloca()
backends/tpm: Avoid using g_alloca()
hw/gpio/pca9552: Avoid using g_newa()
hw/core/cpu: Move CacheType to general cpu.h
accel/hvf: Fix TYPE_HVF_ACCEL instance size
tests/functional: Add a test for the Arduino UNO machine
MAINTAINERS: Update Akihiko Odaki's affiliation
pc-bios: ensure installed ROMs don't have execute permissions
hw/ppc/e500: Use SysBusDevice API to access TYPE_CCSR's internal resources
hw/net/fsl_etsec: Set default MAC address
hw/ppc/e500: Move clock and TB frequency to machine class
hw/hyperv/balloon: Consolidate OBJECT_DEFINE_SIMPLE_TYPE_WITH_INTERFACES
hw/core/resetcontainer: Consolidate OBJECT_DECLARE_SIMPLE_TYPE
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Stefan Hajnoczi [Wed, 11 Jun 2025 15:37:13 +0000 (11:37 -0400)]
Merge tag 'pull-loongarch-20250610' of https://github.com/gaosong715/qemu into staging
pull-loongarch_20250610
# -----BEGIN PGP SIGNATURE-----
#
# iLMEAAEKAB0WIQS4/x2g0v3LLaCcbCxAov/yOSY+3wUCaEfZDQAKCRBAov/yOSY+
# 3z/XA/4vGGLAiCX6EN+t4E9sh7BWrt8fgbxBFSZapXVLGaeHDV3Y4IUHlLGy9RZT
# 3OtfE+5qvXPt1iz5l4IygmJh6wk7kN05Qw7XkV18hO5TqmYINdbmeuwvK0vmH6x+
# nTxSRke0CMmwYKg3bYDFVS1CRgfPX1zfRb1VKB1PnkKaZcHPNQ==
# =jC/2
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 10 Jun 2025 03:04:45 EDT
# gpg: using RSA key B8FF1DA0D2FDCB2DA09C6C2C40A2FFF239263EDF
# gpg: Good signature from "Song Gao <m17746591750@163.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B8FF 1DA0 D2FD CB2D A09C 6C2C 40A2 FFF2 3926 3EDF
* tag 'pull-loongarch-20250610' of https://github.com/gaosong715/qemu:
hw/loongarch/virt: Remove global variables about memmap tables
hw/loongarch/virt: Remove global variables about initrd
target/loongarch: add check for fcond
hw/loongarch/virt: inform guest of kvm
hw/intc/loongarch_extioi: Fix typo issue about register EXTIOI_COREISR_END
hw/intc/loongarch_pch: Convert to little endian with ID register
hw/loongarch/virt: Fix big endian support with MCFG table
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Steve Sistare [Tue, 10 Jun 2025 15:39:29 +0000 (08:39 -0700)]
vfio/pci: vfio_notifier_cleanup
Move event_notifier_cleanup calls to a helper vfio_notifier_cleanup.
This version is trivial, and does not yet use the vdev and nr parameters.
No functional change.
Steve Sistare [Tue, 10 Jun 2025 15:39:27 +0000 (08:39 -0700)]
vfio/pci: pass vector to virq functions
Pass the vector number to vfio_connect_kvm_msi_virq and
vfio_remove_kvm_msi_virq, so it can be passed to their subroutines in
a subsequent patch. No functional change.
Steve Sistare [Tue, 10 Jun 2025 15:39:26 +0000 (08:39 -0700)]
vfio/pci: vfio_notifier_init
Move event_notifier_init calls to a helper vfio_notifier_init.
This version is trivial, but it will be expanded to support CPR
in subsequent patches. No functional change.
Steve Sistare [Tue, 10 Jun 2025 15:39:24 +0000 (08:39 -0700)]
vfio-pci: skip reset during cpr
Do not reset a vfio-pci device during CPR, and do not complain if the
kernel's PCI config space changes for non-emulated bits between the
vmstate save and load, which can happen due to ongoing interrupt activity.
Steve Sistare [Tue, 10 Jun 2025 15:39:21 +0000 (08:39 -0700)]
vfio/container: recover from unmap-all-vaddr failure
If there are multiple containers and unmap-all fails for some container, we
need to remap vaddr for the other containers for which unmap-all succeeded.
Recover by walking all address ranges of all containers to restore the vaddr
for each. Do so by invoking the vfio listener callback, and passing a new
"remap" flag that tells it to restore a mapping without re-allocating new
userland data structures.
Steve Sistare [Tue, 10 Jun 2025 15:39:20 +0000 (08:39 -0700)]
vfio/container: mdev cpr blocker
During CPR, after VFIO_DMA_UNMAP_FLAG_VADDR, the vaddr is temporarily
invalid, so mediated devices cannot be supported. Add a blocker for them.
This restriction will not apply to iommufd containers when CPR is added
for them in a future patch.
Steve Sistare [Tue, 10 Jun 2025 15:39:19 +0000 (08:39 -0700)]
vfio/container: restore DMA vaddr
In new QEMU, do not register the memory listener at device creation time.
Register it later, in the container post_load handler, after all vmstate
that may affect regions and mapping boundaries has been loaded. The
post_load registration will cause the listener to invoke its callback on
each flat section, and the calls will match the mappings remembered by the
kernel.
The listener calls a special dma_map handler that passes the new VA of each
section to the kernel using VFIO_DMA_MAP_FLAG_VADDR. Restore the normal
handler at the end.
Steve Sistare [Tue, 10 Jun 2025 15:39:18 +0000 (08:39 -0700)]
vfio/container: discard old DMA vaddr
In the container pre_save handler, discard the virtual addresses in DMA
mappings with VFIO_DMA_UNMAP_FLAG_VADDR, because guest RAM will be
remapped at a different VA after in new QEMU. DMA to already-mapped
pages continues.
Steve Sistare [Tue, 10 Jun 2025 15:39:17 +0000 (08:39 -0700)]
vfio/container: preserve descriptors
At vfio creation time, save the value of vfio container, group, and device
descriptors in CPR state. On qemu restart, vfio_realize() finds and uses
the saved descriptors.
During reuse, device and iommu state is already configured, so operations
in vfio_realize that would modify the configuration, such as vfio ioctl's,
are skipped. The result is that vfio_realize constructs qemu data
structures that reflect the current state of the device.