Akihiko Odaki [Mon, 26 May 2025 05:29:13 +0000 (14:29 +0900)]
qemu-thread: Use futex for QemuEvent on Windows
Use the futex-based implementation of QemuEvent on Windows to
remove code duplication and remove the overhead of event object
construction and destruction.
Akihiko Odaki [Mon, 26 May 2025 05:29:13 +0000 (14:29 +0900)]
qemu-thread: Avoid futex abstraction for non-Linux
qemu-thread used to abstract pthread primitives into futex for the
QemuEvent implementation of POSIX systems other than Linux. However,
this abstraction has one key difference: unlike futex, pthread
primitives require an explicit destruction, and it must be ordered after
wait and wake operations.
It would be easier to perform destruction if a wait operation ensures
the corresponding wake operation finishes as POSIX semaphore does, but
that requires to protect state accesses in qemu_event_set() and
qemu_event_wait() with a mutex. On the other hand, real futex does not
need such a protection but needs complex barrier and atomic operations
to ensure ordering between the two functions.
Add special implementations of qemu_event_set() and qemu_event_wait()
using pthread primitives. qemu_event_wait() will ensure qemu_event_set()
finishes, and these functions will avoid complex barrier and atomic
operations to ensure ordering between them.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Tested-by: Phil Dennis-Jordan <phil@philjordan.eu> Reviewed-by: Phil Dennis-Jordan <phil@philjordan.eu> Link: https://lore.kernel.org/r/20250526-event-v4-5-5b784cc8e1de@daynix.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Akihiko Odaki [Thu, 29 May 2025 05:45:50 +0000 (14:45 +0900)]
futex: Check value after qemu_futex_wait()
futex(2) - Linux manual page
https://man7.org/linux/man-pages/man2/futex.2.html
> Note that a wake-up can also be caused by common futex usage patterns
> in unrelated code that happened to have previously used the futex
> word's memory location (e.g., typical futex-based implementations of
> Pthreads mutexes can cause this under some conditions). Therefore,
> callers should always conservatively assume that a return value of 0
> can mean a spurious wake-up, and use the futex word's value (i.e.,
> the user-space synchronization scheme) to decide whether to continue
> to block or not.
Tom Lendacky [Fri, 28 Mar 2025 20:30:24 +0000 (15:30 -0500)]
i386/kvm: Prefault memory on page state change
A page state change is typically followed by an access of the page(s) and
results in another VMEXIT in order to map the page into the nested page
table. Depending on the size of page state change request, this can
generate a number of additional VMEXITs. For example, under SNP, when
Linux is utilizing lazy memory acceptance, memory is typically accepted in
4M chunks. A page state change request is submitted to mark the pages as
private, followed by validation of the memory. Since the guest_memfd
currently only supports 4K pages, each page validation will result in
VMEXIT to map the page, resulting in 1024 additional exits.
When performing a page state change, invoke KVM_PRE_FAULT_MEMORY for the
size of the page state change in order to pre-map the pages and avoid the
additional VMEXITs. This helps speed up boot times.
Paolo Bonzini [Thu, 5 Jun 2025 09:12:15 +0000 (11:12 +0200)]
rust: make TryFrom macro more resilient
If the enum includes values such as "Ok", "Err", or "Error", the TryInto
macro can cause errors. Be careful and qualify identifiers with the full
path, or in the case of TryFrom<>::Error do not use the associated type
at all.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Zhao Liu [Mon, 26 May 2025 11:54:21 +0000 (13:54 +0200)]
rust/hpet: Drop BqlCell wrapper for num_timers
Now that the num_timers field is initialized as a property, someone may
change its default value using qdev_prop_set_uint8(), but the value is
fixed after the Rust code sees it first. Since there is no need to modify
it after realize(), it is not to be necessary to have a BqlCell wrapper.
Paolo Bonzini [Mon, 26 May 2025 11:52:11 +0000 (13:52 +0200)]
rust/hpet: change type of num_timers to usize
Remove the need to convert after every read of the BqlCell. Because the
vmstate uses a u8 as the size of the VARRAY, this requires switching
the VARRAY to use num_timers_save; which in turn requires ensuring that
the num_timers_save is always there. For simplicity do this by
removing support for version 1, which QEMU has not been producing for
~15 years.
Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Tue, 3 Jun 2025 15:45:39 +0000 (17:45 +0200)]
rust: qemu-api: add bindings to Error
Provide an implementation of std::error::Error that bridges the Rust
anyhow::Error and std::panic::Location types with QEMU's Error*.
It also has several utility methods, analogous to error_propagate(),
that convert a Result into a return value + Error** pair. One important
difference is that these propagation methods *panic* if *errp is NULL,
unlike error_propagate() which eats subsequent errors[1]. The reason
for this is that in C you have an error_set*() call at the site where
the error is created, and calls to error_propagate() are relatively rare.
In Rust instead, even though these functions do "propagate" a
qemu_api::Error into a C Error**, there is no error_setg() anywhere that
could check for non-NULL errp and call abort(). error_propagate()'s
behavior of ignoring subsequent errors is generally considered weird,
and there would be a bigger risk of triggering it from Rust code.
[1] This is actually a violation of the preconditions of error_propagate(),
so it should not happen. But you never know...
Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Mon, 26 May 2025 07:25:50 +0000 (09:25 +0200)]
util/error: allow non-NUL-terminated err->src
Rust makes the current file available as a statically-allocated string,
but without a NUL terminator. Allow this by storing an optional maximum
length in the Error.
Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Fri, 23 May 2025 15:59:52 +0000 (17:59 +0200)]
subprojects: add the foreign crate
This is a cleaned up and separated version of the patches at
https://lore.kernel.org/all/20240701145853.1394967-4-pbonzini@redhat.com/
https://lore.kernel.org/all/20240701145853.1394967-5-pbonzini@redhat.com/
Its first user will be the Error bindings; for example a QEMU Error ** can be
converted to a Rust Option using
unsafe { Option::<Error>::from_foreign(c_error) }
Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Mon, 26 May 2025 10:10:18 +0000 (12:10 +0200)]
subprojects: add the anyhow crate
This is a standard replacement for Box<dyn Error> which is more efficient (it only
occcupies one word) and provides a backtrace of the error. This could be plumbed
into &error_abort in the future.
Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Stefan Hajnoczi [Wed, 4 Jun 2025 15:43:30 +0000 (11:43 -0400)]
Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging
* rust: use native Meson support for clippy and rustdoc
* rust: add "bits", a custom bitflags implementation
* target/i386: Remove FRED dependency on WRMSRNS
* target/i386: Add the immediate form MSR access instruction support
* TDX fixes
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu:
rust: qemu-api-macros: add from_bits and into_bits to #[derive(TryInto)]
rust: pl011: use the bits macro
rust: add "bits", a custom bitflags implementation
i386/tdvf: Fix build on 32-bit host
i386/tdx: Fix build on 32-bit host
meson: use config_base_arch for target libraries
target/i386: Add the immediate form MSR access instruction support
target/i386: Add a new CPU feature word for CPUID.7.1.ECX
target/i386: Remove FRED dependency on WRMSRNS
rust: use native Meson support for clippy and rustdoc
rust: cell: remove support for running doctests with "cargo test --doc"
rust: add qemu-api doctests to "meson test"
build, dockerfiles: add support for detecting rustdoc
rust: use "objects" for Rust executables as well
meson: update to version 1.8.1
rust: bindings: allow ptr_offset_with_cast
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Paolo Bonzini [Tue, 3 Jun 2025 15:44:54 +0000 (17:44 +0200)]
rust: add "bits", a custom bitflags implementation
One common thing that device emulation does is manipulate bitmasks, for example
to check whether two bitmaps have common bits. One example in the pl011 crate
is the checks for pending interrupts, where an interrupt cause corresponds to
at least one interrupt source from a fixed set.
Unfortunately, this is one case where Rust *can* provide some kind of
abstraction but it does so with a rather Perl-ish There Is More Way To
Do It. It is not something where a crate like "bilge" helps, because
it only covers the packing of bits in a structure; operations like "are
all bits of Y set in X" almost never make sense for bit-packed structs;
you need something else, there are several crates that do it and of course
we're going to roll our own.
In particular I examined three:
- bitmask (https://docs.rs/bitmask/0.5.0/bitmask/) does not support const
at all. This is a showstopper because one of the ugly things in the
current pl011 code is the ugliness of code that defines interrupt masks
at compile time:
You would have to use roughly the same code---"bitmask" only helps with
defining the struct.
- bitmask_enum (https://docs.rs/bitmask-enum/2.2.5/bitmask_enum/) does not
have a good separation of "valid" and "invalid" bits, so for example "!x"
will invert all 16 bits if you choose u16 as the representation -- even if
you only defined 10 bits. This makes it easier to introduce subtle bugs
in comparisons.
- bitflags (https://docs.rs/bitflags/2.6.0/bitflags/) is generally the most
used such crate and is the one that I took most inspiration from with
respect to the syntax. It's a pretty sophisticated implementation,
with a lot of bells and whistles such as an implementation of "Iter"
that returns the bits one at a time.
The main thing that all of them lack, however, is a way to simplify the
ugly definitions like the above. "bitflags" includes const methods that
perform AND/OR/XOR of masks (these are necessary because Rust operator
overloading does not support const yet, and therefore overloaded operators
cannot be used in the definition of a "static" variable), but they become
even more verbose and unmanageable, like
This was the main reason to create "bits", which allows something like
bits!(Interrupt: E | MS | RT | TX | RX)
and expands it 1) add "Interrupt::" in front of all identifiers 2) convert
operators to the wordy const functions like "union". It supports boolean
operators "&", "|", "^", "!" and parentheses, with a relatively simple
recursive descent parser that's implemented in qemu_api_macros.
Since I don't remember exactly how the macro was developed, I cannot exclude
that it contains code from "bitflags". Therefore, I am conservatively leaving
in the MIT and Apache 2.0 licenses from bitflags. In fact, I think there
would be a benefit in being able to push code back to "bitflags" anyway
whenever applicable, so that the two libraries do not diverge too much,
so that's another reason to use this.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Fixed commit introduced common dependencies for target libraries. Alas,
it wrongly reused the 'target' variable, which was previously set from
another loop.
Thus, some dependencies were missing depending on order of target list,
as found here [1].
The fix is to use the correct config_base_arch instead.
Kudos to Thomas Huth who had this right, before I reimplement it, and
introduce this bug.
Xin Li (Intel) [Fri, 3 Jan 2025 08:48:27 +0000 (00:48 -0800)]
target/i386: Add the immediate form MSR access instruction support
The immediate form of MSR access instructions are primarily motivated by
performance, not code size: by having the MSR number in an immediate, it
is available *much* earlier in the pipeline, which allows the hardware
much more leeway about how a particular MSR is handled.
Paolo Bonzini [Sat, 5 Apr 2025 12:18:17 +0000 (14:18 +0200)]
rust: use native Meson support for clippy and rustdoc
Meson has support for invoking clippy and rustdoc on all crates (1.7.0 for
clippy, 1.8.0 for rustdoc). Use it instead of the homegrown version; this
requires disabling the multiple_crate_versions lint (the only one that was
enabled from the "cargo" group)---which was not particularly useful anyway
because all dependencies are converted by hand into Meson subprojects.
rustfmt is still not supported.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Sat, 5 Apr 2025 08:33:09 +0000 (10:33 +0200)]
rust: add qemu-api doctests to "meson test"
Doctests are weird. They are essentially integration tests, but they're
"ran" by executing rustdoc --test, which takes a compiler-ish
command line. This is supported by Meson 1.8.0.
Because they run the linker and need all the .o files, run them in the
build jobs rather than the test jobs.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Fri, 2 May 2025 17:06:43 +0000 (19:06 +0200)]
build, dockerfiles: add support for detecting rustdoc
rustdoc is effectively a custom version of rustc, and it is necessary to
specify it in order to run doctests from Meson. Add the relevant configure
option and environment variables.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Thu, 27 Feb 2025 12:36:49 +0000 (13:36 +0100)]
rust: use "objects" for Rust executables as well
libqemuutil is not meant be linked as a whole; if modules are enabled, doing
so results in undefined symbols (corresponding to QMP commands) in
rust/qemu-api/rust-qemu-api-integration.
Support for "objects" in Rust executables is available in Meson 1.8.0; use it
to switching to the same dependencies that C targets use: link_with for
libqemuutil, and objects for everything else.
Reported-by: Bernhard Beschow <shentey@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Wed, 11 Dec 2024 17:14:19 +0000 (18:14 +0100)]
rust: bindings: allow ptr_offset_with_cast
This is produced by recent versions of bindgen:
warning: use of `offset` with a `usize` casted to an `isize`
--> /builds/bonzini/qemu/rust/target/debug/build/qemu_api-35cb647f4db404b8/out/bindings.inc.rs:39:21
|
39 | let byte = *(core::ptr::addr_of!((*this).storage) as *const u8).offset(byte_index as isize);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: try: `(core::ptr::addr_of!((*this).storage) as *const u8).add(byte_index)`
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#ptr_offset_with_cast
= note: `#[warn(clippy::ptr_offset_with_cast)]` on by default
warning: use of `offset` with a `usize` casted to an `isize`
--> /builds/bonzini/qemu/rust/target/debug/build/qemu_api-35cb647f4db404b8/out/bindings.inc.rs:68:13
|
68 | (core::ptr::addr_of_mut!((*this).storage) as *mut u8).offset(byte_index as isize);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: try: `(core::ptr::addr_of_mut!((*this).storage) as *mut u8).add(byte_index)`
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#ptr_offset_with_cast
This seems to be new in bindgen 0.71.0, possibly related to bindgen
commit 33006185b7878 ("Add raw_ref_macros feature", 2024-11-22).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* tag 'pull-qapi-2025-06-03' of https://repo.or.cz/qemu/armbru:
qapi: Improve documentation around job state @concluded
qapi: Tidy up references to job state CONCLUDED
qapi: Mention both job-cancel and block-job-cancel in doc comments
qapi: Refer to job-FOO instead of deprecated block-job-FOO in docs
qapi: Spell JSON null correctly in blockdev-reopen documentation
qapi: Use proper markup instead of CAPS for emphasis in doc comments
qapi: Fix capitalization in doc comments
qapi: Correct spelling of QEMU in doc comments
qapi: Drop a problematic (Since: 2.11) from query-hotpluggable-cpus
qapi: Avoid breaking lines within (since X.Y)
qapi: Move (since X.Y) to end of description
qapi: Tidy up whitespace in doc comments
qapi: Tidy up run-together sentences in doc comments
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
qapi: Improve documentation around job state @concluded
We use "the query list" in a few places. It's not entirely obvious
what that means. It's actually the output of query-jobs or
query-block-jobs.
Documentation of @auto-dismiss talks about the job disappearing from
the query list when it reaches state @concluded. This is less than
precise. The job doesn't merely disappear from the query list, it
disappears, period.
Documentation of JobStatus @concluded explains "the job will remain in
the query list until it is dismissed". Again less than precise. It
remains in state @concluded until dismissed.
Rephrase without use of "the query list" for clarity and precision.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20250527073916.1243024-14-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
When talking about the job state machine, we refer to the states like
READY, ABORTING, CONCLUDED, and so forth. Except in two places, where
we use JOB_STATUS_CONCLUDED. Replace by CONCLUDED for consistency.
We should arguably use the JobStatus enum values instead. Left for
another day.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20250527073916.1243024-13-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
qapi: Mention both job-cancel and block-job-cancel in doc comments
Several doc comments mention block-job-cancel where the more generic
job-cancel would also work. Adjust them to mention both.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20250527073916.1243024-12-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
qapi: Refer to job-FOO instead of deprecated block-job-FOO in docs
We deprecated several block-job-FOO commands in commit b836bf2ab68 (qapi/block-core: deprecate some block-job- APIs). Update
the doc comments to refer to their replacements instead.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20250527073916.1243024-11-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
qapi: Spell JSON null correctly in blockdev-reopen documentation
The doc comment misspells JSON null as NULL. Fix that.
Cc: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20250527073916.1243024-10-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
qapi: Use proper markup instead of CAPS for emphasis in doc comments
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20250527073916.1243024-9-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20250527073916.1243024-8-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
Improve awkward phrasing in migrate-incoming While there.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20250527073916.1243024-7-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
qapi: Drop a problematic (Since: 2.11) from query-hotpluggable-cpus
There is a (Since: 2.11) in a query-hotpluggable-cpus example.
Versioning information ought to be in the command description, not
examples. The command description is basically empty (there is a TODO
about it).
What exactly didn't work before 2.11 is not quite clear from the
documentation. The example was added in commit 4dc3b151882 (s390x:
implement query-hotpluggable-cpus), which suggests the command failed
for the s390x target until then. This was almost eight years ago, and
I doubt anyone still cares about this detail. Simply delete
the problematic (Since: 2.11).
Cc: David Hildenbrand <david@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20250527073916.1243024-6-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20250527073916.1243024-5-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
By convention, we put (since X.Y) at the end of the description. Move
the ones that somehow ended up in the middle of the description to the
end.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20250527073916.1243024-4-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20250527073916.1243024-3-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
qapi: Tidy up run-together sentences in doc comments
Fixes: a937b6aa739f (qapi: Reformat doc comments to conform to current conventions) Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20250527073916.1243024-2-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
Tanish Desai [Wed, 28 May 2025 19:25:28 +0000 (19:25 +0000)]
trace/simple: seperate hot paths of tracing fucntions
This change improves performance by moving the hot path of the trace_vhost_commit()(or any other trace function) logic to the header file.
Previously, even when the trace event was disabled, the function call chain:-
trace_vhost_commit()(Or any other trace function) → _nocheck__trace_vhost_commit() → _simple_trace_vhost_commit()
incurred a significant function prologue overhead before checking the trace state.
Disassembly of _simple_trace_vhost_commit() (from the .c file) showed that 11 out of the first 14 instructions were prologue-related, including:
0x10 stp x29, x30, [sp, #-64]! Prologue: allocates 64-byte frame and saves old FP (x29) & LR (x30)
0x14 adrp x3, trace_events_enabled_count Prologue: computes page-base of the trace-enable counter
0x18 adrp x2, __stack_chk_guard Important (maybe prolog don't know?)(stack-protector): starts up the stack-canary load
0x1c mov x29, sp Prologue: sets new frame pointer
0x20 ldr x3, [x3] Prologue: loads the actual trace-enabled count
0x24 stp x19, x20, [sp, #16] Prologue: spills callee-saved regs used by this function (x19, x20)
0x28 and w20, w0, #0xff Tracepoint setup: extracts the low-8 bits of arg0 as the “event boolean”
0x2c ldr x2, [x2] Prologue (cont’d): completes loading of the stack-canary value
0x30 and w19, w1, #0xff Tracepoint setup: extracts low-8 bits of arg1
0x34 ldr w0, [x3] Important: loads the current trace-enabled flag from memory
0x38 ldr x1, [x2] Prologue (cont’d): reads the canary
0x3c str x1, [sp, #56] Prologue (cont’d): writes the canary into the new frame
0x40 mov x1, #0 Prologue (cont’d): zeroes out x1 for the upcoming branch test
0x44 cbnz w0, 0x88 Important: if tracing is disabled (w0==0) skip the heavy path entirely
The trace-enabled check happens after the prologue. This is wasteful when tracing is disabled, which is often the case in production.
To optimize this:
_nocheck__trace_vhost_commit() is now fully inlined in the .h file with
the hot path.It checks trace_event_get_state() before calling into _simple_trace_vhost_commit(), which remains in .c.
This avoids calling into the .c function altogether when the tracepoint is disabled, thereby skipping unnecessary prologue instructions.
This results in better performance by removing redundant instructions in the tracing fast path.
Signed-off-by: Tanish Desai <tanishdesai37@gmail.com>
Message-id: 20250528192528.3968-1-tanishdesai37@gmail.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Stefan Hajnoczi [Mon, 2 Jun 2025 18:52:44 +0000 (14:52 -0400)]
Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging
virtio,pci,pc: features, fixes, tests
vhost will now no longer set a call notifier if unused
some work towards loongarch testing based on bios-tables-test
some core pci work for SVM support in vtd
vhost vdpa init has been optimized for response time to QMP
A couple more fixes
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (26 commits)
hw/i386/pc_piix: Fix RTC ISA IRQ wiring of isapc machine
vdpa: move memory listener register to vhost_vdpa_init
vdpa: move iova_tree allocation to net_vhost_vdpa_init
vdpa: reorder listener assignment
vdpa: add listener_registered
vdpa: set backend capabilities at vhost_vdpa_init
vdpa: reorder vhost_vdpa_set_backend_cap
vdpa: check for iova tree initialized at net_client_start
vhost: Don't set vring call if guest notifier is unused
tests/qtest/bios-tables-test: Use MiB macro rather hardcode value
tests/data/uefi-boot-images: Add ISO image for LoongArch system
uefi-test-tools:: Add LoongArch64 support
pci: Add a PCI-level API for PRI
pci: Add a pci-level API for ATS
pci: Add a pci-level initialization function for IOMMU notifiers
memory: Store user data pointer in the IOMMU notifiers
pci: Add an API to get IOMMU's min page size and virtual address width
pci: Cache the bus mastering status in the device
pcie: Helper functions to check to check if PRI is enabled
pcie: Add a helper to declare the PRI capability for a pcie device
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Bernhard Beschow [Mon, 26 May 2025 20:38:20 +0000 (22:38 +0200)]
hw/i386/pc_piix: Fix RTC ISA IRQ wiring of isapc machine
Commit 56b1f50e3c10 ("hw/i386/pc: Wire RTC ISA IRQs in south bridges")
attempted to refactor RTC IRQ wiring which was previously done in
pc_basic_device_init() but forgot about the isapc machine. Fix this by
wiring in the code section dedicated exclusively to the isapc machine.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2961 Fixes: 56b1f50e3c10 ("hw/i386/pc: Wire RTC ISA IRQs in south bridges")
cc: qemu-stable Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Mark Cave-Ayland <mark.caveayland@nutanix.com>
Message-Id: <20250526203820.1853-1-shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Eugenio Pérez [Thu, 22 May 2025 14:58:39 +0000 (10:58 -0400)]
vdpa: move memory listener register to vhost_vdpa_init
Current memory operations like pinning may take a lot of time at the
destination. Currently they are done after the source of the migration is
stopped, and before the workload is resumed at the destination. This is a
period where neigher traffic can flow, nor the VM workload can continue
(downtime).
We can do better as we know the memory layout of the guest RAM at the
destination from the moment that all devices are initializaed. So
moving that operation allows QEMU to communicate the kernel the maps
while the workload is still running in the source, so Linux can start
mapping them.
As a small drawback, there is a time in the initialization where QEMU
cannot respond to QMP etc. By some testing, this time is about
0.2seconds. This may be further reduced (or increased) depending on the
vdpa driver and the platform hardware, and it is dominated by the cost
of memory pinning.
This matches the time that we move out of the called downtime window.
The downtime is measured as the elapsed trace time between the last
vhost_vdpa_suspend on the source and the last vhost_vdpa_set_vring_enable_one
on the destination. In other words, from "guest CPUs freeze" to the
instant the final Rx/Tx queue-pair is able to start moving data.
Using ConnectX-6 Dx (MLX5) NICs in vhost-vDPA mode with 8 queue-pairs,
the series reduces guest-visible downtime during back-to-back live
migrations by more than half:
- 39G VM: 4.72s -> 2.09s (-2.63s, ~56% improvement)
- 128G VM: 14.72s -> 5.83s (-8.89s, ~60% improvement)
Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <20250522145839.59974-8-jonah.palmer@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Eugenio Pérez [Thu, 22 May 2025 14:58:38 +0000 (10:58 -0400)]
vdpa: move iova_tree allocation to net_vhost_vdpa_init
As we are moving to keep the mapping through all the vdpa device life
instead of resetting it at VirtIO reset, we need to move all its
dependencies to the initialization too. In particular devices with
x-svq=on need a valid iova_tree from the beginning.
Simplify the code also consolidating the two creation points: the first
data vq in case of SVQ active and CVQ start in case only CVQ uses it.
Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Suggested-by: Si-Wei Liu <si-wei.liu@oracle.com> Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <20250522145839.59974-7-jonah.palmer@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Eugenio Pérez [Thu, 22 May 2025 14:58:37 +0000 (10:58 -0400)]
vdpa: reorder listener assignment
Since commit f6fe3e333f ("vdpa: move memory listener to
vhost_vdpa_shared") this piece of code repeatedly assign
shared->listener members. This was not a problem as it was not used
until device start.
However next patches move the listener registration to this
vhost_vdpa_init function. When the listener is registered it is added
to an embedded linked list, so setting its members again will cause
memory corruption to the linked list node.
Do the right thing and only set it in the first vdpa device.
Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <20250522145839.59974-6-jonah.palmer@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Eugenio Pérez [Thu, 22 May 2025 14:58:36 +0000 (10:58 -0400)]
vdpa: add listener_registered
Check if the listener has been registered or not, so it needs to be
registered again at start.
Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <20250522145839.59974-5-jonah.palmer@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Eugenio Pérez [Thu, 22 May 2025 14:58:35 +0000 (10:58 -0400)]
vdpa: set backend capabilities at vhost_vdpa_init
The backend does not reset them until the vdpa file descriptor is closed
so there is no harm in doing it only once.
This allows the destination of a live migration to premap memory in
batches, using VHOST_BACKEND_F_IOTLB_BATCH.
Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <20250522145839.59974-4-jonah.palmer@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Eugenio Pérez [Thu, 22 May 2025 14:58:34 +0000 (10:58 -0400)]
vdpa: reorder vhost_vdpa_set_backend_cap
It will be used directly by vhost_vdpa_init.
Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <20250522145839.59974-3-jonah.palmer@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Eugenio Pérez [Thu, 22 May 2025 14:58:33 +0000 (10:58 -0400)]
vdpa: check for iova tree initialized at net_client_start
To map the guest memory while it is migrating we need to create the
iova_tree, as long as the destination uses x-svq=on. Checking to not
override it.
The function vhost_vdpa_net_client_stop clear it if the device is
stopped. If the guest starts the device again, the iova tree is
recreated by vhost_vdpa_net_data_start_first or vhost_vdpa_net_cvq_start
if needed, so old behavior is kept.
Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <20250522145839.59974-2-jonah.palmer@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Huaitong Han [Thu, 22 May 2025 10:05:48 +0000 (18:05 +0800)]
vhost: Don't set vring call if guest notifier is unused
The vring call fd is set even when the guest does not use MSI-X (e.g., in the
case of virtio PMD), leading to unnecessary CPU overhead for processing
interrupts.
The commit 96a3d98d2c("vhost: don't set vring call if no vector") optimized the
case where MSI-X is enabled but the queue vector is unset. However, there's an
additional case where the guest uses INTx and the INTx_DISABLED bit in the PCI
config is set, meaning that no interrupt notifier will actually be used.
In such cases, the vring call fd should also be cleared to avoid redundant
interrupt handling.
Fixes: 96a3d98d2c("vhost: don't set vring call if no vector") Reported-by: Zhiyuan Yuan <yuanzhiyuan@chinatelecom.cn> Signed-off-by: Jidong Xia <xiajd@chinatelecom.cn> Signed-off-by: Huaitong Han <hanht2@chinatelecom.cn>
Message-Id: <20250522100548.212740-1-hanht2@chinatelecom.cn> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Bibo Mao [Tue, 20 May 2025 13:01:53 +0000 (21:01 +0800)]
tests/qtest/bios-tables-test: Use MiB macro rather hardcode value
Replace 1024 * 1024 with MiB macro.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Message-Id: <20250520130158.767083-4-maobibo@loongson.cn> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Bibo Mao [Tue, 20 May 2025 13:01:52 +0000 (21:01 +0800)]
tests/data/uefi-boot-images: Add ISO image for LoongArch system
To test ACPI tables, edk2 needs to be booted with a disk image having
EFI partition. This image is created using UefiTestToolsPkg.
The image is generated with the following command:
make -f tests/uefi-test-tools/Makefile
Signed-off-by: Bibo Mao <maobibo@loongson.cn> Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Message-Id: <20250520130158.767083-3-maobibo@loongson.cn> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Bibo Mao [Tue, 20 May 2025 13:01:51 +0000 (21:01 +0800)]
uefi-test-tools:: Add LoongArch64 support
Add support to build bios-tables-test iso image for LoongArch system.
Signed-off-by: Bibo Mao <maobibo@loongson.cn> Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Message-Id: <20250520130158.767083-2-maobibo@loongson.cn> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
A device can send a PRI request to the IOMMU using pci_pri_request_page.
The PRI response is sent back using the notifier managed with
pci_pri_register_notifier and pci_pri_unregister_notifier.
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com> Co-authored-by: Ethan Milon <ethan.milon@eviden.com>
Message-Id: <20250520071823.764266-12-clement.mathieu--drif@eviden.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Devices implementing ATS can send translation requests using
pci_ats_request_translation. The invalidation events are sent
back to the device using the iommu notifier managed with
pci_iommu_register_iotlb_notifier / pci_iommu_unregister_iotlb_notifier.
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com> Co-authored-by: Ethan Milon <ethan.milon@eviden.com>
Message-Id: <20250520071823.764266-11-clement.mathieu--drif@eviden.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
pci: Add a pci-level initialization function for IOMMU notifiers
This is meant to be used by ATS-capable devices.
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Message-Id: <20250520071823.764266-10-clement.mathieu--drif@eviden.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
memory: Store user data pointer in the IOMMU notifiers
This will help developers of ATS-capable devices to track a state.
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Message-Id: <20250520071823.764266-9-clement.mathieu--drif@eviden.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
pci: Add an API to get IOMMU's min page size and virtual address width
This kind of information is needed by devices implementing ATS in order
to initialize their translation cache.
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Message-Id: <20250520071823.764266-8-clement.mathieu--drif@eviden.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The cached is_master value is necessary to know if a device is
allowed to issue ATS/PRI requests or not as these operations do not go
through the master_enable memory region.
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Message-Id: <20250520071823.764266-7-clement.mathieu--drif@eviden.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
pcie: Helper functions to check to check if PRI is enabled
pri_enabled can be used to check whether the capability is present and
enabled on a PCIe device
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Message-Id: <20250520071823.764266-6-clement.mathieu--drif@eviden.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
pcie: Add a helper to declare the PRI capability for a pcie device
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Message-Id: <20250520071823.764266-5-clement.mathieu--drif@eviden.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
ats_enabled checks whether the capability is
present or not. If so, we read the configuration space to get
the status of the feature (enabled or not).
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Message-Id: <20250520071823.764266-4-clement.mathieu--drif@eviden.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
pcie: Helper functions to check if PASID is enabled
pasid_enabled checks whether the capability is
present or not. If so, we read the configuration space to get
the status of the feature (enabled or not).
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Message-Id: <20250520071823.764266-3-clement.mathieu--drif@eviden.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
pcie: Add helper to declare PASID capability for a pcie device
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Message-Id: <20250520071823.764266-2-clement.mathieu--drif@eviden.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Vasant Hegde [Fri, 16 May 2025 10:05:35 +0000 (15:35 +0530)]
hw/i386/amd_iommu: Fix xtsup when vcpus < 255
If vCPUs > 255 then x86 common code (x86_cpus_init()) call kvm_enable_x2apic().
But if vCPUs <= 255 then the common code won't calls kvm_enable_x2apic().
This is because commit 8c6619f3e692 ("hw/i386/amd_iommu: Simplify non-KVM
checks on XTSup feature") removed the call to kvm_enable_x2apic when xtsup
is "on", which break things when guest is booted with x2apic mode and
there are <= 255 vCPUs.
Fix this by adding back kvm_enable_x2apic() call when xtsup=on.
Sairaj Kodilkar [Fri, 16 May 2025 10:05:34 +0000 (15:35 +0530)]
hw/i386/amd_iommu: Fix device setup failure when PT is on.
Commit c1f46999ef506 ("amd_iommu: Add support for pass though mode")
introduces the support for "pt" flag by enabling nodma memory when
"pt=off". This allowed VFIO devices to successfully register notifiers
by using nodma region.
But, This also broke things when guest is booted with the iommu=nopt
because, devices bypass the IOMMU and use untranslated addresses (IOVA) to
perform DMA reads/writes to the nodma memory region, ultimately resulting in
a failure to setup the devices in the guest.
Fix the above issue by always enabling the amdvi_dev_as->iommu memory region.
But this will once again cause VFIO devices to fail while registering the
notifiers with AMD IOMMU memory region.
Fixes: c1f46999ef506 ("amd_iommu: Add support for pass though mode") Signed-off-by: Sairaj Kodilkar <sarunkod@amd.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Message-Id: <20250516100535.4980-2-sarunkod@amd.com> Fixes: c1f46999ef506 ("amd_iommu: Add support for pass though mode") Signed-off-by: Sairaj Kodilkar <sarunkod@amd.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Yuri Benditovich [Thu, 15 May 2025 06:32:37 +0000 (09:32 +0300)]
virtio: check for validity of indirect descriptors
virtio processes indirect descriptors even if the respected
feature VIRTIO_RING_F_INDIRECT_DESC was not negotiated.
If qemu is used with reduced set of features to emulate the
hardware device that does not support indirect descriptors,
the will probably trigger problematic flows on the hardware
setup but do not reveal the mistake on qemu.
Add LOG_GUEST_ERROR for such case. This will issue logs with
'-d guest_errors' in the command line
Stefan Hajnoczi [Fri, 30 May 2025 15:41:21 +0000 (11:41 -0400)]
Merge tag 'pull-target-arm-20250530-2' of https://git.linaro.org/people/pmaydell/qemu-arm into staging
target-arm queue:
* hw/arm: Add GMAC devices to NPCM8XX SoC
* hw/arm: Add missing psci_conduit to NPCM8XX SoC boot info
* docs/interop: convert text files to restructuredText
* target/arm: Some minor refactorings
* tests/functional: Add a test for the Stellaris arm machines
* hw/block: Drop unused nand.c
* tag 'pull-target-arm-20250530-2' of https://git.linaro.org/people/pmaydell/qemu-arm:
hw/block: Drop unused nand.c
tests/functional: Add a test for the Stellaris arm machines
target/arm/hvf: Include missing 'cpu-qom.h' header
target/arm/kvm: Include missing 'cpu-qom.h' header
target/arm/qmp: Include missing 'cpu.h' header
target/arm/cpu-features: Include missing 'cpu.h' header
hw/arm/boot: Include missing 'system/memory.h' header
target/arm/cpregs: Include missing 'target/arm/cpu.h' header
target/arm: Only link with zlib when TCG is enabled
target/arm/hvf_arm: Avoid using poisoned CONFIG_HVF definition
target/arm/tcg-stubs: compile file once (system)
docs/interop: convert text files to restructuredText
hw/arm: Add missing psci_conduit to NPCM8XX SoC boot info
tests/qtest: Migrate GMAC test from 7xx to 8xx
hw/arm: Add GMAC devices to NPCM8XX SoC
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Stefan Hajnoczi [Fri, 30 May 2025 15:41:13 +0000 (11:41 -0400)]
Merge tag 'pull-request-2025-05-30' of https://gitlab.com/thuth/qemu into staging
* Functional tests improvements
* Endianness improvements/clean-ups for the Microblaze machines
* Remove obsolete -2.4 and -2.5 i440fx and q35 machine types and related code
Stefan Hajnoczi [Fri, 30 May 2025 15:41:07 +0000 (11:41 -0400)]
Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging
* target/i386/kvm: Intel TDX support
* target/i386/emulate: more lflags cleanups
* meson: remove need for explicit listing of dependencies in hw_common_arch and
target_common_arch
* rust: small fixes
* hpet: Reorganize register decoding to be more similar to Rust code
* target/i386: fixes for AMD models
* target/i386: new EPYC-Turin CPU model
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (77 commits)
target/i386/tcg/helper-tcg: fix file references in comments
target/i386: Add support for EPYC-Turin model
target/i386: Update EPYC-Genoa for Cache property, perfmon-v2, RAS and SVM feature bits
target/i386: Add couple of feature bits in CPUID_Fn80000021_EAX
target/i386: Update EPYC-Milan CPU model for Cache property, RAS, SVM feature bits
target/i386: Update EPYC-Rome CPU model for Cache property, RAS, SVM feature bits
target/i386: Update EPYC CPU model for Cache property, RAS, SVM feature bits
rust: make declaration of dependent crates more consistent
docs: Add TDX documentation
i386/tdx: Validate phys_bits against host value
i386/tdx: Make invtsc default on
i386/tdx: Don't treat SYSCALL as unavailable
i386/tdx: Fetch and validate CPUID of TD guest
target/i386: Print CPUID subleaf info for unsupported feature
i386: Remove unused parameter "uint32_t bit" in feature_word_description()
i386/cgs: Introduce x86_confidential_guest_check_features()
i386/tdx: Define supported KVM features for TDX
i386/tdx: Add XFD to supported bit of TDX
i386/tdx: Add supported CPUID bits relates to XFAM
i386/tdx: Add supported CPUID bits related to TD Attributes
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* tag 'pull-nbd-2025-05-29' of https://repo.or.cz/qemu/ericb:
iotests: Filter out ZFS in several tests
iotests: Improve mirror-sparse on ext4 and xfs
iotests: Use disk_usage in more places
nbd: Set unix socket send buffer on Linux
nbd: Set unix socket send buffer on macOS
io: Add helper for setting socket send buffer size
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
tests/unit/test-util-sockets: fix mem-leak on error object
The test fails with --enable-asan as the error struct is never freed.
In the case where the test expects a success but it fails, let's also
report the error for debugging (it will be freed internally).
Fixes 316e8ee8d6 ("util/qemu-sockets: Refactor inet_parse() to use QemuOpts")
Signed-off-by: Matheus Tavares Bernardino <matheus.bernardino@oss.qualcomm.com> Reviewed-by: Juraj Marcin <jmarcin@redhat.com>
Message-ID: <518d94c7db20060b2a086cf55ee9bffab992a907.1748280011.git.matheus.bernardino@oss.qualcomm.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
hw/net/vmxnet3: Merge DeviceRealize in InstanceInit
Simplify merging vmxnet3_realize() within vmxnet3_instance_init(),
removing the need for device_class_set_parent_realize().
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250512083948.39294-20-philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
VMXNET3_COMPAT_FLAG_DISABLE_PCIE was only used by the
hw_compat_2_5[] array, via the 'x-disable-pcie=on' property.
We removed all machines using that array, lets remove all the
code around VMXNET3_COMPAT_FLAG_DISABLE_PCIE.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250512083948.39294-19-philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
VMXNET3_COMPAT_FLAG_OLD_MSI_OFFSETS was only used by the
hw_compat_2_5[] array, via the 'x-old-msi-offsets=on' property.
We removed all machines using that array, lets remove all the
code around VMXNET3_COMPAT_FLAG_OLD_MSI_OFFSETS.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250512083948.39294-18-philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
Simplify replacing pvscsi_realize() by pvscsi_instance_init(),
removing the need for device_class_set_parent_realize().
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250512083948.39294-17-philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
PVSCSI_COMPAT_DISABLE_PCIE_BIT was only used by the
hw_compat_2_5[] array, via the 'x-disable-pcie=on' property.
We removed all machines using that array, lets remove all the
code around PVSCSI_COMPAT_DISABLE_PCIE_BIT, including the now
unused PVSCSIState::compat_flags field.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250512083948.39294-16-philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
PVSCSI_COMPAT_OLD_PCI_CONFIGURATION was only used by the
hw_compat_2_5[] array, via the 'x-old-pci-configuration=on'
property. We removed all machines using that array, lets remove
all the code around PVSCSI_COMPAT_OLD_PCI_CONFIGURATION.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250512083948.39294-15-philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
The hw_compat_2_5[] array was only used by the pc-q35-2.5 and
pc-i440fx-2.5 machines, which got removed. Remove it.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250512083948.39294-13-philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250512083948.39294-12-philmd@linaro.org>
[thuth: Fix error from check_patch.pl wrt to an empty "for" loop] Signed-off-by: Thomas Huth <thuth@redhat.com>
hw/i386/x86: Remove X86MachineClass::save_tsc_khz field
The X86MachineClass::save_tsc_khz boolean was only used
by the pc-q35-2.5 and pc-i440fx-2.5 machines, which got
removed. Remove it and simplify tsc_khz_needed().
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250512083948.39294-11-philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
hw/i386/pc: Remove deprecated pc-q35-2.5 and pc-i440fx-2.5 machines
These machines has been supported for a period of more than 6 years.
According to our versioned machine support policy (see commit ce80c4fa6ff "docs: document special exception for machine type
deprecation & removal") they can now be removed.
Remove the now unused empty pc_compat_2_5[] array.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250512083948.39294-10-philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
VIRTIO_PCI_FLAG_DISABLE_PCIE was only used by the hw_compat_2_4[]
array, via the 'x-disable-pcie=false' property. We removed all
machines using that array, lets remove all the code around
VIRTIO_PCI_FLAG_DISABLE_PCIE (see commit 9a4c0e220d8 for similar
VIRTIO_PCI_FLAG_* enum removal).
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250512083948.39294-9-philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>