git.ipfire.org Git - thirdparty/qemu.git/log

i386/cpu: Add CPU model for Diamond Rapids

According to table 1-2 in Intel Architecture Instruction Set Extensions
and Future Features (rev 059), Diamond Rapids has the following new
features which have already been supported for guest:
* SM4 (EVEX)
* Intel Advanced Vector Extensions 10 Version 2 (Intel AVX10.2)
* MOVRS and the PREFETCHRST2 instruction
* AMX-MOVRS, AMX-AVX512, AMX-FP8, AMX-TF32
* Intel Advanced Performance Extensions

And FRED - Flexible Return and Event Delivery (FRED) and the LKGS
instruction (introduced since Clearwater Forest & Diamond Rapids) - is
included in Diamond Rapids CPU model.

In addition, the following features are added into Diamond Rapids CPU
model:
* CET: Control-flow Enforcement Technology (introduced since Sapphire
Rapids & Sierra Forest).

Tested-by: Xudong Hao <xudong.hao@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251215073743.4055227-11-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/cpu: Define dependency for VMX_VM_ENTRY_LOAD_IA32_FRED

VMX_VM_ENTRY_LOAD_IA32_FRED depends on FRED. Define this dependency
relationship.

Tested-by: Xudong Hao <xudong.hao@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251215073743.4055227-10-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/cpu: Add an option in X86CPUDefinition to control CPUID 0x1f

Many Intel CPUs enable CPUID 0x1f by default to encode CPU topology
information.

Add the "cpuid_0x1f" option to X86CPUDefinition to allow named CPU
models to configure CPUID 0x1f from the start, thereby forcing 0x1f
to be present for guest.

With this option, there's no need to explicitly add v1 model to an
unversioned CPU model for explicitly enabling the x-force-cpuid-0x1f
property.

Tested-by: Xudong Hao <xudong.hao@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251215073743.4055227-9-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/cpu: Allow cache to be shared at thread level

In CPUID 0x4 leaf, it's possible to make the cache privated at thread
level when there's no HT within the core. In this case, while cache per
thread and cache per core are essentially identical, their topology
information differs in CPUID 0x4.

Diamond Rapids assigns the L1 i/d cache at the thread level. To allow
accurate emulation of DMR cache topology, remove the cache-per-thread
restriction in max_thread_ids_for_cache(), which enables CPUID 0x4 to
support cache per thread topology.

Given that after adding thread-level support, the topology offset
information required by max_thread_ids_for_cache() can be sufficiently
provided by apicid_offset_by_topo_level(), so it's straightforward to
re-implement max_thread_ids_for_cache() based on
apicid_offset_by_topo_level() to reduce redundant duplicate codes.

Tested-by: Xudong Hao <xudong.hao@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251215073743.4055227-8-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/cpu: Allow unsupported avx10_version with x-force-features

The "force_features" ("x-force-features" property) forces setting
feature even if host doesn't support, but also reports the warning.

Given its function, it's useful for debug, so even if the AVX10
version is unsupported by host, force to set this AVX10 version if
x-force-features=on.

Tested-by: Xudong Hao <xudong.hao@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251215073743.4055227-7-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/cpu: Add a helper to get host avx10 version

Factor out a helper to get host avx10 version, to reduce duplicate
codes.

Tested-by: Xudong Hao <xudong.hao@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251215073743.4055227-6-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/cpu: Support AVX10.2 with AVX10 feature models

Intel AVX10 Version 2 (Intel AVX10.2) includes a suite of new
instructions delivering new AI features and performance, accelerated
media processing, expanded Web Assembly, and Cryptography support, along
with enhancements to existing legacy instructions for completeness and
efficiency, and it is enumerated as version 2 in CPUID 0x24.0x0.EBX[bits
0-7] [*].

Considerring "Intel CPUs which support Intel AVX10.2 will include an
enumeration for AVX10_VNNI_INT (CPUID.24H.01H:ECX.AVX10_VNNI_INT[2])"
[*] and EVEX VPDP* instructions for INT8/INT16 (AVX10_VNNI_INT) are
detected by either AVX10.2 OR AVX10_VNNI_INT, AVX10_VNNI_INT is part of
AVX10.2, so any Intel AVX10.2 implementation lacking the AVX10_VNNI_INT
enumeration should be considered buggy hardware.

Therefore, it's necessary to set AVX10_VNNI_INT enumeration for Guest
when the user specifies AVX10 version 2. For this, introduce AVX10
models to explicitly define the feature bits included in different AVX10
versions.

[*]: Intel Advanced Vector Extensions 10.2 Architecture Specification
(rev 5.0).

Tested-by: Xudong Hao <xudong.hao@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251215073743.4055227-5-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/cpu: Add support for AVX10_VNNI_INT in CPUID enumeration

AVX10_VNNI_INT (0x24.0x1.ECX[bit 2]) is a discrete feature bit
introduced on Intel Diamond Rapids, which enumerates the support for
EVEX VPDP* instructions for INT8/INT16 [*].

Although Intel AVX10.2 has already included new VPDP* INT8/INT16 VNNI
instructions, a bit - AVX10_VNNI_INT - is still be separated. Relevant
new instructions can be checked by either CPUID AVX10.2 OR
AVX10_VNNI_INT (e.g., VPDPBSSD).

Support CPUID 0x24.0x1 subleaf with AVX10_VNNI_INT enumeration for
Guest.

[*]: Intel Advanced Vector Extensions 10.2 Architecture Specification
(rev 5.0).

Tested-by: Xudong Hao <xudong.hao@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251215073743.4055227-4-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/cpu: Add CPUID.0x1E.0x1 subleaf for AMX instructions

Intel Diamond Rapids adds new AMX instructions to support new formats
and memory operations [*]. And it introduces the CPUID subleaf 0x1E.0x1
to centralize the discrete AMX feature bits within EAX.

For new feature bits (CPUID 0x1E.0x1.EAX[bits 4,6-8]), it's
straightforward to add their enurmeration support.

In addition to the new features, CPUID 0x1E.0x1.EAX[bits 0-3] are
mirrored positions of existing AMX feature bits distributing across
the 0x7 leaves. It's not flexible to make these mirror bits have the
same names as existing ones, because QEMU would try to set both original
bit and mirror bit which would cause warning if host doesn't support
0x1E.0x1 subleaf. Thus, name these mirror bits with "*-mirror" suffix.

[*]: Intel Architecture Instruction Set Extensions and Future Features
(rev.059).

Tested-by: Xudong Hao <xudong.hao@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251215073743.4055227-3-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/cpu: Add support for MOVRS in CPUID enumeration

MOVRS is a new set of instructions introduced in the Intel platform
Diamond Rapids, to load instructions that carry a read-shared hint.

Functionally, MOVRS family is equivalent to existing load instructions,
but its read-shared hint indicates the source memory location is likely
to become read-shared by multiple processors, i.e., read in the future
by at least one other processor before it is written (assuming it is
ever written in the future). It could optimize the behavior of the
caches, especially shared caches, for this data for future reads by
multiple processors. Additionally, MOVRS family also includes a software
prefetch instruction, PREFETCHRST2, that carries the same read-shared
hint. [*]

MOVRS family is enumerated by CPUID single-bit (0x7.0x1.EAX[bit 31]).
Add its enumeration support.

[*]: Intel Architecture Instruction Set Extensions and Future Features
(rev.059).

Tested-by: Xudong Hao <xudong.hao@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251215073743.4055227-2-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

run: introduce a script for running devel commands

Various aspects of the development workflow are complicated by the need
to set env variables ahead of time, or use specific paths. Meson
provides a 'devenv' command that can be used to launch a command with a
number of appropriate project specific environment variables preset.

By default it will modify $PATH to point to any build directory that
contains a binary built by the project.

This further augments that to replicate the venv 'activate' script:

* Add $BUILD_DIR/pyvenv/bin to $PATH
* Set VIRTUAL_ENV to $BUILD_DIR/pyvenv

And then makes functional tests more easily executable

* Add $SRC_DIR/tests/functional and $SRC_DIR/python to $PYTHONPATH

To see the benefits of this consider this command:

  $ source ./build/pyvenv/bin/activate
  $ ./scripts/qmp/qmp-shell-wrap ./build/qemu-system-x86_64

which is now simplified to

  $ ./build/run ./scripts/qmp/qmp-shell-wrap qemu-system-x86_64 [args..]

This avoids the need repeat './build' several times and avoids polluting
the current terminal's environment and/or avoids errors from forgetting
to source the venv settings.

As another example running functional tests

  $ export PYTHONPATH=./python:./tests/functional
  $ export QEMU_TEST_QEMU_BINARY=./build/qemu-system-x86_64
  $ build/pyvenv/bin/python3 ./tests/functional/x86_64/test_virtio_version.py

which is now simplified to

  $ export QEMU_TEST_QEMU_BINARY=qemu-system-x86_64
  $ ./build/run ./tests/functional/x86_64/test_virtio_version.py

This usefulness of this will be further enhanced with the pending
removal of the QEMU python APIs from git, as that will require the use
of the python venv in even more scenarios that today.

The 'run' script does not let 'meson devenv' directly launch the command
to be run because it always requires $BUILD_DIR as the current working
directory. It is desired that 'run' script always honour the current
working directory of the terminal that invokes is. Thus the '--dump'
flag is used to export the devenv variables into the 'run' script's
shell.

This takes the liberty to assign 'run.in' to the "Build system" section
in the MAINTAINERS file, given that it leverages meson's 'devenv'
feature.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Link: https://lore.kernel.org/r/20251222113859.182395-1-berrange@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

gitlab-ci: enable rust for msys2-64bit

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Link: https://lore.kernel.org/r/20250924120426.2158655-24-marcandre.lureau@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust: skip compilation if there are no system emulators

Otherwise, the Rust crates require the corresponding C code
(e.g. migration/ for rust/migration/) but the dependencies of
that C code, for example the trace files, have not been built.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386/tcg: commonize code to compute SF/ZF/PF

PF/ZF/SF are computed the same way for almost all CC_OP values (depending
only on the operand size in the case of ZF and SF). The only exception is
PF for CC_OP_BLSI* and CC_OP_BMILG*; but AMD documents that PF should
be computed normally (rather than being undefined) so that is a kind of
bug fix.

Put the common code at the end of helper_cc_compute_all, shaving
another kB from its text.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386/tcg: move fetch code out of translate.c

Let translate.c only concern itself with TCG code generation. Move everything
that uses CPUX86State*, as well as gen_lea_modrm_0 now that it is only used
to fill decode->mem, to decode-new.c.inc.

While at it also rename gen_lea_modrm_0.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386/tcg: add a CCOp for SBB x,x

This is more efficient both when generating code and when testing
flags.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386/tcg: kill tmp2_i32

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386/tcg: kill tmp1_i64

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386/tcg: unify more pop/no-pop x87 instructions

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386/tcg: reuse gen_helper_fp_arith_ST0_FT0 for undocumented fcom/fcomp variants

For 0x32 hack the op to be fcomp; for the others there isn't even anything special
to do.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386/tcg: reuse gen_helper_fp_arith_ST0_FT0 for fcom STn and fcomp STn

Treat specially the undocumented ops, instead of treating specially the
two d8/0 opcodes that have undocumented variants: just call
gen_helper_fp_arith_ST0_FT0 for all opcodes in the d8/0 encoding.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386/tcg: move fcom/fcomp differentiation to gen_helper_fp_arith_ST0_FT0

There is only one call site for gen_helper_fp_arith_ST0_FT0(), therefore
there is no need to check the op1 == 3 in the caller. Once this is done,
eliminate the goto to that call site.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386/tcg: unnest switch statements in disas_insn_x87

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386/tcg: simplify effective address calculation

Split gen_lea_v_seg_dest into three simple phases (extend from
16 bits, add, final extend), with optimization for known-zero bases
to avoid back-to-back extensions.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386/tcg: move and expand misplaced comment

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386/tcg: remove do_decode_0F

It is not needed anymore since all prefixes are handled by the
new decoder.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386/tcg: do not compute all flags for SAHF

Only OF is needed, the others are overwritten.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386/tcg: mark more instructions that are invalid in 64-bit mode

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386/tcg: update cc_op after PUSHF

PUSHF already needs to compute the full eflags; set the cc_op to
CC_OP_EFLAGS to avoid duplicate computation in subsequent code.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386/tcg: ignore V3 in 32-bit mode

From the manual: "In 64-bit mode all 4 bits may be used. [...]
In 32-bit and 16-bit modes bit 6 must be 1 (if bit 6 is not 1, the
2-byte VEX version will generate LDS instruction and the 3-byte VEX
version will ignore this bit)."

Cc: qemu-stable@nongnu.org
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: Fix #GP error code for INT instructions

While the (intno << shift) expression is correct for indexing the IDT based on
whether Long Mode is active, the error code itself was unchanged with AMD64,
and is still the index with 3 bits of metadata in the bottom.

Found when running a Xen unit test, all under QEMU.  The unit test objected to
being told there was an error with IDT index 256 when INT $0x80 (128) was the
problem instruction:

  ...
  Error: Unexpected fault 0x800d0802, #GP[IDT[256]]
  ...

Fixes: d2fd1af76777 ("x86_64 linux user emulation")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Link: https://lore.kernel.org/r/20250312000603.3666083-1-andrew.cooper3@citrix.com
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3160
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/cpu: Drop incorrect comment for CPUID 0x1E

The information (tmul_maxk and tmul_maxn) in CPUID 0x1E.0x0.EBX is
defined for architecture, not for SPR.

This is to say, these "hardcoded" values won't change in future. If
the TMUL component needs to be extended for new palettes, there'll
likely be the new TMUL instructions, or new types of AMX instructions
that are _parallel_ to TMUL that operate in particular palettes,
instead of changing current tmul_maxk and tmul_maxn fields in CPUID
0x1E.0x0.EBX.

Furthermore, the previous attempt [*] to make the 0x1E.0x0.EBX fields
user-configurable is incorrect and unnecessary.

Therefore, drop the incorrect and misleading comment.

[*]: https://lore.kernel.org/qemu-devel/20230106083826.5384-2-lei4.wang@intel.com/

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Link: https://lore.kernel.org/r/20251118080837.837505-3-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/cpu: Drop incorrect comment for CPUID 0x1D

The information in CPUID 0x1D.0x1 is for tile palette 1, and is not
SPR-specific.

This is to say, these "hardcoded" values won't change in future. If
the palette needs to be extended, a new tile palette (maybe in a new
subleaf) will be introduced instead of changing current information of
tile palette 1.

Furthermore, the previous attempt [*] to make the 0x1D.0x1 fields
user-configurable is incorrect and unnecessary.

Therefore, drop the incorrect and misleading comment.

[*]: https://lore.kernel.org/qemu-devel/20230106083826.5384-2-lei4.wang@intel.com/

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Link: https://lore.kernel.org/r/20251118080837.837505-2-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

meson: let Meson handle mixed-language linking of Rust and C objects

With the bump to Meson 1.10.0, C objects can be passed to rust targets.
This way, the Rust libstd will be added by rustc itself in its final
linker invocation. Use that to eliminate the staticlib and allow
dynamic linking with libstd (also introduced by Meson 1.9.0, but not
for staticlib crates due to lack of support in rustc).

The main() function is still provided by C, which is possible by
declaring the main source file of the Rust executable (which is
still created by scripts/rust/rust_root_crate.sh) as #![no_main].

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust: Meson now adds -Cdefault-linker-libraries

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

cirrus/macos: enable Rust

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Link: https://lore.kernel.org/r/20250924120426.2158655-25-marcandre.lureau@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

lcitool: enable Rust for Windows cross targets

The issue that is mentioned in the comment has been fixed.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust: only link the Rust part of the code into devices

Do not include libqemuutil in the device crates for the same
reason as in the previous commit. Static libraries like qemuutil
are sensitive to their position on the command line and rustc does not
always get it right.

If rustc places the library too early on the command line, the stubs
are included in the final link product, which results in duplicate
symbols.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust: Do not link qemuutil into Rust rlibs

Commit de037ab8d83d removed qemuutil dependency from chardev and util
rust crates.  However it stayed in the _util_rs static library.  The
dependency is also defined as `link_with`, which is fine for C targets,
where the resulting archive gets linked as another parameter on the
command line when it is a static library.

However, when a C library is linked into a Rust rlib, rustc remembers
the dependency into the metadata and adds the library to the linker
command line.

Unfortunately, static libraries are sensitive to their
position on the command line and rustc does not always get it right.
Fortunately, simply removing it from dependencies of any rust libraries
and instead adding them into the dependencies of executables and
doctests fixes the behaviour.

Without this patch the error I get is:

  FAILED: [code=1] rust/tests/rust-integration
  ...
  = note: rust-lld: error: unable to find library -l:libqemuutil.a
          rust-lld: error: unable to find library -l:libvhost-user-glib.a
          rust-lld: error: unable to find library -l:libvhost-user.a
          rust-lld: error: unable to find library -l:libqemuutil.a
          rust-lld: error: unable to find library -l:libvhost-user-glib.a
          rust-lld: error: unable to find library -l:libvhost-user.a
          rust-lld: error: unable to find library -l:libqemuutil.a
          rust-lld: error: unable to find library -l:libvhost-user-glib.a
          rust-lld: error: unable to find library -l:libvhost-user.a
          rust-lld: error: unable to find library -l:libqemuutil.a
          rust-lld: error: unable to find library -l:libvhost-user-glib.a
          rust-lld: error: unable to find library -l:libvhost-user.a
          rust-lld: error: unable to find library -l:libqemuutil.a
          rust-lld: error: unable to find library -l:libvhost-user-glib.a
          rust-lld: error: unable to find library -l:libvhost-user.a
          rust-lld: error: unable to find library -l:libqemuutil.a
          rust-lld: error: unable to find library -l:libvhost-user-glib.a
          rust-lld: error: unable to find library -l:libvhost-user.a
          collect2: error: ld returned 1 exit status

Meson could work around it itself by never adding these static libraries
to the rlibs (after all, Meson tracks the transitive dependencies already
and knows how to add them to dependents of those rlibs); at least for now,
do it in QEMU: never link C libraries into Rust rlibs, and add them to the
final build products only.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

bump meson wheel to 1.10.0

This version includes several improvements and bugfixes for Rust.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

tests/meson: do not reuse migration_files variable

The variable is defined in migration/meson.build, reusing it is confusing.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

build: do not include @block.syms/@qemu.sys with modules disabled

Including specific symbols used by modules is not necessary for
monolithic executables. This avoids a failure where emcc does not
support @file syntax inside a response file---which in turn breaks
the WebAssembly build if the command line is long enough that meson
decides to use a response file.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/cpu: Mark APX xstate as migratable

APX xstate is user xstate. The related registers are cached in
X86CPUState. And there's a vmsd "vmstate_apx" to migrate these
registers.

Thus, it's safe to mark it as migratable.

Tested-by: Xudong Hao <xudong.hao@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211070942.3612547-10-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/cpu: Support APX CPUIDs

APX is enumerated by CPUID.(EAX=0x7, ECX=1).EDX[21]. And this feature
bit also indicates the existence of dedicated CPUID leaf 0x29, called
the Intel APX Advanced Performance Extensions Leaf.

This new CPUID leaf now is populated with enumerations for a select
set of Intel APX sub-features.

CPUID.(EAX=0x29, ECX=0)
- EAX
   * Maximum Subleaf CPUID.(EAX=0x29, ECX=0).EAX[31:0] = 0
- EBX
   * Reserved CPUID.(EAX=0x29, ECX=0).EBX[31:1] = 0
   * APX_NCI_NDD_NF CPUID.(EAX=0x29, ECX=0).EBX[0:0] = 1, which
     enumerates the presence of New Conditional Instructions (NCIs),
     explicit New Data Destination (NDD) controls, and explicit Flags
     Suppression (NF) controls for select sets of EVEX-encoded Intel
     APX instructions (present in EVEX map=4, and EVEX map=2 0x0F38).
- ECX
   * Reserved CPUID.(EAX=0x29, ECX=0).ECX[31:0] = 0
- EDX
   * Reserved CPUID.(EAX=0x29, ECX=0).EDX[31:0] = 0

Note, APX_NCI_NDD_NF is documented as always enabled for Intel
processors since APX spec (revision v7.0). Now any Intel processor
that enumerates support for APX_F (CPUID.(EAX=0x7, ECX=1).EDX[21])
will also enumerate support for APX_NCI_NDD_NF.

Tested-by: Xudong Hao <xudong.hao@intel.com>
Co-developed-by: Zide Chen <zide.chen@intel.com>
Signed-off-by: Zide Chen <zide.chen@intel.com>
Co-developed-by: Peter Fang <peter.fang@intel.com>
Signed-off-by: Peter Fang <peter.fang@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211070942.3612547-9-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/cpu: Add APX migration support

Add a VMStateDescription to migrate APX EGPRs.

Tested-by: Xudong Hao <xudong.hao@intel.com>
Signed-off-by: Zide Chen <zide.chen@intel.com>
Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211070942.3612547-8-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/monitor: Support EGPRs in hmp_print

Add EGPRs in monitor_defs[] to allow HMP to access EGPRs.

For example,

(qemu) print $r16

Since monitor_defs[] is used for read-only case, no need to consider
xstate synchronization issues that might be caused by modifying EGPRs
(like what gdbstub did).

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Xudong Hao <xudong.hao@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211070942.3612547-7-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/cpu-dump: Dump entended GPRs for APX supported guest

Dump EGPRs when guest supports APX.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Xudong Hao <xudong.hao@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211070942.3612547-6-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/gdbstub: Add APX support for gdbstub

Add i386-64bit-apx.xml from gdb to allow QEMU gdbstub parse APX EGPRs,
and implement the callbacks to allow gdbstub access EGPRs of guest.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Xudong Hao <xudong.hao@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211070942.3612547-5-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/cpu: Cache EGPRs in CPUX86State

Expend general registers array "regs" of CPUX86State to cache entended
GPRs.

Tested-by: Xudong Hao <xudong.hao@intel.com>
Signed-off-by: Zide Chen <zide.chen@intel.com>
Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211070942.3612547-4-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/machine: Use VMSTATE_UINTTL_SUB_ARRAY for vmstate of CPUX86State.regs

Before expanding the number of elements in the CPUX86State.regs array,
first use VMSTATE_UINTTL_SUB_ARRAY for the regs' vmstate to avoid the
type_check_array failure.

VMSTATE_UINTTL_SUB_ARRAY will also be used for subsequently added elements
in regs array.

Tested-by: Xudong Hao <xudong.hao@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211070942.3612547-3-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/cpu: Add APX EGPRs into xsave area

APX feature bit is in CPUID_7_1_EDX[21], and APX has EGPR component with
index 19 in xstate area, EGPR component has 16 64bit regs. Add EGRP
component into xstate area.

Note, APX re-uses the 128-byte XSAVE area that had been previously
allocated by MPX which has been deprecated on Intel processors, so check
whether APX and MPX are set at the same for Guest, if this case happens,
mask off them both to avoid conflict for xsave area.

Tested-by: Xudong Hao <xudong.hao@intel.com>
Signed-off-by: Zide Chen <zide.chen@intel.com>
Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211070942.3612547-2-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/tdx: Add CET SHSTK/IBT into the supported CPUID by XFAM

So that it can be configured in TD guest.

And considerring CET_U and CET_S bits are always same in supported
XFAM reported by TDX module, i.e., either 00 or 11. So, only need to
choose one of them.

Tested-by: Farrah Chen <farrah.chen@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-23-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/tdx: Fix missing spaces in tdx_xfam_deps[]

The checkpatch.pl always complains: "ERROR: space required after that
close brace '}'".

Fix this issue.

Tested-by: Farrah Chen <farrah.chen@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-22-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/cpu: Enable cet-ss & cet-ibt for supported CPU models

Add new versioned CPU models for Sapphire Rapids, Sierra Forest, Granite
Rapids and Clearwater Forest, to enable shadow stack and indirect branch
tracking.

Tested-by: Farrah Chen <farrah.chen@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-21-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/cpu: Advertise CET related flags in feature words

Add SHSTK and IBT flags in feature words with entry/exit
control flags.

CET SHSTK and IBT feature are enumerated via CPUID(EAX=7,ECX=0)
ECX[bit 7] and EDX[bit 20]. CET states load/restore at vmentry/
vmexit are controlled by VMX_ENTRY_CTLS[bit 20] and VMX_EXIT_CTLS[bit 28].
Enable these flags so that KVM can enumerate the features properly.

Tested-by: Farrah Chen <farrah.chen@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
Co-developed-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Chao Gao <chao.gao@intel.com>
Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-20-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/cpu: Mark cet-u & cet-s xstates as migratable

Cet-u and cet-s are supervisor xstates. Their states are saved/loaded by
saving/loading related CET MSRs. And there're the "vmstate_cet" and
"vmstate_pl0_ssp" to migrate these MSRs.

Thus, it's safe to mark them as migratable.

Tested-by: Farrah Chen <farrah.chen@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-19-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/machine: Add vmstate for cet-shstk and cet-ibt

Add vmstates for cet-shstk and cet-ibt

Tested-by: Farrah Chen <farrah.chen@intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
Co-developed-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Chao Gao <chao.gao@intel.com>
Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-18-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/cpu: Migrate MSR_IA32_PL0_SSP for FRED and CET-SHSTK

Both FRED and CET-SHSTK need MSR_IA32_PL0_SSP, so add the vmstate for
this MSR.

When CET-SHSTK is not supported, MSR_IA32_PL0_SSP keeps accessible, but
its value doesn't take effect. Therefore, treat this vmstate as a
subsection rather than a fix for the previous FRED vmstate.

Tested-by: Farrah Chen <farrah.chen@intel.com>
Signed-off-by: Xin Li (Intel) <xin@zytor.com>
Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-17-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/kvm: Add save/restore support for KVM_REG_GUEST_SSP

CET provides a new architectural register, shadow stack pointer (SSP),
which cannot be directly encoded as a source, destination or memory
operand in instructions. But Intel VMCS & VMCB provide fields to
save/load guest & host's ssp.

It's necessary to save & restore Guest's ssp before & after migration.
To support this, KVM implements Guest's SSP as a special KVM internal
register - KVM_REG_GUEST_SSP, and allows QEMU to save & load it via
KVM_GET_ONE_REG/KVM_SET_ONE_REG.

Cache KVM_REG_GUEST_SSP in X86CPUState.

Tested-by: Farrah Chen <farrah.chen@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
Co-developed-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Chao Gao <chao.gao@intel.com>
Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-16-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/kvm: Add save/restore support for CET MSRs

CET (architectural) MSRs include:
MSR_IA32_U_CET - user mode CET control bits.
MSR_IA32_S_CET - supervisor mode CET control bits.
MSR_IA32_PL{0,1,2,3}_SSP - linear addresses of SSPs for user/kernel modes.
MSR_IA32_INT_SSP_TAB - linear address of interrupt SSP table

Since FRED also needs to save/restore MSR_IA32_PL0_SSP, to avoid duplicate
operations, make FRED only save/restore MSR_IA32_PL0_SSP when CET-SHSTK
is not enumerated.

And considerring MSR_IA32_SSP_TBL_ADDR is only presented on 64 bit
processor, wrap it with TARGET_X86_64 macro.

For other MSRs, add save/restore support directly.

Tested-by: Farrah Chen <farrah.chen@intel.com>
Suggested-by: Xin Li (Intel) <xin@zytor.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
Co-developed-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Chao Gao <chao.gao@intel.com>
Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-15-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/cpu: Save/restore SSP0 MSR for FRED

Both FRED and CET shadow stack define the MSR MSR_IA32_PL0_SSP (aka
MSR_IA32_FRED_SSP0 in FRED spec).

MSR_IA32_PL0_SSP is a FRED SSP MSR, so that if a processor doesn't
support CET shadow stack, FRED transitions won't use MSR_IA32_PL0_SSP,
but this MSR would still be accessible using MSR-access instructions
(e.g., RDMSR, WRMSR).

Therefore, save/restore SSP0 MSR for FRED.

Tested-by: Farrah Chen <farrah.chen@intel.com>
Signed-off-by: Xin Li (Intel) <xin@zytor.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-14-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/cpu: Add CET support in CR4

CR4.CET bit (bit 23) is as master enable for CET.
Check and adjust CR4.CET bit based on CET CPUIDs.

Tested-by: Farrah Chen <farrah.chen@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-13-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/cpu: Enable xsave support for CET states

Add CET_U/S bits in xstate area and report support in xstate
feature mask.
MSR_XSS[bit 11] corresponds to CET user mode states.
MSR_XSS[bit 12] corresponds to CET supervisor mode states.

CET Shadow Stack(SHSTK) and Indirect Branch Tracking(IBT) features
are enumerated via CPUID.(EAX=07H,ECX=0H):ECX[7] and EDX[20]
respectively, two features share the same state bits in XSS, so
if either of the features is enabled, set CET_U and CET_S bits
together.

Tested-by: Farrah Chen <farrah.chen@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
Co-developed-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Chao Gao <chao.gao@intel.com>
Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-12-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/cpu: Add missing migratable xsave features

Xtile-cfg & xtile-data are both user xstates. Their xstates are cached
in X86CPUState, and there's a related vmsd "vmstate_amx_xtile", so that
it's safe to mark them as migratable.

Arch lbr xstate is a supervisor xstate, and it is save & load by saving
& loading related arch lbr MSRs, which are cached in X86CPUState, and
there's a related vmsd "vmstate_arch_lbr". So it should be migratable.

PT is still unmigratable since KVM disabled it and there's no vmsd and
no other emulation/simulation support.

Note, though the migratable_flags get fixed,
x86_cpu_enable_xsave_components() still overrides supported xstates
bitmaps regardless the masking of migratable_flags. This is another
issue, and would be fixed in follow-up refactoring.

Tested-by: Farrah Chen <farrah.chen@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-11-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/cpu: Fix supervisor xstate initialization

Arch lbr is a supervisor xstate, but its area is not covered in
x86_cpu_init_xsave().

Fix it by checking supported xss bitmap.

In addition, drop the (uint64_t) type casts for supported_xcr0 since
x86_cpu_get_supported_feature_word() returns uint64_t so that the cast
is not needed. Then ensure line length is within 90 characters.

Tested-by: Farrah Chen <farrah.chen@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Chao Gao <chao.gao@intel.com>
Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-10-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/cpu: Drop pmu check in CPUID 0x1C encoding

Since CPUID_7_0_EDX_ARCH_LBR will be masked off if pmu is disabled,
there's no need to check CPUID_7_0_EDX_ARCH_LBR feature with pmu.

Tested-by: Farrah Chen <farrah.chen@intel.com>
Reviewed-by: Zide Chen <zide.chen@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-9-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/cpu: Reorganize dependency check for arch lbr state

The arch lbr state has 2 dependencies:
* Arch lbr feature bit (CPUID 0x7.0x0:EDX[bit 19]):

   This bit also depends on pmu property. Mask it off if pmu is disabled
   in x86_cpu_expand_features(), so that it is not needed to repeatedly
   check whether this bit is set as well as pmu is enabled.

   Note this doesn't need compat option, since even KVM hasn't support
   arch lbr yet.

   The supported xstate is constructed based such dependency in
   cpuid_has_xsave_feature(), so if pmu is disabled and arch lbr bit is
   masked off, then arch lbr state won't be included in supported
   xstates.

   Thus it's safe to drop the check on arch lbr bit in CPUID 0xD
   encoding.

* XSAVES feature bit (CPUID 0xD.0x1.EAX[bit 3]):

   Arch lbr state is a supervisor state, which requires the XSAVES
   feature support. Enumerate supported supervisor state based on XSAVES
   feature bit in x86_cpu_enable_xsave_components().

   Then it's safe to drop the check on XSAVES feature support during
   CPUID 0XD encoding.

Suggested-by: Zide Chen <zide.chen@intel.com>
Tested-by: Farrah Chen <farrah.chen@intel.com>
Reviewed-by: Zide Chen <zide.chen@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-8-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/cpu: Use x86_ext_save_areas[] for CPUID.0XD subleaves

The x86_ext_save_areas[] is expected to be well initialized by
accelerators and its xstate detail information cannot be changed by
user. So use x86_ext_save_areas[] to encode CPUID.0XD subleaves directly
without other hardcoding & masking.

And for arch LBR, KVM fills its xstate in x86_ext_save_areas[] via
host_cpuid(). The info obtained this way matches what would be retrieved
from x86_cpu_get_supported_cpuid() (since KVM just fills CPUID with the
host xstate info directly anyway). So just use the initialized
x86_ext_save_areas[] instead of calling x86_cpu_get_supported_cpuid().

Tested-by: Farrah Chen <farrah.chen@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-7-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/cpu: Add avx10 dependency for Opmask/ZMM_Hi256/Hi16_ZMM

With feature array in ExtSaveArea, add avx10 as the second dependency
for Opmask/ZMM_Hi256/Hi16_ZMM xsave components, and drop the special
check in cpuid_has_xsave_feature().

Tested-by: Farrah Chen <farrah.chen@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-6-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/cpu: Make ExtSaveArea store an array of dependencies

Some XSAVE components depend on multiple features. For example, Opmask/
ZMM_Hi256/Hi16_ZMM depend on avx512f OR avx10, and for CET (which will
be supported later), cet_u/cet_s will depend on shstk OR ibt.

Although previously there's the special check for the dependencies of
AVX512F OR AVX10 on their respective XSAVE components (in
cpuid_has_xsave_feature()), to make the code more general and avoid
adding more special cases, make ExtSaveArea store a features array
instead of a single feature, so that it can describe multiple
dependencies.

Tested-by: Farrah Chen <farrah.chen@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-5-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/cpu: Reorganize arch lbr structure definitions

- Move ARCH_LBR_NR_ENTRIES macro and LBREntry definition before XSAVE
  areas definitions.
- Reorder XSavesArchLBR (area 15) between XSavePKRU (area 9) and
  XSaveXTILECFG (area 17), and reorder the related QEMU_BUILD_BUG_ON
  check to keep the same ordering.

This makes xsave structures to be organized together and makes them
clearer.

Tested-by: Farrah Chen <farrah.chen@intel.com>
Reviewed-by: Zide Chen <zide.chen@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-4-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/cpu: Clean up arch lbr xsave struct and comment

Arch LBR state is area 15, not 19. Fix this comment. And considerring
other areas don't mention user or supervisor state, for consistent
style, remove "Supervisor mode" from its comment.

Moreover, rename XSavesArchLBR to XSaveArchLBR since there's no need to
emphasize XSAVES in naming; the XSAVE related structure is mainly
used to represent memory layout.

In addition, arch lbr specifies its offset of xsave component as 0. But
this cannot help on anything. The offset of ExtSaveArea is initialized
by accelerators (e.g., hvf_cpu_xsave_init(), kvm_cpu_xsave_init() and
x86_tcg_cpu_xsave_init()), so explicitly setting the offset doesn't
work and CPUID 0xD encoding has already ensure supervisor states won't
have non-zero offsets. Drop the offset initialization and its comment
from the xsave area of arch lbr.

Tested-by: Farrah Chen <farrah.chen@intel.com>
Reviewed-by: Zide Chen <zide.chen@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-3-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/cpu: Clean up indent style of x86_ext_save_areas[]

The indentation style in `x86_ext_save_areas[]` is extremely
inconsistent. Clean it up to ensure a uniform style.

Tested-by: Farrah Chen <farrah.chen@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251211060801.3600039-2-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

qemu-options.hx: Document -M as -machine alias

-M is used heavily in documentation and scripts, but isn't actually
documented anywhere.

Document it as equivalent to -machine, also moving to -machine the sole
suboption that was documented under -M.

Reported-by: Julian Andres Klode <jak@jak-linux.org>
Signed-off-by: Dr. David Alan Gilbert <dave@treblig.org>
Link: https://lore.kernel.org/r/20251203131511.153460-1-dave@treblig.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

tracetool: add Rust DTrace/SystemTap SDT support

Implement DTrace/SystemTap SDT by emitting the following:
- The probe crate's probe!() macro is used to emit a DTrace/SystemTap
  SDT probe.
- Every trace event gets a corresponding trace_<name>_enabled() -> bool
  generated function that Rust code can use to avoid expensive
  computation when a trace event is disabled. This API works for other
  trace backends too.

`#[allow(dead_code)]` additions are necessary for QEMU's dstate in
generated trace-<dir>.rs files since they are unused by the dtrace
backend. `./configure --enable-trace-backends=` can enable multiple
backends, so keep it simple and just silence the warning instead of
trying to detect the condition when generating the dstate code can be
skipped.

The tracetool tests are updated. Take a look at
tests/tracetool/dtrace.rs to see what the new generated code looks like.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Link: https://lore.kernel.org/r/20251119205200.173170-5-stefanha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

subprojects: add probe crate

The probe crate (https://crates.io/crates/probe) provides a probe!()
macro that defines SystemTap SDT probes on Linux hosts or does nothing
on other host OSes.

This crate will be used to implement DTrace support for Rust.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Link: https://lore.kernel.org/r/20251119205200.173170-4-stefanha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust/hpet: change wrap_flag to a bool

This is more precise, and makes it possible to use bool::then_some().

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust/hpet: Apply Migratable<> wrapper and ToMigrationState

Before using Mutex<> to protect HPETRegisters, it's necessary to apply
Migratable<> wrapper and ToMigrationState first since there's no
pre-defined VMState for Mutex<>.

In addition, this allows to move data from HPETRegisters' vmstate
to HPETTimer's, so as to preserve the original migration format of the C
implementation. To do that, HPETTimer is wrapped with Migratable<>
as well but the implementation of ToMigrationStateShared is
hand-written.

Note that even though the HPETRegistersMigration struct is
generated by ToMigrationState macro, its VMState still needs to be
implemented by hand.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251113051937.4017675-21-zhao1.liu@intel.com
[Added HPETTimer implementation and restored compatible migration format. - Paolo]
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust: migration: implement ToMigrationState for Timer

Timer is a complex struct, allow adding it to a struct that
uses #[derive(ToMigrationState)]; similar to vmstate_timer, only
the expiration time has to be preserved.

In fact, because it is thread-safe, ToMigrationStateShared can
also be implemented without needing a cell or mutex that wraps
the timer.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust: timer: add bindings to timer_mod_ns and timer_expire_time_ns

These are needed to implement ToMigrationStateShared for timers,
and thus allow including them in Migratable<> structs.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust/hpet: move hpet_offset to HPETRegisters

Likewise, do not separate hpet_offset from the other registers.
However, because it is migrated in a subsection it is necessary
to copy it out of HPETRegisters and into a BqlCell<>.

Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust/hpet: remove BqlRefCell around HPETTimer

HPETTimer now has all of its state stored in HPETRegisters, so it does not
need its own BqlRefCell anymore.

Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust/hpet: move hidden registers to HPETTimerRegisters

Do not separate visible and hidden state; both of them are used in the
same circumstances and it's easiest to place both of them under the
same BqlRefCell.

Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust/hpet: Borrow BqlRefCell<HPETRegisters> at top level

Lockless IO requires to lock the registers during MMIO access. So it's
necessary to get (or borrow) registers data at top level, and not to
borrow again in child function calls.

Change the argument types from BqlRefCell<HPETRegisters> to
&HPETRegisters/&mut HPETRegisters in child methods, and do borrow the
data once at top level.

This allows BqlRefCell<HPETRegisters> to be directly replaced with
Mutex<HPETRegisters> in subsequent steps without causing lock reentrancy
issues.

Note, passing reference instead of BqlRef/BqlRefMut because BqlRefMut
cannot be re-borrowed as BqlRef, though BqlRef/BqlRefMut themselves play
as the "guard". Passing reference is directly and the extra
bql::is_locked check could help to consolidate safety guarantee.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251113051937.4017675-19-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust/hpet: Maintain HPETTimerRegisters in HPETRegisters

Lockless IO requires holding a single lock during MMIO access, so that
it's necessary to maintain timer N's registers (HPETTimerRegisters) with
global register in one place.

Therefore, move HPETTimerRegisters to HPETRegisters from HPETTimer, and
access timer registers from HPETRegisters struct for the whole HPET
code.

This changes HPETTimer and HPETRegisters, and the layout of VMState has
changed, which makes it incompatible to migrate with previous versions.
Thus, bump up the version IDs in VMStates of HPETState and HPETTimer.

The VMState version IDs of HPETRegisters doesn't need to change since
it's a newly added struct and its version IDs doesn't affect the
compatibility of HPETState's VMState.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251113051937.4017675-18-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust/hpet: Pass &BqlRefCell<HPETRegisters> as argument during MMIO access

Currently in HPETTimer context, the global registers are accessed by
dereferring a HPETState raw pointer stored in NonNull<>, and then
borrowing the BqlRefCel<>.

This blocks borrowing HPETRegisters once during MMIO access, and
furthermore prevents replacing BqlRefCell<> with Mutex<>.

Therefore, do not access global registers through NonNull<HPETState>
and instead passing &BqlRefCell<HPETRegisters> as argument in
function calls within MMIO access.

But there's one special case that is timer handler, which still needs
to access HPETRegistsers through NonNull<HPETState>. It's okay for now
since this case doesn't have any repeated borrow() or lock reentrancy
issues.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251113051937.4017675-17-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust/hpet: Explicitly initialize complex fields in init()

Explicitly initialize more fields which are complex structures.

For simple types (bool/u32/usize), they can be omitted since C has
already initialized memory to all zeros and this is the valid
initialization for those simple types.

Previously such complex fields (InterruptSource/BqlCell/BqlRefCell) were
not explicitly initialized in init() and it's fine, because simply
setting all memory to zero aligns with their default initialization
behavior. However, this behavior is not robust. When adding new complex
struct or modifying the initial values of existing structs, this default
behavior can easily be broken.

Thus, do explicit initialization for HPET to become a good example.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251113051937.4017675-16-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust/hpet: Borrow HPETState.regs once in HPETState::post_load()

Timers in post_load() access the same HPETState, which is the "self"
HPETState.

So there's no need to access HPETState from child HPETTimer again and
again. Instead, just cache and borrow HPETState.regs at the beginning,
and this could save some CPU cycles and reduce borrow() calls.

It's safe, because post_load() is called with BQL protection, so that
there's no other chance to modify the regs.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251113051937.4017675-15-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust/hpet: Make global register accessors as methods of HPETRegisters

Implement helper accessors as methods of HPETRegisters. Then
HPETRegisters can be accessed without going through HPETState.

In subsequent refactoring, coarser-grained BQL lock protection will be
implemented. Specifically, BqlRefCell<HPETRegisters> will be borrowed
only once during MMIO accesses, and the scope of borrowed `regs` will
be extended to cover the entire MMIO access. Consequently, repeated
borrow() attempts within function calls will no longer be allowed.

Therefore, refactor the accessors of HPETRegisters to bypass HPETState,
which help to reduce borrow() in deep function calls.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251113051937.4017675-14-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust/hpet: Abstract HPETRegisters struct

Place all HPET (global) timer block registers in a HPETRegisters struct,
and wrap the whole register struct with a BqlRefCell<>.

This allows to elevate the Bql check from individual register access to
register structure access, making the Bql check more coarse-grained. But
in current step, just treat BqlRefCell as BqlCell while maintaining
fine-grained BQL protection. This approach helps to use HPETRegisters
struct clearly without introducing the "already borrowed" around
BqlRefCell.

HPETRegisters struct makes it possible to take a Mutex<> to replace
BqlRefCell<>, like C side did.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251113051937.4017675-13-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust/hpet: Make timer register accessors as methods of HPETTimerRegisters

Implement helper accessors as methods of HPETTimerRegisters. Then
HPETTimerRegisters can be accessed without going through HPETTimer or
HPETState.

In subsequent refactoring, HPETTimerRegisters will be maintained at the
HPETState level. However, accessing it through HPETState requires the
lock (lock BQL or mutex), which would cause troublesome nested locks or
reentrancy issues.

Therefore, refactor the accessors of HPETTimerRegisters to bypass
HPETTimer or HPETState.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251113051937.4017675-12-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust/hpet: Abstract HPETTimerRegisters struct

Place all timer N's registers in a HPETTimerRegisters struct.

This allows all Timer N registers to be grouped together with global
registers and managed using a single lock (BqlRefCell or Mutex) in
future. And this makes it easier to apply ToMigrationState macro.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251113051937.4017675-11-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust/hpet: Rename decoded "reg" enumeration to "target"

HPETAddrDecode has a `reg` field so that there're many variables named
"reg" in MMIO read/write/decode functions.

In the future, there'll be other HPETRegisters/HPETTimerRegisters
structs containing values of HPET registers, and related variables or
arguments will be named as "regs".

To avoid potential confusion between many "reg" and "regs", rename
HPETAddrDecode::reg to HPETAddrDecode::target, and rename decoding
related variables from "reg" to "target".

"target" is picked as the name since this clearly reflects the field or
variable is the target decoded register.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251113051937.4017675-10-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust/hpet: Rename HPETRegister to DecodedRegister

HPETRegister represents the layout of register spaces of HPET timer
block and timer N, and is used to decode register address into register
enumeration.

To avoid confusion with the subsequently introduced HPETRegisters (that
is used to maintain values of HPET registers), rename HPETRegister to
DecodedRegister.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251113051937.4017675-9-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust/hpet: Reduce unnecessary mutable self argument

Methods of Timer and InterruptSource have been made as safe, so make the
related methods of HPETTimer to accept immutable self reference.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251113051937.4017675-8-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust/bql: Ensure BQL locked early at BqlRefCell borrowing

At present, BqlRefCell checks whether BQL is locked when it blocks BQL
unlock (in bql_block_unlock).

But the such check should be done earlier - at the beginning of
BqlRefCell borrowing.

So convert BqlRefCell::borrow field from Cell<> to BqlCell<>, to
guarantee BQL is locked from the beginning when someone is trying to
borrow BqlRefCell.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251113051937.4017675-6-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust/migration: Check name field in VMStateDescriptionBuilder

The name field is necessary for VMStateDescription, so that it's
necessary to check if it is set when build VMStateDescription.

Since is_null()/as_ref() become rustc v1.84 and pointer cannot cast to
integer in const, use Option<> to check name with a new field in
VMStateDescriptionBuilder instead.

This can be simplified in future when QEMU bumps up rustc version.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251113051937.4017675-4-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust/migration: Fix missing name in the VMSD of Migratable<>

The VMStateDescription of Migratable<T> missed the name field, and this
casused segmentation fault in vmstate_save_state_v() when it tries to
write name field by json_writer_str().

Due to the limitation of const, a custom name based on type would be
more difficult. Instead, a straightforward and simple approach is to
have all Migratable<T> instances use the same VMSD name -
"migratable-wrapper".

This is availiable because Migratable<T> is always a field within a
VMSD, and its parent VMSD should have a distinct name.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251113051937.4017675-3-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust/migration: Add Sync implementation for Migratable<>

It's common to define MemoryRegionOps<T> and VMStateDescription<T> as
static variables, and this requires T to implement Sync.

Migratable<T> is usually embedded in device state, so it's necessary to
implement Sync for Migratable<T>.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20251113051937.4017675-2-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust/hpet: add trace events

Implement the same trace events as the C implementation.

Notes:
- Keep order of hpet_ram_write_invalid_tn_cmp and hpet_ram_write_tn_cmp
the same as the C implementation.
- Put hpet_ram_write_timer_id in HPETTimer::write() instead of
HPETState::decode() so that reads can be excluded.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Link: https://lore.kernel.org/r/20251106215606.36598-3-stefanha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>