git.ipfire.org Git - thirdparty/qemu.git/log

gdbstub: Remove tb_flush uses

This hasn't been needed since d828b92b8a6
("accel/tcg: Introduce CF_BP_PAGE").

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tests/tcg/multiarch: Add tb-link test

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

accel/tcg: Properly unlink a TB linked to itself

When we remove dest from orig's links, we lose the link
that we rely on later to reset links. This can lead to
failure to release from spinlock with self-modifying code.

Cc: qemu-stable@nongnu.org
Reported-by: 李威威 <liweiwei@kubuds.cn>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Tested-by: Anton Johansson <anjo@rev.ng>

target/hppa: Adjust mmu indexes to begin with 0

This is a logical reversion of 2ad04500543, though
there have been additions to the set of mmu indexes
since then. The impetus to that original patch,
"9-15 will use shorter assembler instructions when
run on a x86-64 host" is now handled generically.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

include/hw/core/cpu: Invert the indexing into CPUTLBDescFast

This array is within CPUNegativeOffsetState, which means the
last element of the array has an offset from env with the
smallest magnitude. This can be encoded into fewer bits
when generating TCG fast path memory references.

When we changed the NB_MMU_MODES to be a global constant,
rather than a per-target value, we pessimized the code
generated for targets which use only a few mmu indexes.
By inverting the array index, we counteract that.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

include/hw/core/cpu: Introduce cpu_tlb_fast

Encapsulate access to cpu->neg.tlb.f[] in a function.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

include/hw/core/cpu: Introduce MMUIdxMap

Use a typedef instead of uint16_t directly when
describing sets of mmu indexes.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Fix folding of vector bitsel

It looks like a typo. When the false value (C) is the constant -1, the
correct fold should be: R = B | ~A

Reproducer (LoongArch64 assembly):

     .text
     .globl  _start
_start:
     vldi    $vr1, 3073
     vldi    $vr2, 1023
     vbitsel.v       $vr0, $vr2, $vr1, $vr2
     vpickve2gr.d    $a1, $vr0, 1
     xori    $a0, $a1, 1
     li.w    $a7, 93
     syscall 0

Fixes: e58b977238e3 ("tcg/optimize: Optimize bitsel_vec")
Link: https://github.com/llvm/llvm-project/issues/159610
Signed-off-by: WANG Rui <wangrui@loongson.cn>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20250919124901.2756538-1-wangrui@loongson.cn>

hw/pci-host/astro: Don't call pci_regsiter_root_bus() in init

In the astro PCI host bridge device, we call pci_register_root_bus()
in the device's instance_init. This is a problem for two reasons
* the PCI bridge is then available to the rest of the simulation
   (e.g. via pci_qdev_find_device()), even though it hasn't
   yet been realized
* we do not attempt to unregister in an instance_deinit,
   which means that if you go through an instance_init -> deinit
   lifecycle the freed memory for the host-bridge device is
   left on the pci_host_bridges list

ASAN reports the resulting use-after-free:

==1776584==ERROR: AddressSanitizer: heap-use-after-free on address 0x51f00000cb00 at pc 0x5b2d460a89b5 bp 0x7ffef7617f50 sp 0x7ffef7617f48
WRITE of size 8 at 0x51f00000cb00 thread T0
    #0 0x5b2d460a89b4 in pci_host_bus_register /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/hppa-asan/../../hw/pci/pci.c:608:5
    #1 0x5b2d46093566 in pci_root_bus_internal_init /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/hppa-asan/../../hw/pci/pci.c:677:5
    #2 0x5b2d460935e0 in pci_root_bus_new /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/hppa-asan/../../hw/pci/pci.c:706:5
    #3 0x5b2d46093fe5 in pci_register_root_bus /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/hppa-asan/../../hw/pci/pci.c:751:11
    #4 0x5b2d46fe2335 in elroy_pcihost_init /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/hppa-asan/../../hw/pci-host/astro.c:455:16

0x51f00000cb00 is located 1664 bytes inside of 3456-byte region [0x51f00000c480,0x51f00000d200)
freed by thread T0 here:
    #0 0x5b2d4582385a in free (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/hppa-asan/qemu-system-hppa+0x17ad85a) (BuildId: 692b49eedc6fb0ef618bbb6784a09311b3b7f1e8)
    #1 0x5b2d47160723 in object_finalize /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/hppa-asan/../../qom/object.c:734:9
    #2 0x5b2d471589db in object_unref /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/hppa-asan/../../qom/object.c:1232:9
    #3 0x5b2d477d373c in qmp_device_list_properties /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/hppa-asan/../../qom/qom-qmp-cmds.c:237:5

previously allocated by thread T0 here:
    #0 0x5b2d45823af3 in malloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/hppa-asan/qemu-system-hppa+0x17adaf3) (BuildId: 692b49eedc6fb0ef618bbb6784a09311b3b7f1e8)
    #1 0x79728fa08b09 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x62b09) (BuildId: 1eb6131419edb83b2178b682829a6913cf682d75)
    #2 0x5b2d471595fc in object_new_with_type /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/hppa-asan/../../qom/object.c:767:15
    #3 0x5b2d47159409 in object_new_with_class /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/hppa-asan/../../qom/object.c:782:12
    #4 0x5b2d477d29a5 in qmp_device_list_properties /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/hppa-asan/../../qom/qom-qmp-cmds.c:206:11

Cc: qemu-stable@nongnu.org
Fixes: e029bb00a79be ("hw/pci-host: Add Astro system bus adapter found on PA-RISC machines")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3118
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20250918114259.1802337-3-peter.maydell@linaro.org>

hw/pci-host/dino: Don't call pci_register_root_bus() in init

In the dino PCI host bridge device, we call pci_register_root_bus()
in the device's instance_init. This is a problem for two reasons
* the PCI bridge is then available to the rest of the simulation
   (e.g. via pci_qdev_find_device()), even though it hasn't
   yet been realized
* we do not attempt to unregister in an instance_deinit,
   which means that if you go through an instance_init -> deinit
   lifecycle the freed memory for the host-bridge device is
   left on the pci_host_bridges list

ASAN reports the resulting use-after-free:

==1771223==ERROR: AddressSanitizer: heap-use-after-free on address 0x527000018f80 at pc 0x5b4b9d3369b5 bp 0x7ffd01929980 sp 0x7ffd01929978
WRITE of size 8 at 0x527000018f80 thread T0
    #0 0x5b4b9d3369b4 in pci_host_bus_register /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/hppa-asan/../../hw/pci/pci.c:608:5
    #1 0x5b4b9d321566 in pci_root_bus_internal_init /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/hppa-asan/../../hw/pci/pci.c:677:5
    #2 0x5b4b9d3215e0 in pci_root_bus_new /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/hppa-asan/../../hw/pci/pci.c:706:5
    #3 0x5b4b9d321fe5 in pci_register_root_bus /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/hppa-asan/../../hw/pci/pci.c:751:11
    #4 0x5b4b9d390521 in dino_pcihost_init /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/hppa-asan/../../hw/pci-host/dino.c:473:16

0x527000018f80 is located 1664 bytes inside of 12384-byte region [0x527000018900,0x52700001b960)
freed by thread T0 here:
    #0 0x5b4b9cab185a in free (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/hppa-asan/qemu-system-hppa+0x17ad85a) (BuildId: ca496bb2e4fc750ebd289b448bad8d99c0ecd140)
    #1 0x5b4b9e3ee723 in object_finalize /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/hppa-asan/../../qom/object.c:734:9
    #2 0x5b4b9e3e69db in object_unref /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/hppa-asan/../../qom/object.c:1232:9
    #3 0x5b4b9ea6173c in qmp_device_list_properties /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/hppa-asan/../../qom/qom-qmp-cmds.c:237:5
    #4 0x5b4b9ec4e0f3 in qmp_marshal_device_list_properties /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/hppa-asan/qapi/qapi-commands-qdev.c:65:14

previously allocated by thread T0 here:
    #0 0x5b4b9cab1af3 in malloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/hppa-asan/qemu-system-hppa+0x17adaf3) (BuildId: ca496bb2e4fc750ebd289b448bad8d99c0ecd140)
    #1 0x799d8270eb09 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x62b09) (BuildId: 1eb6131419edb83b2178b682829a6913cf682d75)
    #2 0x5b4b9e3e75fc in object_new_with_type /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/hppa-asan/../../qom/object.c:767:15
    #3 0x5b4b9e3e7409 in object_new_with_class /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/hppa-asan/../../qom/object.c:782:12
    #4 0x5b4b9ea609a5 in qmp_device_list_properties /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/hppa-asan/../../qom/qom-qmp-cmds.c:206:11

where we allocated one instance of the dino device, put it on the
list, freed it, and then trying to allocate a second instance touches
the freed memory on the pci_host_bridges list.

Fix this by deferring all the setup of memory regions and registering
the PCI bridge to the device's realize method.  This brings it into
line with almost all other PCI host bridges, which call
pci_register_root_bus() in realize.

Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3118
Fixes: 63901b6cc4d8b4 ("dino: move PCI bus initialisation to dino_pcihost_init()")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20250918114259.1802337-2-peter.maydell@linaro.org>

target/sparc: Relax decode of rs2_or_imm for v7

For v7, bits [12:5] are ignored for !imm.
For v8, those same bits are reserved, but are not trapped.

Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/sparc: Loosen decode of RDTBR for v7

For v7, bits [18:0] are ignored.
For v8, bits [18:14] are reserved and bits [13:0] are ignored.

Fixes: e8325dc02d0 ("target/sparc: Move RDTBR, FLUSHW to decodetree")
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/sparc: Loosen decode of RDWIM for v7

For v7, bits [18:0] are ignored.
For v8, bits [18:14] are reserved and bits [13:0] are ignored.

Fixes: 5d617bfba07 ("target/sparc: Move RDWIM, RDPR to decodetree")
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/sparc: Loosen decode of RDPSR for v7

For v7, bits [18:0] are ignored.
For v8, bits [18:14] are reserved and bits [13:0] are ignored.

Fixes: 668bb9b755e ("target/sparc: Move RDPSR, RDHPR to decodetree")
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/sparc: Loosen decode of RDY for v7

Bits [18:0] are not decoded with v7, and for v8 unused values
of rs1 simply produce undefined results.

Fixes: af25071c1d ("target/sparc: Move RDASR, STBAR, MEMBAR to decodetree")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>

target/sparc: Loosen decode of STBAR for v8

Solaris 8 appears to have a bug whereby it executes v9 MEMBAR
instructions when booting a freshly installed image. According
to the SPARC v8 architecture manual, whilst bits 13 and bits 12-0
of the "Read State Register Instructions" are notionally zero,
they are marked as unused (i.e. ignored).

Fixes: af25071c1d ("target/sparc: Move RDASR, STBAR, MEMBAR to decodetree")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3097
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>

target/sparc: Allow TRANS macro with no extra arguments

Use ## to drop the preceding comma if __VA_ARGS__ is empty.

Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

linux-user: Add syscall dispatch support

This commit adds support for the `prctl(PR_SET_SYSCALL_USER_DISPATCH)`
function in the Linux userspace emulator.

It is implemented as a fully host-independent function, by forcing
a SIGSYS early during syscall handling, if the PC is outside the
allowed range.

Since disabled SUD is indistinguishable from enabled SUD with
always-allowed region length == ~0, this encoding is used
instead of introducing a new flag.

Tested on [uglendix][1], will probably also apply to software like
tiny-wine, rpcsx, limbo, lazypoline, vicar, sysfail and endokernel,
to name a few.

[1]: https://sr.ht/~arusekk/uglendix

Signed-off-by: Arusekk <floss@arusekk.pl>
Message-ID: <20250711225226.14652-1-floss@arusekk.pl>
[rth: Split out is_vdso_sigreturn region matching and other minor tweaks.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

linux-user: Populate vdso_sigreturn_region_{start,end} from sigtramp page

When a target does not support a vdso, we generate a sigtramp page.
The only thing on this page is a (set of) signal return syscalls.
We do not need to narrowly restrict the vdso_sigreturn_region;
simply record the entire page for all such targets.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

linux-user: Populate sigreturn_region_{start,end} in all vdso.S

Mark the regions which contain sigreturn syscalls within
each vdso. Rebuild the shared objects.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

linux-user: Create vdso_sigreturn_region_{start,end}

These variables will be populated from the vdso, and used
for detecting whether we are executing the sigreturn.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Merge tag 'pull-9p-20250918' of https://github.com/cschoenebeck/qemu into staging

9pfs changes:

* Add FreeBSD host support.

* Fix glib header inclusion.

# -----BEGIN PGP SIGNATURE-----
#
# iQJLBAABCgA1FiEEltjREM96+AhPiFkBNMK1h2Wkc5UFAmjMYKMXHHFlbXVfb3Nz
# QGNydWRlYnl0ZS5jb20ACgkQNMK1h2Wkc5VUGBAAiRVM6vTErPwccp+w8UrpAVo5
# oXdN2TIpQoILGg2vSuHc4mGUXjMmqnihCbNP9p3ZUVSYQwSwpXa2i47GSe100Mzi
# kiv2/SROopohE6ZiDok65GCj2hXShF0tZGauTBoE0WTZP9LG+rvftMeupbgrEKll
# To5hOdsQbPw2HtATpTjRufvVTtaeu8oGeh+BPmtiyu7Aiea4xht9YCAMa8AVG44P
# 97ZmnqYAq/5bolE6fTuVEWj484cPjMPC/sMBddhNV57HwzYdqGdOinR3GqRHspvN
# B0oCq07HXeAV55APGQtPWOWq1SonGqIhHj0Hdnugl3DWUWiQs0CVSMPlE7Aag7at
# /8JbGS2j7RuM5N9Zdf8Wlq78jgvRmbpYZunD0RLd8O+jESaHAoNpjrNHm4v92WLa
# bUePytsxCK9ozStPqRVB9zGOYyx36LKG/8E5J4t00GX2F0FRB9OxgSPFWCWFnqM5
# R4IvR2huW8/DvplgvVpPc0SM+lMV7GZhAC92z7KkQYBE85s09EdAobIIHguK3B0l
# 5hy9w6tZ6nnFloaL0fWccE3XU+X56KrDkX0G/AEdppsxYBYYhs1XNhR5AYuQCEd5
# gdKtLrEOr1F2snb8aLfS8MDwTUCkU1lfbipyzDaX3sr4Gg+7L/vV3OxQoGmwMjOe
# xnI3cMzk0j7prHT1oSc=
# =3YK2
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 18 Sep 2025 12:42:27 PM PDT
# gpg:                using RSA key 96D8D110CF7AF8084F88590134C2B58765A47395
# gpg:                issuer "qemu_oss@crudebyte.com"
# gpg: Good signature from "Christian Schoenebeck <qemu_oss@crudebyte.com>" [unknown]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: ECAB 1A45 4014 1413 BA38  4926 30DB 47C3 A012 D5F4
#      Subkey fingerprint: 96D8 D110 CF7A F808 4F88  5901 34C2 B587 65A4 7395

* tag 'pull-9p-20250918' of https://github.com/cschoenebeck/qemu:
  9pfs: Stop including gstrfuncs.h
  9pfs: Add FreeBSD support

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Merge tag 'misc-fixes-pull-request' of https://gitlab.com/berrange/qemu into staging

* Update security triage contact address
* Check and honour failures to the blocking flag on FDs
* Don't touch blocking flags on FDs received during migration

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE2vOm/bJrYpEtDo4/vobrtBUQT98FAmjNQuAACgkQvobrtBUQ
# T99xaBAAr6zQPii1tjzuzLovF6MIqtldXnmVO/yjcl5NgLWonIRDt2JsxnRxi3es
# 9uNDed5+ePNXmUAYd46k81gBEjBWbv465kt5FHAZZV6BRw/PPzkoh+jzGc8NVir8
# 3GZJ2kPr51PxGEl8md2vRthg4bMuhlS5ogCEqAMDYT4f6AVemfnNQ5NttGX353T2
# etxoMhEeMtTBKjMoTBv+SVhhO4nKwZ+6CFhvuGON423EfrGlkNTXyprKTdzpr4i0
# 4KDQLxxoANlmg/1W0PxfrLiBCmGpHweMR44Piv715VYa2YNPRq0G6EC6AFGbHZ51
# N+mKmWNE0CS5rP1TEacSCX4q6If5VxjSLLj+og8LmpIlJ6tiqdrisSqA6bzCJ1f/
# lMsfUsKoMqPhqat9ZGUkYu8REgKP+O+CSGJNftYTsEEY0oKZrAW4fsoN3E9qpfcG
# Xy6eSu0TTGDWE6CEe0vkHiQwlVHMtRcWMSPwlsvrgt2TO6k97reT3AoIBK2VfygC
# WzMv0P0nBvHFKeIbqmFOk3BEI5+JECgxVRc1WXWbSFLW0PBY/xd7g6ow8uaQsd9e
# pzMA1Pwh2EuM4DTlOy+m9zBOhm9YP9An188NLldOne3TFKFYe5QO1DQpvvEGvIGB
# +4XpmyOj3g2ycelZZ5XsDJk0LumCCOcbSPSiAvHZyWwLo24EABE=
# =rrMd
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 19 Sep 2025 04:47:44 AM PDT
# gpg:                using RSA key DAF3A6FDB26B62912D0E8E3FBE86EBB415104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>" [unknown]
# gpg:                 aka "Daniel P. Berrange <berrange@redhat.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E  8E3F BE86 EBB4 1510 4FDF

* tag 'misc-fixes-pull-request' of https://gitlab.com/berrange/qemu:
  util/vhost-user-server: vu_message_read(): improve error handling
  chardev: close an fd on failure path
  chardev: qemu_chr_open_fd(): add errp
  treewide: use qemu_set_blocking instead of g_unix_set_fd_nonblocking
  util: drop qemu_socket_set_block()
  io/channel-socket: rework qio_channel_socket_copy_fds()
  util: drop qemu_socket_try_set_nonblock()
  util: drop qemu_socket_set_nonblock()
  migration: qemu_file_set_blocking(): add errp parameter
  treewide: handle result of qio_channel_set_blocking()
  util: add qemu_set_blocking() function
  char-socket: tcp_chr_recv(): add comment
  char-socket: tcp_chr_recv(): drop extra _set_(block,cloexec)
  io/channel: document how qio_channel_readv_full() handles fds
  migration/qemu-file: don't make incoming fds blocking again
  MAINTAINERS: list qemu-security@nongnu.org as security contact

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

util/vhost-user-server: vu_message_read(): improve error handling

1. Drop extra error_report_err(NULL), it will just crash, if we get
here.

2. Get and report error of qemu_set_blocking(), instead of aborting.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>

chardev: close an fd on failure path

There are at least two failure paths, where we forget
to close an fd.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>

chardev: qemu_chr_open_fd(): add errp

Every caller already support errp, let's go further.

Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>

treewide: use qemu_set_blocking instead of g_unix_set_fd_nonblocking

Instead of open-coded g_unix_set_fd_nonblocking() calls, use
QEMU wrapper qemu_set_blocking().

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
[DB: fix missing closing ) in tap-bsd.c, remove now unused GError var]
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>

util: drop qemu_socket_set_block()

Now it's unused.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>

io/channel-socket: rework qio_channel_socket_copy_fds()

We want to switch from qemu_socket_set_block() to newer
qemu_set_blocking(), which provides return status of operation,
to handle errors.

Still, we want to keep qio_channel_socket_readv() interface clean,
as currently it allocate @fds only on success.

So, in case of error, we should close all incoming fds and keep
user's @fds untouched or zero.

Let's make separate functions qio_channel_handle_fds() and
qio_channel_cleanup_fds(), to achieve what we want.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>

util: drop qemu_socket_try_set_nonblock()

Now we can use qemu_set_blocking() in these cases.

Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>

util: drop qemu_socket_set_nonblock()

Use common qemu_set_blocking() instead.

Note that pre-patch the behavior of Win32 and Linux realizations
are inconsistent: we ignore failure for Win32, and assert success
for Linux.

How do we convert the callers?

1. Most of callers call qemu_socket_set_nonblock() on a
freshly created socket fd, in conditions when we may simply
report an error. Seems correct switching to error handling
both for Windows (pre-patch error is ignored) and Linux
(pre-patch we assert success). Anyway, we normally don't
expect errors in these cases.

Still in tests let's use &error_abort for simplicity.

What are exclusions?

2. hw/virtio/vhost-user.c - we are inside #ifdef CONFIG_LINUX,
so no damage in switching to error handling from assertion.

3. io/channel-socket.c: here we convert both old calls to
qemu_socket_set_nonblock() and qemu_socket_set_block() to
one new call. Pre-patch we assert success for Linux in
qemu_socket_set_nonblock(), and ignore all other errors here.
So, for Windows switch is a bit dangerous: we may get
new errors or crashes(when error_abort is passed) in
cases where we have silently ignored the error before
(was it correct in all such cases, if they were?) Still,
there is no other way to stricter API than take
this risk.

4. util/vhost-user-server - compiled only for Linux (see
util/meson.build), so we are safe, switching from assertion to
&error_abort.

Note: In qga/channel-posix.c we use g_warning(), where g_printerr()
would actually be a better choice. Still let's for now follow
common style of qga, where g_warning() is commonly used to print
such messages, and no call to g_printerr(). Converting everything
to use g_printerr() should better be another series.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>

migration: qemu_file_set_blocking(): add errp parameter

qemu_file_set_blocking() is a wrapper on qio_channel_set_blocking(),
so let's passthrough the errp.

Note the migration should not be using &error_abort in these calls,
however, this is done to expedite the API conversion.

The original code would have eventually ended up calling either
qemu_socket_set_nonblock which would asset on Linux, or
g_unix_set_fd_nonblocking which would propagate errors. We never
saw asserts in practice, and conceptually they should not happen,
but ideally this code will be later adapted to remove use of
&error_abort.

Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>

treewide: handle result of qio_channel_set_blocking()

Currently, we just always pass NULL as errp argument. That doesn't
look good.

Some realizations of interface may actually report errors.
Channel-socket realization actually either ignore or crash on
errors, but we are going to straighten it out to always reporting
an errp in further commits.

So, convert all callers to either handle the error (where environment
allows) or explicitly use &error_abort.

Take also a chance to change the return value to more convenient
bool (keeping also in mind, that underlying realizations may
return -1 on failure, not -errno).

Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
[DB: fix return type mismatch in TLS/websocket channel
impls for qio_channel_set_blocking]
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>

util: add qemu_set_blocking() function

In generic code we have qio_channel_set_blocking(), which takes
bool parameter, and qemu_file_set_blocking(), which as well takes
bool parameter.

At lower fd-layer we have a mess of functions:

- enough direct calls to Unix-specific g_unix_set_fd_nonblocking()
(of course, all calls are out of Windows-compatible code), which
is glib specific with GError, which we can't use, and have to
handle error-reporting by hand after the call.

and several platform-agnostic qemu_* helpers:

- qemu_socket_set_nonblock(), which asserts success for posix (still,
in most cases we can handle the error in better way) and ignores
error for win32 realization

- qemu_socket_try_set_nonblock(), providing and error, but not errp,
so we have to handle it after the call

- qemu_socket_set_block(), which simply ignores an error

Note, that *_socket_* word in original API, which we are going
to substitute was intended, because Windows support these operations
only for sockets. What leads to solution of dropping it again?

1. Having a QEMU-native wrapper with errp parameter
for g_unix_set_fd_nonblocking() for non-socket fds worth doing,
at least to unify error handling.

2. So, if try to keep _socket_ vs _file_ words, we'll have two
actually duplicated functions for Linux, which actually will
be executed successfully on any (good enough) fds, and nothing
prevent using them improperly except for the name. That doesn't
look good.

3. Naming helped us in the world where we crash on errors or
ignore them. Now, with errp parameter, callers are intended to
proper error checking. And for places where we really OK with
crash-on-error semantics (like tests), we have an explicit
&error_abort.

So, this commit starts a series, which will effectively revert
commit ff5927baa7ffb9 "util: rename qemu_*block() socket functions"
(which in turn was reverting f9e8cacc5557e43
"oslib-posix: rename socket_set_nonblock() to qemu_set_nonblock()",
so that's a long story).
Now we don't simply rename, instead we provide the new API and
update all the callers.

This commit only introduces a new fd-layer wrapper. Next commits
will replace old API calls with it, and finally remove old API.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>

char-socket: tcp_chr_recv(): add comment

Add comment, to stress that the order of operation (first drop old fds,
second check read status) is intended.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>

char-socket: tcp_chr_recv(): drop extra _set_(block,cloexec)

qio_channel_readv_full() guarantees BLOCKING and CLOEXEC states for
incoming descriptors, no reason to call extra ioctls.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>

io/channel: document how qio_channel_readv_full() handles fds

The only realization, which may have incoming fds is
qio_channel_socket_readv() (in io/channel-socket.c).
qio_channel_socket_readv() do call (through
qio_channel_socket_copy_fds()) qemu_socket_set_block() and
qemu_set_cloexec() for each fd.

Also, qio_channel_socket_copy_fds() is called at the end of
qio_channel_socket_readv(), on success path.

Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>

migration/qemu-file: don't make incoming fds blocking again

In migration we want to pass fd "as is", not changing its
blocking status.

The only current user of these fds is CPR state (through VMSTATE_FD),
which of-course doesn't want to modify fds on target when source is
still running and use these fds.

Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>

MAINTAINERS: list qemu-security@nongnu.org as security contact

The qemu-security@nongnu.org list is considered the authoritative
contact for reporting QEMU security issues. Remove the Red Hat
security team address in favour of QEMU's list, to ensure that
upstream gets first contact. There is a representative of the
Red Hat security team as a member of qemu-security@nongnu.org
whom requests CVE assignments on behalf of QEMU when needed.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Mauro Matteo Cascella <mcascell@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>

9pfs: Stop including gstrfuncs.h

gstrfuncs.h is not intended to be included directly.
In fact this only works because glib.h is already included by osdep.h.
Just remove the include.

Signed-off-by: Peter Foley <pefoley@google.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/qemu-devel/20250905-9p-v2-1-2ad31999684d@google.com
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>

9pfs: Add FreeBSD support

This is largely derived from existing Darwin support.  FreeBSD
apparently has better support for *at() system calls so doesn't require
workarounds for a missing mknodat().  The implementation has a couple of
warts however:
- The extattr(2) system calls don't support anything akin to
  XATTR_CREATE or XATTR_REPLACE, so a racy workaround is implemented.
- Attribute names cannot begin with "user." or "system." on ZFS.
  However FreeBSD's extattr(2) system calls support two dedicated
  namespaces for these two.  So "user." or "system." prefixes are
  trimmed off from attribute names and instead EXTATTR_NAMESPACE_USER or
  EXTATTR_NAMESPACE_SYSTEM are picked and passed to extattr system calls
  accordingly.

The 9pfs tests were verified to pass on the UFS, ZFS and tmpfs
filesystems.

Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Link: https://lore.kernel.org/qemu-devel/aJOWhHB2p-fbueAm@nuc
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>

Merge tag 'pull-loongarch-20250918' of https://github.com/gaosong715/qemu into staging

pull-loongarch-20250918

# -----BEGIN PGP SIGNATURE-----
#
# iLMEAAEIAB0WIQTKRzxE1qCcGJoZP81FK5aFKyaCFgUCaMvTpQAKCRBFK5aFKyaC
# Fkk0BACDkaQa6jDON8aLcTFcwpIlrnblqlYo6EK7TaGqpI866EhTX09BscRF5bvp
# 3JtGARKy5a6s5GJ64KItIl4n5Z6xvt4ME1KjyqeUTpD99c7J1krgxl6+W/NthK/K
# cLbSnlfvcw/L6KfIsGP6i2F6Y+riyZf6OYMc9IF/xFEAIMKJyA==
# =EgXn
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 18 Sep 2025 02:40:53 AM PDT
# gpg:                using RSA key CA473C44D6A09C189A193FCD452B96852B268216
# gpg: Good signature from "Song Gao <gaosong@loongson.cn>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: CA47 3C44 D6A0 9C18 9A19  3FCD 452B 9685 2B26 8216

* tag 'pull-loongarch-20250918' of https://github.com/gaosong715/qemu:
  hw/loongarch/virt: Register reset interface with cpu plug callback
  hw/loongarch/virt: Remove unnecessay pre-boot setting with BSP
  hw/loongarch/virt: Add BSP support with aux boot code

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging

* cpu-exec: more cleanups to CPU loop exits
* python: bump bundled Meson to 1.9.0
* rust: require Rust 1.83.0
* rust: temporarily remove from Ubuntu CI
* rust: vmstate: convert to use builder pattern
* rust: split "qemu-api" crate
* rust: rename qemu_api_macros -> qemu_macros
* rust: re-export qemu macros from other crates
* x86: fix functional test failure for Xen emulation
* x86: cleanups

# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCgAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmjK6ZsUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroNBKwf/aadInCT4vASOfpxbwZgYfYgR2m2m
# BJE9oYKxZJ6MlEOU/1Wfywf9fg4leMSh3XxkDKkEIL19yS6emwin8n3SNYrdAFn3
# 6u4IIWO4NI1Ht3NKytrqFk9wtbH9pAs/gVHLlnmpMxIqtOtZLumPAKNz8rlantmK
# UVDYL3Y0L4pD9i5FK1ObMNpk5AsWNr8Tr64fmb+nTkHutld3sBrEMCLI0+EByGyN
# lQ16sLn9PGqHOr210zuQP7wP2T3NCI3YokFSPQrUUL8LZGxRdXoNF4hI4uZDKGdn
# UbtRu9EkM052qzfsFMrEw5JSbdxEfIjKlPoFKseMv+aWvNAuximAraD3Vg==
# =Lr+x
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 17 Sep 2025 10:02:19 AM PDT
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [unknown]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [unknown]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (60 commits)
  accel/kvm: Set guest_memfd_offset to non-zero value only when guest_memfd is valid
  accel/kvm: Zero out mem explicitly in kvm_set_user_memory_region()
  accel/kvm: Switch to check KVM_CAP_GUEST_MEMFD and KVM_CAP_USER_MEMORY2 on VM
  i386/kvm: Drop KVM_CAP_X86_SMM check in kvm_arch_init()
  multiboot: Fix the split lock
  target/i386: Define enum X86ASIdx for x86's address spaces
  i386/cpu: Enable SMM cpu address space under KVM
  hpet: guard IRQ handling with BQL
  rust: do not inline do_init_io
  rust: meson: remove unnecessary complication in device crates
  docs: update rust.rst
  rust: re-export qemu macros from common/qom/hwcore
  rust: re-export qemu_macros internal helper in "bits"
  rust: repurpose qemu_api -> tests
  rust/pl011: drop dependency on qemu_api
  rust/hpet: drop now unneeded qemu_api dep
  rust: rename qemu_api_macros -> qemu_macros
  rust: split "hwcore" crate
  rust: split "system" crate
  rust: split "chardev" crate
  ...

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

hw/loongarch/virt: Register reset interface with cpu plug callback

With cpu hotplug is implemented on LoongArch virt machine, reset
interface with hot-added CPU should be registered. Otherwise there
will be problem if system reboots after cpu is hot-added.

Now register reset interface with CPU plug callback, so that all
cold/hot added CPUs let their reset interface registered. And remove
reset interface with CPU unplug callback.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Song Gao <gaosong@loongson.cn>
Message-ID: <20250906070200.3749326-4-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>

hw/loongarch/virt: Remove unnecessay pre-boot setting with BSP

With BSP core, it boots from aux boot code and loads data into register
A0-A2 and PC. Pre-boot setting is not unnecessary and can be removed.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-ID: <20250906070200.3749326-3-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>

hw/loongarch/virt: Add BSP support with aux boot code

If system boots directly from Linux kernel, BSP core jumps to kernel
entry of Linux kernel image and other APs jump to aux boot code. Instead
BSP and APs can all jump to aux boot code like UEFI bios.

With aux boot code, BSP core is judged from physical cpu id, whose
cpu id is 0. With BSP core, load data to register A0-A2 and PC.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-ID: <20250906070200.3749326-2-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>

Merge tag 'pull-target-arm-20250916' of https://gitlab.com/pm215/qemu into staging

target-arm queue:
* tests, scripts: Don't import print_function from __future__
* Implement FEAT_ATS1A
* Remove deprecated pxa CPU family
* arm/kvm: report registers we failed to set
* Expose SME registers to GDB via gdbstub
* linux-user/aarch64: Generate ESR signal records
* hw/arm/raspi4b: remove redundant check in raspi_add_memory_node
* hw/arm/virt: Allow user-creatable SMMUv3 dev instantiation
* system: drop the -old-param option

# -----BEGIN PGP SIGNATURE-----
#
# iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAmjJpt8ZHHBldGVyLm1h
# eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3vRGEACO3VrePiMIA9N7egqlUiGn
# aRQVqIKeuPVj6TRVG7BSNWlAX8qvnOWOKg1yGVHDZv/nLvRje9UyfUAw7pf6jXod
# bzxWBCPJ0J0eOB64Tz87WRCLltKB5pEN+uIG00PtpBcXT1ixYCDgBZXyD3mwuJ4Q
# 5Yc5hEwQzpmh+EycLtfCHbmjKDw3x1ncpVlGceOG4h5fvzIvIhcNcZJXfAHhbhyO
# Y4c5PELrCkCLZaTtSSxd6VJ+vXQ9bNWyKaSZu2KRRnLcMeAqw2Ic7dLPlkzCVyxM
# PTOHy4TuDu+kqCbkxdnhpI6fvq5kcHyfTL6qX6tth8ZZS+qKGtvMEIXnYoy6q1kh
# 4jV5vizK8avx31fSiuTKVpttRv4dC+Aq5QrcgYtIVMeOwtkWHv610D8gcFPmXoG+
# uHX9WdzOjrYOzXVKzJaCZF6b7L31ptSEfOrx7asBC9k2wPRwonFXg4JGNq16Yann
# aAO5TM7NAUvM2IPgqS+Tf1Bk0iQqORxGfqzCyL76OO/QMMgfBy9elKH0UR0G+ePJ
# yjpub1oWIELSXsQGMrdFo1W4/NIpFMTu3DP9W+6XRPu1AvrAx/AsrTuvSvXoeFY9
# d/U3yWAXm5XxRzbCIUg7ke8I8zLwRz924M5PA8vophvSnfDLS3V8CJHLwbz/PqYc
# 0P2KCeI6d2NIhVik4mgEoQ==
# =5tK3
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 16 Sep 2025 11:05:19 AM PDT
# gpg:                using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg:                issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [unknown]
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>" [unknown]
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [unknown]
# gpg:                 aka "Peter Maydell <peter@archaic.org.uk>" [unknown]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* tag 'pull-target-arm-20250916' of https://gitlab.com/pm215/qemu: (36 commits)
  hw/usb/network: Remove hardcoded 0x40 prefix in STRING_ETHADDR response
  qtest/bios-tables-test: Update tables for smmuv3 tests
  qtest/bios-tables-test: Add tests for legacy smmuv3 and smmuv3 device
  bios-tables-test: Allow for smmuv3 test data.
  qemu-options.hx: Document the arm-smmuv3 device
  hw/arm/virt: Allow user-creatable SMMUv3 dev instantiation
  hw/pci: Introduce pci_setup_iommu_per_bus() for per-bus IOMMU ops retrieval
  hw/arm/virt: Add an SMMU_IO_LEN macro
  hw/arm/virt: Factor out common SMMUV3 dt bindings code
  hw/arm/virt-acpi-build: Update IORT for multiple smmuv3 devices
  hw/arm/virt-acpi-build: Re-arrange SMMUv3 IORT build
  hw/arm/smmu-common: Check SMMU has PCIe Root Complex association
  target/arm: Added test case for SME register exposure to GDB
  target/arm: Added support for SME register exposure to GDB
  target/arm: Increase MAX_PACKET_LENGTH for SME ZA remote gdb debugging
  arm/kvm: report registers we failed to set
  system: drop the -old-param option
  target/arm: Drop ARM_FEATURE_IWMMXT handling
  target/arm: Drop ARM_FEATURE_XSCALE handling
  target/arm: Remove iwmmxt helper functions
  ...

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

accel/kvm: Set guest_memfd_offset to non-zero value only when guest_memfd is valid

Current QEMU unconditionally sets the guest_memfd_offset of KVMSlot in
kvm_set_phys_mem(), which leads to the trace of kvm_set_user_memory looks:

kvm_set_user_memory AddrSpace#0 Slot#4 flags=0x2 gpa=0xe0000 size=0x20000 ua=0x7f5840de0000 guest_memfd=-1 guest_memfd_offset=0x3e0000 ret=0

It's confusing that the guest_memfd_offset has a non-zero value while
the guest_memfd is invalid (-1).

Change to only set guest_memfd_offset when guest_memfd is valid and
leave it as 0 when no valid guest_memfd.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Link: https://lore.kernel.org/r/20250728115707.1374614-4-xiaoyao.li@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

accel/kvm: Zero out mem explicitly in kvm_set_user_memory_region()

Zero out the entire mem explicitly before it's used, to ensure the unused
feilds (pad1, pad2) are all zeros. Otherwise, it might cause problem when
the pad fields are extended by future KVM.

Fixes: ce5a983233b4 ("kvm: Enable KVM_SET_USER_MEMORY_REGION2 for memslot")
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Link: https://lore.kernel.org/r/20250728115707.1374614-3-xiaoyao.li@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

accel/kvm: Switch to check KVM_CAP_GUEST_MEMFD and KVM_CAP_USER_MEMORY2 on VM

It returns more accruate result on checking KVM_CAP_GUEST_MEMFD and
KVM_CAP_USER_MEMORY2 on VM instance instead of on KVM platform.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Link: https://lore.kernel.org/r/20250728115707.1374614-2-xiaoyao.li@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/kvm: Drop KVM_CAP_X86_SMM check in kvm_arch_init()

x86_machine_is_smm_enabled() checks the KVM_CAP_X86_SMM for KVM
case. No need to check KVM_CAP_X86_SMM in kvm_arch_init().

So just drop the check of KVM_CAP_X86_SMM to simplify the code.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Link: https://lore.kernel.org/r/20250729062014.1669578-3-xiaoyao.li@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

multiboot: Fix the split lock

While running the kvm-unit-tests on Intel platforms with "split lock
disable" feature, every test triggers a kernel warning of

  x86/split lock detection: #AC: qemu-system-x86_64/373232 took a split_lock trap at address: 0x1e3

Hack KVM by exiting to QEMU on split lock #AC, we get

KVM: exception 17 exit (error code 0x0)
EAX=00000001 EBX=00000000 ECX=00000014 EDX=0001fb80
ESI=00000000 EDI=000000a8 EBP=00000000 ESP=00006f10
EIP=000001e3 EFL=00010002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0900 00009000 0000ffff 00009300 DPL=0 DS16 [-WA]
CS =c000 000c0000 0000ffff 00009b00 DPL=0 CS16 [-RA]
SS =0000 00000000 0000ffff 00009300 DPL=0 DS16 [-WA]
DS =c000 000c0000 0000ffff 00009300 DPL=0 DS16 [-WA]
FS =0950 00009500 0000ffff 00009300 DPL=0 DS16 [-WA]
GS =06f2 00006f20 0000ffff 00009300 DPL=0 DS16 [-WA]
LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
GDT=     000c02b4 00000027
IDT=     00000000 000003ff
CR0=00000011 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000
Code=89 16 08 00 65 66 0f 01 16 06 00 66 b8 01 00 00 00 0f 22 c0 <65> 66 ff 2e 00 00 b8 10 00 00 00 8e d0 8e d8 8e c0 8e e0 8e e8 66 b8 08 00 66 ba 10 05 66

And it matches with what disassembled from multiboo_dma.bin:

#objdump -b binary -m i386 -D pc-bios/multiboot_dma.bin

  1d1:   08 00                   or     %al,(%eax)
  1d3:   65 66 0f 01 16          lgdtw  %gs:(%esi)
  1d8:   06                      push   %es
  1d9:   00 66 b8                add    %ah,-0x48(%esi)
  1dc:   01 00                   add    %eax,(%eax)
  1de:   00 00                   add    %al,(%eax)
  1e0:   0f 22 c0                mov    %eax,%cr0
> 1e3:   65 66 ff 2e             ljmpw  *%gs:(%esi)
  1e7:   00 00                   add    %al,(%eax)
  1e9:   b8 10 00 00 00          mov    $0x10,%eax
  1ee:   8e d0                   mov    %eax,%ss
  1f0:   8e d8                   mov    %eax,%ds
  1f2:   8e c0                   mov    %eax,%es
  1f4:   8e e0                   mov    %eax,%fs
  1f6:   8e e8                   mov    %eax,%gs
  1f8:   66 b8 08 00             mov    $0x8,%ax
  1fc:   66 ba 10 05             mov    $0x510,%dx

We can see that the instruction at 0x1e3 is a far jmp through the GDT.
However, the GDT is not 8 byte aligned, the base is 0xc02b4.

Intel processors follow the LOCK semantics to set the accessed flag of the
segment descriptor when loading a segment descriptor. If the the segment
descriptor crosses two cache line, it causes split lock.

Fix it by aligning the GDT on 8 bytes, so that segment descriptor cannot
span two cache lines.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Link: https://lore.kernel.org/r/20250808035027.2194673-1-xiaoyao.li@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: Define enum X86ASIdx for x86's address spaces

Define X86ASIdx as enum, like ARM's ARMASIdx, so that it's clear index 0
is for memory and index 1 is for SMM.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Tested-By: Kirill Martynov <stdcalllevi@yandex-team.ru>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Link: https://lore.kernel.org/r/20250730095253.1833411-3-xiaoyao.li@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/cpu: Enable SMM cpu address space under KVM

Kirill Martynov reported assertation in cpu_asidx_from_attrs() being hit
when x86_cpu_dump_state() is called to dump the CPU state[*]. It happens
when the CPU is in SMM and KVM emulation failure due to misbehaving
guest.

The root cause is that QEMU i386 never enables the SMM address space for
cpu since KVM SMM support has been added.

Enable the SMM cpu address space under KVM when the SMM is enabled for
the x86machine.

[*] https://lore.kernel.org/qemu-devel/20250523154431.506993-1-stdcalllevi@yandex-team.ru/

Reported-by: Kirill Martynov <stdcalllevi@yandex-team.ru>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Kirill Martynov <stdcalllevi@yandex-team.ru>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Link: https://lore.kernel.org/r/20250730095253.1833411-2-xiaoyao.li@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

hpet: guard IRQ handling with BQL

Commit [1] made qemu fail with abort:
xen_evtchn_set_gsi: Assertion `bql_locked()' failed.
when running ./tests/functional/x86_64/test_kvm_xen.py tests.

To fix it make sure that BQL is held when manipulating IRQs.

Fixes: 7defb58baf (hpet: switch to fine-grained device locking)
Reported-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Link: https://lore.kernel.org/r/20250910142506.86274-1-imammedo@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust: do not inline do_init_io

This is now possible since the hwcore integration tests do not
link the system crate anymore.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20250908105005.2119297-34-pbonzini@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust: meson: remove unnecessary complication in device crates

It is not necessary anymore to explicitly list procedural macro crates
when doing the final link using rustc.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20250908105005.2119297-33-pbonzini@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

docs: update rust.rst

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Link: https://lore.kernel.org/r/20250827104147.717203-23-marcandre.lureau@redhat.com
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust: re-export qemu macros from common/qom/hwcore

This is just a bit nicer.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Link: https://lore.kernel.org/r/20250827104147.717203-22-marcandre.lureau@redhat.com
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust: re-export qemu_macros internal helper in "bits"

Avoid the need to import "qemu_macros".

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Link: https://lore.kernel.org/r/20250827104147.717203-21-marcandre.lureau@redhat.com
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust: repurpose qemu_api -> tests

The crate purpose is only to provide integration tests at this point,
that can't easily be moved to a specific crate.

It's also often a good practice to have a single integration test crate
(see for ex https://github.com/rust-lang/cargo/issues/4867)

Drop README.md, use docs/devel/rust.rst instead.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Link: https://lore.kernel.org/r/20250827104147.717203-20-marcandre.lureau@redhat.com
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust/pl011: drop dependency on qemu_api

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Link: https://lore.kernel.org/r/20250827104147.717203-19-marcandre.lureau@redhat.com
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust/hpet: drop now unneeded qemu_api dep

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Link: https://lore.kernel.org/r/20250827104147.717203-18-marcandre.lureau@redhat.com
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust: rename qemu_api_macros -> qemu_macros

Since "qemu_api" is no longer the unique crate to provide APIs.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Link: https://lore.kernel.org/r/20250827104147.717203-17-marcandre.lureau@redhat.com
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust: split "hwcore" crate

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Link: https://lore.kernel.org/r/20250827104147.717203-16-marcandre.lureau@redhat.com
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust: split "system" crate

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Link: https://lore.kernel.org/r/20250827104147.717203-15-marcandre.lureau@redhat.com
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust: split "chardev" crate

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Link: https://lore.kernel.org/r/20250827104147.717203-14-marcandre.lureau@redhat.com
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust: split "qom" crate

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20250827104147.717203-13-marcandre.lureau@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust: split "bql" crate

Unfortunately, an example had to be compile-time disabled, since it
relies on higher level crates (qdev, irq etc). The alternative is
probably to move that code to an example in qemu-api or elsewere and
make a link to it, or include_str.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Link: https://lore.kernel.org/r/20250827104147.717203-12-marcandre.lureau@redhat.com
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust: split "migration" crate

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Link: https://lore.kernel.org/r/20250827104147.717203-11-marcandre.lureau@redhat.com
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust: split "util" crate

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Link: https://lore.kernel.org/r/20250827104147.717203-7-marcandre.lureau@redhat.com
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust: make build.rs generic over various ./rust/projects

Guess the name of the subdir from the manifest directory, instead of
hard-coding it. In the following commits, other crates can then link to
this file, instead of maintaining their own copy.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Link: https://lore.kernel.org/r/20250827104147.717203-5-marcandre.lureau@redhat.com
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust: split Rust-only "common" crate

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Link: https://lore.kernel.org/r/20250827104147.717203-6-marcandre.lureau@redhat.com
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust: move Cell vmstate impl

This will allow to split vmstate to a standalone crate next.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Link: https://lore.kernel.org/r/20250827104147.717203-10-marcandre.lureau@redhat.com
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust: move VMState handling to QOM module

This will allow to split vmstate to a standalone crate next.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Link: https://lore.kernel.org/r/20250827104147.717203-9-marcandre.lureau@redhat.com
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust: move vmstate_clock!() to qdev module

This will allow to split vmstate to a standalone crate next.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Link: https://lore.kernel.org/r/20250827104147.717203-8-marcandre.lureau@redhat.com
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust: add workspace authors

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Link: https://lore.kernel.org/r/20250827104147.717203-4-marcandre.lureau@redhat.com
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust: remove unused global qemu "allocator"

The global allocator has always been disabled. There is no clear reason
Rust and C should use the same allocator. Allocations made from Rust
must be freed by Rust, and same for C, otherwise we head into troubles.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Link: https://lore.kernel.org/r/20250827104147.717203-3-marcandre.lureau@redhat.com
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

docs/rust: update msrv

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Link: https://lore.kernel.org/r/20250827104147.717203-2-marcandre.lureau@redhat.com
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust: qdev: const_refs_to_static

Now that const_refs_static can be assumed, convert the members of
the DeviceImpl trait from functions to constants. This lets the
compiler know that they have a 'static lifetime, and removes the
need for the weird "Box::leak()".

Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20250908105005.2119297-10-pbonzini@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust: vmstate: use const_refs_to_static

The VMStateDescriptionBuilder already needs const_refs_static, so
use it to remove the need for vmstate_clock! and vmstate_struct!,
as well as to simplify the implementation for scalars.

If the consts in the VMState trait can reference to static
VMStateDescription, scalars do not need the info_enum_to_ref!
indirection and structs can implement the VMState trait themselves.

Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20250908105005.2119297-9-pbonzini@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust: vmstate: convert to use builder pattern

Similar to MemoryRegionOps, the builder pattern has two advantages:
1) it makes it possible to build a VMStateDescription that knows which
types it will be invoked on; 2) it provides a way to wrap the callbacks
and let devices avoid "unsafe".

Unfortunately, building a static VMStateDescription requires the
builder methods to be "const", and because the VMStateFields are
*also* static, this requires const_refs_static. So this requires
Rust 1.83.0.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20250908105005.2119297-8-pbonzini@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust: add qdev Device derive macro

Add derive macro for declaring qdev properties directly above the field
definitions. To do this, we split DeviceImpl::properties method on a
separate trait so we can implement only that part in the derive macro
expansion (we cannot partially implement the DeviceImpl trait).

Adding a `property` attribute above the field declaration will generate
a `qemu_api::bindings::Property` array member in the device's property
list.

Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Link: https://lore.kernel.org/r/20250711-rust-qdev-properties-v3-1-e198624416fb@linaro.org
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust: use inline const expressions

They were stabilized in Rust 1.79.0.

Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20250908105005.2119297-6-pbonzini@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust: add missing const markers for MSRV==1.83.0

Rust 1.83 allows more functions to be marked const.
Fix clippy with bumped minimum supported Rust version.

Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20250908105005.2119297-5-pbonzini@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

meson, cargo: require Rust 1.83.0

Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20250908105005.2119297-4-pbonzini@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

configure: bump Meson to 1.9.0 for use with Rust

Meson 1.9.0 provides mixed linking of Rust and C objects. As a side effect,
this also allows adding dependencies with "sources: ..." files to Rust crates
that use structured_sources().

It can also clean up up the meson.build files for Rust noticeably, but due
to an issue with doctests (see https://github.com/mesonbuild/meson/pull/14973)
that will have to wait for 1.9.1.

Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20250908105005.2119297-3-pbonzini@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

ci: temporarily remove rust from Ubuntu

This is for the purpose of getting an easy-to-use base for future
development. The plan is:
- that Debian will require trixie to enable Rust usage
- that Ubuntu will backport 1.83 to its 22.04 and 24.04 versions
(https://bugs.launchpad.net/ubuntu/+source/rustc-1.83/+bug/2120318)

Marc-André is working on adding Rust to other CI jobs.

Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20250908105005.2119297-2-pbonzini@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

tcg/user: do not set exit_request gratuitously

Whenever user-mode emulation needs to go all the way out of the cpu
exec loop, it uses cpu_exit(), which already sets cpu->exit_request.

Therefore, there is no need for tcg_kick_vcpu_thread() to set
cpu->exit_request again outside system emulation.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

accel: make all calls to qemu_process_cpu_events look the same

There is no reason for some accelerators to use qemu_process_cpu_events_common
(which is separated from qemu_process_cpu_events() specifically for round
robin TCG). They can also check for events directly on the first pass through
the loop, instead of setting cpu->exit_request to true.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

cpus: clear exit_request in qemu_process_cpu_events

Make the code common to all accelerators: after seeing cpu->exit_request
set to true, accelerator code needs to reach qemu_process_cpu_events_common().

So for the common cases where they use qemu_process_cpu_events(), go ahead and
clear it in there.  Note that the cheap qatomic_set() is enough because
at this point the thread has taken the BQL; qatomic_set_mb() is not needed.
In particular, this is the ordering of the communication between
I/O and vCPU threads is always the same.

In the I/O thread:

(a) store other memory locations that will be checked if cpu->exit_request
    or cpu->interrupt_request is 1 (for example cpu->stop or cpu->work_list
    for cpu->exit_request)

(b) cpu_exit(): store-release cpu->exit_request, or
(b) cpu_interrupt(): store-release cpu->interrupt_request

>>> at this point, cpu->halt_cond is broadcast and the BQL released

(c) do the accelerator-specific kick (e.g. write icount_decr for TCG,
    pthread_kill for KVM, etc.)

In the vCPU thread instead the opposite order is respected:

(c) the accelerator's execution loop exits thanks to the kick

(b) then the inner execution loop checks cpu->interrupt_request
    and cpu->exit_request.  If needed cpu->interrupt_request is
    converted into cpu->exit_request when work is needed outside
    the execution loop.

(a) then the other memory locations are checked.  Some may need to
    be read under the BQL, but the vCPU thread may also take other
    locks (e.g. for queued work items) or none at all.

qatomic_set_mb() would only be needed if the halt sleep was done
outside the BQL (though in that case, cpu->exit_request probably
would be replaced by a QemuEvent or something like that).

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

bsd-user, linux-user: introduce qemu_process_cpu_events

Add a user-mode emulation version of the function. More will be
added later, for now it is just process_queued_cpu_work.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

treewide: rename qemu_wait_io_event/qemu_wait_io_event_common

Do so before extending it to the user-mode emulators, where there is no
such thing as an "I/O thread".

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

cpus: properly kick CPUs out of inner execution loop

Now that cpu_exit() actually kicks all accelerators, use it whenever
the message to another thread is processed in qemu_wait_io_event().

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

cpus: remove TCG-ism from cpu_exit()

Now that TCG has its own kick function, make cpu_exit() do the right kick
for all accelerators.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

accel/tcg: inline cpu_exit()

Right now, cpu_exit() is not usable from all accelerators because it
includes a TCG-specific thread kick. In fact, cpu_exit() doubles as
the TCG thread-kick via tcg_kick_vcpu_thread().

In preparation for changing that, inline cpu_exit() into
tcg_kick_vcpu_thread(). The direction of the calls can then be
reversed, with an accelerator-independent cpu_exit() calling into
qemu_vcpu_kick() rather than the opposite.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

accel/tcg: create a thread-kick function for TCG

Round-robin TCG is calling into cpu_exit() directly. In preparation
for making cpu_exit() usable from all accelerators, define a generic
thread-kick function for TCG which is used directly in the multi-threaded
case, and through CPU_FOREACH in the round-robin case.

Use it also for user-mode emulation, and take the occasion to move
the implementation to accel/tcg/user-exec.c.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

accel: use atomic accesses for exit_request

CPU threads write exit_request as a "note to self" that they need to
go out to a slow path. This write happens out of the BQL and can be
a data race with another threads' cpu_exit(); use atomic accesses
consistently.

While at it, change the source argument from int ("1") to bool ("true").

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

accel: use store_release/load_acquire for cross-thread exit_request

Reads and writes cpu->exit_request do not use a load-acquire/store-release
pair right now, but this means that cpu_exit() may not write cpu->exit_request
after any flags that are read by the vCPU thread.

Probably everything is protected one way or the other by the BQL, because
cpu->exit_request leads to the slow path, where the CPU thread often takes
the BQL (for example, to go to sleep by waiting on the BQL-protected
cpu->halt_cond); but it's not clear, so use load-acquire/store-release
consistently.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

cpus: document that qemu_cpu_kick() can be used for BQL-less operation

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>