Arman Nabiev [Thu, 22 Aug 2024 16:56:53 +0000 (19:56 +0300)]
target/ppc: Fix migration of CPUs with TLB_EMB TLB type
In vmstate_tlbemb a cut-and-paste error meant we gave
this vmstate subsection the same "cpu/tlb6xx" name as
the vmstate_tlb6xx subsection. This breaks migration load
for any CPU using the TLB_EMB CPU type, because when we
see the "tlb6xx" name in the incoming data we try to
interpret it as a vmstate_tlb6xx subsection, which it
isn't the right format for:
$ qemu-system-ppc -drive
if=none,format=qcow2,file=/home/petmay01/test-images/virt/dummy.qcow2
-monitor stdio -M bamboo
QEMU 9.0.92 monitor - type 'help' for more information
(qemu) savevm foo
(qemu) loadvm foo
Missing section footer for cpu
Error: Error -22 while loading VM state
Correct the incorrect vmstate section name. Since migration
for these CPU types was completely broken before, we don't
need to care that this is a migration compatibility break.
This affects the PPC 405, 440, 460 and e200 CPU families.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2522 Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Arman Nabiev <nabiev.arman13@gmail.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
(cherry picked from commit 203beb6f047467a4abfc8267c234393cea3f471c) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Alex Bennée [Fri, 26 Apr 2024 15:39:37 +0000 (16:39 +0100)]
gitlab: migrate the s390x custom machine to 22.04
20.04 is dead (from QEMU's point of view), long live 22.04!
Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240426153938.1707723-3-alex.bennee@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 108d99742af1fa6e977dcfac9d4151b7915e33a3) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
target/hppa: Fix PSW V-bit packaging in cpu_hppa_get for hppa64
While adding hppa64 support, the psw_v variable got extended from 32 to 64
bits. So, when packaging the PSW-V bit from the psw_v variable for interrupt
processing, check bit 31 instead the 63th (sign) bit.
This fixes a hard to find Linux kernel boot issue where the loss of the PSW-V
bit due to an ITLB interruption in the middle of a series of ds/addc
instructions (from the divU milicode library) generated the wrong division
result and thus triggered a Linux kernel crash.
Link: https://lore.kernel.org/lkml/718b8afe-222f-4b3a-96d3-93af0e4ceff1@roeck-us.net/ Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Helge Deller <deller@gmx.de> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Guenter Roeck <linux@roeck-us.net> Fixes: 931adff31478 ("target/hppa: Update cpu_hppa_get/put_psw for hppa64") Cc: qemu-stable@nongnu.org # v8.2+
(cherry picked from commit ead5078cf1a5f11d16e3e8462154c859620bcc7e) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: context fixup in target/hppa/helper.c due to lack of v9.0.0-688-gebc9401a4067 "target/hppa: Split PSW X and B into their own field")
Volker Rümelin [Fri, 2 Aug 2024 07:18:05 +0000 (09:18 +0200)]
hw/audio/virtio-snd: fix invalid param check
Commit 9b6083465f ("virtio-snd: check for invalid param shift
operands") tries to prevent invalid parameters specified by the
guest. However, the code is not correct.
Change the code so that the parameters format and rate, which are
a bit numbers, are compared with the bit size of the data type.
Fixes: 9b6083465f ("virtio-snd: check for invalid param shift operands") Signed-off-by: Volker Rümelin <vr_qemu@t-online.de>
Message-Id: <20240802071805.7123-1-vr_qemu@t-online.de> Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 7d14471a121878602cb4e748c4707f9ab9a9e3e2) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
That commit causes reverse_debugging.py test failures, and does
not seem to solve the root cause of the problem x86-64 still
hangs in record/replay tests.
The problem with short-cutting the iowait that was taken during
record phase is that related events will not get consumed at the
same points (e.g., reading the clock).
A hang with zero icount always seems to be a symptom of an earlier
problem that has caused the recording to become out of synch with
the execution and consumption of events by replay.
Acked-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20240813050638.446172-6-npiggin@gmail.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240813202329.1237572-14-alex.bennee@linaro.org>
(cherry picked from commit 94962ff00d09674047aed896e87ba09736cd6941) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Fixes: https://gitlab.com/qemu-project/qemu/-/issues/2524
Peter Maydell [Tue, 20 Aug 2024 14:44:29 +0000 (15:44 +0100)]
migration/multifd: Free MultiFDRecvParams::data
In multifd_recv_setup() we allocate (among other things)
* a MultiFDRecvData struct to multifd_recv_state::data
* a MultiFDRecvData struct to each multfd_recv_state->params[i].data
(Then during execution we might swap these pointers around.)
But in multifd_recv_cleanup() we free multifd_recv_state->data
in multifd_recv_cleanup_state() but we don't ever free the
multifd_recv_state->params[i].data. This results in a memory
leak reported by LeakSanitizer:
(cd build/asan && \
ASAN_OPTIONS="fast_unwind_on_malloc=0:strip_path_prefix=/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../" \
QTEST_QEMU_BINARY=./qemu-system-x86_64 \
./tests/qtest/migration-test --tap -k -p /x86_64/migration/multifd/file/mapped-ram )
[...]
Direct leak of 72 byte(s) in 3 object(s) allocated from:
#0 0x561cc0afcfd8 in __interceptor_calloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qemu-system-x86_64+0x218efd8) (BuildId: be72e086d4e47b172b0a72779972213fd9916466)
#1 0x7f89d37acc50 in g_malloc0 debian/build/deb/../../../glib/gmem.c:161:13
#2 0x561cc1e9c83c in multifd_recv_setup migration/multifd.c:1606:19
#3 0x561cc1e68618 in migration_ioc_process_incoming migration/migration.c:972:9
#4 0x561cc1e3ac59 in migration_channel_process_incoming migration/channel.c:45:9
#5 0x561cc1e4fa0b in file_accept_incoming_migration migration/file.c:132:5
#6 0x561cc30f2c0c in qio_channel_fd_source_dispatch io/channel-watch.c:84:12
#7 0x7f89d37a3c43 in g_main_dispatch debian/build/deb/../../../glib/gmain.c:3419:28
#8 0x7f89d37a3c43 in g_main_context_dispatch debian/build/deb/../../../glib/gmain.c:4137:7
#9 0x561cc3b21659 in glib_pollfds_poll util/main-loop.c:287:9
#10 0x561cc3b1ff93 in os_host_main_loop_wait util/main-loop.c:310:5
#11 0x561cc3b1fb5c in main_loop_wait util/main-loop.c:589:11
#12 0x561cc1da2917 in qemu_main_loop system/runstate.c:801:9
#13 0x561cc3796c1c in qemu_default_main system/main.c:37:14
#14 0x561cc3796c67 in main system/main.c:48:12
#15 0x7f89d163bd8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
#16 0x7f89d163be3f in __libc_start_main csu/../csu/libc-start.c:392:3
#17 0x561cc0a79fa4 in _start (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qemu-system-x86_64+0x210bfa4) (BuildId: be72e086d4e47b172b0a72779972213fd9916466)
Direct leak of 24 byte(s) in 1 object(s) allocated from:
#0 0x561cc0afcfd8 in __interceptor_calloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qemu-system-x86_64+0x218efd8) (BuildId: be72e086d4e47b172b0a72779972213fd9916466)
#1 0x7f89d37acc50 in g_malloc0 debian/build/deb/../../../glib/gmem.c:161:13
#2 0x561cc1e9bed9 in multifd_recv_setup migration/multifd.c:1588:32
#3 0x561cc1e68618 in migration_ioc_process_incoming migration/migration.c:972:9
#4 0x561cc1e3ac59 in migration_channel_process_incoming migration/channel.c:45:9
#5 0x561cc1e4fa0b in file_accept_incoming_migration migration/file.c:132:5
#6 0x561cc30f2c0c in qio_channel_fd_source_dispatch io/channel-watch.c:84:12
#7 0x7f89d37a3c43 in g_main_dispatch debian/build/deb/../../../glib/gmain.c:3419:28
#8 0x7f89d37a3c43 in g_main_context_dispatch debian/build/deb/../../../glib/gmain.c:4137:7
#9 0x561cc3b21659 in glib_pollfds_poll util/main-loop.c:287:9
#10 0x561cc3b1ff93 in os_host_main_loop_wait util/main-loop.c:310:5
#11 0x561cc3b1fb5c in main_loop_wait util/main-loop.c:589:11
#12 0x561cc1da2917 in qemu_main_loop system/runstate.c:801:9
#13 0x561cc3796c1c in qemu_default_main system/main.c:37:14
#14 0x561cc3796c67 in main system/main.c:48:12
#15 0x7f89d163bd8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
#16 0x7f89d163be3f in __libc_start_main csu/../csu/libc-start.c:392:3
#17 0x561cc0a79fa4 in _start (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qemu-system-x86_64+0x210bfa4) (BuildId: be72e086d4e47b172b0a72779972213fd9916466)
SUMMARY: AddressSanitizer: 96 byte(s) leaked in 4 allocation(s).
Free the params[i].data too.
Cc: qemu-stable@nongnu.org Fixes: d117ed0699d41 ("migration/multifd: Allow receiving pages without packets") Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Fabiano Rosas <farosas@suse.de>
(cherry picked from commit 4c107870e8b2ba3951ee0c46123f1c3b5d3a19d3) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Cindy Lu [Tue, 6 Aug 2024 09:37:12 +0000 (17:37 +0800)]
virtio-pci: Fix the use of an uninitialized irqfd
The crash was reported in MAC OS and NixOS, here is the link for this bug
https://gitlab.com/qemu-project/qemu/-/issues/2334
https://gitlab.com/qemu-project/qemu/-/issues/2321
In this bug, they are using the virtio_input device. The guest notifier was
not supported for this device, The function virtio_pci_set_guest_notifiers()
was not called, and the vector_irqfd was not initialized.
So the fix is adding the check for vector_irqfd in virtio_pci_get_notifier()
The function virtio_pci_get_notifier() can be used in various devices.
It could also be called when VIRTIO_CONFIG_S_DRIVER_OK is not set. In this situation,
the vector_irqfd being NULL is acceptable. We can allow the device continue to boot
If the vector_irqfd still hasn't been initialized after VIRTIO_CONFIG_S_DRIVER_OK
is set, it means that the function set_guest_notifiers was not called before the
driver started. This indicates that the device is not using the notifier.
At this point, we will let the check fail.
This fix is verified in vyatta,MacOS,NixOS,fedora system.
The bt tree for this bug is:
Thread 6 "CPU 0/KVM" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7c817be006c0 (LWP 1269146)]
kvm_virtio_pci_vq_vector_use () at ../qemu-9.0.0/hw/virtio/virtio-pci.c:817
817 if (irqfd->users == 0) {
(gdb) thread apply all bt
...
Thread 6 (Thread 0x7c817be006c0 (LWP 1269146) "CPU 0/KVM"):
0 kvm_virtio_pci_vq_vector_use () at ../qemu-9.0.0/hw/virtio/virtio-pci.c:817
1 kvm_virtio_pci_vector_use_one () at ../qemu-9.0.0/hw/virtio/virtio-pci.c:893
2 0x00005983657045e2 in memory_region_write_accessor () at ../qemu-9.0.0/system/memory.c:497
3 0x0000598365704ba6 in access_with_adjusted_size () at ../qemu-9.0.0/system/memory.c:573
4 0x0000598365705059 in memory_region_dispatch_write () at ../qemu-9.0.0/system/memory.c:1528
5 0x00005983659b8e1f in flatview_write_continue_step.isra.0 () at ../qemu-9.0.0/system/physmem.c:2713
6 0x000059836570ba7d in flatview_write_continue () at ../qemu-9.0.0/system/physmem.c:2743
7 flatview_write () at ../qemu-9.0.0/system/physmem.c:2774
8 0x000059836570bb76 in address_space_write () at ../qemu-9.0.0/system/physmem.c:2894
9 0x0000598365763afe in address_space_rw () at ../qemu-9.0.0/system/physmem.c:2904
10 kvm_cpu_exec () at ../qemu-9.0.0/accel/kvm/kvm-all.c:2917
11 0x000059836576656e in kvm_vcpu_thread_fn () at ../qemu-9.0.0/accel/kvm/kvm-accel-ops.c:50
12 0x0000598365926ca8 in qemu_thread_start () at ../qemu-9.0.0/util/qemu-thread-posix.c:541
13 0x00007c8185bcd1cf in ??? () at /usr/lib/libc.so.6
14 0x00007c8185c4e504 in clone () at /usr/lib/libc.so.6
Fixes: 2ce6cff94d ("virtio-pci: fix use of a released vector") Cc: qemu-stable@nongnu.org Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20240806093715.65105-1-lulu@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit a8e63ff289d137197ad7a701a587cc432872d798) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Klaus Jensen [Tue, 20 Aug 2024 04:16:48 +0000 (06:16 +0200)]
hw/nvme: fix leak of uninitialized memory in io_mgmt_recv
Yutaro Shimizu from the Cyber Defense Institute discovered a bug in the
NVMe emulation that leaks contents of an uninitialized heap buffer if
subsystem and FDP emulation are enabled.
Cc: qemu-stable@nongnu.org Reported-by: Yutaro Shimizu <shimizu@cyberdefense.jp> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
(cherry picked from commit 6a22121c4f25b181e99479f65958ecde65da1c92) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Peter Maydell [Mon, 19 Aug 2024 14:50:21 +0000 (15:50 +0100)]
crypto/tlscredspsk: Free username on finalize
When the creds->username property is set we allocate memory
for it in qcrypto_tls_creds_psk_prop_set_username(), but
we never free this when the QCryptoTLSCredsPSK is destroyed.
Free the memory in finalize.
This fixes a LeakSanitizer complaint in migration-test:
Direct leak of 5 byte(s) in 1 object(s) allocated from:
#0 0x5624e5c99dee in malloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qemu-system-x86_64+0x218edee) (BuildId: a9e623fa1009a9435c0142c037cd7b8c1ad04ce3)
#1 0x7fb199ae9738 in g_malloc debian/build/deb/../../../glib/gmem.c:128:13
#2 0x7fb199afe583 in g_strdup debian/build/deb/../../../glib/gstrfuncs.c:361:17
#3 0x5624e82ea919 in qcrypto_tls_creds_psk_prop_set_username /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../crypto/tlscredspsk.c:255:23
#4 0x5624e812c6b5 in property_set_str /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../qom/object.c:2277:5
#5 0x5624e8125ce5 in object_property_set /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../qom/object.c:1463:5
#6 0x5624e8136e7c in object_set_properties_from_qdict /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../qom/object_interfaces.c:55:14
#7 0x5624e81372d2 in user_creatable_add_type /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../qom/object_interfaces.c:112:5
#8 0x5624e8137964 in user_creatable_add_qapi /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../qom/object_interfaces.c:157:11
#9 0x5624e891ba3c in qmp_object_add /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../qom/qom-qmp-cmds.c:227:5
#10 0x5624e8af9118 in qmp_marshal_object_add /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qapi/qapi-commands-qom.c:337:5
#11 0x5624e8bd1d49 in do_qmp_dispatch_bh /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../qapi/qmp-dispatch.c:128:5
#12 0x5624e8cb2531 in aio_bh_call /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../util/async.c:171:5
#13 0x5624e8cb340c in aio_bh_poll /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../util/async.c:218:13
#14 0x5624e8c0be98 in aio_dispatch /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../util/aio-posix.c:423:5
#15 0x5624e8cba3ce in aio_ctx_dispatch /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../util/async.c:360:5
#16 0x7fb199ae0d3a in g_main_dispatch debian/build/deb/../../../glib/gmain.c:3419:28
#17 0x7fb199ae0d3a in g_main_context_dispatch debian/build/deb/../../../glib/gmain.c:4137:7
#18 0x5624e8cbe1d9 in glib_pollfds_poll /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../util/main-loop.c:287:9
#19 0x5624e8cbcb13 in os_host_main_loop_wait /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../util/main-loop.c:310:5
#20 0x5624e8cbc6dc in main_loop_wait /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../util/main-loop.c:589:11
#21 0x5624e6f3f917 in qemu_main_loop /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../system/runstate.c:801:9
#22 0x5624e893379c in qemu_default_main /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../system/main.c:37:14
#23 0x5624e89337e7 in main /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../system/main.c:48:12
#24 0x7fb197972d8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
#25 0x7fb197972e3f in __libc_start_main csu/../csu/libc-start.c:392:3
#26 0x5624e5c16fa4 in _start (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qemu-system-x86_64+0x210bfa4) (BuildId: a9e623fa1009a9435c0142c037cd7b8c1ad04ce3)
SUMMARY: AddressSanitizer: 5 byte(s) leaked in 1 allocation(s).
Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240819145021.38524-1-peter.maydell@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 87e012f29f2e47dcd8c385ff8bb8188f9e06d4ea) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Alyssa Ross [Mon, 5 Aug 2024 10:49:21 +0000 (12:49 +0200)]
target/hexagon: don't look for static glib
When cross compiling QEMU configured with --static, I've been getting
configure errors like the following:
Build-time dependency glib-2.0 found: NO
../target/hexagon/meson.build:303:15: ERROR: Dependency lookup for glib-2.0 with method 'pkgconfig' failed: Could not generate libs for glib-2.0:
Package libpcre2-8 was not found in the pkg-config search path.
Perhaps you should add the directory containing `libpcre2-8.pc'
to the PKG_CONFIG_PATH environment variable
Package 'libpcre2-8', required by 'glib-2.0', not found
This happens because --static sets the prefer_static Meson option, but
my build machine doesn't have a static libpcre2. I don't think it
makes sense to insist that native dependencies are static, just
because I want the non-native QEMU binaries to be static.
module: Prevent crash by resetting local_err in module_load_qom_all()
Set local_err to NULL after it has been freed in error_report_err(). This
avoids triggering assert(*errp == NULL) failure in error_setv() when
local_err is reused in the loop.
Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Claudio Fontana <cfontana@suse.de> Reviewed-by: Denis V. Lunev <den@openvz.org> Link: https://lore.kernel.org/r/20240809121340.992049-2-alexander.ivanov@virtuozzo.com
[Do the same by moving the declaration instead. - Paolo] Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 940d802b24e63650e0eacad3714e2ce171cba17c) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Peter Maydell [Fri, 9 Aug 2024 16:04:30 +0000 (17:04 +0100)]
target/arm: Fix usage of MMU indexes when EL3 is AArch32
Our current usage of MMU indexes when EL3 is AArch32 is confused.
Architecturally, when EL3 is AArch32, all Secure code runs under the
Secure PL1&0 translation regime:
* code at EL3, which might be Mon, or SVC, or any of the
other privileged modes (PL1)
* code at EL0 (Secure PL0)
This is different from when EL3 is AArch64, in which case EL3 is its
own translation regime, and EL1 and EL0 (whether AArch32 or AArch64)
have their own regime.
We claimed to be mapping Secure PL1 to our ARMMMUIdx_EL3, but didn't
do anything special about Secure PL0, which meant it used the same
ARMMMUIdx_EL10_0 that NonSecure PL0 does. This resulted in a bug
where arm_sctlr() incorrectly picked the NonSecure SCTLR as the
controlling register when in Secure PL0, which meant we were
spuriously generating alignment faults because we were looking at the
wrong SCTLR control bits.
The use of ARMMMUIdx_EL3 for Secure PL1 also resulted in the bug that
we wouldn't honour the PAN bit for Secure PL1, because there's no
equivalent _PAN mmu index for it.
We could fix this in one of two ways:
* The most straightforward is to add new MMU indexes EL30_0,
EL30_3, EL30_3_PAN to correspond to "Secure PL1&0 at PL0",
"Secure PL1&0 at PL1", and "Secure PL1&0 at PL1 with PAN".
This matches how we use indexes for the AArch64 regimes, and
preserves propirties like being able to determine the privilege
level from an MMU index without any other information. However
it would add two MMU indexes (we can share one with ARMMMUIdx_EL3),
and we are already using 14 of the 16 the core TLB code permits.
* The more complicated approach is the one we take here. We use
the same MMU indexes (E10_0, E10_1, E10_1_PAN) for Secure PL1&0
than we do for NonSecure PL1&0. This saves on MMU indexes, but
means we need to check in some places whether we're in the
Secure PL1&0 regime or not before we interpret an MMU index.
The changes in this commit were created by auditing all the places
where we use specific ARMMMUIdx_ values, and checking whether they
needed to be changed to handle the new index value usage.
Note for potential stable backports: taking also the previous
(comment-change-only) commit might make the backport easier.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2326 Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Tested-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240809160430.1144805-3-peter.maydell@linaro.org
(cherry picked from commit 4c2c0474693229c1f533239bb983495c5427784d) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Peter Maydell [Fri, 9 Aug 2024 16:04:29 +0000 (17:04 +0100)]
target/arm: Update translation regime comment for new features
We have a long comment describing the Arm architectural translation
regimes and how we map them to QEMU MMU indexes. This comment has
got a bit out of date:
* FEAT_SEL2 allows Secure EL2 and corresponding new regimes
* FEAT_RME introduces Realm state and its translation regimes
* We now model the Cortex-R52 so that is no longer a hypothetical
* We separated Secure Stage 2 and NonSecure Stage 2 MMU indexes
* We have an MMU index per physical address spacea
Add the missing pieces so that the list of architectural translation
regimes matches the Arm ARM, and the list and count of QEMU MMU
indexes in the comment matches the enum.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Tested-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240809160430.1144805-2-peter.maydell@linaro.org
(cherry picked from commit 150c24f34e9c3388c0f0ad04ddd997e5559db800) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: pick this one for stable-9.0 so the next commit applies cleanly)
target/arm: Clear high SVE elements in handle_vec_simd_wshli
AdvSIMD instructions are supposed to zero bits beyond 128.
Affects SSHLL, USHLL, SSHLL2, USHLL2.
Cc: qemu-stable@nongnu.org Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240717060903.205098-15-richard.henderson@linaro.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 8e0c9a9efa21a16190cbac288e414bbf1d80f639) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
block/blkio: use FUA flag on write zeroes only if supported
libblkio supports BLKIO_REQ_FUA with write zeros requests only since
version 1.4.0, so let's inform the block layer that the blkio driver
supports it only in this case. Otherwise we can have runtime errors
as reported in https://issues.redhat.com/browse/RHEL-32878
Fixes: fd66dbd424 ("blkio: add libblkio block driver") Cc: qemu-stable@nongnu.org Buglink: https://issues.redhat.com/browse/RHEL-32878 Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240808080545.40744-1-sgarzare@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 547c4e50929ec6c091d9c16a7b280e829b12b463) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Jianzhou Yue [Fri, 9 Aug 2024 16:37:56 +0000 (17:37 +0100)]
hw/core/ptimer: fix timer zero period condition for freq > 1GHz
The real period is zero when both period and period_frac are zero.
Check the method ptimer_set_freq, if freq is larger than 1000 MHz,
the period is zero, but the period_frac is not, in this case, the
ptimer will work but the current code incorrectly recognizes that
the ptimer is disabled.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2306 Signed-off-by: JianZhou Yue <JianZhou.Yue@verisilicon.com>
Message-id: 3DA024AEA8B57545AF1B3CAA37077D0FB75E82C8@SHASXM03.verisilicon.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 446e5e8b4515e9a7be69ef6a29852975289bb6f0) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
David Woodhouse [Tue, 6 Aug 2024 17:21:37 +0000 (18:21 +0100)]
net: Fix '-net nic,model=' for non-help arguments
Oops, don't *delete* the model option when checking for 'help'.
Fixes: 64f75f57f9d2 ("net: Reinstate '-net nic, model=help' output as documented in man page") Reported-by: Hans <sungdgdhtryrt@gmail.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Cc: qemu-stable@nongnu.org Reviewed-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit fa62cb989a9146c82f8f172715042852f5d36200) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Eric Blake [Thu, 22 Aug 2024 14:35:29 +0000 (09:35 -0500)]
nbd/server: CVE-2024-7409: Avoid use-after-free when closing server
Commit 3e7ef738 plugged the use-after-free of the global nbd_server
object, but overlooked a use-after-free of nbd_server->listener.
Although this race is harder to hit, notice that our shutdown path
first drops the reference count of nbd_server->listener, then triggers
actions that can result in a pending client reaching the
nbd_blockdev_client_closed() callback, which in turn calls
qio_net_listener_set_client_func on a potentially stale object.
If we know we don't want any more clients to connect, and have already
told the listener socket to shut down, then we should not be trying to
update the listener socket's associated function.
Eric Blake [Wed, 7 Aug 2024 17:23:13 +0000 (12:23 -0500)]
nbd/server: CVE-2024-7409: Close stray clients at server-stop
A malicious client can attempt to connect to an NBD server, and then
intentionally delay progress in the handshake, including if it does
not know the TLS secrets. Although the previous two patches reduce
this behavior by capping the default max-connections parameter and
killing slow clients, they did not eliminate the possibility of a
client waiting to close the socket until after the QMP nbd-server-stop
command is executed, at which point qemu would SEGV when trying to
dereference the NULL nbd_server global which is no longer present.
This amounts to a denial of service attack. Worse, if another NBD
server is started before the malicious client disconnects, I cannot
rule out additional adverse effects when the old client interferes
with the connection count of the new server (although the most likely
is a crash due to an assertion failure when checking
nbd_server->connections > 0).
For environments without this patch, the CVE can be mitigated by
ensuring (such as via a firewall) that only trusted clients can
connect to an NBD server. Note that using frameworks like libvirt
that ensure that TLS is used and that nbd-server-stop is not executed
while any trusted clients are still connected will only help if there
is also no possibility for an untrusted client to open a connection
but then stall on the NBD handshake.
Given the previous patches, it would be possible to guarantee that no
clients remain connected by having nbd-server-stop sleep for longer
than the default handshake deadline before finally freeing the global
nbd_server object, but that could make QMP non-responsive for a long
time. So intead, this patch fixes the problem by tracking all client
sockets opened while the server is running, and forcefully closing any
such sockets remaining without a completed handshake at the time of
nbd-server-stop, then waiting until the coroutines servicing those
sockets notice the state change. nbd-server-stop now has a second
AIO_WAIT_WHILE_UNLOCKED (the first is indirectly through the
blk_exp_close_all_type() that disconnects all clients that completed
handshakes), but forced socket shutdown is enough to progress the
coroutines and quickly tear down all clients before the server is
freed, thus finally fixing the CVE.
This patch relies heavily on the fact that nbd/server.c guarantees
that it only calls nbd_blockdev_client_closed() from the main loop
(see the assertion in nbd_client_put() and the hoops used in
nbd_client_put_nonzero() to achieve that); if we did not have that
guarantee, we would also need a mutex protecting our accesses of the
list of connections to survive re-entrancy from independent iothreads.
Although I did not actually try to test old builds, it looks like this
problem has existed since at least commit 862172f45c (v2.12.0, 2017) -
even back when that patch started using a QIONetListener to handle
listening on multiple sockets, nbd_server_free() was already unaware
that the nbd_blockdev_client_closed callback can be reached later by a
client thread that has not completed handshakes (and therefore the
client's socket never got added to the list closed in
nbd_export_close_all), despite that patch intentionally tearing down
the QIONetListener to prevent new clients.
Reported-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Fixes: CVE-2024-7409 CC: qemu-stable@nongnu.org Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20240807174943.771624-14-eblake@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
(cherry picked from commit 3e7ef738c8462c45043a1d39f702a0990406a3b3) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Eric Blake [Thu, 8 Aug 2024 21:05:08 +0000 (16:05 -0500)]
nbd/server: CVE-2024-7409: Drop non-negotiating clients
A client that opens a socket but does not negotiate is merely hogging
qemu's resources (an open fd and a small amount of memory); and a
malicious client that can access the port where NBD is listening can
attempt a denial of service attack by intentionally opening and
abandoning lots of unfinished connections. The previous patch put a
default bound on the number of such ongoing connections, but once that
limit is hit, no more clients can connect (including legitimate ones).
The solution is to insist that clients complete handshake within a
reasonable time limit, defaulting to 10 seconds. A client that has
not successfully completed NBD_OPT_GO by then (including the case of
where the client didn't know TLS credentials to even reach the point
of NBD_OPT_GO) is wasting our time and does not deserve to stay
connected. Later patches will allow fine-tuning the limit away from
the default value (including disabling it for doing integration
testing of the handshake process itself).
Note that this patch in isolation actually makes it more likely to see
qemu SEGV after nbd-server-stop, as any client socket still connected
when the server shuts down will now be closed after 10 seconds rather
than at the client's whims. That will be addressed in the next patch.
For a demo of this patch in action:
$ qemu-nbd -f raw -r -t -e 10 file &
$ nbdsh --opt-mode -c '
H = list()
for i in range(20):
print(i)
H.insert(i, nbd.NBD())
H[i].set_opt_mode(True)
H[i].connect_uri("nbd://localhost")
'
$ kill $!
where later connections get to start progressing once earlier ones are
forcefully dropped for taking too long, rather than hanging.
Suggested-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20240807174943.771624-13-eblake@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
[eblake: rebase to changes earlier in series, reduce scope of timer] Signed-off-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit b9b72cb3ce15b693148bd09cef7e50110566d8a0) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Eric Blake [Tue, 6 Aug 2024 18:53:00 +0000 (13:53 -0500)]
nbd/server: CVE-2024-7409: Cap default max-connections to 100
Allowing an unlimited number of clients to any web service is a recipe
for a rudimentary denial of service attack: the client merely needs to
open lots of sockets without closing them, until qemu no longer has
any more fds available to allocate.
For qemu-nbd, we default to allowing only 1 connection unless more are
explicitly asked for (-e or --shared); this was historically picked as
a nice default (without an explicit -t, a non-persistent qemu-nbd goes
away after a client disconnects, without needing any additional
follow-up commands), and we are not going to change that interface now
(besides, someday we want to point people towards qemu-storage-daemon
instead of qemu-nbd).
But for qemu proper, and the newer qemu-storage-daemon, the QMP
nbd-server-start command has historically had a default of unlimited
number of connections, in part because unlike qemu-nbd it is
inherently persistent until nbd-server-stop. Allowing multiple client
sockets is particularly useful for clients that can take advantage of
MULTI_CONN (creating parallel sockets to increase throughput),
although known clients that do so (such as libnbd's nbdcopy) typically
use only 8 or 16 connections (the benefits of scaling diminish once
more sockets are competing for kernel attention). Picking a number
large enough for typical use cases, but not unlimited, makes it
slightly harder for a malicious client to perform a denial of service
merely by opening lots of connections withot progressing through the
handshake.
This change does not eliminate CVE-2024-7409 on its own, but reduces
the chance for fd exhaustion or unlimited memory usage as an attack
surface. On the other hand, by itself, it makes it more obvious that
with a finite limit, we have the problem of an unauthenticated client
holding 100 fds opened as a way to block out a legitimate client from
being able to connect; thus, later patches will further add timeouts
to reject clients that are not making progress.
This is an INTENTIONAL change in behavior, and will break any client
of nbd-server-start that was not passing an explicit max-connections
parameter, yet expects more than 100 simultaneous connections. We are
not aware of any such client (as stated above, most clients aware of
MULTI_CONN get by just fine on 8 or 16 connections, and probably cope
with later connections failing by relying on the earlier connections;
libvirt has not yet been passing max-connections, but generally
creates NBD servers with the intent for a single client for the sake
of live storage migration; meanwhile, the KubeSAN project anticipates
a large cluster sharing multiple clients [up to 8 per node, and up to
100 nodes in a cluster], but it currently uses qemu-nbd with an
explicit --shared=0 rather than qemu-storage-daemon with
nbd-server-start).
We considered using a deprecation period (declare that omitting
max-parameters is deprecated, and make it mandatory in 3 releases -
then we don't need to pick an arbitrary default); that has zero risk
of breaking any apps that accidentally depended on more than 100
connections, and where such breakage might not be noticed under unit
testing but only under the larger loads of production usage. But it
does not close the denial-of-service hole until far into the future,
and requires all apps to change to add the parameter even if 100 was
good enough. It also has a drawback that any app (like libvirt) that
is accidentally relying on an unlimited default should seriously
consider their own CVE now, at which point they are going to change to
pass explicit max-connections sooner than waiting for 3 qemu releases.
Finally, if our changed default breaks an app, that app can always
pass in an explicit max-parameters with a larger value.
It is also intentional that the HMP interface to nbd-server-start is
not changed to expose max-connections (any client needing to fine-tune
things should be using QMP).
Suggested-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20240807174943.771624-12-eblake@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
[ericb: Expand commit message to summarize Dan's argument for why we
break corner-case back-compat behavior without a deprecation period] Signed-off-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit c8a76dbd90c2f48df89b75bef74917f90a59b623) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Eric Blake [Wed, 7 Aug 2024 13:50:01 +0000 (08:50 -0500)]
nbd/server: Plumb in new args to nbd_client_add()
Upcoming patches to fix a CVE need to track an opaque pointer passed
in by the owner of a client object, as well as request for a time
limit on how fast negotiation must complete. Prepare for that by
changing the signature of nbd_client_new() and adding an accessor to
get at the opaque pointer, although for now the two servers
(qemu-nbd.c and blockdev-nbd.c) do not change behavior even though
they pass in a new default timeout value.
Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20240807174943.771624-11-eblake@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
[eblake: s/LIMIT/MAX_SECS/ as suggested by Dan] Signed-off-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit fb1c2aaa981e0a2fa6362c9985f1296b74f055ac) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Added several tests to verify the implementation of the vvfat driver.
We needed a way to interact with it, so created a basic `fat16.py` driver
that handled writing correct sectors for us.
Added `vvfat` to the non-generic formats, as its not a normal image format.
Signed-off-by: Amjad Alsharafi <amjadsharafi10@gmail.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Tested-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <bb8149c945301aefbdf470a0924c07f69f9c087d.1721470238.git.amjadsharafi10@gmail.com>
[kwolf: Made mypy and pylint happy to unbreak 297] Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit c8f60bfb4345ea8343a53eaefe88d47b44c53f24) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
vvfat: Fix reading files with non-continuous clusters
When reading with `read_cluster` we get the `mapping` with
`find_mapping_for_cluster` and then we call `open_file` for this
mapping.
The issue appear when its the same file, but a second cluster that is
not immediately after it, imagine clusters `500 -> 503`, this will give
us 2 mappings one has the range `500..501` and another `503..504`, both
point to the same file, but different offsets.
When we don't open the file since the path is the same, we won't assign
`s->current_mapping` and thus accessing way out of bound of the file.
From our example above, after `open_file` (that didn't open anything) we
will get the offset into the file with
`s->cluster_size*(cluster_num-s->current_mapping->begin)`, which will
give us `0x2000 * (504-500)`, which is out of bound for this mapping and
will produce some issues.
Signed-off-by: Amjad Alsharafi <amjadsharafi10@gmail.com>
Message-ID: <1f3ea115779abab62ba32c788073cdc99f9ad5dd.1721470238.git.amjadsharafi10@gmail.com>
[kwolf: Simplified the patch based on Amjad's analysis and input] Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 5eed3db336506b529b927ba221fe0d836e5b8819) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
vvfat: Fix wrong checks for cluster mappings invariant
How this `abort` was intended to check for was:
- if the `mapping->first_mapping_index` is not the same as
`first_mapping_index`, which **should** happen only in one case,
when we are handling the first mapping, in that case
`mapping->first_mapping_index == -1`, in all other cases, the other
mappings after the first should have the condition `true`.
- From above, we know that this is the first mapping, so if the offset
is not `0`, then abort, since this is an invalid state.
The issue was that `first_mapping_index` is not set if we are
checking from the middle, the variable `first_mapping_index` is
only set if we passed through the check `cluster_was_modified` with the
first mapping, and in the same function call we checked the other
mappings.
One approach is to go into the loop even if `cluster_was_modified`
is not true so that we will be able to set `first_mapping_index` for the
first mapping, but since `first_mapping_index` is only used here,
another approach is to just check manually for the
`mapping->first_mapping_index != -1` since we know that this is the
value for the only entry where `offset == 0` (i.e. first mapping).
The field is marked as "the offset in the file (in clusters)", but it
was being used like this
`cluster_size*(nums)+mapping->info.file.offset`, which is incorrect.
Before this commit, the behavior when calling `commit_one_file` for
example with `offset=0x2000` (second cluster), what will happen is that
we won't fetch the next cluster from the fat, and instead use the first
cluster for the read operation.
This is due to off-by-one error here, where `i=0x2000 !< offset=0x2000`,
thus not fetching the next cluster.
Signed-off-by: Amjad Alsharafi <amjadsharafi10@gmail.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Tested-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <b97c1e1f1bc2f776061ae914f95d799d124fcd73.1721470238.git.amjadsharafi10@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit b881cf00c99e03bc8a3648581f97736ff275b18b) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2) (theoretical) If channel n-1 already started creation it will
defeat the purpose of the channels_created logic which is in place
to avoid migrate_fd_cleanup() to run while channels are still being
created.
This cannot really happen today because the current failure cases
for multifd_new_send_channel_create() are all synchronous,
resulting from qio_channel_file_new_path() getting a bad
filename. This would hit all channels equally.
But I don't want to set a trap for future people, so have all
channels try to create (even if failing), and only fail after the
channels_created semaphore has been posted.
While here, remove the error_report_err call. There's one already at
migrate_fd_cleanup later on.
Cc: qemu-stable@nongnu.org Reported-by: Jim Fehlig <jfehlig@suse.com> Fixes: b7b03eb614 ("migration/multifd: Add outgoing QIOChannelFile support") Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
(cherry picked from commit 0bd5b9284fa94a6242a0d27a46380d93e753488b) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
David Woodhouse [Tue, 9 Jul 2024 12:34:44 +0000 (13:34 +0100)]
net: Reinstate '-net nic, model=help' output as documented in man page
While refactoring the NIC initialization code, I broke '-net nic,model=help'
which no longer outputs a list of available NIC models.
Fixes: 2cdeca04adab ("net: report list of available models according to platform") Cc: qemu-stable@nongnu.org Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 64f75f57f9d2c8c12ac6d9355fa5d3a2af5879ca) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
thomas [Fri, 12 Jul 2024 03:10:53 +0000 (11:10 +0800)]
virtio-net: Fix network stall at the host side waiting for kick
Patch 06b12970174 ("virtio-net: fix network stall under load")
added double-check to test whether the available buffer size
can satisfy the request or not, in case the guest has added
some buffers to the avail ring simultaneously after the first
check. It will be lucky if the available buffer size becomes
okay after the double-check, then the host can send the packet
to the guest. If the buffer size still can't satisfy the request,
even if the guest has added some buffers, viritio-net would
stall at the host side forever.
The patch enables notification and checks whether the guest has
added some buffers since last check of available buffers when
the available buffers are insufficient. If no buffer is added,
return false, else recheck the available buffers in the loop.
If the available buffers are sufficient, disable notification
and return true.
Changes:
1. Change the return type of virtqueue_get_avail_bytes() from void
to int, it returns an opaque that represents the shadow_avail_idx
of the virtqueue on success, else -1 on error.
2. Add a new API: virtio_queue_enable_notification_and_check(),
it takes an opaque as input arg which is returned from
virtqueue_get_avail_bytes(). It enables notification firstly,
then checks whether the guest has added some buffers since
last check of available buffers or not by virtio_queue_poll(),
return ture if yes.
After some time, the host loses connection to the guest,
the guest can send packet to the host, but can't receive
packet from the host.
It's more likely to happen if SWIOTLB is enabled in the guest,
allocating and freeing bounce buffer takes some CPU ticks,
copying from/to bounce buffer takes more CPU ticks, compared
with that there is no bounce buffer in the guest.
Once the rate of producing packets from the host approximates
the rate of receiveing packets in the guest, the guest would
loop in NAPI.
receive packets ---
| |
v |
free buf virtnet_poll
| |
v |
add buf to avail ring ---
|
| need kick the host?
| NAPI continues
v
receive packets ---
| |
v |
free buf virtnet_poll
| |
v |
add buf to avail ring ---
|
v
... ...
On the other hand, the host fetches free buf from avail
ring, if the buf in the avail ring is not enough, the
host notifies the guest the event by writing the avail
idx read from avail ring to the event idx of used ring,
then the host goes to sleep, waiting for the kick signal
from the guest.
Once the guest finds the host is waiting for kick singal
(in virtqueue_kick_prepare_split()), it kicks the host.
The host may stall forever at the sequences below:
In the first loop of NAPI above, indicated in the range of
virtnet_poll above, the host is sending packets while the
guest is receiving packets and adding buffers.
step 1: The buf is not enough, for example, a big packet
needs 5 buf, but the available buf count is 3.
The host read current avail idx.
step 2: The guest adds some buf, then checks whether the
host is waiting for kick signal, not at this time.
The used ring is not empty, the guest continues
the second loop of NAPI.
step 3: The host writes the avail idx read from avail
ring to used ring as event idx via
virtio_queue_set_notification(q->rx_vq, 1).
step 4: At the end of the second loop of NAPI, recheck
whether kick is needed, as the event idx in the
used ring written by the host is beyound the
range of kick condition, the guest will not
send kick signal to the host.
Fixes: 06b12970174 ("virtio-net: fix network stall under load") Cc: qemu-stable@nongnu.org Signed-off-by: Wencheng Yang <east.moutain.yang@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit f937309fbdbb48c354220a3e7110c202ae4aa7fa) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: context fixup in include/hw/virtio/virtio.h)
Ensure the queue index points to a valid queue when software RSS
enabled. The new calculation matches with the behavior of Linux's TAP
device with the RSS eBPF program.
Fixes: 4474e37a5b3a ("virtio-net: implement RX RSS processing") Reported-by: Zhibin Hu <huzhibin5@huawei.com> Cc: qemu-stable@nongnu.org Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit f1595ceb9aad36a6c1da95bcb77ab9509b38822d) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Fixes: CVE-2024-6505
Peter Maydell [Thu, 1 Aug 2024 09:15:03 +0000 (10:15 +0100)]
target/arm: Handle denormals correctly for FMOPA (widening)
The FMOPA (widening) SME instruction takes pairs of half-precision
floating point values, widens them to single-precision, does a
two-way dot product and accumulates the results into a
single-precision destination. We don't quite correctly handle the
FPCR bits FZ and FZ16 which control flushing of denormal inputs and
outputs. This is because at the moment we pass a single float_status
value to the helper function, which then uses that configuration for
all the fp operations it does. However, because the inputs to this
operation are float16 and the outputs are float32 we need to use the
fp_status_f16 for the float16 input widening but the normal fp_status
for everything else. Otherwise we will apply the flushing control
FPCR.FZ16 to the 32-bit output rather than the FPCR.FZ control, and
incorrectly flush a denormal output to zero when we should not (or
vice-versa).
(In commit 207d30b5fdb5b we tried to fix the FZ handling but
didn't get it right, switching from "use FPCR.FZ for everything" to
"use FPCR.FZ16 for everything".)
(Mjt: it is commit 43929c818c4b in stable-9.0)
Pass the CPU env to the sme_fmopa_h helper instead of an fp_status
pointer, and have the helper pass an extra fp_status into the
f16_dotadd() function so that we can use the right status for the
right parts of this operation.
Cc: qemu-stable@nongnu.org Fixes: 207d30b5fdb5 ("target/arm: Use FPST_F16 for SME FMOPA (widening)")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2373 Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit 55f9f4ee018c5ccea81d8c8c586756d7711ae46f) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Marco Palumbi [Thu, 1 Aug 2024 09:15:02 +0000 (10:15 +0100)]
hw/arm/mps2-tz.c: fix RX/TX interrupts order
The order of the RX and TX interrupts are swapped.
This commit fixes the order as per the following documents:
* https://developer.arm.com/documentation/dai0505/latest/
* https://developer.arm.com/documentation/dai0521/latest/
* https://developer.arm.com/documentation/dai0524/latest/
* https://developer.arm.com/documentation/dai0547/latest/
Cc: qemu-stable@nongnu.org Signed-off-by: Marco Palumbi <Marco.Palumbi@tii.ae>
Message-id: 20240730073123.72992-1-marco@palumbi.it Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 5a558be93ad628e5bed6e0ee062870f49251725c) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Peter Maydell [Wed, 31 Jul 2024 17:00:19 +0000 (18:00 +0100)]
hw/i386/amd_iommu: Don't leak memory in amdvi_update_iotlb()
In amdvi_update_iotlb() we will only put a new entry in the hash
table if to_cache.perm is not IOMMU_NONE. However we allocate the
memory for the new AMDVIIOTLBEntry and for the hash table key
regardless. This means that in the IOMMU_NONE case we will leak the
memory we alloacted.
Move the allocations into the if() to the point where we know we're
going to add the item to the hash table.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2452 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20240731170019.3590563-1-peter.maydell@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 9a45b0761628cc59267b3283a85d15294464ac31) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Peter Maydell [Mon, 29 Jul 2024 12:05:33 +0000 (13:05 +0100)]
docs/sphinx/depfile.py: Handle env.doc2path() returning a Path not a str
In newer versions of Sphinx the env.doc2path() API is going to change
to return a Path object rather than a str. This was originally visible
in Sphinx 8.0.0rc1, but has been rolled back for the final 8.0.0
release. However it will probably emit a deprecation warning and is
likely to change for good in 9.0:
https://github.com/sphinx-doc/sphinx/issues/12686
Our use in depfile.py assumes a str, and if it is passed a Path
it will fall over:
Handler <function write_depfile at 0x77a1775ff560> for event 'build-finished' threw an exception (exception: unsupported operand type(s) for +: 'PosixPath' and 'str')
Wrapping the env.doc2path() call in str() will coerce a Path object
to the str we expect, and have no effect in older Sphinx versions
that do return a str.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2458 Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240729120533.2486427-1-peter.maydell@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 48e5b5f994bccf161dd88a67fdd819d4bfb400f1) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Peter Maydell [Mon, 22 Jul 2024 17:29:57 +0000 (18:29 +0100)]
target/arm: Ignore SMCR_EL2.LEN and SVCR_EL2.LEN if EL2 is not enabled
When determining the current vector length, the SMCR_EL2.LEN and
SVCR_EL2.LEN settings should only be considered if EL2 is enabled
(compare the pseudocode CurrentSVL and CurrentNSVL which call
EL2Enabled()).
We were checking against ARM_FEATURE_EL2 rather than calling
arm_is_el2_enabled(), which meant that we would look at
SMCR_EL2/SVCR_EL2 when in Secure EL1 or Secure EL0 even if Secure EL2
was not enabled.
Use the correct check in sve_vqm1_for_el_sm().
Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240722172957.1041231-5-peter.maydell@linaro.org
(cherry picked from commit f573ac059ed060234fcef4299fae9e500d357c33) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Peter Maydell [Mon, 22 Jul 2024 17:29:56 +0000 (18:29 +0100)]
target/arm: Avoid shifts by -1 in tszimm_shr() and tszimm_shl()
The function tszimm_esz() returns a shift amount, or possibly -1 in
certain cases that correspond to unallocated encodings in the
instruction set. We catch these later in the trans_ functions
(generally with an "a-esz < 0" check), but before we do the
decodetree-generated code will also call tszimm_shr() or tszimm_sl(),
which will use the tszimm_esz() return value as a shift count without
checking that it is not negative, which is undefined behaviour.
Avoid the UB by checking the return value in tszimm_shr() and
tszimm_shl().
Cc: qemu-stable@nongnu.org
Resolves: Coverity CID 1547617, 1547694 Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240722172957.1041231-4-peter.maydell@linaro.org
(cherry picked from commit 76916dfa89e8900639c1055c07a295c06628a0bc) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Peter Maydell [Mon, 22 Jul 2024 17:29:55 +0000 (18:29 +0100)]
target/arm: Fix UMOPA/UMOPS of 16-bit values
The UMOPA/UMOPS instructions are supposed to multiply unsigned 8 or
16 bit elements and accumulate the products into a 64-bit element.
In the Arm ARM pseudocode, this is done with the usual
infinite-precision signed arithmetic. However our implementation
doesn't quite get it right, because in the DEF_IMOP_64() macro we do:
sum += (NTYPE)(n >> 0) * (MTYPE)(m >> 0);
where NTYPE and MTYPE are uint16_t or int16_t. In the uint16_t case,
the C usual arithmetic conversions mean the values are converted to
"int" type and the multiply is done as a 32-bit multiply. This means
that if the inputs are, for example, 0xffff and 0xffff then the
result is 0xFFFE0001 as an int, which is then promoted to uint64_t
for the accumulation into sum; this promotion incorrectly sign
extends the multiply.
Avoid the incorrect sign extension by casting to int64_t before
the multiply, so we do the multiply as 64-bit signed arithmetic,
which is a type large enough that the multiply can never
overflow into the sign bit.
(The equivalent 8-bit operations in DEF_IMOP_32() are fine, because
the 8-bit multiplies can never overflow into the sign bit of a
32-bit integer.)
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2372 Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240722172957.1041231-3-peter.maydell@linaro.org
(cherry picked from commit ea3f5a90f036734522e9af3bffd77e69e9f47355) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Peter Maydell [Mon, 22 Jul 2024 17:29:54 +0000 (18:29 +0100)]
target/arm: Don't assert for 128-bit tile accesses when SVL is 128
For an instruction which accesses a 128-bit element tile when
the SVL is also 128 (for example MOV z0.Q, p0/M, ZA0H.Q[w0,0]),
we will assert in get_tile_rowcol():
This happens because we calculate
len = ctz32(streaming_vec_reg_size(s)) - esz;$
but if the SVL and the element size are the same len is 0, and
the deposit operation asserts.
In this case the ZA storage contains exactly one 128 bit
element ZA tile, and the horizontal or vertical slice is just
that tile. This means that regardless of the index value in
the Ws register, we always access that tile. (In pseudocode terms,
we calculate (index + offset) MOD 1, which is 0.)
Special case the len == 0 case to avoid hitting the assertion
in tcg_gen_deposit_z_i32().
Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240722172957.1041231-2-peter.maydell@linaro.org
(cherry picked from commit 56f1c0db928aae0b83fd91c89ddb226b137e2b21) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Peter Maydell [Tue, 23 Jul 2024 13:10:26 +0000 (14:10 +0100)]
hw/misc/bcm2835_property: Fix handling of FRAMEBUFFER_SET_PALETTE
The documentation of the "Set palette" mailbox property at
https://github.com/raspberrypi/firmware/wiki/Mailbox-property-interface#set-palette
says it has the form:
Length: 24..1032
Value:
u32: offset: first palette index to set (0-255)
u32: length: number of palette entries to set (1-256)
u32...: RGBA palette values (offset to offset+length-1)
We get this wrong in a couple of ways:
* we aren't checking the offset and length are in range, so the guest
can make us spin for a long time by providing a large length
* the bounds check on our loop is wrong: we should iterate through
'length' palette entries, not 'length - offset' entries
Fix the loop to implement the bounds checks and get the loop
condition right. In the process, make the variables local to
this switch case, rather than function-global, so it's clearer
what type they are when reading the code.
Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240723131029.1159908-2-peter.maydell@linaro.org
(cherry picked from commit 0892fffc2abaadfb5d8b79bb0250ae1794862560) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: context fix due to lack of v9.0.0-1812-g5d5f1b60916a "hw/misc: Implement mailbox properties for customer OTP and device specific private keys"
also remove now-unused local `n' variable which gets removed in the next change in this file, v9.0.0-2720-g32f1c201eedf "hw/misc/bcm2835_property: Avoid overflow in OTP access properties")
hw/char/bcm2835_aux: Fix assert when receive FIFO fills up
When a bare-metal application on the raspi3 board reads the
AUX_MU_STAT_REG MMIO register while the device's buffer is
at full receive FIFO capacity
(i.e. `s->read_count == BCM2835_AUX_RX_FIFO_LEN`) the
assertion `assert(s->read_count < BCM2835_AUX_RX_FIFO_LEN)`
fails.
Reported-by: Cryptjar <cryptjar@junk.studio> Suggested-by: Cryptjar <cryptjar@junk.studio>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/459 Signed-off-by: Frederik van Hövell <frederik@fvhovell.nl> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
[PMM: commit message tweaks] Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 546d574b11e02bfd5b15cdf1564842c14516dfab) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Using int32_t meant that the address was sign-extended to uint64_t
when passing to translator_ld*, triggering an assert.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2453 Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 83340193b991e7a974f117baa86a04db1fd835a9) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Thomas Huth [Tue, 18 Jun 2024 12:19:58 +0000 (14:19 +0200)]
hw/virtio: Fix the de-initialization of vhost-user devices
The unrealize functions of the various vhost-user devices are
calling the corresponding vhost_*_set_status() functions with a
status of 0 to shut down the device correctly.
Now these vhost_*_set_status() functions all follow this scheme:
if (vhost_dev_is_started(&vvc->vhost_dev) == should_start) {
return;
}
if (should_start) {
/* ... do the initialization stuff ... */
} else {
/* ... do the cleanup stuff ... */
}
The problem here is virtio_device_should_start(vdev, 0) currently
always returns "true" since it internally only looks at vdev->started
instead of looking at the "status" parameter. Thus once the device
got started once, virtio_device_should_start() always returns true
and thus the vhost_*_set_status() functions return early, without
ever doing any clean-up when being called with status == 0. This
causes e.g. problems when trying to hot-plug and hot-unplug a vhost
user devices multiple times since the de-initialization step is
completely skipped during the unplug operation.
This bug has been introduced in commit 9f6bcfd99f ("hw/virtio: move
vm_running check to virtio_device_started") which replaced
should_start = status & VIRTIO_CONFIG_S_DRIVER_OK;
which later got replaced by virtio_device_should_start(). This blocked
the possibility to set should_start to false in case the status flag
VIRTIO_CONFIG_S_DRIVER_OK was not set.
Fix it by adjusting the virtio_device_should_start() function to
only consider the status flag instead of vdev->started. Since this
function is only used in the various vhost_*_set_status() functions
for exactly the same purpose, it should be fine to fix it in this
central place there without any risk to change the behavior of other
code.
Fixes: 9f6bcfd99f ("hw/virtio: move vm_running check to virtio_device_started") Buglink: https://issues.redhat.com/browse/RHEL-40708 Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20240618121958.88673-1-thuth@redhat.com> Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit d72479b11797c28893e1e3fc565497a9cae5ca16) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
After 038b4217884c ("Revert "chardev: use a child source for qio input
source"") we've been observing the "iwp->src == NULL" assertion
triggering periodically during the initial capabilities querying by
libvirtd. One of possible backtraces:
Thread 1 (Thread 0x7f16cd4f0700 (LWP 43858)):
0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
1 0x00007f16c6c21e65 in __GI_abort () at abort.c:79
2 0x00007f16c6c21d39 in __assert_fail_base at assert.c:92
3 0x00007f16c6c46e86 in __GI___assert_fail (assertion=assertion@entry=0x562e9bcdaadd "iwp->src == NULL", file=file@entry=0x562e9bcdaac8 "../chardev/char-io.c", line=line@entry=99, function=function@entry=0x562e9bcdab10 <__PRETTY_FUNCTION__.20549> "io_watch_poll_finalize") at assert.c:101
4 0x0000562e9ba20c2c in io_watch_poll_finalize (source=<optimized out>) at ../chardev/char-io.c:99
5 io_watch_poll_finalize (source=<optimized out>) at ../chardev/char-io.c:88
6 0x00007f16c904aae0 in g_source_unref_internal () from /lib64/libglib-2.0.so.0
7 0x00007f16c904baf9 in g_source_destroy_internal () from /lib64/libglib-2.0.so.0
8 0x0000562e9ba20db0 in io_remove_watch_poll (source=0x562e9d6720b0) at ../chardev/char-io.c:147
9 remove_fd_in_watch (chr=chr@entry=0x562e9d5f3800) at ../chardev/char-io.c:153
10 0x0000562e9ba23ffb in update_ioc_handlers (s=0x562e9d5f3800) at ../chardev/char-socket.c:592
11 0x0000562e9ba2072f in qemu_chr_fe_set_handlers_full at ../chardev/char-fe.c:279
12 0x0000562e9ba207a9 in qemu_chr_fe_set_handlers at ../chardev/char-fe.c:304
13 0x0000562e9ba2ca75 in monitor_qmp_setup_handlers_bh (opaque=0x562e9d4c2c60) at ../monitor/qmp.c:509
14 0x0000562e9bb6222e in aio_bh_poll (ctx=ctx@entry=0x562e9d4c2f20) at ../util/async.c:216
15 0x0000562e9bb4de0a in aio_poll (ctx=0x562e9d4c2f20, blocking=blocking@entry=true) at ../util/aio-posix.c:722
16 0x0000562e9b99dfaa in iothread_run (opaque=0x562e9d4c26f0) at ../iothread.c:63
17 0x0000562e9bb505a4 in qemu_thread_start (args=0x562e9d4c7ea0) at ../util/qemu-thread-posix.c:543
18 0x00007f16c70081ca in start_thread (arg=<optimized out>) at pthread_create.c:479
19 0x00007f16c6c398d3 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
io_remove_watch_poll(), which makes sure that iwp->src is NULL, calls
g_source_destroy() which finds that iwp->src is not NULL in the finalize
callback. This can only happen if another thread has managed to trigger
io_watch_poll_prepare() callback in the meantime.
Move iwp->src destruction back to the finalize callback to prevent the
described race, and also remove the stale comment. The deadlock glib bug
was fixed back in 2010 by b35820285668 ("gmain: move finalization of
GSource outside of context lock").
Peter Maydell [Tue, 23 Jul 2024 15:09:27 +0000 (16:09 +0100)]
util/async.c: Forbid negative min/max in aio_context_set_thread_pool_params()
aio_context_set_thread_pool_params() takes two int64_t arguments to
set the minimum and maximum number of threads in the pool. We do
some bounds checking on these, but we don't catch the case where the
inputs are negative. This means that later in the function when we
assign these inputs to the AioContext::thread_pool_min and
::thread_pool_max fields, which are of type int, the values might
overflow the smaller type.
A negative number of threads is meaningless, so make
aio_context_set_thread_pool_params() return an error if either min or
max are negative.
Resolves: Coverity CID 1547605 Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240723150927.1396456-1-peter.maydell@linaro.org Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 851495571d14fe2226c52b9d423f88a4f5460836) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Once initialised, QOM objects can be realized and
unrealized multiple times before being finalized.
Resources allocated in REALIZE must be deallocated
in an equivalent UNREALIZE handler.
Free the CPU array in loongson_ipi_unrealize()
instead of loongson_ipi_finalize().
Cc: qemu-stable@nongnu.org Fixes: 5e90b8db382 ("hw/loongarch: Set iocsr address space per-board rather than percpu") Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20240723111405.14208-3-philmd@linaro.org>
(cherry picked from commit 0c2086bc7360565dfb9933181dafaefe2c94cddf) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: rename loongson back to longarch for 9.0 due to lack of v9.0.0-582-gb4a12dfc2132 "hw/intc/loongarch_ipi: Rename as loongson_ipi")
Bibo Mao [Tue, 16 Jul 2024 21:41:23 +0000 (23:41 +0200)]
hw/intc/loongson_ipi: Access memory in little endian
Loongson IPI is only available in little-endian,
so use that to access the guest memory (in case
we run on a big-endian host).
Cc: qemu-stable@nongnu.org Signed-off-by: Bibo Mao <maobibo@loongson.cn> Fixes: f6783e3438 ("hw/loongarch: Add LoongArch ipi interrupt support")
[PMD: Extracted from bigger commit, added commit description] Co-Developed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Bibo Mao <maobibo@loongson.cn> Tested-by: Bibo Mao <maobibo@loongson.cn> Acked-by: Song Gao <gaosong@loongson.cn> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Tested-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Message-Id: <20240718133312.10324-3-philmd@linaro.org>
(cherry picked from commit 2465c89fb983eed670007742bd68c7d91b6d6f85) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: fixups for 9.0, for lack of: v9.0.0-583-g91d0b151de4c "hw/intc/loongson_ipi: Implement IOCSR address space for MIPS" v9.0.0-582-gb4a12dfc2132 "hw/intc/loongarch_ipi: Rename as loongson_ipi")
chardev/char-win-stdio.c: restore old console mode
If I use `-serial stdio` on Windows, after QEMU exits, the terminal
could not handle arrow keys and tab any more. Because stdio backend
on Windows sets console mode to virtual terminal input when starts,
but does not restore the old mode when finalize.
This small patch saves the old console mode and set it back.
Signed-off-by: Ziming Song <s.ziming@hotmail.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-ID: <ME3P282MB25488BE7C39BF0C35CD0DA5D8CA82@ME3P282MB2548.AUSP282.PROD.OUTLOOK.COM>
(cherry picked from commit 903cc9e1173e0778caa50871e8275c898770c690) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
However, sgx_epc_get_section is called by CPUID regardless of whether
SGX state has been initialized or which platform is in use. Check
whether the machine has the right QOM class and if not behave as if
there are no EPC sections.
Fixes: 1dec2e1f19f ("i386: Update SGX CPUID info according to hardware/KVM/user input", 2021-09-30) Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2142 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 13be929aff804581b21e69087a9caf3698fd5c3c) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The constant must be unsigned, otherwise the two's complement
overrides the other fields when a PASID is present.
Fixes: 1b2b12376c8a ("intel-iommu: PASID support") Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Minwoo Im <minwoo.im@samsung.com>
Message-Id: <20240709142557.317271-2-clement.mathieu--drif@eviden.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit a3c8d7e38550c3d5a46e6fa94ffadfa625a4861d) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
virtio-snd: check for invalid param shift operands
When setting the parameters of a PCM stream, we compute the bit flag
with the format and rate values as shift operand to check if they are
set in supported_formats and supported_rates.
If the guest provides a format/rate value which when shifting 1 results
in a value bigger than the number of bits in
supported_formats/supported_rates, we must report an error.
Previously, this ended up triggering the not reached assertions later
when converting to internal QEMU values.
Reported-by: Zheyu Ma <zheyuma97@gmail.com>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2416 Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Message-Id: <virtio-snd-fuzz-2416-fix-v1-manos.pitsidianakis@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 9b6083465fb8311f2410615f8303a41f580a2a20) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
When reading input audio in the virtio-snd input callback,
virtio_snd_pcm_in_cb(), we do not check whether the iov can actually fit
the data buffer. This is because we use the buffer->size field as a
total-so-far accumulator instead of byte-size-left like in TX buffers.
This triggers an out of bounds write if the size of the virtio queue
element is equal to virtio_snd_pcm_status, which makes the available
space for audio data zero. This commit adds a check for reaching the
maximum buffer size before attempting any writes.
Reported-by: Zheyu Ma <zheyuma97@gmail.com>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2427 Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Message-Id: <virtio-snd-fuzz-2427-fix-v1-manos.pitsidianakis@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 98e77e3dd8dd6e7aa9a7dffa60f49c8c8a49d4e3) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Zhao Liu [Fri, 5 Jul 2024 11:39:54 +0000 (12:39 +0100)]
hw/cxl/cxl-host: Fix segmentation fault when getting cxl-fmw property
QEMU crashes (Segmentation fault) when getting cxl-fmw property via
qmp:
(QEMU) qom-get path=machine property=cxl-fmw
This issue is caused by accessing wrong callback (opaque) type in
machine_get_cfmw().
cxl_machine_init() sets the callback as `CXLState *` type but
machine_get_cfmw() treats the callback as
`CXLFixedMemoryWindowOptionsList **`.
Fix this error by casting opaque to `CXLState *` type in
machine_get_cfmw().
Fixes: 03b39fcf64bc ("hw/cxl: Make the CXL fixed memory window setup a machine parameter.") Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Xingtao Yao <yaoxt.fnst@fujitsu.com> Link: https://lore.kernel.org/r/20240704093404.1848132-1-zhao1.liu@linux.intel.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20240705113956.941732-2-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit a207d5f87d66f7933b50677e047498fc4af63e1f) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Zheyu Ma [Tue, 2 Jul 2024 23:13:03 +0000 (01:13 +0200)]
hw/nvme: fix memory leak in nvme_dsm
The allocated memory to hold LBA ranges leaks in the nvme_dsm function. This
happens because the allocated memory for iocb->range is not freed in all
error handling paths.
Fix this by adding a free to ensure that the allocated memory is properly freed.
ASAN log:
==3075137==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 480 byte(s) in 6 object(s) allocated from:
#0 0x55f1f8a0eddd in malloc llvm/compiler-rt/lib/asan/asan_malloc_linux.cpp:129:3
#1 0x7f531e0f6738 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x5e738)
#2 0x55f1faf1f091 in blk_aio_get block/block-backend.c:2583:12
#3 0x55f1f945c74b in nvme_dsm hw/nvme/ctrl.c:2609:30
#4 0x55f1f945831b in nvme_io_cmd hw/nvme/ctrl.c:4470:16
#5 0x55f1f94561b7 in nvme_process_sq hw/nvme/ctrl.c:7039:29
Cc: qemu-stable@nongnu.org Fixes: d7d1474fd85d ("hw/nvme: reimplement dsm to allow cancellation") Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
(cherry picked from commit c510fe78f1b7c966524489d6ba752107423b20c8) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
hvf: arm: Do not advance PC when raising an exception
hvf did not advance PC when raising an exception for most unhandled
system registers, but it mistakenly advanced PC when raising an
exception for GICv3 registers.
Cc: qemu-stable@nongnu.org Fixes: a2260983c655 ("hvf: arm: Add support for GICv3") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-id: 20240716-pmu-v3-4-8c7c1858a227@daynix.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 30a1690f2402e6c1582d5b3ebcf7940bfe2fad4b) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Peter Maydell [Tue, 16 Jul 2024 10:30:32 +0000 (11:30 +0100)]
target/arm: LDAPR should honour SCTLR_ELx.nAA
In commit c1a1f80518d360b when we added the FEAT_LSE2 relaxations to
the alignment requirements for atomic and ordered loads and stores,
we didn't quite get it right for LDAPR/LDAPRH/LDAPRB with no
immediate offset. These instructions were handled in the old decoder
as part of disas_ldst_atomic(), but unlike all the other insns that
function decoded (LDADD, LDCLR, etc) these insns are "ordered", not
"atomic", so they should be using check_ordered_align() rather than
check_atomic_align(). Commit c1a1f80518d360b used
check_atomic_align() regardless for everything in
disas_ldst_atomic(). We then carried that incorrect check over in
the decodetree conversion, where LDAPR/LDAPRH/LDAPRB are now handled
by trans_LDAPR().
The effect is that when FEAT_LSE2 is implemented, these instructions
don't honour the SCTLR_ELx.nAA bit and will generate alignment
faults when they should not.
(The LDAPR insns with an immediate offset were in disas_ldst_ldapr_stlr()
and then in trans_LDAPR_i() and trans_STLR_i(), and have always used
the correct check_ordered_align().)
Use check_ordered_align() in trans_LDAPR().
Cc: qemu-stable@nongnu.org Fixes: c1a1f80518d360b ("target/arm: Relax ordered/atomic alignment checks for LSE2") Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240709134504.3500007-3-peter.maydell@linaro.org
(cherry picked from commit 25489b521b61b874c4c6583956db0012a3674e3a) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Peter Maydell [Tue, 16 Jul 2024 10:30:32 +0000 (11:30 +0100)]
target/arm: Fix handling of LDAPR/STLR with negative offset
When we converted the LDAPR/STLR instructions to decodetree we
accidentally introduced a regression where the offset is negative.
The 9-bit immediate field is signed, and the old hand decoder
correctly used sextract32() to get it out of the insn word,
but the ldapr_stlr_i pattern in the decode file used "imm:9"
instead of "imm:s9", so it treated the field as unsigned.
Fix the pattern to treat the field as a signed immediate.
Cc: qemu-stable@nongnu.org Fixes: 2521b6073b7 ("target/arm: Convert LDAPR/STLR (imm) to decodetree")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2419 Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240709134504.3500007-2-peter.maydell@linaro.org
(cherry picked from commit 5669d26ec614b3f4c56cf1489b9095ed327938b1) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
scsi: fix regression and honor bootindex again for legacy drives
Commit 3089637461 ("scsi: Don't ignore most usb-storage properties")
removed the call to object_property_set_int() and thus the 'set'
method for the bootindex property was also not called anymore. Here
that method is device_set_bootindex() (as configured by
scsi_dev_instance_init() -> device_add_bootindex_property()) which as
a side effect registers the device via add_boot_device_path().
As reported by a downstream user [0], the bootindex property did not
have the desired effect anymore for legacy drives. Fix the regression
by explicitly calling the add_boot_device_path() function after
checking that the bootindex is not yet used (to avoid
add_boot_device_path() calling exit()).
hw/scsi/lsi53c895a: bump instruction limit in scripts processing to fix regression
Commit 9876359990 ("hw/scsi/lsi53c895a: add timer to scripts
processing") reduced the maximum allowed instruction count by
a factor of 100 all the way down to 100.
This causes the "Check Point R81.20 Gaia" appliance [0] to fail to
boot after fully finishing the installation via the appliance's web
interface (there is already one reboot before that).
With a limit of 150, the appliance still fails to boot, while with a
limit of 200, it works. Bump to 500 to fix the regression and be on
the safe side.
Originally reported in the Proxmox community forum[1].
Vincent Fu [Fri, 3 May 2024 17:50:04 +0000 (13:50 -0400)]
hw/nvme: fix number of PIDs for FDP RUH update
The number of PIDs is in the upper 16 bits of cdw10. So we need to
right-shift by 16 bits instead of only a single bit.
Fixes: 73064edfb864 ("hw/nvme: flexible data placement emulation") Cc: qemu-stable@nongnu.org Signed-off-by: Vincent Fu <vincent.fu@samsung.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
(cherry picked from commit 3936bbdf9a2e9233875f850c7576c79d06add261) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Trigger generation of broadcast RARP frames to update network
[...]
Arguments
~~~~~~~~~
The members of "AnnounceParameters"
Except when the command takes its arguments unboxed , i.e. it doesn't
have 'boxed': true, we generate *nothing*. A few commands have a
reference in their doc comment to compensate, but most don't.
Example:
##
# @blockdev-snapshot-sync:
#
# Takes a synchronous snapshot of a block device.
#
# For the arguments, see the documentation of BlockdevSnapshotSync.
[...]
##
{ 'command': 'blockdev-snapshot-sync',
'data': 'BlockdevSnapshotSync',
'allow-preconfig': true }
char-stdio: Restore blocking mode of stdout on exit
qemu_chr_open_fd() sets stdout into non-blocking mode. Restore the old
fd flags on exit to avoid breaking unsuspecting applications that run on
the same terminal after qemu and don't expect to get EAGAIN.
While at at, also ensure term_exit is called once (at the moment it's
called both from char_stdio_finalize() and as the atexit() hook.
virtio: remove virtio_tswap16s() call in vring_packed_event_read()
Commit d152cdd6f6 ("virtio: use virtio accessor to access packed event")
switched using of address_space_read_cached() to virito_lduw_phys_cached()
to access packed descriptor event.
When we used address_space_read_cached(), we needed to call
virtio_tswap16s() to handle the endianess of the field, but
virito_lduw_phys_cached() already handles it internally, so we no longer
need to call virtio_tswap16s() (as the commit had done for `off_wrap`,
but forgot for `flags`).
Fixes: d152cdd6f6 ("virtio: use virtio accessor to access packed event") Cc: jasowang@redhat.com Cc: qemu-stable@nongnu.org Reported-by: Xoykie <xoykie@gmail.com> Link: https://lore.kernel.org/qemu-devel/CAFU8RB_pjr77zMLsM0Unf9xPNxfr_--Tjr49F_eX32ZBc5o2zQ@mail.gmail.com Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20240701075208.19634-1-sgarzare@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 7aa6492401e95fb296dec7cda81e67d91f6037d7) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Cindy Lu [Tue, 28 May 2024 08:48:15 +0000 (16:48 +0800)]
virtio-pci: Fix the failure process in kvm_virtio_pci_vector_use_one()
In function kvm_virtio_pci_vector_use_one(), the function will only use
the irqfd/vector for itself. Therefore, in the undo label, the failing
process is incorrect.
To fix this, we can just remove this label.
Fixes: f9a09ca3ea ("vhost: add support for configure interrupt") Cc: qemu-stable@nongnu.org Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20240528084840.194538-1-lulu@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit a113d041e8d0b152d72a7c2bf47dd09aabf9ade2) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Kevin Wolf [Thu, 25 Apr 2024 12:56:02 +0000 (14:56 +0200)]
block: Parse filenames only when explicitly requested
When handling image filenames from legacy options such as -drive or from
tools, these filenames are parsed for protocol prefixes, including for
the json:{} pseudo-protocol.
This behaviour is intended for filenames that come directly from the
command line and for backing files, which may come from the image file
itself. Higher level management tools generally take care to verify that
untrusted images don't contain a bad (or any) backing file reference;
'qemu-img info' is a suitable tool for this.
However, for other files that can be referenced in images, such as
qcow2 data files or VMDK extents, the string from the image file is
usually not verified by management tools - and 'qemu-img info' wouldn't
be suitable because in contrast to backing files, it already opens these
other referenced files. So here the string should be interpreted as a
literal local filename. More complex configurations need to be specified
explicitly on the command line or in QMP.
This patch changes bdrv_open_inherit() so that it only parses filenames
if a new parameter parse_filename is true. It is set for the top level
in bdrv_open(), for the file child and for the backing file child. All
other callers pass false and disable filename parsing this way.
Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
(cherry picked from commit 7ead946998610657d38d1a505d5f25300d4ca613) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Kevin Wolf [Thu, 25 Apr 2024 12:49:40 +0000 (14:49 +0200)]
iotests/270: Don't store data-file with json: prefix in image
We want to disable filename parsing for data files because it's too easy
to abuse in malicious image files. Make the test ready for the change by
passing the data file explicitly in command line options.
Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
(cherry picked from commit 7e1110664ecbc4826f3c978ccb06b6c1bce823e6) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Kevin Wolf [Thu, 25 Apr 2024 12:49:40 +0000 (14:49 +0200)]
iotests/244: Don't store data-file with protocol in image
We want to disable filename parsing for data files because it's too easy
to abuse in malicious image files. Make the test ready for the change by
passing the data file explicitly in command line options.
Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
(cherry picked from commit 2eb42a728d27a43fdcad5f37d3f65706ce6deba5) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Kevin Wolf [Thu, 11 Apr 2024 13:06:01 +0000 (15:06 +0200)]
qcow2: Don't open data_file with BDRV_O_NO_IO
One use case for 'qemu-img info' is verifying that untrusted images
don't reference an unwanted external file, be it as a backing file or an
external data file. To make sure that calling 'qemu-img info' can't
already have undesired side effects with a malicious image, just don't
open the data file at all with BDRV_O_NO_IO. If nothing ever tries to do
I/O, we don't need to have it open.
This changes the output of iotests case 061, which used 'qemu-img info'
to show that opening an image with an invalid data file fails. After
this patch, it succeeds. Replace this part of the test with a qemu-io
call, but keep the final 'qemu-img info' to show that the invalid data
file is correctly displayed in the output.
Fixes: CVE-2024-4467 Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
(cherry picked from commit bd385a5298d7062668e804d73944d52aec9549f1) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
tests: add testing of parameter=1 for SMP topology
Validate that it is possible to pass 'parameter=1' for any SMP topology
parameter, since unsupported parameters are implicitly considered to
always have a value of 1.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Message-ID: <20240513123358.612355-3-berrange@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit e68dcbb07923df0886802727edc3b21a10b0d342) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
but is not done as a 'git revert' since the part of the changes to the
file hw/core/machine-smp.c which add 'has_XXX' checks remain desirable.
Furthermore, we have to tweak the subsequently added unit test to
account for differing warning message.
The rationale for the original deprecation was:
"Currently, it was allowed for users to specify the unsupported
topology parameter as "1". For example, x86 PC machine doesn't
support drawer/book/cluster topology levels, but user could specify
"-smp drawers=1,books=1,clusters=1".
This is meaningless and confusing, so that the support for this kind
of configurations is marked deprecated since 9.0."
There are varying POVs on the topic of 'unsupported' topology levels.
It is common to say that on a system without hyperthreading, that there
is always 1 thread. Likewise when new CPUs introduced a concept of
multiple "dies', it was reasonable to say that all historical CPUs
before that implicitly had 1 'die'. Likewise for the more recently
introduced 'modules' and 'clusters' parameter'. From this POV, it is
valid to set 'parameter=1' on the -smp command line for any machine,
only a value > 1 is strictly an error condition.
It doesn't cause any functional difficulty for QEMU, because internally
the QEMU code is itself assuming that all "unsupported" parameters
implicitly have a value of '1'.
At the libvirt level, we've allowed applications to set 'parameter=1'
when configuring a guest, and pass that through to QEMU.
Deprecating this creates extra difficulty for because there's no info
exposed from QEMU about which machine types "support" which parameters.
Thus, libvirt can't know whether it is valid to pass 'parameter=1' for
a given machine type, or whether it will trigger deprecation messages.
Since there's no apparent functional benefit to deleting this deprecated
behaviour from QEMU, and it creates problems for consumers of QEMU,
remove this deprecation.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Message-ID: <20240513123358.612355-2-berrange@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 9d7950edb0cdf8f4e5746e220e6e8a9e713bad16) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: remove hunk about modules in hw/core/machine-smp.c introduced in v9.0.0-155-g8ec0a4634798 "hw/core/machine: Support modules in -smp")
Chuang Xu [Tue, 11 Jun 2024 03:23:14 +0000 (11:23 +0800)]
i386/cpu: fixup number of addressable IDs for processor cores in the physical package
When QEMU is started with:
-cpu host,host-cache-info=on,l3-cache=off \
-smp 2,sockets=1,dies=1,cores=1,threads=2
Guest can't acquire maximum number of addressable IDs for processor cores in
the physical package from CPUID[04H].
When creating a CPU topology of 1 core per package, host-cache-info only
uses the Host's addressable core IDs field (CPUID.04H.EAX[bits 31-26]),
resulting in a conflict (on the multicore Host) between the Guest core
topology information in this field and the Guest's actual cores number.
Fix it by removing the unnecessary condition to cover 1 core per package
case. This is safe because cores_per_pkg will not be 0 and will be at
least 1.
Fixes: d7caf13b5fcf ("x86: cpu: fixup number of addressable IDs for logical processors sharing cache") Signed-off-by: Guixiong Wei <weiguixiong@bytedance.com> Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com> Signed-off-by: Chuang Xu <xuchuangxclwt@bytedance.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240611032314.64076-1-xuchuangxclwt@bytedance.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 903916f0a017fe4b7789f1c6c6982333a5a71876) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: fixup for 9.0 due to other changes in this area past 9.0)
Thomas Huth [Thu, 18 Apr 2024 10:10:47 +0000 (12:10 +0200)]
tests: Update our CI to use CentOS Stream 9 instead of 8
RHEL 9 (and thus also the derivatives) have been available since two
years now, so according to QEMU's support policy, we can drop the active
support for the previous major version 8 now.
Another reason for doing this is that Centos Stream 8 will go EOL soon:
"After May 31, 2024, CentOS Stream 8 will be archived
and no further updates will be provided."
Thus upgrade our CentOS Stream container to major version 9 now.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240418101056.302103-5-thuth@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 641b1efe01b2dd6e7ac92f23d392dcee73508746) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Fabiano Rosas [Mon, 17 Jun 2024 18:57:17 +0000 (15:57 -0300)]
migration: Fix file migration with fdset
When the "file:" migration support was added we missed the special
case in the qemu_open_old implementation that allows for a particular
file name format to be used to refer to a set of file descriptors that
have been previously provided to QEMU via the add-fd QMP command.
When using this fdset feature, we should not truncate the migration
file because being given an fd means that the management layer is in
control of the file and will likely already have some data written to
it. This is further indicated by the presence of the 'offset'
argument, which indicates the start of the region where QEMU is
allowed to write.
Fix the issue by replacing the O_TRUNC flag on open by an ftruncate
call, which will take the offset into consideration.
Fixes: 385f510df5 ("migration: file URI offset") Suggested-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Prasad Pandit <pjp@fedoraproject.org> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
(cherry picked from commit 6d3279655ac49b806265f08415165f471d33e032) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
tcg/loongarch64: Fix tcg_out_movi vs some pcrel pointers
Simplify the logic for two-part, 32-bit pc-relative addresses.
Rather than assume all such fit in int32_t, do some arithmetic
and assert a result, do some arithmetic first and then check
to see if the pieces are in range.
Cc: qemu-stable@nongnu.org Fixes: dacc51720db ("tcg/loongarch64: Implement tcg_out_mov and tcg_out_movi") Reviewed-by: Song Gao <gaosong@loongson.cn> Reported-by: Song Gao <gaosong@loongson.cn> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit 521d7fb3ebdf88112ed13556a93e3037742b9eb8) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Ilya Leoshkevich [Fri, 14 Jun 2024 15:46:40 +0000 (17:46 +0200)]
linux-user: Make TARGET_NR_setgroups affect only the current thread
Like TARGET_NR_setuid, TARGET_NR_setgroups should affect only the
calling thread, and not the entire process. Therefore, implement it
using a syscall, and not a libc call.
Cc: qemu-stable@nongnu.org Fixes: 19b84f3c35d7 ("added setgroups and getgroups syscalls") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20240614154710.1078766-1-iii@linux.ibm.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit 54b27921026df384f67df86f04c39539df375c60) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Anton Johansson [Wed, 12 Jun 2024 13:30:31 +0000 (15:30 +0200)]
accel/tcg: Fix typo causing tb->page_addr[1] to not be recorded
For TBs crossing page boundaries, the 2nd page will never be
recorded/removed, as the index of the 2nd page is computed from the
address of the 1st page. This is due to a typo, fix it.
Cc: qemu-stable@nongnu.org Fixes: deba78709a ("accel/tcg: Always lock pages before translation") Signed-off-by: Anton Johansson <anjo@rev.ng> Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240612133031.15298-1-anjo@rev.ng> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit 3b279f73fa37bec8d3ba04a15f5153d6491cffaf) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
hw/audio/virtio-snd: Always use little endian audio format
The VIRTIO Sound Device conforms with the Virtio spec v1.2,
thus only use little endianness.
Remove the suspicious target_words_bigendian() noticed during
code review.
Cc: qemu-stable@nongnu.org Fixes: eb9ad377bb ("virtio-sound: handle control messages and streams") Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20240422211830.25606-1-philmd@linaro.org>
(cherry picked from commit a276ec8e2632c9015d0f9b4e47194e4e91dfa8bb) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Stefan Hajnoczi [Mon, 6 May 2024 19:06:21 +0000 (15:06 -0400)]
Revert "monitor: use aio_co_reschedule_self()"
Commit 1f25c172f837 ("monitor: use aio_co_reschedule_self()") was a code
cleanup that uses aio_co_reschedule_self() instead of open coding
coroutine rescheduling.
Bug RHEL-34618 was reported and Kevin Wolf <kwolf@redhat.com> identified
the root cause. I missed that aio_co_reschedule_self() ->
qemu_get_current_aio_context() only knows about
qemu_aio_context/IOThread AioContexts and not about iohandler_ctx. It
does not function correctly when going back from the iohandler_ctx to
qemu_aio_context.
Go back to open coding the AioContext transitions to avoid this bug.
Cc: qemu-stable@nongnu.org Buglink: https://issues.redhat.com/browse/RHEL-34618 Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240506190622.56095-2-stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 719c6819ed9a9838520fa732f9861918dc693bda) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Dongwon Kim [Fri, 26 Apr 2024 22:50:59 +0000 (15:50 -0700)]
ui/gtk: Draw guest frame at refresh cycle
Draw routine needs to be manually invoked in the next refresh
if there is a scanout blob from the guest. This is to prevent
a situation where there is a scheduled draw event but it won't
happen bacause the window is currently in inactive state
(minimized or tabified). If draw is not done for a long time,
gl_block timeout and/or fence timeout (on the guest) will happen
eventually.
v2: Use gd_gl_area_draw(vc) in gtk-gl-area.c
Suggested-by: Vivek Kasireddy <vivek.kasireddy@intel.com> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Marc-André Lureau <marcandre.lureau@redhat.com> Cc: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dongwon Kim <dongwon.kim@intel.com> Acked-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20240426225059.3871283-1-dongwon.kim@intel.com>
(cherry picked from commit 77bf310084dad38b3a2badf01766c659056f1cf2) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reproducer from https://gitlab.com/qemu-project/qemu/-/issues/1451
creates small packet (1 segment, len = 10 == n->guest_hdr_len),
then destroys queue.
"if (n->host_hdr_len != n->guest_hdr_len)" is triggered, if body creates
zero length/zero segment packet as there is nothing after guest header.
qemu_sendv_packet_async() tries to send it.
slirp discards it because it is smaller than Ethernet header,
but returns 0 because tx hooks are supposed to return total length of data.
0 is propagated upwards and is interpreted as "packet has been sent"
which is terrible because queue is being destroyed, nobody is waiting for TX
to complete and assert it triggered.
Fix is discard such empty packets instead of sending them.
Signed-off-by: Alexey Dobriyan <adobriyan@yandex-team.ru> Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 2c3e4e2de699cd4d9f6c71f30a22d8f125cd6164) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
target/i386: fix size of EBP writeback in gen_enter()
The calculation of FrameTemp is done using the size indicated by mo_pushpop()
before being written back to EBP, but the final writeback to EBP is done using
the size indicated by mo_stacksize().
In the case where mo_pushpop() is MO_32 and mo_stacksize() is MO_16 then the
final writeback to EBP is done using MO_16 which can leave junk in the top
16-bits of EBP after executing ENTER.
Change the writeback of EBP to use the same size indicated by mo_pushpop() to
ensure that the full value is written back.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2198
Message-ID: <20240606095319.229650-5-mark.cave-ayland@ilande.co.uk> Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 3973615e7fbaeef1deeaa067577e373781ced70a) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>