git.ipfire.org Git - thirdparty/qemu.git/log

virtio-balloon: don't start free page hinting if postcopy is possible

Postcopy never worked properly with 'free-page-hint=on', as there are
at least two issues:

1) With postcopy, the guest will never receive a VIRTIO_BALLOON_CMD_ID_DONE
   and consequently won't release free pages back to the OS once
   migration finishes.

   The issue is that for postcopy, we won't do a final bitmap sync while
   the guest is stopped on the source and
   virtio_balloon_free_page_hint_notify() will only call
   virtio_balloon_free_page_done() on the source during
   PRECOPY_NOTIFY_CLEANUP, after the VM state was already migrated to
   the destination.

2) Once the VM touches a page on the destination that has been excluded
   from migration on the source via qemu_guest_free_page_hint() while
   postcopy is active, that thread will stall until postcopy finishes
   and all threads are woken up. (with older Linux kernels that won't
   retry faults when woken up via userfaultfd, we might actually get a
   SEGFAULT)

   The issue is that the source will refuse to migrate any pages that
   are not marked as dirty in the dirty bmap -- for example, because the
   page might just have been sent. Consequently, the faulting thread will
   stall, waiting for the page to be migrated -- which could take quite
   a while and result in guest OS issues.

While we could fix 1) comparatively easily, 2) is harder to get right and
might require more involved RAM migration changes on source and destination
[1].

As it never worked properly, let's not start free page hinting in the
precopy notifier if the postcopy migration capability was enabled to fix
it easily. Capabilities cannot be enabled once migration is already
running.

Note 1: in the future we might either adjust migration code on the source
        to track pages that have actually been sent or adjust
        migration code on source and destination  to eventually send
        pages multiple times from the source and and deal with pages
        that are sent multiple times on the destination.

Note 2: virtio-mem has similar issues, however, access to "unplugged"
        memory by the guest is very rare and we would have to be very
        lucky for it to happen during migration. The spec states
        "The driver SHOULD NOT read from unplugged memory blocks ..."
        and "The driver MUST NOT write to unplugged memory blocks".
        virtio-mem will move away from virtio_balloon_free_page_done()
        soon and handle this case explicitly on the destination.

[1] https://lkml.kernel.org/r/e79fd18c-aa62-c1d8-c7f3-ba3fc2c25fc8@redhat.com

Fixes: c13c4153f76d ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT")
Cc: qemu-stable@nongnu.org
Cc: Wei Wang <wei.w.wang@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
Cc: Alexander Duyck <alexander.duyck@gmail.com>
Cc: Juan Quintela <quintela@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210708095339.20274-2-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
(cherry picked from commit fd51e54fa10221e5a8add894c38cc1cf199f4bc4)
Signed-off-by: Michael Roth <michael.roth@amd.com>

hw/nvme: fix pin-based interrupt behavior (again)

Jakub noticed[1] that, when using pin-based interrupts, the device will
unconditionally deasssert when any CQEs are acknowledged. However, the
pin should not be deasserted if other completion queues still holds
unacknowledged CQEs.

The bug is an artifact of commit ca247d35098d ("hw/block/nvme: fix
pin-based interrupt behavior") which fixed one bug but introduced
another. This is the third time someone tries to fix pin-based
interrupts (see commit 5e9aa92eb1a5 ("hw/block: Fix pin-based interrupt
behaviour of NVMe"))...

Third time's the charm, so fix it, again, by keeping track of how many
CQs have unacknowledged CQEs and only deassert when all are cleared.

[1]: <20210610114624.304681-1-jakub.jermar@kernkonzept.com>

Cc: qemu-stable@nongnu.org
Fixes: ca247d35098d ("hw/block/nvme: fix pin-based interrupt behavior")
Reported-by: Jakub Jermář <jakub.jermar@kernkonzept.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
(cherry picked from commit 83d7ed5c570d4c1d5163951b3057cac2ae7da4ff)
*avoid dependency on 88eea45c ("hw/nvme: move nvme emulation out of hw/block")
Signed-off-by: Michael Roth <michael.roth@amd.com>

hw/nvme: fix missing check for PMR capability

Qiang Liu reported that an access on an unknown address is triggered in
memory_region_set_enabled because a check on CAP.PMRS is missing for the
PMRCTL register write when no PMR is configured.

Cc: qemu-stable@nongnu.org
Fixes: 75c3c9de961d ("hw/block/nvme: disable PMR at boot up")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/362
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
(cherry picked from commit 2b02aabc9d02f9e95946cf639f546bb61f1721b7)
Signed-off-by: Michael Roth <michael.roth@amd.com>

hw/block/nvme: align with existing style

While QEMU coding style prefers lowercase hexadecimals in constants, the
NVMe subsystem uses the format from the NVMe specifications in comments,
i.e. 'h' suffix instead of '0x' prefix.

Fix this up across the code base.

Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
[k.jensen: updated message; added conversion in a couple of missing comments]
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
(cherry picked from commit 312c3531bba416e589f106db8c8241fc6e7e6332)
*context dependency for 2b02aabc9d
Signed-off-by: Michael Roth <michael.roth@amd.com>

tests: acpi: pc: update expected DSDT blobs

@@ -930,20 +930,20 @@ DefinitionBlock ("", "DSDT", 1, "BOCHS ", "BXPC    ", 0x00000001)
             Device (S00)
             {
                 Name (_ADR, Zero)  // _ADR: Address
-                Name (_SUN, Zero)  // _SUN: Slot User Number
+                Name (ASUN, Zero)
                 Method (_DSM, 4, Serialized)  // _DSM: Device-Specific Method
                 {
-                    Return (PDSM (Arg0, Arg1, Arg2, Arg3, BSEL, _SUN))
+                    Return (PDSM (Arg0, Arg1, Arg2, Arg3, BSEL, ASUN))
                 }
             }

             Device (S10)
             {
                 Name (_ADR, 0x00020000)  // _ADR: Address
-                Name (_SUN, 0x02)  // _SUN: Slot User Number
+                Name (ASUN, 0x02)
                 Method (_DSM, 4, Serialized)  // _DSM: Device-Specific Method
                 {
-                    Return (PDSM (Arg0, Arg1, Arg2, Arg3, BSEL, _SUN))
+                    Return (PDSM (Arg0, Arg1, Arg2, Arg3, BSEL, ASUN))
                 }

                 Method (_S1D, 0, NotSerialized)  // _S1D: S1 Device State

with a hank per bridge:

@@ -965,10 +965,10 @@ DefinitionBlock ("", "DSDT", 1, "BOCHS ", "BXPC    ", 0x00000001)
             Device (S18)
             {
                 Name (_ADR, 0x00030000)  // _ADR: Address
-                Name (_SUN, 0x03)  // _SUN: Slot User Number
+                Name (ASUN, 0x03)
                 Method (_DSM, 4, Serialized)  // _DSM: Device-Specific Method
                 {
-                    Return (PDSM (Arg0, Arg1, Arg2, Arg3, BSEL, _SUN))
+                    Return (PDSM (Arg0, Arg1, Arg2, Arg3, BSEL, ASUN))
                 }
             }

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20210624204229.998824-4-imammedo@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: John Sucaet <john.sucaet@ekinops.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 40f23e4e52f6188036062abea432560e5cdd239a)
Signed-off-by: Michael Roth <michael.roth@amd.com>

acpi: pc: revert back to v5.2 PCI slot enumeration

Commit [1] moved _SUN variable from only hot-pluggable to
all devices. This made linux kernel enumerate extra slots
that weren't present before. If extra slot happens to be
be enumerated first and there is a device in th same slot
but on other bridge, linux kernel will add -N suffix to
slot name of the later, thus changing NIC name compared to
QEMU 5.2. This in some case confuses systemd, if it is
using SLOT NIC naming scheme and interface name becomes
not the same as it was under QEMU-5.2.

Reproducer QEMU CLI:
  -M pc-i440fx-5.2 -nodefaults \
  -device pci-bridge,chassis_nr=1,id=pci.1,bus=pci.0,addr=0x3 \
  -device virtio-net-pci,id=nic1,bus=pci.1,addr=0x1 \
  -device virtio-net-pci,id=nic2,bus=pci.1,addr=0x2 \
  -device virtio-net-pci,id=nic3,bus=pci.1,addr=0x3

with RHEL8 guest produces following results:
  v5.2:
     kernel: virtio_net virtio0 ens1: renamed from eth0
     kernel: virtio_net virtio2 ens3: renamed from eth2
     kernel: virtio_net virtio1 enp1s2: renamed from eth1
      (slot 2 is assigned to empty bus 0 slot and virtio1
       is assigned to 2-2 slot, and renaming falls back,
       for some reason, to path based naming scheme)

  v6.0:
     kernel: virtio_net virtio0 ens1: renamed from eth0
     kernel: virtio_net virtio2 ens3: renamed from eth2
     systemd-udevd[299]: Error changing net interface name 'eth1' to 'ens3': File exists
     systemd-udevd[299]: could not rename interface '3' from 'eth1' to 'ens3': File exists
      (with commit [1] kernel assigns virtio2 to 3-2 slot
       since bridge advertises _SUN=0x3 and kernel assigns
       slot 3 to bridge. Still it manages to rename virtio2
       correctly to ens3, however systemd gets confused with virtio1
       where slot allocation exactly the same (2-2) as in 5.2 case
       and tries to rename it to ens3 which is rightfully taken by
       virtio2)

I'm not sure what breaks in systemd interface renaming (it probably
should be investigated), but on QEMU side we can safely revert
_SUN to 5.2 behavior (i.e. avoid cold-plugged bridges and non
hot-pluggable device classes), without breaking acpi-index, which uses
slot numbers but it doesn't have to use _SUN, it could use an arbitrary
variable name that has the same slot value).
It will help existing VMs to keep networking with non trivial
configs in working order since systemd will do its interface
renaming magic as it used to do.

1)
Fixes: b7f23f62e40 (pci: acpi: add _DSM method to PCI devices)
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20210624204229.998824-3-imammedo@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: John Sucaet <john.sucaet@ekinops.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 7193d7cdd93e50f0e5f09803b98d27d3f9b147ac)
Signed-off-by: Michael Roth <michael.roth@amd.com>

tests: acpi: prepare for changing DSDT tables

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20210624204229.998824-2-imammedo@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: John Sucaet <john.sucaet@ekinops.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit a4344574fd47336b6d8fc85ce1f66d4262e7dafd)
Signed-off-by: Michael Roth <michael.roth@amd.com>

yank: Unregister function when using TLS migration

After yank feature was introduced in migration, whenever migration
is started using TLS, the following error happens in both source and
destination hosts:

(qemu) qemu-kvm: ../util/yank.c:107: yank_unregister_instance:
Assertion `QLIST_EMPTY(&entry->yankfns)' failed.

This happens because of a missing yank_unregister_function() when using
qio-channel-tls.

Fix this by also allowing TYPE_QIO_CHANNEL_TLS object type to perform
yank_unregister_function() in channel_close() and multifd_load_cleanup().

Also, inside migration_channel_connect() and
migration_channel_process_incoming() move yank_register_function() so
it only runs once on a TLS migration.

Fixes: b5eea99ec2f ("migration: Add yank feature", 2021-01-13)
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1964326
Signed-off-by: Leonardo Bras <leobras.c@gmail.com>
Reviewed-by: Lukas Straub <lukasstraub2@web.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Changes since v2:
- Dropped all references to ioc->master
- yank_register_function() and yank_unregister_function() now only run
once in a TLS migration.

Changes since v1:
- Cast p->c to QIOChannelTLS into multifd_load_cleanup()
Message-Id: <20210601054030.1153249-1-leobras.c@gmail.com>

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
(cherry picked from commit 7de2e8565335c13fb3516cddbe2e40e366cce273)
Signed-off-by: Michael Roth <michael.roth@amd.com>

crypto: Make QCryptoTLSCreds* structures private

Code consuming the "crypto/tlscreds*.h" APIs doesn't need
to access its internals. Move the structure definitions to
the "tlscredspriv.h" private header (only accessible by
implementations). The public headers (in include/) still
forward-declare the structures typedef.

Note, tlscreds.c and 3 of the 5 modified source files already
include "tlscredspriv.h", so only add it to tls-cipher-suites.c
and tlssession.c.

Removing the internals from the public header solves a bug
introduced by commit 7de2e856533 ("yank: Unregister function
when using TLS migration") which made migration/qemu-file-channel.c
include "io/channel-tls.h", itself sometime depends on GNUTLS,
leading to a build failure on OSX:

  [2/35] Compiling C object libmigration.fa.p/migration_qemu-file-channel.c.o
  FAILED: libmigration.fa.p/migration_qemu-file-channel.c.o
  cc -Ilibmigration.fa.p -I. -I.. -Iqapi [ ... ] -o libmigration.fa.p/migration_qemu-file-channel.c.o -c ../migration/qemu-file-channel.c
  In file included from ../migration/qemu-file-channel.c:29:
  In file included from include/io/channel-tls.h:26:
  In file included from include/crypto/tlssession.h:24:
  include/crypto/tlscreds.h:28:10: fatal error: 'gnutls/gnutls.h' file not found
  #include <gnutls/gnutls.h>
           ^~~~~~~~~~~~~~~~~
  1 error generated.

Reported-by: Stefan Weil <sw@weilnetz.de>
Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/407
Fixes: 7de2e856533 ("yank: Unregister function when using TLS migration")
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
(cherry picked from commit 678bcc3c2cf22262d0a72b52da57737c4a40e040)
Signed-off-by: Michael Roth <michael.roth@amd.com>

ui/vnc: Use qcrypto_tls_creds_check_endpoint()

Avoid accessing QCryptoTLSCreds internals by using
the qcrypto_tls_creds_check_endpoint() helper.

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
(cherry picked from commit 3c52bf0c608419d7892fea95f2a0af8f2e99633e)
Signed-off-by: Michael Roth <michael.roth@amd.com>

migration/tls: Use qcrypto_tls_creds_check_endpoint()

Avoid accessing QCryptoTLSCreds internals by using
the qcrypto_tls_creds_check_endpoint() helper.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
(cherry picked from commit 5590f65facc508fbc38575f19a0ab2fdcdcf18a4)
Signed-off-by: Michael Roth <michael.roth@amd.com>

chardev/socket: Use qcrypto_tls_creds_check_endpoint()

Avoid accessing QCryptoTLSCreds internals by using
the qcrypto_tls_creds_check_endpoint() helper.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
(cherry picked from commit 8612df2ebef9ff1f880fe76d223a3369b9c98db2)
Signed-off-by: Michael Roth <michael.roth@amd.com>

qemu-nbd: Use qcrypto_tls_creds_check_endpoint()

Avoid accessing QCryptoTLSCreds internals by using
the qcrypto_tls_creds_check_endpoint() helper.

Tested-by: Akihiko Odaki <akihiko.odaki@gmail.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
(cherry picked from commit 0279cd9535283cf5711768ab6401b204e5697a81)
Signed-off-by: Michael Roth <michael.roth@amd.com>

block/nbd: Use qcrypto_tls_creds_check_endpoint()

Avoid accessing QCryptoTLSCreds internals by using
the qcrypto_tls_creds_check_endpoint() helper.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
(cherry picked from commit 7b3b616838c30830c004b746e7b1209297118318)
Signed-off-by: Michael Roth <michael.roth@amd.com>

crypto/tlscreds: Introduce qcrypto_tls_creds_check_endpoint() helper

Introduce the qcrypto_tls_creds_check_endpoint() helper
to access QCryptoTLSCreds internal 'endpoint' field.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
(cherry picked from commit e9ac68083f26759b85ef0d871ca2bbe897218f64)
Signed-off-by: Michael Roth <michael.roth@amd.com>

block/nvme: Fix VFIO_MAP_DMA failed: No space left on device

When the NVMe block driver was introduced (see commit bdd6a90a9e5,
January 2018), Linux VFIO_IOMMU_MAP_DMA ioctl was only returning
-ENOMEM in case of error. The driver was correctly handling the
error path to recycle its volatile IOVA mappings.

To fix CVE-2019-3882, Linux commit 492855939bdb ("vfio/type1: Limit
DMA mappings per container", April 2019) added the -ENOSPC error to
signal the user exhausted the DMA mappings available for a container.

The block driver started to mis-behave:

  qemu-system-x86_64: VFIO_MAP_DMA failed: No space left on device
  (qemu)
  (qemu) info status
  VM status: paused (io-error)
  (qemu) c
  VFIO_MAP_DMA failed: No space left on device
  (qemu) c
  VFIO_MAP_DMA failed: No space left on device

(The VM is not resumable from here, hence stuck.)

Fix by handling the new -ENOSPC error (when DMA mappings are
exhausted) without any distinction to the current -ENOMEM error,
so we don't change the behavior on old kernels where the CVE-2019-3882
fix is not present.

An easy way to reproduce this bug is to restrict the DMA mapping
limit (65535 by default) when loading the VFIO IOMMU module:

  # modprobe vfio_iommu_type1 dma_entry_limit=666

Cc: qemu-stable@nongnu.org
Cc: Fam Zheng <fam@euphon.net>
Cc: Maxim Levitsky <mlevitsk@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Reported-by: Michal Prívozník <mprivozn@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20210723195843.1032825-1-philmd@redhat.com
Fixes: bdd6a90a9e5 ("block: Add VFIO based NVMe driver")
Buglink: https://bugs.launchpad.net/qemu/+bug/1863333
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/65
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 15a730e7a3aaac180df72cd5730e0617bcf44a5a)
Signed-off-by: Michael Roth <michael.roth@amd.com>

hw/pci-host/q35: Ignore write of reserved PCIEXBAR LENGTH field

libFuzzer triggered the following assertion:

  cat << EOF | qemu-system-i386 -M pc-q35-5.0 \
    -nographic -monitor none -serial none \
    -qtest stdio -d guest_errors -trace pci\*
  outl 0xcf8 0xf2000060
  outl 0xcfc 0x8400056e
  EOF
  pci_cfg_write mch 00:0 @0x60 <- 0x8400056e
  Aborted (core dumped)

This is because guest wrote MCH_HOST_BRIDGE_PCIEXBAR_LENGTH_RVD
(reserved value) to the PCIE XBAR register.

There is no indication on the datasheet about what occurs when
this value is written. Simply ignore it on QEMU (and report an
guest error):

  pci_cfg_write mch 00:0 @0x60 <- 0x8400056e
  Q35: Reserved PCIEXBAR LENGTH
  pci_cfg_read mch 00:0 @0x0 -> 0x8086
  pci_cfg_read mch 00:0 @0x0 -> 0x29c08086
  ...

Cc: qemu-stable@nongnu.org
Reported-by: Alexander Bulekov <alxndr@bu.edu>
BugLink: https://bugs.launchpad.net/qemu/+bug/1878641
Fixes: df2d8b3ed4 ("q35: Introduce q35 pc based chipset emulator")
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210526142438.281477-1-f4bug@amsat.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 9b0ca75e0196a72523232063db1e07ae36a5077a)
Signed-off-by: Michael Roth <michael.roth@amd.com>

tcg: Allocate sufficient storage in temp_allocate_frame

This function should have been updated for vector types
when they were introduced.

Fixes: d2fd745fe8b
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/367
Cc: qemu-stable@nongnu.org
Tested-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit c1c091948ae4a73c1a80b5005f6204d0e665ce52)
Signed-off-by: Michael Roth <michael.roth@amd.com>

tcg/sparc: Fix temp_allocate_frame vs sparc stack bias

We should not be aligning the offset in temp_allocate_frame,
because the odd offset produces an aligned address in the end.
Instead, pass the logical offset into tcg_set_frame and add
the stack bias last.

Cc: qemu-stable@nongnu.org
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit 9defd1bdfb0f2ddb3ca9863e39577f3a9929d58c)
Signed-off-by: Michael Roth <michael.roth@amd.com>

vl: Fix an assert failure in error path

Based on the description of error_setg(), the local variable err in
qemu_maybe_daemonize() should be initialized to NULL.

Without fix, the uninitialized *errp triggers assert failure which
doesn't show much valuable information.

Before the fix:
qemu-system-x86_64: ../util/error.c:59: error_setv: Assertion `*errp == NULL' failed.

After fix:
qemu-system-x86_64: cannot create PID file: Cannot open pid file: Permission denied

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Message-Id: <20210610084741.456260-1-zhenzhong.duan@intel.com>
Cc: qemu-stable@nongnu.org
Fixes: 0546c0609c ("vl: split various early command line options to a separate function", 2020-12-10)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 38f71349c7c4969bc14da4da1c70b8cc4078d596)
Signed-off-by: Michael Roth <michael.roth@amd.com>

vfio: Fix unregister SaveVMHandler in vfio_migration_finalize

In the vfio_migration_init(), the SaveVMHandler is registered for
VFIO device. But it lacks the operation of 'unregister'. It will
lead to 'Segmentation fault (core dumped)' in
qemu_savevm_state_setup(), if performing live migration after a
VFIO device is hot deleted.

Fixes: 7c2f5f75f94 (vfio: Register SaveVMHandlers for VFIO device)
Reported-by: Qixin Gan <ganqixin@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
Message-Id: <20210527123101.289-1-jiangkunkun@huawei.com>
Reviewed by: Kirti Wankhede <kwankhede@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
(cherry picked from commit 22fca190e25b10761925bb1eeadeda07aabf3c26)
Signed-off-by: Michael Roth <michael.roth@amd.com>

runstate: Initialize Error * to NULL

Based on the description of error_setg(), the local variable err in
qemu_init_subsystems() should be initialized to NULL.

Fixes: efd7ab22fb ("vl: extract qemu_init_subsystems")
Cc: qemu-stable@nongnu.org
Signed-off-by: Peng Liang <liangpeng10@huawei.com>
Message-Id: <20210610131729.3906565-1-liangpeng10@huawei.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 6e1da3d305499d3907f3c7f6638243e2e09b5085)
Signed-off-by: Michael Roth <michael.roth@amd.com>

esp: only set ESP_RSEQ at the start of the select sequence

When processing a command to select a target and send a CDB, the ESP device
maintains a sequence step register so that if an error occurs the host can
determine which part of the selection/CDB submission sequence failed.

The old Linux 2.6 driver is really pedantic here: it checks the sequence step
register even if a command succeeds and complains loudly on the console if the
sequence step register doesn't match the expected bus phase and interrupt flags.

This reason this mismatch occurs is because the ESP emulation currently doesn't
update the bus phase until the next TI (Transfer Information) command and so the
cleared sequence step register is considered invalid for the stale bus phase.

Normally this isn't an issue as the host only checks the sequence step register
if an error occurs but the old Linux 2.6 driver does this in several places
causing a large stream of "esp0: STEP_ASEL for tgt 0" messages to appear on the
console during the boot process.

Fix this by not clearing the sequence step register when reading the interrupt
register and clearing the DMA status, so the guest sees a valid sequence step
and bus phase combination at the end of the command phase. No other change is
required since the sequence step register is correctly updated throughout the
selection/CDB submission sequence once one of the select commands is issued.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Fixes: 1b9e48a5bd ("esp: implement non-DMA transfers in PDMA mode")
Message-Id: <20210518212511.21688-3-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit af947a3d853a235943681a00f07f3081f5143cc3)
Signed-off-by: Michael Roth <michael.roth@amd.com>

esp: only assert INTR_DC interrupt flag if selection fails

The datasheet sequence tables confirm that when a target selection fails, only
the INTR_DC interrupt flag should be asserted.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Fixes: cf47a41e05 ("esp: latch individual bits in ESP_RINTR register")
Message-Id: <20210518212511.21688-2-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit cf1a7a9b3721544aaa3e43d111eb383c30d71a62)
Signed-off-by: Michael Roth <michael.roth@amd.com>

vhost-vdpa: don't initialize backend_features

We used to initialize backend_features during vhost_vdpa_init()
regardless whether or not it was supported by vhost. This will lead
the unsupported features like VIRTIO_F_IN_ORDER to be included and set
to the vhost-vdpa during vhost_dev_start. Because the
VIRTIO_F_IN_ORDER is not supported by vhost-vdpa so it won't be
advertised to guest which will break the datapath.

Fix this by not initializing the backend_features, so the
acked_features could be built only from guest features via
vhost_net_ack_features().

Fixes: 108a64818e69b ("vhost-vdpa: introduce vhost-vdpa backend")
Cc: qemu-stable@nongnu.org
Cc: Gautam Dawar <gdawar@xilinx.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit c33f23a419f95da16ab4faaf08be635c89b96ff0)
Signed-off-by: Michael Roth <michael.roth@amd.com>

configure: fix detection of gdbus-codegen

"pkg-config --variable=gdbus_codegen gio-2.0" returns "gdbus-codegen",
and it does not pass test -x (which does not walk the path).

Meson 0.58.0 notices that something is iffy, as the dbus_vmstate1
assignment in tests/qtest/meson.build uses an empty string as the
command, and fails very eloquently:

../tests/qtest/meson.build:92:2: ERROR: No program name specified.

Use the "has" function instead of test -x, and fix the generation
of config-host.mak since meson.build expects that GDBUS_CODEGEN
is absent, rather than empty, if the tool is unavailable.

Reported-by: Sebastian Mitterle <smitterl@redhat.com>
Fixes: #178
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 5ecfb76ccc056eb6127e44268e475827ae73b9e0)
Signed-off-by: Michael Roth <michael.roth@amd.com>

hmp: Fix loadvm to resume the VM on success instead of failure

Commit f61fe11aa6f broke hmp_loadvm() by adding an incorrect negation
when converting from 0/-errno return values to a bool value. The result
is that loadvm resumes the VM now if it failed and keeps it stopped if
it failed. Fix it to restore the old behaviour and do it the other way
around.

Fixes: f61fe11aa6f7f8f0ffe4ddaa56a8108f3ab57854
Cc: qemu-stable@nongnu.org
Reported-by: Yanhui Ma <yama@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210511163151.45167-1-kwolf@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
(cherry picked from commit c53cd04e70641fdf9410aac40c617d074047b3e1)
Signed-off-by: Michael Roth <michael.roth@amd.com>

sockets: update SOCKET_ADDRESS_TYPE_FD listen(2) backlog

socket_get_fd() fails with the error "socket_get_fd: too many
connections" if the given listen backlog value is not 1.

Not all callers set the backlog to 1. For example, commit
582d4210eb2f2ab5baac328fe4b479cd86da1647 ("qemu-nbd: Use SOMAXCONN for
socket listen() backlog") uses SOMAXCONN. This will always fail with in
socket_get_fd().

This patch calls listen(2) on the fd to update the backlog value. The
socket may already be in the listen state. I have tested that this works
on Linux 5.10 and macOS Catalina.

As a bonus this allows us to detect when the fd cannot listen. Now we'll
be able to catch unbound or connected fds in socket_listen().

Drop the num argument from socket_get_fd() since this function is also
called by socket_connect() where a listen backlog value does not make
sense.

Fixes: e5b6353cf25c99c3f08bf51e29933352f7140e8f ("socket: Add backlog parameter to socket_listen")
Reported-by: Richard W.M. Jones <rjones@redhat.com>
Cc: Juan Quintela <quintela@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20210310173004.420190-1-stefanha@redhat.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 37179e9ea45d6428b29ae789209c119ac18c1d39)
Signed-off-by: Michael Roth <michael.roth@amd.com>

vl: plug -object back into -readconfig

Commit bc2f4fcb1d ("qom: move user_creatable_add_opts logic to vl.c
and QAPIfy it", 2021-03-19) switched the creation of objects from
qemu_opts_foreach to a bespoke QTAILQ in preparation for supporting JSON
syntax in -object.

Unfortunately in doing so it lost support for [object] stanzas in
configuration files and also for "-set object.ID.KEY=VAL". The latter
is hard to re-establish and probably best solved by deprecating -set.
This patch uses the infrastructure introduced by the previous two
patches in order to parse QOM objects correctly from configuration
files.

Cc: Markus Armbruster <armbru@redhat.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210524105752.3318299-4-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 49e987695a1873a769a823604f9065aa88e00c55)
Signed-off-by: Michael Roth <michael.roth@amd.com>

vl: plumb keyval-based options into -readconfig

Let -readconfig support parsing command line options into QDict or
QemuOpts. This will be used to add back support for objects in
-readconfig.

Cc: Markus Armbruster <armbru@redhat.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210524105752.3318299-3-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit c0d4aa82f895af67cbf7772324e05605e22b4162)
Signed-off-by: Michael Roth <michael.roth@amd.com>

qemu-config: parse configuration files to a QDict

Change the parser to put the values into a QDict and pass them
to a callback. qemu_config_parse's QemuOpts creation is
itself turned into a callback function.

This is useful for -readconfig to support keyval-based options;
getting a QDict from the parser removes a roundtrip from
QDict to QemuOpts and then back to QDict.

Unfortunately there is a disadvantage in that semantic errors will
point to the last line of the group, because the entries of the QDict
do not have a location attached.

Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210524105752.3318299-2-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 37701411397c7b7d709ae92abd347cc593940ee5)
Signed-off-by: Michael Roth <michael.roth@amd.com>

qemu-config: load modules when instantiating option groups

Right now the SPICE module is special cased to be loaded when processing
of the -spice command line option. However, the spice option group
can also be brought in via -readconfig, in which case the module is
not loaded.

Add a generic hook to load modules that provide a QemuOpts group,
and use it for the "spice" and "iscsi" groups.

Fixes: #194
Fixes: https://bugs.launchpad.net/qemu/+bug/1910696
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 632a8873500d27022c584256afc11e57e2418b94)
Signed-off-by: Michael Roth <michael.roth@amd.com>

qemu-option: support accept-any QemuOptsList in qemu_opts_absorb_qdict

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 941a4736d2b465be1d6429415f8b1f26e2167585)
Signed-off-by: Michael Roth <michael.roth@amd.com>

target/xtensa: fix access ring in l32ex

l32ex does memory access as all regular load/store operations at CRING
level. Fix apparent pasto from l32e that caused it to use RING instead.

This is a correctness issue, not a security issue, because in the worst
case the privilege level of memory access may be lowered, resulting in
an exception when the correct implementation would've succeeded.
In no case it would allow memory access that would've raised an
exception in the correct implementation.

Cc: qemu-stable@nongnu.org
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
(cherry picked from commit 735aa900e4bf57b777ac620bed7c88234ec4b601)
Signed-off-by: Michael Roth <michael.roth@amd.com>

vl: allow not specifying size in -m when using -M memory-backend

Starting in QEMU 6.0's commit f5c9fcb82d ("vl: separate
qemu_create_machine", 2020-12-10), a function have_custom_ram_size()
replaced the return value of set_memory_options().

The purpose of the return value was to record the presence of
"-m size", and if it was not there, change the default RAM
size to the size of the memory backend passed with "-M
memory-backend".

With that commit, however, have_custom_ram_size() is now queried only
after set_memory_options has stored the fixed-up RAM size in QemuOpts for
"future use". This was actually the only future use of the fixed-up RAM
size, so remove that code and fix the bug.

Cc: qemu-stable@nongnu.org
Fixes: f5c9fcb82d ("vl: separate qemu_create_machine", 2020-12-10)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit d349f92f78d26db2805ca39a7745cc70affea021)
Signed-off-by: Michael Roth <michael.roth@amd.com>

target/ppc: Fix load endianness for lxvwsx/lxvdsx

TARGET_WORDS_BIGENDIAN may not match the machine endianness if that's a
runtime-configurable parameter.

Fixes: bcb0b7b1a1c05707304f80ca6f523d557816f85c
Fixes: afae37d98ae991c0792c867dbd9f32f988044318
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/212

Signed-off-by: Giuseppe Musacchio <thatlemon@gmail.com>
Message-Id: <20210518133020.58927-1-thatlemon@gmail.com>
Tested-by: Paul A. Clarke <pc@us.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit 861f10fd528263a507476b8c4dda93a9588dfa5c)
Signed-off-by: Michael Roth <michael.roth@amd.com>

target/i386: Exit tb after wrmsr

At minimum, wrmsr can change efer, which affects HF_LMA.

Cc: qemu-stable@nongnu.org
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210514151342.384376-46-richard.henderson@linaro.org>
(cherry picked from commit 244843b757220c432e0e9ae8d2210218c034730d)
Signed-off-by: Michael Roth <michael.roth@amd.com>

migration/rdma: Fix cm_event used before being initialized

A segmentation fault was triggered when i try to abort a postcopy + rdma
migration.

since rdma_ack_cm_event releases a uninitialized cm_event in these case.

like below:
2496     ret = rdma_get_cm_event(rdma->channel, &cm_event);
2497     if (ret) {
2498         perror("rdma_get_cm_event after rdma_connect");
2499         ERROR(errp, "connecting to destination!");
2500         rdma_ack_cm_event(cm_event); <<<< cause segmentation fault
2501         goto err_rdma_source_connect;
2502     }

Refer to the rdma_get_cm_event() code, cm_event will be
updated/changed only if rdma_get_cm_event() returns 0. So it's okey to
remove the ack in error patch.

Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Message-Id: <20210519064740.10828-1-lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
(cherry picked from commit efb208dc9c3f1e881aecff21fb1c7a7b6b869480)
Signed-off-by: Michael Roth <michael.roth@amd.com>

monitor/qmp: fix race on CHR_EVENT_CLOSED without OOB

The QMP dispatcher coroutine holds the qmp_queue_lock over a yield
point, where it expects to be rescheduled from the main context. If a
CHR_EVENT_CLOSED event is received just then, it can race and block the
main thread on the mutex in monitor_qmp_cleanup_queue_and_resume.

monitor_resume does not need to be called from main context, so we can
call it immediately after popping a request from the queue, which allows
us to drop the qmp_queue_lock mutex before yielding.

Suggested-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Message-Id: <20210322154024.15011-1-s.reiter@proxmox.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Markus Armbruster <armbru@redhat.com>
(cherry picked from commit a67b996e7894edfafbcd3fd007c9f58f26d25908)
Signed-off-by: Michael Roth <michael.roth@amd.com>

docs/system: Document the removal of "compat" property for POWER CPUs

This is just an oversight.

Fixes: f518be3aa35b ("target/ppc: Remove "compat" property of server class POWER CPUs")
Cc: groug@kaod.org
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <161399328834.51902.14269239378658110394.stgit@bahia.lan>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
(cherry picked from commit a058b895079348d0854a027a42ce3396a4a00bb7)
Signed-off-by: Michael Roth <michael.roth@amd.com>

linux-user/aarch64: Enable hwcap for RND, BTI, and MTE

These three features are already enabled by TCG, but are missing
their hwcap bits. Update HWCAP2 from linux v5.12.

Cc: qemu-stable@nongnu.org (for 6.0.1)
Buglink: https://bugs.launchpad.net/bugs/1926044
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210427214108.88503-1-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 68948d18224b93361e2880e2946ab268d0c650d7)
Signed-off-by: Michael Roth <michael.roth@amd.com>

multi-process: Initialize variables declared with g_auto*

Quote docs/devel/style.rst (section "Automatic memory deallocation"):

* Variables declared with g_auto* MUST always be initialized,
  otherwise the cleanup function will use uninitialized stack memory

Initialize @name properly to get rid of the compilation error (using
gcc-7.3.0 on CentOS):

../hw/remote/proxy.c: In function 'pci_proxy_dev_realize':
/usr/include/glib-2.0/glib/glib-autocleanups.h:28:3: error: 'name' may be used uninitialized in this function [-Werror=maybe-uninitialized]
   g_free (*pp);
   ^~~~~~~~~~~~
../hw/remote/proxy.c:350:30: note: 'name' was declared here
             g_autofree char *name;
                              ^~~~

Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Reviewed-by: Jagannathan Raman <jag.raman@oracle.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Miroslav Rezanina <mrezanin@redhat.com>
Message-id: 20210312112143.1369-1-yuzenghui@huawei.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit d90226808b5b1b30b07968e94d8f74bf8804fc89)
Signed-off-by: Michael Roth <michael.roth@amd.com>

Update version for v6.0.0 release

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Update version for v6.0.0-rc5 release

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

hw/block/nvme: fix invalid msix exclusive uninit

Commit 1901b4967c3f changed the nvme device from using a bar exclusive
for MSI-x to sharing it on bar0.

Unfortunately, the msix_uninit_exclusive_bar() call remains in
nvme_exit() which causes havoc when the device is removed with, say,
device_del. Fix this.

Additionally, a subregion is added but it is not removed on exit which
causes a reference to linger and the drive to never be unlocked.

Fixes: 1901b4967c3f ("hw/block/nvme: move msix table and pba to BAR 0")
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

target/s390x: fix s390_probe_access to check PAGE_WRITE_ORG for writeability

We can remove PAGE_WRITE when (internally) marking a page read-only
because it contains translated code. This can get confused when we are
executing signal return code on signal stacks.

Fixes: e56552cf07 ("target/s390x: Implement the MVPG condition-code-option bit")
Found-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-id: 20210422154427.13038-1-alex.bennee@linaro.org
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: Thomas Huth <thuth@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

net: check the existence of peer before trying to pad

There could be case that peer is NULL. This can happen when during
network device hot-add where net device needs to be added first. So
the patch check the existence of peer before trying to do the pad.

Fixes: 969e50b61a285 ("net: Pad short frames to minimum size before sending from SLiRP/TAP")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Message-id: 20210423031803.1479-1-jasowang@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Update version for v6.0.0-rc4 release

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

qga: fix guest-get-disks regression

Commit 54aa3de72 ("qapi: Use QAPI_LIST_PREPEND() where possible")
inadvertently removed the has_dependencies from the partition disk
info, resulting in empty list being returned.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1950833
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 20210420125831.233092-1-marcandre.lureau@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

target/mips/rel6_translate: Change license to GNU LGPL v2.1 (or later)

When adding this file and its new content in commit 3f7a927847a
("target/mips: LSA/DLSA R6 decodetree helpers") I did 2 mistakes:

1: Listed authors who haven't been involved in its development,
2: Used an incorrect GNU GPLv2 license text (using 'and' instead
of 'or').

Instead of correcting the GNU GPLv2 license text, replace the license
by the 'GNU LGPL v2.1 or later' one, to be coherent with the other
translation files in the target/mips/ folder.

Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20210420100633.1752440-1-f4bug@amsat.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

migration: Deprecate redundant query-migrate result @blocked

Result @blocked is true when and only when result @blocked-reasons is
present.  It's always non-empty when present.  @blocked is redundant.
It was introduced in commit 3af8554bd0 "migration: Add blocker
information", and has not been released.  This gives us a chance to
fix the interface with minimal fuss.

Unfortunately, we're already too close to the release to risk dropping
it.  Deprecate it instead.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-id: 20210420051907.891470-1-armbru@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

mptsas: Remove unused MPTSASState 'pending' field (CVE-2021-3392)

While processing SCSI i/o requests in mptsas_process_scsi_io_request(),
the Megaraid emulator appends new MPTSASRequest object 'req' to
the 's->pending' queue. In case of an error, this same object gets
dequeued in mptsas_free_request() only if SCSIRequest object
'req->sreq' is initialised. This may lead to a use-after-free issue.

Since s->pending is actually not used, simply remove it from
MPTSASState.

Cc: qemu-stable@nongnu.org
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reported-by: Cheolwoo Myung <cwmyung@snu.ac.kr>
Message-id: 20210419134247.1467982-1-f4bug@amsat.org
Message-Id: <20210416102243.1293871-1-mjt@msgid.tls.msk.ru>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Reported-by: Cheolwoo Myung <cwmyung@snu.ac.kr>
BugLink: https://bugs.launchpad.net/qemu/+bug/1914236
Fixes: e351b826112 ("hw: Add support for LSI SAS1068 (mptsas) device")
[PMD: Reworded description, added more tags]
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20210417' into staging

Fixes for rc4:
* Fix compile failures of C++ files with new glib headers
* mps3-an547: Use correct Cortex-M55 CPU and don't disable its FPU
* accel/tcg: Fix assertion failure executing from non-RAM with -icount

# gpg: Signature made Sat 17 Apr 2021 20:39:58 BST
# gpg:                using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg:                issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [ultimate]
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>" [ultimate]
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [ultimate]
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* remotes/pmaydell/tags/pull-target-arm-20210417:
  accel/tcg: avoid re-translating one-shot instructions
  target/arm: drop CF_LAST_IO/dc->condjump check
  hw/arm/armsse: Make SSE-300 use Cortex-M55
  hw/arm/armsse: Give SSE-300 its own Property array
  include/qemu/osdep.h: Move system includes to top
  osdep: protect qemu/osdep.h with extern "C"
  osdep: include glib-compat.h before other QEMU headers

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

accel/tcg: avoid re-translating one-shot instructions

By definition a single instruction is capable of being an IO
instruction. This avoids a problem of triggering a cpu_io_recompile on
a non-recorded translation which then fails because it expects
tcg_tb_lookup() to succeed unconditionally. The normal use case
requires a TB to be able to resolve machine state.

The other users of tcg_tb_lookup() are able to tolerate a missing TB
if the machine state has been resolved by other means - which in the
single-shot case is always true because machine state is synced at the
start of a block.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210415162454.22056-1-alex.bennee@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

target/arm: drop CF_LAST_IO/dc->condjump check

This is a left over erroneous check from the days front-ends handled
io start/end themselves. Regardless just because IO could be performed
on the last instruction doesn't obligate the front end to do so.

This fixes an abort faced by the aspeed execute-in-place support which
will necessarily trigger this state (even before the one-shot
CF_LAST_IO fix). The test still seems to hang once it attempts to boot
the Linux kernel but I suspect this is an unrelated issue with icount
and the timer handling code.

The original intention of the cpu_abort (added in commit 2e70f6efa8b9
when the icount stuff was first added) seems to have been to act as
an assert() to catch an unhandled corner case where the generated code
would be something like:
    conditional branch to condlabel if its cc failed
    implementation of the insn (a conditional branch or trap)
    code emitted by gen_io_end()
condlabel:
    gen_goto_tb or equivalent thing to go to next insn

At runtime the cc-failed case would skip over the code emitted by
gen_io_end(), leaving the can_do_io flag incorrectly set.

In commit ba3e7926691ed33 we switched to an implementation which
always clears can_do_io at the start of the following TB instead
of trying to clear it at the end of a TB that did IO. So the corner
case that this cpu_abort() was trying to flag is no longer possible,
because the gen_io_end() call has been deleted. We can therefore
safely remove the no-longer-valid assertion.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20210416170207.12504-1-alex.bennee@linaro.org
Cc: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

hw/arm/armsse: Make SSE-300 use Cortex-M55

The SSE-300 has a Cortex-M55 (which was the whole reason for us
modelling it), but we forgot to actually update the code to let it
have a different CPU type from the IoTKit and SSE-200. Add CPU type
as a field for ARMSSEInfo instead of hardcoding it to always use a
Cortex-M33.

Buglink: https://bugs.launchpad.net/qemu/+bug/1923861
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210416104010.13228-1-peter.maydell@linaro.org

hw/arm/armsse: Give SSE-300 its own Property array

SSE-300 currently shares the SSE-200 Property array. This is
bad principally because the default values of the CPU0_FPU
and CPU0_DSP properties disable the FPU and DSP on the CPU.
That is correct for the SSE-200 but not the SSE-300.
Give the SSE-300 its own Property array with the correct
SSE-300 specific settings:
* SSE-300 has only one CPU, so no CPU1* properties
* SSE-300 CPU has FPU and DSP

Buglink: https://bugs.launchpad.net/qemu/+bug/1923861
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210415182353.8173-1-peter.maydell@linaro.org

include/qemu/osdep.h: Move system includes to top

Mostly osdep.h puts the system includes at the top of the file; but
there are a couple of exceptions where we include a system header
halfway through the file. Move these up to the top with the rest
so that all the system headers we include are included before
we include os-win32.h or os-posix.h.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210416135543.20382-4-peter.maydell@linaro.org
Message-id: 20210414184343.26235-1-peter.maydell@linaro.org

osdep: protect qemu/osdep.h with extern "C"

System headers may include templates if compiled with a C++ compiler,
which cause the compiler to complain if qemu/osdep.h is included
within a C++ source file's 'extern "C"' block. Add
an 'extern "C"' block directly to qemu/osdep.h, so that
system headers can be kept out of it.

There is a stray declaration early in qemu/osdep.h, which needs
to be special cased. Add a definition in qemu/compiler.h to
make it look nice.

config-host.h, CONFIG_TARGET, exec/poison.h and qemu/compiler.h
are included outside the 'extern "C"' block; that is not
an issue because they consist entirely of preprocessor directives.

This allows us to move the include of osdep.h in our two C++
source files outside the extern "C" block they were previously
using for it, which in turn means that they compile successfully
against newer versions of glib which insist that glib.h is
*not* inside an extern "C" block.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210416135543.20382-3-peter.maydell@linaro.org
[PMM: Moved disas/arm-a64.cc osdep.h include out of its extern "C" block;
explained in commit message why we're doing this]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

osdep: include glib-compat.h before other QEMU headers

glib-compat.h is sort of like a system header, and it needs to include
system headers (glib.h) that may dislike being included under
'extern "C"'. Move it right after all system headers and before
all other QEMU headers.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210416135543.20382-2-peter.maydell@linaro.org
[PMM: Added comment about why glib-compat.h is special]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Update version for v6.0.0-rc3 release

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

qapi/qom.json: Do not use CONFIG_VIRTIO_CRYPTO in common code

The ObjectType enum and ObjectOptions are included from qapi-types-qom.h
into common code. We should not use target-specific config switches like
CONFIG_VIRTIO_CRYPTO here, since this is not defined in common code and
thus the enum will look differently between common and target specific
code. For this case, it's hopefully enough to check for CONFIG_VHOST_CRYPTO
only (which is a host specific config switch, i.e. it's the same on all
targets).

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210412160710.639800-1-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/dgilbert-gitlab/tags/pull-virtiofs-20210413' into staging

virtiofs: Fix feature negotiation (for 6.0)

A 6.0 fix for feature negotiation on vhost-user.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
# gpg: Signature made Tue 13 Apr 2021 16:21:41 BST
# gpg:                using RSA key 45F5C71B4A0CB7FB977A9FA90516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>" [full]
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A  9FA9 0516 331E BC5B FDE7

* remotes/dgilbert-gitlab/tags/pull-virtiofs-20210413:
  vhost-user-fs: fix features handling

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/maxreitz/tags/pull-block-2021-04-13' into staging

Block patches for 6.0-rc3:
- Use-after-free fix for block/nbd.c

# gpg: Signature made Tue 13 Apr 2021 14:35:48 BST
# gpg:                using RSA key 91BEB60A30DB3E8857D11829F407DB0061D5CF40
# gpg:                issuer "mreitz@redhat.com"
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>" [full]
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1  1829 F407 DB00 61D5 CF40

* remotes/maxreitz/tags/pull-block-2021-04-13:
  block/nbd: fix possible use after free of s->connect_thread

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20210413' into staging

target-arm queue:
* Fix MPC setting for AN524 SRAM block
* sphinx: qapidoc: Wrap "If" section body in a paragraph node

# gpg: Signature made Tue 13 Apr 2021 13:07:12 BST
# gpg:                using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg:                issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [ultimate]
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>" [ultimate]
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [ultimate]
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* remotes/pmaydell/tags/pull-target-arm-20210413:
  sphinx: qapidoc: Wrap "If" section body in a paragraph node
  hw/arm/mps2-tz: Assert if more than one RAM is attached to an MPC
  hw/arm/mps2-tz: Fix MPC setting for AN524 SRAM block

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

vhost-user-fs: fix features handling

Make virtio-fs take into account server capabilities.

Just returning requested features assumes they all of then are implemented
by server and results in setting unsupported configuration if some of them
are absent.

Signed-off-by: Anton Kuchin <antonkuchin@yandex-team.ru>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
With changes suggested by Stefan
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

block/nbd: fix possible use after free of s->connect_thread

If on nbd_close() we detach the thread (in
nbd_co_establish_connection_cancel() thr->state becomes
CONNECT_THREAD_RUNNING_DETACHED), after that point we should not use
s->connect_thread (which is set to NULL), as running thread may free it
at any time.

Still nbd_co_establish_connection() does exactly this: it saves
s->connect_thread to local variable (just for better code style) and
use it even after yield point, when thread may be already detached.

Fix that. Also check thr to be non-NULL on
nbd_co_establish_connection() start for safety.

After this patch "case CONNECT_THREAD_RUNNING_DETACHED" becomes
impossible in the second switch in nbd_co_establish_connection().
Still, don't add extra abort() just before the release. If it somehow
possible to reach this "case:" it won't hurt. Anyway, good refactoring
of all this reconnect mess will come soon.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210406155114.1057355-1-vsementsov@virtuozzo.com>
Reviewed-by: Roman Kagan <rvkagan@yandex-team.ru>
Signed-off-by: Max Reitz <mreitz@redhat.com>

Merge remote-tracking branch 'remotes/philmd/tags/mips-20210413' into staging

MIPS patches queue

- Fix invalid Kconfig dependency
- Fix missing migrated value
- Fix TCG temporary leak

# gpg: Signature made Tue 13 Apr 2021 11:10:11 BST
# gpg:                using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full]
# Primary key fingerprint: FAAB E75E 1291 7221 DCFD  6BB2 E3E3 2C2C DEAD C0DE

* remotes/philmd/tags/mips-20210413:
  target/mips: Fix TCG temporary leak in gen_cache_operation()
  hw/isa/piix4: Migrate Reset Control Register
  hw/isa/Kconfig: Add missing dependency VIA VT82C686 -> APM

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/dg-gitlab/tags/ppc-for-6.0-20210412' into staging

ppc patch queue for 2021-04-21

Here's what I hope is the last ppc related pull request for qemu-6.0.

The 2 patches here revert a behavioural change that after further
discussion we concluded was a bad idea (adding a timeout for
possibly-failed hot unplug requests).  Instead it implements a
different approach to the original problem: we again let unplug
requests the guest doesn't respond to remain pending indefinitely, but
no longer allow those to block attempts to retry the same unplug
again.

The change is a bit more complex than I'd like for this late in the
freeze.  Nonetheless, I think it's important to merge this for 6.0, so
we don't allow a release which has the probably-a-bad-idea timeout
behaviour.

# gpg: Signature made Mon 12 Apr 2021 06:25:58 BST
# gpg:                using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full]
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full]
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full]
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown]
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dg-gitlab/tags/ppc-for-6.0-20210412:
  spapr.c: always pulse guest IRQ in spapr_core_unplug_request()
  spapr: rollback 'unplug timeout' for CPU hotunplugs

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

target/mips: Fix TCG temporary leak in gen_cache_operation()

Fix a TCG temporary leak when translating CACHE opcode.

Fixes: 0d74a222c27 ("make ITC Configuration Tags accessible to the CPU")
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210406202857.1440744-1-f4bug@amsat.org>

hw/isa/piix4: Migrate Reset Control Register

When adding the Reset register in commit 5790b757cfb we
forgot to migrate it.

While it is possible a VM using the PIIX4 is migrated just
after requesting a system shutdown, it is very unlikely.
However when restoring a migrated VM, we might have the
RCR bit #4 set on the stack and when the VM resume it
directly shutdowns.

Add a post_load() migration handler and set the default
RCR value to 0 for earlier versions, assuming the VM was
not going to shutdown before migration.

Fixes: 5790b757cfb ("piix4: Add the Reset Control Register")
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20210324200334.729899-1-f4bug@amsat.org>

hw/isa/Kconfig: Add missing dependency VIA VT82C686 -> APM

TYPE_VIA_PM calls apm_init() in via_pm_realize(), so
requires APM to be selected.

Reported-by: BALATON Zoltan <balaton@eik.bme.hu>
Fixes: dd0ff8191ab ("isa: express SuperIO dependencies with Kconfig")
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210302080531.913802-1-f4bug@amsat.org>

Merge remote-tracking branch 'remotes/mcayland/tags/qemu-sparc-20210412' into staging

qemu-sparc queue

# gpg: Signature made Mon 12 Apr 2021 23:13:12 BST
# gpg:                using RSA key CC621AB98E82200D915CC9C45BC2C56FAE0F321F
# gpg:                issuer "mark.cave-ayland@ilande.co.uk"
# gpg: Good signature from "Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>" [full]
# Primary key fingerprint: CC62 1AB9 8E82 200D 915C  C9C4 5BC2 C56F AE0F 321F

* remotes/mcayland/tags/qemu-sparc-20210412:
  tests/qtest: add tests for am53c974 device
  esp: ensure that do_cmd is set to zero before submitting an ESP select command
  esp: don't reset async_len directly in esp_select() if cancelling request
  esp: don't overflow cmdfifo if TC is larger than the cmdfifo size
  esp: don't overflow cmdfifo in get_cmd()
  esp: don't underflow cmdfifo in do_cmd()
  esp: ensure cmdfifo is not empty and current_dev is non-NULL
  esp: introduce esp_fifo_pop_buf() and use it instead of fifo8_pop_buf()
  esp: consolidate esp_cmdfifo_pop() into esp_fifo_pop()
  esp: consolidate esp_cmdfifo_push() into esp_fifo_push()
  esp: rework write_response() to avoid using the FIFO for DMA transactions
  esp: always check current_req is not NULL before use in DMA callbacks
  esp: fix setting of ESPState mig_version_id when launching QEMU with -S option

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

sphinx: qapidoc: Wrap "If" section body in a paragraph node

These sections need to be wrapped in a block-level element, such as
Paragraph in order for them to be rendered into Texinfo correctly.

Before (e.g.):

<section ids="qapidoc-713">
  <title>If</title>
  <literal>defined(CONFIG_REPLICATION)</literal>
</section>

became:

  .SS If
  \fBdefined(CONFIG_REPLICATION)\fP.SS \fBBlockdevOptionsReplication\fP (Object)
  ...

After:

<section ids="qapidoc-713">
  <title>If</title>
  <paragraph>
    <literal>defined(CONFIG_REPLICATION)</literal>
  </paragraph>
</section>

becomes:

  .SS If
  .sp
  \fBdefined(CONFIG_REPLICATION)\fP
  .SS \fBBlockdevOptionsReplication\fP (Object)
  ...

Reported-by: Markus Armbruster <armbru@redhat.com>
Tested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20210406141909.1992225-2-jsnow@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

tests/qtest: add tests for am53c974 device

Use the autogenerated fuzzer test cases as the basis for a set of am53c974
regression tests.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20210407195801.685-13-mark.cave-ayland@ilande.co.uk>

esp: ensure that do_cmd is set to zero before submitting an ESP select command

When a CDB has been received and is about to be submitted to the SCSI layer
via one of the ESP select commands, ensure that do_cmd is set to zero before
executing the command.

Otherwise a guest executing 2 valid CDBs in quick sequence can invoke the SCSI
.transfer_data callback again before do_cmd is set to zero by the callback
function triggering an assert at the start of esp_transfer_data().

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210407195801.685-12-mark.cave-ayland@ilande.co.uk>

esp: don't reset async_len directly in esp_select() if cancelling request

Instead let the SCSI layer invoke the .cancel callback itself to cancel and
reset the request state.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210407195801.685-11-mark.cave-ayland@ilande.co.uk>

esp: don't overflow cmdfifo if TC is larger than the cmdfifo size

If a guest transfers the message out/command phase data using DMA with a TC
that is larger than the cmdfifo size then the cmdfifo overflows triggering
an assert. Limit the size of the transfer to the free space available in
cmdfifo.

Buglink: https://bugs.launchpad.net/qemu/+bug/1919036
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20210407195801.685-10-mark.cave-ayland@ilande.co.uk>

esp: don't overflow cmdfifo in get_cmd()

If the guest tries to read a CDB using DMA and cmdfifo is not empty then it is
possible to overflow cmdfifo.

Since this can only occur by issuing deliberately incorrect instruction
sequences, ensure that the maximum length of the CDB transferred to cmdfifo is
limited to the available free space within cmdfifo.

Buglink: https://bugs.launchpad.net/qemu/+bug/1909247
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20210407195801.685-9-mark.cave-ayland@ilande.co.uk>

esp: don't underflow cmdfifo in do_cmd()

If the guest tries to execute a CDB when cmdfifo is not empty before the start
of the message out phase then clearing the message out phase data will cause
cmdfifo to underflow due to cmdfifo_cdb_offset being larger than the amount of
data within.

Since this can only occur by issuing deliberately incorrect instruction
sequences, ensure that the maximum length of esp_fifo_pop_buf() is limited to
the size of the data within cmdfifo.

Buglink: https://bugs.launchpad.net/qemu/+bug/1909247
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20210407195801.685-8-mark.cave-ayland@ilande.co.uk>

esp: ensure cmdfifo is not empty and current_dev is non-NULL

When about to execute a SCSI command, ensure that cmdfifo is not empty and
current_dev is non-NULL. This can happen if the guest tries to execute a TI
(Transfer Information) command without issuing one of the select commands
first.

Buglink: https://bugs.launchpad.net/qemu/+bug/1910723
Buglink: https://bugs.launchpad.net/qemu/+bug/1909247
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20210407195801.685-7-mark.cave-ayland@ilande.co.uk>

esp: introduce esp_fifo_pop_buf() and use it instead of fifo8_pop_buf()

The const pointer returned by fifo8_pop_buf() lies directly within the array used
to model the FIFO. Building with address sanitizers enabled shows that if the
caller expects a minimum number of bytes present then if the FIFO is nearly full,
the caller may unexpectedly access past the end of the array.

Introduce esp_fifo_pop_buf() which takes a destination buffer and performs a
memcpy() in it to guarantee that the caller cannot overwrite the FIFO array and
update all callers to use it. Similarly add underflow protection similar to
esp_fifo_push() and esp_fifo_pop() so that instead of triggering an assert()
the operation becomes a no-op.

Buglink: https://bugs.launchpad.net/qemu/+bug/1909247
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20210407195801.685-6-mark.cave-ayland@ilande.co.uk>

esp: consolidate esp_cmdfifo_pop() into esp_fifo_pop()

Each FIFO currently has its own pop functions with the only difference being
the capacity check. The original reason for this was that the fifo8
implementation doesn't have a formal API for retrieving the FIFO capacity,
however there are multiple examples within QEMU where the capacity field is
accessed directly.

Change esp_fifo_pop() to access the FIFO capacity directly and then consolidate
esp_cmdfifo_pop() into esp_fifo_pop().

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20210407195801.685-5-mark.cave-ayland@ilande.co.uk>

esp: consolidate esp_cmdfifo_push() into esp_fifo_push()

Each FIFO currently has its own push functions with the only difference being
the capacity check. The original reason for this was that the fifo8
implementation doesn't have a formal API for retrieving the FIFO capacity,
however there are multiple examples within QEMU where the capacity field is
accessed directly.

Change esp_fifo_push() to access the FIFO capacity directly and then consolidate
esp_cmdfifo_push() into esp_fifo_push().

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20210407195801.685-4-mark.cave-ayland@ilande.co.uk>

esp: rework write_response() to avoid using the FIFO for DMA transactions

The code for write_response() has always used the FIFO to store the data for
the status/message in phases, even for DMA transactions. Switch to using a
separate buffer that can be used directly for DMA transactions and restrict
the FIFO use to the non-DMA case.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210407195801.685-3-mark.cave-ayland@ilande.co.uk>

esp: always check current_req is not NULL before use in DMA callbacks

After issuing a SCSI command the SCSI layer can call the SCSIBusInfo .cancel
callback which resets both current_req and current_dev to NULL. If any data
is left in the transfer buffer (async_len != 0) then the next TI (Transfer
Information) command will attempt to reference the NULL pointer causing a
segfault.

Buglink: https://bugs.launchpad.net/qemu/+bug/1910723
Buglink: https://bugs.launchpad.net/qemu/+bug/1909247
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20210407195801.685-2-mark.cave-ayland@ilande.co.uk>

esp: fix setting of ESPState mig_version_id when launching QEMU with -S option

If QEMU is launched with the -S option then the ESPState mig_version_id property
is left unset due to the ordering of the VMState fields in the VMStateDescription
for sysbusespscsi and pciespscsi. If the VM is migrated and restored in this
stopped state, the version tests in the vmstate_esp VMStateDescription and
esp_post_load() become confused causing the migration to fail.

Fix the ordering problem by moving the setting of mig_version_id to a common
esp_pre_save() function which is invoked first by both sysbusespscsi and
pciespscsi rather than at the point where ESPState is itself serialised into the
migration stream.

Buglink: https://bugs.launchpad.net/qemu/+bug/1922611
Fixes: 0bd005be78 ("esp: add vmstate_esp version to embedded ESPState")
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210407124842.32695-1-mark.cave-ayland@ilande.co.uk>

hw/arm/mps2-tz: Assert if more than one RAM is attached to an MPC

Each board in mps2-tz.c specifies a RAMInfo[] array providing
information about each RAM in the board. The .mpc field of the
RAMInfo struct specifies which MPC, if any, the RAM is attached to.
We already assert if the array doesn't have any entry for an MPC, but
we don't diagnose the error of using the same MPC number twice (which
is quite easy to do by accident if copy-and-pasting structure
entries).

Enhance find_raminfo_for_mpc() so that it detects multiple entries
for the MPC as well as missing entries.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20210409150527.15053-3-peter.maydell@linaro.org

hw/arm/mps2-tz: Fix MPC setting for AN524 SRAM block

The AN524 has three MPCs: one for the BRAM, one for the QSPI flash,
and one for the DDR.  We incorrectly set the .mpc field in the
RAMInfo struct for the SRAM block to 1, giving it the same MPC we are
using for the QSPI.  The effect of this was that the QSPI didn't get
mapped into the system address space at all, via an MPC or otherwise,
and guest programs which tried to read from the QSPI would get a bus
error.  Correct the SRAM RAMInfo to indicate that it does not have an
associated MPC.

Fixes: 25ff112a8cc ("hw/arm/mps2-tz: Add new mps3-an524 board")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20210409150527.15053-2-peter.maydell@linaro.org

Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20210412' into staging

target-arm queue:
* hw/arm/virt-acpi-build: Fix GSIV values of the {GERR, Sync} interrupts
* hw/arm/smmuv3: Emulate CFGI_STE_RANGE for an aligned range of StreamIDs
* accel/tcg: Preserve PAGE_ANON when changing page permissions
* target/arm: Check PAGE_WRITE_ORG for MTE writeability
* exec: Fix overlap of PAGE_ANON and PAGE_TARGET_1

# gpg: Signature made Mon 12 Apr 2021 11:31:15 BST
# gpg:                using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg:                issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [ultimate]
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>" [ultimate]
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [ultimate]
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* remotes/pmaydell/tags/pull-target-arm-20210412:
  exec: Fix overlap of PAGE_ANON and PAGE_TARGET_1
  target/arm: Check PAGE_WRITE_ORG for MTE writeability
  accel/tcg: Preserve PAGE_ANON when changing page permissions
  hw/arm/smmuv3: Emulate CFGI_STE_RANGE for an aligned range of StreamIDs
  hw/arm/virt-acpi-build: Fix GSIV values of the {GERR, Sync} interrupts

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

exec: Fix overlap of PAGE_ANON and PAGE_TARGET_1

Unfortuately, the elements of PAGE_* were not in numerical
order and so PAGE_ANON was added to an "unused" bit.
As an arbitrary choice, move PAGE_TARGET_{1,2} together.

Cc: Laurent Vivier <laurent@vivier.eu>
Fixes: 26bab757d41b ("linux-user: Introduce PAGE_ANON")
Buglink: https://bugs.launchpad.net/bugs/1922617
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Tested-by: Laurent Vivier <laurent@vivier.eu>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

target/arm: Check PAGE_WRITE_ORG for MTE writeability

We can remove PAGE_WRITE when (internally) marking a page
read-only because it contains translated code.

This can be triggered by tests/tcg/aarch64/bti-2, after
having serviced SIGILL trampolines on the stack.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

accel/tcg: Preserve PAGE_ANON when changing page permissions

Using mprotect() to change PROT_* does not change the MAP_ANON
previously set with mmap(). Our linux-user version of MTE only
works with MAP_ANON pages, so losing PAGE_ANON caused MTE to
stop working.

Reported-by: Stephen Long <steplong@quicinc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

hw/arm/smmuv3: Emulate CFGI_STE_RANGE for an aligned range of StreamIDs

In emulation of the CFGI_STE_RANGE command, we now take StreamID as the
start of the invalidation range, regardless of whatever the Range is,
whilst the spec clearly states that

- "Invalidation is performed for an *aligned* range of 2^(Range+1)
StreamIDs."

- "The bottom Range+1 bits of the StreamID parameter are IGNORED,
aligning the range to its size."

Take CFGI_ALL (where Range == 31) as an example, if there are some random
bits in the StreamID field, we'll fail to perform the full invalidation but
get a strange range (e.g., SMMUSIDRange={.start=1, .end=0}) instead. Rework
the emulation a bit to get rid of the discrepancy with the spec.

Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Acked-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20210402100449.528-1-yuzenghui@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

hw/arm/virt-acpi-build: Fix GSIV values of the {GERR, Sync} interrupts

The GSIV values in SMMUv3 IORT node are not correct as they don't match
the SMMUIrq enumeration, which describes the IRQ<->PIN mapping used by
our emulated vSMMU.

Fixes: a703b4f6c1ee ("hw/arm/virt-acpi-build: Add smmuv3 node in IORT table")
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Acked-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20210402084731.93-1-yuzenghui@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/nvme/tags/nvme-fixes-20210412-pull-request' into staging

emulated nvme docs and fixes for -rc3

- documentation
- fixes

# gpg: Signature made Mon 12 Apr 2021 07:56:09 BST
# gpg:                using RSA key 522833AA75E2DCE6A24766C04DE1AF316D4F0DE9
# gpg: Good signature from "Klaus Jensen <its@irrelevant.dk>" [unknown]
# gpg:                 aka "Klaus Jensen <k.jensen@samsung.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: DDCA 4D9C 9EF9 31CC 3468  4272 63D5 6FC5 E55D A838
#      Subkey fingerprint: 5228 33AA 75E2 DCE6 A247  66C0 4DE1 AF31 6D4F 0DE9

* remotes/nvme/tags/nvme-fixes-20210412-pull-request:
  hw/block/nvme: drain namespaces on sq deletion
  hw/block/nvme: store aiocb in compare
  hw/block/nvme: map prp fix if prp2 contains non-zero offset
  docs: add nvme emulation documentation

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

hw/block/nvme: drain namespaces on sq deletion

For most commands, when issuing an AIO, the BlockAIOCB is stored in the
NvmeRequest aiocb pointer when the AIO is issued. The main use of this
is cancelling AIOs when deleting submission queues (it is currently not
used for Abort).

However, some commands like Dataset Management Zone Management Send
(zone reset) may involve more than one AIO and here the AIOs are issued
without saving a reference to the BlockAIOCB. This is a problem since
nvme_del_sq() will attempt to cancel outstanding AIOs, potentially with
an invalid BlockAIOCB since the aiocb pointer is not NULL'ed when the
request structure is recycled.

Fix this by

  1. making sure the aiocb pointer is NULL'ed when requests are recycled
  2. only attempt to cancel the AIO if the aiocb is non-NULL
  3. if any AIOs could not be cancelled, drain all aio as a last resort.

Fixes: dc04d25e2f3f ("hw/block/nvme: add support for the format nvm command")
Fixes: c94973288cd9 ("hw/block/nvme: add broadcast nsid support flush command")
Fixes: e4e430b3d6ba ("hw/block/nvme: add simple copy command")
Fixes: 5f5dc4c6a942 ("hw/block/nvme: zero out zones on reset")
Fixes: 2605257a26b8 ("hw/block/nvme: add the dataset management command")
Cc: Gollu Appalanaidu <anaidu.gollu@samsung.com>
Cc: Minwoo Im <minwoo.im@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>

hw/block/nvme: store aiocb in compare

nvme_compare() fails to store the aiocb from the blk_aio_preadv() call.
Fix this.

Fixes: 0a384f923f51 ("hw/block/nvme: add compare command")
Cc: Gollu Appalanaidu <anaidu.gollu@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>

hw/block/nvme: map prp fix if prp2 contains non-zero offset

nvme_map_prp needs to calculate the number of list entries based on the
offset value. For the subsequent PRP2 list, need to ensure the number of
entries is within the MAX number of PRP entries for a page.

Signed-off-by: Padmakar Kalghatgi <p.kalghatgi@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>

docs: add nvme emulation documentation

Remove the docs/specs/nvme.txt and replace it with proper documentation
in docs/system/nvme.rst.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>