git.ipfire.org Git - thirdparty/qemu.git/log

vpc: Return 0 from vpc_co_create() on success

blockdev_create_run() directly uses .bdrv_co_create()'s return value as
the job's return value. Jobs must return 0 on success, not just any
nonnegative value. Therefore, using blockdev-create for VPC images may
currently fail as the vpc driver may return a positive integer.

Because there is no point in returning a positive integer anywhere in
the block layer (all non-negative integers are generally treated as
complete success), we probably do not want to add more such cases.
Therefore, fix this problem by making the vpc driver always return 0 in
case of success.

Suggested-by: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 1a37e3124407b5a145d44478d3ecbdb89c63789f)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

iotests: add testing shim for script-style python tests

Because the new-style python tests don't use the iotests.main() test
launcher, we don't turn on the debugger logging for these scripts
when invoked via ./check -d.

Refactor the launcher shim into new and old style shims so that they
share environmental configuration.

Two cleanup notes: debug was not actually used as a global, and there
was no reason to create a class in an inner scope just to achieve
default variables; we can simply create an instance of the runner with
the values we want instead.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20190709232550.10724-14-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit 456a2d5ac7641c7e75c76328a561b528a8607a8e)
*prereq for 1a37e31244/88d2aa533a
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

pr-manager: Fix invalid g_free() crash bug

pr_manager_worker() passes its @opaque argument to g_free().  Wrong;
it points to pr_manager_worker()'s automatic @data.  Broken when
commit 2f3a7ab39be converted @data from heap- to stack-allocated.  Fix
by deleting the g_free().

Fixes: 2f3a7ab39bec4ba8022dc4d42ea641165b004e3e
Cc: qemu-stable@nongnu.org
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 6b9d62c2a9e83bbad73fb61406f0ff69b46ff6f3)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

xen-bus: Fix backend state transition on device reset

When a frontend wants to reset its state and the backend one, it
starts with setting "Closing", then waits for the backend (QEMU) to do
the same.

But when QEMU is setting "Closing" to its state, it triggers an event
(xenstore watch) that re-execute xen_device_backend_changed() and set
the backend state to "Closed". QEMU should wait for the frontend to
set "Closed" before doing the same.

Before setting "Closed" to the backend_state, we are also going to
check if there is a frontend. If that the case, when the backend state
is set to "Closing" the frontend should react and sets its state to
"Closing" then "Closed". The backend should wait for that to happen.

Fixes: b6af8926fb858c4f1426e5acb2cfc1f0580ec98a
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Message-Id: <20190823101534.465-2-anthony.perard@citrix.com>
(cherry picked from commit cb3231460747552d70af9d546dc53d8195bcb796)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

target/arm: Don't abort on M-profile exception return in linux-user mode

An attempt to do an exception-return (branch to one of the magic
addresses) in linux-user mode for M-profile should behave like
a normal branch, because linux-user mode is always going to be
in 'handler' mode. This used to work, but we broke it when we added
support for the M-profile security extension in commit d02a8698d7ae2bfed.

In that commit we allowed even handler-mode calls to magic return
values to be checked for and dealt with by causing an
EXCP_EXCEPTION_EXIT exception to be taken, because this is
needed for the FNC_RETURN return-from-non-secure-function-call
handling. For system mode we added a check in do_v7m_exception_exit()
to make any spurious calls from Handler mode behave correctly, but
forgot that linux-user mode would also be affected.

How an attempted return-from-non-secure-function-call in linux-user
mode should be handled is not clear -- on real hardware it would
result in return to secure code (not to the Linux kernel) which
could then handle the error in any way it chose. For QEMU we take
the simple approach of treating this erroneous return the same way
it would be handled on a CPU without the security extensions --
treat it as a normal branch.

The upshot of all this is that for linux-user mode we should never
do any of the bx_excret magic, so the code change is simple.

This ought to be a weird corner case that only affects broken guest
code (because Linux user processes should never be attempting to do
exception returns or NS function returns), except that the code that
assigns addresses in RAM for the process and stack in our linux-user
code does not attempt to avoid this magic address range, so
legitimate code attempting to return to a trampoline routine on the
stack can fall into this case. This change fixes those programs,
but we should also look at restricting the range of memory we
use for M-profile linux-user guests to the area that would be
real RAM in hardware.

Cc: qemu-stable@nongnu.org
Reported-by: Christophe Lyon <christophe.lyon@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20190822131534.16602-1-peter.maydell@linaro.org
Fixes: https://bugs.launchpad.net/qemu/+bug/1840922
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 5e5584c89f36b302c666bc6db535fd3f7ff35ad2)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

dma-helpers: ensure AIO callback is invoked after cancellation

dma_aio_cancel unschedules the BH if there is one, which corresponds
to the reschedule_dma case of dma_blk_cb. This can stall the DMA
permanently, because dma_complete will never get invoked and therefore
nobody will ever invoke the original AIO callback in dbs->common.cb.

Fix this by invoking the callback (which is ensured to happen after
a bdrv_aio_cancel_async, or done manually in the dbs->bh case), and
add assertions to check that the DMA state machine is indeed waiting
for dma_complete or reschedule_dma, but never both.

Reported-by: John Snow <jsnow@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20190729213416.1972-1-pbonzini@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit 539343c0a47e19d5dd64d846d64d084d9793681f)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

qcow2: Fix the calculation of the maximum L2 cache size

The size of the qcow2 L2 cache defaults to 32 MB, which can be easily
larger than the maximum amount of L2 metadata that the image can have.
For example: with 64 KB clusters the user would need a qcow2 image
with a virtual size of 256 GB in order to have 32 MB of L2 metadata.

Because of that, since commit b749562d9822d14ef69c9eaa5f85903010b86c30
we forbid the L2 cache to become larger than the maximum amount of L2
metadata for the image, calculated using this formula:

uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8);

The problem with this formula is that the result should be rounded up
to the cluster size because an L2 table on disk always takes one full
cluster.

For example, a 1280 MB qcow2 image with 64 KB clusters needs exactly
160 KB of L2 metadata, but we need 192 KB on disk (3 clusters) even if
the last 32 KB of those are not going to be used.

However QEMU rounds the numbers down and only creates 2 cache tables
(128 KB), which is not enough for the image.

A quick test doing 4KB random writes on a 1280 MB image gives me
around 500 IOPS, while with the correct cache size I get 16K IOPS.

Cc: qemu-stable@nongnu.org
Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit b70d08205b2e4044c529eefc21df2c8ab61b473b)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

Revert "ide/ahci: Check for -ECANCELED in aio callbacks"

This reverts commit 0d910cfeaf2076b116b4517166d5deb0fea76394.

It's not correct to just ignore an error code in a callback; we need to
handle that error and possible report failure to the guest so that they
don't wait indefinitely for an operation that will now never finish.

This ought to help cases reported by Nutanix where iSCSI returns a
legitimate -ECANCELED for certain operations which should be propagated
normally.

Reported-by: Shaju Abraham <shaju.abraham@nutanix.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 20190729223605.7163-1-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit 8ec41c4265714255d5a138f8b538faf3583dcff6)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

block/backup: disable copy_range for compressed backup

Enabled by default copy_range ignores compress option. It's definitely
unexpected for user.

It's broken since introduction of copy_range usage in backup in
9ded4a011496.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20190730163251.755248-3-vsementsov@virtuozzo.com
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit 110571be4e70ac015628e76d2731f96dd8d1998c)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

iotests: Test unaligned blocking mirror write

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20190805113526.20319-1-mreitz@redhat.com
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit 19ba4651fe2d17cc49adae29acbb4a8cc29db8d1)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

mirror: Only mirror granularity-aligned chunks

In write-blocking mode, all writes to the top node directly go to the
target. We must only mirror chunks of data that are aligned to the
job's granularity, because that is how the dirty bitmap works.
Therefore, the request alignment for writes must be the job's
granularity (in write-blocking mode).

Unfortunately, this forces all reads and writes to have the same
granularity (we only need this alignment for writes to the target, not
the source), but that is something to be fixed another time.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20190805153308.2657-1-mreitz@redhat.com
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Fixes: d06107ade0ce74dc39739bac80de84b51ec18546
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit 9adc1cb49af8d4e54f57980b1eed5c0a4b2dafa6)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

iotests: Test incremental backup after truncation

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20190805152840.32190-1-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit 8a9cb864086269af14bbd13f395472703cf99f8c)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

util/hbitmap: update orig_size on truncate

Without this, hbitmap_next_zero and hbitmap_next_dirty_area are broken
after truncate. So, orig_size is broken since it's introduction in
76d570dc495c56bb.

Fixes: 76d570dc495c56bb
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20190805120120.23585-1-vsementsov@virtuozzo.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit 4e4de222799634d8159ee7b9303b9e1b45c6be2c)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

iotests: Test backup job with two guest writes

Perform two guest writes to not yet backed up areas of an image, where
the former touches an inner area of the latter.

Before HEAD^, copy offloading broke this in two ways:
(1) The target image differs from the reference image (what the source
    was when the backup started).
(2) But you will not see that in the failing output, because the job
    offset is reported as being greater than the job length.  This is
    because one cluster is copied twice, and thus accounted for twice,
    but of course the job length does not increase.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20190801173900.23851-3-mreitz@redhat.com
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Tested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit 5f594a2e99f19ca0f7744d333bcd556f5976b78f)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

backup: Copy only dirty areas

The backup job must only copy areas that the copy_bitmap reports as
dirty. This is always the case when using traditional non-offloading
backup, because it copies each cluster separately. When offloading the
copy operation, we sometimes copy more than one cluster at a time, but
we only check whether the first one is dirty.

Therefore, whenever copy offloading is possible, the backup job
currently produces wrong output when the guest writes to an area of
which an inner part has already been backed up, because that inner part
will be re-copied.

Fixes: 9ded4a0114968e98b41494fc035ba14f84cdf700
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20190801173900.23851-2-mreitz@redhat.com
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit 4a5b91ca024fc6fd87021c54655af76a35f2ef1e)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

block/backup: refactor: split out backup_calculate_cluster_size

Split out cluster_size calculation. Move copy-bitmap creation above
block-job creation, as we are going to share it with upcoming
backup-top filter, which also should be created before actual block job
creation.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20190429090842.57910-6-vsementsov@virtuozzo.com
[mreitz: Dropped a paragraph from the commit message that was left over
from a previous version]
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit ae6b12fa4cf7d54add35531c790aaf2bd6d833f3)
*prereq for 110571be4e
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

block/backup: unify different modes code path

Do full, top and incremental mode copying all in one place. This
unifies the code path and helps further improvements.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20190429090842.57910-5-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit c334e897d08eea1f5a3a95f6a2208afe6757c103)
*prereq for 110571be4e
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

block/backup: refactor and tolerate unallocated cluster skipping

Split allocation checking to separate function and reduce nesting.
Consider bdrv_is_allocated() fail as allocated area, as copying more
than needed is not wrong (and we do it anyway) and seems better than
fail the whole job. And, most probably we will fail on the next read,
if there are real problem with source.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20190429090842.57910-4-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit 9eb5a248f3e50c1f034bc6ff4b2f25c8c56515a5)
*prereq for 110571be4e
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

block/backup: move to copy_bitmap with granularity

We are going to share this bitmap between backup and backup-top filter
driver, so let's share something more meaningful. It also simplifies
some calculations.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20190429090842.57910-3-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit a8389e315ef71913ae99cf8f3b1f89e84631f599)
*prereq for 4a5b91ca
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

block/backup: simplify backup_incremental_init_copy_bitmap

Simplify backup_incremental_init_copy_bitmap using the function
bdrv_dirty_bitmap_next_dirty_area.

Note: move to job->len instead of bitmap size: it should not matter but
less code.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20190429090842.57910-2-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit c2da3413c021398152e98022261bb1643276a2fe)
*prereq for 4a5b91ca
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

tpm_emulator: Translate TPM error codes to strings

Implement a function to translate TPM error codes to strings so that
at least the most common error codes can be translated to human
readable strings.

Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
(cherry picked from commit 7e095e84ba0b7c0a1ac45bc6824dace2fd352e56)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

tpm: Exit in reset when backend indicates failure

Exit() in the frontend reset function when the backend indicates
intialization failure.

Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
(cherry picked from commit bcfd16fe26d6bb6eabfd2dfb46b9fda59d5493db)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

i386/acpi: fix gint overflow in crs_range_compare

When very large regions (32GB sized in our case, PCI pass-through of GPUs)
are compared substraction result does not fit into gint.

As a result crs_replace_with_free_ranges does not get sorted ranges and
incorrectly computes PCI64 free space regions. Which then makes linux
guest complain about device and PCI64 hole intersection and device
becomes unusable.

Fix that by returning exactly fitting ranges.

Also fix indentation of an entire crs_replace_with_free_ranges to make
checkpatch happy.

Cc: qemu-stable@nongnu.org
Signed-off-by: Evgeny Yakovlev <wrfsh@yandex-team.ru>
Message-Id: <1563466463-26012-1-git-send-email-wrfsh@yandex-team.ru>
Signed-off-by: Evgeny Yakovlev <wrfsh@yandex-team.ru>
(cherry picked from commit 21e2acd583126db94f6d881005cd58e835160582)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

virtio-balloon: free pbp more aggressively

Previous patches switched to a temporary pbp but that does not go far
enough: after device uses a buffer, guest is free to reuse it, so
tracking the page and freeing it later is wrong.

Free and reset the pbp after we push each element.

Fixes: ed48c59875b6 ("virtio-balloon: Safely handle BALLOON_PAGE_SIZE < host page size")
Cc: qemu-stable@nongnu.org #v4.0.0
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 1b47b37c33ec01ae1efc527f4c97f97f93723bc4)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

virtio-balloon: don't track subpages for the PBP

As ramblocks cannot get removed/readded while we are processing a bulk
of inflation requests, there is no more need to track the page size
in form of the number of subpages.

Suggested-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190725113638.4702-8-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 9a7ca8a7c920360db9dcaf616ca6f1440c025043)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

virtio-balloon: Use temporary PBP only

We still have multiple issues in the current code
- The PBP is not freed during unrealize()
- The PBP is not reset on device resets: After a reset, the PBP is stale.
- We are not indicating VIRTIO_BALLOON_F_MUST_TELL_HOST, therefore
guests (esp. legacy guests) will reuse pages without deflating,
turning the PBP stale. Adding that would require compat handling.

Instead, let's use the PBP only temporarily, when processing one bulk of
inflation requests. This will keep guest_page_size > 4k working (with
Linux guests). There is nothing to do for deflation requests anymore.
The pbp is only used for a limited amount of time.

Fixes: ed48c59875b6 ("virtio-balloon: Safely handle BALLOON_PAGE_SIZE < host page size")
Cc: qemu-stable@nongnu.org #v4.0.0
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190722134108.22151-7-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit a8cd64d488325f3be5c4ddec4bf07efb3b8c7330)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

virtio-balloon: Rework pbp tracking data

Using the address of a RAMBlock to test for a matching pbp is not really
safe. Instead, let's use the guest physical address of the base page
along with the page size (via the number of subpages).

Also, let's allocate the bitmap separately. This makes the code
easier to read and maintain - we can reuse bitmap_new().

Prepare the code to move the PBP out of the device.

Fixes: ed48c59875b6 ("virtio-balloon: Safely handle BALLOON_PAGE_SIZE < host page size")
Fixes: b27b32391404 ("virtio-balloon: Fix possible guest memory corruption with inflates & deflates")
Cc: qemu-stable@nongnu.org #v4.0.0
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190722134108.22151-6-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 1c5cfc2b7153dd72bf4b8ddc456408eb2b9b66d8)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

virtio-balloon: Better names for offset variables in inflate/deflate code

"host_page_base" is really confusing, let's make this clearer, also
rename the other offsets to indicate to which base they apply.

offset -> mr_offset
ram_offset -> rb_offset
host_page_base -> rb_aligned_offset

While at it, use QEMU_ALIGN_DOWN() instead of a handcrafted computation
and move the computation to the place where it is needed.

Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190722134108.22151-5-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit e6129b271b9dccca22c84870e313c315f2c70063)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

virtio-balloon: Simplify deflate with pbp

Let's simplify this - the case we are optimizing for is very hard to
trigger and not worth the effort. If we're switching from inflation to
deflation, let's reset the pbp.

Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190722134108.22151-4-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 2ffc49eea1bbd454913a88a0ad872c2649b36950)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

virtio-balloon: Fix QEMU crashes on pagesize > BALLOON_PAGE_SIZE

We are using the wrong functions to set/clear bits, effectively touching
multiple bits, writing out of range of the bitmap, resulting in memory
corruptions. We have to use set_bit()/clear_bit() instead.

Can easily be reproduced by starting a qemu guest on hugetlbfs memory,
inflating the balloon. QEMU crashes. This never could have worked
properly - especially, also pages would have been discarded when the
first sub-page would be inflated (the whole bitmap would be set).

While testing I realized, that on hugetlbfs it is pretty much impossible
to discard a page - the guest just frees the 4k sub-pages in random order
most of the time. I was only able to discard a hugepage a handful of
times - so I hope that now works correctly.

Fixes: ed48c59875b6 ("virtio-balloon: Safely handle BALLOON_PAGE_SIZE < host page size")
Fixes: b27b32391404 ("virtio-balloon: Fix possible guest memory corruption with inflates & deflates")
Cc: qemu-stable@nongnu.org #v4.0.0
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190722134108.22151-3-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 483f13524bb2a08b7ff6a7560b846564ed3b0c33)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

virtio-balloon: Fix wrong sign extension of PFNs

If we directly cast from int to uint64_t, we will first sign-extend to
an int64_t, which is wrong. We actually want to treat the PFNs like
unsigned values.

As far as I can see, this dates back to the initial virtio-balloon
commit, but wasn't triggered as fairly big guests would be required.

Cc: qemu-stable@nongnu.org
Reported-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190722134108.22151-2-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit ffa207d08253ffffb3993a1dbe09e40af4fc91f1)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

i386/acpi: show PCI Express bus on pxb-pcie expanders

Show PCIe host bridge PNP id with PCI host bridge as a compatible id
when expanding a pcie bus.

Cc: qemu-stable@nongnu.org
Signed-off-by: Evgeny Yakovlev <wrfsh@yandex-team.ru>
Message-Id: <1563526469-15588-1-git-send-email-wrfsh@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit ee4b0c8686f781987879508d7c6dd605b5435bac)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

ioapic: kvm: Skip route updates for masked pins

Masked entries will not generate interrupt messages, thus do no need to
be routed by KVM. This is a cosmetic cleanup, just avoiding warnings of
the kind

qemu-system-x86_64: vtd_irte_get: detected non-present IRTE (index=0, high=0xff00, low=0x100)

if the masked entry happens to reference a non-present IRTE.

Cc: qemu-stable@nongnu.org
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Message-Id: <a84b7e03-f9a8-b577-be27-4d93d1caa1c9@siemens.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
(cherry picked from commit be1927c97e564346cbd409cb17fe611df74b84e5)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

hw/ssi/xilinx_spips: Avoid out-of-bound access to lqspi_buf[]

Both lqspi_read() and lqspi_load_cache() expect a 32-bit
aligned address.

>From UG1085 datasheet [*] chapter on 'Quad-SPI Controller':

  Transfer Size Limitations

    Because of the 32-bit wide TX, RX, and generic FIFO, all
    APB/AXI transfers must be an integer multiple of 4-bytes.
    Shorter transfers are not possible.

Set MemoryRegionOps.impl values to force 32-bit accesses,
this way we are sure we do not access the lqspi_buf[] array
out of bound.

[*] https://www.xilinx.com/support/documentation/user_guides/ug1085-zynq-ultrascale-trm.pdf

Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Tested-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 526668c734e6a07f2fedfd378840a61b70c1cbab)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

hw/ssi/xilinx_spips: Avoid AXI writes to the LQSPI linear memory

Lei Sun found while auditing the code that a CPU write would
trigger a NULL pointer dereference.

>From UG1085 datasheet [*] AXI writes in this region are ignored
and generates an AXI Slave Error (SLVERR).

Fix by implementing the write_with_attrs() handler.
Return MEMTX_ERROR when the region is accessed (this error maps
to an AXI slave error).

[*] https://www.xilinx.com/support/documentation/user_guides/ug1085-zynq-ultrascale-trm.pdf

Reported-by: Lei Sun <slei.casper@gmail.com>
Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Tested-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 936a236c4e4b1068ade99220260cd04f68eb0212)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

hw/ssi/xilinx_spips: Convert lqspi_read() to read_with_attrs

In the next commit we will implement the write_with_attrs()
handler. To avoid using different APIs, convert the read()
handler first.

Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Tested-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 5937bd50d3841b6ab2592c1ff4233448762a8483)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

docs/bitmaps: use QMP lexer instead of json

The annotated style json we use in QMP documentation is not strict json
and depending on the version of Sphinx (2.0+) or Pygments installed,
might cause the build to fail.

Use the new QMP lexer.

Further, some versions of Sphinx can not apply custom lexers to "code"
directives and require the use of "code-block" directives instead, so
make that change at this time as well.

Tested under:
- Sphinx 1.3.6 and Pygments 2.4
- Sphinx 1.7.6 and Pygments 2.2 (Fedora 29 packages)
- Sphinx 2.0.1 and Pygments 2.4
- Sphinx 3.0.0+/f396b3a783 and Pygments 2.4 (From Sphinx git c4f44bdd)

Reported-by: Aarushi Mehta <mehta.aaru20@gmail.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-id: 20190603214653.29369-4-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit a7786bfb0effe0b4b0fc61d8a8cd307c0b739ed7)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

sphinx: add qmp_lexer

Sphinx, through Pygments, does not like annotated json examples very
much. In some versions of Sphinx (1.7), it will render the non-json
portions of code blocks in red, but in newer versions (2.0) it will
throw an exception and not highlight the block at all. Though we can
suppress this warning, it doesn't bring back highlighting on non-strict
json blocks.

We can alleviate this by creating a custom lexer for QMP examples that
allows us to properly highlight these examples in a robust way, keeping
our directionality and elision notations.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reported-by: Aarushi Mehta <mehta.aaru20@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20190603214653.29369-3-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit cd231e13bdcb8d686b014bef940c7d19c6f1e769)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

docs/interop/bitmaps.rst: Fix typos

Pygments and Sphinx get pickier all the time; Sphinx 2.1+ now catches
these errors.

Signed-off-by: John Snow <jsnow@redhat.com>
Reported-by: Aarushi Mehta <mehta.aaru20@gmail.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20190603214653.29369-2-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit 575e6226287072bd0d6eb85d9712d280eb29c392)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

virtio-balloon: fix QEMU 4.0 config size migration incompatibility

The virtio-balloon config size changed in QEMU 4.0 even for existing
machine types.  Migration from QEMU 3.1 to 4.0 can fail in some
circumstances with the following error:

  qemu-system-x86_64: get_pci_config_device: Bad config data: i=0x10 read: a1 device: 1 cmask: ff wmask: c0 w1cmask:0

This happens because the virtio-balloon config size affects the VIRTIO
Legacy I/O Memory PCI BAR size.

Introduce a qdev property called "qemu-4-0-config-size" and enable it
only for the QEMU 4.0 machine types.  This way <4.0 machine types use
the old size, 4.0 uses the larger size, and >4.0 machine types use the
appropriate size depending on enabled virtio-balloon features.

Live migration to and from old QEMUs to QEMU 4.1 works again as long as
a versioned machine type is specified (do not use just "pc"!).

Originally-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20190710141440.27635-1-stefanha@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Tested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Tested-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 2bbadb08ce272d65e1f78621002008b07d1e0f03)
Conflicts:
hw/core/machine.c
* drop context dep. on 0a71966253c
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

usbredir: fix buffer-overflow on vmload

If interface_count is NO_INTERFACE_INFO, let's not access the arrays
out-of-bounds.

==994==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x625000243930 at pc 0x5642068086a8 bp 0x7f0b6f9ffa50 sp 0x7f0b6f9ffa40
READ of size 1 at 0x625000243930 thread T0
    #0 0x5642068086a7 in usbredir_check_bulk_receiving /home/elmarco/src/qemu/hw/usb/redirect.c:1503
    #1 0x56420681301c in usbredir_post_load /home/elmarco/src/qemu/hw/usb/redirect.c:2154
    #2 0x5642068a56c2 in vmstate_load_state /home/elmarco/src/qemu/migration/vmstate.c:168
    #3 0x56420688e2ac in vmstate_load /home/elmarco/src/qemu/migration/savevm.c:829
    #4 0x5642068980cb in qemu_loadvm_section_start_full /home/elmarco/src/qemu/migration/savevm.c:2211
    #5 0x564206899645 in qemu_loadvm_state_main /home/elmarco/src/qemu/migration/savevm.c:2395
    #6 0x5642068998cf in qemu_loadvm_state /home/elmarco/src/qemu/migration/savevm.c:2467
    #7 0x56420685f3e9 in process_incoming_migration_co /home/elmarco/src/qemu/migration/migration.c:449
    #8 0x564207106c47 in coroutine_trampoline /home/elmarco/src/qemu/util/coroutine-ucontext.c:115
    #9 0x7f0c0604e37f  (/lib64/libc.so.6+0x4d37f)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190807084048.4258-1-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 7b84b90966568da0e05655ecaa78c209300aae6e)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

virtio-pci: fix missing device properties

Since commit a4ee4c8baa37154 ("virtio: Helper for registering virtio
device types"), virtio-gpu-pci, virtio-vga, and virtio-crypto-pci lost
some properties: "ioeventfd" and "vectors". This may cause various
issues, such as failing migration or invalid properties.

Since those VirtioPCI devices do not have a base name, their class are
initialized with virtio_pci_generic_base_class_init(). However, if the
VirtioPCIDeviceTypeInfo provided a class_init which sets dc->props,
the properties were overwritten by virtio_pci_generic_class_init().

Instead, introduce an intermediary base-type to register the generic
properties.

Fixes: a4ee4c8baa37154f42b4dc6a13fee79268d15238
Cc: qemu-stable@nongnu.org
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20190625232333.30752-1-marcandre.lureau@redhat.com>
(cherry picked from commit 683c1d89efd1eeb111c129a9a91f629b94d90d45)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

docs: recommend use of md-clear feature on all Intel CPUs

Update x86 CPU model guidance to recommend that the md-clear feature is
manually enabled with all Intel CPU models, when supported by the host
microcode.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20190515141011.5315-3-berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit 2c7e82a30774730100da9dbe68d2360459030d91)
Signed-off-by: Oguz Bektas <o.bektas@proxmox.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

target/i386: define md-clear bit

md-clear is a new CPUID bit which is set when microcode provides the
mechanism to invoke a flush of various exploitable CPU buffers by invoking
the VERW instruction.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20190515141011.5315-2-berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit b2ae52101fca7f9547ac2f388085dbc58f8fe1c0)
Signed-off-by: Oguz Bektas <o.bektas@proxmox.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

target/i386: add MDS-NO feature

Microarchitectural Data Sampling is a hardware vulnerability which allows
unprivileged speculative access to data which is available in various CPU
internal buffers.

Some Intel processors use the ARCH_CAP_MDS_NO bit in the
IA32_ARCH_CAPABILITIES
MSR to report that they are not vulnerable, make it available to guests.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20190516185320.28340-1-pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit 20140a82c67467f53814ca197403d5e1b561a5e5)
Signed-off-by: Oguz Bektas <o.bektas@proxmox.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

vl: Fix -drive / -blockdev persistent reservation management

qemu-system-FOO's main() acts on command line arguments in its own
idiosyncratic order.  There's not much method to its madness.
Whenever we find a case where one kind of command line argument needs
to refer to something created for another kind later, we rejigger the
order.

Recent commit cda4aa9a5a "vl: Create block backends before setting
machine properties" was such a rejigger.  Block backends are now
created before "delayed" objects.  This broke persistent reservation
management.  Reproducer:

    $ qemu-system-x86_64 -object pr-manager-helper,id=pr-helper0,path=/tmp/pr-helper0.sock-drive -drive file=/dev/mapper/crypt,file.pr-manager=pr-helper0,format=raw,if=none,id=drive-scsi0-0-0-2
    qemu-system-x86_64: -drive file=/dev/mapper/crypt,file.pr-manager=pr-helper0,format=raw,if=none,id=drive-scsi0-0-0-2: No persistent reservation manager with id 'pr-helper0'

The delayed pr-manager-helper object is created too late for use by
-drive or -blockdev.  Normal objects are still created in time.

pr-manager-helper has always been a delayed object (commit 7c9e527659
"scsi, file-posix: add support for persistent reservation
management").  Turns out there's no real reason for that.  Make it a
normal object.

Fixes: cda4aa9a5a08777cf13e164c0543bd4888b8adce
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190604151251.9903-2-armbru@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 9ea18ed25a36527167e9676f25d983df5e7f76e6)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

q35: Revert to kernel irqchip

Backport of QEMU v4.1 commit for stable v4.0.1 release

commit c87759ce876a7a0b17c2bf4f0b964bd51f0ee871
Author: Alex Williamson <alex.williamson@redhat.com>
Date:   Tue May 14 14:14:41 2019 -0600

    q35: Revert to kernel irqchip

    Commit b2fc91db8447 ("q35: set split kernel irqchip as default") changed
    the default for the pc-q35-4.0 machine type to use split irqchip, which
    turned out to have disasterous effects on vfio-pci INTx support.  KVM
    resampling irqfds are registered for handling these interrupts, but
    these are non-functional in split irqchip mode.  We can't simply test
    for split irqchip in QEMU as userspace handling of this interrupt is a
    significant performance regression versus KVM handling (GeForce GPUs
    assigned to Windows VMs are non-functional without forcing MSI mode or
    re-enabling kernel irqchip).

    The resolution is to revert the change in default irqchip mode in the
    pc-q35-4.1 machine and create a pc-q35-4.0.1 machine for the 4.0-stable
    branch.  The qemu-q35-4.0 machine type should not be used in vfio-pci
    configurations for devices requiring legacy INTx support without
    explicitly modifying the VM configuration to use kernel irqchip.

Link: https://bugs.launchpad.net/qemu/+bug/1826422
Fixes: b2fc91db8447 ("q35: set split kernel irqchip as default")
Cc: qemu-stable@nongnu.org
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
(upstream commit c87759ce876a7a0b17c2bf4f0b964bd51f0ee871)
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
*add comments regarding AML mismatch warnings from
tests/bios-tables-test.c
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

target/ppc: Fix lxvw4x, lxvh8x and lxvb16x

During the conversion these instructions were incorrectly treated as
stores. We need to use set_cpu_vsr* and not get_cpu_vsr*.

Fixes: 8b3b2d75c7c0 ("introduce get_cpu_vsr{l,h}() and set_cpu_vsr{l,h}() helpers for VSR register access")
Signed-off-by: Anton Blanchard <anton@ozlabs.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <20190524065345.25591-1-mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(upstream commit 2a1224359008e23b051b7b45be4789afa0269f8c)
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

target/ppc: Fix vsum2sws

A recent cleanup changed the pre zeroing of the result from 64 bit
to 32 bit operations:

- result.u64[i] = 0;
+ result.VsrW(i) = 0;

This corrupts the result.

Fixes: 60594fea298d ("target/ppc: remove various HOST_WORDS_BIGENDIAN hacks in int_helper.c")
Signed-off-by: Anton Blanchard <anton@ozlabs.org>
Message-Id: <20190507004811.29968-9-anton@ozlabs.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(upstream commit 7fa0ddc1d63806769d1b6246a62708d3bde39037)
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

target/ppc: Fix xxbrq, xxbrw

Fix a typo in xxbrq and xxbrw where we put both results into the lower
doubleword.

Fixes: 8b3b2d75c7c0 ("introduce get_cpu_vsr{l,h}() and set_cpu_vsr{l,h}() helpers for VSR register access")
Signed-off-by: Anton Blanchard <anton@ozlabs.org>
Message-Id: <20190507004811.29968-3-anton@ozlabs.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(upstream commit d47a751adab7833e9831408376077bc8dba41d5d)
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

target/ppc: Fix xvxsigdp

Fix a typo in xvxsigdp where we put both results into the lower
doubleword.

Fixes: dd977e4f45cb ("target/ppc: Optimize x[sv]xsigdp using deposit_i64()")
Signed-off-by: Anton Blanchard <anton@ozlabs.org>
Message-Id: <20190507004811.29968-1-anton@ozlabs.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(upstream commit cf4e9363f7fd889d8d804c8f78e8927782c2aa48)
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

target/ppc: Fix xvabs[sd]p, xvnabs[sd]p, xvneg[sd]p, xvcpsgn[sd]p

We were using set_cpu_vsr*() when we should have used get_cpu_vsr*().

Fixes: 8b3b2d75c7c0 ("introduce get_cpu_vsr{l,h}() and set_cpu_vsr{l,h}() helpers for VSR register access")
Signed-off-by: Anton Blanchard <anton@ozlabs.org>
Message-Id: <20190509104912.6b754dff@kryten>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(upstream commit 77bd8937c03dd55e57cc257951ad07c185303c3e)
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

vhost: fix vhost_log size overflow during migration

When a guest which doesn't support multiqueue is migrated with a multi queues
vhost-user-blk deivce, a crash will occur like:

0 qemu_memfd_alloc (name=<value optimized out>, size=562949953421312, seals=<value optimized out>, fd=0x7f87171fe8b4, errp=0x7f87171fe8a8) at util/memfd.c:153
1 0x00007f883559d7cf in vhost_log_alloc (size=70368744177664, share=true) at hw/virtio/vhost.c:186
2 0x00007f88355a0758 in vhost_log_get (listener=0x7f8838bd7940, enable=1) at qemu-2-12/hw/virtio/vhost.c:211
3 vhost_dev_log_resize (listener=0x7f8838bd7940, enable=1) at hw/virtio/vhost.c:263
4 vhost_migration_log (listener=0x7f8838bd7940, enable=1) at hw/virtio/vhost.c:787
5 0x00007f88355463d6 in memory_global_dirty_log_start () at memory.c:2503
6 0x00007f8835550577 in ram_init_bitmaps (f=0x7f88384ce600, opaque=0x7f8836024098) at migration/ram.c:2173
7 ram_init_all (f=0x7f88384ce600, opaque=0x7f8836024098) at migration/ram.c:2192
8 ram_save_setup (f=0x7f88384ce600, opaque=0x7f8836024098) at migration/ram.c:2219
9 0x00007f88357a419d in qemu_savevm_state_setup (f=0x7f88384ce600) at migration/savevm.c:1002
10 0x00007f883579fc3e in migration_thread (opaque=0x7f8837530400) at migration/migration.c:2382
11 0x00007f8832447893 in start_thread () from /lib64/libpthread.so.0
12 0x00007f8832178bfd in clone () from /lib64/libc.so.6

This is because vhost_get_log_size() returns a overflowed vhost-log size.
In this function, it uses the uninitialized variable vqs->used_phys and
vqs->used_size to get the vhost-log size.

Signed-off-by: Li Hangjing <lihangjing@baidu.com>
Reviewed-by: Xie Yongji <xieyongji@baidu.com>
Reviewed-by: Chai Wen <chaiwen@baidu.com>
Message-Id: <20190603061524.24076-1-lihangjing@baidu.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 240e647a14df9677b3a501f7b8b870e40aac3fd5)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

migration/dirty-bitmaps: change bitmap enumeration method

Shift from looking at every root BDS to *every* BDS. This will migrate
bitmaps that are attached to blockdev created nodes instead of just ones
attached to emulated storage devices.

Note that this will not migrate anonymous or internal-use bitmaps, as
those are defined as having no name.

This will also fix the Coverity issues Peter Maydell has been asking
about for the past several releases, as well as fixing a real bug.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Reported-by: Coverity 😅
Reported-by: aihua liang <aliang@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 20190514201926.10407-1-jsnow@redhat.com
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1652490
Fixes: Coverity CID 1390625
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit 592203e7cfbd1ad261178431fcf390adfe8b16df)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

iotests: add iotest 256 for testing blockdev-backup across iothread contexts

Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 20190523170643.20794-6-jsnow@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
[mreitz: Moved from 250 to 256]
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit ba7704f2228f16ed61b9903801e28e17666c7e38)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

iotests.py: rewrite run_job to be pickier

Don't pull events out of the queue that don't belong to us;
be choosier so that we can use this method to drive jobs that
were launched by transactions that may have more jobs.

Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 20190523170643.20794-5-jsnow@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit d6a79af0e641806d6bd6a42a4920e294b5db179c)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

iotests.py: Fix VM.run_job

log() is in the current module, there is no need to prefix it. In fact,
doing so may make VM.run_job() unusable in tests that never use
iotests.log() themselves.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 86a4f599a67b9b709109c7a7c8b7eb91d21c21fd)
*prereq for d6a79af0e6
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

QEMUMachine: add events_wait method

Instead of event_wait which looks for a single event, add an events_wait
which can look for any number of events simultaneously. However, it
will still only return one at a time, whichever happens first.

Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 20190523170643.20794-4-jsnow@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit f6f4b3f045ea18e3fa93a50cd0462236c428d62e)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

iotests.py: do not use infinite waits

Cap waits to 60 seconds so that iotests can fail gracefully if something
goes wrong.

Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 20190523170643.20794-3-jsnow@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit 8b6f5f8b9f3bec5cbeebefab34bae0102a2581b3)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

iotests: Test commit job start with concurrent I/O

This tests that concurrent requests are correctly drained before making
graph modifications instead of running into assertions in
bdrv_replace_node().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit ac6fb43eae1f5029b51e0a3d975fe2111cc8b976)
Conflicts:
tests/qemu-iotests/group
*prereq for d81e1efb tests
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

block: Drain source node in bdrv_replace_node()

Instead of just asserting that no requests are in flight in
bdrv_replace_node(), which is a requirement that most callers ignore, we
can just drain the source node right there. This fixes at least starting
a commit job while I/O is active on the backing chain, but probably
other callers, too.

Having requests in flight on the target node isn't a problem because the
target just gets new parents, but the call path of running requests
isn't modified. So we can just drop this assertion without a replacement.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1711643
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit f871abd60f4b67547e62c57c9bec19420052be39)
*prereq for d81e1efb tests
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

blockdev-backup: don't check aio_context too early

in blockdev_backup_prepare, we check to make sure that the target is
associated with a compatible aio context. However, do_blockdev_backup is
called later and has some logic to move the target to a compatible
aio_context. The transaction version will fail certain commands
needlessly early as a result.

Allow blockdev_backup_prepare to simply call do_blockdev_backup, which
will ultimately decide if the contexts are compatible or not.

Note: the transaction version has always disallowed this operation since
its initial commit bd8baecd (2014), whereas the version of
qmp_blockdev_backup at the time, from commit c29c1dd312f, tried to
enforce the aio_context switch instead. It's not clear, and I can't see
from the mailing list archives at the time, why the two functions take a
different approach. It wasn't until later in efd7556708b (2016) that the
standalone version tried to determine if it could set the context or
not.

Reported-by: aihua liang <aliang@redhat.com>
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1683498
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 20190523170643.20794-2-jsnow@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit d81e1efbea7d19c2f142d300df56538c73800590)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

s390x/cpumodel: ignore csske for expansion

csske will be removed in a future machine. Ignore it for expanding the
cpu model. Otherwise qemu falls back to z9.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190429090250.7648-3-borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
(cherry picked from commit eaf6f642abf1d4d24791b70728d4068428fc4658)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

iotests: Test unaligned raw images with O_DIRECT

We already have 221 for accesses through the page cache, but it is
better to create a new file for O_DIRECT instead of integrating those
test cases into 221. This way, we can make use of
_supported_cache_modes (and _default_cache_mode) so the test is
automatically skipped on filesystems that do not support O_DIRECT.

As part of the split, add _supported_cache_modes to 221. With that, it
no longer fails when run with -c none or -c directsync.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 2fab30c80b33cdc6157c7efe6207e54b6835cf92)
Conflicts:
tests/qemu-iotests/group
*fix context deps on test groups not in 4.0

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

block/file-posix: Unaligned O_DIRECT block-status

Currently, qemu crashes whenever someone queries the block status of an
unaligned image tail of an O_DIRECT image:
$ echo > foo
$ qemu-img map --image-opts driver=file,filename=foo,cache.direct=on
Offset          Length          Mapped to       File
qemu-img: block/io.c:2093: bdrv_co_block_status: Assertion `*pnum &&
QEMU_IS_ALIGNED(*pnum, align) && align > offset - aligned_offset'
failed.

This is because bdrv_co_block_status() checks that the result returned
by the driver's implementation is aligned to the request_alignment, but
file-posix can fail to do so, which is actually mentioned in a comment
there: "[...] possibly including a partial sector at EOF".

Fix this by rounding up those partial sectors.

There are two possible alternative fixes:
(1) We could refuse to open unaligned image files with O_DIRECT
    altogether.  That sounds reasonable until you realize that qcow2
    does necessarily not fill up its metadata clusters, and that nobody
    runs qemu-img create with O_DIRECT.  Therefore, unpreallocated qcow2
    files usually have an unaligned image tail.

(2) bdrv_co_block_status() could ignore unaligned tails.  It actually
    throws away everything past the EOF already, so that sounds
    reasonable.
    Unfortunately, the block layer knows file lengths only with a
    granularity of BDRV_SECTOR_SIZE, so bdrv_co_block_status() usually
    would have to guess whether its file length information is inexact
    or whether the driver is broken.

Fixing what raw_co_block_status() returns is the safest thing to do.

There seems to be no other block driver that sets request_alignment and
does not make sure that it always returns aligned values.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 9c3db310ff0b7473272ae8dce5e04e2f8a825390)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

usb-tablet: fix serial compat property

s/kbd/tablet/, fixes cut+paste bug.

Cc: qemu-stable@nongnu.org
Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190520081805.15019-1-kraxel@redhat.com
(cherry picked from commit 442bac16a6cd708a9f87adb0a263f9d833f03ed5)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

kbd-state: fix autorepeat handling

When allowing multiple down-events in a row (key autorepeat) we can't
use change_bit() any more to update the state, because autorepeat events
don't change the key state. We have to explicitly use set_bit() and
clear_bit() instead.

Cc: qemu-stable@nongnu.org
Fixes: 35921860156e kbd-state: don't block auto-repeat events
Buglink: https://bugs.launchpad.net/qemu/+bug/1828272
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20190514042443.10735-1-kraxel@redhat.com
(cherry picked from commit 5fff13f245cddd3bc260dfe6ebe1b1f05b72116f)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

spapr/xive: fix EQ page addresses above 64GB

The high order bits of the address of the OS event queue is stored in
bits [4-31] of word2 of the XIVE END internal structures and the low
order bits in word3. This structure is using Big Endian ordering and
computing the value requires some simple arithmetic which happens to
be wrong. The mask removing bits [0-3] of word2 is applied to the
wrong value and the resulting address is bogus when above 64GB.

Guests with more than 64GB of RAM will allocate pages for the OS event
queues which will reside above the 64GB limit. In this case, the XIVE
device model will wake up the CPUs in case of a notification, such as
IPIs, but the update of the event queue will be written at the wrong
place in memory. The result is uncertain as the guest memory is
trashed and IPI are not delivered.

Introduce a helper xive_end_qaddr() to compute this value correctly in
all places where it is used.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190508171946.657-3-clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit 13df93244efbd4bb8b4cf4e26104a26033178674)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

docs/interop/bitmaps: rewrite and modernize doc

This just about rewrites the entirety of the bitmaps.rst document to
make it consistent with the 4.0 release. I have added new features seen
in the 4.0 release, as well as tried to clarify some points that keep
coming up when discussing this feature both in-house and upstream.

It does not yet cover pull backups or migration details, but I intend to
keep extending this document to cover those cases.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20190426221528.30293-3-jsnow@redhat.com
[Adjusted commit message. --js]
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit 90edef80a0852cf8a3d2668898ee40e8970e4314)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

Makefile: add nit-picky mode to sphinx-build

If we add references that don't resolve (or accidentally remove them),
it will be helpful to have warning messages alerting us to that.

Further, turn those warnings into errors so we can be alerted to these
problems sooner rather than later.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20190426221528.30293-2-jsnow@redhat.com
[adjusted commit message. --js]
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit 9e5b6cb87db66dfb606604fe6cf40e5ddf1ef0e7)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

cutils: Fix size_to_str() on 32-bit platforms

When extracting a human-readable size formatter, we changed 'uint64_t
div' pre-patch to 'unsigned long div' post-patch. Which breaks on
32-bit platforms, resulting in 'inf' instead of intended values larger
than 999GB.

Fixes: 22951aaa
CC: qemu-stable@nongnu.org
Reported-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 754da86714d550c3f995f11a2587395081362e0a)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

block: Fix AioContext switch for bs->drv == NULL

Even for block nodes with bs->drv == NULL, we can't just ignore a
bdrv_set_aio_context() call. Leaving the node in its old context can
mean that it's still in an iothread context in bdrv_close_all() during
shutdown, resulting in an attempted unlock of the AioContext lock which
we don't hold.

This is an example stack trace of a related crash:

#0  0x00007ffff59da57f in raise () at /lib64/libc.so.6
#1  0x00007ffff59c4895 in abort () at /lib64/libc.so.6
#2  0x0000555555b97b1e in error_exit (err=<optimized out>, msg=msg@entry=0x555555d386d0 <__func__.19059> "qemu_mutex_unlock_impl") at util/qemu-thread-posix.c:36
#3  0x0000555555b97f7f in qemu_mutex_unlock_impl (mutex=mutex@entry=0x5555568002f0, file=file@entry=0x555555d378df "util/async.c", line=line@entry=507) at util/qemu-thread-posix.c:97
#4  0x0000555555b92f55 in aio_context_release (ctx=ctx@entry=0x555556800290) at util/async.c:507
#5  0x0000555555b05cf8 in bdrv_prwv_co (child=child@entry=0x7fffc80012f0, offset=offset@entry=131072, qiov=qiov@entry=0x7fffffffd4f0, is_write=is_write@entry=true, flags=flags@entry=0)
         at block/io.c:833
#6  0x0000555555b060a9 in bdrv_pwritev (qiov=0x7fffffffd4f0, offset=131072, child=0x7fffc80012f0) at block/io.c:990
#7  0x0000555555b060a9 in bdrv_pwrite (child=0x7fffc80012f0, offset=131072, buf=<optimized out>, bytes=<optimized out>) at block/io.c:990
#8  0x0000555555ae172b in qcow2_cache_entry_flush (bs=bs@entry=0x555556810680, c=c@entry=0x5555568cc740, i=i@entry=0) at block/qcow2-cache.c:51
#9  0x0000555555ae18dd in qcow2_cache_write (bs=bs@entry=0x555556810680, c=0x5555568cc740) at block/qcow2-cache.c:248
#10 0x0000555555ae15de in qcow2_cache_flush (bs=0x555556810680, c=<optimized out>) at block/qcow2-cache.c:259
#11 0x0000555555ae16b1 in qcow2_cache_flush_dependency (c=0x5555568a1700, c=0x5555568a1700, bs=0x555556810680) at block/qcow2-cache.c:194
#12 0x0000555555ae16b1 in qcow2_cache_entry_flush (bs=bs@entry=0x555556810680, c=c@entry=0x5555568a1700, i=i@entry=0) at block/qcow2-cache.c:194
#13 0x0000555555ae18dd in qcow2_cache_write (bs=bs@entry=0x555556810680, c=0x5555568a1700) at block/qcow2-cache.c:248
#14 0x0000555555ae15de in qcow2_cache_flush (bs=bs@entry=0x555556810680, c=<optimized out>) at block/qcow2-cache.c:259
#15 0x0000555555ad242c in qcow2_inactivate (bs=bs@entry=0x555556810680) at block/qcow2.c:2124
#16 0x0000555555ad2590 in qcow2_close (bs=0x555556810680) at block/qcow2.c:2153
#17 0x0000555555ab0c62 in bdrv_close (bs=0x555556810680) at block.c:3358
#18 0x0000555555ab0c62 in bdrv_delete (bs=0x555556810680) at block.c:3542
#19 0x0000555555ab0c62 in bdrv_unref (bs=0x555556810680) at block.c:4598
#20 0x0000555555af4d72 in blk_remove_bs (blk=blk@entry=0x5555568103d0) at block/block-backend.c:785
#21 0x0000555555af4dbb in blk_remove_all_bs () at block/block-backend.c:483
#22 0x0000555555aae02f in bdrv_close_all () at block.c:3412
#23 0x00005555557f9796 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4776

The reproducer I used is a qcow2 image on gluster volume, where the
virtual disk size (4 GB) is larger than the gluster volume size (64M),
so we can easily trigger an ENOSPC. This backend is assigned to a
virtio-blk device using an iothread, and then from the guest a
'dd if=/dev/zero of=/dev/vda bs=1G count=1' causes the VM to stop
because of an I/O error. qemu_gluster_co_flush_to_disk() sets
bs->drv = NULL on error, so when virtio-blk stops the dataplane, the
block nodes stay in the iothread AioContext. A 'quit' monitor command
issued from this paused state crashes the process.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1631227
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
(cherry picked from commit 1bffe1ae7a7b707c3a14ea2ccd00d3609d3ce4d8)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

qcow2: Fix qcow2_make_empty() with external data file

make_completely_empty() is an optimisated path for bdrv_make_empty()
where completely new metadata is created inside the image file instead
of going through all clusters and discarding them. For an external data
file, however, we actually need to do discard operations on the data
file; just overwriting the qcow2 file doesn't get rid of the data.

The necessary slow path with an explicit discard operation already
exists for other cases. Use it for external data files, too.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit db04524f820582ebf1189223b6378de238511da1)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

megasas: fix mapped frame size

the current value of 1024 bytes (16 * MFI_FRAME_SIZE) we map is not enough to hold
the maximum number of scatter gather elements we advertise. We actually need a
maximum of 2048 bytes. This is 128 max sg elements * 16 bytes (sizeof (union mfi_sgl)).

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Lieven <pl@kamp.de>
Message-Id: <20190404121015.28634-1-pl@kamp.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 2e56fbc87f6ec3cd56c37b01d313abd502b80d61)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

qcow2: Fix full preallocation with external data file

preallocate_co() already gave the data file the full size without
forwarding the requested preallocation mode to the protocol. When
bdrv_co_truncate() was called later with the preallocation mode, the
file didn't actually grow any more, so the data file stayed unallocated
even if full preallocation was requested.

Pass the right preallocation mode to preallocate_co() and remove the
second bdrv_co_truncate() to fix this. As a side effect, the ugly
one-byte write in preallocate_co() is replaced with a truncate call,
now leaving the last block unallocated on the protocol level as it
should be.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 718c0fce2f56755a8d8f737607779a98aa6e7cc4)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

qcow2: Add errp to preallocate_co()

We'll add a bdrv_co_truncate() call in the next patch which can return
an Error that we don't want to discard. So add an errp parameter to
preallocate_co().

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 360bd07471dfd1830246e8403ffdc9ba9d82f9d4)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

qcow2: Avoid COW during metadata preallocation

Limiting the allocation to INT_MAX bytes isn't particularly clever
because it means that the final cluster will be a partial cluster which
will be completed through a COW operation. This results in unnecessary
data read and write requests which lead to an unwanted non-sparse
filesystem block for metadata preallocation.

Align the maximum allocation size down to the cluster size to avoid this
situation.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit f29fbf7c6b1c9a84f6931c1c222716fbe073e6e4)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>

Update version for v4.0.0 release

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Update version for v4.0.0-rc4 release

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

usb-mtp: fix bounds check for guest provided filename

The ObjectInfo struct has a variable length array containing the UTF-16
encoded filename. The number of characters of trailing data is given by
the 'length' field in the struct and this must be validated against the
size of the data packet received from the guest.

Since the data is UTF-16, we must convert the byte count we have to a
character count before validating. This must take care to truncate if
a malicious guest sent an odd number of bytes.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block layer patches:

- qcow2: Fix potential corruption for preallocated resize with external data file

# gpg: Signature made Tue 16 Apr 2019 15:23:35 BST
# gpg:                using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream:
  qcow2: Fix preallocation bdrv_pwrite to wrong file

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

qcow2: Fix preallocation bdrv_pwrite to wrong file

With an external data file, preallocate_co() must write the final byte
to the external data file, not to the qcow2 image file.

This is harmless for preallocation of newly created images (only the
qcow2 file size is increased to the virtual disk size while it should be
much smaller), but with preallocated resize, it could in theory cause
visible corruption if the metadata of the image is larger than the data
(e.g. lots of bitmaps).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>

socket: allow wait=false for client socket

Commit 767abe7 ("chardev: forbid 'wait' option with client sockets")
is a bit too strict. Current libvirt always set wait=false, and will
thus fail to add client chardev.

Make the code more permissive, allowing wait=false with client socket
chardevs. Deprecate usage of 'wait' with client sockets.

Fixes: 767abe7f49e8be14d29da5db3527817b5d696a52
Cc: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20190415163337.2795-1-marcandre.lureau@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/thibault/tags/samuel-thibault' into staging

Slirp updates

Dr. David Alan Gilbert (1):
  slirp: Gcc 9 -O3 fix

# gpg: Signature made Mon 15 Apr 2019 19:05:39 BST
# gpg:                using RSA key E61DBB15D4172BDEC97E92D9DB550E89F0FA54F3
# gpg: Good signature from "Samuel Thibault <samuel.thibault@aquilenet.fr>" [unknown]
# gpg:                 aka "Samuel Thibault <sthibault@debian.org>" [marginal]
# gpg:                 aka "Samuel Thibault <samuel.thibault@gnu.org>" [unknown]
# gpg:                 aka "Samuel Thibault <samuel.thibault@inria.fr>" [marginal]
# gpg:                 aka "Samuel Thibault <samuel.thibault@labri.fr>" [marginal]
# gpg:                 aka "Samuel Thibault <samuel.thibault@ens-lyon.org>" [marginal]
# gpg:                 aka "Samuel Thibault <samuel.thibault@u-bordeaux.fr>" [unknown]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 900C B024 B679 31D4 0F82  304B D017 8C76 7D06 9EE6
#      Subkey fingerprint: E61D BB15 D417 2BDE C97E  92D9 DB55 0E89 F0FA 54F3

* remotes/thibault/tags/samuel-thibault:
  slirp: Gcc 9 -O3 fix

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

slirp: Gcc 9 -O3 fix

Gcc 9 needs some convincing that sopreprbuf really is going to fill
in iov in the call from soreadbuf, even though the failure case
shouldn't happen.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190415121740.9881-1-dgilbert@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block layer patches:

- iotests fixes

# gpg: Signature made Fri 12 Apr 2019 17:04:09 BST
# gpg:                using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream:
  iotest: Fix 241 to run in generic directory
  iotests: Let 245 pass on tmpfs

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

iotest: Fix 241 to run in generic directory

Filter the qemu-nbd server output to get rid of a direct reference
to my build directory.

Fixes: e9dce9cb
Reported-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

iotests: Let 245 pass on tmpfs

tmpfs does not support O_DIRECT. Detect this case, and skip flipping
@direct if the filesystem does not support it.

Fixes: bf3e50f6239090e63a8ffaaec971671e66d88e07
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qemu-img: fix .hx and .texi disparity

It turns out that having options listed in three places continues to be
a bad idea. I'm still toying with the idea of an improved infrastructure
here, but in the meantime, another bandaid.

There are three locations:
(1) .hx file, formatted as texi
(2) .hx file, formatted as human readable.
(3) .texi file, as section headers, formatted as texi.

You can compare the two summaries within the .hx file like so:

Human-readable command summaries:
`./qemu-img --help | grep 'Command syntax' -A14`
Detokenized texi command summaries:
`grep "@item" qemu-img-cmds.hx | sed -E 's|@var\{([^\}]*?)\}|\1|g'`

You can compare the two separate texi summaries like so:

Texi command summaries:
`grep "@item" qemu-img-cmds.hx"`
Texi command headers:
grep -E "@item.*@var" qemu-img.texi | tail -14

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 20190409210655.777-1-jsnow@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

curses: fix wchar_t printf warning

On some systems wchar_t is "long int", on others just "int".
So go cast to "long int" and adjust the printf format accordingly.

Reported-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190402073018.17747-1-kraxel@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-4.0-20190412' into staging

ppc patch queue for 2018-04-12

Here's a last minute pull request for 4.0.  Turns out my last pull
request, to fix a regression in extended config space access for the
pseries machine didn't fix things hard enough.  This PR has a single
patch which improves the fix to work in more cases.

It's a ghastly, ghastly hack, but it's simple and localized.  I
already have patches almost ready to go in 4.1 that provides a simpler
and cleaner solution to all this.

# gpg: Signature made Fri 12 Apr 2019 06:34:16 BST
# gpg:                using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full]
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full]
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full]
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown]
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-4.0-20190412:
  spapr_pci: Fix broken naming of PCI bus

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

spapr_pci: Fix broken naming of PCI bus

Recent commit 5cf0d326a0fe fixed a regression which was preventing the
guest to access the extended config space of a PCIe device. This was
done by introducing a new PCI bus subtype for PAPR. The original fix
was causing PCI busses to be named "spapr-pci-host-bridge-root-bus.N"
instead of "pci.N", which was making upper layers unhappy of course.
This got worked around by hardcoding the PCI bus name to "pci.0", but
this only works for the default PHB. And we're now hitting:

# qemu-system-ppc64 \
             -device spapr-pci-host-bridge,index=1 \
             -device e1000e,bus=pci.0 \
             -device e1000e,bus=pci.1
qemu-system-ppc64: -device e1000e,bus=pci.1: Bus 'pci.1' not found

David already posted some patches [1] to control PCI extended config
space accesses with a new flag in the base PCI bus class instead of
subtyping. These patches are a bit more intrusive though, and
are targetted for 4.1.

When no name is passed to pci_register_bus(), the core device code
generates a lowercase name based on the QOM typename. The typename
for the base PCI bus class is "PCI", hence the "pci.0", "pci.1"
bus names. Rename the type of the PAPR PCI bus to "pci", so that
the QOM code can generate proper names. This is a hack but it is
enough to fix the regression. And all this will be reworked properly
in 4.1.

[1] https://patchwork.ozlabs.org/project/qemu-devel/list/?series=100486

Fixes: 5cf0d326a0fe
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155500034416.646888.1307366522340665522.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

Update version for v4.0.0-rc3 release

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/alistair/tags/pull-device-tree-20190409-1' into staging

Single device tree fix for 4.0

A single patch to avoid an overflow when loading device trees.

# gpg: Signature made Wed 10 Apr 2019 00:52:16 BST
# gpg:                using RSA key F6C4AC46D4934868D3B8CE8F21E10D29DF977054
# gpg: Good signature from "Alistair Francis <alistair@alistair23.me>" [full]
# Primary key fingerprint: F6C4 AC46 D493 4868 D3B8  CE8F 21E1 0D29 DF97 7054

* remotes/alistair/tags/pull-device-tree-20190409-1:
  device_tree: Fix integer overflowing in load_device_tree()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

device_tree: Fix integer overflowing in load_device_tree()

If the value of get_image_size() exceeds INT_MAX / 2 - 10000, the
computation of @dt_size overflows to a negative number, which then
gets converted to a very large size_t for g_malloc0() and
load_image_size(). In the (fortunately improbable) case g_malloc0()
succeeds and load_image_size() survives, we'd assign the negative
number to *sizep. What that would do to the callers I can't say, but
it's unlikely to be good.

Fix by rejecting images whose size would overflow.

Reported-by: Kurtis Miller <kurtis.miller@nccgroup.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190409174018.25798-1-armbru@redhat.com>

migration/ram.c: Fix use-after-free in multifd_recv_unfill_packet()

Coverity points out (CID 1400442) that in this code:

    if (packet->pages_alloc > p->pages->allocated) {
        multifd_pages_clear(p->pages);
        multifd_pages_init(packet->pages_alloc);
    }

we free p->pages in multifd_pages_clear() but continue to
use it in the following code. We also leak memory, because
multifd_pages_init() returns the pointer to a new MultiFDPages_t
struct but we are ignoring its return value.

Fix both of these bugs by adding the missing assignment of
the newly created struct to p->pages.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-id: 20190409151830.6024-1-peter.maydell@linaro.org
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* fixes for Alpine and SuSE
* fix crash when hot-plugging nvdimm on older machine types

# gpg: Signature made Tue 09 Apr 2019 17:34:27 BST
# gpg:                using RSA key BFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream:
  tests: Make check-block a phony target
  hw/i386/pc: Fix crash when hot-plugging nvdimm on older machine types
  include/qemu/bswap.h: Use __builtin_memcpy() in accessor functions
  roms: Allow passing configure options to the EDK2 build tools
  roms: Rename the EFIROM variable to avoid clashing with iPXE

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

tests: Make check-block a phony target

Fixes: b93b63f574c "test makefile overhaul"
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20190319072104.32591-1-armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

hw/i386/pc: Fix crash when hot-plugging nvdimm on older machine types

QEMU currently crashes when you try to hot-plug an "nvdimm" device
on older machine types:

$ qemu-system-x86_64 -monitor stdio -M pc-1.1
QEMU 3.1.92 monitor - type 'help' for more information
(qemu) device_add nvdimm,id=nvdimmn1
qemu-system-x86_64: /home/thuth/devel/qemu/util/error.c:57: error_setv:
Assertion `*errp == ((void *)0)' failed.
Aborted (core dumped)

The call to hotplug_handler_pre_plug() in pc_memory_pre_plug() has been
added recently before the check whether nvdimm is enabled. It should
be done after the check. And while we're at it, also check the errp
after the hotplug_handler_pre_plug(), otherwise errors are silently
ignored here.

Fixes: 9040e6dfa8c3fed87695a3de555d2c775727bb51
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20190407092314.11066-1-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>