Li Hangjing [Mon, 3 Jun 2019 06:15:24 +0000 (14:15 +0800)]
vhost: fix vhost_log size overflow during migration
When a guest which doesn't support multiqueue is migrated with a multi queues
vhost-user-blk deivce, a crash will occur like:
0 qemu_memfd_alloc (name=<value optimized out>, size=562949953421312, seals=<value optimized out>, fd=0x7f87171fe8b4, errp=0x7f87171fe8a8) at util/memfd.c:153
1 0x00007f883559d7cf in vhost_log_alloc (size=70368744177664, share=true) at hw/virtio/vhost.c:186
2 0x00007f88355a0758 in vhost_log_get (listener=0x7f8838bd7940, enable=1) at qemu-2-12/hw/virtio/vhost.c:211
3 vhost_dev_log_resize (listener=0x7f8838bd7940, enable=1) at hw/virtio/vhost.c:263
4 vhost_migration_log (listener=0x7f8838bd7940, enable=1) at hw/virtio/vhost.c:787
5 0x00007f88355463d6 in memory_global_dirty_log_start () at memory.c:2503
6 0x00007f8835550577 in ram_init_bitmaps (f=0x7f88384ce600, opaque=0x7f8836024098) at migration/ram.c:2173
7 ram_init_all (f=0x7f88384ce600, opaque=0x7f8836024098) at migration/ram.c:2192
8 ram_save_setup (f=0x7f88384ce600, opaque=0x7f8836024098) at migration/ram.c:2219
9 0x00007f88357a419d in qemu_savevm_state_setup (f=0x7f88384ce600) at migration/savevm.c:1002
10 0x00007f883579fc3e in migration_thread (opaque=0x7f8837530400) at migration/migration.c:2382
11 0x00007f8832447893 in start_thread () from /lib64/libpthread.so.0
12 0x00007f8832178bfd in clone () from /lib64/libc.so.6
This is because vhost_get_log_size() returns a overflowed vhost-log size.
In this function, it uses the uninitialized variable vqs->used_phys and
vqs->used_size to get the vhost-log size.
Signed-off-by: Li Hangjing <lihangjing@baidu.com> Reviewed-by: Xie Yongji <xieyongji@baidu.com> Reviewed-by: Chai Wen <chaiwen@baidu.com>
Message-Id: <20190603061524.24076-1-lihangjing@baidu.com> Cc: qemu-stable@nongnu.org Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 240e647a14df9677b3a501f7b8b870e40aac3fd5) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Shift from looking at every root BDS to *every* BDS. This will migrate
bitmaps that are attached to blockdev created nodes instead of just ones
attached to emulated storage devices.
Note that this will not migrate anonymous or internal-use bitmaps, as
those are defined as having no name.
This will also fix the Coverity issues Peter Maydell has been asking
about for the past several releases, as well as fixing a real bug.
Reported-by: Peter Maydell <peter.maydell@linaro.org> Reported-by: Coverity đ Reported-by: aihua liang <aliang@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 20190514201926.10407-1-jsnow@redhat.com Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1652490 Fixes: Coverity CID 1390625 CC: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit 592203e7cfbd1ad261178431fcf390adfe8b16df) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
John Snow [Thu, 23 May 2019 17:06:43 +0000 (13:06 -0400)]
iotests: add iotest 256 for testing blockdev-backup across iothread contexts
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 20190523170643.20794-6-jsnow@redhat.com Reviewed-by: Max Reitz <mreitz@redhat.com>
[mreitz: Moved from 250 to 256] Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit ba7704f2228f16ed61b9903801e28e17666c7e38) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
John Snow [Thu, 23 May 2019 17:06:42 +0000 (13:06 -0400)]
iotests.py: rewrite run_job to be pickier
Don't pull events out of the queue that don't belong to us;
be choosier so that we can use this method to drive jobs that
were launched by transactions that may have more jobs.
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 20190523170643.20794-5-jsnow@redhat.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit d6a79af0e641806d6bd6a42a4920e294b5db179c) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Max Reitz [Wed, 15 May 2019 20:15:02 +0000 (22:15 +0200)]
iotests.py: Fix VM.run_job
log() is in the current module, there is no need to prefix it. In fact,
doing so may make VM.run_job() unusable in tests that never use
iotests.log() themselves.
Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 86a4f599a67b9b709109c7a7c8b7eb91d21c21fd)
*prereq for d6a79af0e6 Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
John Snow [Thu, 23 May 2019 17:06:41 +0000 (13:06 -0400)]
QEMUMachine: add events_wait method
Instead of event_wait which looks for a single event, add an events_wait
which can look for any number of events simultaneously. However, it
will still only return one at a time, whichever happens first.
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 20190523170643.20794-4-jsnow@redhat.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit f6f4b3f045ea18e3fa93a50cd0462236c428d62e) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
John Snow [Thu, 23 May 2019 17:06:40 +0000 (13:06 -0400)]
iotests.py: do not use infinite waits
Cap waits to 60 seconds so that iotests can fail gracefully if something
goes wrong.
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 20190523170643.20794-3-jsnow@redhat.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit 8b6f5f8b9f3bec5cbeebefab34bae0102a2581b3) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Kevin Wolf [Tue, 21 May 2019 18:35:52 +0000 (20:35 +0200)]
iotests: Test commit job start with concurrent I/O
This tests that concurrent requests are correctly drained before making
graph modifications instead of running into assertions in
bdrv_replace_node().
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit ac6fb43eae1f5029b51e0a3d975fe2111cc8b976)
Conflicts:
tests/qemu-iotests/group
*prereq for d81e1efb tests Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Kevin Wolf [Tue, 21 May 2019 17:00:25 +0000 (19:00 +0200)]
block: Drain source node in bdrv_replace_node()
Instead of just asserting that no requests are in flight in
bdrv_replace_node(), which is a requirement that most callers ignore, we
can just drain the source node right there. This fixes at least starting
a commit job while I/O is active on the backing chain, but probably
other callers, too.
Having requests in flight on the target node isn't a problem because the
target just gets new parents, but the call path of running requests
isn't modified. So we can just drop this assertion without a replacement.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1711643 Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit f871abd60f4b67547e62c57c9bec19420052be39)
*prereq for d81e1efb tests Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
John Snow [Thu, 23 May 2019 17:06:39 +0000 (13:06 -0400)]
blockdev-backup: don't check aio_context too early
in blockdev_backup_prepare, we check to make sure that the target is
associated with a compatible aio context. However, do_blockdev_backup is
called later and has some logic to move the target to a compatible
aio_context. The transaction version will fail certain commands
needlessly early as a result.
Allow blockdev_backup_prepare to simply call do_blockdev_backup, which
will ultimately decide if the contexts are compatible or not.
Note: the transaction version has always disallowed this operation since
its initial commit bd8baecd (2014), whereas the version of
qmp_blockdev_backup at the time, from commit c29c1dd312f, tried to
enforce the aio_context switch instead. It's not clear, and I can't see
from the mailing list archives at the time, why the two functions take a
different approach. It wasn't until later in efd7556708b (2016) that the
standalone version tried to determine if it could set the context or
not.
Reported-by: aihua liang <aliang@redhat.com> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1683498 Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 20190523170643.20794-2-jsnow@redhat.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit d81e1efbea7d19c2f142d300df56538c73800590) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Max Reitz [Wed, 15 May 2019 04:15:41 +0000 (06:15 +0200)]
iotests: Test unaligned raw images with O_DIRECT
We already have 221 for accesses through the page cache, but it is
better to create a new file for O_DIRECT instead of integrating those
test cases into 221. This way, we can make use of
_supported_cache_modes (and _default_cache_mode) so the test is
automatically skipped on filesystems that do not support O_DIRECT.
As part of the split, add _supported_cache_modes to 221. With that, it
no longer fails when run with -c none or -c directsync.
Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 2fab30c80b33cdc6157c7efe6207e54b6835cf92)
Conflicts:
tests/qemu-iotests/group
*fix context deps on test groups not in 4.0
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Max Reitz [Wed, 15 May 2019 04:15:40 +0000 (06:15 +0200)]
block/file-posix: Unaligned O_DIRECT block-status
Currently, qemu crashes whenever someone queries the block status of an
unaligned image tail of an O_DIRECT image:
$ echo > foo
$ qemu-img map --image-opts driver=file,filename=foo,cache.direct=on
Offset Length Mapped to File
qemu-img: block/io.c:2093: bdrv_co_block_status: Assertion `*pnum &&
QEMU_IS_ALIGNED(*pnum, align) && align > offset - aligned_offset'
failed.
This is because bdrv_co_block_status() checks that the result returned
by the driver's implementation is aligned to the request_alignment, but
file-posix can fail to do so, which is actually mentioned in a comment
there: "[...] possibly including a partial sector at EOF".
Fix this by rounding up those partial sectors.
There are two possible alternative fixes:
(1) We could refuse to open unaligned image files with O_DIRECT
altogether. That sounds reasonable until you realize that qcow2
does necessarily not fill up its metadata clusters, and that nobody
runs qemu-img create with O_DIRECT. Therefore, unpreallocated qcow2
files usually have an unaligned image tail.
(2) bdrv_co_block_status() could ignore unaligned tails. It actually
throws away everything past the EOF already, so that sounds
reasonable.
Unfortunately, the block layer knows file lengths only with a
granularity of BDRV_SECTOR_SIZE, so bdrv_co_block_status() usually
would have to guess whether its file length information is inexact
or whether the driver is broken.
Fixing what raw_co_block_status() returns is the safest thing to do.
There seems to be no other block driver that sets request_alignment and
does not make sure that it always returns aligned values.
Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 9c3db310ff0b7473272ae8dce5e04e2f8a825390) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Gerd Hoffmann [Tue, 14 May 2019 04:24:43 +0000 (06:24 +0200)]
kbd-state: fix autorepeat handling
When allowing multiple down-events in a row (key autorepeat) we can't
use change_bit() any more to update the state, because autorepeat events
don't change the key state. We have to explicitly use set_bit() and
clear_bit() instead.
The high order bits of the address of the OS event queue is stored in
bits [4-31] of word2 of the XIVE END internal structures and the low
order bits in word3. This structure is using Big Endian ordering and
computing the value requires some simple arithmetic which happens to
be wrong. The mask removing bits [0-3] of word2 is applied to the
wrong value and the resulting address is bogus when above 64GB.
Guests with more than 64GB of RAM will allocate pages for the OS event
queues which will reside above the 64GB limit. In this case, the XIVE
device model will wake up the CPUs in case of a notification, such as
IPIs, but the update of the event queue will be written at the wrong
place in memory. The result is uncertain as the guest memory is
trashed and IPI are not delivered.
Introduce a helper xive_end_qaddr() to compute this value correctly in
all places where it is used.
John Snow [Fri, 26 Apr 2019 22:15:28 +0000 (18:15 -0400)]
docs/interop/bitmaps: rewrite and modernize doc
This just about rewrites the entirety of the bitmaps.rst document to
make it consistent with the 4.0 release. I have added new features seen
in the 4.0 release, as well as tried to clarify some points that keep
coming up when discussing this feature both in-house and upstream.
It does not yet cover pull backups or migration details, but I intend to
keep extending this document to cover those cases.
Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20190426221528.30293-3-jsnow@redhat.com
[Adjusted commit message. --js] Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit 90edef80a0852cf8a3d2668898ee40e8970e4314) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Eric Blake [Wed, 17 Apr 2019 17:11:00 +0000 (12:11 -0500)]
cutils: Fix size_to_str() on 32-bit platforms
When extracting a human-readable size formatter, we changed 'uint64_t
div' pre-patch to 'unsigned long div' post-patch. Which breaks on
32-bit platforms, resulting in 'inf' instead of intended values larger
than 999GB.
Fixes: 22951aaa CC: qemu-stable@nongnu.org Reported-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 754da86714d550c3f995f11a2587395081362e0a) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Kevin Wolf [Wed, 17 Apr 2019 15:15:25 +0000 (17:15 +0200)]
block: Fix AioContext switch for bs->drv == NULL
Even for block nodes with bs->drv == NULL, we can't just ignore a
bdrv_set_aio_context() call. Leaving the node in its old context can
mean that it's still in an iothread context in bdrv_close_all() during
shutdown, resulting in an attempted unlock of the AioContext lock which
we don't hold.
This is an example stack trace of a related crash:
#0 0x00007ffff59da57f in raise () at /lib64/libc.so.6
#1 0x00007ffff59c4895 in abort () at /lib64/libc.so.6
#2 0x0000555555b97b1e in error_exit (err=<optimized out>, msg=msg@entry=0x555555d386d0 <__func__.19059> "qemu_mutex_unlock_impl") at util/qemu-thread-posix.c:36
#3 0x0000555555b97f7f in qemu_mutex_unlock_impl (mutex=mutex@entry=0x5555568002f0, file=file@entry=0x555555d378df "util/async.c", line=line@entry=507) at util/qemu-thread-posix.c:97
#4 0x0000555555b92f55 in aio_context_release (ctx=ctx@entry=0x555556800290) at util/async.c:507
#5 0x0000555555b05cf8 in bdrv_prwv_co (child=child@entry=0x7fffc80012f0, offset=offset@entry=131072, qiov=qiov@entry=0x7fffffffd4f0, is_write=is_write@entry=true, flags=flags@entry=0)
at block/io.c:833
#6 0x0000555555b060a9 in bdrv_pwritev (qiov=0x7fffffffd4f0, offset=131072, child=0x7fffc80012f0) at block/io.c:990
#7 0x0000555555b060a9 in bdrv_pwrite (child=0x7fffc80012f0, offset=131072, buf=<optimized out>, bytes=<optimized out>) at block/io.c:990
#8 0x0000555555ae172b in qcow2_cache_entry_flush (bs=bs@entry=0x555556810680, c=c@entry=0x5555568cc740, i=i@entry=0) at block/qcow2-cache.c:51
#9 0x0000555555ae18dd in qcow2_cache_write (bs=bs@entry=0x555556810680, c=0x5555568cc740) at block/qcow2-cache.c:248
#10 0x0000555555ae15de in qcow2_cache_flush (bs=0x555556810680, c=<optimized out>) at block/qcow2-cache.c:259
#11 0x0000555555ae16b1 in qcow2_cache_flush_dependency (c=0x5555568a1700, c=0x5555568a1700, bs=0x555556810680) at block/qcow2-cache.c:194
#12 0x0000555555ae16b1 in qcow2_cache_entry_flush (bs=bs@entry=0x555556810680, c=c@entry=0x5555568a1700, i=i@entry=0) at block/qcow2-cache.c:194
#13 0x0000555555ae18dd in qcow2_cache_write (bs=bs@entry=0x555556810680, c=0x5555568a1700) at block/qcow2-cache.c:248
#14 0x0000555555ae15de in qcow2_cache_flush (bs=bs@entry=0x555556810680, c=<optimized out>) at block/qcow2-cache.c:259
#15 0x0000555555ad242c in qcow2_inactivate (bs=bs@entry=0x555556810680) at block/qcow2.c:2124
#16 0x0000555555ad2590 in qcow2_close (bs=0x555556810680) at block/qcow2.c:2153
#17 0x0000555555ab0c62 in bdrv_close (bs=0x555556810680) at block.c:3358
#18 0x0000555555ab0c62 in bdrv_delete (bs=0x555556810680) at block.c:3542
#19 0x0000555555ab0c62 in bdrv_unref (bs=0x555556810680) at block.c:4598
#20 0x0000555555af4d72 in blk_remove_bs (blk=blk@entry=0x5555568103d0) at block/block-backend.c:785
#21 0x0000555555af4dbb in blk_remove_all_bs () at block/block-backend.c:483
#22 0x0000555555aae02f in bdrv_close_all () at block.c:3412
#23 0x00005555557f9796 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4776
The reproducer I used is a qcow2 image on gluster volume, where the
virtual disk size (4 GB) is larger than the gluster volume size (64M),
so we can easily trigger an ENOSPC. This backend is assigned to a
virtio-blk device using an iothread, and then from the guest a
'dd if=/dev/zero of=/dev/vda bs=1G count=1' causes the VM to stop
because of an I/O error. qemu_gluster_co_flush_to_disk() sets
bs->drv = NULL on error, so when virtio-blk stops the dataplane, the
block nodes stay in the iothread AioContext. A 'quit' monitor command
issued from this paused state crashes the process.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1631227 Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
(cherry picked from commit 1bffe1ae7a7b707c3a14ea2ccd00d3609d3ce4d8) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Kevin Wolf [Mon, 29 Apr 2019 10:52:21 +0000 (12:52 +0200)]
qcow2: Fix qcow2_make_empty() with external data file
make_completely_empty() is an optimisated path for bdrv_make_empty()
where completely new metadata is created inside the image file instead
of going through all clusters and discarding them. For an external data
file, however, we actually need to do discard operations on the data
file; just overwriting the qcow2 file doesn't get rid of the data.
The necessary slow path with an explicit discard operation already
exists for other cases. Use it for external data files, too.
Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit db04524f820582ebf1189223b6378de238511da1) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Peter Lieven [Thu, 4 Apr 2019 12:10:15 +0000 (14:10 +0200)]
megasas: fix mapped frame size
the current value of 1024 bytes (16 * MFI_FRAME_SIZE) we map is not enough to hold
the maximum number of scatter gather elements we advertise. We actually need a
maximum of 2048 bytes. This is 128 max sg elements * 16 bytes (sizeof (union mfi_sgl)).
Cc: qemu-stable@nongnu.org Signed-off-by: Peter Lieven <pl@kamp.de>
Message-Id: <20190404121015.28634-1-pl@kamp.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 2e56fbc87f6ec3cd56c37b01d313abd502b80d61) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Kevin Wolf [Mon, 15 Apr 2019 14:34:30 +0000 (16:34 +0200)]
qcow2: Fix full preallocation with external data file
preallocate_co() already gave the data file the full size without
forwarding the requested preallocation mode to the protocol. When
bdrv_co_truncate() was called later with the preallocation mode, the
file didn't actually grow any more, so the data file stayed unallocated
even if full preallocation was requested.
Pass the right preallocation mode to preallocate_co() and remove the
second bdrv_co_truncate() to fix this. As a side effect, the ugly
one-byte write in preallocate_co() is replaced with a truncate call,
now leaving the last block unallocated on the protocol level as it
should be.
Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 718c0fce2f56755a8d8f737607779a98aa6e7cc4) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Kevin Wolf [Mon, 15 Apr 2019 14:56:07 +0000 (16:56 +0200)]
qcow2: Add errp to preallocate_co()
We'll add a bdrv_co_truncate() call in the next patch which can return
an Error that we don't want to discard. So add an errp parameter to
preallocate_co().
Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 360bd07471dfd1830246e8403ffdc9ba9d82f9d4) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Kevin Wolf [Mon, 15 Apr 2019 14:25:01 +0000 (16:25 +0200)]
qcow2: Avoid COW during metadata preallocation
Limiting the allocation to INT_MAX bytes isn't particularly clever
because it means that the final cluster will be a partial cluster which
will be completed through a COW operation. This results in unnecessary
data read and write requests which lead to an unwanted non-sparse
filesystem block for metadata preallocation.
Align the maximum allocation size down to the cluster size to avoid this
situation.
Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit f29fbf7c6b1c9a84f6931c1c222716fbe073e6e4) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
usb-mtp: fix bounds check for guest provided filename
The ObjectInfo struct has a variable length array containing the UTF-16
encoded filename. The number of characters of trailing data is given by
the 'length' field in the struct and this must be validated against the
size of the data packet received from the guest.
Since the data is UTF-16, we must convert the byte count we have to a
character count before validating. This must take care to truncate if
a malicious guest sent an odd number of bytes.
Kevin Wolf [Mon, 15 Apr 2019 15:54:50 +0000 (17:54 +0200)]
qcow2: Fix preallocation bdrv_pwrite to wrong file
With an external data file, preallocate_co() must write the final byte
to the external data file, not to the qcow2 image file.
This is harmless for preallocation of newly created images (only the
qcow2 file size is increased to the virtual disk size while it should be
much smaller), but with preallocated resize, it could in theory cause
visible corruption if the metadata of the image is larger than the data
(e.g. lots of bitmaps).
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
Commit 767abe7 ("chardev: forbid 'wait' option with client sockets")
is a bit too strict. Current libvirt always set wait=false, and will
thus fail to add client chardev.
Make the code more permissive, allowing wait=false with client socket
chardevs. Deprecate usage of 'wait' with client sockets.
Gcc 9 needs some convincing that sopreprbuf really is going to fill
in iov in the call from soreadbuf, even though the failure case
shouldn't happen.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190415121740.9881-1-dgilbert@redhat.com> Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Max Reitz [Wed, 10 Apr 2019 16:29:18 +0000 (18:29 +0200)]
iotests: Let 245 pass on tmpfs
tmpfs does not support O_DIRECT. Detect this case, and skip flipping
@direct if the filesystem does not support it.
Fixes: bf3e50f6239090e63a8ffaaec971671e66d88e07 Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
John Snow [Tue, 9 Apr 2019 21:06:55 +0000 (17:06 -0400)]
qemu-img: fix .hx and .texi disparity
It turns out that having options listed in three places continues to be
a bad idea. I'm still toying with the idea of an improved infrastructure
here, but in the meantime, another bandaid.
There are three locations:
(1) .hx file, formatted as texi
(2) .hx file, formatted as human readable.
(3) .texi file, as section headers, formatted as texi.
You can compare the two summaries within the .hx file like so:
Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 20190409210655.777-1-jsnow@redhat.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Fri, 12 Apr 2019 10:23:14 +0000 (11:23 +0100)]
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-4.0-20190412' into staging
ppc patch queue for 2018-04-12
Here's a last minute pull request for 4.0. Turns out my last pull
request, to fix a regression in extended config space access for the
pseries machine didn't fix things hard enough. This PR has a single
patch which improves the fix to work in more cases.
It's a ghastly, ghastly hack, but it's simple and localized. I
already have patches almost ready to go in 4.1 that provides a simpler
and cleaner solution to all this.
Greg Kurz [Thu, 11 Apr 2019 16:32:24 +0000 (18:32 +0200)]
spapr_pci: Fix broken naming of PCI bus
Recent commit 5cf0d326a0fe fixed a regression which was preventing the
guest to access the extended config space of a PCIe device. This was
done by introducing a new PCI bus subtype for PAPR. The original fix
was causing PCI busses to be named "spapr-pci-host-bridge-root-bus.N"
instead of "pci.N", which was making upper layers unhappy of course.
This got worked around by hardcoding the PCI bus name to "pci.0", but
this only works for the default PHB. And we're now hitting:
# qemu-system-ppc64 \
-device spapr-pci-host-bridge,index=1 \
-device e1000e,bus=pci.0 \
-device e1000e,bus=pci.1
qemu-system-ppc64: -device e1000e,bus=pci.1: Bus 'pci.1' not found
David already posted some patches [1] to control PCI extended config
space accesses with a new flag in the base PCI bus class instead of
subtyping. These patches are a bit more intrusive though, and
are targetted for 4.1.
When no name is passed to pci_register_bus(), the core device code
generates a lowercase name based on the QOM typename. The typename
for the base PCI bus class is "PCI", hence the "pci.0", "pci.1"
bus names. Rename the type of the PAPR PCI bus to "pci", so that
the QOM code can generate proper names. This is a hack but it is
enough to fix the regression. And all this will be reworked properly
in 4.1.
device_tree: Fix integer overflowing in load_device_tree()
If the value of get_image_size() exceeds INT_MAX / 2 - 10000, the
computation of @dt_size overflows to a negative number, which then
gets converted to a very large size_t for g_malloc0() and
load_image_size(). In the (fortunately improbable) case g_malloc0()
succeeds and load_image_size() survives, we'd assign the negative
number to *sizep. What that would do to the callers I can't say, but
it's unlikely to be good.
Fix by rejecting images whose size would overflow.
Peter Maydell [Tue, 9 Apr 2019 15:18:30 +0000 (16:18 +0100)]
migration/ram.c: Fix use-after-free in multifd_recv_unfill_packet()
Coverity points out (CID 1400442) that in this code:
if (packet->pages_alloc > p->pages->allocated) {
multifd_pages_clear(p->pages);
multifd_pages_init(packet->pages_alloc);
}
we free p->pages in multifd_pages_clear() but continue to
use it in the following code. We also leak memory, because
multifd_pages_init() returns the pointer to a new MultiFDPages_t
struct but we are ignoring its return value.
Fix both of these bugs by adding the missing assignment of
the newly created struct to p->pages.
* remotes/bonzini/tags/for-upstream:
tests: Make check-block a phony target
hw/i386/pc: Fix crash when hot-plugging nvdimm on older machine types
include/qemu/bswap.h: Use __builtin_memcpy() in accessor functions
roms: Allow passing configure options to the EDK2 build tools
roms: Rename the EFIROM variable to avoid clashing with iPXE
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Thomas Huth [Sun, 7 Apr 2019 09:23:14 +0000 (11:23 +0200)]
hw/i386/pc: Fix crash when hot-plugging nvdimm on older machine types
QEMU currently crashes when you try to hot-plug an "nvdimm" device
on older machine types:
$ qemu-system-x86_64 -monitor stdio -M pc-1.1
QEMU 3.1.92 monitor - type 'help' for more information
(qemu) device_add nvdimm,id=nvdimmn1
qemu-system-x86_64: /home/thuth/devel/qemu/util/error.c:57: error_setv:
Assertion `*errp == ((void *)0)' failed.
Aborted (core dumped)
The call to hotplug_handler_pre_plug() in pc_memory_pre_plug() has been
added recently before the check whether nvdimm is enabled. It should
be done after the check. And while we're at it, also check the errp
after the hotplug_handler_pre_plug(), otherwise errors are silently
ignored here.
Fixes: 9040e6dfa8c3fed87695a3de555d2c775727bb51 Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20190407092314.11066-1-thuth@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Peter Maydell [Mon, 18 Mar 2019 11:29:38 +0000 (11:29 +0000)]
include/qemu/bswap.h: Use __builtin_memcpy() in accessor functions
In the accessor functions ld*_he_p() and st*_he_p() we use memcpy()
to perform a load or store to a pointer which might not be aligned
for the size of the type. We rely on the compiler to optimize this
memcpy() into an efficient load or store instruction where possible.
This is required for good performance, but at the moment it is also
required for correct operation, because some users of these functions
require that the access is atomic if the pointer is aligned, which
will only be the case if the compiler has optimized out the memcpy().
(The particular example where we discovered this is the virtio
vring_avail_idx() which calls virtio_lduw_phys_cached() which
eventually ends up calling lduw_he_p().)
Unfortunately some compile environments, such as the fortify-source
setup used in Alpine Linux, define memcpy() to a wrapper function
in a way that inhibits this compiler optimization.
The correct long-term fix here is to add a set of functions for
doing atomic accesses into AddressSpaces (and to other relevant
families of accessor functions like the virtio_*_phys_cached()
ones), and make sure that callsites which want atomic behaviour
use the correct functions.
In the meantime, switch to using __builtin_memcpy() in the
bswap.h accessor functions. This will make us robust against things
like this fortify library in the short term. In the longer term
it will mean that we don't end up with these functions being really
badly-performing even if the semantics of the out-of-line memcpy()
are correct.
Reported-by: Fernando Casas Schössow <casasfernando@outlook.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20190318112938.8298-1-peter.maydell@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Peter Maydell [Thu, 28 Mar 2019 10:47:50 +0000 (10:47 +0000)]
target/i386: Generate #UD for LOCK on a register increment
Fix a TCG crash due to attempting an atomic increment
operation without having set up the address first.
This is a similar case to that dealt with in commit e84fcd7f662a0d8198703, and we fix it in the same way.
Fixes: https://bugs.launchpad.net/qemu/+bug/1807675 Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20190328104750.25046-1-peter.maydell@linaro.org
* remotes/dgibson/tags/ppc-for-4.0-20190409:
spapr_pci: Fix extended config space accesses
pci: Allow PCI bus subtypes to support extended config space accesses
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Greg Kurz [Mon, 1 Apr 2019 17:55:08 +0000 (19:55 +0200)]
spapr_pci: Fix extended config space accesses
The PAPR PHB acts as a legacy PCI bus but it allows PCIe extended
config space accesses anyway (for pseries-2.9 and newer machine
types).
Introduce a specific PCI bus subtype to inform the common PCI code
about that.
Fixes: c2077e2ca0da7 Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155414130834.574858.16502276132110219890.stgit@bahia.lan>
[dwg: Apply fix so we don't rename the default pci bus, breaking everything] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Greg Kurz [Mon, 1 Apr 2019 17:55:02 +0000 (19:55 +0200)]
pci: Allow PCI bus subtypes to support extended config space accesses
Some PHB implementations, eg. PAPR used on pseries machine, act like
a regular PCI bus rather than a PCIe bus, but allow access to the
PCIe extended config space anyway.
Introduce a new PCI bus class method to modelize this behaviour and
use it when adjusting the config space size limit during accesses.
No behaviour change for existing PCI bus types.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155414130271.574858.4253514266378127489.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Eric Blake [Thu, 4 Apr 2019 14:52:26 +0000 (09:52 -0500)]
nbd/client: Fix error message for server with unusable sizing
Add a missing space to the error message used when giving up on a
server that insists on an alignment which renders the last few bytes
of the export unreadable.
Fixes: 3add3ab78 Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20190404145226.32649-1-eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Eric Blake [Wed, 3 Apr 2019 03:05:22 +0000 (22:05 -0500)]
nbd/server: Don't fail NBD_OPT_INFO for byte-aligned sources
In commit 0c1d50bd, I added a couple of TODO comments about whether we
consult bl.request_alignment when responding to NBD_OPT_INFO. At the
time, qemu as server was hard-coding an advertised alignment of 512 to
clients that promised to obey constraints, and there was no function
for getting at a device's preferred alignment. But in hindsight,
advertising 512 when the block device prefers 1 caused other
compliance problems, and commit b0245d64 changed one of the two TODO
comments to advertise a more accurate alignment. Time to fix the other
TODO. Doesn't really impact qemu as client (our normal client doesn't
use NBD_OPT_INFO, and qemu-nbd --list promises to obey block sizes),
but it might prove useful to other clients.
Fixes: b0245d64 Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20190403030526.12258-4-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Eric Blake [Wed, 3 Apr 2019 03:05:21 +0000 (22:05 -0500)]
nbd/server: Trace client noncompliance on unaligned requests
We've recently added traces for clients to flag server non-compliance;
let's do the same for servers to flag client non-compliance. According
to the spec, if the client requests NBD_INFO_BLOCK_SIZE, it is
promising to send all requests aligned to those boundaries. Of
course, if the client does not request NBD_INFO_BLOCK_SIZE, then it
made no promises so we shouldn't flag anything; and because we are
willing to handle clients that made no promises (the spec allows us to
use NBD_REP_ERR_BLOCK_SIZE_REQD if we had been unwilling), we already
have to handle unaligned requests (which the block layer already does
on our behalf). So even though the spec allows us to return EINVAL
for clients that promised to behave, it's easier to always answer
unaligned requests. Still, flagging non-compliance can be useful in
debugging a client that is trying to be maximally portable.
Qemu as client used to have one spot where it sent non-compliant
requests: if the server sends an unaligned reply to
NBD_CMD_BLOCK_STATUS, and the client was iterating over the entire
disk, the next request would start at that unaligned point; this was
fixed in commit a39286dd when the client was taught to work around
server non-compliance; but is equally fixed if the server is patched
to not send unaligned replies in the first place (yes, qemu 4.0 as
server still has few such bugs, although they will be patched in
4.1). Fortunately, I did not find any more spots where qemu as client
was non-compliant. I was able to test the patch by using the following
hack to convince qemu-io to run various unaligned commands, coupled
with serving 512-byte alignment by intentionally omitting '-f raw' on
the server while viewing server traces.
| diff --git i/nbd/client.c w/nbd/client.c
| index 427980bdd22..1858b2aac35 100644
| --- i/nbd/client.c
| +++ w/nbd/client.c
| @@ -449,6 +449,7 @@ static int nbd_opt_info_or_go(QIOChannel *ioc, uint32_t opt,
| nbd_send_opt_abort(ioc);
| return -1;
| }
| + info->min_block = 1;//hack
| if (!is_power_of_2(info->min_block)) {
| error_setg(errp, "server minimum block size %" PRIu32
| " is not a power of two", info->min_block);
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20190403030526.12258-3-eblake@redhat.com>
[eblake: address minor review nits] Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Eric Blake [Wed, 3 Apr 2019 03:05:20 +0000 (22:05 -0500)]
nbd/server: Fix blockstatus trace
Don't increment remaining_bytes until we know that we will actually be
including the current block status extent in the reply; otherwise, the
value traced will include a bytes value that is oversized by the
length of the next block status extent which did not get sent because
it instead ended the loop.
Fixes: fb7afc79 Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20190403030526.12258-2-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
drive_new() returns null without setting an error when it provided
help. add_init_drive() assumes null means failure, and crashes trying
to report a null error.
Fixes: c4f26c9f37ce511e5fe629c21c180dc6eb7c5a25 Cc: qemu-stable@nongnu.org Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
linux-user: rename gettid() to sys_gettid() to avoid clash with glibc
The glibc-2.29.9000-6.fc31.x86_64 package finally includes the gettid()
function as part of unistd.h when __USE_GNU is defined. This clashes
with linux-user code which unconditionally defines this function name
itself.
/home/berrange/src/virt/qemu/linux-user/syscall.c:253:16: error: static declaration of âgettidâ follows non-static declaration
253 | _syscall0(int, gettid)
| ^~~~~~
/home/berrange/src/virt/qemu/linux-user/syscall.c:184:13: note: in definition of macro â_syscall0â
184 | static type name (void) \
| ^~~~
In file included from /usr/include/unistd.h:1170,
from /home/berrange/src/virt/qemu/include/qemu/osdep.h:107,
from /home/berrange/src/virt/qemu/linux-user/syscall.c:20:
/usr/include/bits/unistd_ext.h:34:16: note: previous declaration of âgettidâ was here
34 | extern __pid_t gettid (void) __THROW;
| ^~~~~~
CC aarch64-linux-user/linux-user/signal.o
make[1]: *** [/home/berrange/src/virt/qemu/rules.mak:69: linux-user/syscall.o] Error 1
make[1]: *** Waiting for unfinished jobs....
make: *** [Makefile:449: subdir-aarch64-linux-user] Error 2
While we could make our definition conditional and rely on glibc's impl,
this patch simply renames our definition to sys_gettid() which is a
common pattern in this file.
The gettid syscall was introduced in Linux 2.4.11. This is old enough
that we can assume it always exists and thus not bother with the
conditional backcompat logic.
Kevin Wolf [Thu, 4 Apr 2019 15:04:43 +0000 (17:04 +0200)]
block: Forward 'discard' to temporary overlay
When bdrv_temp_snapshot_options() is called for snapshot=on, the
'discard' option in the options QDict hasn't been parsed and merged into
the flags yet. So copy the dict entry to make sure that the temporary
overlay enables discard when it was requested for the drive.
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com>
* remotes/huth-gitlab/tags/pull-request-2019-04-08:
test qgraph.c: Fix segs due to out of scope default
tests/libqos: fix usage of bool in pci-spapr.c
tests/libqos: fix usage of bool in pci-pc.c
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
test qgraph.c: Fix segs due to out of scope default
The test uses the trick:
if (!opts) {
opts = &(QOSGraph...Options) { };
}
in a couple of places, however the temporary created
by the &() {} goes out of scope at the bottom of the if,
and results in a seg or assert when opts-> fields are
used (on fedora 30's gcc 9).
Fixes: fc281c802022cb3a73a5 Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20190405184037.16799-1-dgilbert@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
Jafar Abdi [Sat, 23 Mar 2019 14:26:36 +0000 (17:26 +0300)]
tests/libqos: fix usage of bool in pci-spapr.c
Clean up wrong usage of FALSE and TRUE in places that use "bool" from stdbool.h.
FALSE and TRUE (with capital letters) are the constants defined by glib for
being used with the "gboolean" type of glib. But some parts of the code also use
TRUE and FALSE for variables that are declared as "bool" (the type from <stdbool.h>).
Signed-off-by: Jafar Abdi <cafer.abdi@gmail.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Acked-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <1553351197-14581-4-git-send-email-cafer.abdi@gmail.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
Jafar Abdi [Sat, 23 Mar 2019 14:26:35 +0000 (17:26 +0300)]
tests/libqos: fix usage of bool in pci-pc.c
Clean up wrong usage of FALSE and TRUE in places that use "bool" from stdbool.h.
FALSE and TRUE (with capital letters) are the constants defined by glib for
being used with the "gboolean" type of glib. But some parts of the code also use
TRUE and FALSE for variables that are declared as "bool" (the type from <stdbool.h>).
Signed-off-by: Jafar Abdi <cafer.abdi@gmail.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1553351197-14581-3-git-send-email-cafer.abdi@gmail.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
Juan Quintela [Wed, 3 Apr 2019 11:49:51 +0000 (13:49 +0200)]
migration: Fix migrate_set_parameter
Otherwise we are setting err twice, what is wrong and causes an abort.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20190403114958.3705-2-quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
migration: use bitmap_mutex in migration_bitmap_clear_dirty
My colleague Wei's patch add bitmap_mutex in migration_bitmap_clear_dirty,
but COLO didn't initialize the bitmap_mutex. So we always get an error
when COLO start up. like that:
qemu-system-x86_64: util/qemu-thread-posix.c:64: qemu_mutex_lock_impl: Assertion `mutex->initialized' failed.
This patch add the bitmap_mutex initialize and destroy in COLO
lifecycle.
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Message-Id: <20190329222951.28945-1-chen.zhang@intel.com> Reviewed-by: Wei Wang <wei.w.wang@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Peter Maydell [Fri, 5 Apr 2019 03:50:30 +0000 (04:50 +0100)]
Merge remote-tracking branch 'remotes/palmer/tags/riscv-for-master-4.0-rc3-v2' into staging
RISC-V Patches for 4.0-rc3, v2
This patch set contains a pair of tightly coupled PLIC bug fixes:
* We were calculating the PLIC addresses incorrectly.
* We were installing the wrong number of PLIC interrupts.
The two bugs togther resulted in a mostly-working system, but they're
impossible to seperate because fixing one bug would result in
significant breakage. As a result they're in the same patch.
There is also a cleanup to use qemu_log_mask(LOG_GUEST_ERROR,...) for
error reporting.
As far as I know these are the last outstanding RISC-V patches for 4.0.
v2 no longer fails "make check" for me... sorry!
# gpg: Signature made Fri 05 Apr 2019 01:33:57 BST
# gpg: using RSA key 00CE76D1834960DFCE886DF8EF4CA1502CCBAB41
# gpg: issuer "palmer@dabbelt.com"
# gpg: Good signature from "Palmer Dabbelt <palmer@dabbelt.com>" [unknown]
# gpg: aka "Palmer Dabbelt <palmer@sifive.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 00CE 76D1 8349 60DF CE88 6DF8 EF4C A150 2CCB AB41
Peter Maydell [Fri, 5 Apr 2019 02:52:05 +0000 (03:52 +0100)]
Merge remote-tracking branch 'remotes/aperard/tags/pull-xen-20190404' into staging
Xen queue
xen-block fixes
# gpg: Signature made Thu 04 Apr 2019 18:04:38 BST
# gpg: using RSA key F80C006308E22CFD8A92E7980CF5572FD7FB55AF
# gpg: issuer "anthony.perard@citrix.com"
# gpg: Good signature from "Anthony PERARD <anthony.perard@gmail.com>" [marginal]
# gpg: aka "Anthony PERARD <anthony.perard@citrix.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 5379 2F71 024C 600F 778A 7161 D8D5 7199 DF83 42C8
# Subkey fingerprint: F80C 0063 08E2 2CFD 8A92 E798 0CF5 572F D7FB 55AF
* remotes/aperard/tags/pull-xen-20190404:
xen-block: scale sector based quantities correctly
xen-block: only advertize discard to the frontend when it is enabled...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This patch fixes four different things, to maintain bisectability they
have been merged into a single patch. The following fixes are below:
sifive_plic: Fix incorrect irq calculation
The irq is incorrectly calculated to be off by one. It has worked in the
past as the priority_base offset has also been set incorrectly. We are
about to fix the priority_base offset so first first the irq
calculation.
sifive_u: Fix PLIC priority base offset and numbering
According to the FU540 manual the PLIC source priority address starts at
an offset of 0x04 and not 0x00. The same manual also specifies that the
PLIC only has 53 source priorities. Fix these two incorrect header
files.
We also need to over extend the plic_gpios[] array as the PLIC sources
count from 1 and not 0.
riscv: sifive_e: Fix PLIC priority base offset
According to the FE31 manual the PLIC source priority address starts at
an offset of 0x04 and not 0x00.
riscv: virt: Fix PLIC priority base offset
Update the virt offsets based on the newly updated SiFive U and SiFive E
offsets.
Signed-off-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Paul Durrant [Mon, 1 Apr 2019 12:17:19 +0000 (13:17 +0100)]
xen-block: scale sector based quantities correctly
The Xen blkif protocol requires that sector based quantities should be
interpreted strictly as multiples of 512 bytes. Specifically:
"first_sect and last_sect in blkif_request_segment, as well as
sector_number in blkif_request, are always expressed in 512-byte units."
Commit fcab2b464e06 "xen: add header and build dataplane/xen-block.c"
incorrectly modified behaviour to use the block device logical_block_size
property as the scale, instead of correctly shifting values by the
hardcoded BDRV_SECTOR_BITS (and hence scaling them to 512 byte units).
This patch undoes that change and restores compliance with the spec.
Furthermore, this patch also restores the original xen_disk behaviour
of advertizing a hardcoded 'sector-size' value of 512 in xenstore and
scaling 'sectors' accordingly. The realize() method is also modified to
fail if logical_block_size is set to anything other than 512.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Message-Id: <20190401121719.27208-1-paul.durrant@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Paul Durrant [Wed, 20 Mar 2019 14:28:25 +0000 (14:28 +0000)]
xen-block: only advertize discard to the frontend when it is enabled...
...and properly enable it when synthesizing a drive.
The Xen toolstack sets 'discard-enable' to '1' in xenstore when it wants
to enable discard on a specified image. The code in
xen_block_drive_create() correctly parses this and uses it to set
'discard' to 'unmap' for the file_layer, but fails to do the same for the
driver_layer (which effectively disables it). Meanwhile the code in
xen_block_realize() advertizes discard support to the frontend in the
default case (because conf->discard_granularity defaults to -1), even when
the underlying image may not handle it.
This patch adds the missing option to the driver_layer in
xen_block_driver_create() and checks whether BDRV_O_UNMAP is actually
set on the block device before advertizing discard to the frontend.
In the case that discard is supported it also makes sure that the
granularity is set to the physical block size.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Message-Id: <20190320142825.24565-1-paul.durrant@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
* remotes/cohuck/tags/s390x-20190403:
hw/s390x/3270-ccw: avoid taking address of fields in packed struct
hw/s390x/ipl: avoid taking address of fields in packed struct
hw/s390/css: avoid taking address members in packed structs
hw/vfio/ccw: avoid taking address members in packed structs
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
hw/s390x/3270-ccw: avoid taking address of fields in packed struct
Compiling with GCC 9 complains
hw/s390x/3270-ccw.c: In function âemulated_ccw_3270_cbâ:
hw/s390x/3270-ccw.c:81:19: error: taking address of packed member of âstruct SCHIBâ may result in an unaligned pointer value [-Werror=address-of-packed-member]
81 | SCSW *s = &sch->curr_status.scsw;
| ^~~~~~~~~~~~~~~~~~~~~~
This local variable is only present to save a little bit of
typing when setting the field later. Get rid of this to avoid
the warning about unaligned accesses.
hw/s390x/ipl: avoid taking address of fields in packed struct
Compiling with GCC 9 complains
hw/s390x/ipl.c: In function âs390_ipl_set_boot_menuâ:
hw/s390x/ipl.c:256:25: warning: taking address of packed member of âstruct QemuIplParametersâ may result in an unaligned pointer value [-Waddress-of-packed-member]
256 | uint32_t *timeout = &ipl->qipl.boot_menu_timeout;
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
This local variable is only present to save a little bit of
typing when setting the field later. Get rid of this to avoid
the warning about unaligned accesses.
hw/s390/css: avoid taking address members in packed structs
The GCC 9 compiler complains about many places in s390 code
that take the address of members of the 'struct SCHIB' which
is marked packed:
hw/s390x/css.c: In function âsch_handle_clear_funcâ:
hw/s390x/css.c:698:15: warning: taking address of packed member of âstruct SCHIBâ may result in an unaligned pointer val\
ue [-Waddress-of-packed-member]
698 | PMCW *p = &sch->curr_status.pmcw;
| ^~~~~~~~~~~~~~~~~~~~~~
hw/s390x/css.c:699:15: warning: taking address of packed member of âstruct SCHIBâ may result in an unaligned pointer val\
ue [-Waddress-of-packed-member]
699 | SCSW *s = &sch->curr_status.scsw;
| ^~~~~~~~~~~~~~~~~~~~~~
...snip many more...
Almost all of these are just done for convenience to avoid
typing out long variable/field names when referencing struct
members. We can get most of this convenience by taking the
address of the 'struct SCHIB' instead, avoiding triggering
the compiler warnings.
In a couple of places we copy via a local variable which is
a technique already applied elsewhere in s390 code for this
problem.
hw/vfio/ccw: avoid taking address members in packed structs
The GCC 9 compiler complains about many places in s390 code
that take the address of members of the 'struct SCHIB' which
is marked packed:
hw/vfio/ccw.c: In function âvfio_ccw_io_notifier_handlerâ:
hw/vfio/ccw.c:133:15: warning: taking address of packed member of âstruct SCHIBâ may result in an unaligned pointer value \
[-Waddress-of-packed-member]
133 | SCSW *s = &sch->curr_status.scsw;
| ^~~~~~~~~~~~~~~~~~~~~~
hw/vfio/ccw.c:134:15: warning: taking address of packed member of âstruct SCHIBâ may result in an unaligned pointer value \
[-Waddress-of-packed-member]
134 | PMCW *p = &sch->curr_status.pmcw;
| ^~~~~~~~~~~~~~~~~~~~~~
...snip many more...
Almost all of these are just done for convenience to avoid
typing out long variable/field names when referencing struct
members. We can get most of this convenience by taking the
address of the 'struct SCHIB' instead, avoiding triggering
the compiler warnings.
In a couple of places we copy via a local variable which is
a technique already applied elsewhere in s390 code for this
problem.
Peter Xu [Fri, 29 Mar 2019 06:14:22 +0000 (14:14 +0800)]
intel_iommu: Drop extended root field
VTD_RTADDR_RTT is dropped even by the VT-d spec, so QEMU should
probably do the same thing (after all we never really implemented it).
Since we've had a field for that in the migration stream, to keep
compatibility we need to fill the hole up.
Please refer to VT-d spec 10.4.6.
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20190329061422.7926-3-peterx@redhat.com> Reviewed-by: Liu, Yi L <yi.l.liu@intel.com> Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Peter Xu [Fri, 29 Mar 2019 06:14:21 +0000 (14:14 +0800)]
intel_iommu: Fix root_scalable migration breakage
When introducing the initial support for scalable mode we added a
new field into vmstate however we blindly migrate that field without
notice. That'll break migration no matter forward or backward.
The normal way should be that we use something like
VMSTATE_UINT32_TEST() or subsections for the new vmstate field however
for this case of vt-d we can even make it simpler because we've
already migrated all the registers and it'll be fairly simple that we
re-generate root_scalable field from the register values during post
load of the device.
Fixes: fb43cf739e ("intel_iommu: scalable mode emulation") Reviewed-by: Yi Sun <yi.y.sun@linux.intel.com> Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20190329061422.7926-2-peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Yuval Shaia [Thu, 21 Mar 2019 16:18:32 +0000 (18:18 +0200)]
virtio-net: Fix typo in comment
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Message-Id: <20190321161832.10533-1-yuval.shaia@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Liam Merwick [Thu, 21 Mar 2019 20:13:49 +0000 (20:13 +0000)]
acpi: verify file entries in bios_linker_loader_add_pointer()
The callers to bios_linker_find_file() assert that the file entry returned
is not NULL, except for those in bios_linker_loader_add_pointer(). Add two
asserts in that case for completeness and to facilitate static code analysis.
Signed-off-by: Liam Merwick <liam.merwick@oracle.com>
Message-Id: <1553199229-25318-1-git-send-email-liam.merwick@oracle.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Peter Maydell [Tue, 2 Apr 2019 13:52:17 +0000 (14:52 +0100)]
Merge remote-tracking branch 'remotes/berrange/tags/filemon-next-pull-request' into staging
filemon: various fixes / improvements to file monitor for USB MTP
Ensure watch IDs unique within a monitor and avoid integer wraparound
issues when many watches are set & unset over time.
# gpg: Signature made Tue 02 Apr 2019 13:53:40 BST
# gpg: using RSA key BE86EBB415104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>" [full]
# gpg: aka "Daniel P. Berrange <berrange@redhat.com>" [full]
# Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E 8E3F BE86 EBB4 1510 4FDF
* remotes/berrange/tags/filemon-next-pull-request:
filemon: fix watch IDs to avoid potential wraparound issues
filemon: ensure watch IDs are unique to QFileMonitor scope
tests: refactor file monitor test to make it more understandable
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Tue, 2 Apr 2019 13:03:11 +0000 (14:03 +0100)]
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches:
- file-posix: Ignore unlock failure instead of crashing
- gluster: Limit the transfer size to 512 MiB
- stream: Fix backing chain freezing
- qemu-img: Enable BDRV_REQ_MAY_UNMAP for zero writes in convert
- iotests fixes
# gpg: Signature made Tue 02 Apr 2019 13:47:43 BST
# gpg: using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream:
tests/qemu-iotests/235: Allow fallback to tcg
block: test block-stream with a base node that is used by block-commit
block: freeze the backing chain earlier in stream_start()
block: continue until base is found in bdrv_freeze_backing_chain() et al
block/file-posix: do not fail on unlock bytes
tests/qemu-iotests: Remove redundant COPYING file
block/gluster: limit the transfer size to 512 MiB
qemu-img: Enable BDRV_REQ_MAY_UNMAP in convert
iotests: Fix test 200 on s390x without virtio-pci
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
filemon: fix watch IDs to avoid potential wraparound issues
Watch IDs are allocated from incrementing a int counter against
the QFileMonitor object. In very long life QEMU processes with
a huge amount of USB MTP activity creating & deleting directories
it is just about conceivable that the int counter can wrap
around. This would result in incorrect behaviour of the file
monitor watch APIs due to clashing watch IDs.
Instead of trying to detect this situation, this patch changes
the way watch IDs are allocated. It is turned into an int64_t
variable where the high 32 bits are set from the underlying
inotify "int" ID. This gives an ID that is guaranteed unique
for the directory as a whole, and we can rely on the kernel
to enforce this. QFileMonitor then sets the low 32 bits from
a per-directory counter.
The USB MTP device only sets watches on the directory as a
whole, not files within, so there is no risk of guest
triggered wrap around on the low 32 bits.
filemon: ensure watch IDs are unique to QFileMonitor scope
The watch IDs are mistakenly only unique within the scope of the
directory being monitored. This is not useful for clients which are
monitoring multiple directories. They require watch IDs to be unique
globally within the QFileMonitor scope.
When the user specifies a list of accelerators, we pick the first one
that initializes successfully. Recent commit 1a3ec8c1564 broke that.
Reproducer:
$ qemu-system-x86_64 --machine accel=xen:tcg
xencall: error: Could not obtain handle on privileged command interface: No such file or directory
xen be core: xen be core: can't open xen interface
can't open xen interface
qemu-system-x86_64: failed to initialize Xen: Operation not permitted
qemu-system-x86_64: /home/armbru/work/qemu/qom/object.c:436: object_set_accelerator_compat_props: Assertion `!object_compat_props[0]' failed.
Root cause: we register accelerator compat properties even when the
accelerator fails. The failed assertion is
object_set_accelerator_compat_props() telling us off. Fix by calling
it only for the accelerator that succeeded.
vl: Document dependencies hiding in global and compat props
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190401090827.20793-5-armbru@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com>