git.ipfire.org Git - thirdparty/qemu.git/log

qemu-io: Allow reopen read-write

This allows qemu-iotests to test the switch between read-only and
read-write mode for block devices.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>

block: Set BDRV_O_ALLOW_RDWR during rw reopen

Reopening an image should be consistent with opening it, so we should
set BDRV_O_ALLOW_RDWR for any image that is reopened read-write like in
bdrv_open_inherit().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>

block: Allow reopen rw without BDRV_O_ALLOW_RDWR

BDRV_O_ALLOW_RDWR is a flag that tells whether qemu can internally
reopen a node read-write temporarily because the user requested
read-write for the top-level image, but qemu decided that read-only is
enough for this node (a backing file).

bdrv_reopen() is different, it is also used for cases where the user
changed their mind and wants to update the options. There is no reason
to forbid making a node read-write in that case.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>

block: Fix order in bdrv_replace_child()

Commit 8ee03995 refactored the code incorrectly and broke the release of
permissions on the old BDS. Instead of changing the permissions to the
new required values after removing the old BDS from the list of
children, it only re-obtains the permissions it already had.

Change the order of operations so that the old BDS is removed again
before calculating the new required permissions.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>

parallels: drop check that bdrv_truncate() is working

This would be actually strange and error prone. If truncate() nowadays
will fail, there is something fatally wrong. Let's check for that during
the actual work.

The only fallback case is when the file is not zero initialized. In this
case we should switch to preallocation via fallocate().

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Markus Armbruster <armbru@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Max Reitz <mreitz@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

parallels: respect error code of bdrv_getlength() in allocate_clusters()

If we can not get the file length, the state of BDS is broken completely.
Return error to the caller.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Markus Armbruster <armbru@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Max Reitz <mreitz@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>

block: respect error code from bdrv_getlength in handle_aiocb_write_zeroes

Original idea beyond the code in question was the following: we have failed
to write zeroes with fallocate(FALLOC_FL_ZERO_RANGE) as the simplest
approach and via fallocate(FALLOC_FL_PUNCH_HOLE)/fallocate(0). We have the
only chance now: if the request comes beyond end of the file. Thus we
should calculate file length and respect the error code from that op.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Markus Armbruster <armbru@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Max Reitz <mreitz@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>

vmdk: Fix error handling/reporting of vmdk_check

Errors from the callees must be captured and propagated to our caller,
ensure this for both find_extent() and bdrv_getlength().

Reported-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block/null: Remove 'filename' option

This option was only added to allow 'null-co://' and 'null-aio://' as
filenames, its value never served any actual purpose and was ignored.
Nevertheless it was accepted as '-drive driver=null,filename=foo'.

The correct way to enable the protocol prefixes (and that without adding
a useless -drive option) is implementing .bdrv_parse_filename. This is
what this patch does.

Technically, this is an incompatible change, but the null block driver
is only used for benchmarking, testing and debugging, and an option
without effect isn't likely to be used by anyone anyway, so no bad
effects are to be expected.

Reported-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

block: drop bdrv_set_key from BlockDriver

This is not used anymore since c01c214b69 ("block: remove all encryption
handling APIs", 2017-07-11).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block/vhdx: check error return of bdrv_truncate()

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block/vhdx: check error return of bdrv_flush()

Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block/vhdx: check for offset overflow to bdrv_truncate()

VHDX uses uint64_t types for most offsets, following the VHDX spec.
However, bdrv_truncate() takes an int64_t value for the truncating
offset. Check for overflow before calling bdrv_truncate().

While we are here, replace the bit shifting with QEMU_ALIGN_UP as well.

N.B.: For a compliant image this is not an issue, as the maximum VHDX
image size is defined per the spec to be 64TB.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block/vhdx: check error return of bdrv_getlength()

Calls to bdrv_getlength() were not checking for error. In vhdx.c, this
can lead to truncating an image file, so it is a definite bug. In
vhdx-log.c, the path for improper behavior is less clear, but it is best
to check in any case.

Some minor code movement of the log_guid intialization, as well.

Reported-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

quorum: Set sectors-count to 0 when reporting a flush error

The QUORUM_REPORT_BAD event has fields to report the sector in which
the error was detected and the number of affected sectors starting
from that one. This is important for read and write errors, but not
for flush errors.

For flush errors the current code reports the total size of the disk
image. That is however not useful information in this case. Moreover,
the bdrv_getlength() call can fail, and there's no good way of
handling that failure.

Since we're reporting useless information and we cannot even guarantee
to do it in a consistent way, this patch changes the code to report 0
instead in all cases.

Reported-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qemu-iotests/109: Fix lock race condition

A race condition is currently present between the clean up attempt of
the QEMU process and the execution of qemu-img. The actual (bad)
output is:

-Warning: Image size mismatch!
-Images are identical.
+qemu-img: Could not open '<build_dir>/tests/qemu-iotests/scratch/t.raw': Failed to get "consistent read" lock
+Is another process using the image?

A KILL signal is sent to the QEMU process, but qemu-img may begin to
run before the QEMU process is really gone. qemu-img will then
attempt to open the TEST_IMG file before it can secure a lock on it.

This attempts a more graceful shutdown, and waits for the QEMU process
to exit.

Signed-off-by: Cleber Rosa <crosa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

virtio: fix for rc2

It turns out there's a way to setup SHPC on Q35: just put
a PCI to PCI bridge behind a DMI to PCI one. Our _OSC is
thus incorrect.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Mon 07 Aug 2017 22:39:20 BST
# gpg:                using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream:
  cpu: add APIs to allocate/free CPU environment
  hw/i386: allow SHPC for Q35 machine

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

cpu: add APIs to allocate/free CPU environment

These will be implemented and then used by follow-up patches.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

hw/i386: allow SHPC for Q35 machine

Unmask previously masked SHPC feature in _OSC method.

Signed-off-by: Aleksandr Bezzubikov <zuban32s@gmail.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

hw/arm/virt: Add 2.10 machine type

Add virt-2.10 machine type.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-id: 1502106581-11714-1-git-send-email-eric.auger@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging

# gpg: Signature made Mon 07 Aug 2017 12:03:54 BST
# gpg:                using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/tracing-pull-request:
  block: move trace probes into bdrv_co_preadv|pwritev

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

block: move trace probes into bdrv_co_preadv|pwritev

There are trace probes in bdrv_co_readv|writev, however, the
block drivers are being gradually moved over to using the
bdrv_co_preadv|pwritev functions instead. As a result some
block drivers miss the current probes. Move the probes
into bdrv_co_preadv|pwritev instead, so that they are triggered
by more (all?) I/O code paths.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170804105036.11879-1-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20170803' into staging

Queued misc tcg patches

# gpg: Signature made Thu 03 Aug 2017 19:07:18 BST
# gpg:                using RSA key 0xAD1270CC4DD0279B
# gpg: Good signature from "Richard Henderson <rth7680@gmail.com>"
# gpg:                 aka "Richard Henderson <rth@twiddle.net>"
# Primary key fingerprint: 9CB1 8DDA F8E8 49AD 2AFC  16A4 AD12 70CC 4DD0 279B

* remotes/rth/tags/pull-tcg-20170803:
  tcg: Increase minimum alignment from tcg_malloc to 8
  target/s390x: Fix CSST for 16-byte store
  tcg/arm: Fix runtime overalignment test

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/ehabkost/tags/machine-pull-request' into staging

cpu: crash fix (don't allow negative core id)

# gpg: Signature made Thu 03 Aug 2017 18:57:41 BST
# gpg:                using RSA key 0x2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* remotes/ehabkost/tags/machine-pull-request:
  cpu: don't allow negative core id

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/yongbok/tags/mips-20170803' into staging

MIPS patches 2017-08-03

Changes:
KVM T&E segment support for TCG
malta: leave space for the bootmap after the initrd
Apply CP0.PageMask before writing into TLB entry
Fix fallout from indirect branch optimisation

# gpg: Signature made Thu 03 Aug 2017 15:32:59 BST
# gpg:                using RSA key 0x2238EB86D5F797C2
# gpg: Good signature from "Yongbok Kim <yongbok.kim@imgtec.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 8600 4CF5 3415 A5D9 4CFA  2B5C 2238 EB86 D5F7 97C2

* remotes/yongbok/tags/mips-20170803:
  target/mips: Fix RDHWR CC with icount
  target/mips: Drop redundant gen_io_start/stop()
  target/mips: Use BS_EXCP where interrupts are expected
  target-mips: apply CP0.PageMask before writing into TLB entry
  mips: Add KVM T&E segment support for TCG
  mips: Improve segment defs for KVM T&E guests
  mips/malta: leave space for the bootmap after the initrd
  target-mips: Don't stop on [d]mtc0 DESAVE/KScratch

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

virtio: fix for rc2

Looks like the constant stream of additions of vhost-user devices is a
problem for some people who are concerned about external connections
from qemu. A per-device flag seems like an overkill, but a single
configure flag seems like a sane way to support that, and it looks like
we need to do it before the release.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Thu 03 Aug 2017 13:57:57 BST
# gpg:                using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream:
  build-sys: add --disable-vhost-user

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/thibault/tags/samuel-thibault' into staging

slirp updates

# gpg: Signature made Wed 02 Aug 2017 23:27:41 BST
# gpg:                using RSA key 0x9E511E01C737F075
# gpg: Good signature from "Samuel Thibault <samuel.thibault@aquilenet.fr>"
# gpg:                 aka "Samuel Thibault <sthibault@debian.org>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@gnu.org>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@inria.fr>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@labri.fr>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@ens-lyon.org>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@u-bordeaux.fr>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 900C B024 B679 31D4 0F82  304B D017 8C76 7D06 9EE6
#      Subkey fingerprint: 9A37 3D36 64A8 DC62 DA0A  34FD 9E51 1E01 C737 F075

* remotes/thibault/tags/samuel-thibault:
  slirp: check len against dhcp options array end
  slirp: fill error when failing to initialize user network

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

tcg: Increase minimum alignment from tcg_malloc to 8

For a 64-bit ILP32 host, aligning to sizeof(long) is not enough.
Guess the minimum for any host is 8, as that covers uint64_t.
Qemu doesn't use a host long double or host vectors, except in
extremely limited circumstances.

Fixes a bus error for a sparc v8plus host.

Signed-off-by: Richard Henderson <rth@twiddle.net>

target/s390x: Fix CSST for 16-byte store

Found by Coverity (CID 1378273).

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>

tcg/arm: Fix runtime overalignment test

Patch 85aa80813dd changed the IF emitting the TST instruction,
but failed to change the ?: converting CMP to CMPEQ, so the
result of the TST is ignored.

Signed-off-by: Richard Henderson <rth@twiddle.net>

build-sys: add --disable-vhost-user

Learn to compile out vhost-user (net, scsi & upcoming users). Keep it
enabled by default on non-win32, that is assumed to be POSIX. Fail if
trying to enable it on win32.

When trying to make a vhost-user netdev, it gives the following error:

-netdev vhost-user,id=foo,chardev=chr-test: Parameter 'type' expects a netdev backend type

And similar error with the HMP/QMP monitors.

While at it, rename CONFIG_VHOST_NET_TEST CONFIG_VHOST_USER_NET_TEST
since it's a vhost-user specific variable.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

slirp: check len against dhcp options array end

While parsing dhcp options string in 'dhcp_decode', if an options'
length 'len' appeared towards the end of 'bp_vend' array, ensuing
read could lead to an OOB memory access issue. Add check to avoid it.

This is CVE-2017-11434.

Reported-by: Reno Robert <renorobert@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>

slirp: fill error when failing to initialize user network

With "-netdev user,id=net0,dns=1.2.3.4"
error was:
qemu-system-i386: -netdev user,id=net0,dns=1.2.3.4: Device 'user' could not be initialized

Error is now:
qemu-system-i386: -netdev user,id=net0,dns=1.2.3.4: DNS doesn't belong to network

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>

cpu: don't allow negative core id

With pseries machine type a negative core-id is not managed properly:
-1 gives an inaccurate error message ("core -1 already populated"),
-2 crashes QEMU (core dump)

As it seems a negative value is invalid for any architecture,
instead of checking this in spapr_core_pre_plug() I think it's better
to check this in the generic part, core_prop_set_core_id()

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20170802103259.25940-1-lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>

target/mips: Fix RDHWR CC with icount

RDHWR CC reads the CPU timer like MFC0 CP0_Count, so with icount enabled
it must set can_do_io while it calls the helper to avoid the "Bad icount
read" error. It should also break out of the translation loop to ensure
that timer interrupts are immediately handled.

Fixes: 2e70f6efa8b9 ("Add instruction counter.")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Yongbok Kim <yongbok.kim@imgtec.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>

target/mips: Drop redundant gen_io_start/stop()

DMTC0 CP0_Cause does a redundant gen_io_start() and gen_io_end() pair,
even though this is done for all DMTC0 operations outside of the switch
statement. Remove these redundant calls.

Fixes: 5dc5d9f055c5 ("mips: more fixes to the MIPS interrupt glue logic")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Yongbok Kim <yongbok.kim@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>

target/mips: Use BS_EXCP where interrupts are expected

Commit e350d8ca3ac7 ("target/mips: optimize indirect branches") made
indirect branches able to directly find the next TB and jump straight to
it without breaking out of translated code and going around the main
execution loop. This breaks the assumption in target/mips/translate.c
that BS_STOP is sufficient to cause pending interrupts to be handled,
since interrupts are only checked in the main loop.

Fix a few of these assumptions by using gen_save_pc to update the saved
PC and using BS_EXCP instead of BS_STOP:

- [D]MFC0 CP0_Count may trigger a timer interrupt which should be
   immediately handled.

- [D]MTC0 CP0_Cause may trigger an interrupt (but in fact translation
   was only even being stopped in the DMTC0 case).

- [D]MTC0 CP0_<any> when icount is used is assumed could potentially
   cause interrupts.

- EI may trigger an interrupt which was pending. I specifically hit
   this case when running KVM nested in mipsel-softmmu. A timer
   interrupt while the 2nd guest was executing is caught by KVM which
   switches back to the normal Linux exception base and re-enables
   interrupts with EI. Since the above commit QEMU doesn't leave
   translated code until the nested KVM has already restored the KVM
   exception base and returned to the 2nd guest, at which point it is
   too late to check for pending interrupts and it gets stuck in an
   infinite loop of unhandled interrupts.

Something similar was needed for ARM in commit b29fd33db578
("target/arm: use DISAS_EXIT for eret handling").

Fixes: e350d8ca3ac7 ("target/mips: optimize indirect branches")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Yongbok Kim <yongbok.kim@imgtec.com>
Cc: Richard Henderson <rth@twiddle.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>

target-mips: apply CP0.PageMask before writing into TLB entry

PFN0 and PFN1 have to be masked out with PageMask_Mask.

Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
[Yongbok Kim:
Added commit message]
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>

mips: Add KVM T&E segment support for TCG

MIPS KVM trap & emulate guest kernels have a different segment layout
compared with traditional MIPS kernels, to allow both the user and
kernel code to run from the user address segment without repeatedly
trapping to KVM.

QEMU currently supports this layout only for KVM, but its sometimes
useful to be able to run these kernels in QEMU on a PC, so enable it for
TCG too.

This also paves the way for MIPS KVM VZ support (which uses the normal
virtual memory layout) by abstracting whether user mode kernel segments
are in use.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Yongbok Kim <yongbok.kim@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org
Reviewed-by: Richard Henderson <rth@twiddle.net>
[Yongbok Kim:
minor change]
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>

mips: Improve segment defs for KVM T&E guests

Improve the segment definitions used by get_physical_address() to yield
target_ulong types, e.g. 0xffffffff80000000 instead of 0x80000000. This
is in preparation for enabling emulation of MIPS KVM T&E segments in TCG
MIPS targets, which unlike KVM could potentially have 64-bit
target_ulong. In such a case the offset guest KSEG0 address ends up at
e.g. 0x000000008xxxxxxx instead of 0xffffffff8xxxxxxx.

This also allows the casts to int32_t that force sign extension to be
removed, which removes any confusion due to relational comparison of
unsigned (target_ulong) and signed (int32_t) types.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Yongbok Kim <yongbok.kim@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>

mips/malta: leave space for the bootmap after the initrd

Since commit 9768e2abf7 the initrd is loaded at the end of the low
memory to avoid clash for the kernel relocation when kaslr is used.

However this in turn conflicts with the bootmap memory that the kernel
tries to place after initrd, but in low memory. The bootmap spans the
whole usable physical address space. The machine can have at most 2GiB
of memory, 256MiB of low memory mapped at 0x00000000, and 1792MiB of
high memory mapped at 0x90000000. The biggest bootmap therefore
corresponds to the adresses 0x00000000 -> 0xffffffff, which at 1 bit
per 4kiB page corresponds to 128kiB in memory.

Therefore reserve 128kiB after the initrd.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Tested-by: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>

target-mips: Don't stop on [d]mtc0 DESAVE/KScratch

Writing to the MIPS DESAVE register (and now the KScratch registers)
will stop translation, supposedly due to risk of execution mode
switches. However these registers are basically RW scratch registers
with no side effects so there is no risk of them triggering execution
mode changes.

Drop the bstate = BS_STOP for these registers for both mtc0 and dmtc0.

Fixes: 7a387fffce50 ("Add MIPS32R2 instructions, and generally straighten out the instruction decoding. This is also the first percent towards MIPS64 support.")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Yongbok Kim <yongbok.kim@imgtec.com>
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>

Update version for v2.10.0-rc1 release

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

tests/hmp: Fix typo in the 'chardev-send-break' test

testchardev2 is not a valid chardev id here. Use testchardev1
instead which has been created with chardev-add right before
the 'chardev-send-break' line.
And while we're at it, add the test-hmp.c file to the MAINTAINERS
file, too.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-id: 1501149097-19071-1-git-send-email-thuth@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/dgilbert/tags/pull-migration-20170802a' into staging

Migration pull 2017-08-02

Just minor fixes for 2.10

# gpg: Signature made Wed 02 Aug 2017 14:55:21 BST
# gpg:                using RSA key 0x0516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A  9FA9 0516 331E BC5B FDE7

* remotes/dgilbert/tags/pull-migration-20170802a:
  io: fix qio_channel_socket_accept err handling
  migration: fix comment disorder in RAMState
  migration: fix small leaks

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

io: fix qio_channel_socket_accept err handling

When accept failed, we should setup errp with the reason. More
importantly, the caller may assume errp be non-NULL when error happens,
and not setting the errp may crash QEMU.

At the same time, move the trace_qio_channel_socket_accept_fail() after
the if check on EINTR. Two reasons:

1. when EINTR happened, it's not really a fault (we should just try
   again), so we should not log with an "accept failure".

2. trace_*() functions may overwrite errno, then the old errno will be
   missing. We need to either check errno before trace_*() calls, or
   reserve the errno.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1501666880-10159-3-git-send-email-peterx@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

migration: fix comment disorder in RAMState

Comments for "migration_dirty_pages" and "bitmap_mutex" are switched.
Fix it.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1501666880-10159-2-git-send-email-peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

migration: fix small leaks

Spotted thanks to valgrind and tests/device-introspect-test:

==11711== 1 bytes in 1 blocks are definitely lost in loss record 6 of 14,537
==11711==    at 0x4C2EB6B: malloc (vg_replace_malloc.c:299)
==11711==    by 0x1E0CDBD8: g_malloc (gmem.c:94)
==11711==    by 0x1E0E696E: g_strdup (gstrfuncs.c:363)
==11711==    by 0x695693: migration_instance_init (migration.c:2226)
==11711==    by 0x717C4B: object_init_with_type (object.c:344)
==11711==    by 0x717E80: object_initialize_with_type (object.c:375)
==11711==    by 0x7182EB: object_new_with_type (object.c:483)
==11711==    by 0x718328: object_new (object.c:493)
==11711==    by 0x4B8A29: qmp_device_list_properties (qmp.c:542)
==11711==    by 0x4A9561: qmp_marshal_device_list_properties (qmp-marshal.c:1425)
==11711==    by 0x819D4A: do_qmp_dispatch (qmp-dispatch.c:104)
==11711==    by 0x819E82: qmp_dispatch (qmp-dispatch.c:131)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20170801160419.14180-1-marcandre.lureau@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

pc, acpi, virtio: fixes, test speedup for rc1

Some fixes all over the place. Notably vhost-user gained a new message
to set endian-ness. Borderline for 2.10 but seems to be the only way to
fix legacy guests.  Also pc tests are run on kvm now. Not a fix at all
but doesn't touch qemu itself, so I merged it since I had to run these a
lot and I just got tired of waiting for these to finish.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Tue 01 Aug 2017 22:36:47 BST
# gpg:                using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream:
  pc: acpi: force FADT rev1 for 440fx based machine types
  pc: make 'pc.rom' readonly when machine has PCI enabled
  vhost-user: fix watcher need be removed when vhost-user hotplug
  tests/bios-tables-test: Compiler warning fix
  accel: cleanup error output
  intel_iommu: use access_flags for iotlb
  intel_iommu: fix iova for pt
  vhost-user: fix legacy cross-endian configurations
  vhost: fix a memory leak
  tests: switch pxe and vm gen id tests to use kvm

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

pc: acpi: force FADT rev1 for 440fx based machine types

w2k used to boot on QEMU until revision of FADT has
been bumped to rev3
(commit 77af8a2b hw/i386: Use Rev3 FADT (ACPI 2.0) instead of Rev1 to improve guest OS support.)

Keep PC machine at rev1 to remain compatible and Q35
at rev3 where w2k isn't supported anyway so OSX could
run as well.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: John Arbuckle <programmingkidx@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

pc: make 'pc.rom' readonly when machine has PCI enabled

looking at bios ROM mapping in QEMU it seems that only isapc
(i.e. not PCI enabled machine) requires ROM being mapped as
RW in other cases BIOS is mapped as RO. Do the same for option
ROM 'pc.rom' when machine has PCI enabled.

As useful side-effect pc.rom MemoryRegion stops being
put in vhost memory map (filtered out by vhost_section()),
which reduces number of entries by 1.

Coincidentally it fixes migration failure reported in

"[PATCH V2]  vhost: fix a migration failed because of vhost region merge"

where following destination CLI with /sys/module/vhost/parameters/max_mem_regions = 8

export DIMMSCOUNT=6
QEMU -enable-kvm \
     -netdev type=tap,id=guest0,vhost=on,script=no,vhostforce \
     -device virtio-net-pci,netdev=guest0 \
     -m 256,slots=256,maxmem=2G \
     `i=0; while [ $i -lt $DIMMSCOUNT ]; do echo \
         "-object memory-backend-ram,id=m$i,size=128M \
          -device pc-dimm,id=d$i,memdev=m$i"; i=$(($i + 1)); \
     done`

will fail to startup with error:

"-device pc-dimm,id=d5,memdev=m5: a used vhost backend has no free memory slots left"

while it's possible to add the 6th DIMM during hotplug
on source.

Issue is caused by the fact that number of entries in vhost map
is bigger on 1 entry, when -device is processed, than
after guest boots up, and that offending entry belongs to
'pc.rom', it's not like vhost intends to do IO in ROM range
so making it RO hides region from vhost and makes number
of entries in vhost memory map at -device/machine_done time
match number of entries after guest boots.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reported-by: Peng Hao <peng.hao2@zte.com.cn>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

vhost-user: fix watcher need be removed when vhost-user hotplug

"nc" is freed after hotplug vhost-user, but the watcher is not removed.
The QEMU crash when the watcher access the "nc" when socket disconnects.

    Program received signal SIGSEGV, Segmentation fault.
    #0  object_get_class (obj=obj@entry=0x2) at qom/object.c:750
    #1  0x00007f9bb4180da1 in qemu_chr_fe_disconnect (be=<optimized out>) at chardev/char-fe.c:372
    #2  0x00007f9bb40d1100 in net_vhost_user_watch (chan=<optimized out>, cond=<optimized out>, opaque=<optimized out>) at net/vhost-user.c:188
    #3  0x00007f9baf97f99a in g_main_context_dispatch () from /usr/lib64/libglib-2.0.so.0
    #4  0x00007f9bb41d7ebc in glib_pollfds_poll () at util/main-loop.c:213
    #5  os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:261
    #6  main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:515
    #7  0x00007f9bb3e266a7 in main_loop () at vl.c:1917
    #8  main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4786

Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

tests/bios-tables-test: Compiler warning fix

gcc 7.1.1 in fedora 26 moans about the:
tables = g_new0(uint32_t, tables_nr)

because it can't convince itself that tables_nr is positive.
This is fallout from g_assert_cmpint no longer necessarily being
no-return; replace it with a plain g_assert.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>

accel: cleanup error output

Only emit "XXX accelerator not found", if there are not
further accelerators listed. eg

accel=kvm:tcg

doesn't print a "KVM accelerator not found" warning
when it falls back to tcg, but a

accel=kvm

prints a warning, since no fallback is given.

Suggested-by: Daniel P. Berrange <berrange@redhat.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>

intel_iommu: use access_flags for iotlb

It was cached by read/write separately. Let's merge them.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

intel_iommu: fix iova for pt

IOMMUTLBEntry.iova is returned incorrectly on one PT path (though mostly
we cannot really trigger this path, even if we do, we are mostly
disgarding this value, so it didn't break anything). Fix it by
converting the VTD_PAGE_MASK into the correct definition
VTD_PAGE_MASK_4K, then remove VTD_PAGE_MASK.

Fixes: b93130 ("intel_iommu: cleanup vtd_{do_}iommu_translate()")
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

vhost-user: fix legacy cross-endian configurations

Currently, vhost-user does not implement any means for notifying the
backend about guest endianess. This commit introduces a new message
called VHOST_USER_SET_VRING_ENDIAN which is analogous to the ioctl()
called VHOST_SET_VRING_ENDIAN used for kernel vhost backends. Such
message is necessary for backends supporting legacy (pre-1.0) virtio
devices running in big-endian guests.

Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Signed-off-by: Mike Cui <cui@nutanix.com>

vhost: fix a memory leak

vhost exists a call for g_file_get_contents, but not call g_free.

Signed-off-by: Peng Hao<peng.hao2@zte.com.cn>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>

tests: switch pxe and vm gen id tests to use kvm

Speed up tests on host systems with kvm support.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Thomas Huth <thuth@redhat.com>
Cc: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* Xen fix (Anthony)
* chardev fixes (Anton, Marc-André)
* small dead code removal (Zhongyi)
* documentation (Dan)
* bugfixes (David)
* decrease migration downtime (Jay)
* improved error output (Laurent)
* RTC tests and bugfix (me)
* Bluetooth clang analyzer fix (me)
* KVM CPU hotplug race (Peng Hao)
* Two other patches from Philippe's clang analyzer series

# gpg: Signature made Tue 01 Aug 2017 16:56:21 BST
# gpg:                using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream:
  mc146818rtc: implement UIP latching as intended
  mc146818rtc: simplify check_update_timer
  rtc-test: introduce more update tests
  rtc-test: cleanup register_b_set_flag test
  hw/scsi/vmw_pvscsi: Convert to realize
  hw/scsi/vmw_pvscsi: Remove the dead error handling
  migration: optimize the downtime
  qemu-options: document existance of versioned machine types
  bt: stop the sdp memory allocation craziness
  exec: Add lock parameter to qemu_ram_ptr_length
  target-i386: kvm_get/put_vcpu_events don't handle sipi_vector
  docs: document deprecation policy & deprecated features in appendix
  char: don't exit on hmp 'chardev-add help'
  char-fd: remove useless chr pointer
  accel: cleanup error output
  cpu_physical_memory_sync_dirty_bitmap: Fix alignment check
  vl.c/exit: pause cpus before closing block devices

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block layer patches for 2.10.0-rc1

# gpg: Signature made Tue 01 Aug 2017 17:10:52 BST
# gpg:                using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream:
  block/qapi: Remove redundant NULL check to silence Coverity
  qemu-iotests/059: Fix leaked image files
  qemu-iotests/063: Fix leaked image
  qemu-iotests/162: Fix leaked temporary files
  qemu-iotests/153: Fix leaked scratch images
  qemu-iotests/141: Fix image cleanup
  qemu-iotests: Remove blkdebug.conf after tests
  qemu-iotests/041: Fix leaked scratch images
  block: fix leaks in bdrv_open_driver()
  block: fix dangling bs->explicit_options in block.c
  iotests: Add test of recent fix to 'qemu-img measure'
  iotests: Check dirty bitmap statistics in 124
  iotests: Redirect stderr to stdout in 186
  iotests: Fix test 156

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

block/qapi: Remove redundant NULL check to silence Coverity

When skipping implicit nodes in bdrv_block_device_info(), we know that
bs0 is always non-NULL; initially, because it's taken from a BdrvChild
and a BdrvChild never has a NULL bs, and after the first iteration
because implicit nodes always have a backing file.

Remove the NULL check and add an assertion that the implicit node does
indeed have a backing file.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>

qemu-iotests/059: Fix leaked image files

qemu-iotests 059 left a whole lot of image files behind in the scratch
directory because VMDK creates additional files for extents and cleaning
them up requires the original image intact (it parses qemu-img info
output to find all extent files), but the image overwrote it many times
like it works for all other image formats.

In addition, _use_sample_img overwrites the TEST_IMG variable, causing
new images created afterwards to reuse the name of the sample file
rather than the usual t.IMGFMT.

This patch adds an intermediate _cleanup_test_img after each subtest
that created an image file with additional extent files, and also after
each use of a sample image. _cleanup_test_img is also changed so that it
resets TEST_IMG after a sample image is cleaned up.

Note that this test was failing before this commit and continues to do
so after it. This failure was introduced in commit 9877860 ('block/vmdk:
Report failures in vmdk_read_cid()') and needs to be dealt with
separately.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>

qemu-iotests/063: Fix leaked image

qemu-iotests 063 left t.raw.raw1 behind in the scratch directory because
it used the wrong suffix. Make sure to clean it up after completing the
test.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>

qemu-iotests/162: Fix leaked temporary files

qemu-iotests 162 left qemu-nbd.pid behind in the scratch directory, and
potentially a file called '42' in the current directory. Make sure to
clean it up after completing the tests.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>

qemu-iotests/153: Fix leaked scratch images

qemu-iotests 153 left t.qcow2.c behind in the scratch directory. Make
sure to clean it up after completing the tests.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>

qemu-iotests/141: Fix image cleanup

qemu-iotests 141 attempted to use brace expansion to remove all images
with a single command. However, for this to work, the braces shouldn't
be quoted.

With this fix, the tests correctly cleans up its scratch images.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>

qemu-iotests: Remove blkdebug.conf after tests

qemu-iotests 074 and 179 left a blkdebug.conf behind in the scratch
directory. Make sure to clean up after completing the tests.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>

qemu-iotests/041: Fix leaked scratch images

qemu-iotests 041 left quorum_snapshot.img and target.img behind in the
scratch directory. Make sure to clean up after completing the tests.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>

block: fix leaks in bdrv_open_driver()

bdrv_open_driver() is called in two places, bdrv_new_open_driver() and
bdrv_open_common(). In the latter, failure cleanup in is in its caller,
bdrv_open_inherit(), which unrefs the bs->file of the failed driver open
if it exists.

Let's move the bs->file cleanup to bdrv_open_driver() to take care of
all callers and do not set bs->drv to NULL unless the driver's open
function failed. When bs is destroyed by removing its last reference, it
calls bdrv_close() which checks bs->drv to perform the needed cleanups
and also call the driver's close function. Since it cleans up options
and opaque we must take care not leave dangling pointers.

The error paths in bdrv_open_driver() are now two:
If open fails, drv->bdrv_close() should not be called. Unref the child
if it exists, free what we allocated and set bs->drv to NULL. Return the
error and let callers free their stuff.

If open succeeds but we fail after, return the error and let callers
unref and delete their bs, while cleaning up their allocations.

Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: fix dangling bs->explicit_options in block.c

In some error paths it is possible to QDECREF a freed dangling
explicit_options, resulting in a heap overflow crash. For example
bdrv_open_inherit()'s fail unrefs it, then calls bdrv_unref which calls
bdrv_close which also unrefs it.

Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

iotests: Add test of recent fix to 'qemu-img measure'

The new test 190 ensures we don't regress back to an infinite loop when
measuring the size of a 2T+ qcow2 image. I did not append to test 178,
because that test is also designed to run with format 'raw'; also, this
gives us some coverage of the measure command under the quick group.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

iotests: Check dirty bitmap statistics in 124

We had a bug for multiple releases where dirty-bitmap count was
documented in bytes but reported in sectors; enhance the testsuite
to add coverage of DirtyBitmapInfo to ensure we do not regress again.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

iotests: Redirect stderr to stdout in 186

Without redirecting qemu's stderr to stdout, _filter_qemu will not apply
to warnings. This results in $QEMU_PROG not being replaced by QEMU_PROG
which is not great if your qemu executable is not called
qemu-system-x86_64 (e.g. qemu-system-i386).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

iotests: Fix test 156

On one hand, the _make_test_img invocation for creating the target image
was missing a -u because its backing file is not supposed to exist at
that point.

On the other hand, nobody noticed probably because the backing file is
created later on and _cleanup failed to remove it: The quotation marks
were misplaced so bash tried to delete a file literally called
"$TEST_IMG{,.target}..." instead of performing brace expansion. Thus, the
files stayed around after the first run and qemu-img create did not
complain about a missing backing file on any run but the first.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

mc146818rtc: implement UIP latching as intended

In some cases, the guest can observe the wrong ordering of UIP and
interrupts.  This can happen if the VCPU exit is timed like this:

           iothread                 VCPU
                                  ... wait for interrupt ...
t-100ns                           read register A
t          wake up, take BQL
t+100ns                             update_in_progress
                                      return false
                                    return UIP=0
           trigger interrupt

The interrupt is late; the VCPU expected the falling edge of UIP to
happen after the interrupt.  update_in_progress is already trying to
cover this case by latching UIP if the timer is going to fire soon,
and the fix is documented in the commit message for commit 56038ef623
("RTC: Update the RTC clock only when reading it", 2012-09-10).  It
cannot be tested with qtest, because its timing of interrupts vs. reads
is exact.

However, the implementation was incorrect because UIP cmos_ioport_read
cleared register A instead of leaving that to rtc_update_timer.  Fixing
the implementation of cmos_ioport_read to match the commit message,
however, breaks the "uip-stuck" test case from the previous patch.
To fix it, skip update timer optimizations if UIP has been latched.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

mc146818rtc: simplify check_update_timer

Move all the optimized cases together, since they all have UF=1 in
common.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rtc-test: introduce more update tests

Test divider reset and UIP behavior.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rtc-test: cleanup register_b_set_flag test

Introduce set_datetime_bcd/assert_datetime_bcd, and handle
UIP correctly.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

hw/scsi/vmw_pvscsi: Convert to realize

Convert a device model where initialization obviously
can't fail, make it implement realize() rather than init().

Reviewed-by: Dmitry Fleytman <dmitry@daynix.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com>
Message-Id: <20170726084153.10121-2-maozy.fnst@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

hw/scsi/vmw_pvscsi: Remove the dead error handling

qemu_bh_new() is a wrapper around aio_bh_new(), which returns
null only when g_new() does. It doesn't. So remove the dead
error handling.

Reviewed-by: Dmitry Fleytman <dmitry@daynix.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com>
Message-Id: <20170726084153.10121-1-maozy.fnst@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

migration: optimize the downtime

Qemu_savevm_state_cleanup takes about 300ms in my ram migration tests
with a 8U24G vm(20G is really occupied), the main cost comes from
KVM_SET_USER_MEMORY_REGION ioctl when mem.memory_size = 0 in
kvm_set_user_memory_region. In kmod, the main cost is
kvm_zap_obsolete_pages, which traverses the active_mmu_pages list to
zap the unsync sptes.

It can be optimized by delaying memory_global_dirty_log_stop to the next
vm_start.

Changes v2->v3:
- NULL VMChangeStateHandler if it is deleted and protect the scenario
of nested invocations of memory_global_dirty_log_start/stop [Paolo]

Changes v1->v2:
- create a VMChangeStateHandler in memory.c to reduce the coupling [Paolo]

Signed-off-by: Jay Zhou <jianjay.zhou@huawei.com>
Message-Id: <1501237733-2736-1-git-send-email-jianjay.zhou@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

qemu-options: document existance of versioned machine types

The -machine docs did not explain what the versioned machine
types are for, nor that they'll be maintained across
releases.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <20170725141041.1195-1-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

bt: stop the sdp memory allocation craziness

Clang static analyzer reports a memory leak.  Actually, the allocated
memory escapes here:

        record->attribute_list[record->attributes].pair = data;

but clang is correct that the memory might leak if len is zero.  We
know it isn't; assert that it is the case.

The craziness doesn't end there.  The memory is freed by
bt_l2cap_sdp_close_ch:

       g_free(sdp->service_list[i].attribute_list->pair);

which actually should have been written like this:

       g_free(sdp->service_list[i].attribute_list[0].pair);

The attribute_list is sorted with qsort; but indeed the first
entry of attribute_list should point to "data" even after the qsort,
because the first record has id SDP_ATTR_RECORD_HANDLE, whose
numeric value is zero.

But hang on.  The qsort function is

    static int sdp_attributeid_compare(
                const struct sdp_service_attribute_s *a,
                const struct sdp_service_attribute_s *b)
    {
        return (int) b->attribute_id - a->attribute_id;
    }

but no one ever writes attribute_id.  So it only works if qsort is
stable, and who knows what else is broken, but we can fix it by
setting attribute_id in the while loop.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

exec: Add lock parameter to qemu_ram_ptr_length

Commit 04bf2526ce87f21b32c9acba1c5518708c243ad0 (exec: use
qemu_ram_ptr_length to access guest ram) start using qemu_ram_ptr_length
instead of qemu_map_ram_ptr, but when used with Xen, the behavior of
both function is different. They both call xen_map_cache, but one with
"lock", meaning the mapping of guest memory is never released
implicitly, and the second one without, which means, mapping can be
release later, when needed.

In the context of address_space_{read,write}_continue, the ptr to those
mapping should not be locked because it is used immediatly and never
used again.

The lock parameter make it explicit in which context qemu_ram_ptr_length
is called.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Message-Id: <20170726165326.10327-1-anthony.perard@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target-i386: kvm_get/put_vcpu_events don't handle sipi_vector

qemu call kvm_get_vcpu_events, and kernel return sipi_vector always
0, never valid when reporting to user space. But when qemu calls
kvm_put_vcpu_events will make sipi_vector in kernel be 0. This will
accidently modify sipi_vector when sipi_vector in kernel is not 0.

Signed-off-by: Peng Hao <peng.hao2@zte.com.cn>
Reviewed-by: Liu Yi <liu.yi24@zte.com.cn>
Message-Id: <1500047256-8911-1-git-send-email-peng.hao2@zte.com.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

docs: document deprecation policy & deprecated features in appendix

The deprecation of features in QEMU is totally adhoc currently,
with no way for the user to get a list of what is deprecated
in each release. This adds an appendix to the doc that records
when each deprecation was made and provides text explaining
what to use instead, if anything.

Since there has been no formal policy around removal of deprecated
features in the past, any deprecations prior to 2.10.0 are to be
treated as if they had been made at the 2.10.0 release. Thus the
earliest that existing deprecations will be deleted is the start
of the 2.12.0 cycle.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <20170725113638.7019-1-berrange@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

char: don't exit on hmp 'chardev-add help'

qemu_chr_new_from_opts() is used from both vl.c and hmp,
and it is quite confusing to see qemu suddenly exit after receiving a help
option in hmp.

Do exit(0) from vl.c instead.

Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
Message-Id: <1500977081-120929-1-git-send-email-anton.nefedov@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

char-fd: remove useless chr pointer

Apparently unused since it was introduced in commit
a29753f8aa79a34a324afebe340182a51a5aef11. Now, it can be trivially
accessed by CHARDEV() of self.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20170720100046.4424-1-marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

accel: cleanup error output

Only emit "XXX accelerator not found", if there are not
further accelerators listed. eg

accel=kvm:tcg

doesn't print a "KVM accelerator not found" warning
when it falls back to tcg, but a

accel=kvm

prints a warning, since no fallback is given.

Suggested-by: Daniel P. Berrange <berrange@redhat.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20170717144527.24534-1-lvivier@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

cpu_physical_memory_sync_dirty_bitmap: Fix alignment check

This code has an optimised, word aligned version, and a boring
unaligned version.  Recently 084140bd498909 fixed a missing offset
addition from the core of both versions.  However, the offset isn't
necessarily aligned and thus the choice between the two versions
needs fixing up to also include the offset.

Symptom:
  A few stuck unsent pages during migration; not normally noticed
unless under very low bandwidth in which case the migration may get
stuck never ending and never performing a 2nd sync; noticed by
a hanging postcopy-test on a very heavily loaded system.

Fixes: 084140bd498909
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reported-by: Alex BenneÃ© <alex.benee@linaro.org>
Tested-by: Alex BenneÃ© <alex.benee@linaro.org>
--
v2
  Move 'page' inside the if (Comment from Paolo)
Message-Id: <20170724165125.29887-1-dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

vl.c/exit: pause cpus before closing block devices

There's a rare exit seg if the guest is accessing
IO during exit.
It's always hitting the atomic_inc(&bs->in_flight) with a NULL
bs. This was added recently in 99723548 but I don't see it
as the cause.

Flip vl.c around so we pause the cpus before closing the block devices,
that way we shouldn't have anything trying to access them when
they're gone.

This was originally Red Hat bz https://bugzilla.redhat.com/show_bug.cgi?id=1451015

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reported-by: Cong Li <coli@redhat.com>
--
This is a very rare race, I'll leave it running in a loop to see if
we hit anything else and to check this really fixes it.

I do worry if there are other cases that can trigger this - e.g.
hot-unplug or ejecting a CD.

Message-Id: <20170713190116.21608-1-dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging

Pull request

Fixes for inconsistencies in the trace event format strings, broken
trace_event_get_state() usage, and handle_qmp_command() fix.

# gpg: Signature made Tue 01 Aug 2017 14:16:05 BST
# gpg:                using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/tracing-pull-request:
  monitor: Reduce handle_qmp_command() tracing overhead
  trace-events: fix code style: print 0x before hex numbers
  checkpatch: check trace-events code style
  trace-events: fix code style: %# -> 0x%
  coding_style: add point about 0x in trace-events
  trace: add trace_event_get_state_backends()
  trace: add TRACE_<event>_BACKEND_DSTATE()
  trace: ensure unique function / variable names per .stp file
  trace: ensure .stp files are rebuilt if trace tool source changes

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

monitor: Reduce handle_qmp_command() tracing overhead

We are malloc'ing a QString and spending CPU cycles on converting a
QObject to string, just for the sake of sticking the string in the trace
message. Wasted when we aren't tracing. Avoid that.

[Commit message and description suggested by Markus Armbruster to
provide more detail about the rationale for this patch.

Use trace_event_get_state_backends() instead of trace_event_get_state()
to honor DTrace/UST backend dstates.
--Stefan]

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20170725143923.11241-1-den@openvz.org
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Lluís Vilanova <vilanova@ac.upc.edu>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

trace-events: fix code style: print 0x before hex numbers

The only exception are groups of numers separated by symbols
'.', ' ', ':', '/', like 'ab.09.7d'.

This patch is made by the following:

> find . -name trace-events | xargs python script.py

where script.py is the following python script:
=========================
#!/usr/bin/env python

import sys
import re
import fileinput

rhex = '%[-+ *.0-9]*(?:[hljztL]|ll|hh)?(?:x|X|"\s*PRI[xX][^"]*"?)'
rgroup = re.compile('((?:' + rhex + '[.:/ ])+' + rhex + ')')
rbad = re.compile('(?<!0x)' + rhex)

files = sys.argv[1:]

for fname in files:
    for line in fileinput.input(fname, inplace=True):
        arr = re.split(rgroup, line)
        for i in range(0, len(arr), 2):
            arr[i] = re.sub(rbad, '0x\g<0>', arr[i])

        sys.stdout.write(''.join(arr))
=========================

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Message-id: 20170731160135.12101-5-vsementsov@virtuozzo.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

checkpatch: check trace-events code style

According to CODING_STYLE, check that in trace-events:
1. hex numbers are prefixed with '0x'
2. '#' flag of printf is not used
3. The exclusion from 1. are period-separated groups of numbers

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170731160135.12101-4-vsementsov@virtuozzo.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

trace-events: fix code style: %# -> 0x%

In trace format '#' flag of printf is forbidden. Fix it to '0x%'.

This patch is created by the following:

check that we have a problem
> find . -name trace-events | xargs grep '%#' | wc -l
56

check that there are no cases with additional printf flags before '#'
> find . -name trace-events | xargs grep "%[-+ 0'I]+#" | wc -l
0

check that there are no wrong usage of '#' and '0x' together
> find . -name trace-events | xargs grep '0x%#' | wc -l
0

fix the problem
> find . -name trace-events | xargs sed -i 's/%#/0x%/g'

[Eric Blake noted that xargs grep '%[-+ 0'I]+#' should be xargs grep
"%[-+ 0'I]+#" instead so the shell quoting is correct.
--Stefan]

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20170731160135.12101-3-vsementsov@virtuozzo.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

coding_style: add point about 0x in trace-events

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20170731160135.12101-2-vsementsov@virtuozzo.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

trace: add trace_event_get_state_backends()

Code that checks dstate is unaware of SystemTap and LTTng UST dstate, so
the following trace event will not fire when solely enabled by SystemTap
or LTTng UST:

  if (trace_event_get_state(TRACE_MY_EVENT)) {
      str = g_strdup_printf("Expensive string to generate ...",
                            ...);
      trace_my_event(str);
      g_free(str);
  }

Add trace_event_get_state_backends() to fetch backend dstate.  Those
backends that use QEMU dstate fetch it as part of
generate_h_backend_dstate().

Update existing trace_event_get_state() callers to use
trace_event_get_state_backends() instead.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170731140718.22010-3-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

trace: add TRACE_<event>_BACKEND_DSTATE()

QEMU keeps track of trace event enabled/disabled state and provides
monitor commands to inspect and modify the "dstate".  SystemTap and
LTTng UST maintain independent enabled/disabled states for each trace
event, the other backends rely on QEMU dstate.

Introduce a new per-event macro that combines backend-specific dstate
like this:

  #define TRACE_MY_EVENT_BACKEND_DSTATE() ( \
      QEMU_MY_EVENT_ENABLED() || /* SystemTap */ \
      tracepoint_enabled(qemu, my_event) /* LTTng UST */ || \
      false)

This will be used to extend trace_event_get_state() in the next patch.

[Daniel Berrange pointed out that QEMU_MY_EVENT_ENABLED() must be true
by default, not false.  This way events will fire even if the DTrace
implementation does not implement the SystemTap semaphores feature.

Ubuntu Precise uses lttng-ust-dev 2.0.2 which does not have
tracepoint_enabled(), so we need a compatibility wrapper to keep Travis
builds passing.
--Stefan]

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170731140718.22010-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
fixup! trace: add TRACE_<event>_BACKEND_DSTATE()