git.ipfire.org Git - thirdparty/qemu.git/log

Merge remote-tracking branch 'kwolf/for-anthony' into staging

# By Paolo Bonzini (14) and others
# Via Kevin Wolf
* kwolf/for-anthony: (24 commits)
  ide: Add fall through annotations
  block: Create proper size file for disk mirror
  ahci: Add migration support
  ahci: Change data types in preparation for migration
  ahci: Remove unused AHCIDevice fields
  hbitmap: add assertion on hbitmap_iter_init
  mirror: do nothing on zero-sized disk
  block/vdi: Check for bad signature
  block/vdi: Improved return values from vdi_open
  block/vdi: Improve debug output for signature
  block: Use error code EMEDIUMTYPE for wrong format in some block drivers
  block: Add special error code for wrong format
  mirror: support arbitrarily-sized iterations
  mirror: support more than one in-flight AIO operation
  mirror: add buf-size argument to drive-mirror
  mirror: switch mirror_iteration to AIO
  mirror: allow customizing the granularity
  block: allow customizing the granularity of the dirty bitmap
  block: return count of dirty sectors, not chunks
  mirror: perform COW if the cluster size is bigger than the granularity
  ...

Merge remote-tracking branch 'luiz/queue/qmp' into staging

# By Lei Li (3) and others
# Via Luiz Capitulino
* luiz/queue/qmp:
  QAPI: Introduce memchar-read QMP command
  QAPI: Introduce memchar-write QMP command
  qemu-char: Add new char backend CirMemCharDriver
  docs: document virtio-balloon stats
  balloon: re-enable balloon stats
  balloon: drop old stats code & API
  block: Monitor command commit neglects to report some errors

xilinx_ethlite: Avoid build warnings in debug code

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>

m25p80.c: Return state to IDLE after COLLECTING

Default to moving back to the IDLE state after the COLLECTING_DATA
state. For a well behaved guest this patch has no consequence, but
A bad guest could crash QEMU by using one of the erase commands
followed by a longer than 5 byte argument (undefined behaviour).

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>

xilinx_ethlite: Flush queued packets on SW service

Software services a received packet by clearing the CTRL_S bit in the RX_CTRLn
register. If this bit is cleared, flush any packets queued for the device.

Reported-by: John Williams <john.williams@xilinx.com>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>

xilinx_ethlite: fix eth_can_rx() for ping-pong

The eth_can_rx() function only checks the first buffers status ("ping"). The
controller should be able to receive into "pong" when ping-pong is enabled.
Checks the active buffer (either "ping" or "pong") when determining can_rx()
rather than just testing "ping".

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>

Merge branch 'ppc-for-upstream' of git://repo.or.cz/qemu/agraf

* 'ppc-for-upstream' of git://repo.or.cz/qemu/agraf:
  PPC: e500: Select MPIC v4.2 on ppce500 platform
  PPC: e500: fix mpic_iack address
  openpic: add basic support for MPIC v4.2
  openpic: fix timer address decoding
  openpic: fix remaining issues from idr-to-destmask conversion
  pseries: Adjust default VIO address allocations to play better with libvirt
  pseries: Improve handling of multiple PCI host bridges
  target-ppc: Give a meaningful error if too many threads are specified
  cuda: Move ADB bus into CUDA state
  adb: QOM'ify ADB devices
  adb: QOM'ify Apple Desktop Bus
  cuda: QOM'ify CUDA
  ide/macio: QOM'ify MacIO IDE
  mac_nvram: QOM'ify MacIO NVRAM
  mac_nvram: Mark as Big Endian
  mac_nvram: Clean up public API
  macio: Split MacIO in two
  macio: Delay qdev init until all fields are initialized
  macio: QOM'ify some more
  ppc: Move Mac machines to hw/ppc/

tests: Add gcov support for x86_64 qtest

Since x86_64 is a superset of i386 and reuses all its test cases, adopt
all the i386 gcov source files as well, substituting their paths
appropriately.

Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

tests: Add gcov support for sparc64 qtest

m48t59-test is individually being executed for sparc and sparc64, so add
the gcov source file for sparc64 as well.

Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

tests: Fix gcov typo for tmp105-test

Commit 6e9989034b176a8e4cfdccd85892abfa73977ba7 introduced a new qtest
test case but misspelled gcov, leading to no coverage analysis. Fix it.

Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

vmware_vga: fix out of bounds and invalid rects updating

This is a follow up for several attempts to fix this issue.

Previous incarnations:

1. http://thread.gmane.org/gmane.linux.ubuntu.bugs.general/3156089
https://bugs.launchpad.net/bugs/918791
"qemu-kvm dies when using vmvga driver and unity in the guest" bug.
Fix by Serge Hallyn:
https://launchpadlibrarian.net/94916786/qemu-vmware.debdiff
This fix is incomplete, since it does not check width and height
for being negative.  Serge weren't sure if that's the right place
to fix it, maybe the fix should be up the stack somewhere.

2. http://thread.gmane.org/gmane.comp.emulators.qemu/166064
by Marek Vasut: "vmware_vga: Redraw only visible area"

This one adds the (incomplete) check to vmsvga_update_rect_delayed(),
the routine just queues the rect updating but does no interesting
stuff.  It is also incomplete in the same way as patch by Serge,
but also does not touch width&height at all after adjusting x&y,
which is wrong.

As far as I can see, when processing guest requests, the device
places them into a queue (vmsvga_update_rect_delayed()) and
processes this queue in different place/time, namely, in
vmsvga_update_rect().  Sometimes, vmsvga_update_rect() is
called directly, without placing the request to the gueue.
This is the place this patch changes, which is the last
(deepest) in the stack.  I'm not sure if this is the right
place still, since it is possible we have some queue optimization
(or may have in the future) which will be upset by negative/wrong
values here, so maybe we should check for validity of input
right when receiving request from the guest (and maybe even
use unsigned types there).  But I don't know the protocol
and implementation enough to have a definitive answer.

But since vmsvga_update_rect() has other sanity checks already,
I'm adding the missing ones there as well.

Cc'ing BALATON Zoltan and Andrzej Zaborowski who shows in `git blame'
output and may know something in this area.

If this patch is accepted, it should be applied to all active
stable branches (at least since 1.1, maybe even before), with
minor context change (ds_get_*(s->vga.ds) => s->*).  I'm not
Cc'ing -stable yet, will do it explicitly once the patch is
accepted.

BTW, these checks use fprintf(stderr) -- it should be converted
to something more appropriate, since stderr will most likely
disappear somewhere.

Cc: Marek Vasut <marex@denx.de>
CC: Serge Hallyn <serge.hallyn@ubuntu.com>
Cc: BALATON Zoltan <balaton@eik.bme.hu>
Cc: Andrzej Zaborowski <balrogg@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Marek Vasut <marex@denx.de>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

tests: add fuzzing to visitor tests

Perform input tests on random data.

Improvement to code coverage for qapi/string-input-visitor.c
is about 3 percentage points.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

build: remove *.lo, *.a, *.la files from all subdirectories on make clean

.lo files in stubs/, util/ and libcacard/ were not cleaned.
Fix this.

Cc: Blue Swirl <blauwirbel@gmail.com>
Reported-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

hw/arm_boot: Align device tree to 4KB boundary, not page

Align the device tree blob to a 4KB boundary, not to QEMU's
idea of a page boundary -- the latter is the smallest possible
page size for the architecture, which on ARM is 1KB.
The documentation for Linux does not impose separation
or alignment requirements on the device tree blob, but
in practice some kernels will happily trash the entire
page the initrd ends in after they have finished uncompressing
the initrd. So 4KB-align the DTB to ensure it does not get
trampled by these kernels.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

qemu-char: Avoid unused variable warning in some configs

Avoid unused variable warnings:
qemu-char.c: In function 'qmp_chardev_open_port':
qemu-char.c:3132: warning: unused variable 'fd'
qemu-char.c:3132: warning: unused variable 'flags'

in configurations with neither HAVE_CHARDEV_TTY nor
HAVE_CHARDEV_PARPORT set.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

make_device_config.sh: Fix target path in generated dependency file

config-devices.mak.d is included from Makefile.target, i.e. from inside
the *-softmmu/ directory. It included the directory path, so never
applied to the actual ./config-devices.mak. Symptoms were spurious
build failures due to missing dependency on default-configs/pci.mak.

Fix this by using `basename` to strip the directory path.

Reported-by: Gerhard Wiesinger <lists@wiesinger.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

fw_cfg: Drop a few superfluous initializers

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

fw_cfg: Splash image loader can overrun a stack variable, fix

read_splashfile() passes the address of an int variable as size_t *
parameter to g_file_get_contents(), with a cast to gag the compiler.

No problem on machines where sizeof(size_t) == sizeof(int).

Happens to work on my x86_64 box (64 bit little endian): the least
significant 32 bits of the file size end up in the right place
(caller's variable file_size), and the most significant 32 bits
clobber a place that gets assigned to before its next use (caller's
variable file_type).

I'd expect it to break on a 64 bit big-endian box.

Fix up the variable types and drop the problematic cast.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

softfloat: Handle float_muladd_negate_c when product is zero

Honour float_muladd_negate_c in the case where the product is zero and
c is nonzero. Previously we would fail to negate c.

Seen in (and tested against) the gfortran testsuite on MIPS.

Signed-off-by: Richard Sandiford <rdsandiford@googlemail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

hw/pxa2xx_timer: Explicitly mark fallthroughs

Explicitly mark the fallthroughs as intentional in the code
pattern where we gradually increment an index before falling
into the code to read/write that array entry:
    case THINGY_3: idx++;
    case THINGY_2: idx++;
    case THINGY_1: idx++;
    case THINGY_0: return s->thingy[idx];

This makes static analysers happy.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

hw/smc91c111: Add explicit 'return' rather than relying on fallthrough

Add an explicit 'return' statement to a case in smc91c111_readb
rather than relying on fallthrough to the following case's
return statement, for code clarity and to placate static analysers.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

hw/pflash_cfi02.c: Mark deliberate fallthrough

Mark the deliberate fallthrough where we treat the case of
an attempt to read flash when it is an unknown command
state as if it were a normal read.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

hw/omap_dma, hw/omap_spi: Explicitly mark fallthroughs

Explicitly mark the fallthroughs as intentional in the code
pattern where we gradually increment an index before falling
into the code to read/write that array entry:
  case THINGY_3: idx++;
  case THINGY_2: idx++;
  case THINGY_1: idx++;
  case THINGY_0: return s->thingy[idx];

This makes static analysers happy.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

hw/omap1.c: Add fallthrough markers and breaks

Explicitly mark cases where we are deliberately falling
through to the following code. In one case we insert a
'break' instead of falling through to a 'break', as this
seems slightly clearer.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

hw/arm_sysctl.c: Add missing 'break' statements

Add some break statements that were accidentally omitted
from some cases of arm_sysctl_write(). The omission was
harmless because in both cases the following case did
an immediate break, but adding the breaks explicitly
placates static analysers and avoids weird behaviour if
the following register is ever implemented as something
other than a no-op.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

link seccomp only with softmmu targets

Now, if seccomp is detected, it is linked into every executable,
but is used only by softmmu targets (from vl.c). So link it
only where it is actually needed.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

bsd-user: avoid conflict with qemu_vmalloc

Rename qemu_vmalloc() to bsd_vmalloc(), adjust the only user.

Remove #ifdeffery in oslib-posix.c.

Tested-by: Andreas Färber <andreas.faerber@web.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

build: remove extra-obj-y

extra-obj-y is somewhat complicated to understand. Replace it with a
special CONFIG_ALL symbol that is defined only at toplevel.
This limits the case of directories defining more than one
*-obj-y target.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

build: remove universal-obj-y

All of universal-obj-y, user-obj-y (right now unused) and common-obj-y can
be unified into common-obj-y if we take care of defining CONFIG_SOFTMMU
and CONFIG_USER_ONLY in the toplevel makefile. This is similar to how
we define symbols for hardware components.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

build: use -$(CONFIG_SECCOMP) instead of ifeq

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

build: move around libcacard-y definition

It is also needed if !CONFIG_SOFTMMU, unlike everything that surrounds it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

tests: adjust gcov variables for directory movement

I had missed the introduction of the gcov-files-* variables.

Cc: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

PPC: e500: Select MPIC v4.2 on ppce500 platform

The compatible string is changed to fsl,mpic on all e500 platforms, to
advertise the existence of BRR1. This matches what the device tree will
have on real hardware.

With MPIC v4.2 max_cpu can be increased from 15 to 32.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>

PPC: e500: fix mpic_iack address

MPIC+0xa0 is IACK for the current CPU. MPIC+0x200a0 is IACK for CPU 0.
This fix allows EPR to work with an SMP target.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>

openpic: add basic support for MPIC v4.2

Besides the new value in the version register, this provides:
- ILR support, which includes:
  - IDR becoming a pure CPU bitmap, allowing 32 CPUs
  - machine check output support (though other parts of QEMU need to
    be fixed for it to do something other than immediately reboot the
    guest)
- dummy error interrupt support (EISR0/EIMR0 read as zero)
  - actually all FSL MPICs get all summary registers returning zero for now,
    which includes EISR0/EIMR0

Various refactoring is done to support these changes and to ease
new functionality (e.g. a more flexible way of declaring regions).

Just as the code was already not a full implementation of MPIC v2.0,
this is not a full implementation of MPIC v4.2 -- e.g. it still has only
one bank of MSIs.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>

openpic: fix timer address decoding

The timer memory range begins at 0x10f0, so that address 0x1120 shows
up as 0x30, 0x1130 shows up as 0x40, etc. However, the address
decoding (other than TFRR) is not adjusted for this, causing the
wrong registers to be accessed.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>

openpic: fix remaining issues from idr-to-destmask conversion

openpic_update_irq() was checking idr rather than destmask, treating
it as if it were a simple bitmap of cpus. Changed to use destmask.

IPI delivery was removing bits directly from .idr, without calling
write_IRQreg_idr so that the change could be conveyed to destmask.
Changed to use destmask directly.

Save/restore destmask when serializing, as due to the IPI change it
cannot be reproduced from idr.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>

pseries: Adjust default VIO address allocations to play better with libvirt

Currently, if VIO devices for pseries don't have addresses explicitly
allocated, they get automatically numbered from 0x1000. This is in the
same general range that libvirt will typically assign VIO device addresses.

That means that if there is a device libvirt doesn't know about, and it
gets an address assigned before the libvirt assigned devices are processed,
we can end up with an address conflict (qemu will abort with an error).

While the real solution is to teach libvirt about the other devices, so it
can correctly manage the whole allocation, this patch reduces the interim
inconvenience by moving qemu allocations to a range that libvirt is less
likely to conflict with.

Because the guest gets the device addresses through the device tree, these
addresses are truly arbitrary and can be changed without breaking guests.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>

pseries: Improve handling of multiple PCI host bridges

Multiple - even many - PCI host bridges (i.e. PCI domains) are very
common on real PAPR compliant hardware.  For reasons related to the
PAPR specified IOMMU interfaces, PCI device assignment with VFIO will
generally require at least two (virtual) PHBs and possibly more
depending on which devices are assigned.

At the moment the qemu PAPR PCI code will not deal with this well,
leaving several crucial parameters of PHBs other than the default one
uninitialized.  This patch reworks the code to allow this.

Every PHB needs a unique BUID (Bus Unit Identifier, the id used for
the PAPR PCI related interfaces) and a unique LIOBN (Logical IO Bus
Number, the id used for the PAPR IOMMU related interfaces).  In
addition they need windows in CPU real address space to access PCI
memory space, PCI IO space and MSIs.  Properties are added to the PCI
host bridge qdevice to allow configuration of all these.

To simplify configuration of multiple PHBs for common cases, a
convenience "index" property is also added.  This can be set instead
of the low-level properties, and will generate suitable values for the
other parameters, different for each index value.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>

target-ppc: Give a meaningful error if too many threads are specified

Currently the target-ppc tcg code only supports a single thread. You can
specify more, but they're treated identically to multiple cores. On KVM
we obviously can't support more threads than the hardware; if more are
specified it will cause strange and cryptic errors.

This patch clarifies the situation by giving a simple meaningful error if
more threads are specified than we can support.

Signed-off-by: Mike Qiu <qiudayu@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>

cuda: Move ADB bus into CUDA state

Replace the global adb_bus with a CUDA-internal one, accessed using
regular qdev child bus accessor.

Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>

adb: QOM'ify ADB devices

They were not qdev'ified before. Derive ADBDevice from DeviceState and
convert reset callbacks to DeviceClass::reset, ADBDevice::opaque pointer
to ADBDevice subtypes for mouse and keyboard and adb_{kbd,mouse}_init()
to regular qdev functions.

Fixing Coding Style issues and splitting keyboard and mouse off into
their own files is left for a later point in time.

Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>

adb: QOM'ify Apple Desktop Bus

It was not a qbus before, turn it into a first-class bus and initialize
it properly from CUDA. Leave it a global variable as long as devices are
not QOM'ified yet.

Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>

cuda: QOM'ify CUDA

It was not qdev'ified before. Turn it into a SysBusDevice and embed it
in MacIO.

Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>

ide/macio: QOM'ify MacIO IDE

It was not qdev'ified before. Turn it into a SysBusDevice.
Embed them into the MacIO devices.

Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>

mac_nvram: QOM'ify MacIO NVRAM

It was not qdev'ified before. Turn it into a SysBusDevice and
initialize it via static properties.

Prepare Old World specific MacIO state and embed the NVRAM state there.

Drop macio_nvram_setup_bar() in favor of sysbus_mmio_map() or
direct use of Memory API.

Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>

mac_nvram: Mark as Big Endian

Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>

mac_nvram: Clean up public API

The state data field is accessed in uint8_t quantities, so switch from
uint32_t argument and return value to uint8_t.

Fix debug format specifiers while at it.

Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>

macio: Split MacIO in two

Let the machines create two different types. This prepares to move
knowledge about sub-devices from the machines into the devices.

Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>

macio: Delay qdev init until all fields are initialized

This turns macio_bar_setup() into an implementation detail of the qdev
initfn, to be removed step by step.

Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>

macio: QOM'ify some more

Move bar MemoryRegion initialization to an instance_init.

Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>

ppc: Move Mac machines to hw/ppc/

Signed-off-by: Andreas Färber <afaerber@suse.de>
[agraf: squash in MAINTAINERS fix]
Signed-off-by: Alexander Graf <agraf@suse.de>

ide: Add fall through annotations

Add comments to help static analysers detect that these cases are
intentional, and clean up some whitespace in the environment of these
comments.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>

block: Create proper size file for disk mirror

The qmp monitor command to mirror a disk was passing -1 for size
along with the disk's backing file. This size of the resulting disk
is the size of the backing file, which is incorrect if the disk
has been resized. Therefore we should always pass in the size of
the current disk.

Signed-off-by: Vishvananda Ishaya <vishvananda@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

ahci: Add migration support

Jason tested these patches by migrating Windows 7 and Fedora 17 guests
(while under I/O) on both piix with ahci attached and on q35 (which has
a built-in AHCI controller).

Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

ahci: Change data types in preparation for migration

The size of an int depends on the host, so in order to be able to
migrate these fields, make them either int32_t or bool, depending on the
use.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>

ahci: Remove unused AHCIDevice fields

'dma_status' and 'dma_cb' are written to, but never read.
Remove these fields in preparation for AHCI migration bits.

Signed-off-by: Jason Baron <jbaron@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

hbitmap: add assertion on hbitmap_iter_init

hbitmap_iter_init causes an out-of-bounds access when the "first"
argument is or greater than or equal to the size of the bitmap.
Forbid this with an assertion, and remove the failing testcase.

Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

mirror: do nothing on zero-sized disk

On a zero-sized disk we need to break out of the job successfully
before bdrv_dirty_iter_init is called, otherwise you will get an
assertion failure with the next patch.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block/vdi: Check for bad signature

vdi_open did not check for a bad signature.
This check was only in vdi_probe.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block/vdi: Improved return values from vdi_open

vdi_open returned -1 in case of any error, but it should return an
error code (negative value of errno or -EMEDIUMTYPE).

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block/vdi: Improve debug output for signature

The signature is a 32 bit value and needs up to 8 hex digits for printing.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: Use error code EMEDIUMTYPE for wrong format in some block drivers

This improves error reports for bochs, cow, qcow, qcow2, qed and vmdk
when a file with the wrong format is selected.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: Add special error code for wrong format

The block drivers need a special error code for "wrong format".
From the available error codes EMEDIUMTYPE fits best.
It is not available on all platforms, so a definition in
qemu-common.h and a specific error report are needed.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

mirror: support arbitrarily-sized iterations

Yet another optimization is to extend the mirroring iteration to include more
adjacent dirty blocks. This limits the number of I/O operations and makes
mirroring efficient even with a small granularity. Most of the infrastructure
is already in place; we only need to put a loop around the computation of
the origin and sector count of the iteration.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

mirror: support more than one in-flight AIO operation

With AIO support in place, we can start copying more than one chunk
in parallel. This patch introduces the required infrastructure for
this: the buffer is split into multiple granularity-sized chunks,
and there is a free list to access them.

Because of copy-on-write, a single operation may already require
multiple chunks to be available on the free list.

In addition, two different iterations on the HBitmap may want to
copy the same cluster. We avoid this by keeping a bitmap of in-flight
I/O operations, and blocking until the previous iteration completes.
This should be a pretty rare occurrence, though; as long as there is
no overlap the next iteration can start before the previous one finishes.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

mirror: add buf-size argument to drive-mirror

This makes sense when the next commit starts using the extra buffer space
to perform many I/O operations asynchronously.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

mirror: switch mirror_iteration to AIO

There is really no change in the behavior of the job here, since
there is still a maximum of one in-flight I/O operation between
the source and the target. However, this patch already introduces
the AIO callbacks (which are unmodified in the next patch)
and some of the logic to count in-flight operations and only
complete the job when there is none.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

mirror: allow customizing the granularity

The desired granularity may be very different depending on the kind of
operation (e.g. continuous replication vs. collapse-to-raw) and whether
the VM is expected to perform lots of I/O while mirroring is in progress.

Allow the user to customize it, while providing a sane default so that
in general there will be no extra allocated space in the target compared
to the source.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: allow customizing the granularity of the dirty bitmap

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: return count of dirty sectors, not chunks

Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

mirror: perform COW if the cluster size is bigger than the granularity

When mirroring runs, the backing files for the target may not yet be
ready.  However, this means that a copy-on-write operation on the target
would fill the missing sectors with zeros.  Copy-on-write only happens
if the granularity of the dirty bitmap is smaller than the cluster size
(and only for clusters that are allocated in the source after the job
has started copying).  So far, the granularity was fixed to 1MB; to avoid
the problem we detected the situation and required the backing files to
be available in that case only.

However, we want to lower the granularity for efficiency, so we need
a better solution.  The solution is to always copy a whole cluster the
first time it is touched.  The code keeps a bitmap of clusters that
have already been allocated by the mirroring job, and only does "manual"
copy-on-write if the chunk being copied is zero in the bitmap.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: make round_to_clusters public

This is needed in the following patch.

Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: implement dirty bitmap using HBitmap

This actually uses the dirty bitmap in the block layer, and converts
mirroring to use an HBitmapIter.

Reviewed-by: Laszlo Ersek <lersek@redhat.com> (except block/mirror.c parts)
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

add hierarchical bitmap data type and test cases

HBitmaps provides an array of bits.  The bits are stored as usual in an
array of unsigned longs, but HBitmap is also optimized to provide fast
iteration over set bits; going from one bit to the next is O(logB n)
worst case, with B = sizeof(long) * CHAR_BIT: the result is low enough
that the number of levels is in fact fixed.

In order to do this, it stacks multiple bitmaps with progressively coarser
granularity; in all levels except the last, bit N is set iff the N-th
unsigned long is nonzero in the immediately next level.  When iteration
completes on the last level it can examine the 2nd-last level to quickly
skip entire words, and even do so recursively to skip blocks of 64 words or
powers thereof (32 on 32-bit machines).

Given an index in the bitmap, it can be split in group of bits like
this (for the 64-bit case):

     bits 0-57 => word in the last bitmap     | bits 58-63 => bit in the word
     bits 0-51 => word in the 2nd-last bitmap | bits 52-57 => bit in the word
     bits 0-45 => word in the 3rd-last bitmap | bits 46-51 => bit in the word

So it is easy to move up simply by shifting the index right by
log2(BITS_PER_LONG) bits.  To move down, you shift the index left
similarly, and add the word index within the group.  Iteration uses
ffs (find first set bit) to find the next word to examine; this
operation can be done in constant time in most current architectures.

Setting or clearing a range of m bits on all levels, the work to perform
is O(m + m/W + m/W^2 + ...), which is O(m) like on a regular bitmap.

When iterating on a bitmap, each bit (on any level) is only visited
once.  Hence, The total cost of visiting a bitmap with m bits in it is
the number of bits that are set in all bitmaps.  Unless the bitmap is
extremely sparse, this is also O(m + m/W + m/W^2 + ...), so the amortized
cost of advancing from one bit to the next is usually constant.

Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

host-utils: add ffsl

We can provide fast versions based on the other functions defined
by host-utils.h. Some care is required on glibc, which provides
ffsl already.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

QAPI: Introduce memchar-read QMP command

Signed-off-by: Lei Li <lilei@linux.vnet.ibm.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>

QAPI: Introduce memchar-write QMP command

Signed-off-by: Lei Li <lilei@linux.vnet.ibm.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>

qemu-char: Add new char backend CirMemCharDriver

Signed-off-by: Lei Li <lilei@linux.vnet.ibm.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>

docs: document virtio-balloon stats

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>

balloon: re-enable balloon stats

The statistics are now available through device properties via a
polling mechanism. First a client has to enable polling, then it
can query available stats.

Polling is enabled by setting an update interval (in seconds)
to a property named guest-stats-polling-interval, like this:

{ "execute": "qom-set",
  "arguments": { "path": "/machine/peripheral-anon/device[1]",
                 "property": "guest-stats-polling-interval", "value": 4 } }

Then the available stats can be retrieved by querying the
guest-stats property. The returned object is a dict containing
all available stats. Example:

{ "execute": "qom-get",
  "arguments": { "path": "/machine/peripheral-anon/device[1]",
  "property": "guest-stats" } }

{
    "return": {
        "stats": {
            "stat-swap-out": 0,
            "stat-free-memory": 844943360,
            "stat-minor-faults": 219028,
            "stat-major-faults": 235,
            "stat-total-memory": 1044406272,
            "stat-swap-in": 0
        },
        "last-update": 1358529861
    }
}

Please, check the next commit for full documentation.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>

balloon: drop old stats code & API

Next commit will re-enable balloon stats with a different interface, but
this old code conflicts with it. Let's drop it.

It's important to note that the QMP and HMP interfaces are also dropped
by this commit. That shouldn't be a problem though, because:

1. All QMP fields are optional
2. This feature has always been disabled

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>

block: Monitor command commit neglects to report some errors

The non-live bdrv_commit() function may return one of the following
errors: -ENOTSUP, -EBUSY, -EACCES, -EIO. The only error that is
checked in the HMP handler is -EBUSY, so the monitor command 'commit'
silently fails for all error cases other than 'Device is in use'.

Report error using monitor_printf() and strerror(), and convert existing
qerror_report() calls in do_commit() to monitor_printf().

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>

Merge remote-tracking branch 'bonzini/scsi-next' into staging

# By Paolo Bonzini (1) and Peter Lieven (1)
# Via Paolo Bonzini
* bonzini/scsi-next:
iscsi: add support for iovectors
iscsi: do not leak acb->buf when commands are aborted

Revert "serial: fix retry logic"

This reverts commit 67c5322d7000fd105a926eec44bc1765b7d70bdd:

    I'm not sure if the retry logic has ever worked when not using FIFO mode.  I
    found this while writing a test case although code inspection confirms it is
    definitely broken.

    The TSR retry logic will never actually happen because it is guarded by an
    'if (s->tsr_rety > 0)' but this is the only place that can ever make the
    variable greater than zero.  That effectively makes the retry logic an 'if (0)

    I believe this is a typo and the intention was >= 0.  Once this is fixed thoug
    I see double transmits with my test case.  This is because in the non FIFO
    case, serial_xmit may get invoked while LSR.THRE is still high because the
    character was processed but the retransmit timer was still active.

    We can handle this by simply checking for LSR.THRE and returning early.  It's
    possible that the FIFO paths also need some attention.

Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Even if the previous logic was never worked, new logic breaks stuff -
namely,

qemu -enable-kvm -nographic -kernel /boot/vmlinuz-$(uname -r) -append console=ttyS0 -serial pty

the above command will cause the virtual machine to stuck at startup
using 100% CPU till one connects to the pty and sends any char to it.

Note this is rather typical invocation for various headless virtual
machines by libvirt.

So revert this change for now, till a better solution will be found.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>

iscsi: add support for iovectors

This patch adds support for directly passing the iovec
array from QEMUIOVector if libiscsi supports it (1.8.0
or newer).

Signed-off-by: Peter Lieven <pl@kamp.de>
[Preserve the improvements from commit 4cc841b, iscsi: partly
avoid iovec linearization in iscsi_aio_writev, 2012-11-19 - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

iscsi: do not leak acb->buf when commands are aborted

acb->buf is freed in the WRITE(16) callback, but this may not
get called at all when commands are aborted. Add another
free in the ABORT TASK callback, which requires setting acb->buf
to NULL everywhere.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target-cris: Fix typo in D_LOG() macro

It's __VA_ARGS__. Fixes the build with CRIS_[OP_]HELPER_DEBUG defined.

Broken since r6338 / 93fcfe39a0383377e647b821c9f165fd927cd4e0 (Convert
references to logfile/loglevel to use qemu_log*() macros).

Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>

trivial: etraxfs_eth: Eliminate checkpatch errors

This is a trivial patch to harmonize the coding style on
hw/etraxfs_eth.c. This is in preparation to split off the bitbang mdio
code into a separate file.

Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Paul Brook <paul@codesourcery.com>
Cc: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Cc: Anthony Liguori <aliguori@us.ibm.com>
Cc: Andreas Färber <afaerber@suse.de>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>

Merge remote-tracking branch 'bonzini/scsi-next' into staging

# By Peter Lieven (3) and others
# Via Paolo Bonzini
* bonzini/scsi-next:
  scsi: Drop useless null test in scsi_unit_attention()
  lsi: use qbus_reset_all to reset SCSI bus
  scsi: fix segfault with 0-byte disk
  iscsi: add support for iSCSI NOPs [v2]
  iscsi: partly avoid iovec linearization in iscsi_aio_writev
  iscsi: add iscsi_create support

Merge remote-tracking branch 'kraxel/usb.77' into staging

# By Gerd Hoffmann
# Via Gerd Hoffmann
* kraxel/usb.77:
  usb: add usb-bot device (scsi bulk-only transport).
  ohci: add missing break
  Revert "usb-storage: Drop useless null test in usb_msd_handle_data()"

Merge remote-tracking branch 'spice/spice.v68' into staging

# By Alon Levy
# Via Gerd Hoffmann
* spice/spice.v68:
qxl: change rom size to 8192
qxl: stop using non revision 4 rom fields for revision < 4

scsi: Drop useless null test in scsi_unit_attention()

req was created by scsi_req_alloc(), which initializes req->dev to a
value it dereferences. req->dev isn't changed anywhere else.
Therefore, req->dev can't be null.

Drop the useless null test; it spooks Coverity.

Signed-off-by: Markus Armbruster <armbru@redhat.com>

lsi: use qbus_reset_all to reset SCSI bus

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

scsi: fix segfault with 0-byte disk

When a 0-sized disk is found, READ CAPACITY will return a
LUN NOT READY error. However, because it returns -1 instead
of zero, the HBA will call scsi_req_continue. This will
typically cause a segmentation fault or an assertion failure.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

iscsi: add support for iSCSI NOPs [v2]

This patch will send NOP-Out PDUs every 5 seconds to the iSCSI target.
If a consecutive number of NOP-In replies fail a reconnect is initiated.
iSCSI NOPs help to ensure that the connection to the target is still operational.
This should not, but in reality may be the case even if the TCP connection is still
alive if there are bugs in either the target or the initiator implementation.

v2:
- track the NOPs inside libiscsi so libiscsi can reset the counter
in case it initiates a reconnect.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

iscsi: partly avoid iovec linearization in iscsi_aio_writev

libiscsi expects all write16 data in a linear buffer. If the
iovec only contains one buffer we can skip the linearization
step as well as the additional malloc/free and pass the
buffer directly.

Reported-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

iscsi: add iscsi_create support

This patch adds support for bdrv_create. This allows e.g.
to use qemu-img to convert from any supported device to
an iscsi backed storage as destination.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

usb: add usb-bot device (scsi bulk-only transport).

Basically the same as usb-storage, but without automatic scsi
device setup. Also features support for up to 16 LUNs.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>

ohci: add missing break

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>