Laine Stump [Mon, 27 Mar 2017 01:57:54 +0000 (21:57 -0400)]
conf: don't ignore <target dev='blah'/> for macvtap interfaces
The parser had been clearing out *all* suggested device names for
type='direct' (aka macvtap) interfaces. All of the code implementing
macvtap allows for a user-specified device name, so we should allow
it. In the case that an interface name starts with "macvtap" or
"macvlan" though, we do still clear it out, just as we do with "vnet"
(which is the prefix used for automatically generated tap device
names), since those are the prefixes for the names we autogenerate for
macvtap and macvlan devices.
Laine Stump [Tue, 25 Apr 2017 18:09:45 +0000 (14:09 -0400)]
util: make macvtap/macvlan generated name #defines available to other files
MACVTAP_NAME_PREFIX and MACVLAN_NAME_PREFIX could be useful to other
files if they were defined in virnetdevmacvlan.h instead of
virnetdevmacvlan.c, so do that (while slightly renaming them and also
adding yet another #define that chooses between macvlan/macvtap based
on flags).
This is a prerequisite to fix: https://bugzilla.redhat.com/1335798
Laine Stump [Tue, 25 Apr 2017 16:26:43 +0000 (12:26 -0400)]
network: better log message when network is inactive during reconnect
If the network isn't active during networkNotifyActualDevice(), we
would log an error message stating that the bridge device didn't
exist. This patch adds a check to see if the network is active, making
the logs more useful in the case that it isn't.
Laine Stump [Tue, 25 Apr 2017 16:20:30 +0000 (12:20 -0400)]
qemu: don't kill qemu process on restart if networkNotify fails
Nothing that could happen during networkNotifyActualDevice() could
justify unceremoniously killing the qemu process, but that's what we
were doing.
In particular, new code added in commit 85bcc022 (first appearred in
libvirt-3.2.0) attempts to reattach tap devices to their assigned
bridge devices when libvirtd restarts (to make it easier to recover
from a restart of a libvirt network). But if the network has been
stopped and *not* restarted, the bridge device won't exist and
networkNotifyActualDevice() will fail.
This patch changes networkNotifyActualDevice() and
qemuProcessNotifyNets() to return void, so that qemuProcessReconnect()
will soldier on regardless of what happens (any errors will still be
logged though).
After 1eb6647979f8c nobody calls the iohelper with 6 arguments.
Everybody uses the other mode. Well, the only user of iohelper
after the previous commit is virFileWrapperFd really.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com>
Currently we use iohelper for virFDStream implementation. This is
because UNIX I/O can lie sometimes: even though a FD for a
file/block device is set as unblocking, actual read()/write() can
block. To avoid this, a pipe is created and one end is kept for
read/write while the other is handed over to iohelper to
write/read the data for us. Thus it's iohelper which gets blocked
and not our event loop.
This approach has two problems:
1) we are spawning a new process.
2) any exchange of information between daemon and iohelper can be
done only through the pipe.
Therefore, iohelper is replaced with an implementation in thread
which is created just for the stream lifetime. The data are still
transferred through pipe (for now), but both problems described
above are solved.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com>
Pavel Hrdina [Thu, 27 Apr 2017 15:41:56 +0000 (17:41 +0200)]
qemu: change the logic of setting default USB controller
The new logic will set the piix3-uhci if available regardless of
any architecture and it will be updated to better model based on
architecture and device existence.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> Acked-by: Andrea Bolognani <abologna@redhat.com>
Peter Krempa [Thu, 20 Apr 2017 09:16:42 +0000 (11:16 +0200)]
qemu: Don't fail if physical size can't be updated in qemuDomainGetBlockInfo
Since commit c5f6151390 qemuDomainBlockInfo tries to update the
"physical" storage size for all network storage and not only block
devices.
Since the storage driver APIs to do this are not implemented for certain
storage types (RBD, iSCSI, ...) the code would fail to retrieve any data
since the failure of qemuDomainStorageUpdatePhysical is fatal.
Since it's desired to return data even if the total size can't be
updated we need to ignore errors from that function and return plausible
data.
Peter Krempa [Wed, 26 Apr 2017 07:57:39 +0000 (09:57 +0200)]
qemu: process: Don't leak priv->usbaddrs after VM restart
Since the private data structure is not freed upon stopping a VM, the
usbaddrs pointer would be leaked:
==15388== 136 (16 direct, 120 indirect) bytes in 1 blocks are definitely lost in loss record 893 of 1,019
==15388== at 0x4C2CF55: calloc (vg_replace_malloc.c:711)
==15388== by 0x54BF64A: virAlloc (viralloc.c:144)
==15388== by 0x5547588: virDomainUSBAddressSetCreate (domain_addr.c:1608)
==15388== by 0x144D38A2: qemuDomainAssignUSBAddresses (qemu_domain_address.c:2458)
==15388== by 0x144D38A2: qemuDomainAssignAddresses (qemu_domain_address.c:2515)
==15388== by 0x144ED1E3: qemuProcessPrepareDomain (qemu_process.c:5398)
==15388== by 0x144F51FF: qemuProcessStart (qemu_process.c:5979)
[...]
Peter Krempa [Tue, 25 Apr 2017 13:17:34 +0000 (15:17 +0200)]
qemu: process: Clean automatic NUMA/cpu pinning information on shutdown
Clean the stale data after shutting down the VM. Otherwise the data
would be leaked on next VM start. This happens due to the fact that the
private data object is not freed on destroy of the VM.
Wim ten Have [Mon, 24 Apr 2017 13:07:00 +0000 (15:07 +0200)]
xenconfig: add conversions for xen-xl
Per xen-xl conversions from and to native under host-passthrough
mode we take care for Xen (nestedhvm = mode) applied and inherited
settings generating or processing correct feature policy:
This patch maps /domain/cpu/cache element into -cpu parameters:
- <cache mode='passthrough'/> is translated to host-cache-info=on
- <cache level='3' mode='emulate'/> is transformed into l3-cache=on
- <cache mode='disable'/> is turned in host-cache-info=off,l3-cache=off
Any other <cache> element is forbidden.
The tricky part is detecting whether QEMU supports the CPU properties.
The 'host-cache-info' property is introduced in v2.4.0-1389-ge265e3e480,
earlier QEMU releases enabled host-cache-info by default and had no way
to disable it. If the property is present, it defaults to 'off' for any
QEMU until at least 2.9.0.
The 'l3-cache' property was introduced later by v2.7.0-200-g14c985cffa.
Earlier versions worked as if l3-cache=off was passed. For any QEMU
until at least 2.9.0 l3-cache is 'off' by default.
QEMU 2.9.0 was the first release which supports probing both properties
by running device-list-properties with typename=host-x86_64-cpu. Older
QEMU releases did not support device-list-properties command for CPU
devices. Thus we can't really rely on probing them and we can just use
query-cpu-model-expansion QMP command as a witness.
Because the cache property probing is only reliable for QEMU >= 2.9.0
when both are already supported for quite a few releases, we let QEMU
report an error if a specific cache mode is explicitly requested. The
other mode (or both if a user requested CPU cache to be disabled) is
explicitly turned off for QEMU >= 2.9.0 to avoid any surprises in case
the QEMU defaults change. Any older QEMU already turns them off so not
doing so explicitly does not make any harm.
Not all async jobs are visible via virDomainGetJobStats (either they are
too fast or getting the stats is not allowed during the job), but
forcing all of them to advertise the operation is easier than hunting
the jobs for which fetching statistics is allowed. And we won't need to
think about this when we add support for getting stats for more jobs.
The parameter is reported by virDomainGetJobStats API and
VIR_DOMAIN_EVENT_ID_JOB_COMPLETED event and it can be used to identify
the operation (migration, snapshot, ...) to which the reported
statistics belong.
Eric Farman [Wed, 26 Apr 2017 21:10:01 +0000 (17:10 -0400)]
qemu: Remove extra messages for vhost-scsi hotplug
As with virtio-scsi, the "internal error" messages after
preparing a vhost-scsi hostdev overwrites more meaningful
error messages deeper in the callchain. Remove it too.
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Eric Farman [Wed, 26 Apr 2017 21:10:00 +0000 (17:10 -0400)]
qemu: Remove extra messages from virtio-scsi hotplug
I tried to attach a SCSI LUN to two different guests, and forgot
to specify "shareable" in the hostdev XML. Attaching the device
to the second guest failed, but the message was not helpful in
telling me what I was doing wrong:
$ virsh attach-device dasd_fedora_0e1e scsi_scratch_disk.xml
error: Failed to attach device from scsi_scratch_disk.xml
error: internal error: Unable to prepare scsi hostdev: scsi_host3:0:15:1074151456
I eventually discovered my error, but thought it was weird that
Libvirt doesn't provide something more helpful in this case.
Looking over the code we had just gone through, I commented out
the "internal error" message, and got something more useful:
$ virsh attach-device dasd_fedora_0e1e scsi_scratch_disk.xml
error: Failed to attach device from scsi_scratch_disk.xml
error: Requested operation is not valid: SCSI device 3:0:15:1074151456 is already in use by other domain(s) as 'non-shareable'
Looking over the error paths here, we seem to issue better
messages deeper in the callchain so these "internal error"
messages overwrite any of them. Remove them, so that the
more detailed errors are seen.
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
qemu: migration: fix race on cancelling drive mirror
0feebab2 adds calling qemuBlockNodeNamesDetect for completed job
on updating block jobs. This affects cancelling drive mirror logic as
this function drops vm lock. Now we have to recheck all disks
before the disk with the completed block job before going
to wait for block job events.
Peter Krempa [Wed, 26 Apr 2017 07:01:30 +0000 (09:01 +0200)]
qemu: numa: Don't return automatic nodeset for inactive domain
qemuDomainGetNumaParameters would return the automatic nodeset even for
the persistent config if the domain was running. This is incorrect since
the automatic nodeset will be re-queried upon starting the vm.
While peer-to-peer migration enters the Confirm phase even if the
Perform phase fails, the client which initiated a non-p2p migration will
never call virDomainMigrateConfirm* API if the Perform phase failed.
Thus we need to explicitly reset migration before reporting a failure
from the Perform phase API.
Migration with old QEMU which does not support query-migrate-parameters
would fail because the QMP command is called unconditionally since the
introduction of TLS migration. Previously it was only called if the user
explicitly requested a feature which uses QEMU migration parameters. And
even then the situation was not ideal, instead of reporting an
unsupported feature we'd just complain about missing QMP command.
Trivially no migration parameters are supported when
query-migrate-parameters QMP command is missing. There's no need to
report an error if it is missing, the callers will report better error
if needed.
John Ferlan [Mon, 3 Apr 2017 13:03:47 +0000 (09:03 -0400)]
secret: Split apart NumOfSecrets and GetUUIDs callback function
Rather than overloading one function - split apart the logic to have
separate interfaces and local/private structures to manage the data
for which the helper is collecting.
John Ferlan [Wed, 19 Apr 2017 12:34:04 +0000 (08:34 -0400)]
secret: Change variable names for list traversals
Rather than 'nuuids' it should be 'maxuuids' and rather than 'got'
it should be 'nuuids'. Alter the logic of the list traversal to
utilize those names.
John Ferlan [Sat, 1 Apr 2017 15:46:36 +0000 (11:46 -0400)]
secret: Use consistent naming for variables
When processing a virSecretPtr use 'secret' as a variable name.
When processing a virSecretObjPtr use 'obj' as a variable name.
When processing a virSecretDefPtr use 'def' as a variable name,
unless a distinction needs to be made with a 'newdef' such as
virSecretObjListAddLocked (which also used the VIR_STEAL_PTR macro
for the configFile and base64File).
John Ferlan [Thu, 20 Apr 2017 15:51:22 +0000 (11:51 -0400)]
nwfilter: Move save of config until after successful assign
Only save the config when using a generated UUID if we were able to
create an object for the def. There could have been "other reasons"
for the assignment to fail, so saving the config could be incorrect.
Ján Tomko [Mon, 8 Aug 2016 13:38:20 +0000 (15:38 +0200)]
Use a separate buffer for <input> subelements
Instead of figuring out upfront whether <input> will be a single
or a pair element, format the subelements into a separate buffer
and close <input/> early if this buffer is empty.
Erik Skultety [Mon, 27 Mar 2017 07:03:02 +0000 (09:03 +0200)]
docs: Provide a nodedev driver stub documentation
There's lot more to document about the nodedev driver, besides PCI and
SR-IOV (even this might need to be extended), but let's start small-ish
and at least have a page for it linked from the drivers.html.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
During 'matrix' testing of all possible combinations I found that if
device is formated with "gpt" first, then an attempt is made to format
using "mac", a startup will fail.
Deeper analysis by Peter Krempa indicates that the "mac" table fits
into the first block on the disk. Since the GPT disklabel is stored
at LBA address 1 it is not overwritten at all. Thus it's apparent that
the (blkid) detection tool then prefers GPT over a older disklabel.
The GPT disklabel has also a secondary copy at the last LBA of the disk.
So, follow the same logic as the logical pool in clearing a 1MB swath
at the beginning and end of the device to avoid potential issues with
larger sector sizes for the device.
Also fixed a minor formatting nit in virStorageBackendDeviceIsEmpty call.
Create a wrapper/helper that can be used to call the storage backend
wipe helper - storageBackendVolWipeLocalFile for future use by logical
and disk backends to clear out the partition table rather than having
each open code the same algorithm.
John Ferlan [Fri, 7 Apr 2017 15:23:38 +0000 (11:23 -0400)]
storage: Modify storageBackendWipeLocal to allow zero from end of device
Add bool 'zero_end' and logic that would allow a caller to wipe specific
portions of a target device either from the beginning (the default) or
from the end when zero_end is true.
This will allow for this code to wipe out partition table information
from a device.
util: switch over to use keycodemapdb GIT submodule
A long time ago we imported the keymaps.csv file from GTK-VNC so we
can do conversions between keycode sets. Meanwhile lots of bug fixes
have gone into this CSV file and libvirt hasn't kept in sync. The
keymaps.csv file and associated generator script has been pulled out
of GTK-VNC into a dedicated GIT repo for use as a submodule. This
allows GTK-VNC, SPICE-GTK, QEMU and libvirt to share the same master
database and tools and pushing updates merely requires a submodule
commit update as with gnulib.
The test suite is updated to cover some extra boundary conditions.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
John Ferlan [Mon, 24 Apr 2017 16:50:12 +0000 (12:50 -0400)]
secret: Clean up virSecretObjListExport logic
Shorten the time needed to keep the list lock and alter the cleanup
path to be more of an error path.
Utilize the the virObjectListFree function to handle the calls for
virObjectUnref on each list element and the VIR_FREE of the list
instead of open coding it.
Change the name of the virHashForEach callback to match the name
of the Export function with the Callback added onto it.
John Ferlan [Fri, 14 Apr 2017 13:24:33 +0000 (09:24 -0400)]
test: Remove unnecessary unlocks in cleanup paths
Commit id '865f479da' altered the logic to use a common test*ObjFindByName
helpers which would lock/unlock the test driver; however, a few cleanup paths
in that cleanup missed removing the Unlock, so remove it now.
John Ferlan [Mon, 27 Mar 2017 16:47:37 +0000 (12:47 -0400)]
remote: Fix possible use-after-free when sending event message
Based upon an idea and some research by Wang King <king.wang@huawei.com>
and xinhua.Cao <caoxinhua@huawei.com>.
Since we're assigning the 'client' to our callback event lookaside list,
it's imperative that we grab a reference to the object; otherwise, when
the object is unref'd during virNetServerProcessClients when it's determined
that the virNetServerClientIsClosed and the memory is free'd before perhaps
the object event state callbacks are run. When a virObjectLock() is run,
before sending the message the following trace occurs;
#0 0x00007fda223d66d8 in virClassIsDerivedFrom
(klass=0xdeadbeef, parent=0x7fda24c81b40)
at util/virobject.c:169
#1 0x00007fda223d6a1e in virObjectIsClass
(anyobj=anyobj@entry=0x7fd9e575b400, klass=<optimized out>)
at util/virobject.c:365
#2 0x00007fda223d6a44 in virObjectLock
(anyobj=0x7fd9e575b400)
at util/virobject.c:317
#3 0x00007fda22507f71 in virNetServerClientSendMessage
(client=client@entry=0x7fd9e575b400, msg=msg@entry=0x7fd9ec30de90)
at rpc/virnetserverclient.c:1422
#4 0x00007fda230d714d in remoteDispatchObjectEventSend
(client=0x7fd9e575b400, program=0x7fda24c844e0, procnr=348,
proc=0x7fda2310e5e0 <xdr_remote_domain_event_callback_tunable_msg>,
data=0x7ffc3857fdb0)
at remote.c:3803
#5 0x00007fda230dd71b in remoteRelayDomainEventTunable
(conn=<optimized out>, dom=0x7fda27cd7660, params=0x7fda27f3aae0,
nparams=1,opaque=0x7fd9e6c99e00)
at remote.c:1033
#6 0x00007fda224484cb in virDomainEventDispatchDefaultFunc
(conn=0x7fda27cd0120, event=0x7fda2736ea00, cb=0x7fda230dd610
<remoteRelayDomainEventTunable>, cbopaque=0x7fd9e6c99e00)
at conf/domain_event.c:1910
#7 0x00007fda22446871 in virObjectEventStateDispatchCallbacks
(callbacks=<optimized out>, callbacks=<optimized out>,
event=0x7fda2736ea00,state=0x7fda24ca3960)
at conf/object_event.c:722
#8 virObjectEventStateQueueDispatch
(callbacks=0x7fda24c65800, queue=0x7ffc3857fe90, state=0x7fda24ca3960)
at conf/object_event.c:736
#9 virObjectEventStateFlush (state=0x7fda24ca3960)
at conf/object_event.c:814
#10 virObjectEventTimer (timer=<optimized out>, opaque=0x7fda24ca3960)
at conf/object_event.c:560
#11 0x00007fda223ae8b9 in virEventPollDispatchTimeouts ()
at util/vireventpoll.c:458
#12 virEventPollRunOnce ()
at util/vireventpoll.c:654
#13 0x00007fda223ad1d2 in virEventRunDefaultImpl ()
at util/virevent.c:314
#14 0x00007fda225046cd in virNetDaemonRun (dmn=0x7fda24c775c0)
at rpc/virnetdaemon.c:818
#15 0x00007fda230d6351 in main (argc=<optimized out>, argv=<optimized out>)
at libvirtd.c:1623
Andrea Bolognani [Wed, 12 Apr 2017 11:09:08 +0000 (13:09 +0200)]
autogen.sh: Improve and generalize
The goal is twofold: firstly, we want to extend the script so
that it can deal with more than a single git submodule, and
secondly we'd like to reduce the amount of duplicated code.
Moreover, since we're making heavy changes to the code anyway,
we might as well make sure it follows a somewhat consistent
coding style too.
To reduce code duplication, we introduce a new --dry-run
option, which can be used by third parties to figure out
whether calling autogen.sh is necessary or not: this allows
us to get rid of the reimplementation of part of the logic
in cfg.mk and guarantee they'll never get out of sync.
Other changes include: making dirty submodules checking and
cleaning entirely independent of other operations; removing
the use of 'set -e' and handling errors explicitly instead;
better parsing of command line arguments.
Currently, virNetDevSetCoalesce() stub is always returning error. As
it's used by virNetDevTapCreateInBridgePort(), it essentially breaks
bridged networking if coalesce is not supported.
To make it work, relax the stub to trigger error only when its
coalesce argument is not NULL, otherwise report success.