Ján Tomko [Fri, 14 Jun 2019 07:16:14 +0000 (09:16 +0200)]
api: disallow virConnectGetDomainCapabilities on read-only connections
This API can be used to execute arbitrary emulators.
Forbid it on read-only connections.
Fixes: CVE-2019-10167 Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
(cherry picked from commit 8afa68bac0cf99d1f8aaa6566685c43c22622f26) Signed-off-by: Ján Tomko <jtomko@redhat.com>
Ján Tomko [Fri, 14 Jun 2019 06:47:42 +0000 (08:47 +0200)]
api: disallow virDomainSaveImageGetXMLDesc on read-only connections
The virDomainSaveImageGetXMLDesc API is taking a path parameter,
which can point to any path on the system. This file will then be
read and parsed by libvirtd running with root privileges.
Forbid it on read-only connections.
Fixes: CVE-2019-10161 Reported-by: Matthias Gerstner <mgerstner@suse.de> Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
(cherry picked from commit aed6a032cead4386472afb24b16196579e239580) Signed-off-by: Ján Tomko <jtomko@redhat.com>
Conflicts:
src/libvirt-domain.c
src/remote/remote_protocol.x
Upstream commit 12a51f372 which introduced the VIR_DOMAIN_SAVE_IMAGE_XML_SECURE
alias for VIR_DOMAIN_XML_SECURE is not backported.
Just skip the commit since we now disallow the whole API on read-only
connections, regardless of the flag.
The debian etch distro was end-of-life a long time ago so we no
longer need the ULLONG_MAX hack. In any case gnulib now provides
an equivalent fix by default, and so our definition now triggers
syntax-check rule failure
src/internal.h:# define ULLONG_MAX ULONG_LONG_MAX
maint.mk: define the above via some gnulib .h file
maint.mk:843: recipe for target 'sc_prohibit_always-defined_macros' failed
Andrea Bolognani [Wed, 18 Jan 2017 17:30:18 +0000 (18:30 +0100)]
nss: Remove RES_USE_INET6 usage
The recent deprecation in glibc (commit b76e065991ec) means the
module will fail to build entirely:
nss/libvirt_nss.c: In function '_nss_libvirt_gethostbyname_r':
nss/libvirt_nss.c:363:13: error: RES_USE_INET6 is deprecated [-Werror]
int af = ((_res.options & RES_USE_INET6) ? AF_INET6 : AF_INET);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This resolver option was removed shortly after being introduced,
and application using it are already broken anyway.
GCC 7.1 introduces a new -Wformat-truncation warning
flag that reports if it thinks the maximum possible
size of the formatted output will exceed the provided
fixed buffer. This is enabled automatically by the
-Wformat warning flag. There are quite a few places
hit by this in libvirt which need rewriting. This is
non-trivial work in some places, so temporarily
disable the new warning until those fixes can be
implemented.
Depending on the platform/architecture, a number of conditionals
in libvirt code expand the same on both branches. This is expected
behaviour and harmless, so disable the warning to avoid creating
unexpected build failures
Two examples, mingw32:
../../src/util/vircommand.c: In function 'virCommandWait':
../../src/util/vircommand.c:2562:51: error: this condition has identical branches [-Werror=duplicated-branches]
*exitstatus = cmd->rawStatus ? status : WEXITSTATUS(status);
^
and gcc7.1
In file included from util/virobject.c:28:0:
util/virobject.c: In function 'virClassNew':
util/viratomic.h:176:46: error: this condition has identical branches [-Werror=duplicated-branches]
(void)(0 ? *(atomic) ^ *(atomic) : 0); \
^
util/virobject.c:144:20: note: in expansion of macro 'virAtomicIntInc'
klass->magic = virAtomicIntInc(&magicCounter);
^~~~~~~~~~~~~~~
util/viralloc.c: In function 'virReallocN':
util/viralloc.c:246:23: error: '*' in boolean context, suggest '&&' instead [-Werror=int-in-bool-context]
if (!tmp && (size * count)) {
~~~~~~^~~~~~~~
Keep it happy by adding != 0 to the right hand expression
so it realizes we really are wanting to treat the result
of the arithmetic expression as a boolean
Marc Hartmayer [Wed, 7 Jun 2017 08:46:41 +0000 (10:46 +0200)]
Use ATTRIBUTE_FALLTHROUGH
Use ATTRIBUTE_FALLTHROUGH, introduced by commit 5d84f5961b8e28e802f600bb2d2c6903e219092e, instead of comments to
indicate that the fall through is an intentional behavior.
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com> Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com> Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
(cherry picked from commit adf846d3c96ce5d45594c718003852684ddf0cb9)
Add ATTRIBUTE_FALLTHROUGH for switch cases without break
In GCC 7 there is a new warning triggered when a switch
case has a conditional statement (eg if ... else...) and
some of the code paths fallthrough to the next switch
statement. e.g.
conf/domain_conf.c: In function 'virDomainChrEquals':
conf/domain_conf.c:14926:12: error: this statement may fall through [-Werror=implicit-fallthrough=]
if (src->targetTypeAttr != tgt->targetTypeAttr)
^
conf/domain_conf.c:14928:5: note: here
case VIR_DOMAIN_CHR_DEVICE_TYPE_CONSOLE:
^~~~
conf/domain_conf.c: In function 'virDomainChrDefFormat':
conf/domain_conf.c:22143:12: error: this statement may fall through [-Werror=implicit-fallthrough=]
if (def->targetTypeAttr) {
^
conf/domain_conf.c:22151:5: note: here
default:
^~~~~~~
GCC introduced a __attribute__((fallthrough)) to let you
indicate that this is intentionale behaviour rather than
a bug.
Jim Fehlig [Tue, 18 Jul 2017 16:20:35 +0000 (10:20 -0600)]
docs: schema: make disk driver name attribute optional
/domain/devices/disk/driver/@name is not a required or mandatory
attribute according to formatdomain, and indeed it was agreed on
IRC that the attribute is "optional for input, recommended (but
not required) for output". Currently the schema requires the
attribute, causing virt-xml-validate to fail on disk config where
the driver name is not explicitly specified. E.g.
# virt-xml-validate test.xml
Relax-NG validity error : Extra element devices in interleave
test.xml:21: element devices: Relax-NG validity error : Element domain failed to validate content
test.xml fails to validate
Relaxing the name attribute to be optional fixes the validation
Juan Hernandez [Thu, 6 Jul 2017 15:03:31 +0000 (17:03 +0200)]
Avoid hidden cgroup mount points
Currently the scan of the /proc/mounts file used to find cgroup mount
points doesn't take into account that mount points may hidden by other
mount points. For, example in certain Kubernetes environments the
/proc/mounts contains the following lines:
In this particular environment the first mount point is hidden by the
second one. The correct mount point is the third one, but libvirt will
never process it because it only checks the first mount point for each
controller (net_cls in this case). So libvirt will try to use the first
mount point, which doesn't actually exist, and the complete detection
process will fail.
To avoid that issue this patch changes the virCgroupDetectMountsFromFile
function so that when there are duplicates it takes the information from
the last line in /proc/mounts. This requires removing the previous
explicit condition to skip duplicates, and adding code to free the
memory used by the processing of duplicated lines.
Related-To: https://bugzilla.redhat.com/1468214 Related-To: https://github.com/kubevirt/libvirt/issues/4 Signed-off-by: Juan Hernandez <jhernand@redhat.com>
(cherry picked from commit dacd160d7479e0ec2d8a63f102145fd30636a1c8)
If we are encoding a block of data that is 16 bytes in length,
we cannot leave it as 16 bytes, we must pad it out to the next
block boundary, 32 bytes. Without this padding, the decoder will
incorrectly treat the last byte of plain text as the padding
length, as it can't distinguish padded from non-padded data.
The problem exhibited itself when using a 16 byte passphrase
for a LUKS volume
$ virsh start demo
error: Failed to start domain demo
error: internal error: process exited while connecting to monitor: >>>>>>>>>>Len 16
2017-05-02T10:35:40.016390Z qemu-system-x86_64: -object \
secret,id=virtio-disk1-luks-secret0,data=SEtNi5vDUeyseMKHwc1c1Q==,\
keyid=masterKey0,iv=zm7apUB1A6dPcH53VW960Q==,format=base64: \
Incorrect number of padding bytes (56) found on decrypted data
Notice how the padding '56' corresponds to the ordinal value of
the character '8'.
spec: Avoid RPM verification errors on nwfilter XMLs
/etc/libvirt/nwfilter/*.xml files are installed with no UUID, which
means libvirtd will automatically alter all of them once it starts. Thus
RPM verification will always fail on them. Let's use a trick similar to
the default network XML and store nwfilter XMLs in /usr/share. They will
be copied into /etc in %post. Additionally the /etc files are marked as
%ghost so that they are uninstalled if the RPM package is removed.
Note that the %post script overwrites existing files with new ones on
upgrade, which is what has always been happening.
Rather than waiting until we've free'd up all the resources, cause the
'workerPool' thread pool to flush as soon as possible during stateCleanup.
Otherwise, it's possible something waiting to run will SEGV such as is the
case during race conditions of simultaneous exiting libvirtd and qemu process.
Resolves the following crash:
[1] crash backtrace: (bt is shortened a bit):
0 0x00007ffff7282f2b in virClassIsDerivedFrom
(klass=0xdeadbeef, parent=0x55555581d650) at util/virobject.c:169
1 0x00007ffff72835fd in virObjectIsClass
(anyobj=0x7fffd024f580, klass=0x55555581d650) at util/virobject.c:365
2 0x00007ffff7283498 in virObjectLock
(anyobj=0x7fffd024f580) at util/virobject.c:317
3 0x00007ffff722f0a3 in virCloseCallbacksUnset
(closeCallbacks=0x7fffd024f580, vm=0x7fffd0194db0,
cb=0x7fffdf1af765 <qemuProcessAutoDestroy>)
at util/virclosecallbacks.c:164
4 0x00007fffdf1afa7b in qemuProcessAutoDestroyRemove
(driver=0x7fffd00f3a60, vm=0x7fffd0194db0) at qemu/qemu_process.c:6365
5 0x00007fffdf1adff1 in qemuProcessStop
(driver=0x7fffd00f3a60, vm=0x7fffd0194db0, reason=VIR_DOMAIN_SHUTOFF_CRASHED,
asyncJob=QEMU_ASYNC_JOB_NONE, flags=0)
at qemu/qemu_process.c:5877
6 0x00007fffdf1f711c in processMonitorEOFEvent
(driver=0x7fffd00f3a60, vm=0x7fffd0194db0) at qemu/qemu_driver.c:4545
7 0x00007fffdf1f7313 in qemuProcessEventHandler
(data=0x555555832710, opaque=0x7fffd00f3a60) at qemu/qemu_driver.c:4589
8 0x00007ffff72a84c4 in virThreadPoolWorker
(opaque=0x555555805da0) at util/virthreadpool.c:167
Thread 1 (Thread 0x7ffff7fb1880 (LWP 494472)):
1 0x00007ffff72a7898 in virCondWait
(c=0x7fffd01c21f8, m=0x7fffd01c21a0) at util/virthread.c:154
2 0x00007ffff72a8a22 in virThreadPoolFree
(pool=0x7fffd01c2160) at util/virthreadpool.c:290
3 0x00007fffdf1edd44 in qemuStateCleanup ()
at qemu/qemu_driver.c:1102
4 0x00007ffff736570a in virStateCleanup ()
at libvirt.c:807
5 0x000055555556f991 in main (argc=1, argv=0x7fffffffe458) at libvirtd.c:1660
Do not dereference the 'dmn' until after the virStateCleanup is completed.
During initialization, virStateInitialize requires/uses the "dmn" as the
argument to/for the daemonInhibitCallback functions. Thus, cleanup cannot
dereference 'dmn' until after calling the virStateCleanup which calls the
the daemonInhibitCallback using 'dmn'; otherwise, the following crash occurs:
backtrace (shortened a bit)
1 0x00007fd3a791b2e6 in virCondWait (c=<optimized out>, m=<optimized out>)
at util/virthread.c:154
2 0x00007fd3a791bcb0 in virThreadPoolFree (pool=0x7fd38024ee00)
at util/virthreadpool.c:266
3 0x00007fd38edaa00e in qemuStateCleanup () at qemu/qemu_driver.c:1116
4 0x00007fd3a79abfeb in virStateCleanup () at libvirt.c:808
5 0x00007fd3a85f2c9e in main (argc=<optimized out>, argv=<optimized out>)
at libvirtd.c:1660
Thread 1 (Thread 0x7fd38722d700 (LWP 32256)):
0 0x00007fd3a7900910 in virClassIsDerivedFrom
(klass=0xdfd36058d4853, parent=0x7fd3a8f394d0) at util/virobject.c:169
1 0x00007fd3a7900c4e in virObjectIsClass
(anyobj=anyobj@entry=0x7fd3a8f2f850, klass=<optimized out>)
at util/virobject.c:365
2 0x00007fd3a7900c74 in virObjectLock (anyobj=0x7fd3a8f2f850)
at util/virobject.c:317
3 0x00007fd3a7a24d5d in virNetDaemonRemoveShutdownInhibition
(dmn=0x7fd3a8f2f850) at rpc/virnetdaemon.c:547
4 0x00007fd38ed722cf in qemuProcessStop
(driver=driver@entry=0x7fd380103810, vm=vm@entry=0x7fd38025b6d0,
reason=reason@entry=VIR_DOMAIN_SHUTOFF_SHUTDOWN,
asyncJob=asyncJob@entry=QEMU_ASYNC_JOB_NONE, flags=flags@entry=0)
at qemu/qemu_process.c:5786
5 0x00007fd38edd9428 in processMonitorEOFEvent
(vm=0x7fd38025b6d0, driver=0x7fd380103810) at qemu/qemu_driver.c:4588
6 qemuProcessEventHandler (data=<optimized out>, opaque=0x7fd380103810)
at qemu/qemu_driver.c:4632
7 0x00007fd3a791bb55 in virThreadPoolWorker
(opaque=opaque@entry=0x7fd3a8f1e4c0) at util/virthreadpool.c:145
Maxim Nestratov [Mon, 6 Jun 2016 14:42:16 +0000 (17:42 +0300)]
util: fix crash in virClassIsDerivedFrom for CloseCallbacks objects
There is a possibility that qemu driver frees by unreferencing its
closeCallbacks pointer as it has the only reference to the object,
while in fact not all users of CloseCallbacks called thier
virCloseCallbacksUnset.
Backtrace is the following:
Thread #1:
0 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
1 in virCondWait (c=<optimized out>, m=<optimized out>)
at util/virthread.c:154
2 in virThreadPoolFree (pool=0x7f0810110b50)
at util/virthreadpool.c:266
3 in qemuStateCleanup () at qemu/qemu_driver.c:1116
4 in virStateCleanup () at libvirt.c:808
5 in main (argc=<optimized out>, argv=<optimized out>)
at libvirtd.c:1660
Thread #2:
0 in virClassIsDerivedFrom (klass=0xdeadbeef, parent=0x7f0837c694d0) at util/virobject.c:169
1 in virObjectIsClass (anyobj=anyobj@entry=0x7f08101d4760, klass=<optimized out>) at util/virobject.c:365
2 in virObjectLock (anyobj=0x7f08101d4760) at util/virobject.c:317
3 in virCloseCallbacksUnset (closeCallbacks=0x7f08101d4760, vm=vm@entry=0x7f08101d47b0, cb=cb@entry=0x7f081d078fc0 <qemuProcessAutoDestroy>) at util/virclosecallbacks.c:163
4 in qemuProcessAutoDestroyRemove (driver=driver@entry=0x7f081018be50, vm=vm@entry=0x7f08101d47b0) at qemu/qemu_process.c:6368
5 in qemuProcessStop (driver=driver@entry=0x7f081018be50, vm=vm@entry=0x7f08101d47b0, reason=reason@entry=VIR_DOMAIN_SHUTOFF_SHUTDOWN, asyncJob=asyncJob@entry=QEMU_ASYNC_JOB_NONE, flags=flags@entry=0) at qemu/qemu_process.c:5854
6 in processMonitorEOFEvent (vm=0x7f08101d47b0, driver=0x7f081018be50) at qemu/qemu_driver.c:4585
7 qemuProcessEventHandler (data=<optimized out>, opaque=0x7f081018be50) at qemu/qemu_driver.c:4629
8 in virThreadPoolWorker (opaque=opaque@entry=0x7f0837c4f820) at util/virthreadpool.c:145
9 in virThreadHelper (data=<optimized out>) at util/virthread.c:206
10 in start_thread () from /lib64/libpthread.so.0
Let's reference CloseCallbacks object in virCloseCallbacksSet and
unreference in virCloseCallbacksUnset.
Peter Krempa [Thu, 30 Mar 2017 11:47:45 +0000 (13:47 +0200)]
storage: driver: Remove unavailable transient pools after restart
If a transient storage pool is deemed inactive after libvirtd restart it
would not be deleted from the list. Reuse virStoragePoolUpdateInactive
along with a refactor necessary to properly update the state.
Peter Krempa [Thu, 30 Mar 2017 11:45:45 +0000 (13:45 +0200)]
storage: driver: Split out code fixing pool state after deactivation
After a pool is made inactive the definition objects need to be updated
(if a new definition is prepared) and transient pools need to be
completely removed. Split out the code doing these steps into a separate
function for later reuse.
If a secret was not provided for what was determined to be a LUKS
encrypted disk (during virStorageFileGetMetadata processing when
called from qemuDomainDetermineDiskChain as a result of hotplug
attach qemuDomainAttachDeviceDiskLive), then do not attempt to
look it up (avoiding a libvirtd crash) and do not alter the format
to "luks" when adding the disk; otherwise, the device_add would
fail with a message such as:
"unable to execute QEMU command 'device_add': Property 'scsi-hd.drive'
can't find value 'drive-scsi0-0-0-0'"
because of assumptions that when the format=luks that libvirt would have
provided the secret to decrypt the volume.
Access to unlock the volume will thus be left to the application.
Ján Tomko [Tue, 28 Mar 2017 13:07:50 +0000 (15:07 +0200)]
schema: do not require name for certain pool types
Pool types that have the VIR_STORAGE_POOL_SOURCE_NAME flag set
allow omitting the <name> element and instead fill out the pool name
from the <source><name> element.
Relax the schema to make <name> optional for these pools.
Expressing that at least one of these is required is out of scope
of the schema.
Andrea Bolognani [Tue, 20 Sep 2016 13:22:04 +0000 (15:22 +0200)]
virtlogd: Don't stop or restart along with libvirtd
Commit 839a060 tied the lifecycle of virtlogd more
closely to that of libvirtd. Unfortunately, while starting
virtlogd when libvirtd is started is definitely a good idea,
restarting virtlogd or shutting it down at any time outside
of system poweroff is not.
Revert part of that commit by removing the PartOf= lines,
meaning that only startup requests will be propagated from
libvirtd to virtlogd.
virtlogd.socket: Tie lifecycle to libvirtd.service
We already guarantee that virtlogd.socket is enabled/disabled
along with libvirtd.service, but if libvirtd.service has just
been installed and is started before rebooting, then
virtlogd.socket will not be running and guest startup will
fail.
Add Requires=virtlogd.socket to libvirtd.service to make sure
virtlogd.socket is always started along with libvirtd.service,
and add Before=libvirtd.service to both virtlogd.socket and
virtlogd.service so that virtlogd never disappears before
libvirtd has exited.
Also add PartOf=libvirtd.service to both virtlogd.socket and
virtlogd.service, so that virtlogd can be shut down when not
needed.
Cole Robinson [Fri, 5 May 2017 00:08:55 +0000 (20:08 -0400)]
spec: Update version check for maint Source URL
New maint release version numbers of just A.B.C format, not the old
A.B.C.D format. Adjust the check that dynamically changes the Source
URL for maint releases
Signed-off-by: Cole Robinson <crobinso@redhat.com> Acked-by: Andrea Bolognani <abologna@redhat.com>
(cherry picked from commit 1d07a5bf3c03309642068d20698a1f55739dafa2)
Peter Krempa [Fri, 25 Nov 2016 16:08:25 +0000 (17:08 +0100)]
qemu: capabilities: Don't partially reprope caps on process reconnect
Thanks to the complex capability caching code virQEMUCapsProbeQMP was
never called when we were starting a new qemu VM. On the other hand,
when we are reconnecting to the qemu process we reload the capability
list from the status XML file. This means that the flag preventing the
function being called was not set and thus we partially reprobed some of
the capabilities.
The recent addition of CPU hotplug clears the
QEMU_CAPS_QUERY_HOTPLUGGABLE_CPUS if the machine does not support it.
The partial re-probe on reconnect results into attempting to call the
unsupported command and then killing the VM.
Remove the partial reprobe and depend on the stored capabilities. If it
will be necessary to reprobe the capabilities in the future, we should
do a full reprobe rather than this partial one.
Laine Stump [Fri, 28 Oct 2016 15:43:56 +0000 (11:43 -0400)]
network: fix endless loop when starting network with multiple IPs and no dhcp
commit 9065cfaa added the ability to disable DNS services for a
libvirt virtual network. If neither DNS nor DHCP is needed for a
network, then we don't need to start dnsmasq, so code was added to
check for this.
Unfortunately, it was written with a great lack of attention to detail
(I can say that, because I was the author), and the loop that checked
if DHCP is needed for the network would never end if the network had
multiple IP addresses and the first <ip> had no <dhcp> subelement
(which would have contained a <range> or <host> subelement, thus
requiring DHCP services).
This patch rewrites the check to be more compact and (more
importantly) finite.
This bug was present in release 2.2.0 and 2.3.0, so will need to be
backported to any relevant maintainence branches.
Laine Stump [Wed, 5 Oct 2016 15:26:07 +0000 (11:26 -0400)]
qemu: allow 32 slots on pcie-expander-bus, not just 1
When I added support for the pcie-expander-bus controller in commit bc07251f, I incorrectly thought that it only had a single slot
available. Actually it has 32 slots, just like the root complex aka
pcie-root (the part that I *did* get correct is that unlike pcie-root
a pcie-expander-bus doesn't allow any integrated endpoint devices -
only pcie-root-ports and dmi-to-pci-controllers are allowed).
There is a logic in place that if there is no real need for
memory-backend-file, qemuBuildMemoryBackendStr() returns 0. However
that wasn't the case with hugepage backing. The reason for that was
that we abused the 'pagesize' variable for storing that information, but
we should rather have a separate one that specifies whether we really
need the new object for hugepage backing. And that variable should be
set only if this particular NUMA cell needs special treatment WRT
hugepages.
Michal Privoznik [Wed, 31 Aug 2016 10:52:11 +0000 (12:52 +0200)]
tools: Don't list virsh-* under EXTRA_DIST
When we wanted to break huge and unmaintainable virsh into
smaller files first thing we did was to just move funcs into
virsh-.c files and then #include them from virsh. Having it done
this way we also needed to have them listed under EXTRA_DIST.
However, things got changed since then and now all the virsh-*.c
files are proper source files. Therefore they are listed under
virsh_SOURCES too. But for some reason we forgot to remove them
from EXTRA_DIST.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Jim Fehlig [Mon, 29 Aug 2016 16:08:01 +0000 (10:08 -0600)]
libxl: advertise support for migration V3
The libxl driver has long supported migration V3 but has never
indicated so in the connectSupportsFeature API. As a result, apps
such as virt-manager that use the more generic virDomainMigrate API
fail with
libvirtError: this function is not supported by the connection driver:
virDomainMigrate
Add VIR_DRV_FEATURE_MIGRATION_V3 to the list of features marked as
supported in the connectSupportsFeature API.
Test 12 from objecteventtest (createXML add event) segaults on FreeBSD
with bus error.
At some point it calls testNodeDeviceDestroy() from the test driver. And
it fails when it tries to unlock the device in the "out:" label of this
function.
Unlocking fails because the previous step was a call to
virNodeDeviceObjRemove from conf/node_device_conf.c. This function
removes the given device from the device list and cleans up the object,
including destroying of its mutex. However, it does not nullify the pointer
that was given to it.
As a result, we end up in testNodeDeviceDestroy() here:
out:
if (obj)
virNodeDeviceObjUnlock(obj);
And instead of skipping this, we try to do Unlock and fail because of
malformed mutex.
Change virNodeDeviceObjRemove to use double pointer and set pointer to
NULL.
Peter Krempa [Thu, 25 Aug 2016 19:30:21 +0000 (15:30 -0400)]
qemu: driver: Validate configuration when setting maximum vcpu count
Setting vcpu count when cpu topology is specified may result into an
invalid configuration. Since the topology can't be modified, reject the
setting if it doesn't match the requested topology. This will allow
fixing the topology in case it was broken.
Peter Krempa [Thu, 25 Aug 2016 18:53:06 +0000 (14:53 -0400)]
qemu: driver: Fix qemuDomainHelperGetVcpus for sparse vcpu topologies
ce43cca0e refactored the helper to prepare it for sparse topologies but
forgot to fix the iterator used to fill the structures. This would
result into a weirdly sparse populated array and possible out of bounds
access and crash once sparse vcpu topologies were allowed.
Peter Krempa [Thu, 25 Aug 2016 18:48:52 +0000 (14:48 -0400)]
virsh: vcpuinfo: Report vcpu number from the structure rather than it's position
virVcpuInfo contains the vcpu number that the data refers to. Report
what's returned by the daemon rather than the sequence number as with
sparse vcpu topologies they won't match.
Olga Krishtal [Thu, 18 Aug 2016 11:57:14 +0000 (14:57 +0300)]
vz: fixed race in vzDomainAttach/DettachDevice
While dettaching/attaching device in OpenStack, nova
calls vzDomainDettachDevice twice, because the update of the internal
configuration of the ct comes a bit latter than the update event.
As the result, we suffer from the second call to dettach the same device.
Signed-off-by: Olga Krishtal <okrishtal@virtuozzo.com>
libvirt-python passes parameter bandwidth = 0
by default. This means that bandwidth is unlimited.
VZ driver doesn't support bandwidth rate limiting,
but we still need to handle it and fail if bandwidth > 0.
Signed-off-by: Pavel Glushchak <pglushchak@virtuozzo.com>
Laine Stump [Thu, 25 Aug 2016 05:46:37 +0000 (01:46 -0400)]
qemu: set tap device online for type='ethernet'
When support for auto-creating tap devices was added to <interface
type='ethernet'> in commit 9c17d6, the code assumed that
virNetDevTapCreate() would honor the VIR_NETDEV_TAP__CREATE_IFUP flag
that is supported by virNetDevTapCreateInBridgePort(). That isn't the
case - the latter function performs several operations, and one of
them is setting the tap device online. But virNetDevTapCreate() *only*
creates the tap device, and relies on the caller to do everything
else, so qemuInterfaceEthernetConnect() needs to call
virNetDevSetOnline() after the device is successfully created.
Laine Stump [Thu, 25 Aug 2016 05:18:25 +0000 (01:18 -0400)]
qemu: remove unnecessary setting of tap device online state
The linkstate setting of an <interface> is only meant to change the
online status reported to the guest system by the emulated network
device driver in qemu, but when support for auto-creating tap devices
for <interface type='ethernet'> was added in commit 9717d6, a chunk of
code was also added to qemuDomainChangeNetLinkState() that sets the
online status of the tap device (i.e. the *host* side of the
interface) for type='ethernet'. This was never done for tap devices
used in type='bridge' or type='network' interfaces, nor was it done in
the past for tap devices created by external scripts for
type='ethernet', so we shouldn't be doing it now.
This patch removes the bit of code in qemuDomainChangeNetLinkState()
that modifies online status of the tap device.
Vasiliy Tolstov [Wed, 24 Aug 2016 16:09:22 +0000 (19:09 +0300)]
qemu: fix ethernet network type ip/route assign
The call to virNetDevIPInfoAddToDev() that sets up tap device IP
addresses and routes was somehow incorrectly placed in
qemuInterfaceStopDevice() instead of qemuInterfaceStartDevice() in
commit fe8567f6. This fixes that error by moving the call to
virNetDevIPInfoAddToDev() to qemuInterfaceStartDevice().
Peter Krempa [Tue, 16 Aug 2016 13:02:11 +0000 (15:02 +0200)]
qemu: hotplug: Add support for VCPU unplug
This patch removes the old vcpu unplug code completely and replaces it
with the new code using device_del. The old hotplug code basically never
worked with any recent qemu and thus is useless.
As the new code is using device_del all the implications of using it
are present. Contrary to the device deletion code, the vcpu deletion
code fails if the unplug request is not executed in time.
Peter Krempa [Tue, 16 Aug 2016 12:44:26 +0000 (14:44 +0200)]
qemu: Use modern vcpu hotplug approach if possible
To allow unplugging the vcpus, hotplugging of vcpus on platforms which
require to plug multiple logical vcpus at once or plugging them in an
arbitrary order it's necessary to use the new device_add interface for
vcpu hotplug.
This patch adds support for the device_add interface using the old
setvcpus API by implementing an algorithm to select the appropriate
entities to plug in.
Peter Krempa [Thu, 4 Aug 2016 12:36:24 +0000 (14:36 +0200)]
qemu: command: Add support for sparse vcpu topologies
Add support for using the new approach to hotplug vcpus using device_add
during startup of qemu to allow sparse vcpu topologies.
There are a few limitations imposed by qemu on the supported
configuration:
- vcpu0 needs to be always present and not hotpluggable
- non-hotpluggable cpus need to be ordered at the beginning
- order of the vcpus needs to be unique for every single hotpluggable
entity
Qemu also doesn't really allow to query the information necessary to
start a VM with the vcpus directly on the commandline. Fortunately they
can be hotplugged during startup.
The new hotplug code uses the following approach:
- non-hotpluggable vcpus are counted and put to the -smp option
- qemu is started
- qemu is queried for the necessary information
- the configuration is checked
- the hotpluggable vcpus are hotplugged
- vcpus are started
This patch adds a lot of checking code and enables the support to
specify the individual vcpu element with qemu.
Peter Krempa [Thu, 4 Aug 2016 12:23:25 +0000 (14:23 +0200)]
qemu: process: Copy final vcpu order information into the vcpu definition
The vcpu order information is extracted only for hotpluggable entities,
while vcpu definitions belonging to the same hotpluggable entity need
to all share the order information.
We also can't overwrite it right away in the vcpu info detection code as
the order is necessary to add the hotpluggable vcpus enabled on boot in
the correct order.
The helper will store the order information in places where we are
certain that it's necessary.
Peter Krempa [Thu, 4 Aug 2016 11:57:46 +0000 (13:57 +0200)]
qemu: migration: Prepare for non-contiguous vcpu configurations
Introduce a new migration cookie flag that will be used for any
configurations that are not compatible with libvirt that would not
support the specific vcpu hotplug approach. This will make sure that old
libvirt does not fail to reproduce the configuration correctly.
The 'enabled' attribute allows to control the state of the vcpu.
'hotpluggable' controls whether given vcpu can be hotplugged and 'order'
allows to specify the order to add the vcpus.
Peter Krempa [Fri, 5 Aug 2016 12:48:27 +0000 (14:48 +0200)]
qemu: domain: Prepare for VCPUs vanishing while libvirt is not running
Similarly to devices the guest may allow unplug of the VCPU if libvirt
is down. To avoid problems, refresh the vcpu state on reconnect. Don't
mess with the vcpu state otherwise.
Peter Krempa [Sun, 31 Jul 2016 12:05:04 +0000 (14:05 +0200)]
qemu: domain: Extract cpu-hotplug related data
Now that the monitor code gathers all the data we can extract it to
relevant places either in the definition or the private data of a vcpu.
As only thread id is broken for TCG guests we may extract the rest of
the data and just skip assigning of the thread id. In case where qemu
would allow cpu hotplug in TCG mode this will make it work eventually.
Peter Krempa [Fri, 29 Jul 2016 17:24:22 +0000 (19:24 +0200)]
tests: cpu-hotplug: Add data for ppc64 platform including hotplug
Power 8 platform's basic hotpluggable unit is a core rather than a
thread for x86_64 family. This introduces most of the complexity of the
matching code and thus needs to be tested.
The test data contain data captured from in-order cpu hotplug and
unplug operations.
Peter Krempa [Tue, 23 Aug 2016 21:05:52 +0000 (17:05 -0400)]
tests: cpu-hotplug: Add data for x86 hotplug with 11+ vcpus
During review it was reported that adding at least 11 vcpus creates a
collision of prefixes in the monitor matching algorithm. Add a test case
to verify that the problem won't happen.
Peter Krempa [Fri, 29 Jul 2016 16:08:06 +0000 (18:08 +0200)]
tests: Add test infrastructure for qemuMonitorGetCPUInfo
As the combination algorithm is rather complex and ugly it's necessary
to make sure it works properly. Add test suite infrastructure for
testing it along with a basic test based on x86_64 platform.
Peter Krempa [Mon, 1 Aug 2016 11:56:23 +0000 (13:56 +0200)]
qemu: monitor: Add algorithm for combining query-(hotpluggable-)-cpus data
For hotplug purposes it's necessary to retrieve data using
query-hotpluggable-cpus while the old query-cpus API report thread IDs
and order of hotplug.
This patch adds code that merges the data using a rather non-trivial
algorithm and fills the data to the qemuMonitorCPUInfo structure for
adding to appropriate place in the domain definition.
Peter Krempa [Fri, 8 Jul 2016 11:52:11 +0000 (13:52 +0200)]
qemu: monitor: Add support for calling query-hotpluggable-cpus
Add support for retrieving information regarding hotpluggable cpu units
supported by qemu. Data returned by the command carries information
needed to figure out the granularity of hotplug, the necessary cpu type
name and the topology information.
Note that qemu doesn't specify any particular order of the entries thus
it's necessary sort them by socket_id, core_id and thread_id to the
order libvirt expects.
Peter Krempa [Thu, 28 Jul 2016 08:33:10 +0000 (10:33 +0200)]
qemu: monitor: Extract QOM path from query-cpus reply
To allow matching up the data returned by query-cpus to entries in the
query-hotpluggable-cpus reply for CPU hotplug it's necessary to extract
the QOM path as it's the only link between the two.
Peter Krempa [Fri, 29 Jul 2016 07:45:19 +0000 (09:45 +0200)]
qemu: capabilities: Extract availability of new cpu hotplug for machine types
QEMU reports whether 'query-hotpluggable-cpus' is supported for a given
machine type. Extract and cache the information using the capability
cache.
When copying the capabilities for a new start of qemu, mask out the
presence of QEMU_CAPS_QUERY_HOTPLUGGABLE_CPUS if the machine type
doesn't support hotpluggable cpus.
configuration where the maximum CPU count doesn't match the topology is
rejected. Prior to that only configurations where the topology would
contain more cpus than the maximum count would be rejected.
Use QEMU_CAPS_QUERY_HOTPLUGGABLE_CPUS as a relevant recent enough
witness to avoid breaking old configs.
Peter Krempa [Mon, 1 Aug 2016 11:44:25 +0000 (13:44 +0200)]
qemu: monitor: Return struct from qemuMonitor(Text|Json)QueryCPUs
Prepare to extract more data by returning an array of structs rather than
just an array of thread ids. Additionally report fatal errors separately
from qemu not being able to produce data.
Pino Toscano [Wed, 24 Aug 2016 14:14:24 +0000 (16:14 +0200)]
virsh: avoid i18n puzzle
Use the full versions of the message, instead of composing a base
message with what was updated; the change makes the messages properly
translatable, since different parts of a sentence might need different
declensions for example.
Pino Toscano [Wed, 24 Aug 2016 14:14:23 +0000 (16:14 +0200)]
virsh: respect -q/--quiet more
Turn various vshPrint() informative messages into vshPrintExtra(), so
they are not printed when requesting the quiet mode; neither XML/info
outputs nor the results of commands are affected.
Also change the expected outputs of the virsh-undefine test, since virsh
is invoked in quiet mode there.
Some informative messages might still be converted (and thus silenced
when in quiet mode), but this is an improvements nonetheless.
vzDomainMigrateConfirm3Params is whitelisted. Otherwise we need to
move removing domain from domain list from perform to confirm
step. This would further imply adding a flag and check that migration
is in progress to prohibit mistakenly (maliciously) removing domains
on confirm step. vz version of p2p also need to be fixed to include confirm step.
One would also need to add means to cleanup pending migration
on client disconnect as now is has state across several API
calls.
On the other hand current version of confirm step is totaly
harmless thus it is easier to whitelist it at the moment.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com> Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
ACL check on perform step should be in API call itself to make ACL
checking script pass. Thus we need to reorganize code to obtain
domain object in perform API itself. Most of this is straight
forward, the only nuance is dropping locks on lengthy remote
operations.
The other motivation is to have only perform step ACL checks for
p2p migration instead of both begin in perform if we can leave
ACL check in vzDomainMigratePerformStep.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
The original motivation is to expand API calls like start/stop etc so that
the ACL checks could be added. But this patch has its own befenits.
1. functions like prlsdkStart/Stop use common routine to wait for
job without domain lock. They become more self contained and do
not return intermediate PRL_RESULT.
2. vzDomainManagedSave do not update cache twice.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com> Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
Upon entry to qemuDomainAttachVirtioDiskDevice, src->encryption points
at a valid 'secret' buffer w/ nsecrets == 1; however, the call to
qemuDomainDetermineDiskChain will call virStorageFileGetMetadata
and eventually virStorageFileGetMetadataInternal where the src->encryption
was overwritten when probing the volume.
Commit id 'a48c7141' added code to virStorageFileGetMetadataInternal
to determine if the disk/volume would use/need encryption and allocated
a meta->encryption. This overwrote an existing encryption buffer
already provided by the XML
This patch adds a check for meta->encryption already present before
just allocating and overwriting an existing buffer. It then checks the
existing encryption data to ensure the XML provided format for the
disk matches the expected format read from the disk and errors if there
is a mismatch.
Laine Stump [Fri, 12 Aug 2016 02:28:27 +0000 (22:28 -0400)]
network: allow limiting a <forwarder> element to certain domains
For some unknown reason the original implementation of the <forwarder>
element only took advantage of part of the functionality in the
dnsmasq feature it exposes - it allowed specifying the ip address of a
DNS server which *all* DNS requests would be forwarded to, like this:
<forwarder addr='192.168.123.25'/>
This is a frontend for dnsmasq's "server" option, which also allows
you to specify a domain that must be matched in order for a request to
be forwarded to a particular server. This patch adds support for
specifying the domain. For example:
would forward requests for bob.example.com, ftp.example.com and
joe.corp.example.com all to the DNS server at 192.168.1.1, but would
forward requests for travesty.org and www.travesty.org to
10.0.0.1. And due to the second line, requests for www.example.com,
and odd.www.example.com would be resolved by the libvirt network's own
DNS server (i.e. thery wouldn't be immediately forwarded) even though
they also match 'example.com' - the match is given to the entry with
the longest matching domain. DNS requests not matching any of the
entries would be resolved by the libvirt network's own DNS server.
Laine Stump [Thu, 11 Aug 2016 21:29:43 +0000 (17:29 -0400)]
network: allow disabling dnsmasq's DNS server
If you define a libvirt virtual network with one or more IP addresses,
it starts up an instance of dnsmasq. It's always been possible to
avoid dnsmasq's dhcp server (simply don't include a <dhcp> element),
but until now it wasn't possible to avoid having the DNS server
listening; even if the network has no <dns> element, it is started
using default settings.
This patch adds a new attribute to <dns>: enable='yes|no'. For
backward compatibility, it defaults to 'yes', but if you don't want a
DNS server created for the network, you can simply add:
<dns enable='no'/>
to the network configuration, and next time the network is started
there will be no dns server created (if there is dhcp configuration,
dnsmasq will be started with "port=0" which disables the DNS server;
if there is no dhcp configuration, dnsmasq won't be started at all).
Laine Stump [Wed, 10 Aug 2016 23:09:55 +0000 (19:09 -0400)]
network: new network forward mode 'open'
The new forward mode 'open' is just like mode='route', except that no
firewall rules are added to assure that any traffic does or doesn't
pass. It is assumed that either they aren't necessary, or they will be
setup outside the scope of libvirt.