Jiri Denemark [Tue, 28 Jun 2016 12:39:58 +0000 (14:39 +0200)]
qemu: Let empty default VNC password work as documented
CVE-2016-5008
Setting an empty graphics password is documented as a way to disable
VNC/SPICE access, but QEMU does not always behaves like that. VNC would
happily accept the empty password. Let's enforce the behavior by setting
password expiration to "now".
Eric Blake [Wed, 9 Dec 2015 00:46:31 +0000 (17:46 -0700)]
CVE-2015-5313: storage: don't allow '/' in filesystem volume names
The libvirt file system storage driver determines what file to
act on by concatenating the pool location with the volume name.
If a user is able to pick names like "../../../etc/passwd", then
they can escape the bounds of the pool. For that matter,
virStoragePoolListVolumes() doesn't descend into subdirectories,
so a user really shouldn't use a name with a slash.
Normally, only privileged users can coerce libvirt into creating
or opening existing files using the virStorageVol APIs; and such
users already have full privilege to create any domain XML (so it
is not an escalation of privilege). But in the case of
fine-grained ACLs, it is feasible that a user can be granted
storage_vol:create but not domain:write, and it violates
assumptions if such a user can abuse libvirt to access files
outside of the storage pool.
Therefore, prevent all use of volume names that contain "/",
whether or not such a name is actually attempting to escape the
pool.
Since commit 8eb55d782a2b9afacc7938694891cc6fad7b42a5 libxml2 removes
two slashes from the URI when there is no server part. This is fixed
with beb7281055dbf0ed4d041022a67c6c5cfd126f25, but only if the calling
application calls xmlSaveUri() on URI that xmlURIParse() parsed. And
that is not the case in virURIFormat(). virURIFormat() accepts
virURIPtr that can be created without parsing it and we do that when we
format network storage paths for gluster for example. Even though
virStorageSourceParseBackingURI() uses virURIParse(), it throws that data
structure right away.
Since we want to format URIs as URIs and not absolute URIs or opaque
URIs (see RFC 3986), we can specify that with a special hack thanks to
commit beb7281055dbf0ed4d041022a67c6c5cfd126f25, by setting port to -1.
This fixes qemuxml2argvtest test where the disk-drive-network-gluster
case was failing.
In systemd >= 218, the udev_set_log_fn method has been marked
deprecated and turned into a no-op. Nothing in the udev client
library will print to stderr by default anymore, so we can
just stop installing a logging hook for new enough udev.
Dario Faggioli [Mon, 30 Jun 2014 17:19:01 +0000 (19:19 +0200)]
libxl: don't break the build on Xen>=4.5 because of libxl_vcpu_setaffinity()
libxl interface for vcpu pinning is changing in Xen 4.5. Basically,
libxl_set_vcpuaffinity() now wants one more parameter. That is
representative of 'VCPU soft affinity', which libvirt does not use.
To mark such change, the macro LIBXL_HAVE_VCPUINFO_SOFT_AFFINITY is
defined. Use it as a gate and, if present, re-#define the calls from
the old to the new interface, to avoid breaking the build.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> Cc: Jim Fehlig <jfehlig@suse.com> Cc: Ian Campbell <Ian.Campbell@citrix.com> Cc: Ian Jackson <Ian.Jackson@eu.citrix.com>
(cherry picked from commit bfc72e99920215c9b004a5380ca61fe6ff81ea6b)
Well, in 8ad126e6 we tried to fix a memory corruption problem.
However, the fix was not as good as it could be. I mean, the
commit has one line more than it should. I've noticed this output
just recently:
# ./run valgrind --leak-check=full --show-reachable=yes ./tools/virsh domblklist gentoo
==17019== Memcheck, a memory error detector
==17019== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==17019== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==17019== Command: /home/zippy/work/libvirt/libvirt.git/tools/.libs/virsh domblklist gentoo
==17019==
Target Source
------------------------------------------------
fda /var/lib/libvirt/images/fd.img
vda /var/lib/libvirt/images/gentoo.qcow2
hdc /home/zippy/tmp/install-amd64-minimal-20150402.iso
==17019== Thread 2:
==17019== Invalid read of size 4
==17019== at 0x4EFF5B4: virObjectUnref (virobject.c:258)
==17019== by 0x5038CFF: remoteClientCloseFunc (remote_driver.c:552)
==17019== by 0x5069D57: virNetClientCloseLocked (virnetclient.c:685)
==17019== by 0x506C848: virNetClientIncomingEvent (virnetclient.c:1852)
==17019== by 0x5082136: virNetSocketEventHandle (virnetsocket.c:1913)
==17019== by 0x4ECD64E: virEventPollDispatchHandles (vireventpoll.c:509)
==17019== by 0x4ECDE02: virEventPollRunOnce (vireventpoll.c:658)
==17019== by 0x4ECBF00: virEventRunDefaultImpl (virevent.c:308)
==17019== by 0x130386: vshEventLoop (vsh.c:1864)
==17019== by 0x4F1EB07: virThreadHelper (virthread.c:206)
==17019== by 0xA8462D3: start_thread (in /lib64/libpthread-2.20.so)
==17019== by 0xAB441FC: clone (in /lib64/libc-2.20.so)
==17019== Address 0x139023f4 is 4 bytes inside a block of size 240 free'd
==17019== at 0x4C2B1F0: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==17019== by 0x4EA8949: virFree (viralloc.c:582)
==17019== by 0x4EFF6D0: virObjectUnref (virobject.c:273)
==17019== by 0x4FE74D6: virConnectClose (libvirt.c:1390)
==17019== by 0x13342A: virshDeinit (virsh.c:406)
==17019== by 0x134A37: main (virsh.c:950)
The problem is, when registering remoteClientCloseFunc(), it's
conn->closeCallback which is ref'd. But in the function itself
it's conn->closeCallback->conn what is unref'd. This is causing
imbalance in reference counting. Moreover, there's no need for
the remote driver to increase/decrease conn refcount since it's
not used anywhere. It's just merely passed to client registered
callback. And for that purpose it's correctly ref'd in
virConnectRegisterCloseCallback() and then unref'd in
virConnectUnregisterCloseCallback().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
(cherry picked from commit e68930077034f786e219bdb015f8880dbc5a246f) Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Peter Krempa [Tue, 20 Jan 2015 16:01:01 +0000 (17:01 +0100)]
CVE-2015-0236: qemu: Check ACLs when dumping security info from snapshots
The ACL check didn't check the VIR_DOMAIN_XML_SECURE flag and the
appropriate permission for it. Found via code inspection while fixing
permissions for save images.
gnutls-3.3.0 and newer leaves 2 FDs open in order to be backwards
compatible when it comes to chrooted binaries [1]. Linking
commandhelper with gnutls then leaves these two FDs open and
commandtest fails thanks to that. This patch does not link
commandhelper with libvirt.la, but rather only the utilities making
the test pass.
wireshark: Honor API change coming with 1.12 release
https://bugs.gentoo.org/show_bug.cgi?id=508336
At wireshark, they have this promise to change public dissector APIs
only with minor version number change. Which they did when releasing
the version of 1.12.
Firstly, they've changed tvb_memdup() in a0c53ffaa1bb46d8c9db2ec739401aa411c9790e so now it takes four arguments
instead of three. The new argument is placed at the very beginning of
the list of arguments and basically says the scope where we'd like to
allocate the memory. According to the documentation NULL should be the
default value.
Then, the tcp_dissect_pdus() signature changed too. Well, the function
that actually dissects reassembled packets as tcp_dissect_pdus()
reorder TCP packets into one big chunk and then calls a user function
to dissect the PDU at once. The change is dated back to 8081cf1d90397cbbb4404f9720595e1537ed5e14.
The rationale is to not duplicate code which is done in
packet-libvirt.h for instance. Moreover, this way we can drop
__attribute_((unused)) used int packet-libvirt.c in favor of
ATTRIBUTE_UNUSED.
This patch changes the macro introduced in 292d3f2d to either be
empty in the case of newer libselinux, or contain 'const' in the
case of older libselinux. The macro is then used directly in
tests/securityselinuxhelper.c.
Cédric Bosdonnat [Wed, 28 May 2014 12:44:08 +0000 (14:44 +0200)]
build: fix build with libselinux 2.3
Several function signatures changed in libselinux 2.3, now taking
a 'const char *' instead of 'security_context_t'. The latter is
defined in selinux/selinux.h as
Laine Stump [Wed, 15 Oct 2014 22:49:01 +0000 (00:49 +0200)]
util: eliminate "use after free" in callers of virNetDevLinkDump
virNetDevLinkDump() gets a message from netlink into "resp", then
calls nlmsg_parse() to fill the table "tb" with pointers into resp. It
then returns tb to its caller, but not before freeing the buffer at
resp. That means that all the callers of virNetDevLinkDump() are
examining memory that has already been freed. This can be verified by
filling the buffer at resp with garbage prior to freeing it (or, I
suppose, just running libvirtd under valgrind) then performing some
operation that calls virNetDevLinkDump().
The upstream commit log incorrectly states that the code has been like
this ever since virNetDevLinkDump() was written. In reality, the
problem was introduced with commit e95de74d, first in libvirt-1.0.5,
which was attempting to eliminate a typecast that caused compiler
warnings. It has only been pure luck (or maybe a lack of heavy load,
and/or maybe an allocation algorithm in malloc() that delays re-use of
just-freed memory) that has kept this from causing errors, for example
when configuring a PCI passthrough or macvtap passthrough network
interface.
The solution taken in this patch is the simplest - just return resp to
the caller along with tb, then have the caller free it after they are
finished using the data (pointers) in tb. I alternately could have
made a cleaner interface by creating a new struct that put tb and resp
together along with a vir*Free() function for it, but this function is
only used in a couple places, and I'm not sure there will be
additional new uses of virNetDevLinkDump(), so the value of adding a
new type, extra APIs, etc. is dubious.
Eric Blake [Thu, 6 Nov 2014 08:42:24 +0000 (09:42 +0100)]
CVE-2014-7823: dumpxml: security hole with migratable flag
Commit 28f8dfd (v1.0.0) introduced a security hole: in at least
the qemu implementation of virDomainGetXMLDesc, the use of the
flag VIR_DOMAIN_XML_MIGRATABLE (which is usable from a read-only
connection) triggers the implicit use of VIR_DOMAIN_XML_SECURE
prior to calling qemuDomainFormatXML. However, the use of
VIR_DOMAIN_XML_SECURE is supposed to be restricted to read-write
clients only. This patch treats the migratable flag as requiring
the same permissions, rather than analyzing what might break if
migratable xml no longer includes secret information.
Fortunately, the information leak is low-risk: all that is gated
by the VIR_DOMAIN_XML_SECURE flag is the VNC connection password;
but VNC passwords are already weak (FIPS forbids their use, and
on a non-FIPS machine, anyone stupid enough to trust a max-8-byte
password sent in plaintext over the network deserves what they
get). SPICE offers better security than VNC, and all other
secrets are properly protected by use of virSecret associations
rather than direct output in domain XML.
* src/remote/remote_protocol.x (REMOTE_PROC_DOMAIN_GET_XML_DESC):
Tighten rules on use of migratable flag.
* src/libvirt-domain.c (virDomainGetXMLDesc): Likewise.
Pavel Hrdina [Mon, 22 Sep 2014 16:19:07 +0000 (18:19 +0200)]
domain_conf: fix domain deadlock
If you use public api virConnectListAllDomains() with second parameter
set to NULL to get only the number of domains you will lock out all
other operations with domains.
Peter Krempa [Thu, 11 Sep 2014 14:35:53 +0000 (16:35 +0200)]
CVE-2014-3633: qemu: blkiotune: Use correct definition when looking up disk
Live definition was used to look up the disk index while persistent one
was indexed leading to a crash in qemuDomainGetBlockIoTune. Use the
correct def and report a nice error.
Unfortunately it's accessible via read-only connection, though it can
only crash libvirtd in the cases where the guest is hot-plugging disks
without reflecting those changes to the persistent definition. So
avoiding hotplug, or doing hotplug where persistent is always modified
alongside live definition, will avoid the out-of-bounds access.
Peter Krempa [Tue, 1 Jul 2014 11:52:51 +0000 (13:52 +0200)]
qemu: copy: Accept 'format' parameter when copying to a non-existing img
We have the following matrix of possible arguments handled by the logic
statement touched by this patch:
| flags & _REUSE_EXT | !(flags & _REUSE_EXT)
-------+--------------------+----------------------
format| (1) | (2)
-------+--------------------+----------------------
!format| (3) | (4)
-------+--------------------+----------------------
In cases 1 and 2 the user provided a format, in cases 3 and 4 not. The
user requests to use a pre-existing image in 1 and 3 and libvirt will
create a new image in 2 and 4.
The difference between cases 3 and 4 is that for 3 the format is probed
from the user-provided image, whereas in 4 we just use the existing disk
format.
The current code would treat cases 1,3 and 4 correctly but in case 2 the
format provided by the user would be ignored.
The particular piece of code was broken in commit 35c7701c64508f975dfeb8
but since it was introduced a few commits before that it was never
released as working.
Eric Blake [Wed, 25 Jun 2014 20:54:36 +0000 (14:54 -0600)]
docs: publish correct enum values
We publish libvirt-api.xml for others to use, and in fact, the
libvirt-python bindings use it to generate python constants that
correspond to our enum values. However, we had an off-by-one bug
that any enum that relied on C's rules for implicit initialization
of the first enum member to 0 got listed in the xml as having a
value of 1 (and all later members of the enum were equally
botched).
The fix is simple - since we add one to the previous value when
encountering an enum without an initializer, the previous value
must start at -1 so that the first enum member is assigned 0.
The python generator code has had the off-by-one ever since DV
first wrote it years ago, but most of our public enums were immune
because they had an explicit = 0 initializer. The only affected
enums are:
- virDomainEventGraphicsAddressType (such as
VIR_DOMAIN_EVENT_GRAPHICS_ADDRESS_IPV4), since commit 987e31e
(libvirt v0.8.0)
- virDomainCoreDumpFormat (such as VIR_DOMAIN_CORE_DUMP_FORMAT_RAW),
since commit 9fbaff0 (libvirt v1.2.3)
- virIPAddrType (such as VIR_IP_ADDR_TYPE_IPV4), since commit 03e0e79 (not yet released)
Thanks to Nehal J Wani for reporting the problem on IRC, and
for helping me zero in on the culprit function.
Peter Krempa [Wed, 25 Jun 2014 16:11:17 +0000 (18:11 +0200)]
qemu: blockcopy: Don't remove existing disk mirror info
When creating a new disk mirror the new struct is stored in a separate
variable until everything went well. The removed hunk would actually
remove existing mirror information for example when the api would be run
if a mirror still exists.
LSN-2014-0003: Don't expand entities when parsing XML
If the XML_PARSE_NOENT flag is passed to libxml2, then any
entities in the input document will be fully expanded. This
allows the user to read arbitrary files on the host machine
by creating an entity pointing to a local file. Removing
the XML_PARSE_NOENT flag means that any entities are left
unchanged by the parser, or expanded to "" by the XPath
APIs.
Laine Stump [Thu, 1 May 2014 08:40:41 +0000 (11:40 +0300)]
qemu: fix crash when removing <filterref> from interface with update-device
If a domain network interface that contains a <filterref> is modified
"live" using "virsh update-device --live", libvirtd would crash. This
was because the code supporting live update of an interface's
filterref was assuming that a filterref might be added or modified,
but didn't account for removing the filterref, resulting in a null
dereference of the filter name.
Introduced with commit 258fb278, which was first in libvirt v1.0.1.
This addresses https://bugzilla.redhat.com/show_bug.cgi?id=1093301
qemu: make sure agent returns error when required data are missing
Commit 5b3492fa aimed to fix this and caught one error but exposed
another one. When agent command is being executed and the thread
waiting for the reply is woken up by an event (e.g. EOF in case of
shutdown), the command finishes with no data (rxObject == NULL), but
no error is reported, since this might be desired by the caller
(e.g. suspend through agent). However, in other situations, when the
data are required (e.g. getting vCPUs), we proceed to getting desired
data out of the reply, but none of the virJSON*() functions works well
with NULLs. I chose the way of a new parameter for qemuAgentCommand()
function that specifies whether reply is required and behaves
according to that.
On all the places where qemuAgentComand() was called, we did a check
for errors in the reply. Unfortunately, some of the places called
qemuAgentCheckError() without checking for non-null reply which might
have resulted in a crash.
So this patch makes the error-checking part of qemuAgentCommand()
itself, which:
a) makes it look better,
b) makes the check mandatory and, most importantly,
c) checks for the errors if and only if it is appropriate.
This actually fixes a potential crashers when qemuAgentComand()
returned 0, but reply was NULL. Having said that, it *should* fix the
following bug:
Michal Privoznik [Wed, 19 Mar 2014 17:10:34 +0000 (18:10 +0100)]
virNetClientSetTLSSession: Restore original signal mask
Currently, we use pthread_sigmask(SIG_BLOCK, ...) prior to calling
poll(). This is okay, as we don't want poll() to be interrupted.
However, then - immediately as we fall out from the poll() - we try to
restore the original sigmask - again using SIG_BLOCK. But as the man
page says, SIG_BLOCK adds signals to the signal mask:
SIG_BLOCK
The set of blocked signals is the union of the current set and the set argument.
Therefore, when restoring the original mask, we need to completely
overwrite the one we set earlier and hence we should be using:
SIG_SETMASK
The set of blocked signals is set to the argument set.
The nwfilter conf update mutex previously serialized
updates to the internal data structures for firewall
rules, and updates to the firewall itself. The latter
was recently turned into a read/write lock, and filter
instantiation allowed to proceed in parallel. It was
believed that this was ok, since each filter is created
on a separate iptables/ebtables chain.
It turns out that there is a subtle lock ordering problem
on virNWFilterObjPtr instances. __virNWFilterInstantiateFilter
will hold a lock on the virNWFilterObjPtr it is instantiating.
This in turn invokes virNWFilterInstantiate which then invokes
virNWFilterDetermineMissingVarsRec which then invokes
virNWFilterObjFindByName. This iterates over every single
virNWFilterObjPtr in the list, locking them and checking their
name. So if 2 or more threads try to instantiate a filter in
parallel, they'll all hold 1 lock at the top level in the
__virNWFilterInstantiateFilter method which will cause the
other thread to deadlock in virNWFilterObjFindByName.
The fix is to add an exclusive mutex to serialize the
execution of __virNWFilterInstantiateFilter.
Eric Blake [Mon, 24 Feb 2014 23:57:06 +0000 (16:57 -0700)]
virsh: add --all flag to 'event' command
Similar to our event-test demo program, it's nice to be able to
have a mode where we can sniff all events at once, rather than
having to spawn multiple virsh in parallel with one for each
event type.
(Can I just say our RegisterAny design is lousy? The fact that
the majority of our callback pointers have a function signature
with the opaque data in a different position, and that we have
to cast the function signature before registering it, makes it
hard to write a generic callback function; we have to write one
for every type of event id. Life would have been easier if we
had designed the callback as a fixed signature with a void*
and size parameter, and then allowed the caller to downcast
the void* to a particular struct for data specific to their
callback id, where we could have then had a single function
with a switch statement for each event id, and register that
one function for all types of events. It would also be nicer
if the callback functions knew which callbackID was being used
when invoking that callback, so that I could use a common data
structure among all registrations instead of having to create
an array of one data per callback. But I really don't want to
go add yet another event API design.)
* tools/virsh-domain.c (cmdEvent): Add --all parameter; convert
all callbacks to support shared counter.
* tools/virsh.pod (event): Document it.
Eric Blake [Mon, 24 Feb 2014 21:46:29 +0000 (14:46 -0700)]
virsh: support remaining domain events
Earlier, I added 'virsh event' for lifecycle events, to get the
concept approved; this patch finishes the support for all other
events, although the user still has to register for one event
type at a time. A future patch may add an --all parameter to
make it possible to register for all events through a single
call.
* tools/virsh-domain.c (vshDomainEventWatchdogToString)
(vshDomainEventIOErrorToString, vshGraphicsPhaseToString)
(vshGraphicsAddressToString, vshDomainBlockJobStatusToString)
(vshDomainEventDiskChangeToString)
(vshDomainEventTrayChangeToString, vshEventGenericPrint)
(vshEventRTCChangePrint, vshEventWatchdogPrint)
(vshEventIOErrorPrint, vshEventGraphicsPrint)
(vshEventIOErrorReasonPrint, vshEventBlockJobPrint)
(vshEventDiskChangePrint, vshEventTrayChangePrint)
(vshEventPMChangePrint, vshEventBalloonChangePrint)
(vshEventDeviceRemovedPrint): New helper routines.
(cmdEvent): Support full array of event callbacks.
The systemd journal expects log record PRIORITY values to
be encoded using the syslog compatible numbering scheme,
not libvirt's own native numbering scheme. We must therefore
apply a conversion.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The systemd journal accepts arbitrary user specified log
fields. These can be passed into virLogMessage via the
virLogMetadata structure. Allow up to 5 custom fields to
be reported by libvirt callers.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Oleg Strikov [Wed, 26 Feb 2014 16:40:12 +0000 (20:40 +0400)]
qemu: Enable 'host-passthrough' cpu mode for arm
This patch allows libvirt user to specify 'host-passthrough'
cpu mode while using qemu/kvm backend on arm (arm32).
It uses 'host' as a CPU model name instead of some other stub
(correct CPU detection is not implemented yet) to allow libvirt
user to specify 'host-model' cpu mode as well.
Michal Privoznik [Fri, 28 Feb 2014 08:50:01 +0000 (09:50 +0100)]
virDomainBlockStats(Flags): Produce saner error message on empty disk path
As of 0bd2ccdec an empty disk path for virDomainBlockStats (or the one
with Flags) is allowed meaning "get me overall summarized statistics".
However, running 'virsh domblkstat $dom' throws a misleading error:
# ./tools/virsh domblkstat dom
error: Failed to get block stats dom
error: invalid argument: invalid path:
while after this commit
# virsh domblkstat dom
error: Operation not supported: summary statistics are not supported yet
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Eric Blake [Fri, 28 Feb 2014 00:48:23 +0000 (17:48 -0700)]
tests: avoid littering /tmp
Running 'make -C tests check TESTS=qemuagenttest' left a directory
/tmp/libvirt_XXXXXX/ behind. The culprit was failure to cleanup
when short-circuiting an expensive test.
* tests/qemuagenttest.c (testQemuAgentTimeout): Free resources
when skipping expensive test.
Jiri Denemark [Thu, 27 Feb 2014 08:32:41 +0000 (09:32 +0100)]
sanlock: Truncate domain names longer than SANLK_NAME_LEN
Libvirt uses a domain name to fill in owner_name in sanlock_options in
virLockManagerSanlockAcquire. Unfortunately, owner_name is limited to
SANLK_NAME_LEN characters (including trailing '\0'), which means domains
with longer names fail to start when sanlock is enabled. However, we can
truncate the name when setting owner_name as explained by sanlock's
author:
Setting sanlk_options or the owner_name is unnecessary, and has very
little to no benefit. If you do provide something in owner_name, it can
be anything, sanlock doesn't care or use it.
If you run the command "sanlock status", the output will display a list
of clients connected to the sanlock daemon. This client list is
displayed as "pid owner_name" if the client has provided an owner_name
via sanlk_options. This debugging output is the only usage of
owner_name, so its only benefit is to potentially provide a more human
friendly output for debugging purposes.
Eric Blake [Wed, 26 Feb 2014 20:30:46 +0000 (13:30 -0700)]
build: skip virportallocatortest on cygwin
Cygwin supports <dlfcn.h> and even has limited LD_PRELOAD
capabilities; but because it does not use ELF binaries it
cannot support RTLD_NEXT lookups.
CC libvirportallocatormock_la-virportallocatortest.lo
virportallocatortest.c: In function 'init_syms':
virportallocatortest.c:47:24: error: 'RTLD_NEXT' undeclared (first use in this function)
realsocket = dlsym(RTLD_NEXT, "socket");
* tests/virportallocatortest.c: Also require RTLD_NEXT.
Eric Blake [Wed, 26 Feb 2014 19:56:06 +0000 (12:56 -0700)]
build: ignore cygwin toolchain droppings
The cygwin compiler automatically creates a '*.exe.manifest'
companion file for any .exe file that contains a substring
that would otherwise cause newer Windows to pester users about
needing admin rights (such as "update", "instal", "setup"...).
This means that compilation on cygwin left behind
tests/networkxml2xmlupdatetest.exe.manifest.
Peter Krempa [Wed, 26 Feb 2014 13:17:51 +0000 (14:17 +0100)]
spec: Fix braces around macros
In commit 72f7658ba24491672e6b81118f892400916e9404 I've added a few
macros with bad bracing. Although they work as expected fix them so that
we use uniform syntax.
Oops - the build is non-deterministic, where the final binary
depends on my build environment. The fix is to require
systemd-devel in all situations where the code base uses it.
Now ~/.rpmmacros can contain "%define _without_systemd_daemon 1"
to explicitly disable use of the library, but the library is now
a strict build requirement for normal builds; if systemd-devel
is not installed, the user now gets an up-front warning:
$ rpmbuild -ta libvirt-1.2.2.tar.gz
error: Failed build dependencies:
systemd-devel is needed by libvirt-1.2.2-1.fc20.x86_64
* libvirt.spec.in (with_systemd_daemon): New variable.
(BuildRequires): Require systemd-devel for more than just udev.
(%configure): Make choice of systemd_daemon explicit.
Eric Blake [Tue, 25 Feb 2014 19:28:53 +0000 (12:28 -0700)]
spec: require device-mapper-devel for storage-disk
On Fedora 20, with the following in my ~/.rpmmacros:
%_without_udev 1
%_without_storage_mpath 1
and with device-mapper-devel uninstalled, 'make rpm' fails with:
checking for libdevmapper.h... no
configure: error: You must install device-mapper-devel/libdevmapper >= 1.0.0 to compile libvirt
error: Bad exit status from /var/tmp/rpm-tmp.Wo9pOG (%build)
This is a rather late point to be issuing an error; better is
to flag missing packages up front. The fix is to match the logic
in configure.ac on when devmapper is required (for both mpath and
storage). While at it, rbd storage is not dependent on mpath.
With this patch applied, I now get:
$ rpmbuild -ta libvirt-1.2.2.tar.gz
error: Failed build dependencies:
device-mapper-devel is needed by libvirt-1.2.2-1.fc20.x86_64
until either installing the package or further modifying
~/.rpmmacros to add "%_without_storage_disk 1".
* libvirt.spec.in (BuildRequires): Fix build when mpath is
disabled.
Eric Blake [Tue, 25 Feb 2014 16:51:40 +0000 (09:51 -0700)]
spec: explicitly avoid bhyve on Linux
Generally, we try to make the spec file tweakable via user
variables, so that they can select a different subset of sub-rpms
to build. We also try to explicitly list all driver config
options, rather than leaving the chance that the rpm build may be
non-deterministic based on what the user had installed locally.
But in the case of the recent bhyve hypervisor driver, there is
no port of bhyve to Linux, so it is easier to just blindly
disable it for now. If someone ever does try to port bhyve to
Fedora, we can make the spec file conditional at that point.
* libvirt.spec.in (%configure): Don't try to build bhyve.
Eric Blake [Tue, 25 Feb 2014 16:42:29 +0000 (09:42 -0700)]
build: use --with-systemd-daemon as configure option
Commit 68954fb added a configure option --with-systemd_daemon,
which violates the conventions of configure files preferring
dash in all option names. This fixes it, before we hit a
release where the tarball is baked with an awkward name.
* m4/virt-lib.m4 (LIBVIRT_CHECK_LIB, LIBVIRT_CHECK_LIB_ALT)
(LIBVIRT_CHECK_PKG): Favor - over _ in configure option names.
Laine Stump [Tue, 25 Feb 2014 14:47:22 +0000 (16:47 +0200)]
network: unplug bandwidth and call networkRunHook only when appropriate
According to commit b4e0299d if networkAllocateActualDevice() was
successful, it will *always* allocate an iface->data.network.actual,
so we can use this during networkReleaseActualDevice() to know if
there is really anything to undo. We were properly using this
information to only decrement the network connections counter if it
had previously been incremented, but we were unconditionally
unplugging bandwidth and calling the "unplugged" network hook for
*all* interfaces (during qemuProcessStop()) whether they had been
previously plugged or not. This caused problems if a domain failed to
start at some time prior to all interfaces being allocated. (I
encountered this when an interface had a bandwidth floor set but no
inbound QoS).
This patch changes both the call to networkUnplugBandwidth() and the
call to networkRunHook() to only be called if there was a previous
call to "plug" for the same interface.
Laine Stump [Tue, 25 Feb 2014 14:35:59 +0000 (16:35 +0200)]
network: don't even call networkRunHook if there is no network
networkAllocateActualDevice() is called for *all* interfaces, not just
those with type='network'. In that case, it will jump down to its
validate: label immediately, without allocating anything. After
validation is done, two counters are potentially updated (one for the
network, and one for any particular physical device that is chosen),
and then networkRunHook() is called.
This patch refactors that code a slight bit so that networkRunHook()
doesn't get called if netdef is NULL (i.e. type != network) and to
place the conditional increment of dev->connections inside the "if
(netdef)" as well - dev can never be non-null if netdef is null
(because "dev" is the pointer to a device in a network's pool of
devices), so this doesn't have any functional effect, it just makes
the code clearer.
While running virscsitest, it was found that valgrind pointed out the following
memory leak:
==320== 5 bytes in 1 blocks are definitely lost in loss record 4 of 37
==320== at 0x4A069EE: malloc (vg_replace_malloc.c:270)
==320== by 0x3E6CE81171: strdup (strdup.c:43)
==320== by 0x4CB28DF: virStrdup (virstring.c:554)
==320== by 0x4CAC987: virSCSIDeviceSetUsedBy (virscsi.c:289)
==320== by 0x402321: test2 (virscsitest.c:100)
==320== by 0x403231: virtTestRun (testutils.c:199)
==320== by 0x402121: mymain (virscsitest.c:180)
==320== by 0x4039AD: virtTestMain (testutils.c:782)
==320== by 0x3E6CE1ED1C: (below main) (libc-start.c:226)
==320==
Introduced by commit fd243fc. Signed-off-by: Ján Tomko <jtomko@redhat.com>
When starting these domain in parallel, all workers may meet in
virNetDevVethCreate() where a race starts. Race over allocating veth
pairs because allocation requires two steps:
1) find first nonexistent '/sys/class/net/vnet%d/'
2) run 'ip link add ...' command
Now consider two threads. Both of them find N as the first unused veth
index but only one of them succeeds allocating it. The other one fails.
For such cases, we are running the allocation in a loop with 10 rounds.
However this is very flaky synchronization. It should be rather used
when libvirt is competing with other process than when libvirt threads
fight each other. Therefore, internally we should use mutex to serialize
callers, and do the allocation in loop (just in case we are competing
with a different process). By the way we have something similar already
since 1cf97c87.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Eric Blake [Wed, 26 Feb 2014 00:24:58 +0000 (17:24 -0700)]
build: avoid ld_preload tests on mingw
Running ./autobuild.sh complained during the mingw cross-compile:
CC libvirportallocatormock_la-virportallocatortest.lo
../../tests/virportallocatortest.c:32:20: fatal error: dlfcn.h: No such file or directory
# include <dlfcn.h>
^
compilation terminated. With that fixed, the next failure was:
CCLD qemuxml2argvmock.la
libtool: link: libtool library `qemuxml2argvmock.la' must begin with `lib'
libtool: link: Try `libtool --help --mode=link' for more information.
While we don't need to limit all LD_PRELOAD tests to just Linux, we
do need to limit them to platforms that actually support loading;
we also need to avoid building qemu tests when qemu is not enabled.
* tests/virportallocatortest.c: Make conditional on <dlfcn.h>.
* tests/Makefile.am (test_libraries): Only build qemu mock library
when building qemu tests.
Jim Fehlig [Wed, 19 Feb 2014 22:39:19 +0000 (15:39 -0700)]
libxl: queue domain event earlier in shutdown handler
The shutdown handler may restart a domain when handling a reboot
event or when <on_*> is set to 'restart'. Restarting consists of
calling libxlVmCleanup followed by libxlVmStart. libxlVmStart will
emit a VIR_DOMAIN_EVENT_STARTED event, but the SHUTDOWN event is
not emitted until exiting the shutdown handler, after the STARTED
event.
This patch changes the logic a bit to queue the event at the start
of the shutdown action, ensuring it is queued before any subsequent
events that may be generated while executing the shutdown action.
Laine Stump [Fri, 21 Feb 2014 12:12:00 +0000 (14:12 +0200)]
network: include plugged interface XML in "plugged" network hook
The network hook script gets called whenever an interface is plugged
into or unplugged from a network, but even though the full XML of both
the network and the domain is included, there is no reasonable way to
determine what exact resources the plugged interface is using:
1) Prior to a recent patch which modified the status XML of interfaces
to include the information about actual hardware resources used, it
would be possible to scan through the domain XML output sent to the
hook, and from there find the correct interface, but that interface
definition would not include any runtime info (e.g. bandwidth or vlan
taken from a portgroup, or which physdev was used in case of a macvtap
network).
2) After the patch modifying the status XML of interfaces, the network
name would no longer be included in the domain XML, so it would be
completely impossible to determine which interface was the one being
plugged.
To solve that problem, this patch includes a single <interface>
element at the beginning of the XML sent to the network hook for
"plugged" and "unplugged" (just inside <hookData>) that is the status
XML of the interface being plugged. This XML will include all info
gathered from the chosen network and portgroup.
NB: due to hardcoded spaces in all of the device *Format() functions,
the <interface> element inside the <hookData> will be indented by 6
spaces rather than 2. I had intended to fix this, but it turns out
that to make virDomainNetDefFormat() indentation relative, I would
have to do the same to virDomainDeviceInfoFormat(), and that function
is called from 19 places - making that a prerequisite of this patch
would cause too many merge difficulties if we needed to backport
network hooks, so I chose to ignore the problem here and fix the
problem for *all* devices in a followup later.
Laine Stump [Thu, 20 Feb 2014 12:03:49 +0000 (14:03 +0200)]
conf: output actual netdev status in <interface> XML
Until now, the "live" XML status of an <interface type='network'>
device would always show the network information, rather than the
exact hardware device that was used. It would also show the name of
any portgroup the interface belonged to, rather than providing the
configuration that was derived from that portgroup. As an example,
given the following network definition:
In order to learn the exact bandwidth information of the interface, a
management application would need to retrieve the XML for testnet,
then search for the portgroup named "admin". Even worse, there was no
simple and standard way to learn which host physdev the macvtap0
device is attached to.
Internally, libvirt has always kept this information in the
virDomainDef that is held in memory, as well as storing it in the
(libvirt-internal-only) domain status XML (in
/var/run/libvirt/qemu/$domain.xml). In order to not confuse the runtime
"actual state" with the config of the device, it's internally stored
like this:
This was never exposed outside of libvirt though, because I thought it
would be too awkward for a management application to need to look in
two places for the same information, but I also wasn't sure that it
would be okay to overwrite the config info (in this case "<source
network='testnet' portgroup='admin'/>") with the actual runtime info
(everything inside <actual> above).
Now we have a need for this information to be made available to
management applications (in particular, so that a network "plugged"
hook will have full information about the device that is being plugged
in), so it's time to take the leap and decide that it is acceptable
for the config info to be replaced with actual runtime state (but
*only* when reporting domain live status, *not* when saving state in
/var/run/libvirt/qemu/$domain.xml - that remains the same so that
there is no loss of information). That is what this patch does - once
applied, the output of "virsh dumpxml $domain" when the domain is
running will contain something like this:
In effect, everything that is internally stored within <actual> is
moved up a level to where a management application will expect
it. This means that the management application will only look in a
single place to learn - the type of interface in use, the name of the
physdev (if relevant), the <bandwidth>, <vlan>, and <virtualport>
settings in use.
The potential downside is that a management app looking at this output
will not see that the physdev 'p4p1_0' was actually allocated from the
network named 'testnet', or that the bandwidth numbers were taken from
the portgroup 'admin'. However, if they are interested in that info,
they can always get the "inactive" XML for the domain.
An example of where this could cause problems is in virt-manager's
network device display, which shows the status of the device, but
allows you to edit that status info and save it as the new
config. Previously virt-manager would always display the information
in example [C] above, and allow editing that. With this patch, it will
instead display what is in [E] and allow editing it directly, which
could lead to some confusion. I would suggest that virt-manager have
an "edit" button which would change the display from the "live" xml to
the "inactive" xml, so that editing would be done on that; such a
change would both handle the new situation, and also be compatible
with older releases.
Laine Stump [Wed, 19 Feb 2014 12:45:52 +0000 (14:45 +0200)]
conf: new function virDomainActualNetDefContentsFormat
This function is currently only called from one place, but in a
subsequent patch will be called from a 2nd place.
The new function exactly replicates the original behavior of the part
of virDomainActualNetDefFormat() that it replaces, but takes a
virDomainNetDefPtr instead of virDomainActualNetDefPtr, and uses the
virDomainNetGetActual*() functions whenever possible, rather than
reaching into def->data.network.actual - this is to be sure that we
are reporting exactly what is being used internally, just in case
there are any discrepancies (there shouldn't be).
Laine Stump [Wed, 19 Feb 2014 14:22:48 +0000 (16:22 +0200)]
conf: re-situate <bandwidth> element in <interface>
This moves the call to virNetDevBandwidthFormat() in
virDomainNetDefFormat() to be called right after the call to
virNetDevVPortProfileFormat(), so that a single chunk of that function
can be placed inside an if that conditionally calls
virDomainActualNetDefContentsFormat() instead (next patch). The
re-ordering necessitates modifying a couple of test data files.
Laine Stump [Thu, 20 Feb 2014 11:59:55 +0000 (13:59 +0200)]
conf: handle null pointer in virNetDevVlanFormat
Other *Format() functions (e.g. virNetDevBandwidthFormat()) return
with no action when called with a NULL *Def pointer. This makes
virNetDevVlanFormat() consistent with that behavior.
Laine Stump [Mon, 17 Feb 2014 12:36:18 +0000 (14:36 +0200)]
conf: clarify what is returned for actual bandwidth and vlan
In practice, if a virDomainNetDef has a virDomainActualNetDef
allocated, the ActualNetDef will *always* contain the bandwidth and
vlan data from the NetDef (unless there was also a portgroup involved
- see networkAllocateActualDevice()).
However, virDomainNetGetActual(Bandwidth|Vlan)() were coded to make it
appear as if it might be possible to have a valid bandwidth/vlan in
the NetDef, but a NULL in the ActualNetDef. Believing this un-truth
could lead to writing unnecessarily defensive code when dealing with
the virDomainGetActual*() functions, so this patch makes it more
obvious:
If there is an ActualNetDef, it will always have a copy of the
various appropriate bits from its parent NetDef, and the
virDomainGetActual* function will *always* return the data from the
ActualNetDef, not from the NetDef.
The reason for this effective-NOP patch is that a subsequent patch to
change virDomainNetDefFormat will rely on the above rule.
I noticed this while shortening switch statements via VIR_ENUM.
Basically, the only ways virAsprintf can fail are if we pass a
bogus format string (but we're not THAT bad) or if we run out
of memory (but it already warns on our behalf in that case).
Throw away the cruft that tries too hard to diagnose a printf
failure.
Eric Blake [Fri, 21 Feb 2014 19:55:06 +0000 (12:55 -0700)]
virsh: use more compact VIR_ENUM_IMPL
Dan Berrange suggested that using VIR_ENUM_IMPL is more compact
than open-coding switch statements, and still just as forceful
at making us remember to update lists if we add enum values
in the future. Make this change throughout virsh.
Sure enough, doing this change caught that we missed at least
VIR_STORAGE_VOL_NETDIR.
Ensure systemd cgroup ownership is delegated to container with userns
This function is needed for user namespaces, where we need to chmod()
the cgroup to the initial uid/gid such that systemd is allowed to
use the cgroup.
Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Michal Privoznik [Fri, 21 Feb 2014 12:06:42 +0000 (13:06 +0100)]
virNetServerRun: Notify systemd that we're accepting clients
Systemd does not forget about the cases, where client service needs to
wait for daemon service to initialize and start accepting new clients.
Setting a dependency in client is not enough as systemd doesn't know
when the daemon has initialized itself and started accepting new
clients. However, it offers a mechanism to solve this. The daemon needs
to call a special systemd function by which the daemon tells "I'm ready
to accept new clients". This is exactly what we need with
libvirtd-guests (client) and libvirtd (daemon). So now, with this
change, libvirt-guests.service is invoked not any sooner than
libvirtd.service calls the systemd notify function.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Michal Privoznik [Fri, 21 Feb 2014 11:46:08 +0000 (12:46 +0100)]
libvirt-guests: Wait for libvirtd to initialize
I've noticed that in some cases systemd was quick enough and even
if libvirt-guests.service is marked to be started after the
libvirtd.service my guests were not resumed as
libvirt-guests.sh failed to connect. This is because of a
simple fact: systemd correctly starts libvirt-guests after it
execs libvirtd. However, the daemon is not able to accept
connections right from the start. It's doing some
initialization which may take ages. This problem is not limited
to systemd only, indeed. Any init system that is able to startup
services in parallel (e.g. OpenRC) may run into this situation.
The fix is to try connecting not only once, but continuously a few
times with a small sleep in between tries.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When creating a new domain, we let systemd know about it by calling
CreateMachine() function via dbus. Systemd then creates a scope and
places domain into it. However, later when the host is shutting
down, systemd computes the shutdown order to see what processes can
be shut down in parallel. And since we were not setting
dependencies at all, the slices (and thus domains) were most likely
killed before libvirt-guests.service. So user domains that had to
be saved, shut off, whatever were in fact killed. This problem can
be solved by letting systemd know that scopes we're creating must
not be killed before libvirt-guests.service.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Ján Tomko [Fri, 21 Feb 2014 09:22:22 +0000 (10:22 +0100)]
Ignore additional fields in iscsiadm output
There has been a new field introduced in iscsiadm --mode session
output [1], but our regex only expects four fields. This breaks
startup of iscsi pools:
error: Failed to start pool iscsi
error: internal error: cannot find session
Fix this by ignoring anything after the fourth field.
Ján Tomko [Fri, 21 Feb 2014 08:10:48 +0000 (09:10 +0100)]
Add a stub for virCgroupGetDomainTotalCpuStats
Commit 6515889 broke the build on FreeBSD:
In function `qemuDomainGetCPUStats':
/../../src/qemu/qemu_driver.c:16102:
undefined reference to `virCgroupGetDomainTotalCpuStats'
Eric Blake [Fri, 14 Feb 2014 23:59:35 +0000 (16:59 -0700)]
virsh: add event command, for lifecycle events
Add 'virsh event --list' and 'virsh event [dom] --event=name
[--loop] [--timeout]'. Borrows somewhat from event-test.c,
but defaults to a one-shot notification, and takes advantage
of the event loop integration to allow Ctrl-C to interrupt the
wait for an event. For now, this just does lifecycle events.
* tools/virsh.pod (event): Document new command.
* tools/virsh-domain.c (vshDomainEventToString)
(vshDomainEventDetailToString, vshDomEventData)
(vshEventLifecyclePrint, cmdEvent): New struct and functions.
Eric Blake [Fri, 14 Feb 2014 22:30:23 +0000 (15:30 -0700)]
virsh: common code for waiting for an event
I plan to add 'virsh event' to virsh-domain.c and 'virsh
net-event' to virsh-network.c; but as they will share quite
a bit of common boilerplate, it's better to set that up now
in virsh.c.
* tools/virsh.h (_vshControl): Add fields.
(vshEventStart, vshEventWait, vshEventDone, vshEventCleanup): New
prototypes.
* tools/virsh.c (vshEventFd, vshEventOldAction, vshEventInt)
(vshEventTimeout): New helper variables and functions.
(vshEventStart, vshEventWait, vshEventDone, vshEventCleanup):
Implement new functions.
(vshInit, vshDeinit, main): Manage event timeout.
Eric Blake [Fri, 14 Feb 2014 00:08:42 +0000 (17:08 -0700)]
virsh: common code for parsing --seconds
Several virsh commands ask for a --timeout parameter in
seconds, then use it to control interfaces that operate on
millisecond limits; I also plan on adding a 'virsh event'
command that also does this. Factor this into a common
function.
* tools/virsh.h (vshCommandOptTimeoutToMs): New prototype.
* tools/virsh.c (vshCommandOptTimeoutToMs): New function.
* tools/virsh-domain.c (cmdBlockCommit, cmdBlockCopy)
(cmdBlockPull, cmdMigrate): Use it.
(vshWatchJob): Adjust timeout scale.