Jiri Denemark [Tue, 28 Jun 2016 12:39:58 +0000 (14:39 +0200)]
qemu: Let empty default VNC password work as documented
CVE-2016-5008
Setting an empty graphics password is documented as a way to disable
VNC/SPICE access, but QEMU does not always behaves like that. VNC would
happily accept the empty password. Let's enforce the behavior by setting
password expiration to "now".
Eric Blake [Wed, 9 Dec 2015 00:46:31 +0000 (17:46 -0700)]
CVE-2015-5313: storage: don't allow '/' in filesystem volume names
The libvirt file system storage driver determines what file to
act on by concatenating the pool location with the volume name.
If a user is able to pick names like "../../../etc/passwd", then
they can escape the bounds of the pool. For that matter,
virStoragePoolListVolumes() doesn't descend into subdirectories,
so a user really shouldn't use a name with a slash.
Normally, only privileged users can coerce libvirt into creating
or opening existing files using the virStorageVol APIs; and such
users already have full privilege to create any domain XML (so it
is not an escalation of privilege). But in the case of
fine-grained ACLs, it is feasible that a user can be granted
storage_vol:create but not domain:write, and it violates
assumptions if such a user can abuse libvirt to access files
outside of the storage pool.
Therefore, prevent all use of volume names that contain "/",
whether or not such a name is actually attempting to escape the
pool.
Since commit 8eb55d782a2b9afacc7938694891cc6fad7b42a5 libxml2 removes
two slashes from the URI when there is no server part. This is fixed
with beb7281055dbf0ed4d041022a67c6c5cfd126f25, but only if the calling
application calls xmlSaveUri() on URI that xmlURIParse() parsed. And
that is not the case in virURIFormat(). virURIFormat() accepts
virURIPtr that can be created without parsing it and we do that when we
format network storage paths for gluster for example. Even though
virStorageSourceParseBackingURI() uses virURIParse(), it throws that data
structure right away.
Since we want to format URIs as URIs and not absolute URIs or opaque
URIs (see RFC 3986), we can specify that with a special hack thanks to
commit beb7281055dbf0ed4d041022a67c6c5cfd126f25, by setting port to -1.
This fixes qemuxml2argvtest test where the disk-drive-network-gluster
case was failing.
In systemd >= 218, the udev_set_log_fn method has been marked
deprecated and turned into a no-op. Nothing in the udev client
library will print to stderr by default anymore, so we can
just stop installing a logging hook for new enough udev.
Dario Faggioli [Mon, 30 Jun 2014 17:19:01 +0000 (19:19 +0200)]
libxl: don't break the build on Xen>=4.5 because of libxl_vcpu_setaffinity()
libxl interface for vcpu pinning is changing in Xen 4.5. Basically,
libxl_set_vcpuaffinity() now wants one more parameter. That is
representative of 'VCPU soft affinity', which libvirt does not use.
To mark such change, the macro LIBXL_HAVE_VCPUINFO_SOFT_AFFINITY is
defined. Use it as a gate and, if present, re-#define the calls from
the old to the new interface, to avoid breaking the build.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> Cc: Jim Fehlig <jfehlig@suse.com> Cc: Ian Campbell <Ian.Campbell@citrix.com> Cc: Ian Jackson <Ian.Jackson@eu.citrix.com>
(cherry picked from commit bfc72e99920215c9b004a5380ca61fe6ff81ea6b)
Well, in 8ad126e6 we tried to fix a memory corruption problem.
However, the fix was not as good as it could be. I mean, the
commit has one line more than it should. I've noticed this output
just recently:
# ./run valgrind --leak-check=full --show-reachable=yes ./tools/virsh domblklist gentoo
==17019== Memcheck, a memory error detector
==17019== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==17019== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==17019== Command: /home/zippy/work/libvirt/libvirt.git/tools/.libs/virsh domblklist gentoo
==17019==
Target Source
------------------------------------------------
fda /var/lib/libvirt/images/fd.img
vda /var/lib/libvirt/images/gentoo.qcow2
hdc /home/zippy/tmp/install-amd64-minimal-20150402.iso
==17019== Thread 2:
==17019== Invalid read of size 4
==17019== at 0x4EFF5B4: virObjectUnref (virobject.c:258)
==17019== by 0x5038CFF: remoteClientCloseFunc (remote_driver.c:552)
==17019== by 0x5069D57: virNetClientCloseLocked (virnetclient.c:685)
==17019== by 0x506C848: virNetClientIncomingEvent (virnetclient.c:1852)
==17019== by 0x5082136: virNetSocketEventHandle (virnetsocket.c:1913)
==17019== by 0x4ECD64E: virEventPollDispatchHandles (vireventpoll.c:509)
==17019== by 0x4ECDE02: virEventPollRunOnce (vireventpoll.c:658)
==17019== by 0x4ECBF00: virEventRunDefaultImpl (virevent.c:308)
==17019== by 0x130386: vshEventLoop (vsh.c:1864)
==17019== by 0x4F1EB07: virThreadHelper (virthread.c:206)
==17019== by 0xA8462D3: start_thread (in /lib64/libpthread-2.20.so)
==17019== by 0xAB441FC: clone (in /lib64/libc-2.20.so)
==17019== Address 0x139023f4 is 4 bytes inside a block of size 240 free'd
==17019== at 0x4C2B1F0: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==17019== by 0x4EA8949: virFree (viralloc.c:582)
==17019== by 0x4EFF6D0: virObjectUnref (virobject.c:273)
==17019== by 0x4FE74D6: virConnectClose (libvirt.c:1390)
==17019== by 0x13342A: virshDeinit (virsh.c:406)
==17019== by 0x134A37: main (virsh.c:950)
The problem is, when registering remoteClientCloseFunc(), it's
conn->closeCallback which is ref'd. But in the function itself
it's conn->closeCallback->conn what is unref'd. This is causing
imbalance in reference counting. Moreover, there's no need for
the remote driver to increase/decrease conn refcount since it's
not used anywhere. It's just merely passed to client registered
callback. And for that purpose it's correctly ref'd in
virConnectRegisterCloseCallback() and then unref'd in
virConnectUnregisterCloseCallback().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
(cherry picked from commit e68930077034f786e219bdb015f8880dbc5a246f) Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Michal Privoznik [Fri, 30 Jan 2015 09:37:10 +0000 (10:37 +0100)]
xend: Don't crash in virDomainXMLDevID
The function is called from all {Attach,Update,Detach}Device APIs to
create config strings that are later passed to the xend to perform the
desired action. The function is intended to handle all supported
devices. However, as of 5b05358abacb1029fa0d61f72decacf0d4fd8ffb we
are trying to get disk driver of the device without checking if the
device really is a disk. This leads to an segmentation fault:
#0 0x00007ffff7571815 in virDomainDiskGetDriver () from /usr/lib/libvirt.so.0
#1 0x00007fffeb9ad471 in ?? () from /usr/lib/libvirt/connection-driver/libvirt_driver_xen.so
#2 0x00007fffeb9b1062 in xenDaemonAttachDeviceFlags () from /usr/lib/libvirt/connection-driver/libvirt_driver_xen.so
#3 0x00007fffeb9a8a86 in ?? () from /usr/lib/libvirt/connection-driver/libvirt_driver_xen.so
#4 0x00007ffff7609266 in virDomainAttachDevice () from /usr/lib/libvirt.so.0
#5 0x0000555555593c9d in ?? ()
#6 0x00007ffff76743c9 in virNetServerProgramDispatch () from /usr/lib/libvirt.so.0
#7 0x00005555555a678d in ?? ()
#8 0x00007ffff755460e in ?? () from /usr/lib/libvirt.so.0
#9 0x00007ffff7553b06 in ?? () from /usr/lib/libvirt.so.0
#10 0x00007ffff4998b50 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#11 0x00007ffff46e30ed in clone () from /lib/x86_64-linux-gnu/libc.so.6
#12 0x0000000000000000 in ?? ()
Reported-by: Xiaolin Su <linxxnil@126.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
(cherry picked from commit cd7702d4561bc100f291be7a1f6fa8f358440558)
Peter Krempa [Tue, 20 Jan 2015 16:01:01 +0000 (17:01 +0100)]
CVE-2015-0236: qemu: Check ACLs when dumping security info from snapshots
The ACL check didn't check the VIR_DOMAIN_XML_SECURE flag and the
appropriate permission for it. Found via code inspection while fixing
permissions for save images.
wireshark: Honor API change coming with 1.12 release
https://bugs.gentoo.org/show_bug.cgi?id=508336
At wireshark, they have this promise to change public dissector APIs
only with minor version number change. Which they did when releasing
the version of 1.12.
Firstly, they've changed tvb_memdup() in a0c53ffaa1bb46d8c9db2ec739401aa411c9790e so now it takes four arguments
instead of three. The new argument is placed at the very beginning of
the list of arguments and basically says the scope where we'd like to
allocate the memory. According to the documentation NULL should be the
default value.
Then, the tcp_dissect_pdus() signature changed too. Well, the function
that actually dissects reassembled packets as tcp_dissect_pdus()
reorder TCP packets into one big chunk and then calls a user function
to dissect the PDU at once. The change is dated back to 8081cf1d90397cbbb4404f9720595e1537ed5e14.
The rationale is to not duplicate code which is done in
packet-libvirt.h for instance. Moreover, this way we can drop
__attribute_((unused)) used int packet-libvirt.c in favor of
ATTRIBUTE_UNUSED.
This patch changes the macro introduced in 292d3f2d to either be
empty in the case of newer libselinux, or contain 'const' in the
case of older libselinux. The macro is then used directly in
tests/securityselinuxhelper.c.
Cédric Bosdonnat [Wed, 28 May 2014 12:44:08 +0000 (14:44 +0200)]
build: fix build with libselinux 2.3
Several function signatures changed in libselinux 2.3, now taking
a 'const char *' instead of 'security_context_t'. The latter is
defined in selinux/selinux.h as
Laine Stump [Wed, 15 Oct 2014 22:49:01 +0000 (00:49 +0200)]
util: eliminate "use after free" in callers of virNetDevLinkDump
virNetDevLinkDump() gets a message from netlink into "resp", then
calls nlmsg_parse() to fill the table "tb" with pointers into resp. It
then returns tb to its caller, but not before freeing the buffer at
resp. That means that all the callers of virNetDevLinkDump() are
examining memory that has already been freed. This can be verified by
filling the buffer at resp with garbage prior to freeing it (or, I
suppose, just running libvirtd under valgrind) then performing some
operation that calls virNetDevLinkDump().
The upstream commit log incorrectly states that the code has been like
this ever since virNetDevLinkDump() was written. In reality, the
problem was introduced with commit e95de74d, first in libvirt-1.0.5,
which was attempting to eliminate a typecast that caused compiler
warnings. It has only been pure luck (or maybe a lack of heavy load,
and/or maybe an allocation algorithm in malloc() that delays re-use of
just-freed memory) that has kept this from causing errors, for example
when configuring a PCI passthrough or macvtap passthrough network
interface.
The solution taken in this patch is the simplest - just return resp to
the caller along with tb, then have the caller free it after they are
finished using the data (pointers) in tb. I alternately could have
made a cleaner interface by creating a new struct that put tb and resp
together along with a vir*Free() function for it, but this function is
only used in a couple places, and I'm not sure there will be
additional new uses of virNetDevLinkDump(), so the value of adding a
new type, extra APIs, etc. is dubious.
Eric Blake [Thu, 6 Nov 2014 08:42:24 +0000 (09:42 +0100)]
CVE-2014-7823: dumpxml: security hole with migratable flag
Commit 28f8dfd (v1.0.0) introduced a security hole: in at least
the qemu implementation of virDomainGetXMLDesc, the use of the
flag VIR_DOMAIN_XML_MIGRATABLE (which is usable from a read-only
connection) triggers the implicit use of VIR_DOMAIN_XML_SECURE
prior to calling qemuDomainFormatXML. However, the use of
VIR_DOMAIN_XML_SECURE is supposed to be restricted to read-write
clients only. This patch treats the migratable flag as requiring
the same permissions, rather than analyzing what might break if
migratable xml no longer includes secret information.
Fortunately, the information leak is low-risk: all that is gated
by the VIR_DOMAIN_XML_SECURE flag is the VNC connection password;
but VNC passwords are already weak (FIPS forbids their use, and
on a non-FIPS machine, anyone stupid enough to trust a max-8-byte
password sent in plaintext over the network deserves what they
get). SPICE offers better security than VNC, and all other
secrets are properly protected by use of virSecret associations
rather than direct output in domain XML.
* src/remote/remote_protocol.x (REMOTE_PROC_DOMAIN_GET_XML_DESC):
Tighten rules on use of migratable flag.
* src/libvirt-domain.c (virDomainGetXMLDesc): Likewise.
Pavel Hrdina [Mon, 22 Sep 2014 16:19:07 +0000 (18:19 +0200)]
domain_conf: fix domain deadlock
If you use public api virConnectListAllDomains() with second parameter
set to NULL to get only the number of domains you will lock out all
other operations with domains.
Peter Krempa [Thu, 11 Sep 2014 14:35:53 +0000 (16:35 +0200)]
CVE-2014-3633: qemu: blkiotune: Use correct definition when looking up disk
Live definition was used to look up the disk index while persistent one
was indexed leading to a crash in qemuDomainGetBlockIoTune. Use the
correct def and report a nice error.
Unfortunately it's accessible via read-only connection, though it can
only crash libvirtd in the cases where the guest is hot-plugging disks
without reflecting those changes to the persistent definition. So
avoiding hotplug, or doing hotplug where persistent is always modified
alongside live definition, will avoid the out-of-bounds access.
Peter Krempa [Tue, 1 Jul 2014 11:52:51 +0000 (13:52 +0200)]
qemu: copy: Accept 'format' parameter when copying to a non-existing img
We have the following matrix of possible arguments handled by the logic
statement touched by this patch:
| flags & _REUSE_EXT | !(flags & _REUSE_EXT)
-------+--------------------+----------------------
format| (1) | (2)
-------+--------------------+----------------------
!format| (3) | (4)
-------+--------------------+----------------------
In cases 1 and 2 the user provided a format, in cases 3 and 4 not. The
user requests to use a pre-existing image in 1 and 3 and libvirt will
create a new image in 2 and 4.
The difference between cases 3 and 4 is that for 3 the format is probed
from the user-provided image, whereas in 4 we just use the existing disk
format.
The current code would treat cases 1,3 and 4 correctly but in case 2 the
format provided by the user would be ignored.
The particular piece of code was broken in commit 35c7701c64508f975dfeb8
but since it was introduced a few commits before that it was never
released as working.
Eric Blake [Wed, 25 Jun 2014 20:54:36 +0000 (14:54 -0600)]
docs: publish correct enum values
We publish libvirt-api.xml for others to use, and in fact, the
libvirt-python bindings use it to generate python constants that
correspond to our enum values. However, we had an off-by-one bug
that any enum that relied on C's rules for implicit initialization
of the first enum member to 0 got listed in the xml as having a
value of 1 (and all later members of the enum were equally
botched).
The fix is simple - since we add one to the previous value when
encountering an enum without an initializer, the previous value
must start at -1 so that the first enum member is assigned 0.
The python generator code has had the off-by-one ever since DV
first wrote it years ago, but most of our public enums were immune
because they had an explicit = 0 initializer. The only affected
enums are:
- virDomainEventGraphicsAddressType (such as
VIR_DOMAIN_EVENT_GRAPHICS_ADDRESS_IPV4), since commit 987e31e
(libvirt v0.8.0)
- virDomainCoreDumpFormat (such as VIR_DOMAIN_CORE_DUMP_FORMAT_RAW),
since commit 9fbaff0 (libvirt v1.2.3)
- virIPAddrType (such as VIR_IP_ADDR_TYPE_IPV4), since commit 03e0e79 (not yet released)
Thanks to Nehal J Wani for reporting the problem on IRC, and
for helping me zero in on the culprit function.
Peter Krempa [Wed, 25 Jun 2014 16:11:17 +0000 (18:11 +0200)]
qemu: blockcopy: Don't remove existing disk mirror info
When creating a new disk mirror the new struct is stored in a separate
variable until everything went well. The removed hunk would actually
remove existing mirror information for example when the api would be run
if a mirror still exists.
Eric Blake [Mon, 9 Jun 2014 21:36:58 +0000 (15:36 -0600)]
storage: fix memory leak with encrypted images
Jim Fehlig reported a regression found by libvirt-TCK tests:
> ~ # perl /usr/share/libvirt-tck/tests/qemu/100-disk-encryption.t
...
> ok 4 - defined persistent domain config
> # Starting inactive domain config
> libvirt error code: 1, message: internal error: unable to execute QEMU command
> 'cont': 'drive-ide0-0-1'
> (/var/cache/libvirt-tck/300-disk-encryption/demo.qcow2) is encrypted
Commit 2279d560 converted a boolean into a pointer with the intent of
transferring that pointer out of a temporary object into the caller's
data structure. The temporary structure meant that meta->encryption
was always NULL on entry, so we could get away with blindly allocating
the pointer when the header said so. But later, commit 8823272d
tweaked things to do backing chain detection in-place, rather than via
a temporary object; this has the net result that meta->encryption can
be non-NULL on entry. Not only did this turn the latent behavior into
a memory leak, it is also a behavior regression: blindly allocating a
new pointer wipes out what secrets we already knew about the chain,
making it impossible to restart the domain.
Of course, no one in their right mind should be relying on qcow2
encryption - it is fundamentally flawed. And sadly, the TCK tests
don't get run often enough, and this shows that our virstoragetest
does not exercise encrypted images at all. Otherwise, we could
have avoided a release containing this regression.
* src/util/virstoragefile.c (virStorageFileGetMetadataInternal):
Don't nuke an already-existing encryption.
LSN-2014-0003: Don't expand entities when parsing XML
If the XML_PARSE_NOENT flag is passed to libxml2, then any
entities in the input document will be fully expanded. This
allows the user to read arbitrary files on the host machine
by creating an entity pointing to a local file. Removing
the XML_PARSE_NOENT flag means that any entities are left
unchanged by the parser, or expanded to "" by the XPath
APIs.
This resulted in a difference in how 'virsh vol-info --pool <poolName>
<volume>' or 'virsh vol-list vol-list --pool <poolName> --details' outputs
the capacity information for a directory pool with a qcow2 sparse file.
Results in listing a Capacity value. Prior to the commit, the value would
be '1.0 MiB' (1048576 bytes). However, after the commit the output would be
(for example) '192.50 KiB', which for my system was the size of the volume
in my file system (eg 'ls -l TestPool/temp_vol_1' results in '197120' bytes
or 192.50 KiB). While perhaps technically correct, it's not necessarily
what the user expected (certainly virt-test didn't expect it).
This patch restores the code to not update the target capacity for this path
gnutls-3.3.0 and newer leaves 2 FDs open in order to be backwards
compatible when it comes to chrooted binaries [1]. Linking
commandhelper with gnutls then leaves these two FDs open and
commandtest fails thanks to that. This patch does not link
commandhelper with libvirt.la, but rather only the utilities making
the test pass.
Ján Tomko [Fri, 2 May 2014 07:37:34 +0000 (09:37 +0200)]
fix build with older gcc
Older gcc (4.1.2-55.el5, 4.2.1 on FreeBSD) reports bogus warnings:
../../src/conf/nwfilter_conf.c:2111: warning: 'protocol' may be used
uninitialized in this function
../../src/conf/nwfilter_conf.c:2110: warning: 'dataProtocolID' may be
used uninitialized in this function
Initialize them to NULL to make the compiler happy.
Eric Blake [Thu, 1 May 2014 02:17:42 +0000 (20:17 -0600)]
storage: reject negative indices
Commit f22b7899 stumbled across a difference between 32-bit and
64-bit platforms when parsing "-1" as an int. Now that we've
fixed that difference, it's time to fix the testsuite.
* src/util/virstoragefile.c (virStorageFileParseChainIndex):
Require a positive index.
Eric Blake [Thu, 1 May 2014 02:11:09 +0000 (20:11 -0600)]
util: new stricter unsigned int parsing
strtoul() is required to parse negative numbers as their
twos-complement positive counterpart. But sometimes we want
to reject negative numbers. Add new functions to do this.
The 'p' suffix is a mnemonic for 'positive' (technically it
also parses 0, but 'non-negative' doesn't lend itself to a
nice one-letter suffix).
* src/util/virstring.h (virStrToLong_uip, virStrToLong_ulp)
(virStrToLong_ullp): New prototypes.
* src/util/virstring.c (virStrToLong_uip, virStrToLong_ulp)
(virStrToLong_ullp): New functions.
* src/libvirt_private.syms (virstring.h): Export them.
* tests/virstringtest.c (testStringToLong): Test them.
Eric Blake [Wed, 30 Apr 2014 20:46:18 +0000 (14:46 -0600)]
util: fix uint parsing on 64-bit platforms
Commit f22b7899 called to light a long-standing latent bug: the
behavior of virStrToLong_ui was different on 32-bit platforms
than on 64-bit platforms. Curse you, C type promotion and
narrowing rules, and strtoul specification. POSIX says that for
a 32-bit long, strtol handles only 2^32 values [LONG_MIN to
LONG_MAX] while strtoul handles 2^33 - 1 values [-ULONG_MAX to
ULONG_MAX] with twos-complement wraparound for negatives. Thus,
parsing -1 as unsigned long produces ULONG_MAX, rather than a
range error. We WANT[1] this same shortcut for turning -1 into
UINT_MAX when parsing to int; and get it for free with 32-bit
long. But with 64-bit long, ULONG_MAX is outside the range
of int and we were rejecting it as invalid; meanwhile, we were
silently treating -18446744073709551615 as 1 even though it
textually exceeds INT_MIN. Too bad there's not a strtoui() in
libc that does guaranteed parsing to int, regardless of the size
of long.
The bug has been latent since 2007, introduced by Jim Meyering
in commit 5d25419 in the attempt to eradicate unsafe use of
strto[u]l when parsing ints and longs. How embarrassing that we
are only discovering it now - so I'm adding a testsuite to ensure
that it covers all the corner cases we care about.
[1] Ideally, we really want the caller to be able to choose whether
to allow negative numbers to wrap around to their 2s-complement
counterpart, as in strtoul, or to force a stricter input range
of [0 to UINT_MAX] by rejecting negative signs; this will be added
in a later patch for all three int types.
This patch is tested on both 32- and 64-bit; the enhanced
virstringtest passes on both platforms, while virstoragetest now
reliably fails on both platforms instead of just 32-bit platforms.
That test will be fixed later.
* src/util/virstring.c (virStrToLong_ui): Ensure same behavior
regardless of platform long size.
* tests/virstringtest.c (testStringToLong): New function.
(mymain): Comprehensively test string to long parsing.
A couple of places in the QEMU XML -> ARGV conversion code
raised an error but then forgot to return an error status
due to missing gotos. While fixing this also tweak style
of a couple of other error reports
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Laine Stump [Thu, 1 May 2014 08:40:41 +0000 (11:40 +0300)]
qemu: fix crash when removing <filterref> from interface with update-device
If a domain network interface that contains a <filterref> is modified
"live" using "virsh update-device --live", libvirtd would crash. This
was because the code supporting live update of an interface's
filterref was assuming that a filterref might be added or modified,
but didn't account for removing the filterref, resulting in a null
dereference of the filter name.
Introduced with commit 258fb278, which was first in libvirt v1.0.1.
This addresses https://bugzilla.redhat.com/show_bug.cgi?id=1093301
Peter Krempa [Sat, 26 Apr 2014 06:27:58 +0000 (08:27 +0200)]
storage: Clear all data allocated about backing store before reparsing
To avoid memory leak of the "backingStoreRaw" field when reparsing
backing chains a new function is being introduced by this patch that
shall be used to clear backing store information.
Well, libvirt doesn't distinguish between domain poweroff and
hibernation (S4). It's hard to differentiate these two on a real
machine anyway. As a result, any device that is hot(un-)plugged is
lost (appears again) when domain is started again as from our POV
it is a fresh cold boot. Instead of doing anything wise here, we
should just document this as known limitation.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Stefan Berger [Wed, 30 Apr 2014 15:41:18 +0000 (11:41 -0400)]
nwfilter: Validate rule after parsing
An IP or IPv6 rule with port specification but without protocol
specification cannot be instantiated by ebtables. The documentation
points to 'protocol' being required but implementation does not
enforce it to be given.
Implement a rule validation function that checks whether the rule is
valid when it is defined. This for example prevents the definition
of rules like:
<ip dstportstart='53'>
where a protocol attribute would be required for it to be valid and for
ebtables to be able to instantiate it. A valid rule then is:
<ip protocol='udp' dstportstart='53'>
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Disable libvirtd by default when building on Win32
We don't support building libvirtd on Win32 since we lack the
fork/exec feature needed for the stateful drivers. Disable this
by default, so users can just do 'mingw32-configure' with no
special args required.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
SO_REUSEADDR on Windows is actually akin to SO_REUSEPORT
on Linux/BSD. ie it allows 2 apps to listen to the same
port at once. Thus we must not set it on Win32 platforms
See http://msdn.microsoft.com/en-us/library/windows/desktop/ms740621.aspx
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When EIO comes to qemu while it's replying to
qemuMigrationUpdateJobStatus(), qemu blocks, the migration of RAM can
complete in the meantime, and when qemu unblocks, it sends us
BLOCK_IO_ERROR plus migrations "status": "complete". Even though we
act upon the BLOCK_IO_ERROR by setting the proper state of the domain,
the call still waits for the proper reply on monitor for query_migrate
and after it gets it, it checks that migration is completed and the
migration is finished. This is what abort_on_error flag was meant for
(we can migrate with these errors, but this flag must inhibit such
behaviour). Changing the order of the steps guarantees the flag works
properly.
Steven McDonald [Tue, 29 Apr 2014 02:19:01 +0000 (12:19 +1000)]
storage_backend_rbd: Correct argument order to rbd_create3
The stripe_unit and stripe_count arguments are passed to rbd_create3 in
the wrong order, resulting in a stripe size of 1 byte with 4194304
stripes on newly created RBD volumes.
https://bugzilla.redhat.com/show_bug.cgi?id=1092208 Signed-off-by: Steven McDonald <steven.mcdonald@anchor.net.au>
Eric Blake [Thu, 24 Apr 2014 21:48:55 +0000 (15:48 -0600)]
storage: use virDirRead API
More instances of failure to report (unlikely) readdir errors.
In one case, I chose to ignore them, given that a readdir error
would be no different than timing out on the loop, where the
fallback path behaves correctly either way.
Eric Blake [Fri, 25 Apr 2014 20:45:49 +0000 (14:45 -0600)]
util: use virDirRead API
In making the conversion to the new API, I fixed a couple bugs:
virSCSIDeviceGetSgName would leak memory if a directory
unexpectedly contained multiple entries;
virNetDevTapGetRealDeviceName could report a spurious error
from a stale errno inherited before starting the readdir search.
The decision on whether to store the result of virDirRead into
a variable is based on whether the end of the loop falls through
to cleanup code automatically. In some cases, we have loops that
are documented to return NULL on failure, and which raise an
error on most failure paths but not in the case where the directory
was unexpectedly empty; it may be worth a followup patch to
explicitly report an error if readdir was successful but the
directory was empty, so that a NULL return always has an error set.
* src/util/vircgroup.c (virCgroupRemoveRecursively): Use new
interface.
(virCgroupKillRecursiveInternal, virCgroupSetOwner): Report
readdir failures.
* src/util/virfile.c (virFileLoopDeviceOpenSearch)
(virFileNBDDeviceFindUnused, virFileDeleteTree): Use new
interface.
* src/util/virnetdevtap.c (virNetDevTapGetRealDeviceName):
Properly check readdir errors.
* src/util/virpci.c (virPCIDeviceIterDevices)
(virPCIDeviceFileIterate, virPCIGetNetName): Report readdir
failures.
(virPCIDeviceAddressIOMMUGroupIterate): Use new interface.
* src/util/virscsi.c (virSCSIDeviceGetSgName): Report readdir
failures, and avoid memory leak.
(virSCSIDeviceGetDevName): Report readdir failures.
* src/util/virusb.c (virUSBDeviceSearch): Report readdir
failures.
* src/util/virutil.c (virGetFCHostNameByWWN)
(virFindFCHostCapableVport): Report readdir failures.
Natanael Copa [Sun, 20 Apr 2014 11:53:45 +0000 (13:53 +0200)]
util: introduce virDirRead wrapper for readdir
Introduce a wrapper for readdir. This helps us make sure that we always
set errno before calling readdir and it will make sure errors are
properly logged.
Signed-off-by: Natanael Copa <ncopa@alpinelinux.org> Signed-off-by: Eric Blake <eblake@redhat.com>
Remove bogus ATTRIBUTE_NONNULL from virFirewallAddRuleFull
The virFirewallAddRuleFull method originally had a single
compulsory virFirewallQueryCallback parameter. During dev
work though the ignoreErrors parameter was added and the
callback parameter made optional. The ATTRIBUTE_NONNULL
annotation was never removed though.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virnetsocket.c API is hardcoded to pass --timeout=30 to
any daemon it auto-starts. For inexplicable reasons the virtlockd
daemon did not implement the --timeout option, so it would
immediately exit on autostart with an error.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When a snapshot operation finishes we have to recheck the backing chain
of all disks involved in the snapshot. And we need to do that even if
the operation failed because some of the disks might have changed if
QEMU did not support transactions.
Laine Stump [Thu, 10 Apr 2014 11:44:07 +0000 (14:44 +0300)]
network: centralize check for active network during interface attach
The check for a network being active during interface attach was being
done individually in several places (by both the lxc driver and the
qemu driver), but those places were too specific, leading to it *not*
being checked when allocating a connection/device from a macvtap or
hostdev network.
This patch puts a single check in networkAllocateActualDevice(), which
is always called before the any network interface is attached to any
type of domain. It also removes all the other now-redundant checks
from the lxc and qemu drivers.
NB: the following patches are prerequisites for this patch, in the
case that it is backported to any branch:
440beeb network: fix virNetworkObjAssignDef and persistence 8aaa5b6 network: create statedir during driver initialization b9e9549 network: change location of network state xml files 411c548 network: set macvtap/hostdev networks active if their state
file exists
Laine Stump [Wed, 9 Apr 2014 14:16:45 +0000 (17:16 +0300)]
network: set macvtap/hostdev networks active if their state file exists
libvirt attempts to determine at startup time which networks are
already active, and set their active flags. Previously it has done
this by assuming that all networks are inactive, then setting the
active flag if the network has a bridge device associated with it and
that bridge device exists. This is not useful for macvtap and hostdev
based networks, since they do not use a bridge device.
Of course the reason that such a check had to be done was that the
presence of a status file in the network "stateDir" couldn't be
trusted as an indicator of whether or not a network was active. This
was due to the network driver mistakenly using
/var/lib/libvirt/network to store the status files, rather than
/var/run/libvirt/network (similar to what is done by every other
libvirt driver that stores status xml for its objects). The difference
is that /var/run is cleared out when the host reboots, so you can be
assured that the state file you are seeing isn't just left over from a
previous boot of the host.
Now that the network driver has been switched to using
/var/run/libvirt/network for status, we can also modify it to assume
that any network with an existing status file is by definition active
- we do this when reading the status file. To fine tune the results,
networkFindActiveConfigs() is changed to networkUpdateAllState(),
and only sets active = 0 if the conditions for particular network
types are *not* met.
The result is that during the first run of libvirtd after the host
boots, there are no status files, so no networks are active. Any time
libvirtd is restarted, any network with a status file will be marked
as active (unless the network uses a bridge device and that device for
some reason doesn't exist).
Laine Stump [Fri, 4 Apr 2014 13:48:54 +0000 (16:48 +0300)]
network: change location of network state xml files
For some reason these have been stored in /var/lib, although other
drivers (e.g. qemu and lxc) store their state files in /var/run.
It's much nicer to store state files in /var/run because it is
automatically cleared out when the system reboots. We can then use
existence of the state file as a convenient indicator of whether or
not a particular network is active.
Since changing the location of the state files by itself will cause
problems in the case of a *live* upgrade from an older libvirt that
uses /var/lib (because current status of active networks will be
lost), the network driver initialization has been modified to migrate
any network state files from /var/lib to /var/run.
This will not help those trying to *downgrade*, but in practice this
will only be problematic in two cases
1) If there are networks with network-wide bandwidth limits configured
*and in use* by a guest during a downgrade to "old" libvirt. In this
case, the class ID's used for that network's tc rules, as well as
the currently in-use bandwidth "floor" will be forgotten.
2) If someone does this: 1) upgrade libvirt, 2) downgrade libvirt, 3)
modify running state of network (e.g. add a static dhcp host, etc),
4) upgrade. In this case, the modifications to the running network
will be lost (but not any persistent changes to the network's
config).
Laine Stump [Fri, 4 Apr 2014 11:21:13 +0000 (14:21 +0300)]
network: create statedir during driver initialization
This directory should be created when the network driver is first
started up, not just when a dhcp daemon is run. This hasn't posed a
problem in the past, because the directory has always been
pre-existing.
Laine Stump [Tue, 22 Apr 2014 13:48:54 +0000 (16:48 +0300)]
network: fix virNetworkObjAssignDef and persistence
Experimentation showed that if virNetworkCreateXML() was called for a
network that was already defined, and then the network was
subsequently shutdown, the network would continue to be persistent
after the shutdown (expected/desired), but the original config would
be lost in favor of the transient config sent in with
virNetworkCreateXML() (which would then be the new persistent config)
(obviously unexpected/not desired).
To fix this, virNetworkObjAssignDef() has been changed to
1) properly save/free network->def and network->newDef for all the
various combinations of live/active/persistent, including some
combinations that were previously considered to be an error but didn't
need to be (e.g. setting a "live" config for a network that isn't yet
active but soon will be - that was previously considered an error,
even though in practice it can be very useful).
2) automatically set the persistent flag whenever a new non-live
config is assigned to the network (and clear it when the non-live
config is set to NULL). the libvirt network driver no longer directly
manipulates network->persistent, but instead relies entirely on
virNetworkObjAssignDef() to do the right thing automatically.
After this patch, the following sequence will behave as expected:
virNetworkDefineXML(X)
virNetworkCreateXML(X') (same name but some config different)
virNetworkDestroy(X)
At the end of these calls, the network config will remain as it was
after the initial virNetworkDefine(), whereas previously it would take
on the changes given during virNetworkCreateXML().
Another effect of this tighter coupling between a) setting a !live def
and b) setting/clearing the "persistent" flag, is that future patches
which change the details of network lifecycle management
(e.g. upcoming patches to fix detection of "active" networks when
libvirtd is restarted) will find it much more difficult to break
persistence functionality.
Ian Campbell [Fri, 25 Apr 2014 15:54:20 +0000 (16:54 +0100)]
libxl: Support PV consoles
Currently the driver only exposes the ability to connect to the serial console
of a Xen guest, which doesn't work for a PV guest. Since for an HVM guest the
serial devices are duplicated as consoles it is sufficient to just use the
console devices unconditionally.
Add a test suite for nwfilter ebiptables tech driver
Create a nwfilterxml2firewalltest to exercise the
ebiptables_driver.applyNewRules method with a variety of
different XML input files. The XML input files are taken
from the libvirt-tck nwfilter tests. While the nwfilter
tests verify the final state of the iptables chains, this
test verifies the set of commands invoked to create the
chains.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Remove last trace of direct firewall command exection
Remove all the left over code related to the direct invocation
of firewall-cmd/iptables/ip6tables/ebtables. This is all handled
by the virFirewallPtr APIs now.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Convert nwfilter ebiptablesApplyNewRules to virFirewall
Convert the nwfilter ebtablesApplyNewRules method to use the
virFirewall object APIs instead of creating shell scripts
using virBuffer APIs. This provides a performance improvement
through allowing direct use of firewalld dbus APIs and will
facilitate automated testing.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Convert nwfilter ebtablesApplyDropAllRules to virFirewall
Convert the nwfilter ebtablesApplyDropAllRules method to use the
virFirewall object APIs instead of creating shell scripts
using virBuffer APIs. This provides a performance improvement
through allowing direct use of firewalld dbus APIs and will
facilitate automated testing.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Convert nwfilter ebtablesApplyDHCPOnlyRules to virFirewall
Convert the nwfilter ebtablesApplyDHCPOnlyRules method to use the
virFirewall object APIs instead of creating shell scripts
using virBuffer APIs. This provides a performance improvement
through allowing direct use of firewalld dbus APIs and will
facilitate automated testing.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Convert nwfilter ebtablesApplyBasicRules to virFirewall
Convert the nwfilter ebtablesApplyBasicRules method to use the
virFirewall object APIs instead of creating shell scripts
using virBuffer APIs. This provides a performance improvement
through allowing direct use of firewalld dbus APIs and will
facilitate automated testing.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Convert nwfilter ebiptablesTearNewRules to virFirewall
Convert the nwfilter ebiptablesTearNewRules method to use the
virFirewall object APIs instead of creating shell scripts
using virBuffer APIs. This provides a performance improvement
through allowing direct use of firewalld dbus APIs and will
facilitate automated testing.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Convert nwfilter ebtablesRemoveBasicRules to virFirewall
Convert the nwfilter ebtablesRemoveBasicRules method to use the
virFirewall object APIs instead of creating shell scripts
using virBuffer APIs. This provides a performance improvement
through allowing direct use of firewalld dbus APIs and will
facilitate automated testing.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Convert nwfilter ebiptablesTearOldRules to virFirewall
Convert the nwfilter ebiptablesTearOldRules method to use the
virFirewall object APIs instead of creating shell scripts
using virBuffer APIs. This provides a performance improvement
through allowing direct use of firewalld dbus APIs and will
facilitate automated testing.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Convert nwfilter ebiptablesAllTeardown to virFirewall
Convert the nwfilter ebiptablesAllTeardown method to use the
virFirewall object APIs instead of creating shell scripts
using virBuffer APIs. This provides a performance improvement
through allowing direct use of firewalld dbus APIs and will
facilitate automated testing.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Replace virNetworkObjPtr with virNetworkDefPtr in network platform APIs
The networkCheckRouteCollision, networkAddFirewallRules and
networkRemoveFirewallRules APIs all take a virNetworkObjPtr
instance, but only ever access the 'def' member. It thus
simplifies testing if the APIs are changed to just take a
virNetworkDefPtr instead
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Convert bridge driver over to use new firewall APIs
Update the iptablesXXXX methods so that instead of directly
executing iptables commands, they populate rules in an
instance of virFirewallPtr. The bridge driver can thus
construct the ruleset and then invoke it in one operation
having rollback handled automatically.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Introduce an object for managing firewall rulesets
The network and nwfilter drivers both have a need to update
firewall rules. The currently share no code for interacting
with iptables / firewalld. The nwfilter driver is fairly
tied to the concept of creating shell scripts to execute
which makes it very hard to port to talk to firewalld via
DBus APIs.
This patch introduces a virFirewallPtr object which is able
to represent a complete sequence of rule changes, with the
ability to have multiple transactional checkpoints with
rollbacks. By formally separating the definition of the rules
to be applied from the mechanism used to apply them, it is
also possible to write a firewall engine that uses firewalld
DBus APIs natively instead of via the slow firewalld-cmd.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When a VM fails to launch due to error creating nwfilter
rules, we must avoid overwriting the original error when
tearing down the partially created rules.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Remove two-stage construction of commands in nwfilter
The nwfilter ebiptables driver will build up commands to run in
two phases. The first phase contains all of the command, except
for the '-A' part. Instead it has a '%c' placeholder, along with
a '%s' placeholder for a position arg. The second phase than
substitutes these placeholders. The only values ever used for
these substitutions though is '-A' and '', so it is entirely
pointless. Remove the second phase entirely, since it will make
it harder to convert to the new firewall APIs
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Merge nwfilter createRuleInstance driver into applyNewRules
The current nwfilter tech driver API has a 'createRuleInstance' method
which populates virNWFilterRuleInstPtr with a command line string
containing variable placeholders. The 'applyNewRules' method then
expands the variables and executes the commands. This split of
responsibility won't work when switching to the virFirewallPtr
APIs, since we can't just build up command line strings. This patch
this merges the functionality of 'createRuleInstance' into the
applyNewRules method.
The virNWFilterRuleInstPtr struct is changed from holding an array
of opaque pointers, into holding generic metadata about the rules
to be processed. In essence this is the result of taking a linked
set of virNWFilterDefPtr's and flattening the tree to get a list
of virNWFilterRuleDefPtr's. At the same time we must keep track of
any nested virNWFilterObjPtr instances, so that the locks are held
for the duration of the 'applyNewRules' method.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Push virNWFilterRuleInstPtr out of (eb|ip)tablesCreateRuleInstance
Later refactoring will change use of the virNWFilterRuleInstPtr struct.
Prepare for this by pushing use of the virNWFilterRuleInstPtr parameter
out of the ebtablesCreateRuleInstance and iptablesCreateRuleInstance
methods. Instead they simply string(s) with the constructed rule data.
The ebiptablesCreateRuleInstance method will make use of the
virNWFilterRuleInstPtr struct instead.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>