git.ipfire.org Git - thirdparty/libvirt.git/log

qemu: Let empty default VNC password work as documented

CVE-2016-5008

Setting an empty graphics password is documented as a way to disable
VNC/SPICE access, but QEMU does not always behaves like that. VNC would
happily accept the empty password. Let's enforce the behavior by setting
password expiration to "now".

https://bugzilla.redhat.com/show_bug.cgi?id=1180092

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
(cherry picked from commit bb848feec0f3f10e92dd8e5231ae7aa89b5598f3)
(cherry picked from commit d933f68ee660566b52cd90330aee0d5f414636a4)
(cherry picked from commit 139a4265774b7aa194f8479a82188bc1337cd7a4)

domain_conf: fix domain deadlock

If you use public api virConnectListAllDomains() with second parameter
set to NULL to get only the number of domains you will lock out all
other operations with domains.

Introduced by commit 2c680804.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
(cherry picked from commit fc22b2e74890873848b43fffae43025d22053669)

CVE-2014-3633: qemu: blkiotune: Use correct definition when looking up disk

Live definition was used to look up the disk index while persistent one
was indexed leading to a crash in qemuDomainGetBlockIoTune. Use the
correct def and report a nice error.

Unfortunately it's accessible via read-only connection, though it can
only crash libvirtd in the cases where the guest is hot-plugging disks
without reflecting those changes to the persistent definition. So
avoiding hotplug, or doing hotplug where persistent is always modified
alongside live definition, will avoid the out-of-bounds access.

Introduced in: eca96694a7f992be633d48d5ca03cedc9bbc3c9aa (v0.9.8)
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1140724
Reported-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
(cherry picked from commit 3e745e8f775dfe6f64f18b5c2fe4791b35d3546b)

Conflicts:
src/qemu/qemu_driver.c - context due to fewer functions

LSN-2014-0003: Don't expand entities when parsing XML

If the XML_PARSE_NOENT flag is passed to libxml2, then any
entities in the input document will be fully expanded. This
allows the user to read arbitrary files on the host machine
by creating an entity pointing to a local file. Removing
the XML_PARSE_NOENT flag means that any entities are left
unchanged by the parser, or expanded to "" by the XPath
APIs.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit d6b27d3e4c40946efa79e91d134616b41b1666c4)

build: fix 'make check' with newer git

Newer git doesn't like the maint.mk rule 'public-submodule-commit'
run during 'make check', as inherited from our checkout of gnulib.
I tracked down that libvirt commit 8531301 picked up a gnulib fix
that makes git happy. Rather than try and do a full .gnulib
submodule update to gnulib.git d18d1b802 (as used in that libvirt
commit), it was easier to just backport the fixed maint.mk from
gnulib on top of our existing submodule level. I did it as follows,
where these steps will have to be repeated when cherry-picking this
commit to any other maintenance branch:

mkdir -p gnulib/local/top
cd .gnulib
git checkout d18d1b802 top/maint.mk
git diff HEAD > ../gnulib/local/top/maint.mk.diff
git reset --hard
cd ..
git add gnulib/local/top

Signed-off-by: Eric Blake <eblake@redhat.com>

docs: publish correct enum values

We publish libvirt-api.xml for others to use, and in fact, the
libvirt-python bindings use it to generate python constants that
correspond to our enum values. However, we had an off-by-one bug
that any enum that relied on C's rules for implicit initialization
of the first enum member to 0 got listed in the xml as having a
value of 1 (and all later members of the enum were equally
botched).

The fix is simple - since we add one to the previous value when
encountering an enum without an initializer, the previous value
must start at -1 so that the first enum member is assigned 0.

The python generator code has had the off-by-one ever since DV
first wrote it years ago, but most of our public enums were immune
because they had an explicit = 0 initializer. The only affected
enums are:
- virDomainEventGraphicsAddressType (such as
VIR_DOMAIN_EVENT_GRAPHICS_ADDRESS_IPV4), since commit 987e31e
(libvirt v0.8.0)
- virDomainCoreDumpFormat (such as VIR_DOMAIN_CORE_DUMP_FORMAT_RAW),
since commit 9fbaff0 (libvirt v1.2.3)
- virIPAddrType (such as VIR_IP_ADDR_TYPE_IPV4), since commit
03e0e79 (not yet released)

Thanks to Nehal J Wani for reporting the problem on IRC, and
for helping me zero in on the culprit function.

* docs/apibuild.py (CParser.parseEnumBlock): Fix implicit enum
values.

Signed-off-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 9b291bbe20c36c0820c6e7cd2bf6229bf41807e8)

Conflicts:
docs/apibuild.py - context with 2a40951

virNetClientSetTLSSession: Restore original signal mask

Currently, we use pthread_sigmask(SIG_BLOCK, ...) prior to calling
poll(). This is okay, as we don't want poll() to be interrupted.
However, then - immediately as we fall out from the poll() - we try to
restore the original sigmask - again using SIG_BLOCK. But as the man
page says, SIG_BLOCK adds signals to the signal mask:

SIG_BLOCK
The set of blocked signals is the union of the current set and the set argument.

Therefore, when restoring the original mask, we need to completely
overwrite the one we set earlier and hence we should be using:

SIG_SETMASK
The set of blocked signals is set to the argument set.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
(cherry picked from commit 3d4b4f5ac634c123af1981084add29d3a2ca6ab0)

Really don't crash if a connection closes early

https://bugzilla.redhat.com/show_bug.cgi?id=1047577

When writing commit 173c291, I missed the fact virNetServerClientClose
unlocks the client object before actually clearing client->sock and thus
it is possible to hit a window when client->keepalive is NULL while
client->sock is not NULL. I was thinking client->sock == NULL was a
better check for a closed connection but apparently we have to go with
client->keepalive == NULL to actually fix the crash.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
(cherry picked from commit 066c8ef6c18bc1faf8b3e10787b39796a7a06cc0)

Don't crash if a connection closes early

https://bugzilla.redhat.com/show_bug.cgi?id=1047577

When a client closes its connection to libvirtd early during
virConnectOpen, more specifically just after making
REMOTE_PROC_CONNECT_SUPPORTS_FEATURE call to check if
VIR_DRV_FEATURE_PROGRAM_KEEPALIVE is supported without even waiting for
the result, libvirtd may crash due to a race in keep-alive
initialization. Once receiving the REMOTE_PROC_CONNECT_SUPPORTS_FEATURE
call, the daemon's event loop delegates it to a worker thread. In case
the event loop detects EOF on the connection and calls
virNetServerClientClose before the worker thread starts to handle
REMOTE_PROC_CONNECT_SUPPORTS_FEATURE call, client->keepalive will be
disposed by the time virNetServerClientStartKeepAlive gets called from
remoteDispatchConnectSupportsFeature. Because the flow is common for
both authenticated and read-only connections, even unprivileged clients
may cause the daemon to crash.

To avoid the crash, virNetServerClientStartKeepAlive needs to check if
the connection is still open before starting keep-alive protocol.

Every libvirt release since 0.9.8 is affected by this bug.

(cherry picked from commit 173c2914734eb5c32df6d35a82bf503e12261bcf)

Conflicts:
src/rpc/virnetserverclient.c - older locking style

qemu: Fix job usage in virDomainGetBlockIoTune

CVE-2013-6458

Every API that is going to begin a job should do that before fetching
data from vm->def.

(cherry picked from commit 3b56425938e2f97208d5918263efa0d6439e4ecd)

Conflicts:
src/qemu/qemu_driver.c - older BeginJobWithDriver

qemu: Fix job usage in qemuDomainBlockJobImpl

CVE-2013-6458

Every API that is going to begin a job should do that before fetching
data from vm->def.

(cherry picked from commit f93d2caa070f6197ab50d372d286018b0ba6bbd8)

Conflicts:
src/qemu/qemu_driver.c - older style BeginJobWithDriver, context

qemu: Avoid using stale data in virDomainGetBlockInfo

CVE-2013-6458

Generally, every API that is going to begin a job should do that before
fetching data from vm->def. However, qemuDomainGetBlockInfo does not
know whether it will have to start a job or not before checking vm->def.
To avoid using disk alias that might have been freed while we were
waiting for a job, we use its copy. In case the disk was removed in the
meantime, we will fail with "cannot find statistics for device '...'"
error message.

(cherry picked from commit b799259583bd65c0b2f5042e6c3ff19637ade881)

Conflicts:
src/qemu/qemu_driver.c - VIR_STRDUP not backported, context

qemu: Do not access stale data in virDomainBlockStats

CVE-2013-6458
https://bugzilla.redhat.com/show_bug.cgi?id=1043069

When virDomainDetachDeviceFlags is called concurrently to
virDomainBlockStats: libvirtd may crash because qemuDomainBlockStats
finds a disk in vm->def before getting a job on a domain and uses the
disk pointer after getting the job. However, the domain in unlocked
while waiting on a job condition and thus data behind the disk pointer
may disappear. This happens when thread 1 runs
virDomainDetachDeviceFlags and enters monitor to actually remove the
disk. Then another thread starts running virDomainBlockStats, finds the
disk in vm->def, and while it's waiting on the job condition (owned by
the first thread), the first thread finishes the disk removal. When the
second thread gets the job, the memory pointed to be the disk pointer is
already gone.

That said, every API that is going to begin a job should do that before
fetching data from vm->def.

(cherry picked from commit db86da5ca2109e4006c286a09b6c75bfe10676ad)

Conflicts:
src/qemu/qemu_driver.c - context: no ACLs

build: use proper pod for nested bulleted VIRSH_DEBUG list

Newer pod (hello rawhide) complains if you attempt to mix bullets
and non-bullets in the same list:

virsh.pod around line 3177: Expected text after =item, not a bullet

As our intent was to nest an inner list, we make that explicit to
keep pod happy.

* tools/virsh.pod (ENVIRONMENT): Use correct pod syntax.

(cherry picked from commit 00d69b4af1215695ebfc8f1dbfa77804c2b293fd)

libxl: fix build with Xen4.3

Xen 4.3 fixes a mistake in the libxl event handler signature where the
event owned by the application was defined as const. Detect this and
define the libvirt libxl event handler signature appropriately.
(cherry picked from commit 43b0ff5b1eb7fbcdc348b2a53088a7db939d5e48)

Conflicts:
src/libxl/libxl_driver.c - context in formatting

Disable nwfilter driver when running unprivileged

When opening a new connection to the driver, nwfilterOpen
only succeeds if the driverState has been allocated.

Move the privilege check in driver initialization before
the state allocation to disable the driver.

This changes the nwfilter-define error from:
error: cannot create config directory (null): Bad address
To:
this function is not supported by the connection driver:
virNWFilterDefineXML

https://bugzilla.redhat.com/show_bug.cgi?id=1029266
(cherry picked from commit b7829f959b33c6e32422222a9ed745c0da7dc696)

Conflicts:
src/nwfilter/nwfilter_driver.c
(da77f04 Convert HAVE_DBUS to WITH_DBUS not backported)

remote: fix regression in event deregistration

Introduced by 7b87a3
When I quit the process which only register VIR_DOMAIN_EVENT_ID_REBOOT,
I got error like:
"libvirt: XML-RPC error : internal error: domain event 0 not registered".
Then I add the following code, it fixed.

Signed-off-by: Zhou Yimin <zhouyimin@huawei.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 9712c2510ec87a87578576a407768380e250a6a4)

Prep for release 0.10.2.8

virsh: fix change-media bug on disk block type

Resolves:https://bugzilla.redhat.com/show_bug.cgi?id=923053
When cdrom is block type, the virsh change-media failed to insert
source info because virsh uses "<source block='/dev/sdb'/>" while
the correct name of the attribute for block disks is "dev".

(cherry picked from commit 7729a16814d5bf3aebd248c9af00296ae2773818)

libvirt: lxc: don't mkdir when selinux is disabled

libvirt lxc will fail to start when selinux is disabled.
error: Failed to start domain noroot
error: internal error guest failed to start: PATH=/bin:/sbin TERM=linux container=lxc-libvirt container_uuid=b9873916-3516-c199-8112-1592ff694a9e LIBVIRT_LXC_UUID=b9873916-3516-c199-8112-1592ff694a9e LIBVIRT_LXC_NAME=noroot /bin/sh
2013-01-09 11:04:05.384+0000: 1: info : libvirt version: 1.0.1
2013-01-09 11:04:05.384+0000: 1: error : lxcContainerMountBasicFS:546 : Failed to mkdir /sys/fs/selinux: No such file or directory
2013-01-09 11:04:05.384+0000: 7536: info : libvirt version: 1.0.1
2013-01-09 11:04:05.384+0000: 7536: error : virLXCControllerRun:1466 : error receiving signal from container: Input/output error
2013-01-09 11:04:05.404+0000: 7536: error : virCommandWait:2287 : internal error Child process (ip link del veth1) unexpected exit status 1: Cannot find device "veth1"

fix this problem by checking if selinuxfs is mounted
in host before we try to create dir /sys/fs/selinux.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
(cherry picked from commit 8d63af22de880d34aa94bcaa7ed95a8eac856ac6)

Fix crash in remoteDispatchDomainMemoryStats (CVE-2013-4296)

The 'stats' variable was not initialized to NULL, so if some
early validation of the RPC call fails, it is possible to jump
to the 'cleanup' label and VIR_FREE an uninitialized pointer.
This is a security flaw, since the API can be called from a
readonly connection which can trigger the validation checks.

This was introduced in release v0.9.1 onwards by

  commit 158ba8730e44b7dd07a21ab90499996c5dec080a
  Author: Daniel P. Berrange <berrange@redhat.com>
  Date:   Wed Apr 13 16:21:35 2011 +0100

    Merge all returns paths from dispatcher into single path

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit e7f400a110e2e3673b96518170bfea0855dd82c0)

Conflicts:
daemon/remote.c - context

Add support for using 3-arg pkcheck syntax for process (CVE-2013-4311)

With the existing pkcheck (pid, start time) tuple for identifying
the process, there is a race condition, where a process can make
a libvirt RPC call and in another thread exec a setuid application,
causing it to change to effective UID 0. This in turn causes polkit
to do its permission check based on the wrong UID.

To address this, libvirt must get the UID the caller had at time
of connect() (from SO_PEERCRED) and pass a (pid, start time, uid)
triple to the pkcheck program.

This fix requires that libvirt is re-built against a version of
polkit that has the fix for its CVE-2013-4288, so that libvirt
can see 'pkg-config --variable pkcheck_supports_uid polkit-gobject-1'

Signed-off-by: Colin Walters <walters@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 922b7fda77b094dbf022d625238262ea05335666)
Signed-off-by: Eric Blake <eblake@redhat.com>
Conflicts:
configure.ac - context
libvirt.spec.in - context of indentation
src/access/viraccessdriverpolkit.c - not present on this branch

Include process start time when doing polkit checks

Since PIDs can be reused, polkit prefers to be given
a (PID,start time) pair. If given a PID on its own,
it will attempt to lookup the start time in /proc/pid/stat,
though this is subject to races.

It is safer if the client app resolves the PID start
time itself, because as long as the app has the client
socket open, the client PID won't be reused.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 979e9c56a7aadf2dcfbddd1abfbad594b78b4468)
Signed-off-by: Eric Blake <eblake@redhat.com>
Conflicts:
src/libvirt_private.syms - not backported
src/locking/lock_daemon.c - not backported
src/rpc/virnetserverclient.c
src/rpc/virnetsocket.c
src/rpc/virnetsocket.h
src/util/viridentity.h - not backported
src/util/virprocess.c
src/util/virprocess.h
src/util/virstring.c
src/util/virstring.h

Most conflicts were contextual (this patch adds new functions,
but upstream intermediate patches not backported here also added
new features, and the resolution was picking out just the portions
needed by this commit). virnetsocket.c also had slightly
different locking semantics.

win32: Pretend that close-on-exec works

Currently virNetSocketNew fails because virSetCloseExec fails as there
is no proper implementation for it on Windows at the moment. Workaround
this by pretending that setting close-on-exec on the fd works. This can
be done because libvirt currently lacks the ability to create child
processes on Windows anyway. So there is no point in failing to set a
flag that isn't useful at the moment anyway.

(cherry picked from commit fcfa4bfb1660cdb083aa61e366ddc8e91a53e3ed)

virDomainDefParseXML: set the argument of virBitmapFree to NULL after calling virBitmapFree

After freeing the bitmap pointer, it must set the pointer to NULL.
This will avoid any other use of the freed memory of the bitmap pointer.

https://bugzilla.redhat.com/show_bug.cgi?id=1006710

Signed-off-by: Liuji (Jeremy) <jeremy.liu@huawei.com>
(cherry picked from commit ef5d51d491356f1f4287aa3a8b908b183b6dd9aa)

Conflicts:
src/conf/domain_conf.c

security: provide supplemental groups even when parsing label (CVE-2013-4291)

Commit 29fe5d7 (released in 1.1.1) introduced a latent problem
for any caller of virSecurityManagerSetProcessLabel and where
the domain already had a uid:gid label to be parsed.  Such a
setup would collect the list of supplementary groups during
virSecurityManagerPreFork, but then ignores that information,
and thus fails to call setgroups() to adjust the supplementary
groups of the process.

Upstream does not use virSecurityManagerSetProcessLabel for
qemu (it uses virSecurityManagerSetChildProcessLabel instead),
so this problem remained latent until backporting the initial
commit into v0.10.2-maint (commit c061ff5, released in 0.10.2.7),
where virSecurityManagerSetChildProcessLabel has not been
backported.  As a result of using a different code path in the
backport, attempts to start a qemu domain that runs as qemu:qemu
will end up with supplementary groups unchanged from the libvirtd
parent process, rather than the desired supplementary groups of
the qemu user.  This can lead to failure to start a domain
(typical Fedora setup assigns user 107 'qemu' to both group 107
'qemu' and group 36 'kvm', so a disk image that is only readable
under kvm group rights is locked out).  Worse, it is a security
hole (the qemu process will inherit supplemental group rights
from the parent libvirtd process, which means it has access
rights to files owned by group 0 even when such files should
not normally be visible to user qemu).

LXC does not use the DAC security driver, so it is not vulnerable
at this time.  Still, it is better to plug the latent hole on
the master branch first, before cherry-picking it to the only
vulnerable branch v0.10.2-maint.

* src/security/security_dac.c (virSecurityDACGetIds): Always populate
groups and ngroups, rather than only when no label is parsed.

Signed-off-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 745aa55fbf3e076c4288d5ec3239f5a5d43508a6)

virbitmap: Refactor virBitmapParse to avoid access beyond bounds of array

The virBitmapParse function was calling virBitmapIsSet() function that
requires the caller to check the bounds of the bitmap without checking
them. This resulted into crashes when parsing a bitmap string that was
exceeding the bounds used as argument.

This patch refactors the function to use virBitmapSetBit without
checking if the bit is set (this function does the checks internally)
and then counts the bits in the bitmap afterwards (instead of keeping
track while parsing the string).

This patch also changes the "parse_error" label to a more common
"error".

The refactor should also get rid of the need to call sa_assert on the
returned variable as the callpath should allow coverity to infer the
possible return values.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=997367

Thanks to Alex Jia for tracking down the issue. This issue is introduced
by commit 0fc8909.

(cherry picked from commit 47b9127e883677a0d60d767030a147450e919a25)

Conflicts:
src/util/bitmap.c - context, coverity fix not backported

bitmap: add virBitmapCountBits

Sometimes it's handy to know how many bits are set.

* src/util/bitmap.h (virBitmapCountBits): New prototype.
(virBitmapNextSetBit): Use correct type.
* src/util/bitmap.c (virBitmapNextSetBit): Likewise.
(virBitmapSetAll): Maintain invariant of clear tail bits.
(virBitmapCountBits): New function.
* src/libvirt_private.syms (bitmap.h): Export it.
* tests/virbitmaptest.c (test2): Test it.

(cherry picked from commit 0711c4b74d1f0e83b06c5b15a50f99d780478566)

Prep for release 0.10.2.7

udev: fix crash in libudev logging

Call virLogVMessage instead of virLogMessage, since libudev
called us with a va_list object, not a list of arguments.

Honor message priority and strip the trailing newline.

https://bugzilla.redhat.com/show_bug.cgi?id=969152
(cherry picked from commit f753dd62f951cc62e164421d0c6491f39e4c68ad)

Conflicts:
src/libvirt_private.syms
src/node_device/node_device_udev.c
src/util/logging.c
src/util/logging.h

security: fix deadlock with prefork

https://bugzilla.redhat.com/show_bug.cgi?id=964358

Attempts to start a domain with both SELinux and DAC security
modules loaded will deadlock; latent problem introduced in commit
fdb3bde and exposed in commit 29fe5d7. Basically, when recursing
into the security manager for other driver's prefork, we have to
undo the asymmetric lock taken at the manager level.

Reported by Jiri Denemark, with diagnosis help from Dan Berrange.

* src/security/security_stack.c (virSecurityStackPreFork): Undo
extra lock grabbed during recursion.

Signed-off-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit bfc183c1e377b24cebf5cede4c00f3dc0d1b3486)

security_dac: compute supplemental groups before fork

https://bugzilla.redhat.com/show_bug.cgi?id=964358

Commit 75c1256 states that virGetGroupList must not be called
between fork and exec, then commit ee777e99 promptly violated
that for lxc's use of virSecurityManagerSetProcessLabel. Hoist
the supplemental group detection to the time that the security
manager needs to fork. Qemu is safe, as it uses
virSecurityManagerSetChildProcessLabel which in turn uses
virCommand to determine supplemental groups.

This does not fix the fact that virSecurityManagerSetProcessLabel
calls virSecurityDACParseIds calls parseIds which eventually
calls getpwnam_r, which also violates fork/exec async-signal-safe
safety rules, but so far no one has complained of hitting
deadlock in that case.

* src/security/security_dac.c (_virSecurityDACData): Track groups
in private data.
(virSecurityDACPreFork): New function, to set them.
(virSecurityDACClose): Clean up new fields.
(virSecurityDACGetIds): Alter signature.
(virSecurityDACSetSecurityHostdevLabelHelper)
(virSecurityDACSetChardevLabel, virSecurityDACSetProcessLabel)
(virSecurityDACSetChildProcessLabel): Update callers.

Signed-off-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 29fe5d745fbe207ec2415441d4807ae76be05974)

Conflicts:
src/security/security_dac.c - virSecurityDACSetSecurityUSBLabel needed similar treatment; no virSecurityDACSetChildPrcessLabel

security: framework for driver PreFork handler

https://bugzilla.redhat.com/show_bug.cgi?id=964358

A future patch wants the DAC security manager to be able to safely
get the supplemental group list for a given uid, but at the time
of a fork rather than during initialization so as to pick up on
live changes to the system's group database.  This patch adds the
framework, including the possibility of a pre-fork callback
failing.

For now, any driver that implements a prefork callback must be
robust against the possibility of being part of a security stack
where a later element in the chain fails prefork.  This means
that drivers cannot do any action that requires a call to postfork
for proper cleanup (no grabbing a mutex, for example).  If this
is too prohibitive in the future, we would have to switch to a
transactioning sequence, where each driver has (up to) 3 callbacks:
PreForkPrepare, PreForkCommit, and PreForkAbort, to either clean
up or commit changes made during prepare.

* src/security/security_driver.h (virSecurityDriverPreFork): New
callback.
* src/security/security_manager.h (virSecurityManagerPreFork):
Change signature.
* src/security/security_manager.c (virSecurityManagerPreFork):
Optionally call into driver, and allow returning failure.
* src/security/security_stack.c (virSecurityDriverStack):
Wrap the handler for the stack driver.
* src/qemu/qemu_process.c (qemuProcessStart): Adjust caller.

Signed-off-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit fdb3bde31ccf8ff172abf00ef5aa974b87af2794)

Conflicts:
src/security/security_manager.c - context from previous backport differences

Fix potential deadlock across fork() in QEMU driver

https://bugzilla.redhat.com/show_bug.cgi?id=964358

The hook scripts used by virCommand must be careful wrt
accessing any mutexes that may have been held by other
threads in the parent process. With the recent refactoring
there are 2 potential flaws lurking, which will become real
deadlock bugs once the global QEMU driver lock is removed.

Remove use of the QEMU driver lock from the hook function
by passing in the 'virQEMUDriverConfigPtr' instance directly.

Add functions to the virSecurityManager to be invoked before
and after fork, to ensure the mutex is held by the current
thread. This allows it to be safely used in the hook script
in the child process.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 61b52d2e3813cc8c9ff3ab67f232bd0c65f7318d)

Conflicts:
src/libvirt_private.syms - context
src/qemu/qemu_process.c - no backport of qemud_driver struct rename
src/security/security_manager.c - no backport of making the security driver self-locking; just expose the interface

util: make virSetUIDGID async-signal-safe

https://bugzilla.redhat.com/show_bug.cgi?id=964358

POSIX states that multi-threaded apps should not use functions
that are not async-signal-safe between fork and exec, yet we
were using getpwuid_r and initgroups.  Although rare, it is
possible to hit deadlock in the child, when it tries to grab
a mutex that was already held by another thread in the parent.
I actually hit this deadlock when testing multiple domains
being started in parallel with a command hook, with the following
backtrace in the child:

Thread 1 (Thread 0x7fd56bbf2700 (LWP 3212)):
#0  __lll_lock_wait ()
     at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
#1  0x00007fd5761e7388 in _L_lock_854 () from /lib64/libpthread.so.0
#2  0x00007fd5761e7257 in __pthread_mutex_lock (mutex=0x7fd56be00360)
     at pthread_mutex_lock.c:61
#3  0x00007fd56bbf9fc5 in _nss_files_getpwuid_r (uid=0, result=0x7fd56bbf0c70,
     buffer=0x7fd55c2a65f0 "", buflen=1024, errnop=0x7fd56bbf25b8)
     at nss_files/files-pwd.c:40
#4  0x00007fd575aeff1d in __getpwuid_r (uid=0, resbuf=0x7fd56bbf0c70,
     buffer=0x7fd55c2a65f0 "", buflen=1024, result=0x7fd56bbf0cb0)
     at ../nss/getXXbyYY_r.c:253
#5  0x00007fd578aebafc in virSetUIDGID (uid=0, gid=0) at util/virutil.c:1031
#6  0x00007fd578aebf43 in virSetUIDGIDWithCaps (uid=0, gid=0, capBits=0,
     clearExistingCaps=true) at util/virutil.c:1388
#7  0x00007fd578a9a20b in virExec (cmd=0x7fd55c231f10) at util/vircommand.c:654
#8  0x00007fd578a9dfa2 in virCommandRunAsync (cmd=0x7fd55c231f10, pid=0x0)
     at util/vircommand.c:2247
#9  0x00007fd578a9d74e in virCommandRun (cmd=0x7fd55c231f10, exitstatus=0x0)
     at util/vircommand.c:2100
#10 0x00007fd56326fde5 in qemuProcessStart (conn=0x7fd53c000df0,
     driver=0x7fd55c0dc4f0, vm=0x7fd54800b100, migrateFrom=0x0, stdin_fd=-1,
     stdin_path=0x0, snapshot=0x0, vmop=VIR_NETDEV_VPORT_PROFILE_OP_CREATE,
     flags=1) at qemu/qemu_process.c:3694
...

The solution is to split the work of getpwuid_r/initgroups into the
unsafe portions (getgrouplist, called pre-fork) and safe portions
(setgroups, called post-fork).

* src/util/virutil.h (virSetUIDGID, virSetUIDGIDWithCaps): Adjust
signature.
* src/util/virutil.c (virSetUIDGID): Add parameters.
(virSetUIDGIDWithCaps): Adjust clients.
* src/util/vircommand.c (virExec): Likewise.
* src/util/virfile.c (virFileAccessibleAs, virFileOpenForked)
(virDirCreate): Likewise.
* src/security/security_dac.c (virSecurityDACSetProcessLabel):
Likewise.
* src/lxc/lxc_container.c (lxcContainerSetID): Likewise.
* configure.ac (AC_CHECK_FUNCS_ONCE): Check for setgroups, not
initgroups.

Signed-off-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit ee777e994927ed5f2d427fbc5a53cbe8b5969bda)

Conflicts:
src/lxc/lxc_container.c - did not use setUIDGID before 1.1.0
src/util/virutil.c - oom handling changes not backported; no virSetUIDGIDWithCaps
src/util/virfile.c - functions still lived in virutil.c this far back
configure.ac - context with previous commit
src/util/command.c - no UID/GID handling in vircommand.c...
src/storage/storage_backend.c - ...so do it in the one hook user instead

util: add virGetGroupList

https://bugzilla.redhat.com/show_bug.cgi?id=964358

Since neither getpwuid_r() nor initgroups() are safe to call in
between fork and exec (they obtain a mutex, but if some other
thread in the parent also held the mutex at the time of the fork,
the child will deadlock), we have to split out the functionality
that is unsafe. At least glibc's initgroups() uses getgrouplist
under the hood, so the ideal split is to expose getgrouplist for
use before a fork. Gnulib already gives us a nice wrapper via
mgetgroups; we wrap it once more to look up by uid instead of name.

* bootstrap.conf (gnulib_modules): Add mgetgroups.
* src/util/virutil.h (virGetGroupList): New declaration.
* src/util/virutil.c (virGetGroupList): New function.
* src/libvirt_private.syms (virutil.h): Export it.

Signed-off-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 75c125641ac73473ba4b0542524d67a184769c8e)

Conflicts:
bootstrap.conf - not updating gnulib submodule...
configure.ac - ...so checking for getgrouplist by hand...
src/util/virutil.c - ...and copying only the getgrouplist implementation rather than calling the gnulib function; also, file still named util.c
src/libvirt_private.syms - context

util: improve user lookup helper

https://bugzilla.redhat.com/show_bug.cgi?id=964358

A future patch needs to look up pw_gid; but it is wasteful
to crawl through getpwuid_r twice for two separate pieces
of information, and annoying to copy that much boilerplate
code for doing the crawl. The current internal-only
virGetUserEnt is also a rather awkward interface; it's easier
to just design it to let callers request multiple pieces of
data as needed from one traversal.

And while at it, I noticed that virGetXDGDirectory could deref
NULL if the getpwuid_r lookup fails.

* src/util/virutil.c (virGetUserEnt): Alter signature.
(virGetUserDirectory, virGetXDGDirectory, virGetUserName): Adjust
callers.

Signed-off-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit c1983ba4e3902308054e961fcae75cece73ef4ba)

Conflicts:
src/util/virutil.c - oom reporting/strdup changes not backported

storage: return -1 when fs pool can't be mounted

Don't reuse the return value of virStorageBackendFileSystemIsMounted.
If it's 0, we'd return it even if the mount command failed.

Also, don't report another error if it's -1, since one has already
been reported.

Introduced by 258e06c.

https://bugzilla.redhat.com/show_bug.cgi?id=981251
(cherry picked from commit 13fde7ceab556804dc6cfb3e56938fb948ffe83d)

Fix invalid read in virCgroupGetValueStr

Don't check for '\n' at the end of file if zero bytes were read.

Found by valgrind:
==404== Invalid read of size 1
==404==    at 0x529B09F: virCgroupGetValueStr (vircgroup.c:540)
==404==    by 0x529AF64: virCgroupMoveTask (vircgroup.c:1079)
==404==    by 0x1EB475: qemuSetupCgroupForEmulator (qemu_cgroup.c:1061)
==404==    by 0x1D9489: qemuProcessStart (qemu_process.c:3801)
==404==    by 0x18557E: qemuDomainObjStart (qemu_driver.c:5787)
==404==    by 0x190FA4: qemuDomainCreateWithFlags (qemu_driver.c:5839)

Introduced by 0d0b409.

https://bugzilla.redhat.com/show_bug.cgi?id=978356
(cherry picked from commit 306c49ffd56a1c72b1892d50f2a75531c62f4a1d)

virsh: edit: don't leak XML string on reedit or redefine

Free the old XML strings before overwriting them if the user
has chosen to reedit the file or force the redefinition.

Found by Alex Jia trying to reproduce another bug:
https://bugzilla.redhat.com/show_bug.cgi?id=977430#c3
(cherry picked from commit 1e3a252974c8e5c650f1d84dc2b167f0ae8cee3c)

Prep for release 0.10.2.6

qemu: Don't report error on successful media eject

If we are just ejecting media, ret == -1 even after the retry loop
determines that the tray is open, as requested. This means media
disconnect always report's error.

Fix it, and fix some other mini issues:

- Don't overwrite the 'eject' error message if the retry loop fails
- Move the retries decrement inside the loop, otherwise the final loop
might succeed, yet retries == 0 and we will raise error
- Setting ret = -1 in the disk->src check is unneeded
- Fix comment typos

cc: mprivozn@redhat.com
(cherry picked from commit 406d8a980973cfd4caebbc886f5b283233409a64)

qemuDomainChangeEjectableMedia: Unlock domain while waiting for event

In 84c59ffa I've tried to fix changing ejectable media process. The
process should go like this:

1) we need to call 'eject' on the monitor
2) we should wait for 'DEVICE_TRAY_MOVED' event
3) now we can issue 'change' command

However, while waiting in step 2) the domain monitor was locked. So
even if qemu reported the desired event, the proper callback was not
called immediately. The monitor handling code needs to lock the
monitor in order to read the event. So that's the first lock we must
not hold while waiting. The second one is the domain lock. When
monitor handling code reads an event, the appropriate callback is
called then. The first thing that each callback does is locking the
corresponding domain as a domain or its device is about to change
state. So we need to unlock both monitor and VM lock. Well, holding
any lock while sleep()-ing is not the best thing to do anyway.
(cherry picked from commit 543af79a14f06cd16844c28887210bbb93a455fa)

Conflicts:
src/qemu/qemu_hotplug.c

qemu_hotplug: Rework media changing process

https://bugzilla.redhat.com/show_bug.cgi?id=892289

It seems like with new udev within guest OS, the tray is locked,
so we need to:
- 'eject'
- wait for tray to open
- 'change'

Moreover, even when doing bare 'eject', we should check for
'tray_open' as guest may have locked the tray. However, the
waiting phase shouldn't be unbounded, so I've chosen 10 retries
maximum, each per 500ms. This should give enough time for guest
to eject a media and open the tray.
(cherry picked from commit 84c59ffaecc983083e2e885fa59c4f0ec1812656)

nwfilter: grab driver lock earlier during init (bz96649)

This patch is in relation to Bug 966449:

https://bugzilla.redhat.com/show_bug.cgi?id=966449

This is a patch addressing the coredump.

Thread 1 must be calling nwfilterDriverRemoveDBusMatches(). It does so with
nwfilterDriverLock held. In the patch below I am now moving the
nwfilterDriverLock(driverState) further up so that the initialization, which
seems to either take a long time or is entirely stuck, occurs with the lock
held and the shutdown cannot occur at the same time.

Remove the lock in virNWFilterDriverIsWatchingFirewallD to avoid
double-locking.

(cherry picked from commit 0ec376c20a42b9eb365c1f9a5596366023c20c35)

storage: Ensure 'qemu-img resize' size arg is a 512 multiple

qemu-img resize will fail with "The new size must be a multiple of 512"
if libvirt doesn't round it first.
This fixes rhbz#951495

Signed-off-by: Christophe Fergeau <cfergeau@redhat.com>
(cherry picked from commit 9a8f39d097448b2b43c4a05d0edc213eacfc9ea6)

Tweak EOF handling of streams

Typically when you get EOF on a stream, poll will return
POLLIN|POLLHUP at the same time. Thus when we deal with
stream reads, if we see EOF during the read, we can then
clear the VIR_STREAM_EVENT_HANGUP & VIR_STREAM_EVENT_ERROR
event bits.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit e7b0382945c9bf043f06ee4dd760c2051d63d0f0)

smartcard: spell ccid-card-emulated qemu property correctly

Reported by Anthony Messina in
https://bugzilla.redhat.com/show_bug.cgi?id=904692
Present since introduction of smartcard support in commit f5fd9baa

* src/qemu/qemu_command.c (qemuBuildCommandLine): Match qemu spelling.
* tests/qemuxml2argvdata/qemuxml2argv-smartcard-host-certificates.args:
Fix broken test.
(cherry picked from commit 6f7e4ea359323f9bc413dfb738a5c544d4f9c4f8)

cgroup: be robust against cgroup movement races, part 2

The previous commit was an incomplete backport of commit 83e4c775,
and as a result made any attempt to start a domain when cgroups
are enabled go into an infinite loop. This fixes the botched
backport.

Signed-off-by: Eric Blake <eblake@redhat.com>

cgroup: be robust against cgroup movement races

https://bugzilla.redhat.com/show_bug.cgi?id=965169 documents a
problem starting domains when cgroups are enabled; I was able
to reliably reproduce the race about 5% of the time when I added
hooks to domain startup by 3 seconds (as that seemed to be about
the length of time that qemu created and then closed a temporary
thread, probably related to aio handling of initially opening
a disk image).  The problem has existed since we introduced
virCgroupMoveTask in commit 9102829 (v0.10.0).

There are some inherent TOCTTOU races when moving tasks between
kernel cgroups, precisely because threads can be created or
completed in the window between when we read a thread id from the
source and when we write to the destination.  As the goal of
virCgroupMoveTask is merely to move ALL tasks into the new
cgroup, it is sufficient to iterate until no more threads are
being created in the old group, and ignoring any threads that
die before we can move them.

It would be nicer to start the threads in the right cgroup to
begin with, but by default, all child threads are created in
the same cgroup as their parent, and we don't want vcpu child
threads in the emulator cgroup, so I don't see any good way
of avoiding the move.  It would also be nice if the kernel were
to implement something like rename() as a way to atomically move
a group of threads from one cgroup to another, instead of forcing
a window where we have to read and parse the source, then format
and write back into the destination.

* src/util/vircgroup.c (virCgroupAddTaskStrController): Ignore
ESRCH, because a thread ended between read and write attempts.
(virCgroupMoveTask): Loop until all threads have moved.

Signed-off-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 83e4c77547f5b721afad19a452f41c31daeee8c5)

Conflicts:
src/util/cgroup.c - refactoring in commit 56f27b3bb is too big
to take in entirety; but I did inline its changes to the cleanup label

Avoid spamming logs with cgroups warnings

The code for putting the emulator threads in a separate cgroup
would spam the logs with warnings

2013-02-27 16:08:26.731+0000: 29624: warning : virCgroupMoveTask:887 : no vm cgroup in controller 3
2013-02-27 16:08:26.731+0000: 29624: warning : virCgroupMoveTask:887 : no vm cgroup in controller 4
2013-02-27 16:08:26.732+0000: 29624: warning : virCgroupMoveTask:887 : no vm cgroup in controller 6

This is because it has only created child cgroups for 3 of the
controllers, but was trying to move the processes from all the
controllers. The fix is to only try to move threads in the
controllers we actually created. Also remove the warning and
make it return a hard error to avoid such lazy callers in the
future.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 279336c5d8c44f28956b3427ab2bf207d06878e2)

Don't try to add non-existant devices to ACL

The QEMU driver has a list of devices nodes that are whitelisted
for all guests. The kernel has recently started returning an
error if you try to whitelist a device which does not exist.
This causes a warning in libvirt logs and an audit error for
any missing devices. eg

2013-02-27 16:08:26.515+0000: 29625: warning : virDomainAuditCgroup:451 : success=no virt=kvm resrc=cgroup reason=allow vm="vm031714" uuid=9d8f1de0-44f4-a0b1-7d50-e41ee6cd897b cgroup="/sys/fs/cgroup/devices/libvirt/qemu/vm031714/" class=path path=/dev/kqemu rdev=? acl=rw

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 7f544a4c8f0353e4ff9ca08aafbb86ff8f60da0a)

Prep for release 0.10.2.5

Fix TLS tests with gnutls 3

When given a CA cert with basic constraints to set non-critical,
and key usage of 'key signing', this should be rejected. Version
of GNUTLS < 3 do not rejecte it though, so we never noticed the
test case was broken

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 0204d6d7a0519377b2e6bc296b00328cd748f55d)

daemon: fix leak after listing all volumes

CVE-2013-1962

remoteDispatchStoragePoolListAllVolumes wasn't freeing the pool.
The pool also held a reference to the connection, preventing it from
getting freed and closing the netcf interface driver, which held two
sockets open.
(cherry picked from commit ca697e90d5bd6a6dfb94bfb6d4438bdf9a44b739)

spec: proper soft static allocation of qemu uid

https://bugzilla.redhat.com/show_bug.cgi?id=924501 tracks a
problem that occurs if uid 107 is already in use at the time
libvirt is first installed. In response that problem, Fedora
packaging guidelines were recently updated. This fixes the
spec file to comply with the new guidelines:
https://fedoraproject.org/wiki/Packaging:UsersAndGroups

* libvirt.spec.in (daemon): Follow updated Fedora guidelines.

Signed-off-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit a2584d58f6f7d941b960f996c8e26df8294b79b9)

Conflicts:
libvirt.spec.in - no backport of c8f79c9b %if reindents

spec: Fix minor changelog issues

When a changelog entry references an RPM macro, % needs to be escaped so
that it does not appear expanded in package changelog.

Fri Mar 4 2009 is incorrect since Mar 4 was Wednesday. Since
libvirt-0.6.1 was released on Mar 4 2009, we should change Fri to Wed.
(cherry picked from commit 53657a0abe273000a8aa2bfc0417a92dcd9dd5b7)

spec: Avoid using makeinstall relic

The macro was made to help installing broken packages that did not use
DESTDIR correctly by overriding individual path variables (prefix,
sysconfdir, ...). Newer rpm provides fixed make_install macro that calls
make install with just the correct DESTDIR, however it is not available
everywhere (e.g., RHEL 5 does not have it). On the other hand the
make_install macro is simple and straightforward enough for us to use
its expansion directly.
(cherry picked from commit d45066a55f866a793f346bde1ac6d0f552aa9e52)

audit: properly encode device path in cgroup audit

https://bugzilla.redhat.com/show_bug.cgi?id=922186

Commit d04916fa introduced a regression in audit quality - even
though the code was computing the proper escaped name for a
path, it wasn't feeding that escaped name on to the audit message.
As a result, /var/log/audit/audit.log would mention a pair of
fields class=path path=/dev/hpet instead of the intended
class=path path="/dev/hpet", which in turn caused ausearch to
format the audit log with path=(null).

* src/conf/domain_audit.c (virDomainAuditCgroupPath): Use
constructed encoding.

Signed-off-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 31c6bf35b9d9de04158318658f4fbf6a9e54ff28)

storage: Fix lvcreate parameter for backingStore.

When virStorageBackendLogicalCreateVol() creates a snapshot for a
logical volume with backingStore element, it fails with the message
below:

  2013-01-17 03:10:18.869+0000: 1967: error : virCommandWait:2345 :
  internal error Child process (/sbin/lvcreate --name lvm-snapshot -L 51200K
  -s=/dev/lvm-pool/lvm-volume) unexpected exit status 3: /sbin/lvcreate:
  invalid option -- '='  Error during parsing of command line.

This is because virCommandAddArgPair() uses '=' to connect the two
parameters, it's unsuitable for -s option of the lvcreate.

Signed-off-by: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp>
(cherry picked from commit ffee627a4a52bf13d787cec883054fe2e7d9465c)

Prep for release 0.10.2.4

esx: Fix and improve esxListAllDomains function

Avoid requesting information such as identity or power state when it
is not necessary.

Lookup virtual machine list with the required fields (configStatus,
name, and config.uuid) to make esxVI_GetVirtualMachineIdentity work.

No need to call esxVI_GetNumberOfSnapshotTrees. rootSnapshotTreeList
can be tested for emptiness by checking it for NULL.

esxVI_LookupRootSnapshotTreeList already does the error reporting,
don't overwrite it.

Check if autostart is enabled at all before looking up the individual
autostart setting of a virtual machine.

Reorder VIR_EXPAND_N(doms, ndoms, 1) to avoid leaking the result of
the call to virGetDomain if VIR_EXPAND_N fails.

Replace VIR_EXPAND_N by VIR_RESIZE_N to avoid quadratic scaling, as in
the Hyper-V version of the function.

If virGetDomain fails it already reports an error, don't overwrite it
with an OOM error.

All items in doms up to the count-th one are valid, no need to double
check before freeing them.

Finally, don't leak autoStartDefaults and powerInfoList.
(cherry picked from commit 5fc663d8bedc082585941e1453229cdcf5fe2880)

Fix parsing of SELinux ranges without a category

Normally libvirtd should run with a SELinux label

  system_u:system_r:virtd_t:s0-s0:c0.c1023

If a user manually runs libvirtd though, it is sometimes
possible to get into a situation where it is running

  system_u:system_r:init_t:s0

The SELinux security driver isn't expecting this and can't
parse the security label since it lacks the ':c0.c1023' part
causing it to complain

  internal error Cannot parse sensitivity level in s0

This updates the parser to cope with this, so if no category
is present, libvirtd will hardcode the equivalent of c0.c1023.

Now this won't work if SELinux is in Enforcing mode, but that's
not an issue, because the user can only get into this problem
if in Permissive mode. This means they can now start VMs in
Permissive mode without hitting that parsing error

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 1732c1c62997b9f5ce39e5eb4d1ef2f842af73e1)

Conflicts:
src/security/security_selinux.c

Separate MCS range parsing from MCS range checking

Pull the code which parses the current process MCS range
out of virSecuritySELinuxMCSFind and into a new method
virSecuritySELinuxMCSGetProcessRange.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 4a92fe4413d2a43cf1215e26be7551b653d9a992)

Conflicts:
src/security/security_selinux.c

Fix memory leak on OOM in virSecuritySELinuxMCSFind

The body of the loop in virSecuritySELinuxMCSFind would
directly 'return NULL' on OOM, instead of jumping to the
cleanup label. This caused a leak of several local vars.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit f2d8190cfb8e52b006a5cfd080b42d2a1755fd28)

qemu: Set migration FD blocking

Since we switched from direct host migration scheme to the one,
where we connect to the destination and then just pass a FD to a
qemu, we have uncovered a qemu bug. Qemu expects migration FD to
block. However, we are passing a nonblocking one which results in
cryptic error messages like:

qemu: warning: error while loading state section id 2
load of migration failed

The bug is already known to Qemu folks, but we should workaround
already released Qemus. Patch has been originally proposed by Stefan
Hajnoczi <stefanha@gmail.com>
(cherry picked from commit ceb31795af40f6127a541076b905935ff83e5b11)

build: further fixes for broken if_bridge.h

Commit c308a9ae was incomplete; it resolved the configure failure,
but not a later build failure.

* src/util/virnetdevbridge.c: Include pre-req header.
* configure.ac (AC_CHECK_HEADERS): Prefer standard in.h over
non-standard ip6.h.
(cherry picked from commit 1bf661caf4e926efcad6e85151a587cea5fd29f4)

build: work around broken kernel header

I got this scary warning during ./configure on rawhide:

checking linux/if_bridge.h usability... no
checking linux/if_bridge.h presence... yes
configure: WARNING: linux/if_bridge.h: present but cannot be compiled
configure: WARNING: linux/if_bridge.h:     check for missing prerequisite headers?
configure: WARNING: linux/if_bridge.h: see the Autoconf documentation
configure: WARNING: linux/if_bridge.h:     section "Present But Cannot Be Compiled"
configure: WARNING: linux/if_bridge.h: proceeding with the compiler's result
configure: WARNING:     ## ------------------------------------- ##
configure: WARNING:     ## Report this to libvir-list@redhat.com ##
configure: WARNING:     ## ------------------------------------- ##
checking for linux/if_bridge.h... no

* configure.ac (AC_CHECK_HEADERS): Provide struct in6_addr, since
linux/if_bridge.h uses it without declaring it.
(cherry picked from commit c308a9ae153db619fc0366bad9fd8f6c49cfac58)
(cherry picked from commit 7ae53f15936277dfc777539ce13970293fdb03d0)

Fix SELinux security label test

If securityselinuxtest was run on a system with newer SELinux
policy it would fail, due to using svirt_tcg_t instead of
svirt_t. Fixing the domain type to be KVM avoids this issue.
(cherry picked from commit 32df483f1d5916f00e3ab15158f099234909e9c2)

libxl: Fix setting of disk backend

The libxl driver was setting the backend field of libxl_device_disk
structure to LIBXL_DISK_BACKEND_TAP when the driver element of disk
configuration was not specified. This needlessly forces the use of
blktap driver, which may not be loaded in dom0

https://bugzilla.redhat.com/show_bug.cgi?id=912488

Ian Campbell suggested that LIBXL_DISK_BACKEND_UNKNOWN is a better
default in this case

https://www.redhat.com/archives/libvir-list/2013-February/msg01126.html
(cherry picked from commit 567779e51a7727b021dee095c9d75cf0cde0bd43)

util: Fix mask for 172.16.0.0 private address range

https://bugzilla.redhat.com/show_bug.cgi?id=905708

Only the first 12 bits should be set in the mask for this range. All
addresses between 172.16.0.0 and 172.31.255.255 are private.
(cherry picked from commit 6405713f2ab9243db7d856914aaefbd4f9747daa)

conf: don't fail to parse <boot> when parsing a single device

This resolves:

https://bugzilla.redhat.com/show_bug.cgi?id=895294

The symptom was that attempts to modify a network device using
virDomainUpdateDeviceFlags() would fail if the original device had a
<boot> element (e.g. "<boot order='1'/>"), even if the updated device
had the same <boot> element. Instead, the following error would be logged:

cannot modify network device boot index setting

It's true that it's not possible to change boot order (internally
known as bootIndex) of a live device; qemuDomainChangeNet checks for
that, but the problem was that the information it was checking was
incorrect.

Explanation:

When a complete domain is parsed, a global (to the domain) "bootMap"
is passed down to the parse for each device; the bootMap is used to
make sure that devices don't have conflicting settings for their boot
orders.

When a single device is parsed by itself (as in the case of
virDomainUpdateDeviceFlags), there is no global bootMap that would be
appropriate to send, so NULL is sent instead. However, although the
lowest level function that parses just the boot order *does* simply
skip the sanity check in that case, the next higher level
"virDomainDeviceInfoParseXML" function refuses to call down to the
lower "virDomainDeviceBootParseXML" if bootMap is NULL. So, the boot
order is never set in the "new" device object, and when it is compared
to the original (which does have a boot order), they don't match.

The fix is to patch virDomainDeviceInfoParseXML to not care about
bootMap, and just always call virDomainDeviceInfoBootParseXML whenever
there is a <boot> element. When we are only parsing a single device,
we don't care whether or not any specified boot order is consistent
with the rest of the domain; we will always do this check later (in
the current case, we do it by verifying that the net bootIndex exactly
matches the old bootIndex).

Support custom 'svirt_tcg_t' context for TCG based guests

The current SELinux policy only works for KVM guests, since
TCG requires the 'execmem' privilege. There is a 'virt_use_execmem'
boolean to turn this on globally, but that is unpleasant for users.
This changes libvirt to automatically use a new 'svirt_tcg_t'
context for TCG based guests. This obsoletes the previous
boolean tunable and makes things 'just work(tm)'

Since we can't assume we run with new enough policy, I also
make us log a warning message (once only) if we find the policy
lacks support. In this case we fallback to the normal label and
expect users to set the boolean tunable

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 77d3a8097480e388f1ce3129fe530f235b05f93b)

uml: Report error if inotify fails on driver startup
(cherry picked from commit 7b97030ad430eb76fcc333652411208fb702e962)

daemon: Preface polkit error output with 'polkit:'

There's been a few bugs about an expected error from polkit:

https://bugzilla.redhat.com/show_bug.cgi?id=873799
https://bugzilla.redhat.com/show_bug.cgi?id=872166

The error is:

Authorization requires authentication but no agent is available.

The error means that polkit needs a password, but there is no polkit
agent registered in your session. Polkit agents are the bit of UI that
pop up and actually ask for your password.

Preface the error with the string 'polkit:' so folks can hopefully
make more sense of it.
(cherry picked from commit 96a108c99398f56970a29c8bfb7da9df90d206ed)

spec: Fix script warning when uninstalling libvirt-client

https://bugzilla.redhat.com/show_bug.cgi?id=888071
(cherry picked from commit d60c7f75c2375fd1a2cfc7e3c4c2e7b030b4b886)

Prep for release 0.10.2.3

selinux: Only create the selabel_handle once.

According to Eric Paris this is slightly more efficient because it
only loads the regular expressions in libselinux once.
(cherry picked from commit 6159710ca1eecefa7c81335612c8141c88fc35a9)

Conflicts:
src/security/security_selinux.c

Skip bulk relabelling of resources in SELinux driver when used with LXC

The virSecurityManager{Set,Restore}AllLabel methods are invoked
at domain startup/shutdown to relabel resources associated with
a domain. This works fine with QEMU, but with LXC they are in
fact both currently no-ops since LXC does not support disks,
hostdevs, or kernel/initrd files. Worse, when LXC gains support
for disks/hostdevs, they will do the wrong thing, since they
run in host context, not container context. Thus this patch
turns then into a formal no-op when used with LXC. The LXC
controller will call out to specific security manager labelling
APIs as required during startup.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 89c5a9d0e83306eef0d73af5cfb32cb49d533afc)

selinux: Resolve resource leak using the default disk label

Commit id a994ef2d1 changed the mechanism to store/update the default
security label from using disk->seclabels[0] to allocating one on the
fly. That change allocated the label, but never saved it. This patch
will save the label. The new virDomainDiskDefAddSecurityLabelDef() is
a copy of the virDomainDefAddSecurityLabelDef().
(cherry picked from commit 05cc03518987fa0f8399930d14c1d635591ca49b)

Conflicts:
src/conf/domain_conf.h

rpc: Fix crash on error paths of message dispatching

This patch resolves CVE-2013-0170:
https://bugzilla.redhat.com/show_bug.cgi?id=893450

When reading and dispatching of a message failed the message was freed
but wasn't removed from the message queue.

After that when the connection was about to be closed the pointer for
the message was still present in the queue and it was passed to
virNetMessageFree which tried to call the callback function from an
uninitialized pointer.

This patch removes the message from the queue before it's freed.

* rpc/virnetserverclient.c: virNetServerClientDispatchRead:
- avoid use after free of RPC messages
(cherry picked from commit 46532e3e8ed5f5a736a02f67d6c805492f9ca720)

nwfilter: Remove unprivileged code path to set base

https://bugzilla.redhat.com/show_bug.cgi?id=903184

Commit id f8ab364c removed ability to run this driver unprivileged. Coverity
detected the check and flagged it.
(cherry picked from commit aafe41971cc3f4a189edf5b322f399aabd869d74)

Conflicts:
src/nwfilter/nwfilter_driver.c - whitespace changes in 1c04f99 not present

Fix nwfilter driver reload/shutdown handling when unprivileged

https://bugzilla.redhat.com/show_bug.cgi?id=903184

Although the nwfilter driver skips startup when running in a
session libvirtd, it did not skip reload or shutdown. This
caused errors to be reported when sending SIGHUP to libvirtd,
and caused an abort() in libdbus on shutdown due to trying
to remove a dbus filter that was never added
(cherry picked from commit abbec81bd0c9bf917f2c63045222734d7e4411fb)

Conflicts:
src/nwfilter/nwfilter_driver.c - earlier changes f4ea67f and
79b8a56 related to using bool and auto-shutdown of drivers are not backported

call virstateCleanup to do the cleanup before libvirtd exits

https://bugzilla.redhat.com/show_bug.cgi?id=903184

(cherry picked from commit 47e176772559f617297abb07855b8556c8e7b72e)

Fix race condition when destroying guests

When running virDomainDestroy, we need to make sure that no other
background thread cleans up the domain while we're doing our work.
This can happen if we release the domain object while in the
middle of work, because the monitor might detect EOF in this window.
For this reason we have a 'beingDestroyed' flag to stop the monitor
from doing its normal cleanup. Unfortunately this flag was only
being used to protect qemuDomainBeginJob, and not qemuProcessKill

This left open a race condition where either libvirtd could crash,
or alternatively report bogus error messages about the domain already
having been destroyed to the caller

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 81621f3e6e45e8681cc18ae49404736a0e772a11)

Conflicts:

  src/qemu/qemu_driver.c - virReportError had been removed from
      upstream in cases where qemuProcessKill failed, creating
      different context.

build: move file deleting action from %files list to %install

When building libvirt rpms on rhel5, I got the following error:

    File must begin with "/": rm
    File must begin with "/": -f
    File must begin with "/": $RPM_BUILD_ROOT/etc/sysctl.d/libvirtd
    Installed (but unpackaged) file(s) found:
   /etc/sysctl.d/libvirtd

It is triggerd by the %files list of libvirt daemon:

    %if 0%{?fedora} >= 14 || 0%{?rhel} >= 6
    %config(noreplace) %{_prefix}/lib/sysctl.d/libvirtd.conf
    %else
    rm -f $RPM_BUILD_ROOT%{_prefix}/lib/sysctl.d/libvirtd.conf
    %endif

After checking document of rpm spec file, I think it would be better
to move the file deleting line from %files list to %install script.

Bug introduced in commit a1fd56c.
(cherry picked from commit daef7c9e9c5abef65e77116a1cabad37c0c0a897)

build: libvirt-guests files misplaced in specfile

In a non-systemd environment the post and preun scripts of libvirt-client
fail, since the required files are in libvirt-daemon. Moved them to client.
Doing that I noticed %{_unitdir}/libvirt-guests.service was contained in
both libvirt-client and libvirt-daemon, which I don't think was intended.
Removed the extra copy from daemon.

Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
(cherry picked from commit b7159dca8b1b653e342584fb33225190bb772fd8)

Conflicts:
libvirt.spec.in - no virtlockd service

qemu: Relax hard RSS limit

Currently, if there's no hard memory limit defined for a domain,
libvirt tries to calculate one, based on domain definition and magic
equation and set it upon the domain startup. The rationale behind was,
if there's a memory leak or exploit in qemu, we should prevent the
host system trashing. However, the equation was too tightening, as it
didn't reflect what the kernel counts into the memory used by a
process. Since many hosts do have a swap, nobody hasn't noticed
anything, because if hard memory limit is reached, process can
continue allocating memory on a swap. However, if there is no swap on
the host, the process gets killed by OOM killer. In our case, the qemu
process it is.

To prevent this, we need to relax the hard RSS limit. Moreover, we
should reflect more precisely the kernel way of accounting the memory
for process. That is, even the kernel caches are counted within the
memory used by a process (within cgroups at least). Hence the magic
equation has to be changed:

limit = 1.5 * (domain memory + total video memory) + (32MB for cache
per each disk) + 200MB
(cherry picked from commit 3c83df679e8feab939c08b1f97c48f9290a5b8cd)

util: fix botched check for new netlink request filters

This is an adjustment to the fix for

https://bugzilla.redhat.com/show_bug.cgi?id=889319

to account for two bonehead mistakes I made.

commit ac2797cf2af2fd0e64c58a48409a8175d24d6f86 attempted to fix a
problem with netlink in newer kernels requiring an extra attribute
with a filter flag set in order to receive an IFLA_VFINFO_LIST from
netlink. Unfortunately, the #ifdef that protected against compiling it
in on systems without the new flag went a bit too far, assuring that
the new code would *never* be compiled, and even if it had, the code
was incorrect.

The first problem was that, while some IFLA_* enum values are also
their existence at compile time, IFLA_EXT_MASK *isn't* #defined, so
checking to see if it's #defined is not a valid method of determining
whether or not to add the attribute. Fortunately, the flag that is
being set (RTEXT_FILTER_VF) *is* #defined, and it is never present if
IFLA_EXT_MASK isn't, so it's sufficient to just check for that flag.

And to top it off, due to the code not actually compiling when I
thought it did, I didn't realize that I'd been given the wrong arglist
to nla_put() - you can't just send a const value to nla_put, you have
to send it a pointer to memory containing what you want to add to the
message, along with the length of that memory.

This time I've actually sent the patch over to the other machine
that's experiencing the problem, applied it to the branch being used
(0.10.2) and verified that it works properly, i.e. it does fix the
problem it's supposed to fix. :-/
(cherry picked from commit 7c36650699f33e54361720f824efdf164bc6e65d)

util: add missing error log messages when failing to get netlink VFINFO

This patch fixes the lack of error messages when libvirt fails to find
VFINFO in a returned netlinke response message.

https://bugzilla.redhat.com/show_bug.cgi?id=827519#c10 is an example
of the error message that was previously logged when the
IFLA_VFINFO_LIST object was missing from the netlink response. The
reason for this failure is detailed in

https://bugzilla.redhat.com/show_bug.cgi?id=889319

Even though that root problem has been fixed, the experience of
finding the root cause shows us how important it is to properly log an
error message in these cases. This patch *seems* to replace the entire
function, but really most of the changes are due to moving code that
was previously inside an if() statement out to the top level of the
function (the original if() was reversed and made to log an error and
return).
(cherry picked from commit 846770e5ff959f7819e2c32857598cb88e2e2f0e)

util: fix functions that retrieve SRIOV VF info

This patch resolves:

https://bugzilla.redhat.com/show_bug.cgi?id=889319

When assigning an SRIOV virtual function to a guest using "intelligent
PCI passthrough" (<interface type='hostdev'>, which sets the MAC
address and vlan tag of the VF before passing its info to qemu),
libvirt first learns the current MAC address and vlan tag by sending
an NLM_F_REQUEST message for the VF's PF (physical function) to the
kernel via a NETLINK_ROUTE socket (see virNetDevLinkDump()); the
response message's IFLA_VFINFO_LIST section is examined to extract the
info for the particular VF being assigned.

This worked fine with kernels up until kernel commit
115c9b81928360d769a76c632bae62d15206a94a (first appearing in upstream
kernel 3.3) which changed the ABI to not return IFLA_VFINFO_LIST in
the response until a newly introduced IFLA_EXT_MASK field was included
in the request, with the (newly introduced, of course) RTEXT_FILTER_VF
flag set.

The justification for this ABI change was that new fields had been
added to the VFINFO, causing NLM_F_REQUEST messages to fail on systems
with large numbers of VFs if the requesting application didn't have a
large enough buffer for all the info. The idea is that most
applications doing an NLM_F_REQUEST don't care about VFINFO anyway, so
eliminating it from the response would lower the requirements on
buffer size. Apparently, the people who pushed this patch made the
mistaken assumption that iproute2 (the "ip" command) was the only
package that used IFLA_VFINFO_LIST, so it wouldn't break anything else
(and they made sure that iproute2 was fixed.

The logic of this "fix" is debatable at best (one could claim that the
proper fix would be for the applications in question to be fixed so
that they properly sized the buffer, which is what libvirt does
(purely by virtue of using libnl), but it is what it is and we have to
deal with it.

In order for <interface type='hostdev'> to work properly on systems
with a kernel 3.3 or later, libvirt needs to add the afore-mentioned
IFLA_EXT_MASK field with RTEXT_FILTER_VF set.

Of course we also need to continue working on systems with older
kernels, so that one bit of code is compiled conditionally. The one
time this could cause problems is if the libvirt binary was built on a
system without IFLA_EXT_MASK which was subsequently updated to a
kernel that *did* have it. That could be solved by manually providing
the values of IFLA_EXT_MASK and RTEXT_FILTER_VF and adding it to the
message anyway, but I'm uncertain what that might actually do on a
system that didn't support the message, so for the time being we'll
just fail in that case (which will very likely never happen anyway).
(cherry picked from commit ac2797cf2af2fd0e64c58a48409a8175d24d6f86)

virsh: Fix POD syntax

The first two hunks fix "Unterminated I<...> sequence" error and the
last one fixes "’=item’ outside of any ’=over’" error.
(cherry picked from commit 61299a1c983a64c7e0337b94232fdd2d42c1f4f2)

build: install libvirt sysctl file correctly

https://bugzilla.redhat.com/show_bug.cgi?id=887017 reports that
even though libvirt attempts to set fs.aio-max-nr via sysctl,
the file was installed with the wrong name and gets ignored by
sysctl. Furthermore, 'man systcl.d' recommends that packages
install into hard-coded /usr/lib/sysctl.d (even when libdir is
/usr/lib64), so that sysadmins can use /etc/sysctl.d for overrides.

* daemon/Makefile.am (install-sysctl, uninstall-sysctl): Use
correct location.
* libvirt.spec.in (network_files): Reflect this.
(cherry picked from commit a1fd56cb3057c45cffbf5d41eaf70a26d2116b20)

build: .service files don't need to be executable

See also commit 66ff2dd, where we avoided installing these files
as executables.

* daemon/Makefile.am (libvirtd.service): Drop chmod.
* tools/Makefile.am (libvirt-guests.service): Likewise.
* src/Makefile.am (virtlockd.service, virtlockd.socket):
Likewise.
(cherry picked from commit 5ec4b22b777b4505d159c6e8d1631d4d774a7be7)

Conflicts:
src/Makefile.am - virtlockd.service not present in 0.10.2

build: use common .in replacement mechanism

We had several different styles of .in conversion in our Makefiles:
ALLCAPS, @ALLCAPS@, @lower@, ::lower::
Canonicalize on one form, to make it easier to copy and paste
between .in files.

Also, we were using some non-portable sed constructs: \@ is an
undefined escape sequence (it happens to be @ itself in GNU sed,
but POSIX allows it to mean something else), as well as risky
behavior (failure to consistently quote things means a space
in $(sysconfdir) could throw things off; also, Autoconf recommends
using | rather than , or ! in the s||| operator, because | has to
be quoted in shell and is therefore less likely to appear in file
names than , or !).

Fix all of these uses to follow the same syntax.

* daemon/libvirtd.8.in: Switch to @var@.
* tools/virt-xml-validate.in: Likewise.
* tools/virt-pki-validate.in: Likewise.
* src/locking/virtlockd.init.in: Likewise.
* daemon/Makefile.am: Prefer | over ! in sed.
(libvirtd.8): Prefer consistent substitution.
(libvirtd.init, libvirtd.service): Avoid non-portable sed.
* tools/Makefile.am (libvirt-guests.sh, libvirt-guests.init)
(libvirt-guests.service): Likewise.
(virt-xml-validate, virt-pki-validate, virt-sanlock-cleanup):
Prefer consistent capitalization.
* src/Makefile.am (virtlockd.init, virtlockd.service)
(virtlockd.socket): Prefer consistent substitution.
(cherry picked from commit 462a69621e232c83990dbe6a711326b671262d47)

Conflicts:
daemon/Makefile.am - drop files not present in 0.10.2
src/Makefile.am - likewise
src/locking/virtlockd.init.in - likewise

tools: Only install guests init script if --with-init=script=redhat

Most of this deals with moving the libvirt-guests.sh script which
does all the work to /usr/libexec, so it can be shared by both
systemd and traditional init. Previously systemd depended on
the script being in /etc/init.d

Required to fix https://bugzilla.redhat.com/show_bug.cgi?id=789747
(cherry picked from commit d13155c20c0df9595e33c120a68b3544192d6740)

build: fix syntax-check tab violation

* tools/Makefile.am: Fix tab damage in previous patch.
(cherry picked from commit 07049e4c39186368d6851df9707a53525a3f06a0)

build: check for pod errors

Patch 61299a1c fixed a long-standing pod error in the man page.
But we should be preventing these up front.
See also https://bugzilla.redhat.com/show_bug.cgi?id=870273

* tools/Makefile.am (virt-xml-validate.1, virt-pki-validate.1)
(virt-host-validate.1, virt-sanlock-cleanup.8, virsh.1): Reject
pod conversion errors.
* daemon/Makefile.am ($(srcdir)/libvirtd.8.in): Likewise.
(cherry picked from commit 2639949abe732cf683bec48f87ff6c243b608b76)

daemon: Use $(AM_V_GEN) in a few more places

(cherry picked from commit 0801c149080b6c7760bc794bc6ab00a8cdbdaed2)

build: Add libxenctrl to LIBXL_LIBS

Commit dfa1e1dd removed libxenctrl from LIBXL_LIBS, but the libxl
driver uses a symbol from this library. Explicitly link with
libxenctrl instead of relying on the build system to support
implicit DSO linking.
(cherry picked from commit 68e7bc4561783d742d1e266b7f1f0e3516d5117e)