Mark Asselstine [Mon, 24 Sep 2018 15:11:35 +0000 (11:11 -0400)]
lxc_monitor: Avoid AB / BA lock race
A deadlock situation can occur when autostarting a LXC domain 'guest'
due to two threads attempting to take opposing locks while holding
opposing locks (AB BA problem). Thread A takes and holds the 'vm' lock
while attempting to take the 'client' lock, meanwhile, thread B takes
and holds the 'client' lock while attempting to take the 'vm' lock.
Since these threads are scheduled independently and are preemptible it
is possible for the deadlock scenario to occur where each thread locks
their first lock but both will fail to get their second lock and just
spin forever. You get something like:
virLXCProcessAutostartDomain (takes vm lock)
--> virLXCProcessStart
--> virLXCProcessConnectMonitor
--> virLXCMonitorNew
<...>
virNetClientIncomingEvent (takes client lock)
--> virNetClientIOHandleInput
--> virNetClientCallDispatch
--> virNetClientCallDispatchMessage
--> virNetClientProgramDispatch
--> virLXCMonitorHandleEventInit
--> virLXCProcessMonitorInitNotify (wants vm lock but spins)
<...>
--> virNetClientSetCloseCallback (wants client lock but spins)
Neither thread ever gets the lock it needs to be able to continue
while holding the lock that the other thread needs.
The actual window for preemption which can cause this deadlock is
rather small, between the calls to virNetClientProgramNew() and
execution of virNetClientSetCloseCallback(), both in
virLXCMonitorNew(). But it can be seen in real world use that this
small window is enough.
By moving the call to virNetClientSetCloseCallback() ahead of
virNetClientProgramNew() we can close any possible chance of the
deadlock taking place. There should be no other implications to the
move since the close callback (in the unlikely event was called) will
spin on the vm lock. The remaining work that takes place between the
old call location of virNetClientSetCloseCallback() and the new
location is unaffected by the move.
Signed-off-by: Mark Asselstine <mark.asselstine@windriver.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Pavel Hrdina [Wed, 15 Aug 2018 11:10:24 +0000 (13:10 +0200)]
util: introduce vircgroupbackend files
We will need to extract current cgroup v1 implementation into separate
backend because there will be new cgroup v2 implementation and both will
have to co-exist.
Reviewed-by: Fabiano Fidêncio <fidencio@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
This will be required once cgroup v2 is introduced. The cgroup
detection is not simple and we will have multiple backends so we
should not just jump into the middle of the detection code.
In order to use virCgroupNewSelf we need to create all the remaining
data files:
- {name}.cgroups represents /proc/cgroups, it is a list of cgroup
controllers compiled into kernel
- {name}.self.cgroup represents /proc/self/cgroup, it describes
cgroups to which the process belongs
For "no-cgroups" we need to modify the expected behavior because
virCgroupNewSelf() will fail if there are no controllers available.
Reviewed-by: Fabiano Fidêncio <fidencio@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Pavel Hrdina [Mon, 24 Sep 2018 15:17:00 +0000 (17:17 +0200)]
vircgroupmock: rewrite cgroup fopen mocking
Move all the cgroup data into separate files out of vircgroupmock.c
and rework the fopen function to load data from files. This will
make it easier to add more test cases.
Reviewed-by: Fabiano Fidêncio <fidencio@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
John Ferlan [Thu, 20 Sep 2018 20:50:37 +0000 (16:50 -0400)]
lxc: Resolve memory leak
Commit 40b5c99a modified the virConfGetValue callers to use
virConfGetValueString. However, using the virConfGetValueString
resulted in leaking the returned @value string in each case.
So, let's modify each instance to use the VIR_AUTOFREE(char *)
syntax. In some instances changing the variable name since
@value was used more than once.
Found by Coverity
Signed-off-by: John Ferlan <jferlan@redhat.com> Reviewed-by: Erik Skultety <eskultet@redhat.com>
John Ferlan [Fri, 21 Sep 2018 11:26:00 +0000 (07:26 -0400)]
lxc: Remove unnecessary error label
Since lxcConvertSize already creates an error message, there
is no need to use an error: label in lxcSetMemTune to just
overwrite or essentially rewrite the same error. So remove
the label.
Signed-off-by: John Ferlan <jferlan@redhat.com> Reviewed-by: Erik Skultety <eskultet@redhat.com>
John Ferlan [Thu, 20 Sep 2018 21:34:38 +0000 (17:34 -0400)]
tests: Resolve possible overrun
Coverity noted that each of the fmemopen called used the strlen value
in order to allocate space, but that neglected space for terminating
null string. So just add 1 to the strlen.
Signed-off-by: John Ferlan <jferlan@redhat.com> Reviewed-by: Erik Skultety <eskultet@redhat.com>
John Ferlan [Thu, 20 Sep 2018 21:34:36 +0000 (17:34 -0400)]
conf: Alter when ctxt->node is set
In virDomainMemoryDefParseXML and virDomainVideoDefParseXML if
the VIR_ALLOC fails and NULL is returned, then the alteration
to ctxt->node isn't reversed.
Found by Coverity
Signed-off-by: John Ferlan <jferlan@redhat.com> Reviewed-by: Erik Skultety <eskultet@redhat.com>
Shi Lei [Wed, 19 Sep 2018 08:38:19 +0000 (16:38 +0800)]
build-aux:check-spacing: Introduce a new rule to check misaligned stuff in parenthesises
This patch introduces a new rule to check misaligned stuff in parenthesis:
1. For misaligned arguments of function
2. For misaligned conditions of [if|while|switch|...]
There're too much misalignment, so it adds a temporary filter which
permits 'src/util' now. It _should_ be removed as soon as fixing all.
Simon Kobyda [Fri, 21 Sep 2018 14:17:23 +0000 (16:17 +0200)]
virsh: Implement vshTable API to vol-list
Local lengthy unicode-unreliable table formatting was replaced by new
API. Great example of how new API saves space and time.
Removed a lot of string lenght calculation used by the local table.
Simon Kobyda [Fri, 21 Sep 2018 14:17:22 +0000 (16:17 +0200)]
virsh: Implement vshTable API to pool-list
Local lengthy unicode-unreliable table formatting was replaced by new
API. Great example of how new API saves space and time.
Removed a lot of string lenght canculation used by the local table.
Simon Kobyda [Fri, 21 Sep 2018 14:17:17 +0000 (16:17 +0200)]
virsh: Set up cmdDomblkinfo() and cmdDomblkinfoPrint() for vshTable API implementation
I've moved all the printing from cmdDomblkinfoPrint() to cmdDomblkinfo(),
and renamed the cmdDomblkinfoPrint() to cmdDomblkinfoGet(), since nature of
that function changed from gathering and printing informations only to
gathering information. This I believe simplifies the functions and
makes the implementation of vshTable API simpler.
qemu: Update hostdevs device lists before connecting qemu monitor
In a following case:
virsh start $domain
service libvirtd stop
<shutdown> the guest from within the $domain
service libvirtd start
Notice that PCI devices which have been assigned to the $domain will
still be bound to stub drivers instead rebound to host drivers.
In that case the call stack is like below:
libvirtd start
qemuProcessReconnect
qemuProcessStop (because $domain was shutdown without
libvirtd event to process that)
qemuHostdevReAttachDomainDevices
qemuHostdevReAttachPCIDevices
virHostdevReAttachPCIDevices
However, because qemuHostdevUpdateActiveDomainDevices was called
after the qemuConnectMonitor, the setup of the tracking of each
host device in the $domain on either the activePCIHostdevs list
or inactivePCIHostdev list will not occur in an orderly manner.
Therefore, virHostdevReAttachPCIDevices just neglects these host PCI
devices which are bound to stub drivers and doesn't rebind them to
host drivers.
This patch fixs that by moving qemuHostdevUpdateActiveDomainDevices before
qemuConnectMonitor during libvirtd reconnection processing.
Signed-off-by: Wu Zongyong <cordius.wu@huawei.com> Reviewed-by: John Ferlan <jferlan@redhat.com>
Wang Yechao [Fri, 21 Sep 2018 04:35:09 +0000 (12:35 +0800)]
qemu: Introduce qemuDomainRemoveInactiveJobLocked
Create a qemuDomainRemoveInactiveJobLocked which copies
qemuDomainRemoveInactiveJob except of course calling
another new helper qemuDomainRemoveInactiveLocked.
The qemuDomainRemoveInactiveLocked is a copy of
qemuDomainRemoveInactive except that instead of calling
virDomainObjListRemove it calls virDomainObjListRemoveLocked.
Signed-off-by: Wang Yechao <wang.yechao255@zte.com.cn> Reviewed-by: John Ferlan <jferlan@redhat.com>
Wang Yechao [Fri, 21 Sep 2018 04:35:08 +0000 (12:35 +0800)]
qemu: Split up qemuDomainRemoveInactive
Introduce qemuDomainRemoveInactiveJobCommon to handle what will
be the common parts of the code with a new function that will
be used to call virDomainObjListRemoveLocked instead of the
unlocked variant.
Signed-off-by: Wang Yechao <wang.yechao255@zte.com.cn> Reviewed-by: John Ferlan <jferlan@redhat.com>
Andrea Bolognani [Tue, 18 Sep 2018 15:45:40 +0000 (17:45 +0200)]
qemu: Simplify QEMU binary search
Now that we have reduced the number of sensible options down
to either the native QEMU binary or RHEL's qemu-kvm, we can
make virQEMUCapsInitGuest() a bit simpler.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Andrea Bolognani [Tue, 18 Sep 2018 15:26:37 +0000 (17:26 +0200)]
qemu: Don't look for "qemu-kvm" and "kvm" binaries
Both Fedora's qemu-kvm and Debian's/Ubuntu's kvm are nothing
more than paper-thin wrappers around the native QEMU binary,
so we gain nothing by looking for them.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Andrea Bolognani [Tue, 18 Sep 2018 14:55:20 +0000 (16:55 +0200)]
qemu: Expect a single binary in virQEMUCapsInitGuest()
We're only ever passing a single binary when calling this
function, so we can remove all code dealing with the
possibility of a second binary being specified.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Andrea Bolognani [Tue, 18 Sep 2018 14:49:40 +0000 (16:49 +0200)]
qemu: Stop looking after finding the first binary
When the guest is native, we are currently looking at
potential KVM binaries regardless of whether or not we have
already located a QEMU binary suitable to run the guest.
This made sense back when KVM support was not part of QEMU
proper, but these days the KVM binaries are in most cases
just trivial wrapper scripts around the native QEMU binary
so it doesn't make sense to poke at them unless they're
the only binaries on the system, such as when running on
RHEL.
This will allow us to simplify both virQEMUCapsInitGuest()
and virQEMUCapsInitGuestFromBinary().
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Andrea Bolognani [Tue, 18 Sep 2018 14:27:42 +0000 (16:27 +0200)]
qemu: Move armv7l-on-aarch64 special case
When running an armv7l guest on an aarch64 hosts, the
qemu-system-aarch64 binary should be our first choice instead
of qemu-system-arm since the former can take advantage of KVM
acceleration.
Move the special case to virQEMUCapsFindBinaryForArch() so
that it's handled along with all other cases rather than on
its own later on.
Doing so will also make further refactoring easier.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Andrea Bolognani [Fri, 14 Sep 2018 11:10:27 +0000 (13:10 +0200)]
qemu: Don't duplicate binary name in capabilities
virCapabilitiesAddGuestDomain() takes an optional binary
name: this is intended for cases where a certain domain
type can't use the default one registered for the guest
architecture, but has to use a special binary instead.
The current code, however, will pass 'binary' again when
'kvmbin' is not defined, which is unnecessary as 'binary'
has been registered as default earlier, and will result
in capabilities output such as