Boris Fiuczynski [Thu, 17 Mar 2022 09:48:29 +0000 (10:48 +0100)]
nodedev: update mdevs on parent change
The parent of the mdev definition can change due to the existance of the
parent device. The parents existance can e.g. depend on the device
driver load state.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Michal Privoznik [Thu, 17 Mar 2022 08:19:39 +0000 (09:19 +0100)]
virnetdev: Use VIR_WITH_MUTEX_LOCK_GUARD in virNetDevGenerateName()
The virNetDevGenerateName() function uses a global array of
virNetDevGenName structs to find next unused name for network
device. This obviously needs some locking and in fact each member
of the array has its own lock. However, these members are not
virObjects, they are just plain structs, therefore
VIR_WITH_MUTEX_LOCK_GUARD() must be used instead of
VIR_WITH_OBJECT_LOCK_GUARD() to lock individual mutexes.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
qemu: domainjob: Allow InitJob if cb is not set in qemuDomainObjInitJob()
This allows init job even if cb structure is not set. This patch
also includes slight rewriting of the function to make it look
cleaner when freeing resources, by allocating privateData at the
end.
Signed-off-by: Kristina Hanicova <khanicov@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
qemu: domainjob: Allow operations if cb is not set in job structure
We should allow resetting / freeing / restoring / parsing /
formatting qemuDomainJobObj even if 'cb' attribute is not set.
This is theoretical for now, but the attribute must not be always
set in the future. It is sufficient to check if 'cb' exists
before dereferencing it.
Michal Privoznik [Tue, 15 Mar 2022 11:45:54 +0000 (12:45 +0100)]
qemu_cgroup: Don't deny devices from cgroupDeviceACL
On domain startup a couple of devices are allowed in the devices
controller no matter the domain configuration. The aim is to
allow devices crucial for QEMU or one of its libraries, or user
is passing through a device (e.g. through additional cmd line
arguments) and wants QEMU to access it.
However, during unplug it may happen that a device is configured
to use one of such devices and since we deny /dev nodes on
hotplug we would deny such device too. For example,
/dev/urandom belongs onto the list of implicit devices and users
can hotplug and hotunplug an RNG device with /dev/urandom as
backend.
The fix is fortunately simple - just consult the list of implicit
devices before removing the device from the namespace.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Michal Privoznik [Tue, 15 Mar 2022 11:37:44 +0000 (12:37 +0100)]
qemu_cgroup: Drop ENOENT special case for RNG devices
When allowing or denying RNG device in CGroups there's a special
check if the backend device exists (errno == ENOENT) in which
case success is returned to caller. This is in contrast with the
rest of the functions and in fact wrong too - if the backend
device doesn't exist then QEMU will fail opening it. Might as
well signal error here.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Michal Privoznik [Mon, 14 Mar 2022 12:35:15 +0000 (13:35 +0100)]
qemu_namespace: Be less aggressive in removing /dev nodes from namespace
When creating /dev nodes in a QEMU domain's namespace the first
thing we simply do is unlink() the path and create it again. This
aims to solve the case when a file changed type/major/minor in
the host and thus we need to reflect this in the guest's
namespace. Fair enough, except we can be a bit more clever about
it: firstly check whether the path doesn't already exist or isn't
already of the correct type/major/minor and do the
unlink+creation only if needed.
Currently, this is implemented only for symlinks and
block/character devices. For regular files/directories (which are
less common) this might be implemented one day, but not today.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Michal Privoznik [Mon, 14 Mar 2022 14:05:11 +0000 (15:05 +0100)]
qemu_namespace: Don't unlink paths from cgroupDeviceACL
When building namespace for a domain there are couple of devices
that are created independent of domain config (see
qemuDomainPopulateDevices()). The idea behind is that these
devices are crucial for QEMU or one of its libraries, or user is
passing through a device and wants us to create it in the
namespace too. That's the reason that these devices are allowed
in the devices CGroup controller as well.
However, during unplug it may happen that a device is configured
to use one of such devices and since we remove /dev nodes on
hotplug we would remove such device too. For example,
/dev/urandom belongs onto the list of implicit devices and users
can hotplug and hotunplug an RNG device with /dev/urandom as
backend.
The fix is fortunately simple - just consult the list of implicit
devices before removing the device from the namespace.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Michal Privoznik [Sat, 12 Mar 2022 04:41:56 +0000 (05:41 +0100)]
virsh: Don't open code virshEnumComplete()
Now that we have a function that generates string list for given
enum, let's use that instead of open coding it.
Note, after this there are still some 'candidates' left (e.g,
virshNetworkEventNameCompleter(), or
virshNetworkUpdateCommandCompleter()). These are not converted
because either they don't have a convenient int2str function or
they don't start from the very beginning of the enum.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Michal Privoznik [Sat, 12 Mar 2022 04:37:50 +0000 (05:37 +0100)]
virsh: Introduce virshEnumComplete()
We have plenty of completers which iterate over all values of
given enum and do nothing more than translate every member into
string (using corresponding virXXXTypeToString()).
Introduce a convenience function so that callers can pass just
VIR_XXX_LAST and virXXXTypeToString and the rest is taken care
of.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Michal Privoznik [Fri, 11 Mar 2022 08:13:56 +0000 (09:13 +0100)]
virsh: Properly terminate string list in virshDomainInterfaceSourceModeCompleter()
A completer must return a NULL terminated list of strings, which
means that when dealing with enums, it has to allocate one
pointer more than the value of VIR_XXX_LAST. But this is not
honoured in virshDomainInterfaceSourceModeCompleter() leading to
out of bounds read.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Peter Krempa [Thu, 10 Mar 2022 11:59:30 +0000 (12:59 +0100)]
qemu: migration: Use 'VIR_MIGRATE_PARAM_TLS_DESTINATION' for the NBD connection
The NBD connection for non-shared storage migration can have the same
issue regarding TLS certificate name match as the migration connection
itself.
Propagate the configured name also for the NBD connections.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1901394 Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Peter Krempa [Thu, 10 Mar 2022 09:05:53 +0000 (10:05 +0100)]
conf: Add support for setting expected TLS hostname for NBD disks
In cases when the hostname of the NBD server doesn't match the hostname
in the TLS certificate the new attribute 'tlsHostname' can be used to
override it.
Add the XML infrastructure and tests.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Notable changes:
- 'tls-hostname' field for NBD client to override local hostname
- machine types 'pc-i440fx-1.7' and older are now deprecated
- 'snapshot-access' block driver added
- The 'protocol' field of 'set_password' and 'expire_password'
parameter is now an enum instead of a pure string allowing 'vnc' and
'spice' as value and the arguments are also covered by the schema.
- 'copy-before-write' block driver now has a 'bitmap' property
- 'query-migrate' now reports 'precopy-bytes', 'downtime-bytes',
'postcopy-bytes' for 'ram' and 'disk' statistics
- RTC_CHANGE event now has a 'qom-path' property to identify the RTC
- 'umip' cpu feature is now migratable
- SGX property 'section-size' reinstated after regression
Changes in build setting:
- fuse block export support now enabled
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Peter Krempa [Wed, 9 Mar 2022 14:43:56 +0000 (15:43 +0100)]
virDomainSnapshotDefParse: Decouple parsing of memory snapshot config
Separate the steps of parsing the memory snapshot config from the
post-processing and validation code. The upcoming patch refactoring the
parsing will be simpler.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Peter Krempa [Tue, 8 Mar 2022 17:20:46 +0000 (18:20 +0100)]
conf: snapshot: Remove VIR_DOMAIN_SNAPSHOT_PARSE_DISKS flag
All callers except the one in the 'esx' driver pass the flag. The 'esx'
driver has a check that 'def->ndisks' is zero after parsing the
definition. This means that we can simply always parse the disks.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Peter Krempa [Tue, 8 Mar 2022 17:08:54 +0000 (18:08 +0100)]
qemuDomainSnapshotForEachQcow2Raw: Act only on internal snapshots
Similarly to the external snapshot code the internal inactive snapshot
creation helper should act only when an internal snapshot of the disk is
required. For now the callers ensure that it's either _INTERNAL or _NO
when control reaches this function.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Peter Krempa [Fri, 4 Mar 2022 13:34:02 +0000 (14:34 +0100)]
qemuSnapshotDiskPrepareActiveExternal: Handle only external snapshots
Preparation steps ensure that the 'snapshot' field can only be
'VIR_DOMAIN_SNAPSHOT_LOCATION_NONE' or
VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL' at this point, but upcoming
patches will change that. Handle only external snapshots.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Haonan Wang [Thu, 10 Mar 2022 15:22:24 +0000 (23:22 +0800)]
virsh: Provide completer for vol-wipe algorithms
Related issue: https://gitlab.com/libvirt/libvirt/-/issues/9
Signed-off-by: Haonan Wang <hnwanga1@gmail.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The security label setting for the external images is part of the
'source' element and documented there. Remove the empty definition added
accidentally in commit ac88a8cfad1
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Peter Krempa [Wed, 9 Mar 2022 08:40:31 +0000 (09:40 +0100)]
docs: formatsnapshot: Remove explicit listing of supported snapshot formats
In blockdev mode we support creating snapshots on all kinds of storage
that qemu allows us to format the image. Drop the part of the sentence
enumerating explicitly supported protocols.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Peter Krempa [Wed, 9 Mar 2022 08:34:40 +0000 (09:34 +0100)]
docs: formatsnapshot: Move paragraphs describing 'disk' element together
There was another paragraph describing the attribute 'type' of the
'disk' element under the description of the subelements. Move it to the
top to get all relevant information in one place.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
When using thue 'run' script to launch a daemon, it is intended to
temporarily stop the systemd units and re-start them again after.
When using this script over an SSH connection, it will get SIGHUP
if the connection goes away, and in this case it fails to re-start
the systemd units. We need to catch SIGHUP and turn it into a
normal python exception. For good measure we do the same for
SIGQUIT and SIGTERM too. SIGINT already gets turned into an
exception by default which we handle.
Reviewed-by: Ján Tomko <jtomko@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Currently the 'run' script modifies $PATH to add the 'tools'
directly to pick up client programs. It fails to add the 'src'
directory to pick up the daemons.
Reviewed-by: Ján Tomko <jtomko@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
conf: remove misleading comments about access being 'lockless'
For the various structs storing lists of objects, the access
to the hash tables is not lockless. The mutex on the object
owning the hash table must be held.
Reviewed-by: Ján Tomko <jtomko@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
We are not guaranteed that the string we are printing onto stdout
contains '\n' and thus that the stdout is flushed. In fact, I've
met this problem when virsh asked me whether I want to edit the
domain XML again (vshAskReedit()) but the prompt wasn't displayed
(as it does not contain a newline character) and virsh just sat
there waiting for my input, I sat there waiting for virsh's
output. Flush stdout after all fputs()-s which do not flush
stdout.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
A bit of effort by me and Michal helped make this the case, and it helped us
uncover some potential issues. I am not documenting it as supported or adding
an Alpine container into the CI, but since there were some distribution bugs
mentioning libvirt issues I thing it would be nice of us to notify those
distribution maintainers that read our release news.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Linux netfilter at some point (Linux 2.6.39) inverted the meaning of the
'--ctdir reply' and newer netfilter implementations now expect
'--ctdir original' instead and vice-versa.
We check for the kernel version and assume that all Linux kernels with version
2.6.39 have the newer inverted logic.
Any distro backporting the Linux kernel patch that inverts the --ctdir logic
(Linux commit 96120d86f) must also backport this patch for Linux and
adapt the kernel version being tested for.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Given our supported platform targets, we no longer need to
consider a version of Linux before 2.6.39, so can drop
support for the old direction behaviour.
The test suite updates are triggered because that never
probed for the ctdir direction, and so the iptables syntax
generator unconditionally dropped the ctdir args.
Reviewed-by: Laine Stump <laine@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Since iptables version 1.4.16 '-m state --state NEW' is converted to
'-m conntrack --ctstate NEW'. Therefore, when encountering this or later
versions of iptables use '-m conntrack --ctstate'.
Given our supported platform targets, we no longer need to
consider a version of iptables before 1.4.16, so can drop
support for the old syntax.
The test suite updates are triggered because that never
probed for the new syntax, and so unconditionally
generated the old syntax.
Reviewed-by: Laine Stump <laine@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Peter Krempa [Mon, 7 Mar 2022 15:02:55 +0000 (16:02 +0100)]
docs: Convert 'governance' page to rST
Extra care is taken to preserve the 'codeofconduct' anchor which is used
in our page template. Upcoming patch will change that but we'll retain
the anchor.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
libvirt-qemu: Don't allow NULL cmd in virDomainQemuMonitorCommandWithFiles()
Nothing in daemon code is prepared for the command in
virDomainQemuMonitorCommandWithFiles() to be NULL. In fact, the
client side doesn't expect this either as our RPC describes the
argument as:
remote_nonnull_string cmd;
Validate the argument in the public API implementation.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
nwfilter: hold filter update lock when creating/deleting bindings
The nwfilter update lock is historically acquired by the virt
drivers in order to achieve serialization between nwfilter
define/undefine, and instantiation/teardown of filters.
When running in the modular daemons, however, the mutex that
the virt drivers are locking is in a completely different
process from the mutex that the nwfilter driver is locking.
Serialization is lost and thus call from the virt driver to
virNWFilterBindingCreateXML can deadlock with a concurrent
call to the virNWFilterDefineXML method.
The solution is surprisingly easy, the update lock simply
needs acquiring in the virNWFilterBindingCreateXML method
and virNWFilterBindingUndefine method instead of in the
virt drivers.
The only semantic difference here is that when a virtual
machine has multiple NICs, the instantiation and teardown
of filters is no longer serialized for the whole VM, but
rather for each NIC. This should not be a problem since
the virt drivers already need to cope with tearing down
a partially created VM where only some of the NICs are
setup.
Reviewed-by: Laine Stump <laine@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Michal Privoznik [Tue, 22 Feb 2022 14:21:49 +0000 (15:21 +0100)]
meson: Detect newer fuse
Now that we have support for fuse-3 we can detect it during the
configure phase. Even better, we can detect fuse-3 first and
fallback to old fuse only if the newer version doesn't exist.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Michal Privoznik [Mon, 13 Apr 2020 11:17:47 +0000 (13:17 +0200)]
lxc_fuse: Implement support for FUSE3
Plenty of projects switch from FUSE to FUSE3. This commit enables
libvirt to compile with newer fuse-3.1 which allows users to have
just one fuse package on their systems, allows us to set
O_CLOEXEC on the fuse session FD. In general, FUSE3 offers more
features, but apparently we don't need them right now. There is a
rewrite guide at [1] but I've took most inspiration from sshfs
[2].