Stefan Berger [Tue, 13 Jul 2021 18:38:32 +0000 (14:38 -0400)]
virt-aa-helper: Allow swtpm to fsync on dir
Allow swtpm (0.7.0 or later) to fsync on the directory where it writes
its state files into so that "the entry in the directory containing the
file has also reached disk" (fsync(2)).
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Neal Gompa <ngompa13@gmail.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Tim Wiederhake [Tue, 6 Jul 2021 12:37:58 +0000 (14:37 +0200)]
qemuMonitorGetChardevInfo: Use automatic memory management
Signed-off-by: Tim Wiederhake <twiederh@redhat.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Jim Fehlig [Wed, 30 Jun 2021 00:11:29 +0000 (18:11 -0600)]
libxl: Add helper function for running the hook script
The same pattern of retrieving the domXML, running the hook script, and
checking for error is used throughout the libxl driver. Remove some
repetitive code by adding a helper function to perform these tasks.
Signed-off-by: Jim Fehlig <jfehlig@suse.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Jim Fehlig [Tue, 29 Jun 2021 23:47:41 +0000 (17:47 -0600)]
libxl: Introduce libxlDomainStartPerform
Introduce libxlDomainStartPerform as part of decomposing libxlDomainStart.
Perform all operations that are part of starting a domain. On error the
domain is destroyed from libxl's perspective, but the operations perfomed
in libxlDomainStartPrepare must be unwound by libxlDomainStart.
Signed-off-by: Jim Fehlig <jfehlig@suse.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Jim Fehlig [Tue, 29 Jun 2021 23:32:37 +0000 (17:32 -0600)]
libxl: Introduce libxlDomainStartPrepare
Introduce libxlDomainStartPrepare as part of decomposing libxlDomainStart.
Perform all prepratory operations such as hostdevs, network devs, etc.
Also ensure all such operations are properly unwound on error.
Signed-off-by: Jim Fehlig <jfehlig@suse.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Jim Fehlig [Wed, 17 Feb 2021 21:13:24 +0000 (14:13 -0700)]
libxl: Move managed save logic to libxlDomainStartNew
the logic to check for existence of a managed save image and use it to
start the VM can be moved to libxlDomainStartNew. libxlDomainStart has
become unwieldy and this is a small step to make it more manageable.
Signed-off-by: Jim Fehlig <jfehlig@suse.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Peter Krempa [Fri, 2 Jul 2021 12:17:57 +0000 (14:17 +0200)]
qemu: migration: Use correct flag constant for enabling storage migration
The 'storageMigration' flag is supposed to be true if storage migration
is requested, which is based on VIR_MIGRATE_NON_SHARED_DISK or
VIR_MIGRATE_NON_SHARED_INC flags. The assignment to the variable used
QEMU_MONITOR_MIGRATE_NON_SHARED_INC (0x04) instead of
VIR_MIGRATE_NON_SHARED_INC (0x80), caused libvirtd to skip the actual
copy of data.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1978526 Fixes: da69f4b2084bff140238e450e264d6036ebef898 Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Peter Krempa [Thu, 1 Jul 2021 14:03:56 +0000 (16:03 +0200)]
storage_source: Add flag storing whether threshold event was registered with index
When users register the threshold event for the top level image with an
explicit index (e.g. vda[3]) they are clearly expecting the index in the
event.
This flag will help avoiding emission of the second event without the
index when the client clearly requested one with the index.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
qemu: interface: check and use ovs command to set qos of ovs managed port
When qos is set or delete, we have to check if the port is an ovs managed
port. If true, call the virNetDevOpenvswitchInterfaceSetQos function when qos
is set, and call the virNetDevOpenvswitchInterfaceClearQos function when
the interface is to be destroyed.
Signed-off-by: Jinsheng Zhang <zhangjl02@inspur.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
virDomain: interface: add virNetDevOpenvswitchInterfaceSetQos and virNetDevOpenvswitchInterfaceClearQos
Introduce qos setting and cleaning method. Use ovs command to set qos
parameters on specific interface of qemu virtual machine.
When an ovs port is created, we add 'ifname' to external-ids. When setting
qos on an ovs port, query its qos and queue. If found, change qos on queried
queue and qos, otherwise create new queue and qos. When cleaning qos, query
and clean queues and qos in ovs table record by 'ifname' and 'vmid'.
Signed-off-by: Jinsheng Zhang <zhangjl02@inspur.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
qemu: remove default audio backend for migratable XML
When seeing a guest with a sound device, and no audio backend, we
automatically add an audio backend XML element based on the historical
QEMU driver behaviour. Unfortunately when we live migrate back to an
old libvirt, it may not understand the audio driver type we configured.
We thus need to strip the default audio backend when migrating.
Fixes https://gitlab.com/libvirt/libvirt/-/issues/179 Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
vircgroup: Improve virCgroupControllerAvailable wrt to CGroupsV2
It all started as a simple bug: trying to move domain memory
between NUMA nodes (e.g. via virsh numatune) did not work. I've
traced the problem to qemuProcessHook() because that's where we
decide whether to rely on CGroups or use numactl APIs to satisfy
<numatune/>. The problem was that virCgroupControllerAvailable()
was telling us that cpuset controller is unavailable. This is
CGroupsV2, and pretty weird because CGroupsV2 definitely do
support cpuset controller and I had them mounted in a standard
way. What I found out (with Pavel's help) was that
virCgroupNewSelf() was looking into the following path to detect
supported controllers:
/sys/fs/cgroup/system.slice/cgroup.controllers
However, if there's no other VM running then the system.slice
only has 'memory' and 'pids' controllers. Therefore, we saw
'cpuset' as not available. The fix is to look at the top most
path, which has the full set of controllers:
/sys/fs/cgroup/cgroup.controllers
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1976690 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
schemas: Allow cache attribute for bandwidth element for HMAT
Turns out, when introducing HMAT support in v6.6.0-rc1~249
I've forgot to allow "cache" attribute for <bandwidth/> element
in RNG. It's parsed and formatted, but schema does not allow it.
Fixes: a89bbbac86383a10be0cec5a93feb7ed820871eb
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1980162 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Jim Fehlig [Wed, 30 Jun 2021 22:36:42 +0000 (16:36 -0600)]
virtlockd: Don't report error if lockspace exists
When the qemu or libxl driver is configured to use lockd and
file_lockspace_dir is set, virtlockd emits an error when libvirtd
is retarted
May 25 15:44:31 virt81 virtlockd[7723]: Requested operation is not
valid: Lockspace for path /data/libvirtd/lockspace already exists
There is really no need to fail when the lockspace already exists,
paricularly since the user is expected to create the lockspace
specified in file_lockspace_dir. Failure to do so will prevent
starting any domains
Also, virLockManagerLockDaemonSetupLockspace already has logic to ignore
the error. Since callers are not interested in the error, change
virtlockd to not report or return an error when the specified lockspace
already exists.
Signed-off-by: Jim Fehlig <jfehlig@suse.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Michal Privoznik [Mon, 21 Jun 2021 15:14:15 +0000 (17:14 +0200)]
qemu: Don't use memory-backend-memfd for NVDIMMs
If guest is configured to use memfd then the function that build
memory-backend-* part of command line will put
memory-backend-memfd, always. Even for NVDIMMs. This is not
correct, because NVDIMMs need a backing path (usually to a real
host NVDIMM device). Therefore, regardless of memfd being
requested, we have to stick with memory-backend-file.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Michal Privoznik [Fri, 25 Jun 2021 13:51:52 +0000 (15:51 +0200)]
virDomainMachineNameAppendValid: Handle special characters better
When constructing guest name for machined we have to be very
cautious as machined expects a name that's basically a valid URI.
Therefore, if there's a dot it has to be followed by a letter or
a number. And if there's a sequence of two or more dashes they
should be joined into a single dash. These rules are implemented
in virDomainMachineNameAppendValid(). There's the @skip variable
which is supposed to track whether it is safe to append a dot or
a dash into name. However, the variable is set to false (meaning
it is safe to append a dot or a dash) even if the current
character we are processing is not in the set of allowed
characters (and thus skipped over).
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1948433 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Vinayak Kale [Fri, 2 Jul 2021 07:23:15 +0000 (12:53 +0530)]
virresctrl: Fix updating the mask for a cache resource
In 'virResctrlAllocUpdateMask', mask is updated only if 'previous mask' is NULL.
By default, the bitmask for a cache resource for a VM is initialized with
'default-resctrl-group' bitmask. So the 'previous mask' would not be NULL and
mask won't get updated if cachetune is configured for a VM. This causes libvirt
to use same bitmask as 'default-resctrl-group' bitmask for a cache resource for
a VM. This patch fixes the issue.
Fixes: d8a354954aba9cd45ab0317915a0a2be27c04767 Signed-off-by: Vinayak Kale <vkale@nvidia.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: Ján Tomko <jtomko@redhat.com>
Michal Privoznik [Thu, 24 Jun 2021 14:58:53 +0000 (16:58 +0200)]
virSetUIDGIDWithCaps: Assume PR_CAPBSET_DROP is always defined
Bounding set capabilities were introduced in kernel commit of
v2.6.25-rc1~912. I guess it is safe to assume that all Linux
hosts we ran on have at least that version or newer.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Jonathon Jongsma [Tue, 22 Jun 2021 19:53:36 +0000 (14:53 -0500)]
nodedev: improve error message when destroying an inactive device
When trying to destroy a node device that is not active, we end up with
a confusing error message:
# nodedev-destroy mdev_88a6b868_46bd_4015_8e5b_26107f82da38
error: Failed to destroy node device 'mdev_88a6b868_46bd_4015_8e5b_26107f82da38'
error: failed to access '/sys/bus/mdev/devices/88a6b868-46bd-4015-8e5b-26107f82da38/iommu_group': No such file or directory
With this patch, the error is more clear:
# nodedev-destroy mdev_88a6b868_46bd_4015_8e5b_26107f82da38
error: Failed to destroy node device 'mdev_88a6b868_46bd_4015_8e5b_26107f82da38'
error: Requested operation is not valid: Device 'mdev_88a6b868_46bd_4015_8e5b_26107f82da38' is not active
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Jonathon Jongsma [Tue, 22 Jun 2021 19:53:35 +0000 (14:53 -0500)]
nodedev: handle mdevctl errors consistently
Currently, we have three different types of mdevctl errors:
1. the command cannot be constructed ecause of unsatisfied
preconditions
2. the command cannot be executed due to some error
3. the command is executed, but returns an error status
These different failures are handled differently. Some cases set an
error and return and error status, and some return a error message but
do not set an error.
This means that the caller has to check both whether the return value is
negative and whether the errmsg parameter is non-NULL before deciding
whether to report the error or not. The situation is further complicated
by the fact that there are occasional instances where mdevctl exits with
an error status but does not print an error message. This results in
errmsg being an empty string "" (i.e. non-NULL).
Simplify the situation by ensuring that virReportError() is called for
all error conditions rather than returning an error message back to the
calling function.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Jonathon Jongsma [Tue, 22 Jun 2021 19:53:34 +0000 (14:53 -0500)]
nodedev: add macro to handle command errors
This macro will be utilized in the following patch. Since mdevctl
commands can fail with or without an error message, this macro makes it
easy to print a fallback error in the case that the error message is not
set.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Jonathon Jongsma [Tue, 22 Jun 2021 19:53:33 +0000 (14:53 -0500)]
nodedev: Handle NULL command variable
In commit 68580a51, I removed the checks for NULL cmd variables because
virCommandRun() already handles the case where it is called with a NULL
cmd. Unfortunately, it handles this case by raising a generic error
which is both unhelpful and overwrites our existing error message. So
for example, when I attempt to create a mediated device with an invalid
parent, I get the following output:
virsh # nodedev-create mdev-test.xml
error: Failed to create node device from mdev-test.xml
error: internal error: invalid use of command API
With this patch, I now get a useful error message again:
virsh # nodedev-create mdev-test.xml
error: Failed to create node device from mdev-test.xml
error: internal error: unable to find parent device 'pci_0000_00_03_0'
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Jonathon Jongsma [Thu, 10 Jun 2021 18:15:37 +0000 (13:15 -0500)]
nodedev: handle mdevs from multiple parents
Due to a rather unfortunate misunderstanding, we were parsing the list
of defined devices from mdevctl incorrectly. Since my primary
development machine only has a single device capable of mdevs, I
apparently neglected to test multiple parent devices and made some
assumptions based on reading the mdevctl code. These assumptions turned
out to be incorrect, so the parsing failed when devices from more than
one parent device were returned.
The details: mdevctl returns an array of objects representing the
defined devices. But instead of an array of multiple objects (with each
object representing a parent device), the array always contains only a
single object. That object has a separate property for each parent
device.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
It is possible to define/edit(in shut off state) a domain XML with
same hostdev device repeated more than once, as shown below. This
behavior is not expected. So, this patch fixes it.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Tested-by: Liu Yiding <liuyd.fnst@fujitsu.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Andrea Bolognani [Fri, 25 Jun 2021 13:57:50 +0000 (15:57 +0200)]
tests: Test the defaults for TPM on ARM virt guests
Instead of providing the configuration explicitly, let libvirt
fill in the blanks. After the recent changes, this results in a
working configuration without the need for user input.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Tested-by: Liu Yiding <liuyd.fnst@fujitsu.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Andrea Bolognani [Fri, 25 Jun 2021 13:57:38 +0000 (15:57 +0200)]
qemu: Default to TPM 2.0 for ARM virt guests
The TPM 2.0 specification predates ARM virtualization, and so
implementing TPM 1.2 support on ARM was not considered a useful
endeavor.
This is technically a breaking change, but TPM support on ARM was
only introduced fairly recently (libvirt 7.1.0) and the previous
default resulted in non working TPM devices; anyone who has a
working configuration is not going to be affected.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Tested-by: Liu Yiding <liuyd.fnst@fujitsu.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Andrea Bolognani [Fri, 25 Jun 2021 13:31:35 +0000 (15:31 +0200)]
tests: Add aarch64-tpm test to qemuxml2xml
We're going to change the input file later, and having this
additional coverage will demonstrate that such a change does not
alter the behavior.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Tested-by: Liu Yiding <liuyd.fnst@fujitsu.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Andrea Bolognani [Fri, 25 Jun 2021 13:16:30 +0000 (15:16 +0200)]
docs: Fix information for default TPM version
The current information is not accurate, because the default
is 2.0 instead of 1.2 for the tpm-crb and tpm-spapr models.
Any detailed list will surely become obsolete and out of sync
with reality over time, so let's just document that the default
model depends on a number of factors and avoid getting any more
specific than that.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Tested-by: Liu Yiding <liuyd.fnst@fujitsu.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
A process can access a file if the set of MCS categories
for the file is equal-to *or* a subset-of, the set of
MCS categories for the process.
If there are two VMs:
a) svirt_t:s0:c117
b) svirt_t:s0:c117,c720
Then VM (b) is able to access files labelled for VM (a).
IOW, we must discard case where the categories are equal
because that is a subset of many other valid category pairs.
Fixes: https://gitlab.com/libvirt/libvirt/-/issues/153
CVE-2021-3631 Reviewed-by: Peter Krempa <pkrempa@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Michal Privoznik [Thu, 24 Jun 2021 14:58:09 +0000 (16:58 +0200)]
virSetUIDGIDWithCaps: Don't drop CAP_SETPCAP right away
There are few cases where we execute a virCommand with all caps
cleared (virCommandClearCaps()). For instance
dnsmasqCapsRefreshInternal() does just that. This means, that
after fork() and before exec() the virSetUIDGIDWithCaps() is
called. But since the caller did not want to change anything,
just drop capabilities, these are the values of arguments:
This means that indeed all capabilities will be dropped,
including CAP_SETPCAP. But this capability controls whether
capabilities can be set, IOW whether capng_apply() succeeds.
There are two calls of capng_apply() in the function. The
CAP_SETPCAP is dropped after the first call and thus the other
call (capng_apply(CAPNG_SELECT_BOUNDS);) fails.
The solution is to keep the capability for as long as needed
(just like CAP_SETGID and CAP_SETUID) and drop it only at the
very end (just like CAP_SETGID and CAP_SETUID).
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1949388 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
spec: drop/update dependencies on systemd-{units,sysv}
-sysv was probably a left-over, and the -units deps was outdated and not
necessary, see
https://docs.fedoraproject.org/en-US/packaging-guidelines/Scriptlets/#_dependencies_on_the_systemd_package.
Only for 'systemctl mask' which is executed in %post, we want to make
sure that /usr/bin/systemctl is installed, so keep that dependency.
(A file dep is used to avoid issues if the systemd package is further
split later on.)
Ferried over from https://src.fedoraproject.org/rpms/libvirt/pull-request/7.
Signed-off-by: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>
It is possible for site admins to assign names to classids in this file,
which are then used by all libnl tools, possibly those used by libvirt.
To be on the safe side, allow read access to the file in the virt-aa-helper
profile and the libvirt-qemu abstraction.
Signed-off-by: Jim Fehlig <jfehlig@suse.com> Reviewed-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Peter Krempa [Mon, 21 Jun 2021 08:52:32 +0000 (10:52 +0200)]
tests: qemucapabilities: Bump test data for qemu-6.1 on x86_64
Update the caps data for the upcoming qemu version.
Notable changes are:
- 'query-sev-attestation-report' command added
- 'sample-pages' members for dirty rate calculation added
- 'qtest' device added
- 'share' member added to query-memdev and 'reserve' members added to
query-memdev/memory-backend-[file,memfd,ram]
- 'qemu-vdagent' chardev added
- 'mptcp' toggle added to inet servers
- 'zstd' compression for qcow2
- new cpu models: - "Snowridge-v3"
- "Skylake-Server-v5"
- "Skylake-Client-v4"
- "Icelake-Server-v5"
- "Icelake-Client-v3"
- "Dhyana-v2"
- "Denverton-v3"
- "Cooperlake-v2"
- "Cascadelake-Server-v5"
- 'avx-vnni' added to some existing cpu models
- 'model-id' is now being reported as the host cpu again rather than
QEMU TCG as I've noted in previous bump
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
ci: Also perform package upgrades on macOS and FreeBSD
The base OS image might include outdated contents, and we don't
want to get spurious failures caused by bugs that have already been
fixed in the respective packages.
This is particularly important on macOS, because 'brew install foo'
will fail if 'foo' is already installed but outdated: upgrading all
packages first ensures we never run into this scenario.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Andrea Bolognani <abologna@redhat.com>
rpc: prefer SHA256 host key fingerprint with new libssh
The host key fingerprint for SSH servers is used in a scenario where
cryptographic strength is important. We should thus be defaulting to
use of SHA256 where available. We only need SHA1 for Ubuntu 18.04
which does not have libssh >= 0.8.1
Reviewed-by: Pavel Hrdina <phrdina@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>