Peter Krempa [Wed, 12 Nov 2025 16:52:05 +0000 (17:52 +0100)]
qemu: snapshot: Set umask for 'qemu-img' when creating external inactive snapshots
External inactive snapshots are created by invoking 'qemu-img' which
creates the file. Currently qemu-img creates image with mode 644 based
on default umask as libvirt doesn't set any.
Having a world-readable image is obviously wrong so set the umask to
077 to have the file readable only by the owner.
Resolves: https://bugs.debian.org/1120119 Signed-off-by: Peter Krempa <pkrempa@redhat.com>
qemu: Check ACLs before parsing the whole domain XML
Utilise the new virDomainDefIDsParseString() for that.
This is one of the more complex ones since there is also a function that
reads relevant metadata from a save image XML. In order _not_ to extract
the parsing out of the function (and make the function basically trivial
and all callers more complex) add a callback to the function which will
be used to check the ACLs.
Fixes: CVE-2025-12748 Reported-by: Святослав Терешин <s.tereshin@fobos-nt.ru> Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
ch: Check ACLs before parsing the whole domain XML
Utilise the new virDomainDefIDsParseString() for that.
This is one of the more complex ones since there is also a function that
reads relevant metadata from a save image XML. In order not to extract
the parsing out of the function (and make the function basically trivial
and all callers more complex) add a callback to the function which will
be used to check the ACLs. And since this function is called in APIs
that perform ACL checks both with and without flags, add two of them for
good measure.
Fixes: CVE-2025-12748 Reported-by: Святослав Терешин <s.tereshin@fobos-nt.ru> Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
This function performs only parsing with the underlying
virDomainDefParseIDs() function to get needed metadata for any ACL
checks, but nothing else to avoid extraneous allocations and any
parser-induced DoS over ACL-forbidden connections.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
'libxml2' deprecated the 'xmlIndentTreeOutput' thread-local variable as
well as the 'xmlThrDefIndentTreeOutput' function for setting the global
default, which we use in our code for formatting the metadata sub-XML.
'libxml2' also for now doesn't provide a way to set target indentation
level in 'xmlSaveCtxt' which would allow us to use the modern output
APIs, we can't replace our use of 'xmlDumpNode'. (See
https://gitlab.gnome.org/GNOME/libxml2/-/issues/989 )
Since the indentation is enabled by default in libxml2 and our most
commonly used code which calls xmlDumpNode lives in a standalone
process, where we don't override the setting, just removing the override
will result in identical behaviour.
For the use cases which do live in a process we don't fully control and
thus the default could have been overriden, the result would be that the
<metadata> element would be un-indented, but that is still valid XML.
Thus to fix the deprecated use just stop setting 'xmlIndentTreeOutput'.
Closes: https://gitlab.com/libvirt/libvirt/-/issues/816 Signed-off-by: Peter Krempa <pkrempa@redhat.com>
'xmlIndentTreeOutput' is now deprecated by libxml2.
The default value set by libxml2 is '1', and the vbox driver resides
only inside the standalone daemon where the value will not be changed by
us thus there's no observable change in behaviour.
conf: domain_validate: make disk queue configuration driver specific
Currently, virDomainDiskDefValidate() allows to configure disks' number
of queues and queue size for virtio disks only. However, the bhyve
driver allows to configure these for the NVMe disks, so make this
check driver-specific.
Signed-off-by: Roman Bogorodskiy <bogorodskiy@gmail.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
meson: default to system crypto policies where available
In RHEL and Fedora, the built-in GNUTLS default priority is changed
from "NORMAL" to "@SYSTEM", but because libvirt sets an explicit
policy with gnutls we don't honour that. Instead we force "NORMAL"
unless the 'tls_priority' meson option is changed.
In RPM builds, meanwhile, we ask for "@LIBVIRT,SYSTEM" to make it
look for a libvirt specific profile first, falling back to "@SYSTEM"
This changes the meson option to default to "@LIBVIRT,SYSTEM" if the
crypto-policies config is present on the local machine and the meson
option -Dsystem=true is given.
This gives developers more appropriate default behaviour, matching
that seen in package builds.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Michal Privoznik [Fri, 24 Oct 2025 08:11:04 +0000 (10:11 +0200)]
ch: Sort driver sources and drop header files
Firstly, there's no need to list header files in
ch_driver_sources (we don't do that anywhere else, and meson is
smart enough to figure them out). And secondly, the list of
source file is not sorted which means new source files are added
in random order.
Thus, drop header files from the list and sort it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Michal Privoznik [Thu, 23 Oct 2025 13:53:20 +0000 (15:53 +0200)]
ch: Assign device alias early
Assigning device should happen from ch_hotplug.c (just like it's
done for disks currently) not in ch_process.c. Move alias
assignment out of chProcessAddNetworkDevice(). And while at it,
mimic what's done with disks and have net hotplug handling done
from a function.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Michal Privoznik [Fri, 24 Oct 2025 13:42:53 +0000 (15:42 +0200)]
ch: Set transient domain definition
Libvirt's philosophy is that for a running domain there are two
(in general distinct) definitions: live definition (reflects the
running state) and inactive definition (used to seed the live
definition when domain is being created). That's why we have
VIR_DOMAIN_AFFECT_LIVE and VIR_DOMAIN_AFFECT_CONFIG flags to APIs
that modify domain definitions.
Well, the CH driver doesn't do this distinction. Fix this by
making the domain definition transient when it's being created.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
ch: Use correct domain definition in chDomainGetXMLDesc()
The chDomainGetXMLDesc() function claims to support
VIR_DOMAIN_XML_INACTIVE to obtain the persistent definition of a
running domain (in its call to virCheckFlags()) but in fact, it's
always passing vm->def to virDomainDefFormat().
So far, there's no harm done because CH driver never sets domain
def as transient. But that'll change.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
ch_process: Avoid memleak in chProcessAddNetworkDevice()
The 'payload' variable inside of chProcessAddNetworkDevice() is
reused and thus the memory it points to just before its
repurpose is not freed. Avoid reusing g_autofree variables.
128 bytes in 1 blocks are definitely lost in loss record 1,828 of 2,026
at 0x491A120: realloc (vg_replace_malloc.c:1801)
by 0x4FEC251: g_realloc (in /usr/lib64/libglib-2.0.so.0.8400.4)
by 0x500BB7E: g_string_expand (in /usr/lib64/libglib-2.0.so.0.8400.4)
by 0x500BBF0: g_string_sized_new (in /usr/lib64/libglib-2.0.so.0.8400.4)
by 0x4A114C0: virBufferInitialize (virbuffer.c:121)
by 0x4A11890: virBufferAdd (virbuffer.c:160)
by 0x4A67344: virJSONValueToBuffer (virjson.c:1562)
by 0x4A673DB: virJSONValueToString (virjson.c:1599)
by 0xBC878AB: virCHMonitorBuildNetJson (ch_monitor.c:466)
by 0xBC8D4A9: chProcessAddNetworkDevice (ch_process.c:688)
by 0xBC8FCE2: chDomainAttachDeviceLive (ch_hotplug.c:78)
by 0xBC900CA: chDomainAttachDeviceLiveAndUpdateConfig (ch_hotplug.c:174)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
domain_capabilities: Use virXMLFormatElement() in FORMAT_PROLOGUE and FORMAT_EPILOGUE macros
Domain capabilities XML is formatted (mostly) using
FORMAT_PROLOGUE and FORMAT_EPILOGUE macros. These format opening
and closing stanzas for given element. The FORMAT_PROLOGUE macro
even tries to be clever and format element onto one line (if the
element isn't supported), but that's not enough. Fortunately, we
have virXMLFormatElement() which formats elements properly, so
let's switch macros into using that.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
domain_capabilities: Check NULL in FORMAT_PROLOGUE
In the virDomainCaps struct there are some pointers that might be
NULL (for instance 'sev', 'sgx', 'hyperv'). Teach FORMAT_PROLOGUE
macro to check for NULL argument so that format functions (like
virDomainCapsFeatureHypervFormat()) don't need to.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
domain_capabilities: Move indentation adjustment out of virDomainCapsCPUCustomFormat()
The aim of virDomainCapsCPUCustomFormat() is to format CPU models
into given buffer. But it starts by adjusting indentation. Move
this one level up into the caller so that another buffer can be
used. This also makes the pattern match in the caller
(virDomainCapsCPUFormat()) with the rest of CPU related domcaps
formatting.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
bhyve: Support passing the 'passthru' command line option
Bhyve supports PCI device passthrough using the following syntax:
bhyve ... -s 4:0,passthru,5/2/0 ...
Where 5/2/0 is PCI address of the device in the host, and "4:0" is the
address in the guest.
Currently, user is responsible for reserving the device for passthrough,
i.e. by configuring pptdevs in loader.conf(5), or using devctl(8) to
detach the device.
Co-authored-by: Roman Bogorodskiy <bogorodskiy@gmail.com> Signed-off-by: Alexander Shursha <kekek2@ya.ru> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
After executing the bhyve binary, it might happen that it fails very
early due to configuration issues (missing/inaccessible files, incorrect
custom args), bugs, etc. In this case it'll look like the domain has
started normally, but quickly turned off.
Improve that by waiting for the domain's vmm entity to appear in
/dev/vmm.
Signed-off-by: Roman Bogorodskiy <bogorodskiy@gmail.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Jiri Denemark [Thu, 6 Nov 2025 13:10:06 +0000 (14:10 +0100)]
cputest: Read more MSRs in cpu-data.py
The features defined in our CPU map use quite a bit more than just the
two MSRs the script is currently trying to read. Let's read all of them
to get complete host CPU data.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Jiri Denemark [Thu, 6 Nov 2025 12:40:14 +0000 (13:40 +0100)]
cputest: Ignore missing MSRs in cpu-data.py
The current code made sense when we were reading only one MSR, but since
we started reading more MSRs, the host CPU would have to support all of
them otherwise the function would just return an empty dict.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Jiri Denemark [Thu, 6 Nov 2025 09:55:10 +0000 (10:55 +0100)]
sync_qemu_models_i386: Support adding models to an empty group
When adding a new CPU vendor, we create a new empty group in
src/cpu_map/index.xml and want to use the sync_qemu_models_i386.py
script to add models there.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Jiri Denemark [Wed, 5 Nov 2025 14:49:22 +0000 (15:49 +0100)]
sync_qemu_models_i386: Print current model for unknown features
This way one can just grep for all warnings in the script output and
still be able to see for which CPU model is defined using features the
script doesn't know about.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
We've already checked the upper bound of the array, but we should
none the less sanity check that the requested array element is
not NULL before dereferencing it.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
"enabled" here indicates "kvm" is the enabled accelertor.
If query-accelerators command is not available, fallback to existing
mechnisms for querying kvm and hvf capabilities.
Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
src: report error from failing to add timer/FD watches
The virEventAddHandle/Timeout APIs are unusual in that they do not
report errors on failure, because they call through to function
callbacks which might be provided externally to libvirt and thus
won't be using libvirt's error reporting APIs.
This is a rather unfortunate design characteristic as we can see
most callers forgot about this special behaviour and so we are
lacking error reporting in many cases.
Reviewed-by: Peter Krempa <pkrempa@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Peter Krempa [Thu, 11 Sep 2025 15:07:11 +0000 (17:07 +0200)]
qemu_monitor: Extract 'timed_stats' of block devices
The 'timed_stats' block is a set of statistics gathered in configurable
time intervals. The stats include latency timings of reads/writes as
well as the depth of the request queues.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Peter Krempa [Mon, 3 Nov 2025 12:23:48 +0000 (13:23 +0100)]
qemu: capabilities: Fix logic for formatting 'reconnect' parameter
In commit e4d058866e9 I've converted the code to use the modern
'reconnect-ms' parameter instead of 'reconnect' but messed up the logic
for the time when 'reconnect' will be removed.
We need to check QEMU_CAPS_NETDEV_STREAM_RECONNECT_MILISECONDS
individually and not based on QEMU_CAPS_NETDEV_STREAM_RECONNECT.
Fix the logic as upstream qemu now removed 'reconnect'.
Fixes: e4d058866e9563756349de6b3f451a53e64ca872 Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Jiri Denemark [Fri, 24 Oct 2025 12:17:56 +0000 (14:17 +0200)]
qemu: Ignore "ht" CPU feature
The feature does not do anything, QEMU will always set it according to
the CPU topology completely ignoring what we asked for. Unfortunately,
the way the state of "ht" is reported changed in QEMU 10.0.0 (commit c6bd2dd634208).
QEMU older than 10.0.0 would just report whatever was specified on the
command line totally ignoring the actual state of the feature visible to
a guest. But after the change QEMU reports ht=on in case it enabled "ht"
based on the CPU topology. In all other cases QEMU still reports the
state requested on the command line.
As a result of this change a domain with multiple CPU threads started on
QEMU < 10.0.0 could not be migrated to QEMU >= 10.0.0 unless "ht" was
explicitly enabled in the domain XML because libvirt would see "ht"
enabled on the destination, but disabled on the source (the guest would
see "ht" enabled in both cases anyway). Outgoing migration of domains
started on QEMU >= 10.0.0 is not affected.
To fix this issue we can completely ignore "ht" both in the domain XML
and in the CPU properties reported by QEMU. With this fix incoming
migration to QEMU >= 10.0.0 works again.
Jiri Denemark [Fri, 24 Oct 2025 15:16:32 +0000 (17:16 +0200)]
qemu_monitor: Filter CPU features reported by QEMU
Some features may be on our ignore list because they do nothing even
though QEMU still supports them and reports their state. But as the
features do nothing, the state reported by QEMU may not correspond to
what the guest sees. To avoid possible confusion we may just pretend
QEMU did not report any of the features on our ignore list.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Jiri Denemark [Fri, 24 Oct 2025 13:36:18 +0000 (15:36 +0200)]
qemu_process: Always fix CPUs on reconnect
We fix CPUs (i.e., remove ignored CPU features) only when libvirt/QEMU
combo used to start the domain is very old and doesn't support
query-cpu-model-expansion, in which case the CPU definition may contain
features that are unknown to QEMU. But even if both libvirt and QEMU are
new enough, we still want to remove features that do nothing to minimize
confusion or to avoid false migration issues.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Jiri Denemark [Fri, 24 Oct 2025 13:27:03 +0000 (15:27 +0200)]
qemu_domain: Fix qemuDomainFixupCPUs
The function was apparently created when the list of ignored CPU
features contained just cmt and related features. The list grew quite a
bit since then and this function stopped making sense as it would remove
all ignored features from CPU definitions but only if cmt was present.
The issue with cmt is long gone and this function was not really doing
anything. Surprisingly this didn't cause any real issues as we don't
update CPU definitions with features unknown to QEMU. But we may still
want to remove ignored features even though QEMU knows about them for
compatibility reasons.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Jiri Denemark [Fri, 24 Oct 2025 13:07:45 +0000 (15:07 +0200)]
cpu_conf: Make virCPUDefFilterFeatures return void
The only thing that can fail inside virCPUDefFilterFeatures is
VIR_DELETE_ELEMENT_INPLACE macro. The macro just calls
virDeleteElementsN, which reports a warning when all elements to be
removed are not within the array bounds and returns -1. The function
succeeds otherwise. But since VIR_DELETE_ELEMENT_INPLACE sets the number
of elements to be removed to 1 and we call it with i < cpu->nfeatures,
the safety check in virDeleteElementsN will never fail. And even if we
theoretically called it with wrong arguments, it just wouldn't do
anything.
Thus we can safely assume VIR_DELETE_ELEMENT_INPLACE always succeeds in
virCPUDefFilterFeatures and avoid reporting any errors to simplify
callers.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>