By default, pfifo_fast queueing discipline (qdisc) is set on
newly created interfaces (including TAPs). This qdisc has three
queues and packets that want to be sent through given NIC are
placed into one of the queues based on TOS field. Queues are then
emptied based on their priority allowing interactive sessions
stay interactive whilst something else is downloading a large
file.
Obviously, this means that kernel has to be involved and some
locking has to happen (when placing packets into queues). If
virtualization is taken into account then the above algorithm
happens twice - once in the guest and the second time in the
host.
This is arguably not optimal as it burns host CPU cycles
needlessly. Guest already made it choice and sent packets in the
order it wants.
To resolve this, Linux kernel offers 'noqueue' qdisc which can be
applied on virtual interfaces and in fact for 'lo' it is by
default:
lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue
Set it for other TAP devices we create for domains too. With this
change I was able to squeeze 1Mbps more from a macvtap attached
to a guest and to my 1Gbps LAN (as measured by iperf3).
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1329644 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
This helper changes the root qdisc on given interface.
Ideally, it would be written using netlink but my attempts to
write the code were not successful and thus I've fallen back to
virCommand() + tc.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Currently setting max_len=0 causes virtlogd to spin in a busy loop. It
is natural to allow this to disable log rollover which can be useful for
developers debugging things.
Note disabling rollover exposes the host to denial of service from a
malicious guest, so must be used with care.
Closes https://gitlab.com/libvirt/libvirt/-/issues/85 Reviewed-by: Peter Krempa <pkrempa@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Cole Robinson [Wed, 7 Oct 2020 19:20:30 +0000 (15:20 -0400)]
qemu: migration: don't open storage driver too early
If storage migration is requested, and the destination storage does
not exist on the remote host, qemu's migration support will call
into the libvirt storage driver to precreate the destination storage.
The storage driver virConnectPtr is opened too early though, adding
an unnecessary dependency on the storage driver for several cases
that don't require it. This currently requires kubevirt to install
the storage driver even though they aren't actually using it.
Push the virGetConnectStorage calls to right before the cases they are
actually needed.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Cole Robinson <crobinso@redhat.com>
virsocketaddr: Zero @netmask in virSocketAddrPrefixToNetmask()
The aim of virSocketAddrPrefixToNetmask() is to initialize passed
virSocketAddr structure based on prefix length and family.
However, it doesn't set all members in the struct which may lead
to reads of uninitialized values:
==15421== Use of uninitialised value of size 8
==15421== at 0x50F297A: _itoa_word (in /lib64/libc-2.31.so)
==15421== by 0x510C8FE: __vfprintf_internal (in /lib64/libc-2.31.so)
==15421== by 0x5120295: __vsnprintf_internal (in /lib64/libc-2.31.so)
==15421== by 0x50F8969: snprintf (in /lib64/libc-2.31.so)
==15421== by 0x51BB602: getnameinfo (in /lib64/libc-2.31.so)
==15421== by 0x496DEE0: virSocketAddrFormatFull (virsocketaddr.c:486)
==15421== by 0x496DD9F: virSocketAddrFormat (virsocketaddr.c:444)
==15421== by 0x11871F: networkDnsmasqConfContents (bridge_driver.c:1404)
==15421== by 0x1118F5: testCompareXMLToConfFiles (networkxml2conftest.c:48)
==15421== by 0x111BAF: testCompareXMLToConfHelper (networkxml2conftest.c:112)
==15421== by 0x112679: virTestRun (testutils.c:142)
==15421== by 0x111D09: mymain (networkxml2conftest.c:144)
==15421== Uninitialised value was created by a stack allocation
==15421== at 0x1175D2: networkDnsmasqConfContents (bridge_driver.c:1056)
All callers expect the function to initialize the structure
fully.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Laine Stump <laine@redhat.com>
It could also have ",ro" suffix to make share read-only.
In the Linux guest, this share is mounted with:
mount -t 9p sharename /mnt/sharename
In the guest user will see the same permissions and ownership
information for this directory as on the host. No uid/gid remapping is
supported, so those could resolve to wrong user or group names.
The same applies to the other side: chowning/chmodding in the guest will
set specified ownership and permissions on the host.
In libvirt domain XML it's modeled using the 'filesystem' element:
With this commit, all architecture lists that we base feature
enablement decisions on are defined within a few lines of each
other, increasing maintainability.
Additionally, generic architecture lists that appear in the
conditions for multiple features are defined, so that repetition
is reduced.
Note that a few checks (numactl, zfs, ceph) have been changed
from %ifarch to %ifnarch for consistency: while doing so, the
corresponding list of architectures has also been replaced with
the complement of the original one to ensure the overall behavior
would be preserved.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Neal Gompa <ngompa13@gmail.com>
There's no need to set a default for it if we're going to override
it immediately afterwards anyway, and setting with_qemu_tcg at the
same time only makes things more confusing.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Neal Gompa <ngompa13@gmail.com>
Ján Tomko [Fri, 9 Oct 2020 11:03:23 +0000 (13:03 +0200)]
remote: remove leftover goto
Signed-off-by: Ján Tomko <jtomko@redhat.com> Reported-by: John Ferlan <jferlan@redhat.com> Fixes: 8487595bee0c04e56b0d8e866c5c71318faf1689 Signed-off-by: Ján Tomko <jtomko@redhat.com>
Fangge Jin [Fri, 21 Aug 2020 10:59:01 +0000 (18:59 +0800)]
qemu.conf: Re-word the description for *_tls_x509_verify
The original descirption for *_tls_x509_verify is a little misleading
by saying that "Enabling this option will reject any client who does
not have a ca-cert.pem certificate".
Signed-off-by: Fangge Jin <fjin@redhat.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Instead of having an ad-hoc build script for CentOS 7, follow the
pattern established in other repositories under the libvirt group
and allow selectively disabling that specific part of the build.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Matt Coleman [Fri, 9 Oct 2020 07:46:09 +0000 (03:46 -0400)]
hyperv: remove openwsman.h
This header's main purpose was to work around bugs in older versions of
openwsman. Most of the files using it only needed wsman-api.h, which
they now include directly.
Signed-off-by: Matt Coleman <matt@datto.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Matt Coleman [Mon, 5 Oct 2020 16:20:13 +0000 (12:20 -0400)]
hyperv: implement connectGetVersion
Hyper-V version numbers are not compatible with the encoding in
virParseVersionString():
https://gitlab.com/libvirt/libvirt/-/blob/master/src/util/virutil.c#L246
For example, the Windows Server 2016 Hyper-V version is 10.0.14393: its
micro is over 14 times larger than the encoding allows.
This commit repacks the Hyper-V version number in order to preserve all
of the digits. The major and minor are concatenated (with minor zero-
padded to two digits) to form the repacked major value. This works
because Microsoft's major and minor versions numbers are unlikely to
exceed 99. The repacked minor value is derived from the digits in the
thousands, ten-thousands, and hundred-thousands places of Hyper-V's
micro. The repacked micro is derived from the digits in the ones, tens,
and hundreds places of Hyper-V's micro.
Co-authored-by: Sri Ramanujam <sramanujam@datto.com> Signed-off-by: Matt Coleman <matt@datto.com> Reviewed-by: Neal Gompa <ngompa13@gmail.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Matt Coleman [Mon, 5 Oct 2020 16:20:07 +0000 (12:20 -0400)]
hyperv: fix nodeGetInfo failures caused by long CPU names
Some CPU model names were too long for _virNodeInfo.model.
For example: Intel Xeon CPU E5-2620 v2 @ 2.10GHz
This commit removes the clock frequency suffix.
Signed-off-by: Matt Coleman <matt@datto.com> Reviewed-by: Neal Gompa <ngompa13@gmail.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Matt Coleman [Mon, 5 Oct 2020 16:20:06 +0000 (12:20 -0400)]
hyperv: make Msvm_ComputerSystem WQL queries locale agnostic
There are two specific WQL queries we're using to get either a list of
virtual machines or the hypervisor host itself from Msvm_ComputerSystem.
Those queries rely on filtering results based on the "Description"
field. Since the "Description" field is locale sensitive, the queries
will fail if the Windows host is using a language pack. While the WSMAN
spec allows the client to set the requested locale (and it is supported
since openwsman 2.6.x), the Windows WinRM service does not respect this
setting: it returns non-English strings despite the WSMAN request
properly setting the locale to 'en-US'. Therefore, this patch changes
the WQL query to make use of the "__SERVER" field to stop relying on
English strings in queries and side step the issue.
Co-authored-by: Dawid Zamirski <dzamirski@datto.com> Signed-off-by: Matt Coleman <matt@datto.com> Reviewed-by: Neal Gompa <ngompa13@gmail.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Roman Bolshakov [Thu, 8 Oct 2020 14:02:08 +0000 (17:02 +0300)]
tests: commandhelper: Accept POLLNVAL on macOS
commandhelper hangs indefinitely in poll() on macOS on commandtest test2
and later because POLLNVAL is returned on revents for input file
descriptor opened from /dev/null, i.e this hangs:
$ tests/commandhelper < /dev/null
BEGIN STDOUT
BEGIN STDERR
^C
But it works fine with regular stdin:
$ tests/commandhelper <<< test
BEGIN STDOUT
BEGIN STDERR
test
test
END STDOUT
END STDERR
The issue is mentioned in poll(2):
BUGS
The poll() system call currently does not support devices.
With the change all 28 cases in commandtest pass.
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com> Reviewed-by: Andrea Bolognani <abologna@redhat.com>
warning: Macro expanded in comment on line 402: %firewalld_reload macro
error: line 402: Unknown tag: test -f /usr/bin/firewall-cmd && firewall-cmd --reload --quiet || :
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
- Wrap long lines in "domxml-to-native" example so it fits
content width,
- For changeset revision links, use "FreeBSD changeset rN" or
"changeset rN" instead of just "rN" to make it more readable.
Signed-off-by: Roman Bogorodskiy <bogorodskiy@gmail.com> Reviewed-by: Andrea Bolognani <abologna@redhat.com>
qemu: Don't generate '-machine memory-backend' and '-numa memdev'
In 88957116c9 I've switched to -machine memory-backend=ID and
-object memory-backend-* because QEMU is obsoleting -mem-path
and -mem-prealloc. However, what I did not foresee was that using
-machine memory-backend in combination with -numa is not allowed
in QEMU. This was reported upstream and fortunately not released
yet.
The problem is that if domain has NUMA nodes then we will
generate memory-backend-* objects for NUMA nodes (because if QEMU
is new enough to expose default RAM ID it also supports -numa
memdev=) and adding non-NUMA memory backend is wrong.
Reported-by: Masayoshi Mizuma <msys.mizuma@gmail.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Ján Tomko [Mon, 5 Oct 2020 20:35:34 +0000 (22:35 +0200)]
qemuAgentGetInterfaceAddresses: turn ifname into char*
We only care about the first part of the 'ifname' string,
splitting it is overkill.
Instead, just replace the ':' with a '\0' in a copy of the string.
This reduces the count of the varaibles containing some form
of the interface name to two.
Ján Tomko [Mon, 5 Oct 2020 19:48:12 +0000 (21:48 +0200)]
qemu: agent: split out qemuAgentGetInterfaceAddresses
Convert one interface from the "return" array returned by
"guest-network-get-interfaces" to virDomainInterface.
Due to the functionality of squashing interface aliases together,
this is not a pure function - it either:
* Adds the interface to ifaces_ret, incrementing ifaces_count
and adds a pointer to it into the ifaces_store hash table.
* Adds the additional IP addresses from the interface alias
to the existing interface entry, found through the hash table.
This does not increment ifaces_count or extend the array.
Ján Tomko [Mon, 5 Oct 2020 19:34:52 +0000 (21:34 +0200)]
qemu: agent: use virHashNew
We're passing 'ifaces_count' to virHashCreate as the initial
hash table size just after we've initialized it to zero.
This translates to a default of 256 inside virHashCreateFull.
Instead of this obfuscation, use virHashNew (default of 32),
to make it obvious we don't care about the initial hash size.
Also remove the error handling, since neither of the functions
return any errors after switching to g_new0.