]> git.ipfire.org Git - thirdparty/qemu.git/commit
migration/postcopy: Report fault latencies in blocktime
authorPeter Xu <peterx@redhat.com>
Fri, 13 Jun 2025 14:12:11 +0000 (10:12 -0400)
committerFabiano Rosas <farosas@suse.de>
Fri, 11 Jul 2025 13:37:38 +0000 (10:37 -0300)
commitb4c82b428828c0ffff273a49f24a22cb4e18d485
tree5a441e52278601d8c6ff2f0eaca621d6b6e59858
parent271a1940e91a32fab6165841279f250204f53ae4
migration/postcopy: Report fault latencies in blocktime

Blocktime so far only cares about the time one vcpu (or the whole system)
got blocked.  It would be also be helpful if it can also report the latency
of page requests, which could be very sensitive during postcopy.

Blocktime itself is sometimes not very important, especially when one
thinks about KVM async PF support, which means vCPUs are literally almost
not blocked at all because the guest OS is smart enough to switch to
another task when a remote fault is needed.

However, latency is still sensitive and important because even if the guest
vCPU is running on threads that do not need a remote fault, the workload
that accesses some missing page is still affected.

Add two entries to the report, showing how long it takes to resolve a
remote fault.  Mention in the QAPI doc that this is not the real average
fault latency, but only the ones that was requested for a remote fault.

Unwrap get_vcpu_blocktime_list() so we don't need to walk the list twice,
meanwhile add the entry checks in qtests for all postcopy tests.

Cc: Markus Armbruster <armbru@redhat.com>
Cc: Dr. David Alan Gilbert <dave@treblig.org>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Link: https://lore.kernel.org/r/20250613141217.474825-9-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
migration/migration-hmp-cmds.c
migration/postcopy-ram.c
qapi/migration.json
tests/qtest/migration/migration-qmp.c