From: Jiri Denemark Date: Fri, 10 Jan 2025 17:25:20 +0000 (+0100) Subject: qemu_migration: Do not consider post-copy active in postcopy-recover X-Git-Tag: v11.0.0-rc2~1 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=a71985f6f3fec52e2a6f4e3991ed9bf6280bdb4b;p=thirdparty%2Flibvirt.git qemu_migration: Do not consider post-copy active in postcopy-recover The postcopy-recover migration state in QEMU means a connection for the migration stream was established. Depending on the schedulers on both hosts a relative timing of the corresponding MIGRATION event on the source host and the destination host may differ. Specifically it's possible that the source sees postcopy-recover while the destination is still in postcopy-paused. Currently the Perform phase on the source host ends when we get postcopy-recover event and the Finish phase on the destination host is called. If this is fast enough we can still see postcopy-paused state when the Finish phase starts waiting for migration to complete. This is interpreted as a failure and reported back to the caller. Even though the recovery may actually start just a few moments later. To avoid this race we now don't consider post-copy migration active in postcopy-recover state and keep waiting for postcopy-active event (in the success path). Thus the Finish phase is entered only after the migration switches to postcopy-active. In this state QEMU guarantees the destination already switched at least to postcopy-recover and we won't be confused be seeing an old postcopy-failed state. https://issues.redhat.com/browse/RHEL-73085 Signed-off-by: Jiri Denemark Reviewed-by: Michal Privoznik stats.mig.status) { case QEMU_MONITOR_MIGRATION_STATUS_POSTCOPY: - case QEMU_MONITOR_MIGRATION_STATUS_POSTCOPY_RECOVER: jobData->status = VIR_DOMAIN_JOB_STATUS_POSTCOPY; break; case QEMU_MONITOR_MIGRATION_STATUS_POSTCOPY_RECOVER_SETUP: + case QEMU_MONITOR_MIGRATION_STATUS_POSTCOPY_RECOVER: jobData->status = VIR_DOMAIN_JOB_STATUS_POSTCOPY_RECOVER; break;