]> git.ipfire.org Git - thirdparty/libvirt.git/commitdiff
qemu_migration: Do not consider post-copy active in postcopy-recover
authorJiri Denemark <jdenemar@redhat.com>
Fri, 10 Jan 2025 17:25:20 +0000 (18:25 +0100)
committerJiri Denemark <jdenemar@redhat.com>
Mon, 13 Jan 2025 11:18:20 +0000 (12:18 +0100)
The postcopy-recover migration state in QEMU means a connection for the
migration stream was established. Depending on the schedulers on both
hosts a relative timing of the corresponding MIGRATION event on the
source host and the destination host may differ. Specifically it's
possible that the source sees postcopy-recover while the destination is
still in postcopy-paused.

Currently the Perform phase on the source host ends when we get
postcopy-recover event and the Finish phase on the destination host is
called. If this is fast enough we can still see postcopy-paused state
when the Finish phase starts waiting for migration to complete. This is
interpreted as a failure and reported back to the caller. Even though
the recovery may actually start just a few moments later.

To avoid this race we now don't consider post-copy migration active in
postcopy-recover state and keep waiting for postcopy-active event (in
the success path). Thus the Finish phase is entered only after the
migration switches to postcopy-active. In this state QEMU guarantees the
destination already switched at least to postcopy-recover and we won't
be confused be seeing an old postcopy-failed state.

https://issues.redhat.com/browse/RHEL-73085

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com
src/qemu/qemu_migration.c

index 50e350b0c4da78d4377b6c547d2f3901651f4a84..1582a738a3d881ff750de62fed56aaf77b803566 100644 (file)
@@ -1872,11 +1872,11 @@ qemuMigrationUpdateJobType(virDomainJobData *jobData)
 
     switch ((qemuMonitorMigrationStatus) priv->stats.mig.status) {
     case QEMU_MONITOR_MIGRATION_STATUS_POSTCOPY:
-    case QEMU_MONITOR_MIGRATION_STATUS_POSTCOPY_RECOVER:
         jobData->status = VIR_DOMAIN_JOB_STATUS_POSTCOPY;
         break;
 
     case QEMU_MONITOR_MIGRATION_STATUS_POSTCOPY_RECOVER_SETUP:
+    case QEMU_MONITOR_MIGRATION_STATUS_POSTCOPY_RECOVER:
         jobData->status = VIR_DOMAIN_JOB_STATUS_POSTCOPY_RECOVER;
         break;