During VF_RESTORE or VF_RESUME, the GuC sends a migration interrupt and
clears the RESFIX_START marker. If migration or resume occurs before the
VF issues its own RESFIX_START, VF KMD may receive two back-to-back
migration interrupts. VF then sends RESFIX_START to indicate the beginning
of fixups and RESFIX_DONE to mark completion. However, the second
RESFIX_START fails because the GuC is already in the RUNNING state.
Clear the recovery_queued flag after sending a RESFIX_START message to
ignore duplicated IRQs seen before we start actual recovery.
This ensures the state is reset only after the fixup process begins,
avoiding redundant work item queuing.
Fixes: b5fbb94341a2 ("drm/xe/vf: Introduce RESFIX start marker support")
Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Tomasz Lis <tomasz.lis@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20251210052546.622809-6-satyanarayana.k.v.p@intel.com
return true;
}
- spin_lock_irq(>->sriov.vf.migration.lock);
- gt->sriov.vf.migration.recovery_queued = false;
- spin_unlock_irq(>->sriov.vf.migration.lock);
-
xe_guc_ct_flush_and_stop(>->uc.guc.ct);
xe_guc_submit_pause_vf(>->uc.guc);
xe_tlb_inval_reset(>->tlb_inval);
static int vf_post_migration_resfix_start(struct xe_gt *gt, u16 marker)
{
- return vf_resfix_start(gt, marker);
+ int err;
+
+ err = vf_resfix_start(gt, marker);
+
+ guard(spinlock_irq) (>->sriov.vf.migration.lock);
+ gt->sriov.vf.migration.recovery_queued = false;
+
+ return err;
}
static u16 vf_post_migration_next_resfix_marker(struct xe_gt *gt)