]> git.ipfire.org Git - thirdparty/kernel/stable.git/commit
drm/xe/guc: Set upper limit of H2G retries over CTB
authorMichal Wajdeczko <michal.wajdeczko@intel.com>
Wed, 3 Sep 2025 22:33:30 +0000 (00:33 +0200)
committerMichal Wajdeczko <michal.wajdeczko@intel.com>
Thu, 4 Sep 2025 20:24:51 +0000 (22:24 +0200)
commit2506af5f8109a387a5e8e9e3d7c498480b8033db
treec5b04b885065f6eaece0975d22b96f952e6dfa51
parenta85ead6d7f74438bc779927fe7b78b0c86addb6b
drm/xe/guc: Set upper limit of H2G retries over CTB

The GuC communication protocol allows GuC to send NO_RESPONSE_RETRY
reply message to indicate that due to some interim condition it can
not handle incoming H2G request and the host shall resend it.

But in some cases, due to errors, this unsatisfied condition might
be final and this could lead to endless retries as it was recently
seen on the CI:

 [drm] GT0: PF: VF1 FLR didn't finish in 5000 ms (-ETIMEDOUT)
 [drm] GT0: PF: VF1 resource sanitizing failed (-ETIMEDOUT)
 [drm] GT0: PF: VF1 FLR failed!
 [drm:guc_ct_send_recv [xe]] GT0: H2G action 0x5503 retrying: reason 0x0
 [drm:guc_ct_send_recv [xe]] GT0: H2G action 0x5503 retrying: reason 0x0
 [drm:guc_ct_send_recv [xe]] GT0: H2G action 0x5503 retrying: reason 0x0
 [drm:guc_ct_send_recv [xe]] GT0: H2G action 0x5503 retrying: reason 0x0

To avoid such dangerous loops allow only limited number of retries
(for now 50) and add some delays (n * 5ms) to slow down the rate of
resending this repeated request.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com>
Link: https://lore.kernel.org/r/20250903223330.6408-1-michal.wajdeczko@intel.com
drivers/gpu/drm/xe/xe_guc_ct.c