From: Harish Kasiviswanathan Date: Tue, 26 Mar 2024 19:32:46 +0000 (-0400) Subject: drm/amdkfd: Reset GPU on queue preemption failure X-Git-Tag: v6.6.28~33 X-Git-Url: http://git.ipfire.org/gitweb.cgi?a=commitdiff_plain;h=4d87f08eb75513334a85458306373d7560af1017;p=thirdparty%2Fkernel%2Fstable.git drm/amdkfd: Reset GPU on queue preemption failure commit 8bdfb4ea95ca738d33ef71376c21eba20130f2eb upstream. Currently, with F32 HWS GPU reset is only when unmap queue fails. However, if compute queue doesn't repond to preemption request in time unmap will return without any error. In this case, only preemption error is logged and Reset is not triggered. Call GPU reset in this case also. Reviewed-by: Alex Deucher Signed-off-by: Harish Kasiviswanathan Reviewed-by: Mukul Joshi Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman --- diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c index e07652e724965..60d98301ef041 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c @@ -1980,6 +1980,7 @@ static int unmap_queues_cpsch(struct device_queue_manager *dqm, pr_err("HIQ MQD's queue_doorbell_id0 is not 0, Queue preemption time out\n"); while (halt_if_hws_hang) schedule(); + kfd_hws_hang(dqm); return -ETIME; }