]> git.ipfire.org Git - thirdparty/kernel/linux.git/commit
accel/ivpu: Perform engine reset instead of device recovery on TDR
authorKarol Wachowski <karol.wachowski@linux.intel.com>
Wed, 18 Mar 2026 09:39:27 +0000 (10:39 +0100)
committerKarol Wachowski <karol.wachowski@linux.intel.com>
Fri, 20 Mar 2026 07:03:11 +0000 (08:03 +0100)
commitade00a6c903f85031061b4e1a45e789b210f9055
tree3bbea463688cb853612beba00c57b3fac84235f5
parentd51f217957ca1fa3a151000e86a192231284595b
accel/ivpu: Perform engine reset instead of device recovery on TDR

Replace full device recovery on TDR timeout with per-context abort,
allowing individual context handling instead of resetting the entire
device.

Extend ivpu_jsm_reset_engine() to return the list of contexts impacted
by the engine reset and use that information to abort only the affected
contexts.

Only check for potentially faulty contexts when the engine reset was not
triggered by an MMU fault or a job completion error status. This prevents
misidentifying non-guilty contexts that happened to be running at the
time of the fault.

Trigger full device recovery if no contexts were marked by engine reset
if triggered by job completion timeout, as there is no way to identify
guilty one.

Add engine reset counter to debugfs for engine resets bookkeeping
for debugging/testing purposes.

Reviewed-by: Lizhi Hou <lizhi.hou@amd.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://patch.msgid.link/20260318093927.4080303-1-karol.wachowski@linux.intel.com
drivers/accel/ivpu/ivpu_debugfs.c
drivers/accel/ivpu/ivpu_drv.c
drivers/accel/ivpu/ivpu_drv.h
drivers/accel/ivpu/ivpu_job.c
drivers/accel/ivpu/ivpu_jsm_msg.c
drivers/accel/ivpu/ivpu_jsm_msg.h
drivers/accel/ivpu/ivpu_mmu.c
drivers/accel/ivpu/ivpu_pm.c
drivers/accel/ivpu/ivpu_pm.h