From: Ketil Johnsen Date: Fri, 19 Dec 2025 09:35:44 +0000 (+0100) Subject: drm/panthor: Evict groups before VM termination X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=565ed40b5fc1242f7538a016fce5a85f802d4fb5;p=thirdparty%2Fkernel%2Flinux.git drm/panthor: Evict groups before VM termination Ensure all related groups are evicted and suspended before VM destruction takes place. This fixes an issue where panthor_vm_destroy() destroys and unmaps the heap context while there are still on slot groups using this. The FW will do a write out to the heap context when a CSG (group) is suspended, so a premature unmap of the heap context will cause a GPU page fault. This page fault is quite harmless, and do not affect the continued operation of the GPU. Fixes: 647810ec2476 ("drm/panthor: Add the MMU/VM logical block") Reviewed-by: Boris Brezillon Signed-off-by: Ketil Johnsen Reviewed-by: Liviu Dudau Reviewed-by: Steven Price Link: https://patch.msgid.link/20251219093546.1227697-1-ketil.johnsen@arm.com Co-developed-by: Boris Brezillon Signed-off-by: Boris Brezillon --- diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c index 473a8bebd61e..b44753f91e40 100644 --- a/drivers/gpu/drm/panthor/panthor_mmu.c +++ b/drivers/gpu/drm/panthor/panthor_mmu.c @@ -1504,6 +1504,10 @@ static void panthor_vm_destroy(struct panthor_vm *vm) vm->destroyed = true; + /* Tell scheduler to stop all GPU work related to this VM */ + if (refcount_read(&vm->as.active_cnt) > 0) + panthor_sched_prepare_for_vm_destruction(vm->ptdev); + mutex_lock(&vm->heaps.lock); panthor_heap_pool_destroy(vm->heaps.pool); vm->heaps.pool = NULL; diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c index 0f83e778d89a..ca272dbae14d 100644 --- a/drivers/gpu/drm/panthor/panthor_sched.c +++ b/drivers/gpu/drm/panthor/panthor_sched.c @@ -2786,6 +2786,20 @@ void panthor_sched_report_mmu_fault(struct panthor_device *ptdev) sched_queue_delayed_work(ptdev->scheduler, tick, 0); } +void panthor_sched_prepare_for_vm_destruction(struct panthor_device *ptdev) +{ + /* FW can write out internal state, like the heap context, during CSG + * suspend. It is therefore important that the scheduler has fully + * evicted any pending and related groups before VM destruction can + * safely continue. Failure to do so can lead to GPU page faults. + * A controlled termination of a Panthor instance involves destroying + * the group(s) before the VM. This means any relevant group eviction + * has already been initiated by this point, and we just need to + * ensure that any pending tick_work() has been completed. + */ + flush_work(&ptdev->scheduler->tick_work.work); +} + void panthor_sched_resume(struct panthor_device *ptdev) { /* Force a tick to re-evaluate after a resume. */ diff --git a/drivers/gpu/drm/panthor/panthor_sched.h b/drivers/gpu/drm/panthor/panthor_sched.h index f4a475aa34c0..9a8692de8ade 100644 --- a/drivers/gpu/drm/panthor/panthor_sched.h +++ b/drivers/gpu/drm/panthor/panthor_sched.h @@ -50,6 +50,7 @@ void panthor_sched_suspend(struct panthor_device *ptdev); void panthor_sched_resume(struct panthor_device *ptdev); void panthor_sched_report_mmu_fault(struct panthor_device *ptdev); +void panthor_sched_prepare_for_vm_destruction(struct panthor_device *ptdev); void panthor_sched_report_fw_events(struct panthor_device *ptdev, u32 events); void panthor_fdinfo_gather_group_samples(struct panthor_file *pfile);