From: Dave Airlie Date: Wed, 2 Jul 2025 23:27:07 +0000 (+1000) Subject: nouveau/gsp: add a 50ms delay between fbsr and driver unload rpcs X-Git-Tag: v6.16-rc6~10^2^2~9 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=e79d0ba605d54dd47f3d8a487d00f264b896966c;p=thirdparty%2Flinux.git nouveau/gsp: add a 50ms delay between fbsr and driver unload rpcs This fixes a bunch of command hangs after runtime suspend/resume. This fixes a regression caused by code movement in the commit below, the commit seems to just change timings enough to cause this to happen now, and adding the sleep seems to avoid it. I've spent some time trying to root cause it to no great avail, it seems like a bug on the firmware side, but it could be a bug in our rpc handling that I can't find. Either way, we should land the workaround to fix the problem, while we continue to work out the root cause. Signed-off-by: Dave Airlie Cc: Ben Skeggs Cc: Danilo Krummrich Fixes: c21b039715ce ("drm/nouveau/gsp: add hals for fbsr.suspend/resume()") Signed-off-by: Danilo Krummrich Link: https://lore.kernel.org/r/20250702232707.175679-1-airlied@gmail.com --- diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/gsp.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/gsp.c index baf42339f93ea..23f80e1677058 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/gsp.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/gsp.c @@ -1744,6 +1744,13 @@ r535_gsp_fini(struct nvkm_gsp *gsp, bool suspend) nvkm_gsp_sg_free(gsp->subdev.device, &gsp->sr.sgt); return ret; } + + /* + * TODO: Debug the GSP firmware / RPC handling to find out why + * without this Turing (but none of the other architectures) + * ends up resetting all channels after resume. + */ + msleep(50); } ret = r535_gsp_rpc_unloading_guest_driver(gsp, suspend);