From: Greg Kroah-Hartman Date: Wed, 28 Jun 2023 19:01:30 +0000 (+0200) Subject: 5.15-stable patches X-Git-Tag: v6.4.1~46 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=d1def2de912e96bac9179373dba25adba3e1735d;p=thirdparty%2Fkernel%2Fstable-queue.git 5.15-stable patches added patches: drm-amdgpu-set-vmbo-destroy-after-pt-bo-is-created.patch --- diff --git a/queue-5.15/drm-amdgpu-set-vmbo-destroy-after-pt-bo-is-created.patch b/queue-5.15/drm-amdgpu-set-vmbo-destroy-after-pt-bo-is-created.patch new file mode 100644 index 00000000000..702943ea1d3 --- /dev/null +++ b/queue-5.15/drm-amdgpu-set-vmbo-destroy-after-pt-bo-is-created.patch @@ -0,0 +1,68 @@ +From 9a3c6067bd2ee2ca2652fbb0679f422f3c9109f9 Mon Sep 17 00:00:00 2001 +From: Philip Yang +Date: Mon, 3 Oct 2022 13:03:26 -0400 +Subject: drm/amdgpu: Set vmbo destroy after pt bo is created +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +From: Philip Yang + +commit 9a3c6067bd2ee2ca2652fbb0679f422f3c9109f9 upstream. + +Under VRAM usage pression, map to GPU may fail to create pt bo and +vmbo->shadow_list is not initialized, then ttm_bo_release calling +amdgpu_bo_vm_destroy to access vmbo->shadow_list generates below +dmesg and NULL pointer access backtrace: + +Set vmbo destroy callback to amdgpu_bo_vm_destroy only after creating pt +bo successfully, otherwise use default callback amdgpu_bo_destroy. + +amdgpu: amdgpu_vm_bo_update failed +amdgpu: update_gpuvm_pte() failed +amdgpu: Failed to map bo to gpuvm +amdgpu 0000:43:00.0: amdgpu: Failed to map peer:0000:43:00.0 mem_domain:2 +BUG: kernel NULL pointer dereference, address: + RIP: 0010:amdgpu_bo_vm_destroy+0x4d/0x80 [amdgpu] + Call Trace: + + ttm_bo_release+0x207/0x320 [amdttm] + amdttm_bo_init_reserved+0x1d6/0x210 [amdttm] + amdgpu_bo_create+0x1ba/0x520 [amdgpu] + amdgpu_bo_create_vm+0x3a/0x80 [amdgpu] + amdgpu_vm_pt_create+0xde/0x270 [amdgpu] + amdgpu_vm_ptes_update+0x63b/0x710 [amdgpu] + amdgpu_vm_update_range+0x2e7/0x6e0 [amdgpu] + amdgpu_vm_bo_update+0x2bd/0x600 [amdgpu] + update_gpuvm_pte+0x160/0x420 [amdgpu] + amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x313/0x1130 [amdgpu] + kfd_ioctl_map_memory_to_gpu+0x115/0x390 [amdgpu] + kfd_ioctl+0x24a/0x5b0 [amdgpu] + +Signed-off-by: Philip Yang +Reviewed-by: Christian König +Signed-off-by: Alex Deucher +[ This fixes a regression introduced by commit 1cc40dccad76 ("drm/amdgpu: + fix Null pointer dereference error in amdgpu_device_recover_vram") in + 5.15.118. It's a hand modified cherry-pick because that commit that + introduced the regression touched nearby code and the context is now + incorrect. ] +Cc: Linux Regressions +Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2650 +Fixes: 1cc40dccad76 ("drm/amdgpu: fix Null pointer dereference error in amdgpu_device_recover_vram") +Signed-off-by: Mario Limonciello +Signed-off-by: Greg Kroah-Hartman +--- + drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 1 - + 1 file changed, 1 deletion(-) + +--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c ++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c +@@ -685,7 +685,6 @@ int amdgpu_bo_create_vm(struct amdgpu_de + * num of amdgpu_vm_pt entries. + */ + BUG_ON(bp->bo_ptr_size < sizeof(struct amdgpu_bo_vm)); +- bp->destroy = &amdgpu_bo_vm_destroy; + r = amdgpu_bo_create(adev, bp, &bo_ptr); + if (r) + return r; diff --git a/queue-5.15/series b/queue-5.15/series index f74c86f7040..deaadceeabc 100644 --- a/queue-5.15/series +++ b/queue-5.15/series @@ -2,3 +2,4 @@ mptcp-fix-possible-divide-by-zero-in-recvmsg.patch mptcp-consolidate-fallback-and-non-fallback-state-machine.patch mm-hwpoison-try-to-recover-from-copy-on-write-faults.patch mm-hwpoison-when-copy-on-write-hits-poison-take-page-offline.patch +drm-amdgpu-set-vmbo-destroy-after-pt-bo-is-created.patch