]> git.ipfire.org Git - thirdparty/linux.git/commitdiff
drm/amdkfd: Clear VRAM on allocation to prevent stale data exposure
authorAmir Shetaia <Amir.Shetaia@amd.com>
Fri, 10 Apr 2026 14:38:13 +0000 (10:38 -0400)
committerAlex Deucher <alexander.deucher@amd.com>
Fri, 17 Apr 2026 18:47:48 +0000 (14:47 -0400)
KFD VRAM allocations set AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE
but not AMDGPU_GEM_CREATE_VRAM_CLEARED, leaving freshly allocated
VRAM with stale data from prior use observable by compute kernels.

The GEM ioctl path already sets VRAM_CLEARED for all userspace
allocations via amdgpu_gem_create_ioctl() and
amdgpu_mode_dumb_create(). The KFD path was missing this flag,
allowing stale page table remnants to leak into user buffers.

This causes crashes in RCCL P2P transport where non-zero data in
ptrExchange/head/tail fields corrupts the protocol handshake.

Signed-off-by: Amir Shetaia <Amir.Shetaia@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c

index 29b400cdd6d5f97237556988f3c9f7aa222f562b..72a5a29e63f6da4008c92dc13c17a65d897e7177 100644 (file)
@@ -1735,7 +1735,8 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
                        alloc_domain = AMDGPU_GEM_DOMAIN_GTT;
                        alloc_flags = 0;
                } else {
-                       alloc_flags = AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE;
+                       alloc_flags = AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE |
+                               AMDGPU_GEM_CREATE_VRAM_CLEARED;
                        alloc_flags |= (flags & KFD_IOC_ALLOC_MEM_FLAGS_PUBLIC) ?
                        AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED : 0;