]> git.ipfire.org Git - thirdparty/kernel/linux.git/commitdiff
drm/pagemap: Disable device-to-device migration
authorMatthew Brost <matthew.brost@intel.com>
Wed, 7 Jan 2026 18:27:16 +0000 (10:27 -0800)
committerMatthew Brost <matthew.brost@intel.com>
Thu, 8 Jan 2026 05:29:40 +0000 (21:29 -0800)
Device-to-device migration is causing xe_exec_system_allocator --r
*race*no* to intermittently fail with engine resets and a kernel hang on
a page lock. This should work but is clearly buggy somewhere. Disable
device-to-device migration in the interim until the issue can be
root-caused.

The only downside of disabling device-to-device migration is that memory
will bounce through system memory during migration. However, this path
should be rare, as it only occurs when madvise attributes are changed or
atomics are used.

Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Fixes: ec265e1f1cfc ("drm/pagemap: Support source migration over interconnect")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Francois Dugast <francois.dugast@intel.com>
Link: https://patch.msgid.link/20260107182716.2236607-3-matthew.brost@intel.com
drivers/gpu/drm/drm_pagemap.c

index aa43a8475100fca159c4d823dd2c3fb05de01490..03ee39a761a41cee8155ec522891daf3eea9e4d8 100644 (file)
@@ -480,8 +480,18 @@ int drm_pagemap_migrate_to_devmem(struct drm_pagemap_devmem *devmem_allocation,
                .start          = start,
                .end            = end,
                .pgmap_owner    = pagemap->owner,
-               .flags          = MIGRATE_VMA_SELECT_SYSTEM | MIGRATE_VMA_SELECT_DEVICE_COHERENT |
-               MIGRATE_VMA_SELECT_DEVICE_PRIVATE,
+               /*
+                * FIXME: MIGRATE_VMA_SELECT_DEVICE_PRIVATE intermittently
+                * causes 'xe_exec_system_allocator --r *race*no*' to trigger aa
+                * engine reset and a hard hang due to getting stuck on a folio
+                * lock. This should work and needs to be root-caused. The only
+                * downside of not selecting MIGRATE_VMA_SELECT_DEVICE_PRIVATE
+                * is that device-to-device migrations won’t work; instead,
+                * memory will bounce through system memory. This path should be
+                * rare and only occur when the madvise attributes of memory are
+                * changed or atomics are being used.
+                */
+               .flags          = MIGRATE_VMA_SELECT_SYSTEM | MIGRATE_VMA_SELECT_DEVICE_COHERENT,
        };
        unsigned long i, npages = npages_in_range(start, end);
        unsigned long own_pages = 0, migrated_pages = 0;