When running on a Cherryview, or on a Broxton with VTD enabled, pinning of
a VMA to GGTT is now committed asynchronously to avoid lock inversion
among reservation_ww and cpu_hotplug locks, the latter acquired from
stop_machine(). That may defer further processing of resources that
depend on that VMA. As a consequence, a 10ms delay in a multithreaded
migrate test case may occur too short and still incomplete threads may be
interrupted, and the test case may fail with -ERESTARTSYS or -EINTR error
code returned by any of those threads.
Extend the delay to empirically determined 100ms on affected platforms.
v3: Add an in-line comment that explains why 100ms (Andi).
v2: Fix spelling (Sebastian, Krzysztof),
- explain why VMA pinning is commited asynchronously on CHV/BXT+VTD
(Krzysztof).
Cc: Andi Shyti <andi.shyti@kernel.org>
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Reviewed-by: Sebastian Brzezinka <sebastian.brzezinka@intel.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Acked-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://lore.kernel.org/r/20251023082925.351307-7-janusz.krzysztofik@linux.intel.com
thread[i].tsk = tsk;
}
- msleep(10 * n_cpus); /* start all threads before we kthread_stop() */
+ /*
+ * Start all threads before we kthread_stop().
+ * In CHV / BXT+VTD environments, where VMA pinning is committed
+ * asynchronously, empirically determined 100ms delay is needed
+ * to avoid stopping threads that may still wait for completion of
+ * intel_ggtt_bind_vma and fail with -ERESTARTSYS when interrupted.
+ */
+ msleep((intel_vm_no_concurrent_access_wa(migrate->context->vm->i915) ? 100 : 10) * n_cpus);
for (i = 0; i < n_cpus; ++i) {
struct task_struct *tsk = thread[i].tsk;