drm/amdgpu: move devcoredump generation to a worker
Update the way drm_coredump_printer is used based on its documentation
and Xe's code: the main idea is to generate the final version in one go
and then use memcpy to return the chunks requested by the caller of
amdgpu_devcoredump_read.
The generation is moved to a separate worker thread.
This cuts the time to copy the dump from 40s to ~0s on my machine.
---
v3:
- removed adev->coredump_in_progress and instead use work as
the synchronisation mechanism
- use kvfree instead of kfree
---
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>