Jonathan Kim [Mon, 26 May 2025 19:39:45 +0000 (15:39 -0400)]
drm/amdkfd: disable shader message vgpr deallocation on gc 12.1
Shader messages to deallocate VGPRs prior to shader end can prevent
the trap handler from saving context, making debugging and core dumps
unreliable.
VGPR deallocations for performance gain is negligible.
GC 12.1 will NOP shader VGPR deallocation messages via HW
settings on driver boot.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Acked-by: Harish Kasiviswanathan <harish.kasiviswanathan@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Unbinding amdgpu has no problems, but binding it again leads to an
error of sysfs file already existing. This is because it wasn't
actually cleaned up on unbind. Add the missing cleanup step.
Fixes: 547aad32edac ("drm/amdgpu: add VCN4 ip block support") Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
amdgpu_acpi_detect() calls some helper functions it calls have large
local structures. When the compiler inlines these helpers, their local
data adds to the amdgpu_acpi_detect() stack frame.
Mark the helpers with noinline_for_stack:
- amdgpu_atif_verify_interface()
- amdgpu_atif_get_notification_params()
- amdgpu_atif_query_backlight_caps()
- amdgpu_atcs_verify_interface()
- amdgpu_acpi_enumerate_xcc()
This keeps the large temporary objects inside the helper’s own stack
frame instead of being inlined into the caller, preventing the caller
from growing beyond the stack limit.
Fixes the below with gcc W=1:
drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c:1403:6: warning: stack frame size (1688) exceeds limit (1024) in 'amdgpu_acpi_detect' [-Wframe-larger-than]
Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amdgpu: pass the entity to use to ttm public functions
This way the caller can select the one it wants to use.
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amdgpu: pass the entity to use to amdgpu_ttm_map_buffer
This way the caller can select the one it wants to use.
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amdgpu: fix error handling in amdgpu_copy_buffer
drm_sched_job_add_resv_dependencies can fail in amdgpu_ttm_prepare_job.
In this case we need to use amdgpu_job_free to release memory.
---
v4: moved job pointer clearing to a different patchset
---
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Deduplicate the IB padding code and will also be used
later to check locking.
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
No functional change for now, but this struct will have more
fields added in the next commit.
This change would introduce synchronisation issue, because
dependencies between successive jobs are not taken care of
properly. For instance, amdgpu_ttm_clear_buffer uses
amdgpu_ttm_map_buffer then amdgpu_ttm_fill_mem which should
use different entities (default_entity then move/clear entity).
To prevent failures for this commit, we limit ourselves to
2 entities: default_entity (which replaces high_pr usages) and
clear_entity (which replaces low_pr usages).
The next commits will deal with these dependencies correctly,
and then we'll be able to use move_entity.
---
v2: renamed amdgpu_ttm_buffer_entity
v4: don't use move_entity in ttm yet
---
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> (v3) Acked-by: Felix Kuehling <felix.kuehling@amd.com> (v3) Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jonathan Kim [Fri, 5 Dec 2025 19:41:08 +0000 (14:41 -0500)]
drm/amdkfd: bump minimum vgpr size for gfx1151
GFX1151 has 1.5x the number of available physical VGPRs per SIMD.
Bump total memory availability for acquire checks on queue creation.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mukul Joshi [Thu, 15 May 2025 02:07:47 +0000 (22:07 -0400)]
drm/amdgpu: Revert retry based thrashing prevention on GFX 12.1.0
Revert the change to enable retry based thrashing prevention on GFX 12.1.0
for now as its causing data mismatch and slowness issues with multiple HIP
tests.
Alex Sierra [Thu, 3 Apr 2025 21:49:54 +0000 (16:49 -0500)]
drm/amdgpu: update sh mem base offsets for gfx 12.1
Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/ras: Reduce stack usage in ras_umc_handle_bad_pages()
ras_umc_handle_bad_pages() function used a large local array:
struct eeprom_umc_record records[MAX_ECC_NUM_PER_RETIREMENT];
Move this array off the stack by allocating it with kcalloc()
and freeing it before return.
This reduces the stack frame size of ras_umc_handle_bad_pages()
and avoids the frame size warning.
Fixes the below:
drivers/gpu/drm/amd/amdgpu/../ras/rascore/ras_umc.c:498:5: warning: stack frame size (1208) exceeds limit (1024) in 'ras_umc_handle_bad_pages' [-Wframe-larger-than]
v2: Removed the duplicate ras_umc_get_new_records() invocation. (Lijo)
Cc: Tao Zhou <tao.zhou1@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mukul Joshi [Thu, 27 Mar 2025 21:17:06 +0000 (17:17 -0400)]
drm/amdgpu: Fix SHMEM alignment mode for GFX 12.1.0
Alignment mode in SHMEM config register is only a single bit
value on GFX 12.1.0 instead of 2 bits in previous asics.
Add a new enum and use the correct value of SHMEM alignment mode
when programming the SHMEM config register.
drm/amdkfd: Export the cwsr_size and ctl_stack_size to userspace
This is important for userspace to avoid hardcoding VGPR size.
Reviewed-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
amdkfd: Bump ABI to indicate presence of Trap handler support for expert scheduling
commit 0f0c8a6983db ("drm/amdkfd: Trap handler support for expert
scheduling mode") introduced support for a trap handler when expert
scheduling mode. However userspace needs to know whether or not a trap
handler support is present.
Bump the KFD IOCTL API so that userspace can key off this to decide.
Suggested-by: Stella Laurenzo <stella.laurenzo@amd.com> Fixes: 423888879412 ("drm/amdkfd: Trap handler support for expert scheduling mode") Reviewed-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Taimur Hassan [Sat, 29 Nov 2025 01:15:43 +0000 (20:15 -0500)]
drm/amd/display: Promote DC to 3.2.362
This version brings along the following updates:
- Defer transitions from minimal state to final state
- Remove periodic detection callbacks from dcn35+
- Fixes for S0i3 exit
- Refactor dml_core_mode_support to reduce stack frame
- Add additional info from DML for DMU
Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com> Signed-off-by: Chenyu Chen <chen-yu.chen@amd.com> Acked-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Joshua Aberback [Fri, 21 Nov 2025 20:46:17 +0000 (15:46 -0500)]
drm/amd/display: Defer transitions from minimal state to final state
[Why]
In non-seamless pipe transitions, it can take several frames to process
a single flip. One of the reasons is the 2-step transition implementation
where first the minimal transition state is applied, then the final state
is applied, all within the same flip. This delay is noticeable to the user
in some video playback scenarios, which makes for a bad user experience.
[How]
- in applicable non-seamless cases, complete the flip with the minimal
state applied, start a counter, and create all new contexts as minimal
- if another pipe transition occurs while counting, reset the counter
- when the counter finishes, promote the current flip to a full update
and restore creation of optimized contexts
- when creating minimal states from new context, apply stream updates
Reviewed-by: Aric Cyr <aric.cyr@amd.com> Signed-off-by: Joshua Aberback <joshua.aberback@amd.com> Signed-off-by: Chenyu Chen <chen-yu.chen@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 2 Dec 2025 19:24:03 +0000 (14:24 -0500)]
drm/amdgpu: don't attach the tlb fence for SI
SI hardware doesn't support pasids, user mode queues, or
KIQ/MES so there is no need for this. Doing so results in
a segfault as these callbacks are non-existent for SI.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4744 Fixes: f3854e04b708 ("drm/amdgpu: attach tlb fence to the PTs update") Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 11 Nov 2025 16:17:22 +0000 (11:17 -0500)]
drm/amd/display: Use GFP_ATOMIC in dc_create_plane_state()
This can get called from an atomic context.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4470 Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jay Cornwall [Fri, 14 Nov 2025 20:32:42 +0000 (14:32 -0600)]
drm/amdkfd: Trap handler support for expert scheduling mode
The trap may be entered with dependency checking disabled.
Wait for dependency counters and save/restore scheduling mode.
v2:
Use ttmp1 instead of ttmp11. ttmp11 is not zero-initialized.
While the trap handler does zero this field before use, a user-mode
second-level trap handler could not rely on this being zero when
using an older kernel mode driver.
v3:
Use ttmp11 primarily but copy to ttmp1 before jumping to the
second level trap handler. ttmp1 is inspectable by a debugger.
Unexpected bits in the unused space may regress existing software.
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com> Reviewed-by: Lancelot Six <lancelot.six@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amdgpu/sdma_v7_1: Add missing inst_mask entry in sdma_v7_1_inst_gfx_resume()
The comment for sdma_v7_1_inst_gfx_resume() did not include the
inst_mask parameter, even though the function takes it as an argument.
Update the comment to document inst_mask as the mask of SDMA engine
instances to be enabled.
Fixes the below with gcc W=1:
drivers/gpu/drm/amd/amdgpu/sdma_v7_1.c:644 function parameter 'inst_mask' not described in 'sdma_v7_1_inst_gfx_resume'
Cc: Likun Gao <Likun.Gao@amd.com> Cc: Le Ma <le.ma@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Zhu Lingshan [Thu, 21 Aug 2025 09:22:44 +0000 (17:22 +0800)]
amdkfd: introduce new ioctl AMDKFD_IOC_CREATE_PROCESS
This commit implemetns a new ioctl AMDKFD_IOC_CREATE_PROCESS
that creates a new secondary kfd_progress on the FD.
To keep backward compatibility, userspace programs need to invoke
this ioctl explicitly on a FD to create a secondary
kfd_process which replacing its primary kfd_process.
This commit bumps ioctl minor version.
Signed-off-by: Zhu Lingshan <lingshan.zhu@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Xiaogang Chen [Mon, 1 Dec 2025 20:12:29 +0000 (14:12 -0600)]
drm/amdkfd: Use huge page size to check split svm range alignment
When split svm ranges that have been mapped using huge page should use huge
page size(2MB) to check split range alignment, not prange->granularity that
means migration granularity.
Fixes: 7ef6b2d4b7e5 ("drm/amdkfd: remap unaligned svm ranges that have split") Signed-off-by: Xiaogang Chen <xiaogang.chen@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Likun Gao [Fri, 21 Mar 2025 03:23:44 +0000 (11:23 +0800)]
drm/amdgpu: add soc config init for GC v12_1
Add function to initialize soc configuration information
for GC 12.1.0 ASICs.
Use it to map IPs and other SOC related information once IP
configuration information is available through discovery.
Zhu Lingshan [Tue, 5 Aug 2025 03:31:32 +0000 (11:31 +0800)]
amdkfd: fence handler evict and restore a kfd process by its context id
In fence enable signaling handler, kfd evicts
and restores the corresponding kfd_process,
this commit helps find the kfd_process by
both its mm and context id.
Signed-off-by: Zhu Lingshan <lingshan.zhu@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jack Xiao [Mon, 10 Mar 2025 09:02:36 +0000 (17:02 +0800)]
drm/amdgpu/mes12_1: add cooperative dispatch support
Add initial cooperative dispatch MES support.
a. set shared cmd buffer for the group of cooperative dispatch mes.
b. automatically dispatch add_hw_queue/remove_hw_queue to master mes.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mukul Joshi [Thu, 27 Feb 2025 03:46:54 +0000 (22:46 -0500)]
drm/amdkfd: Add interrupt handling for GFX 12.1.0
Add interrupt handling for GFX 12.1.0 similar to what is done
for GFX 9.4.3.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Alex Sierra <alex.sierra@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mukul Joshi [Thu, 27 Feb 2025 03:22:00 +0000 (22:22 -0500)]
drm/amdkfd: Add MQD manager for GFX 12.1.0
This patch adds the following functionality for GFX 12.1.0:
1. Add a new MQD manager for GFX v12.1.0.
2. Add a new 12.1.0 specific device queue manager file.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Alex Sierra <alex.sierra@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mukul Joshi [Thu, 27 Feb 2025 03:09:56 +0000 (22:09 -0500)]
drm/amdkfd: Add GFX 12.1.0 support in KFD
Add support for GFX 12.1.0 in KFD.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Alex Sierra <alex.sierra@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mukul Joshi [Wed, 26 Feb 2025 20:45:46 +0000 (15:45 -0500)]
drm/amdgpu: Setup PCIe atomics bit in PTE on GFX 12.1.0
To enable atomic access to memory, setup the new PCIe atomics bit
in PTE on GFX 12.1.0.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Alex Sierra <alex.sierra@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mukul Joshi [Wed, 26 Feb 2025 20:35:23 +0000 (15:35 -0500)]
drm/amdgpu: Setup Atomics enable in TCP UTCL0 for GFX 12.1.0
We need to explicitly setup atomics enable in TCP UTCL0 to enable
PCIe atomics to host memory.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Alex Sierra <alex.sierra@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mukul Joshi [Wed, 26 Feb 2025 20:29:09 +0000 (15:29 -0500)]
drm/amdgpu: Fix golden register init for GFX 12.1.0
TCP_UTCL0 registers are not per XCD so don't init them on a per
XCD basis.
Fixes: ad5f1ee0a9b0 ("drm/amdgpu: Add initial support for gfx v12_1") Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Alex Sierra <alex.sierra@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Sierra [Fri, 7 Mar 2025 18:17:34 +0000 (12:17 -0600)]
drm/amd: include rrmt mode for mes_v12_1
Implement rrmt for misc read/write regs ops in mes_v12.
This covers LOCAL/REMOTE XCD and LOCAL/REMOTE AID.
v2: fix comments (Alex)
Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Michael Chen <michael.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Shaoyun Liu [Wed, 5 Feb 2025 17:06:50 +0000 (12:06 -0500)]
drm/amd/include : Update MES v12 API header
1. Add RRMT option support which will be used for remote die
register access
2. Update set_hw_resource1 for cooperative mode support
3. Add full_sh_mem_config_data for xnack support
v2: squash in compilation fix
Signed-off-by: Shaoyun Liu <shaoyun.liu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Zhu Lingshan [Thu, 21 Aug 2025 06:41:31 +0000 (14:41 +0800)]
amdkfd: introduce new helper kfd_lookup_process_by_id
This commit introduces a new helper function
kfd_lookup_process_by_id which can find a
kfd process that identified by its context id from
the kfd process table
Signed-off-by: Zhu Lingshan <lingshan.zhu@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Randy Dunlap [Sat, 29 Nov 2025 07:37:39 +0000 (23:37 -0800)]
drm/amd/display: correct kernel-doc in dml21_wrapper.h
Fix all kernel-doc warnings in dml21_wrapper.h:
- add missing @dml_ctx entries (2 places)
- fix function prototype typo for dml21_create()
- change a blank kernel-doc line to " *"
Fixes these warnings:
Warning: drivers/gpu/drm/amd/display/dc/dml2_0/dml21/dml21_wrapper.h:30
function parameter 'dml_ctx' not described in 'dml21_create'
Warning: drivers/gpu/drm/amd/display/dc/dml2_0/dml21/dml21_wrapper.h:30
expecting prototype for dml2_create(). Prototype was for dml21_create()
instead
Warning: drivers/gpu/drm/amd/display/dc/dml2_0/dml21/dml21_wrapper.h:55
bad line:
Warning: drivers/gpu/drm/amd/display/dc/dml2_0/dml21/dml21_wrapper.h:61
function parameter 'dml_ctx' not described in 'dml21_validate'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>