drm/amd/ras: Reduce stack usage in amdgpu_virt_ras_get_cper_records()
amdgpu_virt_ras_get_cper_records() was using a large stack array
of ras_log_info pointers. This contributed to the frame size
warning on this function.
Replace the fixed-size stack array:
struct ras_log_info *trace[MAX_RECORD_PER_BATCH];
with a heap-allocated array using kcalloc().
We free the trace buffer together with out_buf on all exit paths.
If allocation of trace or out_buf fails, we return a generic RAS
error code.
This reduces stack usage and keeps the runtime behaviour
unchanged.
Fixes:
stack frame size: 1112 bytes (limit: 1024)
Cc: Tao Zhou <tao.zhou1@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Philip Yang [Wed, 19 Nov 2025 21:32:45 +0000 (16:32 -0500)]
drm/amdkfd: Handle GPU reset and drain retry fault race
Only check and drain IH1 ring if CAM is not enabled.
If GPU is under reset, don't access IH to drain retry fault.
Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Revert "drm/amd/display: Fix pbn to kbps Conversion"
Deeply daisy chained DP/MST displays are no longer able to light
up. This reverts commit e0dec00f3d05 ("drm/amd/display: Fix pbn
to kbps Conversion")
Cc: Jerry Zuo <jerry.zuo@amd.com> Reported-by: nat@nullable.se Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4756 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Likun Gao [Wed, 21 May 2025 00:52:36 +0000 (08:52 +0800)]
drm/amdgpu: revision doorbel range for gfx v12_1
Revision doorbell range on muti-XCC mode for gfx v12_1.
Clean up doorbell range set for graphics engine.
V2: Remove doorbell range set from gfx_v12_1_xcc_kiq_init_register.
Jonathan Kim [Mon, 26 May 2025 19:39:45 +0000 (15:39 -0400)]
drm/amdkfd: disable shader message vgpr deallocation on gc 12.1
Shader messages to deallocate VGPRs prior to shader end can prevent
the trap handler from saving context, making debugging and core dumps
unreliable.
VGPR deallocations for performance gain is negligible.
GC 12.1 will NOP shader VGPR deallocation messages via HW
settings on driver boot.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Acked-by: Harish Kasiviswanathan <harish.kasiviswanathan@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Unbinding amdgpu has no problems, but binding it again leads to an
error of sysfs file already existing. This is because it wasn't
actually cleaned up on unbind. Add the missing cleanup step.
Fixes: 547aad32edac ("drm/amdgpu: add VCN4 ip block support") Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
amdgpu_acpi_detect() calls some helper functions it calls have large
local structures. When the compiler inlines these helpers, their local
data adds to the amdgpu_acpi_detect() stack frame.
Mark the helpers with noinline_for_stack:
- amdgpu_atif_verify_interface()
- amdgpu_atif_get_notification_params()
- amdgpu_atif_query_backlight_caps()
- amdgpu_atcs_verify_interface()
- amdgpu_acpi_enumerate_xcc()
This keeps the large temporary objects inside the helper’s own stack
frame instead of being inlined into the caller, preventing the caller
from growing beyond the stack limit.
Fixes the below with gcc W=1:
drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c:1403:6: warning: stack frame size (1688) exceeds limit (1024) in 'amdgpu_acpi_detect' [-Wframe-larger-than]
Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amdgpu: pass the entity to use to ttm public functions
This way the caller can select the one it wants to use.
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amdgpu: pass the entity to use to amdgpu_ttm_map_buffer
This way the caller can select the one it wants to use.
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amdgpu: fix error handling in amdgpu_copy_buffer
drm_sched_job_add_resv_dependencies can fail in amdgpu_ttm_prepare_job.
In this case we need to use amdgpu_job_free to release memory.
---
v4: moved job pointer clearing to a different patchset
---
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Deduplicate the IB padding code and will also be used
later to check locking.
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
No functional change for now, but this struct will have more
fields added in the next commit.
This change would introduce synchronisation issue, because
dependencies between successive jobs are not taken care of
properly. For instance, amdgpu_ttm_clear_buffer uses
amdgpu_ttm_map_buffer then amdgpu_ttm_fill_mem which should
use different entities (default_entity then move/clear entity).
To prevent failures for this commit, we limit ourselves to
2 entities: default_entity (which replaces high_pr usages) and
clear_entity (which replaces low_pr usages).
The next commits will deal with these dependencies correctly,
and then we'll be able to use move_entity.
---
v2: renamed amdgpu_ttm_buffer_entity
v4: don't use move_entity in ttm yet
---
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> (v3) Acked-by: Felix Kuehling <felix.kuehling@amd.com> (v3) Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jonathan Kim [Fri, 5 Dec 2025 19:41:08 +0000 (14:41 -0500)]
drm/amdkfd: bump minimum vgpr size for gfx1151
GFX1151 has 1.5x the number of available physical VGPRs per SIMD.
Bump total memory availability for acquire checks on queue creation.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mukul Joshi [Thu, 15 May 2025 02:07:47 +0000 (22:07 -0400)]
drm/amdgpu: Revert retry based thrashing prevention on GFX 12.1.0
Revert the change to enable retry based thrashing prevention on GFX 12.1.0
for now as its causing data mismatch and slowness issues with multiple HIP
tests.
Alex Sierra [Thu, 3 Apr 2025 21:49:54 +0000 (16:49 -0500)]
drm/amdgpu: update sh mem base offsets for gfx 12.1
Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/ras: Reduce stack usage in ras_umc_handle_bad_pages()
ras_umc_handle_bad_pages() function used a large local array:
struct eeprom_umc_record records[MAX_ECC_NUM_PER_RETIREMENT];
Move this array off the stack by allocating it with kcalloc()
and freeing it before return.
This reduces the stack frame size of ras_umc_handle_bad_pages()
and avoids the frame size warning.
Fixes the below:
drivers/gpu/drm/amd/amdgpu/../ras/rascore/ras_umc.c:498:5: warning: stack frame size (1208) exceeds limit (1024) in 'ras_umc_handle_bad_pages' [-Wframe-larger-than]
v2: Removed the duplicate ras_umc_get_new_records() invocation. (Lijo)
Cc: Tao Zhou <tao.zhou1@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mukul Joshi [Thu, 27 Mar 2025 21:17:06 +0000 (17:17 -0400)]
drm/amdgpu: Fix SHMEM alignment mode for GFX 12.1.0
Alignment mode in SHMEM config register is only a single bit
value on GFX 12.1.0 instead of 2 bits in previous asics.
Add a new enum and use the correct value of SHMEM alignment mode
when programming the SHMEM config register.
drm/amdkfd: Export the cwsr_size and ctl_stack_size to userspace
This is important for userspace to avoid hardcoding VGPR size.
Reviewed-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
amdkfd: Bump ABI to indicate presence of Trap handler support for expert scheduling
commit 0f0c8a6983db ("drm/amdkfd: Trap handler support for expert
scheduling mode") introduced support for a trap handler when expert
scheduling mode. However userspace needs to know whether or not a trap
handler support is present.
Bump the KFD IOCTL API so that userspace can key off this to decide.
Suggested-by: Stella Laurenzo <stella.laurenzo@amd.com> Fixes: 423888879412 ("drm/amdkfd: Trap handler support for expert scheduling mode") Reviewed-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Taimur Hassan [Sat, 29 Nov 2025 01:15:43 +0000 (20:15 -0500)]
drm/amd/display: Promote DC to 3.2.362
This version brings along the following updates:
- Defer transitions from minimal state to final state
- Remove periodic detection callbacks from dcn35+
- Fixes for S0i3 exit
- Refactor dml_core_mode_support to reduce stack frame
- Add additional info from DML for DMU
Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com> Signed-off-by: Chenyu Chen <chen-yu.chen@amd.com> Acked-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Joshua Aberback [Fri, 21 Nov 2025 20:46:17 +0000 (15:46 -0500)]
drm/amd/display: Defer transitions from minimal state to final state
[Why]
In non-seamless pipe transitions, it can take several frames to process
a single flip. One of the reasons is the 2-step transition implementation
where first the minimal transition state is applied, then the final state
is applied, all within the same flip. This delay is noticeable to the user
in some video playback scenarios, which makes for a bad user experience.
[How]
- in applicable non-seamless cases, complete the flip with the minimal
state applied, start a counter, and create all new contexts as minimal
- if another pipe transition occurs while counting, reset the counter
- when the counter finishes, promote the current flip to a full update
and restore creation of optimized contexts
- when creating minimal states from new context, apply stream updates
Reviewed-by: Aric Cyr <aric.cyr@amd.com> Signed-off-by: Joshua Aberback <joshua.aberback@amd.com> Signed-off-by: Chenyu Chen <chen-yu.chen@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 2 Dec 2025 19:24:03 +0000 (14:24 -0500)]
drm/amdgpu: don't attach the tlb fence for SI
SI hardware doesn't support pasids, user mode queues, or
KIQ/MES so there is no need for this. Doing so results in
a segfault as these callbacks are non-existent for SI.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4744 Fixes: f3854e04b708 ("drm/amdgpu: attach tlb fence to the PTs update") Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 11 Nov 2025 16:17:22 +0000 (11:17 -0500)]
drm/amd/display: Use GFP_ATOMIC in dc_create_plane_state()
This can get called from an atomic context.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4470 Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jay Cornwall [Fri, 14 Nov 2025 20:32:42 +0000 (14:32 -0600)]
drm/amdkfd: Trap handler support for expert scheduling mode
The trap may be entered with dependency checking disabled.
Wait for dependency counters and save/restore scheduling mode.
v2:
Use ttmp1 instead of ttmp11. ttmp11 is not zero-initialized.
While the trap handler does zero this field before use, a user-mode
second-level trap handler could not rely on this being zero when
using an older kernel mode driver.
v3:
Use ttmp11 primarily but copy to ttmp1 before jumping to the
second level trap handler. ttmp1 is inspectable by a debugger.
Unexpected bits in the unused space may regress existing software.
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com> Reviewed-by: Lancelot Six <lancelot.six@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amdgpu/sdma_v7_1: Add missing inst_mask entry in sdma_v7_1_inst_gfx_resume()
The comment for sdma_v7_1_inst_gfx_resume() did not include the
inst_mask parameter, even though the function takes it as an argument.
Update the comment to document inst_mask as the mask of SDMA engine
instances to be enabled.
Fixes the below with gcc W=1:
drivers/gpu/drm/amd/amdgpu/sdma_v7_1.c:644 function parameter 'inst_mask' not described in 'sdma_v7_1_inst_gfx_resume'
Cc: Likun Gao <Likun.Gao@amd.com> Cc: Le Ma <le.ma@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Zhu Lingshan [Thu, 21 Aug 2025 09:22:44 +0000 (17:22 +0800)]
amdkfd: introduce new ioctl AMDKFD_IOC_CREATE_PROCESS
This commit implemetns a new ioctl AMDKFD_IOC_CREATE_PROCESS
that creates a new secondary kfd_progress on the FD.
To keep backward compatibility, userspace programs need to invoke
this ioctl explicitly on a FD to create a secondary
kfd_process which replacing its primary kfd_process.
This commit bumps ioctl minor version.
Signed-off-by: Zhu Lingshan <lingshan.zhu@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Xiaogang Chen [Mon, 1 Dec 2025 20:12:29 +0000 (14:12 -0600)]
drm/amdkfd: Use huge page size to check split svm range alignment
When split svm ranges that have been mapped using huge page should use huge
page size(2MB) to check split range alignment, not prange->granularity that
means migration granularity.
Fixes: 7ef6b2d4b7e5 ("drm/amdkfd: remap unaligned svm ranges that have split") Signed-off-by: Xiaogang Chen <xiaogang.chen@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Likun Gao [Fri, 21 Mar 2025 03:23:44 +0000 (11:23 +0800)]
drm/amdgpu: add soc config init for GC v12_1
Add function to initialize soc configuration information
for GC 12.1.0 ASICs.
Use it to map IPs and other SOC related information once IP
configuration information is available through discovery.
Zhu Lingshan [Tue, 5 Aug 2025 03:31:32 +0000 (11:31 +0800)]
amdkfd: fence handler evict and restore a kfd process by its context id
In fence enable signaling handler, kfd evicts
and restores the corresponding kfd_process,
this commit helps find the kfd_process by
both its mm and context id.
Signed-off-by: Zhu Lingshan <lingshan.zhu@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jack Xiao [Mon, 10 Mar 2025 09:02:36 +0000 (17:02 +0800)]
drm/amdgpu/mes12_1: add cooperative dispatch support
Add initial cooperative dispatch MES support.
a. set shared cmd buffer for the group of cooperative dispatch mes.
b. automatically dispatch add_hw_queue/remove_hw_queue to master mes.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mukul Joshi [Thu, 27 Feb 2025 03:46:54 +0000 (22:46 -0500)]
drm/amdkfd: Add interrupt handling for GFX 12.1.0
Add interrupt handling for GFX 12.1.0 similar to what is done
for GFX 9.4.3.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Alex Sierra <alex.sierra@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mukul Joshi [Thu, 27 Feb 2025 03:22:00 +0000 (22:22 -0500)]
drm/amdkfd: Add MQD manager for GFX 12.1.0
This patch adds the following functionality for GFX 12.1.0:
1. Add a new MQD manager for GFX v12.1.0.
2. Add a new 12.1.0 specific device queue manager file.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Alex Sierra <alex.sierra@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mukul Joshi [Thu, 27 Feb 2025 03:09:56 +0000 (22:09 -0500)]
drm/amdkfd: Add GFX 12.1.0 support in KFD
Add support for GFX 12.1.0 in KFD.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Alex Sierra <alex.sierra@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mukul Joshi [Wed, 26 Feb 2025 20:45:46 +0000 (15:45 -0500)]
drm/amdgpu: Setup PCIe atomics bit in PTE on GFX 12.1.0
To enable atomic access to memory, setup the new PCIe atomics bit
in PTE on GFX 12.1.0.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Alex Sierra <alex.sierra@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mukul Joshi [Wed, 26 Feb 2025 20:35:23 +0000 (15:35 -0500)]
drm/amdgpu: Setup Atomics enable in TCP UTCL0 for GFX 12.1.0
We need to explicitly setup atomics enable in TCP UTCL0 to enable
PCIe atomics to host memory.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Alex Sierra <alex.sierra@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mukul Joshi [Wed, 26 Feb 2025 20:29:09 +0000 (15:29 -0500)]
drm/amdgpu: Fix golden register init for GFX 12.1.0
TCP_UTCL0 registers are not per XCD so don't init them on a per
XCD basis.
Fixes: ad5f1ee0a9b0 ("drm/amdgpu: Add initial support for gfx v12_1") Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Alex Sierra <alex.sierra@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Sierra [Fri, 7 Mar 2025 18:17:34 +0000 (12:17 -0600)]
drm/amd: include rrmt mode for mes_v12_1
Implement rrmt for misc read/write regs ops in mes_v12.
This covers LOCAL/REMOTE XCD and LOCAL/REMOTE AID.
v2: fix comments (Alex)
Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Michael Chen <michael.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>