Alex Deucher [Thu, 20 Mar 2025 18:10:17 +0000 (14:10 -0400)]
drm/amdgpu/gfx: decouple the number of kgqs from the hw
The driver currently sets up one kgq per pipe. As such
adev->gfx.me.num_queue_per_pipe is hardcoded to 1 everywhere.
This is fine for kernel queues, but when we enable user queues
we need to know that actual number of queues per pipe. Decouple
the kgq setup from the actual hardware count. For dev core
dumps and user queues, we want to know the actual number
of queues per pipe.
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amdgpu/gfx10: Add Cleaner Shader Support for GFX10.3.x GPUs
Enable the cleaner shader for other GFX10.3.x series of GPUs to provide
data isolation between GPU workloads. The cleaner shader is responsible
for clearing the Local Data Store (LDS), Vector General Purpose
Registers (VGPRs), and Scalar General Purpose Registers (SGPRs), which
helps prevent data leakage and ensures accurate computation results.
This update extends cleaner shader support to GFX10.3.x GPUs, previously
available for GFX10.3.0. It enhances security by clearing GPU memory
between processes and maintains a consistent GPU state across KGD and
KFD workloads.
Cc: Mario Sopena-Novales <mario.novales@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Thu, 27 Feb 2025 17:31:28 +0000 (12:31 -0500)]
drm/amdgpu: add rebar parameter
Add a new parameter to disable BAR resizing. Note that this
only disables the driver from attempting to resize the BAR,
The BIOS may have resized the BAR at boot.
Some teams have found this useful in debugging P2P DMA
issues on systems where the available MMIO space did not allow
for all of the GPUs present to resize their BARs.
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alexandre Demers [Sat, 22 Mar 2025 01:46:48 +0000 (21:46 -0400)]
drm/amdgpu: use GRPH_SECONDARY_SURFACE_ADDRESS_MASK with GRPH_SECONDARY_SURFACE_ADDRESS in DCE6
It seems a copy-paste error: since we are working with
mmGRPH_SECONDARY_SURFACE_ADDRESS,
GRPH_SECONDARY_SURFACE_ADDRESS__GRPH_SECONDARY_SURFACE_ADDRESS_MASK
should be used.
Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alexandre Demers [Sat, 22 Mar 2025 18:39:00 +0000 (14:39 -0400)]
drm/radeon: fix MAX_POWER_SHIFT value
While I don't think it is being used anywhere, if it were used, it would
be wrong. We can base this assumption on MAX_POWER_MASK, where the shift is
by 16 bits.
Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ce Sun [Thu, 20 Mar 2025 10:12:40 +0000 (18:12 +0800)]
drm/amdgpu: refactor amdgpu_device_gpu_recover
Split amdgpu_device_gpu_recover into the following stages:
halt activities,asic reset,schedule resume and amdgpu resume.
The reason is that the subsequent addition of dpc recover
code will have a high similarity with gpu reset
Signed-off-by: Ce Sun <cesun102@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Aric Cyr <aric.cyr@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by: Leo Li <sunpeng.li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
Read the histogram for VariBright validation
[How]
Add dc/dmub functions to read histogram and ACE
Reviewed-by: Jun Lei <jun.lei@amd.com> Signed-off-by: Chun-Liang Chang <Chun-Liang.Chang@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Paul Hsieh [Tue, 11 Mar 2025 09:16:57 +0000 (17:16 +0800)]
drm/amd/display: Skip to enable dsc if it has been off
[Why]
It makes DSC enable when we commit the stream which need
keep power off.And then it will skip to disable DSC if
pipe reset at this situation as power has been off. It may
cause the DSC unexpected enable on the pipe with the
next new stream which doesn't support DSC.
[HOW]
Check the DSC used on current pipe status when update stream.
Skip to enable if it has been off. The operation enable
DSC should happen when set power on.
Reviewed-by: Wenjing Liu <wenjing.liu@amd.com> Signed-off-by: Paul Hsieh <Paul.Hsieh@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Leo Zeng [Wed, 26 Feb 2025 19:35:05 +0000 (14:35 -0500)]
drm/amd/display: Get visual confirm color for stream
[WHY]
We want to output visual confirm color based on stream.
[HOW]
If visual confirm is for DMUB, use DMUB to get color.
Otherwise, find plane with highest layer index, output visual confirm color
of pipe that contains plane with highest index.
Reviewed-by: Aric Cyr <aric.cyr@amd.com> Signed-off-by: Leo Zeng <Leo.Zeng@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Thu, 9 Jan 2025 16:57:56 +0000 (11:57 -0500)]
drm/amdgpu: allow pinning DMA-bufs into VRAM if all importers can do P2P
Try pinning into VRAM to allow P2P with RDMA NICs without ODP
support if all attachments can do P2P. If any attachment can't do
P2P just pin into GTT instead.
Acked-by: Simona Vetter <simona.vetter@ffwll.ch> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Tested-by: Pak Nin Lui <pak.lui@amd.com> Cc: Simona Vetter <simona.vetter@ffwll.ch> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Similar to xe, enable some simple management of VRAM only.
Reviewed-by: Christian König <christian.koenig@amd.com> Co-developed-by: Maxime Ripard <mripard@kernel.org> Signed-off-by: Maxime Ripard <mripard@kernel.org> Signed-off-by: Maarten Lankhorst <dev@lankhorst.se> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jay Cornwall [Fri, 21 Mar 2025 18:19:05 +0000 (13:19 -0500)]
drm/amdgpu: Increase KIQ invalidate_tlbs timeout
KIQ invalidate_tlbs request has been seen to marginally exceed the
configured 100 ms timeout on systems under load.
All other KIQ requests in the driver use a 10 second timeout. Use a
similar timeout implementation on the invalidate_tlbs path.
v2: Poll once before msleep
v3: Fix return value
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com> Cc: Kent Russell <kent.russell@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jonathan Kim [Thu, 27 Mar 2025 15:50:42 +0000 (11:50 -0400)]
drm/amdkfd: limit sdma queue reset caps flagging for gfx9
ASICs post GFX 9 are being flagged as SDMA per queue reset supported
in the KGD but KFD and scheduler FW currently have no support.
Limit SDMA queue reset capabilities to GFX 9.
Fixes: ceb7114c961b ("drm/amdkfd: flag per-sdma queue reset supported to user space") Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: David Belanger <david.belanger@amd.com> Reviewed-by: Harish Kasiviswanathan <harish.kasiviswanathan@amd.com> Reviewed-by: Jesse Zhang <jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
drm/amd/display: Add HP Elitebook 645 to the quirk list for eDP on DP1
[Why]
HP Elitebook 645 has DP0 and DP1 swapped.
[How]
Add HP Elitebook 645 to DP0/DP1 swap quirk list.
Cc: stable@vger.kernel.org Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3701 Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Add HP Probook 445 and 465 to the quirk list for eDP on DP1
[Why]
HP Probook 445 and 465 has DP0 and DP1 swapped.
[How]
Add HP Probook 445 and 465 to DP0/DP1 swap quirk list.
Cc: stable@vger.kernel.org Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3995 Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Tested-by: Anson Tsao <anson.tsao@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Huacai Chen [Thu, 27 Mar 2025 09:53:34 +0000 (17:53 +0800)]
drm/amd/display: Protect FPU in dml2_validate()/dml21_validate()
Commit 7da55c27e76749b9 ("drm/amd/display: Remove incorrect FP context
start") removes the FP context protection of dml2_create(), and it said
"All the DC_FP_START/END should be used before call anything from DML2".
However, dml2_validate()/dml21_validate() are not protected from their
callers, causing such errors:
Unfortunately, protecting dml2_validate()/dml21_validate() out of DML2
causes "sleeping function called from invalid context", so protect them
with DC_FP_START() and DC_FP_END() inside.
Huacai Chen [Thu, 27 Mar 2025 09:53:33 +0000 (17:53 +0800)]
drm/amd/display: Protect FPU in dml2_init()/dml21_init()
Commit 7da55c27e76749b9 ("drm/amd/display: Remove incorrect FP context
start") removes the FP context protection of dml2_create(), and it said
"All the DC_FP_START/END should be used before call anything from DML2".
However, dml2_init()/dml21_init() are not protected from their callers,
causing such errors:
Unfortunately, protecting dml2_init()/dml21_init() out of DML2 causes
"sleeping function called from invalid context", so protect them with
DC_FP_START() and DC_FP_END() inside.
Huacai Chen [Thu, 27 Mar 2025 09:53:32 +0000 (17:53 +0800)]
drm/amd/display: Protect FPU in dml21_copy()
Commit 7da55c27e76749b9 ("drm/amd/display: Remove incorrect FP context
start") removes the FP context protection of dml2_create(), and it said
"All the DC_FP_START/END should be used before call anything from DML2".
However, dml21_copy() are not protected from their callers, causing such
errors:
Unfortunately, protecting dml21_copy() out of DML2 causes "sleeping
function called from invalid context", so protect them with DC_FP_START()
and DC_FP_END() inside.
Tom Chung [Wed, 19 Mar 2025 08:31:31 +0000 (16:31 +0800)]
drm/amd/display: Do not enable Replay and PSR while VRR is on in amdgpu_dm_commit_planes()
[Why]
Replay and PSR will cause some video corruption while VRR is enabled.
[How]
Do not enable the Replay and PSR while VRR is active in
amdgpu_dm_enable_self_refresh().
Fixes: 67edb81d6e9a ("drm/amd/display: Disable replay and psr while VRR is enabled") Reviewed-by: Sun peng Li <sunpeng.li@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
Emily Deng [Fri, 28 Mar 2025 10:14:17 +0000 (18:14 +0800)]
drm/amdkfd: sriov doesn't support per queue reset
Disable per queue reset for sriov.
Signed-off-by: Emily Deng <Emily.Deng@amd.com> Reviewed-by: Jonathan Kim <jonathan.kim@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Flora Cui <flora.cui@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Matthew Auld [Mon, 7 Apr 2025 14:18:25 +0000 (15:18 +0100)]
drm/amdgpu/dma_buf: fix page_link check
The page_link lower bits of the first sg could contain something like
SG_END, if we are mapping a single VRAM page or contiguous blob which
fits into one sg entry. Rather pull out the struct page, and use that in
our check to know if we mapped struct pages vs VRAM.
Fixes: f44ffd677fb3 ("drm/amdgpu: add support for exporting VRAM using DMA-buf v3") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Christian König <christian.koenig@amd.com> Cc: amd-gfx@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v5.8+ Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com> Fixes: 216c1282dde3 ("drm/amdgpu: use GTT only as fallback for VRAM|GTT") Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
Asad Kamal [Mon, 17 Mar 2025 06:16:04 +0000 (14:16 +0800)]
drm/amd/pm: Remove host limit metrics support
Firmware algorithm changed and the values in this version
are not accurate thereby remove host limit metric support
for smu_v13_0_6, smu_v13_0_12 & smu_v13_0_14
Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Tue, 18 Mar 2025 15:15:12 +0000 (16:15 +0100)]
drm/amdgpu: stop unmapping MQD for kernel queues v3
This looks unnecessary and actually extremely harmful since using kmap()
is not possible while inside the ring reset.
Remove all the extra mapping and unmapping of the MQDs.
v2: also fix debugfs
v3: fix coding style typo
Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Xiang Liu [Fri, 21 Mar 2025 12:47:23 +0000 (20:47 +0800)]
drm/amdgpu: Use correct gfx deferred error count
In the case of parsing GFX deferred error from SMU corrected error
channel, the error count should be set to 1 instead of parsing from
MISC0 register, which is 0.
Signed-off-by: Xiang Liu <xiang.liu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Leo Li [Tue, 11 Mar 2025 13:43:38 +0000 (09:43 -0400)]
drm/amd/display: Actually do immediate vblank disable
[Why]
The `vblank_config.offdelay` field follows the same semantics as the
`drm_vblank_offdelay` parameter. Setting it to 0 will never disable
vblank.
[How]
Set `offdelay` to a positive number.
Fixes: e45b6716de4b ("drm/amd/display: use a more lax vblank enable policy for DCN35+") Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
Brendan Tam [Fri, 14 Mar 2025 17:09:13 +0000 (13:09 -0400)]
drm/amd/display: prevent hang on link training fail
[Why]
When link training fails, the phy clock will be disabled. However, in
enable_streams, it is assumed that link training succeeded and the
mux selects the phy clock, causing a hang when a register write is made.
[How]
When enable_stream is hit, check if link training failed. If it did, fall
back to the ref clock to avoid a hang and keep the system in a recoverable
state.
Reviewed-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Brendan Tam <Brendan.Tam@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
Leo Li [Mon, 10 Mar 2025 16:20:39 +0000 (12:20 -0400)]
drm/amd/display: Increase vblank offdelay for PSR panels
[Why]
Depending on when the HW latching event (vupdate) of double-buffered
registers happen relative to the PSR SDP (signals panel psr enter/exit)
deadline, and how bad the Panel clock has drifted since the last ALPM
off event, there can be up to 3 frames of delay between sending the PSR
exit cmd to DMUB fw, and when the panel starts displaying live frames.
This can manifest as micro-stuttering when userspace commit patterns
cause rapid toggling of the DRM vblank counter, since PSR enter/exit is
hooked up to DRM vblank disable/enable respectively.
In the ideal world, the panel should present the live frame immediately
on PSR exit cmd. But due to HW design and PSR limitations, immediate
exit can only happen by chance, when:
1. PSR exit cmd is ack'd by FW before HW latching (vupdate) event, and
2. Panel's SDP deadline -- determined by it's PSR Start Delay in DPCD
71h -- is after the vupdate event. The PSR exit SDP can then be sent
immediately after HW latches. Otherwise, we have to wait 1 frame. And
3. There is negligible drift between the panel's clock and source clock.
Otherwise, there can be up to 1 frame of drift.
Note that this delay is not expected with Panel Replay.
[How]
Since PSR power savings can be quite substantial, and there are a lot of
systems in the wild with PSR panels, It'll be nice to have a middle
ground that balances user experience with power savings.
A simple way to achieve this is by extending the vblank offdelay, such
that additional PSR exit delays will be less perceivable.
This ensures that `3_frames_ms` will only be experienced as a 20% delay
on top how long the panel has been static, and thus make the delay
less perceivable.
If this ends up being too high of a percentage, it can be dropped
further in a future change.
Fixes: 537ef0f88897 ("drm/amd/display: use new vblank enable policy for DCN35+") Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
Denis Arefev [Fri, 21 Mar 2025 10:52:32 +0000 (13:52 +0300)]
drm/amd/pm: Prevent division by zero
The user can set any speed value.
If speed is greater than UINT_MAX/8, division by zero is possible.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: b64625a303de ("drm/amd/pm: correct the address of Arcturus fan related registers") Signed-off-by: Denis Arefev <arefev@swemel.ru> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
Asad Kamal [Mon, 17 Mar 2025 12:07:06 +0000 (20:07 +0800)]
drm/amd/pm: Update feature list for smu_v13_0_6
Update feature list for smu_v13_0_6 to show vcn & smu deep
sleep feature enable status
Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amdgpu: Add parameter documentation for amdgpu_sync_fence
The 'flags' parameter, which specifies memory allocation behavior while
creating a sync entry,
Fixes the below with gcc W=1:
drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c:162: warning: Function parameter or struct member 'flags' not described in 'amdgpu_sync_fence'
Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 11 Mar 2025 22:00:57 +0000 (18:00 -0400)]
drm/amdgpu/discovery: optionally use fw based ip discovery
On chips without native IP discovery support, use the fw binary
if available, otherwise we can continue without it.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Flora Cui <flora.cui@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Flora Cui <flora.cui@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Flora Cui [Thu, 27 Feb 2025 02:39:27 +0000 (10:39 +0800)]
drm/amdgpu/discovery: check ip_discovery fw file available
Signed-off-by: Flora Cui <flora.cui@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amdgpu/sdma_v4_4_2: update VM flush implementation for SDMA
This commit updates the VM flush implementation for the SDMA engine.
- Added a new function `sdma_v4_4_2_get_invalidate_req` to construct the VM_INVALIDATE_ENG0_REQ
register value for the specified VMID and flush type. This function ensures that all relevant
page table cache levels (L1 PTEs, L2 PTEs, and L2 PDEs) are invalidated.
- Modified the `sdma_v4_4_2_ring_emit_vm_flush` function to use the new `sdma_v4_4_2_get_invalidate_req`
function. The updated function emits the necessary register writes and waits to perform a VM flush
for the specified VMID. It updates the PTB address registers and issues a VM invalidation request
using the specified VM invalidation engine.
- Included the necessary header file `gc/gc_9_0_sh_mask.h` to provide access to the required register
definitions.
v2: vm flush by the vm inalidation packet (Lijo)
v3: code stle and define thh macro for the vm invalidation packet (Christian)
v4: Format definition sdma vm invalidate packet (Lijo)
drm/amdgpu: Optimize VM invalidation engine allocation and synchronize GPU TLB flush
- Modify the VM invalidation engine allocation logic to handle SDMA page rings.
SDMA page rings now share the VM invalidation engine with SDMA gfx rings instead of
allocating a separate engine. This change ensures efficient resource management and
avoids the issue of insufficient VM invalidation engines.
- Add synchronization for GPU TLB flush operations in gmc_v9_0.c.
Use spin_lock and spin_unlock to ensure thread safety and prevent race conditions
during TLB flush operations. This improves the stability and reliability of the driver,
especially in multi-threaded environments.
v2: replace the sdma ring check with a function `amdgpu_sdma_is_page_queue`
to check if a ring is an SDMA page queue.(Lijo)
v3: Add GC version check, only enabled on GC9.4.3/9.4.4/9.5.0
v4: Fix code style and add more detailed description (Christian)
v5: Remove dependency on vm_inv_eng loop order, explicitly lookup shared inv_eng(Christian/Lijo)
v6: Added search shared ring function amdgpu_sdma_get_shared_ring (Lijo)
drm/amd/amdgpu: Increase max rings to enable SDMA page ring
Increase the maximum number of rings supported by the AMDGPU driver from 133 to 149.
This change is necessary to enable support for the SDMA page ring.
Xiang Liu [Wed, 19 Mar 2025 09:02:49 +0000 (17:02 +0800)]
drm/amdgpu: Decode deferred error type in gfx aca bank parser
In the case of injecting uncorrected error with background workload,
the deferred error among uncorrected errors need to be specified
by checking the deferred and poison bits of status register.
v2: refine checking for deferred error
v2: log possiable DEs among CEs
v2: generate CPER records for DEs among UEs
Signed-off-by: Xiang Liu <xiang.liu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amdgpu/gfx11: Add Cleaner Shader Support for GFX11.5 GPUs
Enable the cleaner shader for GFX11.5.0/11.5.1 GPUs to provide data
isolation between GPU workloads. The cleaner shader is responsible for
clearing the Local Data Store (LDS), Vector General Purpose Registers
(VGPRs), and Scalar General Purpose Registers (SGPRs), which helps
prevent data leakage and ensures accurate computation results.
This update extends cleaner shader support to GFX11.5.0/11.5.1 GPUs,
previously available for GFX11.0.3. It enhances security by clearing GPU
memory between processes and maintains a consistent GPU state across KGD
and KFD workloads.
Cc: Mario Sopena-Novales <mario.novales@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>