Dillon Varone [Wed, 19 Mar 2025 17:53:25 +0000 (13:53 -0400)]
drm/amd/display: Fix VUpdate offset calculations for dcn401
[WHY&HOW]
DCN401 uses a different structure to store the VStartup offset used to
calculate the VUpdate position, so adjust the calculations to use this
value.
Reviewed-by: Aric Cyr <aric.cyr@amd.com> Signed-off-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fangzhi Zuo [Thu, 20 Mar 2025 17:58:24 +0000 (13:58 -0400)]
drm/amd/display: Do Not Consider DSC if Valid Config Not Found
[why]
In the mode validation, mst dsc is considered for bw calculation after
common dsc config is determined. Currently it considered common dsc config
is found if max and min target bpp are non zero which is not accurate. Invalid
max and min target bpp values would not get max_kbps and min_kbps calculated,
leading to falsefully pass a mode that does not have valid dsc parameters
available.
[how]
Use the return value of decide_dsc_bandwidth_range() to determine whether valid
dsc common config is found or not. Prune out modes that do not have valid common
dsc config determined.
Reviewed-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Dillon Varone [Wed, 19 Mar 2025 16:45:13 +0000 (12:45 -0400)]
drm/amd/display: Add Support for reg inbox0 for host->DMUB CMDs
[WHY]
DCN4+ supports a new register based mailbox for sending messages
from host to DMCUB. This mailbox supports 64 byte commands, which makes
it compatible with the same structure as the frame buffer based mailbox.
[HOW]
The intention for reg_inbox0 is to be slot in replacement for the frame
buffer based mailbox (Inbox1). It supports all of the required features:
- Supports all messages handled by FB Inbox1
- Supports multi command batching
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Dillon Varone <Dillon.Varone@amd.com> Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Joshua Aberback [Fri, 14 Mar 2025 22:33:43 +0000 (18:33 -0400)]
drm/amd/display: Use meaningful size for block_sequence array
[Why]
This array was initially defined as size 50. There were array overflow
issues so the size was increased to 100. To ensure such issues are
avoided in the future, the size should be set based on the possible
contents instead of an arbitrary value.
[How]
- upper bound, assume every update occurs on max number of pipes
- define array sizes for function parameters, for static analysis
Reviewed-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Joshua Aberback <joshua.aberback@amd.com> Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rodrigo Siqueira [Tue, 25 Mar 2025 17:18:46 +0000 (11:18 -0600)]
Documentation/gpu: Create a GC entry in the amdgpu documentation
GC is a large block that plays a vital role for amdgpu; for this reason,
this commit creates one specific page for GC and adds extra information
about the CP component.
Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rodrigo Siqueira [Tue, 25 Mar 2025 17:18:45 +0000 (11:18 -0600)]
Documentation/gpu: Add explanation about AMD Pipes and Queues
Pipes and Queues are two common vocabulary that pervades discussions
around amdgpu core features. The definition and explanation of those
components are spread around multiple places in the code, mailing list,
and Gitlab, which sometimes leads to the wrong interpretation of these
concepts. This commit attempts to centralize the definition and
explanation of Pipe and Queue from amdgpu perspective in a kernel doc.
Most of the information in this doc was derived from:
Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rodrigo Siqueira [Tue, 25 Mar 2025 17:18:44 +0000 (11:18 -0600)]
Documentation/gpu: Create a documentation entry just for hardware info
The APU and dGPU tables are hidden in the driver misc info, which makes
it hard to find specific hardware info when users need it. This commit
creates a single page for this information and adds it to the top of the
amdgpu list to improve searchability.
Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rodrigo Siqueira [Tue, 25 Mar 2025 17:18:43 +0000 (11:18 -0600)]
Documentation/gpu: Change index order to show driver core first
Since driver-core has an overview of the AMD GPU hardware structure, it
makes more sense to keep it first. This commit move driver-core up in
the index list.
Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rodrigo Siqueira [Tue, 25 Mar 2025 17:18:42 +0000 (11:18 -0600)]
Documentation/gpu: Add new acronyms
This commit introduces some new acronyms extracted from the source code
and found on some web pages around the internet (most of them came from
ArchLinux, Gentoo, and Wikipedia links).
Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Thu, 20 Mar 2025 18:10:17 +0000 (14:10 -0400)]
drm/amdgpu/gfx: decouple the number of kgqs from the hw
The driver currently sets up one kgq per pipe. As such
adev->gfx.me.num_queue_per_pipe is hardcoded to 1 everywhere.
This is fine for kernel queues, but when we enable user queues
we need to know that actual number of queues per pipe. Decouple
the kgq setup from the actual hardware count. For dev core
dumps and user queues, we want to know the actual number
of queues per pipe.
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amdgpu/gfx10: Add Cleaner Shader Support for GFX10.3.x GPUs
Enable the cleaner shader for other GFX10.3.x series of GPUs to provide
data isolation between GPU workloads. The cleaner shader is responsible
for clearing the Local Data Store (LDS), Vector General Purpose
Registers (VGPRs), and Scalar General Purpose Registers (SGPRs), which
helps prevent data leakage and ensures accurate computation results.
This update extends cleaner shader support to GFX10.3.x GPUs, previously
available for GFX10.3.0. It enhances security by clearing GPU memory
between processes and maintains a consistent GPU state across KGD and
KFD workloads.
Cc: Mario Sopena-Novales <mario.novales@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Thu, 27 Feb 2025 17:31:28 +0000 (12:31 -0500)]
drm/amdgpu: add rebar parameter
Add a new parameter to disable BAR resizing. Note that this
only disables the driver from attempting to resize the BAR,
The BIOS may have resized the BAR at boot.
Some teams have found this useful in debugging P2P DMA
issues on systems where the available MMIO space did not allow
for all of the GPUs present to resize their BARs.
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alexandre Demers [Sat, 22 Mar 2025 01:46:48 +0000 (21:46 -0400)]
drm/amdgpu: use GRPH_SECONDARY_SURFACE_ADDRESS_MASK with GRPH_SECONDARY_SURFACE_ADDRESS in DCE6
It seems a copy-paste error: since we are working with
mmGRPH_SECONDARY_SURFACE_ADDRESS,
GRPH_SECONDARY_SURFACE_ADDRESS__GRPH_SECONDARY_SURFACE_ADDRESS_MASK
should be used.
Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alexandre Demers [Sat, 22 Mar 2025 18:39:00 +0000 (14:39 -0400)]
drm/radeon: fix MAX_POWER_SHIFT value
While I don't think it is being used anywhere, if it were used, it would
be wrong. We can base this assumption on MAX_POWER_MASK, where the shift is
by 16 bits.
Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ce Sun [Thu, 20 Mar 2025 10:12:40 +0000 (18:12 +0800)]
drm/amdgpu: refactor amdgpu_device_gpu_recover
Split amdgpu_device_gpu_recover into the following stages:
halt activities,asic reset,schedule resume and amdgpu resume.
The reason is that the subsequent addition of dpc recover
code will have a high similarity with gpu reset
Signed-off-by: Ce Sun <cesun102@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Aric Cyr <aric.cyr@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by: Leo Li <sunpeng.li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
Read the histogram for VariBright validation
[How]
Add dc/dmub functions to read histogram and ACE
Reviewed-by: Jun Lei <jun.lei@amd.com> Signed-off-by: Chun-Liang Chang <Chun-Liang.Chang@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Paul Hsieh [Tue, 11 Mar 2025 09:16:57 +0000 (17:16 +0800)]
drm/amd/display: Skip to enable dsc if it has been off
[Why]
It makes DSC enable when we commit the stream which need
keep power off.And then it will skip to disable DSC if
pipe reset at this situation as power has been off. It may
cause the DSC unexpected enable on the pipe with the
next new stream which doesn't support DSC.
[HOW]
Check the DSC used on current pipe status when update stream.
Skip to enable if it has been off. The operation enable
DSC should happen when set power on.
Reviewed-by: Wenjing Liu <wenjing.liu@amd.com> Signed-off-by: Paul Hsieh <Paul.Hsieh@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Leo Zeng [Wed, 26 Feb 2025 19:35:05 +0000 (14:35 -0500)]
drm/amd/display: Get visual confirm color for stream
[WHY]
We want to output visual confirm color based on stream.
[HOW]
If visual confirm is for DMUB, use DMUB to get color.
Otherwise, find plane with highest layer index, output visual confirm color
of pipe that contains plane with highest index.
Reviewed-by: Aric Cyr <aric.cyr@amd.com> Signed-off-by: Leo Zeng <Leo.Zeng@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Thu, 9 Jan 2025 16:57:56 +0000 (11:57 -0500)]
drm/amdgpu: allow pinning DMA-bufs into VRAM if all importers can do P2P
Try pinning into VRAM to allow P2P with RDMA NICs without ODP
support if all attachments can do P2P. If any attachment can't do
P2P just pin into GTT instead.
Acked-by: Simona Vetter <simona.vetter@ffwll.ch> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Tested-by: Pak Nin Lui <pak.lui@amd.com> Cc: Simona Vetter <simona.vetter@ffwll.ch> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Similar to xe, enable some simple management of VRAM only.
Reviewed-by: Christian König <christian.koenig@amd.com> Co-developed-by: Maxime Ripard <mripard@kernel.org> Signed-off-by: Maxime Ripard <mripard@kernel.org> Signed-off-by: Maarten Lankhorst <dev@lankhorst.se> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jay Cornwall [Fri, 21 Mar 2025 18:19:05 +0000 (13:19 -0500)]
drm/amdgpu: Increase KIQ invalidate_tlbs timeout
KIQ invalidate_tlbs request has been seen to marginally exceed the
configured 100 ms timeout on systems under load.
All other KIQ requests in the driver use a 10 second timeout. Use a
similar timeout implementation on the invalidate_tlbs path.
v2: Poll once before msleep
v3: Fix return value
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com> Cc: Kent Russell <kent.russell@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jonathan Kim [Thu, 27 Mar 2025 15:50:42 +0000 (11:50 -0400)]
drm/amdkfd: limit sdma queue reset caps flagging for gfx9
ASICs post GFX 9 are being flagged as SDMA per queue reset supported
in the KGD but KFD and scheduler FW currently have no support.
Limit SDMA queue reset capabilities to GFX 9.
Fixes: ceb7114c961b ("drm/amdkfd: flag per-sdma queue reset supported to user space") Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: David Belanger <david.belanger@amd.com> Reviewed-by: Harish Kasiviswanathan <harish.kasiviswanathan@amd.com> Reviewed-by: Jesse Zhang <jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
drm/amd/display: Add HP Elitebook 645 to the quirk list for eDP on DP1
[Why]
HP Elitebook 645 has DP0 and DP1 swapped.
[How]
Add HP Elitebook 645 to DP0/DP1 swap quirk list.
Cc: stable@vger.kernel.org Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3701 Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Add HP Probook 445 and 465 to the quirk list for eDP on DP1
[Why]
HP Probook 445 and 465 has DP0 and DP1 swapped.
[How]
Add HP Probook 445 and 465 to DP0/DP1 swap quirk list.
Cc: stable@vger.kernel.org Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3995 Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Tested-by: Anson Tsao <anson.tsao@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Huacai Chen [Thu, 27 Mar 2025 09:53:34 +0000 (17:53 +0800)]
drm/amd/display: Protect FPU in dml2_validate()/dml21_validate()
Commit 7da55c27e76749b9 ("drm/amd/display: Remove incorrect FP context
start") removes the FP context protection of dml2_create(), and it said
"All the DC_FP_START/END should be used before call anything from DML2".
However, dml2_validate()/dml21_validate() are not protected from their
callers, causing such errors:
Unfortunately, protecting dml2_validate()/dml21_validate() out of DML2
causes "sleeping function called from invalid context", so protect them
with DC_FP_START() and DC_FP_END() inside.
Huacai Chen [Thu, 27 Mar 2025 09:53:33 +0000 (17:53 +0800)]
drm/amd/display: Protect FPU in dml2_init()/dml21_init()
Commit 7da55c27e76749b9 ("drm/amd/display: Remove incorrect FP context
start") removes the FP context protection of dml2_create(), and it said
"All the DC_FP_START/END should be used before call anything from DML2".
However, dml2_init()/dml21_init() are not protected from their callers,
causing such errors:
Unfortunately, protecting dml2_init()/dml21_init() out of DML2 causes
"sleeping function called from invalid context", so protect them with
DC_FP_START() and DC_FP_END() inside.
Huacai Chen [Thu, 27 Mar 2025 09:53:32 +0000 (17:53 +0800)]
drm/amd/display: Protect FPU in dml21_copy()
Commit 7da55c27e76749b9 ("drm/amd/display: Remove incorrect FP context
start") removes the FP context protection of dml2_create(), and it said
"All the DC_FP_START/END should be used before call anything from DML2".
However, dml21_copy() are not protected from their callers, causing such
errors:
Unfortunately, protecting dml21_copy() out of DML2 causes "sleeping
function called from invalid context", so protect them with DC_FP_START()
and DC_FP_END() inside.
Tom Chung [Wed, 19 Mar 2025 08:31:31 +0000 (16:31 +0800)]
drm/amd/display: Do not enable Replay and PSR while VRR is on in amdgpu_dm_commit_planes()
[Why]
Replay and PSR will cause some video corruption while VRR is enabled.
[How]
Do not enable the Replay and PSR while VRR is active in
amdgpu_dm_enable_self_refresh().
Fixes: 67edb81d6e9a ("drm/amd/display: Disable replay and psr while VRR is enabled") Reviewed-by: Sun peng Li <sunpeng.li@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
Emily Deng [Fri, 28 Mar 2025 10:14:17 +0000 (18:14 +0800)]
drm/amdkfd: sriov doesn't support per queue reset
Disable per queue reset for sriov.
Signed-off-by: Emily Deng <Emily.Deng@amd.com> Reviewed-by: Jonathan Kim <jonathan.kim@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Flora Cui <flora.cui@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Matthew Auld [Mon, 7 Apr 2025 14:18:25 +0000 (15:18 +0100)]
drm/amdgpu/dma_buf: fix page_link check
The page_link lower bits of the first sg could contain something like
SG_END, if we are mapping a single VRAM page or contiguous blob which
fits into one sg entry. Rather pull out the struct page, and use that in
our check to know if we mapped struct pages vs VRAM.
Fixes: f44ffd677fb3 ("drm/amdgpu: add support for exporting VRAM using DMA-buf v3") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Christian König <christian.koenig@amd.com> Cc: amd-gfx@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v5.8+ Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com> Fixes: 216c1282dde3 ("drm/amdgpu: use GTT only as fallback for VRAM|GTT") Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
Asad Kamal [Mon, 17 Mar 2025 06:16:04 +0000 (14:16 +0800)]
drm/amd/pm: Remove host limit metrics support
Firmware algorithm changed and the values in this version
are not accurate thereby remove host limit metric support
for smu_v13_0_6, smu_v13_0_12 & smu_v13_0_14
Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Tue, 18 Mar 2025 15:15:12 +0000 (16:15 +0100)]
drm/amdgpu: stop unmapping MQD for kernel queues v3
This looks unnecessary and actually extremely harmful since using kmap()
is not possible while inside the ring reset.
Remove all the extra mapping and unmapping of the MQDs.
v2: also fix debugfs
v3: fix coding style typo
Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>