Jesse.Zhang [Wed, 13 Aug 2025 02:36:58 +0000 (10:36 +0800)]
drm/amd/pm: Add VCN reset support check capability
This change introduces infrastructure to check whether VCN reset
is supported by the SMU firmware. Key changes include:
1. Added new functions to query VCN reset support:
- amdgpu_dpm_reset_vcn_is_supported()
- smu_reset_vcn_is_supported()
- pptable_funcs.reset_vcn_is_supported callback
2. Implemented proper locking in the DPM layer with mutex protection
3. Maintained consistency with existing SDMA reset support checks
The new capability allows callers to check for VCN reset support
before attempting the operation, preventing unnecessary attempts
on unsupported platforms.
v2: clean up debug info(Alex)
Suggested-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Ruili Ji <ruiliji2@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Heng Zhou [Wed, 13 Aug 2025 03:18:04 +0000 (11:18 +0800)]
drm/amdgpu: fix nullptr err of vm_handle_moved
If a amdgpu_bo_va is fpriv->prt_va, the bo of this one is always NULL.
So, such kind of amdgpu_bo_va should be updated separately before
amdgpu_vm_handle_moved.
Eric Huang [Thu, 7 Aug 2025 18:23:11 +0000 (14:23 -0400)]
drm/amdkfd: set uuid for each partition in topology
Currently each kfd compute partition/node is sharing
the same uuid of AID, which doen't meet the CUDA spec
for visible device, so corresponding XCD id for each
partition in smu has been assigned to xcp, and exposed
to kfd topology.
v2: add NULL check (Lijo)
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Xichao Zhao [Tue, 12 Aug 2025 03:16:03 +0000 (11:16 +0800)]
drm/radeon: replace min/max nesting with clamp()
The clamp() macro explicitly expresses the intent of constraining
a value within bounds.Therefore, replacing min(max(a, b), c) and
max(min(a,b),c) with clamp(val, lo, hi) can improve code readability.
Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Xichao Zhao <zhao.xichao@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Liu01 Tong [Mon, 11 Aug 2025 06:52:37 +0000 (14:52 +0800)]
drm/amdgpu: fix task hang from failed job submission during process kill
During process kill, drm_sched_entity_flush() will kill the vm
entities. The following job submissions of this process will fail, and
the resources of these jobs have not been released, nor have the fences
been signalled, causing tasks to hang and timeout.
Fix by check entity status in amdgpu_vm_ready() and avoid submit jobs to
stopped entity.
v2: add amdgpu_vm_ready() check before amdgpu_vm_clear_freed() in
function amdgpu_cs_vm_handling().
Signed-off-by: Liu01 Tong <Tong.Liu01@amd.com> Signed-off-by: Lin.Cao <lincao12@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jack Xiao [Mon, 11 Aug 2025 07:20:55 +0000 (15:20 +0800)]
drm/amdgpu: fix incorrect vm flags to map bo
It should use vm flags instead of pte flags
to specify bo vm attributes.
Fixes: 7946340fa389 ("drm/amdgpu: Move csa related code to separate file") Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
YiPeng Chai [Tue, 12 Aug 2025 01:17:58 +0000 (09:17 +0800)]
drm/amdgpu: fix vram reservation issue
The vram block allocation flag must be cleared
before making vram reservation, otherwise reserving
addresses within the currently freed memory range
will always fail.
Fixes: c9cad937c0c5 ("drm/amdgpu: add drm buddy support to amdgpu") Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Some kfd ioctls may not be available depending on the kernel version the
user is running, as such we need to report -ENOTTY so userland can
determine the cause of the ioctl failure.
Signed-off-by: Geoffrey McRae <geoffrey.mcrae@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Brahmajit Das [Mon, 11 Aug 2025 09:21:25 +0000 (14:51 +0530)]
drm/radeon/r600_cs: clean up of dead code in r600_cs
GCC 16 enables -Werror=unused-but-set-variable= which results in build
error with the following message.
drivers/gpu/drm/radeon/r600_cs.c: In function ‘r600_texture_size’:
drivers/gpu/drm/radeon/r600_cs.c:1411:29: error: variable ‘level’ set but not used [-Werror=unused-but-set-variable=]
1411 | unsigned offset, i, level;
| ^~~~~
cc1: all warnings being treated as errors
make[6]: *** [scripts/Makefile.build:287: drivers/gpu/drm/radeon/r600_cs.o] Error 1
level although is set, but in never used in the function
r600_texture_size. Thus resulting in dead code and this error getting
triggered.
Fixes: 60b212f8ddcd ("drm/radeon: overhaul texture checking. (v3)") Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Brahmajit Das <listout@listout.xyz> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Cryolitia PukNgae <cryolitia@uniontech.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Taimur Hassan [Sun, 3 Aug 2025 23:38:31 +0000 (18:38 -0500)]
drm/amd/display: Promote DC to 3.2.345
This version brings along following update:
-Fix close and open lid may cause eDP remaining blank
-Fix frequently disabling/enabling OTG may cause incorrect
configuration of OTG
Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Taimur Hassan [Sun, 3 Aug 2025 20:12:50 +0000 (16:12 -0400)]
drm/amd/display: [FW Promotion] Release 0.1.22.0
Add a new command for Panel Replay.
Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Danny Wang [Thu, 24 Jul 2025 05:58:21 +0000 (13:58 +0800)]
drm/amd/display: Reset apply_eamless_boot_optimization when dpms_off
[WHY&HOW]
The user closed the lid while the system was powering on and opened it
again before the “apply_seamless_boot_optimization” was set to false,
resulting in the eDP remaining blank.
Reset the “apply_seamless_boot_optimization” to false when dpms off.
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Danny Wang <Danny.Wang@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
TungYu Lu [Tue, 15 Jul 2025 08:56:59 +0000 (16:56 +0800)]
drm/amd/display: Wait until OTG enable state is cleared
[Why]
Customer reported an issue that OS starts and stops device multiple times
during driver installation. Frequently disabling and enabling OTG may
prevent OTG from being safely disabled and cause incorrect configuration
upon the next enablement.
[How]
Add a wait until OTG_CURRENT_MASTER_EN_STATE is cleared as a short term
solution.
Reviewed-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: TungYu Lu <tungyu.lu@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Vitaly Prosyak [Thu, 7 Aug 2025 20:37:25 +0000 (16:37 -0400)]
drm/amdgpu: add to custom amdgpu_drm_release drm_dev_enter/exit
User queues are disabled before GEM objects are released
(protecting against user app crashes).
No races with PCI hot-unplug (because drm_dev_enter prevents cleanup
if iewdevice is being removed).
Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Lijo Lazar [Sat, 28 Jun 2025 05:20:25 +0000 (10:50 +0530)]
drm/amdgpu: Save and restore switch state
During a DPC error kernel waits for the link to be active before
notifying downstream devices. On certain platforms with Broadcom switch
in synthetiic mode, switch responds with values even though the link is
not fully ready. The config space restoration done by pcie port driver
for SWUS/DS of dGPU is thus not effective as the switch is still doing
internal enumeration.
As a workaround, save state of SWUS/DS device in driver. Add additional
check to see if link is active and restore the values during DPC error
callbacks.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sathishkumar S [Tue, 5 Aug 2025 16:05:10 +0000 (21:35 +0530)]
drm/amdgpu/vcn: Hold pg_lock before vcn power off
Acquire vcn_pg_lock before changes to vcn power state
and release it after power off in idle work handler.
Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sathishkumar S [Tue, 5 Aug 2025 15:58:25 +0000 (21:28 +0530)]
drm/amdgpu/jpeg: Hold pg_lock before jpeg poweroff
Acquire jpeg_pg_lock before changes to jpeg power state
and release it after power off from idle work handler.
Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Wed, 30 Jul 2025 15:16:05 +0000 (11:16 -0400)]
drm/amdgpu/discovery: fix fw based ip discovery
We only need the fw based discovery table for sysfs. No
need to parse it. Additionally parsing some of the board
specific tables may result in incorrect data on some boards.
just load the binary and don't parse it on those boards.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4441 Fixes: 80a0e8282933 ("drm/amdgpu/discovery: optionally use fw based ip discovery") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Add NULL check for stream before dereference in 'dm_vupdate_high_irq'
Add a NULL check for acrtc->dm_irq_params.stream before
accessing its members.
Fixes below:
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:623
dm_vupdate_high_irq() warn: variable dereferenced before check
'acrtc->dm_irq_params.stream' (see line 615)
614 if (vrr_active) {
615 bool replay_en = acrtc->dm_irq_params.stream->link->replay_settings.replay_feature_enabled;
^^^^^^^^^^^^^^^^^^^^^^^^^^^
616 bool psr_en = acrtc->dm_irq_params.stream->link->psr_settings.psr_feature_enabled;
^^^^^^^^^^^^^^^^^^^^^^^^^^^ New dereferences
617 bool fs_active_var_en = acrtc->dm_irq_params.freesync_config.state
618 == VRR_STATE_ACTIVE_VARIABLE;
619
620 amdgpu_dm_crtc_handle_vblank(acrtc);
621
622 /* BTR processing for pre-DCE12 ASICs */
623 if (acrtc->dm_irq_params.stream &&
^^^^^^^^^^^^^^^^^^^^^^^^^^^ But the existing code assumed it could be NULL. Someone is wrong.
Fixes: 6d31602a9f57 ("drm/amd/display: more liberal vmin/vmax update for freesync") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Cc: Alex Hung <alex.hung@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: ChiaHsuan Chung <chiahsuan.chung@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Ray Wu <ray.wu@amd.com> Cc: Daniel Wheeler <daniel.wheeler@amd.com> Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fix the following warning in struct documentation:
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:168: warning: expecting prototype for struct dm_vupdate_work. Prototype was for struct vupdate_offload_work instead
Fixes: c210b757b400 ("drm/amd/display: fix dmub access race condition") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
James Zhu [Wed, 28 May 2025 16:38:58 +0000 (12:38 -0400)]
drm/amdkfd: return migration pages from copy function
dst MIGRATE_PFN_VALID bit and src MIGRATE_PFN_MIGRATE bit
should always be set when migration success. cpage includes
src MIGRATE_PFN_MIGRATE bit set and MIGRATE_PFN_VALID bit
unset pages for both ram and vram when memory is only allocated
without being populated before migration, those ram pages should
be counted as migrate pages and those vram pages should not be
counted as migrate pages. Here migration pages refer to how many
vram pages involved.
-v2 use dst to check MIGRATE_PFN_VALID bit (suggested-by Philip)
-v3 add warning when vram pages is less than migration pages
return migration pages directly from copy function
-v4 correct comments and copy function return mpage (suggested-by Felix)
Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Certain messages will processed with high priority by PMFW even if it
hasn't responded to a previous message. Send the priority message
regardless of the success/fail status of the previous message. Add
support on SMUv13.0.6 and SMUv13.0.12
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Amber Lin [Fri, 1 Aug 2025 00:45:00 +0000 (20:45 -0400)]
drm/amdkfd: Destroy KFD debugfs after destroy KFD wq
Since KFD proc content was moved to kernel debugfs, we can't destroy KFD
debugfs before kfd_process_destroy_wq. Move kfd_process_destroy_wq prior
to kfd_debugfs_fini to fix a kernel NULL pointer problem. It happens
when /sys/kernel/debug/kfd was already destroyed in kfd_debugfs_fini but
kfd_process_destroy_wq calls kfd_debugfs_remove_process. This line
debugfs_remove_recursive(entry->proc_dentry);
tries to remove /sys/kernel/debug/kfd/proc/<pid> while
/sys/kernel/debug/kfd is already gone. It hangs the kernel by kernel
NULL pointer.
Signed-off-by: Amber Lin <Amber.Lin@amd.com> Reviewed-by: Eric Huang <jinhuieric.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amdgpu: Wait for bootloader after PSPv11 reset
Some PSPv11 SOCs take a longer time for PSP based mode-1 reset. Instead
of checking for C2PMSG_33 status, add the callback wait_for_bootloader.
Wait for bootloader to be back to steady state is already part of the
generic mode-1 reset flow. Increase the retry count for bootloader wait
and also fix the mask to prevent fake pass.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jesse.Zhang [Mon, 4 Aug 2025 00:43:15 +0000 (08:43 +0800)]
drm/amdgpu: Update SDMA firmware version check for user queue support
This commit fixes a firmware version check for enabling user queue
support in SDMA v7.0. The previous version check (7836028) was
incorrect and could lead to issues with PROTECTED_FENCE_SIGNAL
commands causing register conflicts between MCU_DBG0 and MCU_DBG1.
Fixes: 8c011408ed84 ("drm/amdgpu/sdma7: add ucode version checks for userq support") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cached metrics data validity is 1ms on arcturus. It's not reasonable for
any client to query gpu_metrics at a faster rate and constantly
interrupt PMFW.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cached metrics data validity is 1ms on aldebaran. It's not reasonable
for any client to query gpu_metrics at a faster rate and constantly
interrupt PMFW.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Summary:
* Add interface to log hw state when underflow happens
* Fix hubp programming of 3dlut fast load
* Avoid Read Remote DPCD Many Times
* More liberal vmin/vmax update for freesync
* Fix dmub access race condition
Acked-by: Sun peng (Leo) Li <sunpeng.li@amd.com> Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Muhammad Ahmed [Fri, 25 Jul 2025 01:50:25 +0000 (21:50 -0400)]
drm/amd/display: Adding interface to log hw state when underflow happens
[why]
Will help us better debug underflow issues.
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Muhammad Ahmed <Muhammad.Ahmed@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ryan Seto [Thu, 24 Jul 2025 18:57:52 +0000 (14:57 -0400)]
drm/amd/display: Toggle for Disable Force Pstate Allow on Disable
[Why & How]
In theory, driver should be able to support disabling force pstate allow
after hardware release however this behavior is not tested yet.
Introducing a new toggle to disable the force on the fly.
Reviewed-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Ryan Seto <ryanseto@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Fixing hubp programming of 3dlut fast load
[why]
HUBP needs to know the size of the lut's destination in MPC.
This is currently defaulted to 17, and needs to be set for specific
lut size.
[how]
Define and apply the missing hubp field. Taking this opportunity
to consolidate the programming of 3dlut into a hubp and mpc function.
Reviewed-by: Krunoslav Kovac <krunoslav.kovac@amd.com> Signed-off-by: Reza Amini <reza.amini@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why/How]
The w/a will cause reboot black screen issue.
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Jingwen Zhu <Jingwen.Zhu@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Avoid Read Remote DPCD Many Times
Reading remote dpcd is time consuming. Instead of reading each byte
one by one, read 16 bytes together.
Reviewed-by: ChiaHsuan (Tom) Chung <chiahsuan.chung@amd.com> Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com> Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This reverts commit 66abb996999de0d440a02583a6e70c2c24deab45.
This broke custom brightness curves but it wasn't obvious because
of other related changes. Custom brightness curves are always
from a 0-255 input signal. The correct fix was to fix the default
value which was done by [1].
Paul Hsieh [Wed, 23 Jul 2025 03:51:42 +0000 (11:51 +0800)]
drm/amd/display: update dpp/disp clock from smu clock table
[Why]
The reason some high-resolution monitors fail to display properly
is that this platform does not support sufficiently high DPP and
DISP clock frequencies
[How]
Update DISP and DPP clocks from the smu clock table then DML can
filter these mode if not support.
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Paul Hsieh <Paul.Hsieh@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: more liberal vmin/vmax update for freesync
[Why]
FAMS2 expects vmin/vmax to be updated in the case when freesync is
off, but supported. But we only update it when freesync is enabled.
[How]
Change the vsync handler such that dc_stream_adjust_vmin_vmax() its called
irrespective of whether freesync is enabled. If freesync is supported,
then there is no harm in updating vmin/vmax registers.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3546 Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Reviewed-by: ChiaHsuan Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Accessing DC from amdgpu_dm is usually preceded by acquisition of
dc_lock mutex. Most of the DC API that DM calls are under a DC lock.
However, there are a few that are not. Some DC API called from interrupt
context end up sending DMUB commands via a DC API, while other threads were
using DMUB. This was apparent from a race between calls for setting idle
optimization enable/disable and the DC API to set vmin/vmax.
Offload the call to dc_stream_adjust_vmin_vmax() to a thread instead
of directly calling them from the interrupt handler such that it waits
for dc_lock.
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Duncan Ma [Tue, 22 Jul 2025 16:22:15 +0000 (12:22 -0400)]
drm/amd/display: Adjust AUX-less ALPM setting
[Why & How]
Change ACDS period to support LTTPR.
Reviewed-by: Charlene Liu <charlene.liu@amd.com> Signed-off-by: Duncan Ma <Duncan.Ma@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Siyang Liu [Fri, 4 Jul 2025 03:16:22 +0000 (11:16 +0800)]
drm/amd/display: fix a Null pointer dereference vulnerability
[Why]
A null pointer dereference vulnerability exists in the AMD display driver's
(DC module) cleanup function dc_destruct().
When display control context (dc->ctx) construction fails
(due to memory allocation failure), this pointer remains NULL.
During subsequent error handling when dc_destruct() is called,
there's no NULL check before dereferencing the perf_trace member
(dc->ctx->perf_trace), causing a kernel null pointer dereference crash.
[How]
Check if dc->ctx is non-NULL before dereferencing.
Link: https://lore.kernel.org/r/tencent_54FF4252EDFB6533090A491A25EEF3EDBF06@qq.com Co-developed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
(Updated commit text and removed unnecessary error message) Signed-off-by: Siyang Liu <Security@tencent.com> Signed-off-by: Roman Li <roman.li@amd.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Michel Dänzer [Wed, 30 Jul 2025 08:09:02 +0000 (10:09 +0200)]
drm/amd/display: Add primary plane to commits for correct VRR handling
amdgpu_dm_commit_planes calls update_freesync_state_on_stream only for
the primary plane. If a commit affects a CRTC but not its primary plane,
it would previously not trigger a refresh cycle or affect LFC, violating
current UAPI semantics.
Fixes e.g. atomic commits affecting only the cursor plane being limited
to the minimum refresh rate.
Don't do this for the legacy cursor ioctls though, it would break the
UAPI semantics for those.
Suggested-by: Xaver Hugl <xaver.hugl@kde.org> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3034 Signed-off-by: Michel Dänzer <mdaenzer@redhat.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Yunxiang Li [Fri, 25 Jul 2025 16:56:35 +0000 (12:56 -0400)]
drm/amdgpu: skip mgpu fan boost for multi-vf
On multi-vf setup if the VM have two vf assigned, perhaps from two
different gpus, mgpu fan boost will fail.
Signed-off-by: Yunxiang Li <Yunxiang.Li@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
When power management is not enabled in the kernel build, the newly
added hibernation changes cause a link failure:
arm-linux-gnueabi-ld: drivers/gpu/drm/amd/amdgpu/amdgpu_drv.o: in function `amdgpu_pmops_thaw':
amdgpu_drv.c:(.text+0x1514): undefined reference to `pm_hibernate_is_recovering'
Make the power management code in this driver conditional on
CONFIG_PM and CONFIG_PM_SLEEP
Fixes: 530694f54dd5 ("drm/amdgpu: do not resume device in thaw for normal hibernation") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20250714081635.4071570-1-arnd@kernel.org Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sathishkumar S [Fri, 18 Jul 2025 07:53:53 +0000 (13:23 +0530)]
drm/amdgpu/vcn: Register dump cleanup in VCN2_5
Use generic vcn devcoredump helper functions for VCN2_5 and VCN2_6
Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sathishkumar S [Fri, 18 Jul 2025 07:45:00 +0000 (13:15 +0530)]
drm/amdgpu/vcn: Register dump cleanup in VCN2_0_0
Use generic vcn devcoredump helper functions for VCN2_0_0
Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sathishkumar S [Fri, 18 Jul 2025 07:38:00 +0000 (13:08 +0530)]
drm/amdgpu/vcn: Register dump cleanup in VCN3_0
Use generic vcn devcoredump helper functions for VCN3_0
Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sathishkumar S [Fri, 18 Jul 2025 07:27:04 +0000 (12:57 +0530)]
drm/amdgpu/vcn: Register dump cleanup in VCN4_0_3
Use generic vcn devcoredump helper functions for VCN4_0_3
Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sathishkumar S [Fri, 18 Jul 2025 07:08:49 +0000 (12:38 +0530)]
drm/amdgpu/vcn: Register dump cleanup in VCN4_0_5
Use generic vcn devcoredump helper functions for VCN4_0_5
Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sathishkumar S [Fri, 18 Jul 2025 06:48:30 +0000 (12:18 +0530)]
drm/amdgpu/vcn: Register dump cleanup in VCN4_0_0
Use generic vcn devcoredump helper functions for VCN4_0_0
Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sathishkumar S [Thu, 17 Jul 2025 18:57:50 +0000 (00:27 +0530)]
drm/amdgpu/vcn: Register dump cleanup in VCN5
Use generic vcn devcoredump helper functions for VCN5
Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sathishkumar S [Thu, 17 Jul 2025 06:00:52 +0000 (11:30 +0530)]
drm/amdgpu/vcn: Add regdump helper functions
Add generic helper functions for vcn devcoredump support
which can be re-used for all vcn versions.
Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Meng Li [Fri, 9 May 2025 05:44:24 +0000 (13:44 +0800)]
drm/amd/amdgpu: Release xcp drm memory after unplug
Add a new API amdgpu_xcp_drm_dev_free().
After unplug xcp device, need to release xcp drm memory etc.
Co-developed-by: Jiang Liu <gerry@linux.alibaba.com> Signed-off-by: Jiang Liu <gerry@linux.alibaba.com> Signed-off-by: Meng Li <li.meng@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amdgpu: Retain job->vm in amdgpu_job_prepare_job
The field job->vm is used in function amdgpu_job_run to get the page
table re-generation counter and decide whether the job should be skipped.
Specifically, function amdgpu_vm_generation checks if the VM is valid for this job to use.
For instance, if a gfx job depends on a cancelled sdma job from entity vm->delayed,
then the gfx job should be skipped.
Fixes: 26c95e838e63 ("drm/amdgpu: set the VM pointer to NULL in amdgpu_job_prepare") Signed-off-by: YuanShang <YuanShang.Mao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd: Use drm_*() macros instead of DRM_*() for amdgpu_cs
Some of the IOCTL messages can be called for different GPUs and it might
not be obvious which one called them from a problem. Using the drm_*()
macros the correct device will be shown in the messages.
Timur Kristóf [Tue, 22 Jul 2025 15:58:30 +0000 (17:58 +0200)]
drm/amd/display: Fix DCE 6.0 and 6.4 PLL programming.
Apparently, both DCE 6.0 and 6.4 have 3 PLLs, but PLL0 can only
be used for DP. Make sure to initialize the correct amount of PLLs
in DC for these DCE versions and use PLL0 only for DP.
Also, on DCE 6.0 and 6.4, the PLL0 needs to be powered on at
initialization as opposed to DCE 6.1 and 7.x which use a different
clock source for DFS.
The following functions were used as reference from the old
radeon driver implementation of DCE 6.x:
- radeon_atom_pick_pll
- atombios_crtc_set_disp_eng_pll
Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>