Jesse.Zhang [Tue, 26 Aug 2025 09:30:58 +0000 (17:30 +0800)]
drm/amdgpu: update firmware version checks for user queue support
The minimum firmware versions required for user queue functionality
have been increased to address an issue where the queue privilege
state was lost during queue connect operations.
The problem occurred because the privilege state was being restored
to its initial value at the beginning of the function, overwriting
the state that was properly set during the queue connect case.
This commit updates the minimum version requirements:
- ME firmware from 2390 to 2420
- PFP firmware from 2530 to 2580
- MEC firmware from 2600 to 2650
- MES firmware remains at 120
These updated firmware versions contain the necessary fixes to
properly maintain queue privilege state throughout connect operations.
Fixes: 61ca97e9590c ("drm/amdgpu: Add fw minimum version check for usermode queue") Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Yang Wang [Mon, 25 Aug 2025 04:54:01 +0000 (12:54 +0800)]
drm/amd/amdgpu: disable hwmon power1_cap* for gfx 11.0.3 on vf mode
the PPSMC_MSG_GetPptLimit msg is not valid for gfx 11.0.3 on vf mode,
so skiped to create power1_cap* hwmon sysfs node.
Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: David (Ming Qiang) Wu <David.Wu3@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Shaoyun Liu [Fri, 11 Jul 2025 01:42:16 +0000 (21:42 -0400)]
drm/amd/amdgpu : Use the MES INV_TLBS API for tlb invalidation on gfx12
From MES version 0x81, it provide the new API INV_TLBS that support
invalidate tlbs with PASID.
Signed-off-by: Shaoyun Liu <shaoyun.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Shaoyun Liu [Fri, 4 Jul 2025 16:30:10 +0000 (12:30 -0400)]
drm/amd/include : Update MES v12 API header(INV_TLBS)
The requirement from driver side is to have an API that can do the
tlb invalidation on dedicate pasid since driver don't know the vmid
and process mapping.
Make the API generic to support different tlb invalidation related
request. Driver can specify pasid, vmid, hub_id and vm address range
need to be invalidated.
With this API the old INV_GART in MISC Op can be deprecated.
Signed-off-by: Shaoyun Liu <shaoyun.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jesse.Zhang [Sat, 23 Aug 2025 07:00:06 +0000 (15:00 +0800)]
drm/amdgpu: fix shift-out-of-bounds in amdgpu_debugfs_jpeg_sched_mask_set
Fix a UBSAN shift-out-of-bounds warning in amdgpu_debugfs_jpeg_sched_mask_set
when the shift exponent reaches or exceeds 32 bits. The issue occurred because
a 32-bit integer '1' was being shifted by up to 32 bits, which is undefined
behavior.
Replace '1' with '1ULL' to ensure 64-bit arithmetic, matching the u64 type of
'val' and preventing the shift overflow. This is consistent with the existing
mask calculation that already uses 1ULL.
The error manifested as:
UBSAN: shift-out-of-bounds in drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c:373:17
shift exponent 32 is too large for 32-bit type 'int'
v2: remove debug log
* Firmware releases for multiple asics
* CodeQL fixes
* Fix for double cursor with 180 degree rotation on large resolutions
* Misc bug fixes for DSC, PSR/Replay, DPIA etc.
Signed-off-by: Nicholas Carbones <ncarbone@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by: Leo Li <sunpeng.li@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Taimur Hassan [Fri, 15 Aug 2025 23:23:50 +0000 (19:23 -0400)]
drm/amd/display: [FW Promotion] Release 0.1.24.0
Add two new IPS residency data modes.
Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by: Leo Li <sunpeng.li@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Dillon Varone [Thu, 14 Aug 2025 16:01:15 +0000 (12:01 -0400)]
drm/amd/display: Consider sink max slice width limitation for dsc
[WHY&HOW]
The sink max slice width limitation should be considered for DSC, but
was removed in "refactor DSC cap calculations".
This patch adds it back and takes the valid minimum between the sink and
source.
Signed-off-by: Dillon Varone <Dillon.Varone@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by: Wenjing Liu <Wenjing.Liu@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ivan Lipski [Thu, 7 Aug 2025 13:45:26 +0000 (09:45 -0400)]
drm/amd/display: Support HW cursor 180 rot for any number of pipe splits
[Why]
For the HW cursor, its current position in the pipe_ctx->stream struct is
not affected by the 180 rotation, i. e. the top left corner is still at
0,0. However, the DPP & HUBP set_cursor_position functions require rotated
position.
The current approach is hard-coded for ODM 2:1, thus it's failing for
ODM 4:1, resulting in a double cursor.
[How]
Instead of calculating the new cursor position relatively to the
viewports, we calculate it using a viewavable clip_rect of each plane.
The clip_rects are first offset and scaled to the same space as the
src_rect, i. e. Stream space -> Plane space.
In case of a pipe split, which divides the plane into 2 or more viewports,
the clip_rect is the union of all the viewports of the given plane.
With the assumption that the viewports in HUBP's set_cursor_position are
in the Plane space as well, it should produce a correct cursor position
for any number of pipe splits.
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Signed-off-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Amber Lin [Fri, 15 Aug 2025 18:04:15 +0000 (14:04 -0400)]
drm/amdkfd: Tie UNMAP_LATENCY to queue_preemption
When KFD asks CP to preempt queues, other than preempt CP queues, CP
also requests SDMA to preempt SDMA queues with UNMAP_LATENCY timeout.
Currently queue_preemption_timeout_ms is 9000 ms by default but can be
configured via module parameter. KFD_UNMAP_LATENCY_MS is hard coded as
4000 ms though. This patch ties KFD_UNMAP_LATENCY_MS to
queue_preemption_timeout_ms so in a slow system such as emulator, both
CP and SDMA slowness are taken into account.
Signed-off-by: Amber Lin <Amber.Lin@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Update the conditions for setting the SMU vcn reset caps in the SMU v13.0.6 PPT
initialization function. Specifically:
- Add support for VCN reset capability for firmware versions 0x00558200 and
above when the program version is 0.
- Add support for VCN reset capability for firmware versions 0x05551800 and
above when the program version is 5.
v2: correct the smu mp1 version for program 5 (Lijo)
Suggested-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Eric Huang [Mon, 18 Aug 2025 18:22:53 +0000 (14:22 -0400)]
drm/amdkfd: fix vram allocation failure for a special case
When it only allocates vram without va, which is 0, and a
SVM range allocated stays in this range, the vram allocation
returns failure. It should be skipped for this case from
SVM usage check.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sunday Clement [Fri, 30 May 2025 14:58:08 +0000 (10:58 -0400)]
drm/amdkfd: Allow device error to be logged
The addition of a WARN_ON() check in order to return early in the
kq_initialize function retroactively causes the default case in the
following switch statement to never be executed, preventing dev_err
from logging device errors in the kernel. Both logs are now checked
in the default case.
Rakuram Eswaran [Thu, 21 Aug 2025 02:59:55 +0000 (08:29 +0530)]
docs: gpu: amdgpu: Fix spelling in amdgpu documentation
Fixed following typos reported by Codespell
1. propogated ==> propagated
aperatures ==> apertures
In Documentation/gpu/amdgpu/debugfs.rst
2. parition ==> partition
In Documentation/gpu/amdgpu/process-isolation.rst
3. conections ==> connections
In Documentation/gpu/amdgpu/display/programming-model-dcn.rst
In addition to above,
Fixed wrong bit-partition naming in gpu/amdgpu/process-isolation.rst
from "fourth" partition to "third" partition.
Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Suggested-by: Randy Dunlap <rdunlap@infradead.org> Suggested-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Rakuram Eswaran <rakuram.e96@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Brahmajit Das [Thu, 21 Aug 2025 12:01:33 +0000 (17:31 +0530)]
drm/amd/display: clean-up dead code in dml2_mall_phantom
pipe_idx in funtion dml2_svp_validate_static_schedulabilit, although set
is never actually used. While building with GCC 16 this gives a warning:
drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/dml2_mall_phantom.c: In function ‘set_phantom_stream_timing’:
drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/dml2_mall_phantom.c:657:25: warning: variable ‘pipe_idx’ set but not used [-Wunused-but-set-variable=]
657 | unsigned int i, pipe_idx;
| ^~~~~~~~
Signed-off-by: Brahmajit Das <listout@listout.xyz> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Yifan Zhang [Wed, 20 Aug 2025 06:55:41 +0000 (14:55 +0800)]
drm/amdgpu: remove redundant AMDGPU_HAS_VRAM
AMDGPU_HAS_VRAM is redundant with is_app_apu, as both refer to
APUs with no carve-out. Since AMDGPU_HAS_VRAM only occurs once,
remove AMDGPU_HAS_VRAM definition. The tmr allocation can be covered
with AMDGPU_GEM_DOMAIN_GTT | AMDGPU_GEM_DOMAIN_VRAM in both vram and
non vram ASICs.
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ce Sun [Wed, 20 Aug 2025 09:18:57 +0000 (17:18 +0800)]
drm/amdgpu: Correct the loss of aca bank reg info
By polling, poll ACA bank count to ensure that valid
ACA bank reg info can be obtained
v2: add corresponding delay before send msg to SMU to query mca bank info
(Stanley)
v3: the loop cannot exit. (Thomas)
v4: remove amdgpu_aca_clear_bank_count. (Kevin)
v5: continuously inject ce. If a creation interruption
occurs at this time, bank reg info will be lost. (Thomas)
v5: each cycle is delayed by 100ms. (Tao)
Signed-off-by: Ce Sun <cesun102@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ce Sun [Tue, 19 Aug 2025 06:47:05 +0000 (14:47 +0800)]
drm/amdgpu: Add a mutex lock to protect poison injection
When poison is triggered multiple times, competition will occur.
Add a mutex lock to protect poison injection
Signed-off-by: Ce Sun <cesun102@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ce Sun [Thu, 7 Aug 2025 04:36:05 +0000 (12:36 +0800)]
drm/amdgpu: Correct the counts of nr_banks and nr_errors
Correct the counts of nr_banks and nr_errors
Signed-off-by: Ce Sun <cesun102@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Liao Yuanhong [Tue, 19 Aug 2025 14:24:50 +0000 (22:24 +0800)]
drm/amd/display: Remove redundant header files
The header file "dc_stream.h" is already included on line 1507. Remove the
redundant include.
This is because the header file was initially included towards the latter
part of the code. Subsequent commits had to include the header file again
earlier in the code. In my opinion, this doesn't count as a fix; it just
requires removing the redundant header inclusion.
Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Liao Yuanhong [Tue, 19 Aug 2025 08:25:21 +0000 (16:25 +0800)]
drm/amdgpu/fence: Remove redundant 0 value initialization
The amdgpu_fence struct is already zeroed by kzalloc(). It's redundant to
initialize am_fence->context to 0.
Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 24 Jun 2025 15:38:14 +0000 (11:38 -0400)]
drm/amdgpu/gfx12: set MQD as appriopriate for queue types
Set the MQD as appropriate for the kernel vs user queues.
Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 24 Jun 2025 15:37:16 +0000 (11:37 -0400)]
drm/amdgpu/gfx11: set MQD as appriopriate for queue types
Set the MQD as appropriate for the kernel vs user queues.
Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Timur Kristóf [Sat, 2 Aug 2025 15:51:53 +0000 (17:51 +0200)]
drm/amd/display: Fix DP audio DTO1 clock source on DCE 6.
On DCE 6, DP audio was not working. However, it worked when an
HDMI monitor was also plugged in.
Looking at dce_aud_wall_dto_setup it seems that the main
difference is that we use DTO1 when only DP is plugged in.
When programming DTO1, it uses audio_dto_source_clock_in_khz
which is set from get_dp_ref_freq_khz
The dce60_get_dp_ref_freq_khz implementation looks incorrect,
because DENTIST_DISPCLK_CNTL seems to be always zero on DCE 6,
so it isn't usable.
I compared dce60_get_dp_ref_freq_khz to the legacy display code,
specifically dce_v6_0_audio_set_dto, and it turns out that in
case of DCE 6, it needs to use the display clock. With that,
DP audio started working on Pitcairn, Oland and Cape Verde.
However, it still didn't work on Tahiti. Despite having the
same DCE version, Tahiti seems to have a different audio device.
After some trial and error I realized that it works with the
default display clock as reported by the VBIOS, not the current
display clock.
Rodrigo Siqueira [Sat, 16 Aug 2025 16:27:27 +0000 (10:27 -0600)]
drm/amdgpu/vcn: Remove unnecessary check
The function amdgpu_vcn_sysfs_reset_mask_init already returns 0, which
makes the check of the result unnecessary in the vcn_v4_0_3_sw_init().
Just return the amdgpu_vcn_sysfs_reset_mask_init directly.
Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Timur Kristóf [Thu, 31 Jul 2025 09:43:52 +0000 (11:43 +0200)]
drm/amd/display: Fix fractional fb divider in set_pixel_clock_v3
For later VBIOS versions, the fractional feedback divider is
calculated as the remainder of dividing the feedback divider by
a factor, which is set to 1000000. For reference, see:
- calculate_fb_and_fractional_fb_divider
- calc_pll_max_vco_construct
However, in case of old VBIOS versions that have
set_pixel_clock_v3, they only have 1 byte available for the
fractional feedback divider, and it's expected to be set to the
remainder from dividing the feedback divider by 10.
For reference see the legacy display code:
- amdgpu_pll_compute
- amdgpu_atombios_crtc_program_pll
This commit fixes set_pixel_clock_v3 by dividing the fractional
feedback divider passed to the function by 100000.
Fixes: 4562236b3bc0 ("drm/amd/dc: Add dc display driver (v2)") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Timur Kristóf [Thu, 31 Jul 2025 09:43:51 +0000 (11:43 +0200)]
drm/amd/display: Don't print errors for nonexistent connectors
When getting the number of connectors, the VBIOS reports
the number of valid indices, but it doesn't say which indices
are valid, and not every valid index has an actual connector.
If we don't find a connector on an index, that is not an error.
Considering these are not actual errors, don't litter the logs.
Fixes: 60df5628144b ("drm/amd/display: handle invalid connector indices") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Timur Kristóf [Thu, 31 Jul 2025 09:43:50 +0000 (11:43 +0200)]
drm/amd/display: Don't warn when missing DCE encoder caps
On some GPUs the VBIOS just doesn't have encoder caps,
or maybe not for every encoder.
This isn't really a problem and it's handled well,
so let's not litter the logs with it.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Timur Kristóf [Thu, 31 Jul 2025 09:43:49 +0000 (11:43 +0200)]
drm/amd/display: Fill display clock and vblank time in dce110_fill_display_configs
Also needed by DCE 6.
This way the code that gathers this info can be shared between
different DCE versions and doesn't have to be repeated.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Timur Kristóf [Thu, 31 Jul 2025 09:43:48 +0000 (11:43 +0200)]
drm/amd/display: Find first CRTC and its line time in dce110_fill_display_configs
dce110_fill_display_configs is shared between DCE 6-11, and
finding the first CRTC and its line time is relevant to DCE 6 too.
Move the code to find it from DCE 11 specific code.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Timur Kristóf [Thu, 31 Jul 2025 09:43:47 +0000 (11:43 +0200)]
drm/amd/display: Adjust DCE 8-10 clock, don't overclock by 15%
Adjust the nominal (and performance) clocks for DCE 8-10,
and set them to 625 MHz, which is the value used by the legacy
display code in amdgpu_atombios_get_clock_info.
This was tested with Hawaii, Tonga and Fiji.
These GPUs can output 4K 60Hz (10-bit depth) at 625 MHz.
The extra 15% clock was added as a workaround for a Polaris issue
which uses DCE 11, and should not have been used on DCE 8-10 which
are already hardcoded to the highest possible display clock.
Unfortunately, the extra 15% was mistakenly copied and kept
even on code paths which don't affect Polaris.
This commit fixes that and also adds a check to make sure
not to exceed the maximum DCE 8-10 display clock.
Fixes: 8cd61c313d8b ("drm/amd/display: Raise dispclk value for Polaris") Fixes: dc88b4a684d2 ("drm/amd/display: make clk mgr soc specific") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Timur Kristóf [Thu, 31 Jul 2025 09:43:46 +0000 (11:43 +0200)]
drm/amd/display: Don't overclock DCE 6 by 15%
The extra 15% clock was added as a workaround for a Polaris issue
which uses DCE 11, and should not have been used on DCE 6 which
is already hardcoded to the highest possible display clock.
Unfortunately, the extra 15% was mistakenly copied and kept
even on code paths which don't affect Polaris.
This commit fixes that and also adds a check to make sure
not to exceed the maximum DCE 6 display clock.
Fixes: 8cd61c313d8b ("drm/amd/display: Raise dispclk value for Polaris") Fixes: dc88b4a684d2 ("drm/amd/display: make clk mgr soc specific") Fixes: 3ecb3b794e2c ("drm/amd/display: dc/clk_mgr: add support for SI parts (v2)") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Xichao Zhao [Fri, 8 Aug 2025 02:52:09 +0000 (10:52 +0800)]
drm/amd/display: replace min/max nesting with clamp()
The clamp() macro explicitly expresses the intent of constraining
a value within bounds.Therefore, replacing min(max(a, b), c) with
clamp(val, lo, hi) can improve code readability.
Signed-off-by: Xichao Zhao <zhao.xichao@vivo.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Chenyuan Yang [Thu, 24 Jul 2025 02:36:41 +0000 (21:36 -0500)]
drm/amd/display: Add null pointer check in mod_hdcp_hdcp1_create_session()
The function mod_hdcp_hdcp1_create_session() calls the function
get_first_active_display(), but does not check its return value.
The return value is a null pointer if the display list is empty.
This will lead to a null pointer dereference.
Add a null pointer check for get_first_active_display() and return
MOD_HDCP_STATUS_DISPLAY_NOT_FOUND if the function return null.
This is similar to the commit c3e9826a2202
("drm/amd/display: Add null pointer check for get_first_active_display()").
Fixes: 2deade5ede56 ("drm/amd/display: Remove hdcp display state with mst fix") Signed-off-by: Chenyuan Yang <chenyuan0y@gmail.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Taimur Hassan [Fri, 8 Aug 2025 22:25:32 +0000 (17:25 -0500)]
drm/amd/display: Promote DC to 3.2.346
This version brings along following updates:
- Fix Xorg desktop unresponsive on Replay panel
- [FW Promotion] Release 0.1.23.0
- Avoid a NULL pointer dereference
- Attach privacy screen to DRM connector
- Setup Second Stutter Watermark Implementation
- Align LSDMA commands fields
- Delete unused functions
- Optimize amdgpu_dm_atomic_commit_tail()
- Add primary plane to commits for correct VRR handling
- Refactor DPP enum for backwards compatibility.
- Add LSDMA Linear Sub Window Copy support
Acked-by: Sun peng (Leo) Li <sunpeng.li@amd.com> Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Tom Chung [Fri, 18 Jul 2025 10:25:08 +0000 (18:25 +0800)]
drm/amd/display: Fix Xorg desktop unresponsive on Replay panel
[WHY & HOW]
IPS & self-fresh feature can cause vblank counter resets between
vblank disable and enable.
It may cause system stuck due to wait the vblank counter.
Call the drm_crtc_vblank_restore() during vblank enable to estimate
missed vblanks by using timestamps and update the vblank counter in
DRM.
It can make the vblank counter increase smoothly and resolve this issue.
Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Sun peng (Leo) Li <sunpeng.li@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Taimur Hassan [Fri, 8 Aug 2025 21:25:15 +0000 (17:25 -0400)]
drm/amd/display: [FW Promotion] Release 0.1.23.0
1. Fix loop counter.
2. Check whether rb->capacity is 0.
Acked-by: Sun peng (Leo) Li <sunpeng.li@amd.com> Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[WHY]
Although unlikely drm_atomic_get_new_connector_state() or
drm_atomic_get_old_connector_state() can return NULL.
[HOW]
Check returns before dereference.
Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Attach privacy screen to DRM connector
[WHY]
If a system has a privacy screen advertised by a driver it should
be included in the DRM connector for the eDP panel.
[HOW]
Detect statically declared privacy screens when creating eDP connector
and attach privacy screen DRM properties.
Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Austin Zheng [Tue, 5 Aug 2025 19:18:02 +0000 (15:18 -0400)]
drm/amd/display: Setup Second Stutter Watermark Implementation
[WHY & HOW]
Setup initial changes required to program another set of watermarks
for a 2nd stutter mode. The 2nd stutter mode will be lower power but
have higher enter/exit latencies.
PMFW to choose which stutter mode to use based on stutter efficiences
to see if original stutter (LP1) or low power stutter (LP2) will result
in better power savings.
Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Signed-off-by: Austin Zheng <Austin.Zheng@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rafal Ostrowski [Tue, 5 Aug 2025 12:53:37 +0000 (14:53 +0200)]
drm/amd/display: Align LSDMA commands fields
[WHY]
DC LSDMA functions had to remember to extract 1 from several fields
to be compliant with DMUB LSDMA commands interface.
Now this logic is moved to DMUB.
[HOW]
Moved extraction by 1 in several fields of LSDMA commands to DMUB.
Changed DC to not do it.
Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Signed-off-by: Rafal Ostrowski <rostrows@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Clay King [Wed, 30 Jul 2025 14:23:19 +0000 (10:23 -0400)]
drm/amd/display: Delete unused functions
[WHAT]
Removing unused code
Reviewed-by: Joshua Aberback <joshua.aberback@amd.com> Signed-off-by: Clay King <clayking@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[WHY]
The first two loops of for_each_oldnew_connector_in_state() both operate
on an HDCP queue. If one isn't setup then each connector is iterated but
skipped TWICE. This is wasteful for the majority of cases.
[HOW]
Combine the two HDCP related loops of for_each_oldnew_connector_in_state()
and check for the HDCP workqueue before even running either of them. This
should avoid running the functions in most cases, and if HDCP is setup only
run once.
Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Add LSDMA Linear Sub Window Copy support
[WHAT]
Add support for LSDMA Linear Sub Window Copy command.
Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Signed-off-by: Rafal Ostrowski <rostrows@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Chenglei Xie [Thu, 7 Aug 2025 20:52:34 +0000 (16:52 -0400)]
drm/amdgpu: refactor bad_page_work for corner case handling
When a poison is consumed on the guest before the guest receives the host's poison creation msg, a corner case may occur to have poison_handler complete processing earlier than it should to cause the guest to hang waiting for the req_bad_pages reply during a VF FLR, resulting in the VM becoming inaccessible in stress tests.
To fix this issue, this patch refactored the mailbox sequence by seperating the bad_page_work into two parts req_bad_pages_work and handle_bad_pages_work.
Old sequence:
1.Stop data exchange work
2.Guest sends MB_REQ_RAS_BAD_PAGES to host and keep polling for IDH_RAS_BAD_PAGES_READY
3.If the IDH_RAS_BAD_PAGES_READY arrives within timeout limit, re-init the data exchange region for updated bad page info
else timeout with error message
New sequence:
req_bad_pages_work:
1.Stop data exhange work
2.Guest sends MB_REQ_RAS_BAD_PAGES to host
Once Guest receives IDH_RAS_BAD_PAGES_READY event
handle_bad_pages_work:
3.re-init the data exchange region for updated bad page info
drm/amd/display: Add NULL pointer checks in dc_stream cursor attribute functions
The function dc_stream_set_cursor_attributes() currently dereferences
the `stream` pointer and nested members `stream->ctx->dc->current_state`
without checking for NULL.
All callers of these functions, such as in
`dcn30_apply_idle_power_optimizations()` and
`amdgpu_dm_plane_handle_cursor_update()`, already perform NULL checks
before calling these functions.
Fixes below:
drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_stream.c:336 dc_stream_program_cursor_attributes()
error: we previously assumed 'stream' could be null (see line 334)
drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_stream.c
327 bool dc_stream_program_cursor_attributes(
328 struct dc_stream_state *stream,
329 const struct dc_cursor_attributes *attributes)
330 {
331 struct dc *dc;
332 bool reset_idle_optimizations = false;
333
334 dc = stream ? stream->ctx->dc : NULL;
^^^^^^
The old code assumed stream could be NULL.
335
--> 336 if (dc_stream_set_cursor_attributes(stream, attributes)) {
^^^^^^
The refactor added an unchecked dereference.
drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_stream.c
313 bool dc_stream_set_cursor_attributes(
314 struct dc_stream_state *stream,
315 const struct dc_cursor_attributes *attributes)
316 {
317 bool result = false;
318
319 if (dc_stream_check_cursor_attributes(stream, stream->ctx->dc->current_state, attributes)) {
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Here.
This function used to check for if stream as NULL and return false at
the start. Probably we should add that back.
Fixes: 4465dd0e41e8 ("drm/amd/display: Refactor SubVP cursor limiting logic") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Cc: Alex Hung <alex.hung@amd.com> Cc: Alvin Lee <alvin.lee2@amd.com> Cc: Ray Wu <ray.wu@amd.com> Cc: Dillon Varone <dillon.varone@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: ChiaHsuan Chung <chiahsuan.chung@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Daniel Wheeler <daniel.wheeler@amd.com> Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Wenjing Liu <wenjing.liu@amd.com> Cc: Jun Lei <Jun.Lei@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Dillon Varone <Dillon.varone@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jesse.Zhang [Wed, 13 Aug 2025 02:55:44 +0000 (10:55 +0800)]
drm/amd/vcn: Add late_init callback for VCN v4.0.3 reset handling
This change reorganizes VCN reset capability detection by:
1. Moving reset mask configuration from sw_init to new late_init phase
2. Adding vcn_v4_0_3_late_init() to properly check for per-queue reset support
3. Only setting soft full reset mask as fallback when per-queue reset isn't supported
4. Removing TODO comment now that queue reset support is implemented
V2: Removed unrelated changes. Keep amdgpu_get_soft_full_reset_mask in place
and remove TODO comment. (Alex)
v3: set the flags at one place (all in late_init) (Lijo)
Suggested-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Ruili Ji <ruiliji2@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Kent Russell [Mon, 21 Jul 2025 18:06:36 +0000 (14:06 -0400)]
drm/amdkfd: Handle lack of READ permissions in SVM mapping
HMM assumes that pages have READ permissions by default. Inside
svm_range_validate_and_map, we add READ permissions then add WRITE
permissions if the VMA isn't read-only. This will conflict with regions
that only have PROT_WRITE or have PROT_NONE. When that happens,
svm_range_restore_work will continue to retry, silently, giving the
impression of a hang if pr_debug isn't enabled to show the retries..
If pages don't have READ permissions, simply unmap them and continue. If
they weren't mapped in the first place, this would be a no-op. Since x86
doesn't support write-only, and PROT_NONE doesn't allow reads or writes
anyways, this will allow the svm range validation to continue without
getting stuck in a loop forever on mappings we can't use with HMM.
Signed-off-by: Kent Russell <kent.russell@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jesse.Zhang [Wed, 13 Aug 2025 02:40:18 +0000 (10:40 +0800)]
drm/amd/pm: Add VCN reset support for SMU v13.0.6
This commit implements VCN reset capability for SMU v13.0.6 with the following changes:
1. Added new PPSMC message ID (0x5B) for VCN reset in SMU firmware interface
2. Extended SMU capabilities to include VCN_RESET support
3. Implemented VCN reset support check:
- Added smu_v13_0_6_reset_vcn_is_supported() function
4. Updated SMU v13.0.6 PPT functions to include VCN reset operations
v2: clean up debug info (Alex)
v3: remove unsupported message and split smu v13.0.6 changes to a separate patch (Lijo)
v4: simply the function (smu_v13_0_6_reset_vcn_is_supported) (Lijo)
Suggested-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Ruili Ji <ruiliji2@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jesse.Zhang [Wed, 13 Aug 2025 02:36:58 +0000 (10:36 +0800)]
drm/amd/pm: Add VCN reset support check capability
This change introduces infrastructure to check whether VCN reset
is supported by the SMU firmware. Key changes include:
1. Added new functions to query VCN reset support:
- amdgpu_dpm_reset_vcn_is_supported()
- smu_reset_vcn_is_supported()
- pptable_funcs.reset_vcn_is_supported callback
2. Implemented proper locking in the DPM layer with mutex protection
3. Maintained consistency with existing SDMA reset support checks
The new capability allows callers to check for VCN reset support
before attempting the operation, preventing unnecessary attempts
on unsupported platforms.
v2: clean up debug info(Alex)
Suggested-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Ruili Ji <ruiliji2@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Heng Zhou [Wed, 13 Aug 2025 03:18:04 +0000 (11:18 +0800)]
drm/amdgpu: fix nullptr err of vm_handle_moved
If a amdgpu_bo_va is fpriv->prt_va, the bo of this one is always NULL.
So, such kind of amdgpu_bo_va should be updated separately before
amdgpu_vm_handle_moved.
Eric Huang [Thu, 7 Aug 2025 18:23:11 +0000 (14:23 -0400)]
drm/amdkfd: set uuid for each partition in topology
Currently each kfd compute partition/node is sharing
the same uuid of AID, which doen't meet the CUDA spec
for visible device, so corresponding XCD id for each
partition in smu has been assigned to xcp, and exposed
to kfd topology.
v2: add NULL check (Lijo)
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Xichao Zhao [Tue, 12 Aug 2025 03:16:03 +0000 (11:16 +0800)]
drm/radeon: replace min/max nesting with clamp()
The clamp() macro explicitly expresses the intent of constraining
a value within bounds.Therefore, replacing min(max(a, b), c) and
max(min(a,b),c) with clamp(val, lo, hi) can improve code readability.
Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Xichao Zhao <zhao.xichao@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Liu01 Tong [Mon, 11 Aug 2025 06:52:37 +0000 (14:52 +0800)]
drm/amdgpu: fix task hang from failed job submission during process kill
During process kill, drm_sched_entity_flush() will kill the vm
entities. The following job submissions of this process will fail, and
the resources of these jobs have not been released, nor have the fences
been signalled, causing tasks to hang and timeout.
Fix by check entity status in amdgpu_vm_ready() and avoid submit jobs to
stopped entity.
v2: add amdgpu_vm_ready() check before amdgpu_vm_clear_freed() in
function amdgpu_cs_vm_handling().
Signed-off-by: Liu01 Tong <Tong.Liu01@amd.com> Signed-off-by: Lin.Cao <lincao12@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jack Xiao [Mon, 11 Aug 2025 07:20:55 +0000 (15:20 +0800)]
drm/amdgpu: fix incorrect vm flags to map bo
It should use vm flags instead of pte flags
to specify bo vm attributes.
Fixes: 7946340fa389 ("drm/amdgpu: Move csa related code to separate file") Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
YiPeng Chai [Tue, 12 Aug 2025 01:17:58 +0000 (09:17 +0800)]
drm/amdgpu: fix vram reservation issue
The vram block allocation flag must be cleared
before making vram reservation, otherwise reserving
addresses within the currently freed memory range
will always fail.
Fixes: c9cad937c0c5 ("drm/amdgpu: add drm buddy support to amdgpu") Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Some kfd ioctls may not be available depending on the kernel version the
user is running, as such we need to report -ENOTTY so userland can
determine the cause of the ioctl failure.
Signed-off-by: Geoffrey McRae <geoffrey.mcrae@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Brahmajit Das [Mon, 11 Aug 2025 09:21:25 +0000 (14:51 +0530)]
drm/radeon/r600_cs: clean up of dead code in r600_cs
GCC 16 enables -Werror=unused-but-set-variable= which results in build
error with the following message.
drivers/gpu/drm/radeon/r600_cs.c: In function ‘r600_texture_size’:
drivers/gpu/drm/radeon/r600_cs.c:1411:29: error: variable ‘level’ set but not used [-Werror=unused-but-set-variable=]
1411 | unsigned offset, i, level;
| ^~~~~
cc1: all warnings being treated as errors
make[6]: *** [scripts/Makefile.build:287: drivers/gpu/drm/radeon/r600_cs.o] Error 1
level although is set, but in never used in the function
r600_texture_size. Thus resulting in dead code and this error getting
triggered.
Fixes: 60b212f8ddcd ("drm/radeon: overhaul texture checking. (v3)") Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Brahmajit Das <listout@listout.xyz> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Cryolitia PukNgae <cryolitia@uniontech.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Taimur Hassan [Sun, 3 Aug 2025 23:38:31 +0000 (18:38 -0500)]
drm/amd/display: Promote DC to 3.2.345
This version brings along following update:
-Fix close and open lid may cause eDP remaining blank
-Fix frequently disabling/enabling OTG may cause incorrect
configuration of OTG
Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Taimur Hassan [Sun, 3 Aug 2025 20:12:50 +0000 (16:12 -0400)]
drm/amd/display: [FW Promotion] Release 0.1.22.0
Add a new command for Panel Replay.
Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Danny Wang [Thu, 24 Jul 2025 05:58:21 +0000 (13:58 +0800)]
drm/amd/display: Reset apply_eamless_boot_optimization when dpms_off
[WHY&HOW]
The user closed the lid while the system was powering on and opened it
again before the “apply_seamless_boot_optimization” was set to false,
resulting in the eDP remaining blank.
Reset the “apply_seamless_boot_optimization” to false when dpms off.
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Danny Wang <Danny.Wang@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
TungYu Lu [Tue, 15 Jul 2025 08:56:59 +0000 (16:56 +0800)]
drm/amd/display: Wait until OTG enable state is cleared
[Why]
Customer reported an issue that OS starts and stops device multiple times
during driver installation. Frequently disabling and enabling OTG may
prevent OTG from being safely disabled and cause incorrect configuration
upon the next enablement.
[How]
Add a wait until OTG_CURRENT_MASTER_EN_STATE is cleared as a short term
solution.
Reviewed-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: TungYu Lu <tungyu.lu@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Vitaly Prosyak [Thu, 7 Aug 2025 20:37:25 +0000 (16:37 -0400)]
drm/amdgpu: add to custom amdgpu_drm_release drm_dev_enter/exit
User queues are disabled before GEM objects are released
(protecting against user app crashes).
No races with PCI hot-unplug (because drm_dev_enter prevents cleanup
if iewdevice is being removed).
Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Lijo Lazar [Sat, 28 Jun 2025 05:20:25 +0000 (10:50 +0530)]
drm/amdgpu: Save and restore switch state
During a DPC error kernel waits for the link to be active before
notifying downstream devices. On certain platforms with Broadcom switch
in synthetiic mode, switch responds with values even though the link is
not fully ready. The config space restoration done by pcie port driver
for SWUS/DS of dGPU is thus not effective as the switch is still doing
internal enumeration.
As a workaround, save state of SWUS/DS device in driver. Add additional
check to see if link is active and restore the values during DPC error
callbacks.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sathishkumar S [Tue, 5 Aug 2025 16:05:10 +0000 (21:35 +0530)]
drm/amdgpu/vcn: Hold pg_lock before vcn power off
Acquire vcn_pg_lock before changes to vcn power state
and release it after power off in idle work handler.
Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sathishkumar S [Tue, 5 Aug 2025 15:58:25 +0000 (21:28 +0530)]
drm/amdgpu/jpeg: Hold pg_lock before jpeg poweroff
Acquire jpeg_pg_lock before changes to jpeg power state
and release it after power off from idle work handler.
Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>