]> git.ipfire.org Git - thirdparty/linux.git/log
thirdparty/linux.git
3 months agodrm/amdgpu/vcn2.5: use generic set_power_gating_state helper
Alex Deucher [Tue, 26 Nov 2024 17:35:42 +0000 (12:35 -0500)] 
drm/amdgpu/vcn2.5: use generic set_power_gating_state helper

No need for an IP specific version.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu/vcn2.0: use generic set_power_gating_state helper
Alex Deucher [Tue, 26 Nov 2024 17:33:58 +0000 (12:33 -0500)] 
drm/amdgpu/vcn2.0: use generic set_power_gating_state helper

No need for an IP specific version.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu/vcn1.0: use generic set_power_gating_state helper
Alex Deucher [Tue, 26 Nov 2024 17:32:12 +0000 (12:32 -0500)] 
drm/amdgpu/vcn1.0: use generic set_power_gating_state helper

No need for an IP specific version.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu/vcn: add a generic helper for set_power_gating_state
Alex Deucher [Tue, 26 Nov 2024 17:30:30 +0000 (12:30 -0500)] 
drm/amdgpu/vcn: add a generic helper for set_power_gating_state

It's common for all VCN variants.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu/vcn: use per instance callbacks for idle work handler
Alex Deucher [Tue, 26 Nov 2024 17:26:32 +0000 (12:26 -0500)] 
drm/amdgpu/vcn: use per instance callbacks for idle work handler

Use the vcn instance power gating callbacks rather than
the IP powergating callback.  This limits power gating to
only the instance in use rather than all of the instances.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu/vcn5.0.1: add set_pg_state callback
Alex Deucher [Tue, 10 Dec 2024 19:15:34 +0000 (14:15 -0500)] 
drm/amdgpu/vcn5.0.1: add set_pg_state callback

Rework the code as a vcn instance callback.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu/vcn5.0.0: add set_pg_state callback
Alex Deucher [Tue, 26 Nov 2024 17:20:38 +0000 (12:20 -0500)] 
drm/amdgpu/vcn5.0.0: add set_pg_state callback

Rework the code as a vcn instance callback.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu/vcn4.0.5: add set_pg_state callback
Alex Deucher [Tue, 26 Nov 2024 17:20:19 +0000 (12:20 -0500)] 
drm/amdgpu/vcn4.0.5: add set_pg_state callback

Rework the code as a vcn instance callback.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu/vcn4.0.3: add set_pg_state callback
Alex Deucher [Tue, 26 Nov 2024 17:20:04 +0000 (12:20 -0500)] 
drm/amdgpu/vcn4.0.3: add set_pg_state callback

Rework the code as a vcn instance callback.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu/vcn4.0: add set_pg_state callback
Alex Deucher [Tue, 26 Nov 2024 17:19:49 +0000 (12:19 -0500)] 
drm/amdgpu/vcn4.0: add set_pg_state callback

Rework the code as a vcn instance callback.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu/vcn3.0: add set_pg_state callback
Alex Deucher [Tue, 26 Nov 2024 17:19:33 +0000 (12:19 -0500)] 
drm/amdgpu/vcn3.0: add set_pg_state callback

Rework the code as a vcn instance callback.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu/vcn2.5: add set_pg_state callback
Alex Deucher [Tue, 26 Nov 2024 17:19:15 +0000 (12:19 -0500)] 
drm/amdgpu/vcn2.5: add set_pg_state callback

Rework the code as a vcn instance callback.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu/vcn2.0: add set_pg_state callback
Alex Deucher [Tue, 26 Nov 2024 15:57:48 +0000 (10:57 -0500)] 
drm/amdgpu/vcn2.0: add set_pg_state callback

Rework the code as a vcn instance callback.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu/vcn1.0: add set_pg_state callback
Alex Deucher [Tue, 26 Nov 2024 15:52:15 +0000 (10:52 -0500)] 
drm/amdgpu/vcn1.0: add set_pg_state callback

Rework the code as a vcn instance callback.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu/vcn: add new per instance callback for powergating
Alex Deucher [Tue, 26 Nov 2024 16:27:06 +0000 (11:27 -0500)] 
drm/amdgpu/vcn: add new per instance callback for powergating

This is per instance so add a new function pointer for it.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu/vcn: adjust pause_dpg_mode function signature
Alex Deucher [Tue, 26 Nov 2024 16:14:58 +0000 (11:14 -0500)] 
drm/amdgpu/vcn: adjust pause_dpg_mode function signature

Change it to take a vcn instance rather than adev to align
with the vcn instance changes.

TODO: clean up the function internals to use the vinst state
directly rather than accessing it indirectly via adev->vcn.inst[].

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu/vcn5.0.1: convert internal functions to use vcn_inst
Alex Deucher [Tue, 10 Dec 2024 19:00:57 +0000 (14:00 -0500)] 
drm/amdgpu/vcn5.0.1: convert internal functions to use vcn_inst

Pass the vcn instance structure to these functions rather
than adev and the instance number.

TODO: clean up the function internals to use the vinst state
directly rather than accessing it indirectly via adev->vcn.inst[].

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu/vcn5.0.0: convert internal functions to use vcn_inst
Alex Deucher [Fri, 22 Nov 2024 23:07:03 +0000 (18:07 -0500)] 
drm/amdgpu/vcn5.0.0: convert internal functions to use vcn_inst

Pass the vcn instance structure to these functions rather
than adev and the instance number.

TODO: clean up the function internals to use the vinst state
directly rather than accessing it indirectly via adev->vcn.inst[].

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu/vcn4.0.5: convert internal functions to use vcn_inst
Alex Deucher [Fri, 22 Nov 2024 22:38:48 +0000 (17:38 -0500)] 
drm/amdgpu/vcn4.0.5: convert internal functions to use vcn_inst

Pass the vcn instance structure to these functions rather
than adev and the instance number.

TODO: clean up the function internals to use the vinst state
directly rather than accessing it indirectly via adev->vcn.inst[].

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu/vcn4.0.3: convert internal functions to use vcn_inst
Alex Deucher [Fri, 22 Nov 2024 22:01:49 +0000 (17:01 -0500)] 
drm/amdgpu/vcn4.0.3: convert internal functions to use vcn_inst

Pass the vcn instance structure to these functions rather
than adev and the instance number.

TODO: clean up the function internals to use the vinst state
directly rather than accessing it indirectly via adev->vcn.inst[].

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu/vcn4.0: convert internal functions to use vcn_inst
Alex Deucher [Fri, 22 Nov 2024 20:38:37 +0000 (15:38 -0500)] 
drm/amdgpu/vcn4.0: convert internal functions to use vcn_inst

Pass the vcn instance structure to these functions rather
than adev and the instance number.

TODO: clean up the function internals to use the vinst state
directly rather than accessing it indirectly via adev->vcn.inst[].

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu/vcn2.5: convert internal functions to use vcn_inst
Alex Deucher [Fri, 22 Nov 2024 18:53:40 +0000 (13:53 -0500)] 
drm/amdgpu/vcn2.5: convert internal functions to use vcn_inst

Pass the vcn instance structure to these functions rather
than adev and the instance number.

TODO: clean up the function internals to use the vinst state
directly rather than accessing it indirectly via adev->vcn.inst[].

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu/vcn2.0: convert internal functions to use vcn_inst
Alex Deucher [Fri, 22 Nov 2024 18:30:16 +0000 (13:30 -0500)] 
drm/amdgpu/vcn2.0: convert internal functions to use vcn_inst

Pass the vcn instance structure to these functions rather
than adev and the instance number.

TODO: clean up the function internals to use the vinst state
directly rather than accessing it indirectly via adev->vcn.inst[].

v2: index instances directly on vcn1.0 and 2.0 to make
it clear that they only support a single instance (Lijo)

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu/vcn1.0: convert internal functions to use vcn_inst
Alex Deucher [Tue, 19 Nov 2024 21:51:36 +0000 (16:51 -0500)] 
drm/amdgpu/vcn1.0: convert internal functions to use vcn_inst

Pass the vcn instance structure to these functions rather
than adev and the instance number.

TODO: clean up the function internals to use the vinst state
directly rather than accessing it indirectly via adev->vcn.inst[].

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu/vcn3.0: convert internal functions to use vcn_inst
Alex Deucher [Tue, 19 Nov 2024 21:10:46 +0000 (16:10 -0500)] 
drm/amdgpu/vcn3.0: convert internal functions to use vcn_inst

Pass the vcn instance structure to these functions rather
than adev and the instance number.

TODO: clean up the function internals to use the vinst state
directly rather than accessing it indirectly via adev->vcn.inst[].

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu/vcn: switch vcn helpers to be instance based
Alex Deucher [Fri, 15 Nov 2024 22:44:01 +0000 (17:44 -0500)] 
drm/amdgpu/vcn: switch vcn helpers to be instance based

Pass the instance to the helpers.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu/vcn: move more instanced data to vcn_instance
Alex Deucher [Fri, 15 Nov 2024 21:19:23 +0000 (16:19 -0500)] 
drm/amdgpu/vcn: move more instanced data to vcn_instance

Move more per instance data into the per instance structure.

v2: index instances directly on vcn1.0 and 2.0 to make
it clear that they only support a single instance (Lijo)
v3: fix typo on vcn 2.5

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com> (v2)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu/vcn: make powergating status per instance
Alex Deucher [Wed, 13 Nov 2024 20:28:41 +0000 (15:28 -0500)] 
drm/amdgpu/vcn: make powergating status per instance

Store it per instance so we can track it per instance.

v2: index instances directly on vcn1.0 and 2.0 to make
it clear that they only support a single instance (Lijo)

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu/vcn: switch work handler to be per instance
Alex Deucher [Wed, 13 Nov 2024 19:43:15 +0000 (14:43 -0500)] 
drm/amdgpu/vcn: switch work handler to be per instance

Have a separate work handler for each VCN instance. This
paves the way for per instance VCN power gating at runtime.

v2: index instances directly on vcn1.0 and 2.0 to make
it clear that they only support a single instance (Lijo)

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu/vcn5.0.1: split code along instances
Alex Deucher [Tue, 10 Dec 2024 17:34:54 +0000 (12:34 -0500)] 
drm/amdgpu/vcn5.0.1: split code along instances

Split the code on a per instance basis.  This will allow
us to use the per instance functions in the future to
handle more things per instance.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu/vcn5.0.0: split code along instances
Alex Deucher [Wed, 13 Nov 2024 17:27:45 +0000 (12:27 -0500)] 
drm/amdgpu/vcn5.0.0: split code along instances

Split the code on a per instance basis.  This will allow
us to use the per instance functions in the future to
handle more things per instance.

v2: squash in fix for stop() from Boyuan

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu/vcn4.0.5: split code along instances
Alex Deucher [Wed, 13 Nov 2024 17:21:18 +0000 (12:21 -0500)] 
drm/amdgpu/vcn4.0.5: split code along instances

Split the code on a per instance basis.  This will allow
us to use the per instance functions in the future to
handle more things per instance.

v2: squash in fix for stop() from Boyuan

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu/vcn4.0.3: split code along instances
Alex Deucher [Wed, 13 Nov 2024 17:13:15 +0000 (12:13 -0500)] 
drm/amdgpu/vcn4.0.3: split code along instances

Split the code on a per instance basis.  This will allow
us to use the per instance functions in the future to
handle more things per instance.

v2: squash in fix for stop() from Boyuan

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu/vcn4.0: split code along instances
Alex Deucher [Wed, 13 Nov 2024 17:01:44 +0000 (12:01 -0500)] 
drm/amdgpu/vcn4.0: split code along instances

Split the code on a per instance basis.  This will allow
us to use the per instance functions in the future to
handle more things per instance.

v2: squash in fix for stop() from Boyuan

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu/vcn3.0: split code along instances
Alex Deucher [Wed, 13 Nov 2024 16:47:49 +0000 (11:47 -0500)] 
drm/amdgpu/vcn3.0: split code along instances

Split the code on a per instance basis.  This will allow
us to use the per instance functions in the future to
handle more things per instance.

v2: squash in fix for stop() from Boyuan

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu/vcn2.5: fix VCN stop logic
Alex Deucher [Mon, 24 Feb 2025 16:13:27 +0000 (11:13 -0500)] 
drm/amdgpu/vcn2.5: fix VCN stop logic

Need to make sure we call amdgpu_dpm_enable_vcn()
in vcn_v2_5_stop() at the end if there are errors
or DPG is enabled.

Fixes: ebc25499de12 ("drm/amdgpu/vcn2.5: split code along instances")
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Suggested-by: Boyuan Zhang <boyuan.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu: increase AMDGPU_MAX_RINGS
Tao Zhou [Tue, 25 Feb 2025 11:18:12 +0000 (19:18 +0800)] 
drm/amdgpu: increase AMDGPU_MAX_RINGS

Increase it since a cper ring is introduced.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu: Fix correct parameter desc for VCN idle check functions
Srinivasan Shanmugam [Mon, 24 Feb 2025 11:56:22 +0000 (17:26 +0530)] 
drm/amdgpu: Fix correct parameter desc for VCN idle check functions

Fixes the kdoc for the following VCN idle check functions by updating
the parameter description from 'handle' to 'ip_block':

- vcn_v4_0_is_idle
- vcn_v4_0_3_is_idle
- vcn_v4_0_5_is_idle
- vcn_v5_0_1_is_idle

Fixes the below with gcc W=1:
drivers/gpu/drm/amd/amdgpu/vcn_v5_0_1.c:935: warning: Function parameter or struct member 'ip_block' not described in 'vcn_v5_0_1_is_idle'
drivers/gpu/drm/amd/amdgpu/vcn_v5_0_1.c:935: warning: Excess function parameter 'handle' description in 'vcn_v5_0_1_is_idle'
drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c:1972: warning: Function parameter or struct member 'ip_block' not described in 'vcn_v4_0_is_idle'
drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c:1972: warning: Excess function parameter 'handle' description in 'vcn_v4_0_is_idle'
drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c:1583: warning: Function parameter or struct member 'ip_block' not described in 'vcn_v4_0_3_is_idle'
drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c:1583: warning: Excess function parameter 'handle' description in 'vcn_v4_0_3_is_idle'
drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c:1200: warning: Function parameter or struct member 'ip_block' not described in 'vcn_v5_0_0_is_idle'
drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c:1200: warning: Excess function parameter 'handle' description in 'vcn_v5_0_0_is_idle'
drivers/gpu/drm/amd/amdgpu/vcn_v4_0_5.c:1460: warning: Function parameter or struct member 'ip_block' not described in 'vcn_v4_0_5_is_idle'
drivers/gpu/drm/amd/amdgpu/vcn_v4_0_5.c:1460: warning: Excess function parameter 'handle' description in 'vcn_v4_0_5_is_idle'

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu: init return value in amdgpu_ttm_clear_buffer
Pierre-Eric Pelloux-Prayer [Thu, 20 Feb 2025 13:41:59 +0000 (14:41 +0100)] 
drm/amdgpu: init return value in amdgpu_ttm_clear_buffer

Otherwise an uninitialized value can be returned if
amdgpu_res_cleared returns true for all regions.

Possibly closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3812

Fixes: a68c7eaa7a8f ("drm/amdgpu: Enable clear page functionality")
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu: Change page/record number calculation based on nps
ganglxie [Mon, 24 Feb 2025 07:06:51 +0000 (15:06 +0800)] 
drm/amdgpu: Change page/record number calculation based on nps

save only one record to save eeprom space,and
bad_page_num = pa_rec_num + mca_rec_num*16

Signed-off-by: ganglxie <ganglxie@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu: Refine bad page adding
ganglxie [Mon, 24 Feb 2025 07:03:05 +0000 (15:03 +0800)] 
drm/amdgpu: Refine bad page adding

bad page adding can be simpler with nps info

Signed-off-by: ganglxie <ganglxie@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amd/pm: Get metrics table version for smu_v13_0_12
Asad Kamal [Sat, 22 Feb 2025 10:11:35 +0000 (18:11 +0800)] 
drm/amd/pm: Get metrics table version for smu_v13_0_12

Get metrics table version for smu_v13_0_12 and populate pm_metrics

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu: update SDMA sysfs reset mask in late_init
Jesse.zhang@amd.com [Fri, 21 Feb 2025 10:18:07 +0000 (18:18 +0800)] 
drm/amdgpu: update SDMA sysfs reset mask in late_init

- Added `sdma_v4_4_2_update_reset_mask` function to update the reset mask.
- update the sysfs reset mask to the `late_init` stage to ensure that the SMU  initialization
     and capability setup are completed before checking the SDMA reset capability.
- For IP versions 9.4.3 and 9.4.4, enable per-queue reset if the MEC firmware version is at least 0xb0 and PMFW supports queue reset.
- Add a TODO comment for future support of per-queue reset for IP version 9.5.0.

This change ensures that per-queue reset is only enabled when the MEC and PMFW support it.

v2: fix ip version (9.5.4 -> 9.5.0)(Lijo)

Suggested-by: Jonathan Kim <Jonathan.Kim@amd.com>
Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu: Set CPER enabled flag after ring initiailized
Xiang Liu [Mon, 24 Feb 2025 15:01:06 +0000 (23:01 +0800)] 
drm/amdgpu: Set CPER enabled flag after ring initiailized

Setting cper.enabled to be true only after cper ring is successfully
created.

Signed-off-by: Xiang Liu <xiang.liu@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu: Save nps to eeprom
ganglxie [Mon, 24 Feb 2025 03:17:33 +0000 (11:17 +0800)] 
drm/amdgpu: Save nps to eeprom

nps info saved together with bad page makes bad page parsing more efficient

Signed-off-by: ganglxie <ganglxie@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu: Check if CPER enabled when generating CPER
Xiang Liu [Mon, 24 Feb 2025 13:10:24 +0000 (21:10 +0800)] 
drm/amdgpu: Check if CPER enabled when generating CPER

In the case of CPER disabled, generating CPER will cause kernel NULL
pointer dereference without checking.

Signed-off-by: Xiang Liu <xiang.liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amd/pm: handling of set performance level
Mangesh Gadre [Fri, 21 Feb 2025 09:38:21 +0000 (17:38 +0800)] 
drm/amd/pm: handling of set performance level

display performance level when set not supported

Signed-off-by: Mangesh Gadre <Mangesh.Gadre@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu: simplify xgmi peer info calls
Jonathan Kim [Mon, 10 Feb 2025 18:15:48 +0000 (13:15 -0500)] 
drm/amdgpu: simplify xgmi peer info calls

Deprecate KFD XGMI peer info calls in favour of calling directly from
simplified XGMI peer info functions.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdkfd: enable cooperative launch on gfx12
Jonathan Kim [Fri, 21 Feb 2025 14:39:27 +0000 (09:39 -0500)] 
drm/amdkfd: enable cooperative launch on gfx12

Even though GWS no longer exists, to maintain runtime usage for
cooperative launch, SW set legacy GWS size.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Acked-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Harish Kasiviswanathan <harish.kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amd/display: Promote DAL to 3.2.322
Taimur Hassan [Sun, 16 Feb 2025 21:27:26 +0000 (16:27 -0500)] 
drm/amd/display: Promote DAL to 3.2.322

- Disable PSR-SU on eDP panels
- Fix HPD after GPU reset
- Fixes on dcn4x init, DML2 state policy on DCN36
- Various minor logic fixes

Reviewed-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amd/display: [FW Promotion] Release 0.0.255.0
Taimur Hassan [Sun, 16 Feb 2025 20:06:06 +0000 (15:06 -0500)] 
drm/amd/display: [FW Promotion] Release 0.0.255.0

Reviewed-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amd/display: Fix HPD after gpu reset
Roman Li [Wed, 12 Feb 2025 19:49:36 +0000 (14:49 -0500)] 
drm/amd/display: Fix HPD after gpu reset

[Why]
DC is not using amdgpu_irq_get/put to manage the HPD interrupt refcounts.
So when amdgpu_irq_gpu_reset_resume_helper() reprograms all of the IRQs,
HPD gets disabled.

[How]
Use amdgpu_irq_get/put() for HPD init/fini in DM in order to sync refcounts

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Roman Li <Roman.Li@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amd/display: stop DML2 from removing pipes based on planes
Mike Katsnelson [Thu, 13 Feb 2025 16:52:32 +0000 (11:52 -0500)] 
drm/amd/display: stop DML2 from removing pipes based on planes

[Why]
Transitioning from low to high resolutions at high refresh rates caused grey corruption.
During the transition state, there is a period where plane size is based on low resultion
state and ODM slices are based on high resoultion state, causing the entire plane to be
contained in one ODM slice. DML2 would turn off the pipe for the ODM slice with no plane,
causing an underflow since the pixel rate for the higher resolution cannot be supported on
one pipe. This change stops DML2 from turning off pipes that are mapped to an ODM slice
with no plane. This is possible to do without negative consequences because pipes can now
take the minimum viewport and draw with zero recout size, removing the need to have the
pipe turned off.

[How]
In map_pipes_from_plane(), remove "check" that skips ODM slices that are not covered by
the plane. This prevents the pipes for those ODM slices from being freed.

Reviewed-by: Ovidiu Bunea <ovidiu.bunea@amd.com>
Signed-off-by: Mike Katsnelson <mike.katsnelson@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amd/display: Increase halt timeout for DMCUB to 1s
Nicholas Kazlauskas [Thu, 13 Feb 2025 22:40:29 +0000 (17:40 -0500)] 
drm/amd/display: Increase halt timeout for DMCUB to 1s

[Why]
If we soft reset before halt finishes and there are outstanding
memory transactions then the memory interface may produce unexpected
results, such as out of order transactions when the firmware next runs.

These can manifest as random or unexpected load/store violations.

[How]
Increase the timeout before soft reset to ensure the DMCUB has quiesced.

Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amd/display: Remove unused header
Krunoslav Kovac [Fri, 14 Feb 2025 00:14:59 +0000 (19:14 -0500)] 
drm/amd/display: Remove unused header

[Why]
Removes unused header

Reviewed-by: Samson Tam <samson.tam@amd.com>
Signed-off-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amd/display: handle max_downscale_src_width fail check
Yihan Zhu [Wed, 12 Feb 2025 20:17:56 +0000 (15:17 -0500)] 
drm/amd/display: handle max_downscale_src_width fail check

[WHY]
If max_downscale_src_width check fails, we exit early from TAP calculation and left a NULL
value to the scaling data structure to cause the zero divide in the DML validation.

[HOW]
Call set default TAP calculation before early exit in get_optimal_number_of_taps due to
max downscale limit exceed.

Reviewed-by: Samson Tam <samson.tam@amd.com>
Signed-off-by: Yihan Zhu <Yihan.Zhu@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amd/display: Update FIXED_VS Link Rate Toggle Workaround Usage
Michael Strauss [Fri, 24 Jan 2025 20:02:27 +0000 (15:02 -0500)] 
drm/amd/display: Update FIXED_VS Link Rate Toggle Workaround Usage

[WHY]
Previously the 128b/132b LTTPR support DPCD field was used to decide if
FIXED_VS training sequence required a rate toggle before initiating LT.

When running DP2.1 4.9.x.x compliance tests, emulated LTTPRs can report
no-128b/132b support which is then forwarded by the FIXED_VS retimer.
As a result this test exposes the rate toggle again, erroneously causing
failures as certain compliance sinks don't expect this behaviour.

[HOW]
Add new DPCD register defines/reads to read LTTPR IEEE OUI and device ID.

Decide whether to perform the rate toggle based on the LTTPR's IEEE OUI
which guarantees that we only perform the toggle on affected retimers.

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Michael Strauss <michael.strauss@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amd/display: fix dcn4x init failed
Charlene Liu [Thu, 13 Feb 2025 17:37:10 +0000 (12:37 -0500)] 
drm/amd/display: fix dcn4x init failed

[why]
failed due to cmdtable not created.
switch atombios cmdtable as default.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Charlene Liu <Charlene.Liu@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amd/display: Temporarily disable hostvm on DCN31
Aurabindo Pillai [Mon, 20 Jan 2025 20:27:23 +0000 (15:27 -0500)] 
drm/amd/display: Temporarily disable hostvm on DCN31

With HostVM enabled, DCN31 fails to pass validation for 3x4k60. Some Linux
userspace does not downgrade one of the monitors to 4k30, and the result
is that the monitor does not light up. Disable it until the bandwidth
calculation failure is resolved.

Reviewed-by: Sun peng Li <sunpeng.li@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amd/display: ACPI Re-timer Programming
Rafal Ostrowski [Wed, 12 Feb 2025 07:08:07 +0000 (08:08 +0100)] 
drm/amd/display: ACPI Re-timer Programming

[Why]
We must implement an ACPI re-timer programming interface and notify
ACPI driver whenever a PHY transition is about to take place.

Because some trace lengths on certain platforms are very long,
then a re-timer may need to be programmed whenever a PHY transition
takes place. The implementation of this re-timer programming interface
will notify ACPI driver that PHY transition is taking place and it
will trigger the re-timer as needed.

First we need to gather retimer information from ACPI interface.

Then, in the PRE case, the re-timer interface needs to be called before we call
transmitter ENABLE.
In the POST case, it has to be called after we call transmitter DISABLE.

[How]
Implemented ACPI retimer programming interface.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Rafal Ostrowski <rostrows@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amd/display: Refactor DCN4x and related code
Patel, Swapnil [Sun, 9 Feb 2025 16:42:23 +0000 (11:42 -0500)] 
drm/amd/display: Refactor DCN4x and related code

[why & how]
Refactor existing code related to DCN4x for better code sharing with
other modules.

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Signed-off-by: Swapnil Patel <Swapnil.Patel@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amd/display: add a quirk to enable eDP0 on DP1
Yilin Chen [Fri, 7 Feb 2025 20:26:19 +0000 (15:26 -0500)] 
drm/amd/display: add a quirk to enable eDP0 on DP1

[why]
some board designs have eDP0 connected to DP1, need a way to enable
support_edp0_on_dp1 flag, otherwise edp related features cannot work

[how]
do a dmi check during dm initialization to identify systems that
require support_edp0_on_dp1. Optimize quirk table with callback
functions to set quirk entries, retrieve_dmi_info can set quirks
according to quirk entries

Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Yilin Chen <Yilin.Chen@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amd/display: replace dio encoder access
Peichen Huang [Fri, 17 Jan 2025 02:48:11 +0000 (10:48 +0800)] 
drm/amd/display: replace dio encoder access

[WHY]
replace dio encoder access to work with new dio encoder
assignment.

[HOW}
1. before validation, access dio encoder by get_temp_dio_link_enc()
2. after validation, access dio encoder through pipe_ctx->link_res

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Reviewed-by: Meenakshikumar Somasundaram <meenakshikumar.somasundaram@amd.com>
Signed-off-by: Peichen Huang <PeiChen.Huang@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amd/display: Add SPL namespace
Navid Assadian [Mon, 20 Jan 2025 17:35:07 +0000 (12:35 -0500)] 
drm/amd/display: Add SPL namespace

[Why]
In order to avoid component conflicts, spl namespace is needed.

[How]
Adding SPL namespace to the public API os that each user of SPL can have
their own namespace.

Signed-off-by: Navid Assadian <Navid.Assadian@amd.com>
Reviewed-by: Samson Tam <Samson.Tam@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amd/display: Fix unit test failure
Samson Tam [Tue, 7 Jan 2025 19:17:15 +0000 (14:17 -0500)] 
drm/amd/display: Fix unit test failure

[Why]
Some of unit tests use large scaling ratio such that when we
 calculate optimal number of taps, max_taps is negative.
 Then in recent change, we changed max_taps to uint instead
 of int so now max_taps wraps and is positive.  This change
 changed the behaviour from returning back false to return
 true and breaks unit test check

[How]
Add check to prevent max_taps from wrapping and set to 0
 instead

Signed-off-by: Samson Tam <Samson.Tam@amd.com>
Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amd/display: fix check for identity ratio
Samson Tam [Tue, 21 Jan 2025 16:01:47 +0000 (11:01 -0500)] 
drm/amd/display: fix check for identity ratio

[Why]
IDENTITY_RATIO check uses 2 bits for integer, which only allows
 checking downscale ratios up to 3.  But we support up to 6x
 downscale

[How]
Update IDENTITY_RATIO to check 3 bits for integer
Add ASSERT to catch if we downscale more than 6x

Signed-off-by: Samson Tam <Samson.Tam@amd.com>
Reviewed-by: Jun Lei <jun.lei@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amd/display: Fix mismatch type comparison
Assadian, Navid [Thu, 19 Dec 2024 22:19:09 +0000 (17:19 -0500)] 
drm/amd/display: Fix mismatch type comparison

The mismatch type comparison/assignment may cause data loss. Since the
values are always non-negative, it is safe to use unsigned variables to
resolve the mismatch.

Signed-off-by: Navid Assadian <navid.assadian@amd.com>
Reviewed-by: Joshua Aberback <joshua.aberback@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amd/display: Add opp recout adjustment
Navid Assadian [Mon, 20 Jan 2025 17:35:23 +0000 (12:35 -0500)] 
drm/amd/display: Add opp recout adjustment

[Why]
For subsampled YUV output formats, more pixels can get fetched and be
used for scaling.

[How]
Add the adjustment to the calculated recout, so the viewport covers the
corresponding pixels on the source plane.

Signed-off-by: Navid Assadian <Navid.Assadian@amd.com>
Reviewed-by: Samson Tam <Samson.Tam@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amd/display: Fix mismatch type comparison in custom_float
Samson Tam [Tue, 7 Jan 2025 19:16:04 +0000 (14:16 -0500)] 
drm/amd/display: Fix mismatch type comparison in custom_float

[Why & How]
Passing uint into uchar function param.  Pass uint instead

Signed-off-by: Samson Tam <Samson.Tam@amd.com>
Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amd/display: Apply DCN35 DML2 state policy for DCN36 too
Nicholas Kazlauskas [Fri, 24 Jan 2025 14:59:37 +0000 (09:59 -0500)] 
drm/amd/display: Apply DCN35 DML2 state policy for DCN36 too

[Why]
DCN36 should inherit the same policy as DCN35 for DML2.

[How]
Add it to the list of checks in translation helper.

Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amd/display: update incorrect cursor buffer size
Alex Hung [Tue, 11 Feb 2025 20:43:48 +0000 (13:43 -0700)] 
drm/amd/display: update incorrect cursor buffer size

[WHAT & HOW]
Fix the incorrect value of the cursor_buffer_size.

Signed-off-by: Alex Hung <alex.hung@amd.com>
Reviewed-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amd/display: Disable PSR-SU on eDP panels
Tom Chung [Thu, 6 Feb 2025 03:31:23 +0000 (11:31 +0800)] 
drm/amd/display: Disable PSR-SU on eDP panels

[Why]
PSR-SU may cause some glitching randomly on several panels.

[How]
Temporarily disable the PSR-SU and fallback to PSR1 for
all eDP panels.

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3388
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Sun peng Li <sunpeng.li@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amd/display: Revert "Disable PSR-SU on some OLED panel"
Tom Chung [Thu, 6 Feb 2025 03:30:17 +0000 (11:30 +0800)] 
drm/amd/display: Revert "Disable PSR-SU on some OLED panel"

This reverts commit c31b41f1cb32450d8ac176eef9bda979760040e7.

We planning to disable the PSR-SU and fallback to PSR1 for
all eDP panels not only for specific eDP panel temporarily.

Reviewed-by: Sun peng Li <sunpeng.li@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amd/display: Fix spelling mistake "oustanding" -> "outstanding"
Colin Ian King [Mon, 17 Feb 2025 09:53:25 +0000 (09:53 +0000)] 
drm/amd/display: Fix spelling mistake "oustanding" -> "outstanding"

There is a spelling mistake in max_oustanding_when_urgent_expected,
fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agoMAINTAINERS: Update AMDGPU DML maintainers info
Aurabindo Pillai [Fri, 21 Feb 2025 19:19:12 +0000 (14:19 -0500)] 
MAINTAINERS: Update AMDGPU DML maintainers info

Chaitanya is no longer with AMD, and the responsibility has been
taken over by Austin.

Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amd/display: restore edid reading from a given i2c adapter
Melissa Wen [Sat, 15 Feb 2025 21:15:47 +0000 (18:15 -0300)] 
drm/amd/display: restore edid reading from a given i2c adapter

When switching to drm_edid, we slightly changed how to get edid by
removing the possibility of getting them from dc_link when in aux
transaction mode. As MST doesn't initialize the connector with
`drm_connector_init_with_ddc()`, restore the original behavior to avoid
functional changes.

v2:
- Fix build warning of unchecked dereference (kernel test bot)

CC: Alex Hung <alex.hung@amd.com>
CC: Mario Limonciello <mario.limonciello@amd.com>
CC: Roman Li <Roman.Li@amd.com>
CC: Aurabindo Pillai <Aurabindo.Pillai@amd.com>
Fixes: 48edb2a4256e ("drm/amd/display: switch amdgpu_dm_connector to use struct drm_edid")
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu: Remove unused nbif_v6_3_1_sriov_funcs
Dr. David Alan Gilbert [Wed, 19 Feb 2025 21:23:18 +0000 (21:23 +0000)] 
drm/amdgpu: Remove unused nbif_v6_3_1_sriov_funcs

The nbif_v6_3_1_sriov_funcs instance of amdgpu_nbio_funcs was added in
commit 894c6d3522d1 ("drm/amdgpu: Add nbif v6_3_1 ip block support")
but has remained unused.

Alex has confirmed it wasn't needed.

Remove it, together with the four unused stub functions:
  nbif_v6_3_1_sriov_ih_doorbell_range
  nbif_v6_3_1_sriov_gc_doorbell_init
  nbif_v6_3_1_sriov_vcn_doorbell_range
  nbif_v6_3_1_sriov_sdma_doorbell_range

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agomailmap: Add entry for Rodrigo Siqueira
Rodrigo Siqueira [Wed, 19 Feb 2025 18:46:20 +0000 (11:46 -0700)] 
mailmap: Add entry for Rodrigo Siqueira

Map all of my previously used email addresses to my @igalia.com address.

Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu: Add ring reset callback for JPEG5_0_1
Sathishkumar S [Tue, 18 Feb 2025 18:06:48 +0000 (23:36 +0530)] 
drm/amdgpu: Add ring reset callback for JPEG5_0_1

Add ring reset function callback for JPEG5_0_1 to
recover from job timeouts without a full gpu reset.

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agoMAINTAINERS: Change my role from Maintainer to Reviewer
Rodrigo Siqueira [Wed, 19 Feb 2025 18:46:19 +0000 (11:46 -0700)] 
MAINTAINERS: Change my role from Maintainer to Reviewer

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu: Log after a successful ring reset
André Almeida [Thu, 20 Feb 2025 16:27:49 +0000 (13:27 -0300)] 
drm/amdgpu: Log after a successful ring reset

When a ring reset happens, the kernel log shows only "amdgpu: Starting
<ring name> ring reset", but when it finishes nothing appears in the
log. Explicitly write in the log that the reset has finished correctly.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: André Almeida <andrealmeid@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu: Log the creation of a coredump file
André Almeida [Thu, 20 Feb 2025 16:27:48 +0000 (13:27 -0300)] 
drm/amdgpu: Log the creation of a coredump file

After a GPU reset happens, the driver creates a coredump file. However,
the user might not be aware of it. Log the file creation the user can
find more information about the device and add the file to bug reports.
This is similar to what the xe driver does.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: André Almeida <andrealmeid@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu/mes: keep enforce isolation up to date
Alex Deucher [Fri, 14 Feb 2025 17:32:30 +0000 (12:32 -0500)] 
drm/amdgpu/mes: keep enforce isolation up to date

Re-send the mes message on resume to make sure the
mes state is up to date.

Fixes: 8521e3c5f058 ("drm/amd/amdgpu: limit single process inside MES")
Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Shaoyun Liu <shaoyun.liu@amd.com>
Cc: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amd/pm: Use separate metrics table for smu_v13_0_12
Asad Kamal [Wed, 12 Feb 2025 08:34:03 +0000 (16:34 +0800)] 
drm/amd/pm: Use separate metrics table for smu_v13_0_12

Use separate metrics table for smu_v13_0_12 and fetch metrics data using
that.

v2: Fix jpeg busy indexing (Lijo)

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu: Add core reset registers for JPEG5_0_1
Sathishkumar S [Tue, 18 Feb 2025 18:05:42 +0000 (23:35 +0530)] 
drm/amdgpu: Add core reset registers for JPEG5_0_1

Add core reset control register definitions and align
all prior register definitions to end at 100 column
length for uniformity.

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu: Per-instance init func for JPEG5_0_1
Sathishkumar S [Tue, 18 Feb 2025 17:56:37 +0000 (23:26 +0530)] 
drm/amdgpu: Per-instance init func for JPEG5_0_1

Add helper functions to handle per-instance and per-core
initialization and deinitialization in JPEG5_0_1.

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amd/display: fix an indent issue in DML21
Aurabindo Pillai [Fri, 21 Feb 2025 14:45:12 +0000 (09:45 -0500)] 
drm/amd/display: fix an indent issue in DML21

Remove extraneous tab and newline in dml2_core_dcn4.c that was
reported by the bot

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202502211920.txUfwtSj-lkp@intel.com/
Fixes: 70839da6360 ("drm/amd/display: Add new DCN401 sources")
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agoMAINTAINERS: update amdgpu maintainers list
Alex Deucher [Tue, 11 Feb 2025 20:38:20 +0000 (15:38 -0500)] 
MAINTAINERS: update amdgpu maintainers list

Xinhui's email is no longer valid.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu: disable BAR resize on Dell G5 SE
Alex Deucher [Mon, 17 Feb 2025 15:55:05 +0000 (10:55 -0500)] 
drm/amdgpu: disable BAR resize on Dell G5 SE

There was a quirk added to add a workaround for a Sapphire
RX 5600 XT Pulse that didn't allow BAR resizing.  However,
the quirk caused a regression with runtime pm on Dell laptops
using those chips, rather than narrowing the scope of the
resizing quirk, add a quirk to prevent amdgpu from resizing
the BAR on those Dell platforms unless runtime pm is disabled.

v2: update commit message, add runpm check

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/1707
Fixes: 907830b0fc9e ("PCI: Add a REBAR size quirk for Sapphire RX 5600 XT Pulse")
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amd/pm: Fetch fru product info for smu_v13_0_12
Asad Kamal [Wed, 12 Feb 2025 08:00:41 +0000 (16:00 +0800)] 
drm/amd/pm: Fetch fru product info for smu_v13_0_12

Fetch fru product info for smu_v13_0_12 from static metrics table

v2: Field by field copy for fru info(Lijo)

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amd/pm: Fetch static metrics table
Asad Kamal [Mon, 10 Feb 2025 16:03:18 +0000 (00:03 +0800)] 
drm/amd/pm: Fetch static metrics table

Fetch clock frequency table from static metrics table for
smu_v13_0_12

v2: Move PPTable definition, remove unnecessary checks for getting
static metrics table(Lijo)

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amd/pm: Add GetStaticMetricTable message
Asad Kamal [Mon, 10 Feb 2025 16:17:37 +0000 (00:17 +0800)] 
drm/amd/pm: Add GetStaticMetricTable message

Add GetStaticMetricTable message for smu_v13_0_12

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amd/pm: Update pmfw headers for smu_v13_0_12
Asad Kamal [Mon, 10 Feb 2025 07:56:51 +0000 (15:56 +0800)] 
drm/amd/pm: Update pmfw headers for smu_v13_0_12

Update pmfw headers for smu_v13_0_12 new messages & metrics table.
Static metrics table for frequency added, Separate metrics table
for smu_v13_0_12 added.

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu: Update amdgpu_job_timedout to check if the ring is guilty
Jesse.zhang@amd.com [Fri, 21 Feb 2025 02:26:52 +0000 (10:26 +0800)] 
drm/amdgpu: Update amdgpu_job_timedout to check if the ring is guilty

This patch updates the `amdgpu_job_timedout` function to check if
the ring is actually guilty of causing the timeout. If not, it
skips error handling and fence completion.

v2: move the is_guilty check down into the queue reset area (Alex)
v3: need to call is_guilty before reset (Alex)
v4: squash in is_guilty logic fixes (Alex)

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amd/pm: add support for checking SDMA reset capability
Jesse.zhang@amd.com [Fri, 21 Feb 2025 06:02:05 +0000 (14:02 +0800)] 
drm/amd/pm: add support for checking SDMA reset capability

This patch introduces a new function to check if the SMU supports resetting the SDMA engine.
This capability check ensures that the driver does not attempt to reset the SDMA engine
on hardware that does not support it.

The following changes are included:
- New function `amdgpu_dpm_reset_sdma_is_supported` to check SDMA reset
  support at the AMDGPU driver level.
- New function `smu_reset_sdma_is_supported` to check SDMA reset support
  at the SMU level.
- Implementation of `smu_v13_0_6_reset_sdma_is_supported` for the specific
  SMU version v13.0.6.
- Updated `smu_v13_0_6_reset_sdma` to use the new capability check before
  attempting to reset the SDMA engine.

v2: change smu_reset_sdma_is_supported type to bool (Tim)

Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Tim Huang <tim.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu: Add reset function pointer for SDMA v4.4.2 page ring
Jesse.zhang@amd.com [Thu, 13 Feb 2025 05:28:51 +0000 (13:28 +0800)] 
drm/amdgpu: Add reset function pointer for SDMA v4.4.2 page ring

This patch adds a reset function pointer to the SDMA v4.4.2 page ring
functionality. The new function pointer `reset` is set to
`sdma_v4_4_2_reset_queue`, which is responsible for resetting the SDMA queue.

Changes:
- Add `reset` function pointer to `sdma_v4_4_2_page_ring_funcs`.

Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu: Improve SDMA reset logic with guilty queue tracking
Jesse.zhang@amd.com [Thu, 20 Feb 2025 06:43:59 +0000 (14:43 +0800)] 
drm/amdgpu: Improve SDMA reset logic with guilty queue tracking

This patch includes the remaining improvements to the SDMA reset logic:
- Added `gfx_guilty` and `page_guilty` flags to track guilty queues.
- Updated the reset and resume functions to handle the guilty state.
- Cached the `rptr` before reset.

v2:
   1.replace the caller with a guilty bool.
   If the queue is the guilty one, set the rptr and wptr  to the saved wptr value,
   else, set the rptr and wptr to the saved rptr value. (Alex)
   2. cache the rptr before the reset. (Alex)

v3: Keeping intermediate variables like u64 rwptr simplifies resotre rptr/wptr.(Lijo)

Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Suggested-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu/sdma: Introduce is_guilty callbacks for sdma GFX and PAGE rings
Jesse.zhang@amd.com [Thu, 13 Feb 2025 02:51:38 +0000 (10:51 +0800)] 
drm/amdgpu/sdma: Introduce is_guilty callbacks for sdma GFX and PAGE rings

This patch introduces the `is_guilty` callbacks for the GFX and PAGE rings.
These callbacks check if a ring is guilty of causing a timeout or error.

Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu: Introduce cached_rptr and is_guilty callback in amdgpu_ring
Jesse.zhang@amd.com [Thu, 13 Feb 2025 02:30:07 +0000 (10:30 +0800)] 
drm/amdgpu: Introduce cached_rptr and is_guilty callback in amdgpu_ring

This patch introduces the following changes:
- Add `cached_rptr` to the `amdgpu_ring` structure to store the read pointer before a reset.
- Add `is_guilty` callback to the `amdgpu_ring_funcs` structure to check if a ring is guilty of causing a timeout.

Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 months agodrm/amdgpu: Introduce conditional user queue suspension for SDMA resets
Jesse.zhang@amd.com [Thu, 20 Feb 2025 06:25:47 +0000 (14:25 +0800)] 
drm/amdgpu: Introduce conditional user queue suspension for SDMA resets

- Modify the `amdgpu_sdma_reset_engine` function to accept a `suspend_user_queues` parameter.
- This parameter allows the function to conditionally suspend and resume user queues during SDMA resets.
- Ensure that user queues are suspended only when necessary to avoid unnecessary overhead and potential deadlocks.
- Restart the scheduler's work queue for the GFX and page rings after the reset to allow new tasks to be submitted.

This change improves synchronization between the KGD and the KFD during SDMA resets,
ensuring proper handling of user queues and avoiding race conditions.

V2: replace the ring_lock with the existed the scheduler
    locks for the queues (ring->sched) on the sdma engine.(Alex)

v3: call drm_sched_wqueue_stop() rather than job_list_lock.
    If a GPU ring reset was already initiated for one ring at amdgpu_job_timedout,
    skip resetting that ring and call drm_sched_wqueue_stop()
    for the other rings (Alex)

   replace  the common lock (sdma_reset_lock) with DQM lock to
   to resolve reset races between the two driver sections during KFD eviction.(Jon)

   Rename the caller to Reset_src and
   Change AMDGPU_RESET_SRC_SDMA_KGD/KFD to AMDGPU_RESET_SRC_SDMA_HWS/RING (Jon)

v4: restart the wqueue if the reset was successful,
    or fall back to a full adapter reset. (Alex)

   move definition of reset source to enumeration AMDGPU_RESET_SRCS, and
   check reset src in amdgpu_sdma_reset_instance (Jon)

v5: Call amdgpu_amdkfd_suspend/resume at the start/end of reset function respectively under !SRC_HWS
    conditions only (Jon)

v6: replace the paramter src with a bool suspend_user_queues,
    remove the paramter src in pre/post func. (Jon)

Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Suggested-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Suggested-by: Jonathan Kim <Jonathan.Kim@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Acked-by: Jonathan Kim <jonathan.kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>