git.ipfire.org Git - thirdparty/kernel/linux.git/log

drm/amd: Require CONFIG_HOTPLUG_PCI_PCIE for BOCO

If the kernel hasn't been compiled with PCIe hotplug support this
can lead to problems with dGPUs that use BOCO because they effectively
drop off the bus.

To prevent issues, disable BOCO support when compiled without PCIe hotplug.

Reported-by: Gabriel Marcano <gabemarcano@yahoo.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/1707#note_2696862
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20241211155601.3585256-1-superm1@kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/vcn: reset fw_shared under SRIOV

- The previous patch only considered the case for baremetal
and is not applicable for SRIOV code path. We also need to
init fw_share for SRIOV VF

Fixes: 928cd772e18f ("drm/amdgpu/vcn: reset fw_shared when VCPU buffers corrupted on vcn v4.0.3")
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Bokun Zhang <bokun.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display/dc: add helper for panic updates

Add a DC helper for panic updates.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Lu Yao <yaolu@kylinos.cn>
Cc: Jocelyn Falempe <jfalempe@redhat.com>
Cc: Harry Wentland <harry.wentland@amd.com>

drm/amd/display: add clear_tiling mi callbacks

This adds clear_tiling callbacks to the mi structure that
will be used for drm panic support to clear the tiling on
a display. Mem input (mi) is used on DCE based display
IPs.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Lu Yao <yaolu@kylinos.cn>
Cc: Jocelyn Falempe <jfalempe@redhat.com>
Cc: Harry Wentland <harry.wentland@amd.com>

drm/amd/display: add clear_tiling hubp callbacks

This adds clear_tiling callbacks to the hubp structure that
will be used for drm panic support to clear the tiling on
a display. hubp3 support from Jocelyn's original patch
and the rest from me.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Lu Yao <yaolu@kylinos.cn>
Cc: Jocelyn Falempe <jfalempe@redhat.com>
Cc: Harry Wentland <harry.wentland@amd.com>

drm/amdgpu: add generic display panic helper code

Pull this out of Jocelyn's patch and make it generic.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Lu Yao <yaolu@kylinos.cn>
Cc: Jocelyn Falempe <jfalempe@redhat.com>
Cc: Harry Wentland <harry.wentland@amd.com>

drm/amdgpu/jpeg5.0.1: use num_jpeg_inst for SR-IOV

They should be the same, but use the proper variable.

Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/jpeg4.0.3: use num_jpeg_inst for SR-IOV

They should be the same, but use the proper variable.

Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: add sysfs reset mask for vcn 5.0.1

Add the calls to the vcn 5.0.1 code.

Reviewed-by: David (Ming Qiang) Wu <David.Wu3@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: add ip_dump support for vcn 5.0.1

Shared with vcn 5.0.0.

Reviewed-by: David (Ming Qiang) Wu <David.Wu3@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd: Update strapping for NBIO 2.5.0

This helps to avoid a spurious PME event on hotplug to Azalia.

Cc: Vijendar Mukunda <Vijendar.Mukunda@amd.com>
Reported-and-tested-by: ionut_n2001@yahoo.com
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=215884
Tested-by: Gabriel Marcano <gabemarcano@yahoo.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20241211024414.7840-1-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gfx11: clean up kcq reset code

Replace kcq queue reset with existing function amdgpu_mes_reset_legacy_queue.

Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gfx12: clean up kcq reset code

Replace kcq queue reset with existing function amdgpu_mes_reset_legacy_queue.

Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/sdma7: Add queue reset sysfs for sdmav7

sdmv7 queue reset already supports by mmio, add its sys file.

Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/mes12: Implement reset gfx/compute queue function by mmio

Reset gfx/compute queue through mmio based on me_id and queue_id.

Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/mes12: Implement reset sdmav7 queue function by mmio

Reset sdma queue through mmio based on me_id and queue_id.

Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Avoid VF for RAS recovery source check

VF device sets the RAS flag when mailbox data can't be read properly.
There is no conclusive way to tell if the real source is RAS error.
Therefore VF schedules a KFD based reset which doesn't set RAS source.
SKip checking RAS source for any VF scheduled recovery.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reported-by: Vojislav Tomasevic <vojislav.tomasevic@amd.com>
Reviewed-by: Yiqing Yao <yiqing.yao@amd.com>
Tested-by: Yiqing Yao <yiqing.yao@amd.com>
Fixes: e1ee2111ca48 ("drm/amdgpu: Prefer RAS recovery for scheduler hang")
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/sdma7: implement queue reset callback for sdma7

Implement sdma queue reset callback by mes_reset_queue_mmio.

Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/sdma7: Implement resume function for each instance

Extracts the resume sequence for per sdma instance from sdma_v7_0_gfx_resume.
This function can be used in start or restart scenarios of specific instances.

Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdkfd: Uninitialized pointer read

This a pointer that is being passed into other functions, so it is best to
initialize it to NULL prior.

Signed-off-by: Andrew Martin <Andrew.Martin@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Remove unused dcn_find_dcfclk_suits_all

dcn_find_dcfclk_suits_all() last use was removed by 2018's
commit 4fd994c448a3 ("drm/amd/display: Start using the new pp_smu
interface")

Remove it, and the dcn_find_normalized_clock_vdd_Level helper it used.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Remove unused mmhubbub_warmup field

mmhubbub_warmup is a field that was only read by the just removed
dc_stream_warmup_writeback() function.

Remove the field and it's initialisers.

It was only ever initialised to a single function value
(dcn30_mmhubbub_warmup) which is called explicitly elsewhere.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Remove unused dc_stream_warmup_writeback

dc_stream_warmup_writeback() is unused since it was added in 2019 by
commit 6a652f6d127d ("drm/amd/display: Add warmup escape call support")

Remove it.

Note there is a dcn30 version that's called directly which is kept.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Remove unused dwb3_set_host_read_rate_control

dwb3_set_host_read_rate_control() has been unused since it was added by
commit 8993dee0de2a ("drm/amd/display: Add DCN3 DWB")

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Remove unused enable_surface_flip_reporting

enable_surface_flip_reporting() has been unused since it was added by
commit 4562236b3bc0 ("drm/amd/dc: Add dc display driver (v2)")

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: rename register headers to dcn_2_0_1

They were named with the incorrect dcn version.

Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Sun peng Li <sunpeng.li@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: 3.2.313

* Fix some regressions related to IPS2 and PSR Panel Replay
* Bug fixes in DML
* DMCUB debug improvements
* Other refactors and improvements across multiple components

Reviewed-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: [FW Promotion] Release 0.0.246.0

Reviewed-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: update dcn351 used clock offset

[why]
hw register offset delta

Reviewed-by: Martin Leung <martin.leung@amd.com>
Signed-off-by: Charlene Liu <Charlene.Liu@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: remove clearance code of force_ffu_mode flag in dmub_psr_copy_settings()

[Why/How]
The force_ffu_mode flag could be initialized at other place.

Reviewed-by: Robin Chen <robin.chen@amd.com>
Signed-off-by: Zhongwei <Zhongwei.Zhang@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Revert "drm/amd/display: Don't allow IPS2 in D0 for RCG Dynamic"

This reverts commit 8488646966fe.

In some test environments causes reporting failures for S0i3/S4.

It shouldn't actually block entry provided there's no race with the
last state being updated, but currently suspecting there's an IPS2
check that's no longer being met.

Fixes: 8488646966fe ("drm/amd/display: Don't allow IPS2 in D0 for RCG Dynamic")
Reviewed-by: Ovidiu Bunea <ovidiu.bunea@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Revert "drm/amd/display: Revised for Replay Pseudo vblank"

This reverts commit 0f5ac8c8e275
Due to a replay regression.

Fixes: 0f5ac8c8e275 ("drm/amd/display: Revised for Replay Pseudo vblank control")
Reviewed-by: Dennis Chan <dennis.chan@amd.com>
Signed-off-by: Gabe Teeger <Gabe.Teeger@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Update color space, bias and scale programming sequence

[Why]
DMColor inaccurately updates color space, bias and scale
destructively in dc_plane_state. This can be resolved by
accurately populating the infos on dc_plane_info where then
translation to plane state can happen as a whole surface update sequence.

[How]
Remove dc_plane_state update in DMColor and update color space,
bias and scale on dc_plane_info.

Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Chris Park <chris.park@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Use resource_build_scaling_params for dcn20

[WHY]
When using upscaling on certain gpus, some incorrect scaling
calculations would be made causing hangs.

[HOW]
This was fixed by using the resource_build_scaling_params function on these
gpus.

Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Peterson Guo <peterson.guo@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Overwriting dualDPP UBF values before usage

[WHY]
Right now in dml2 mode validation we are calculating UBF parameters for
prefetch calculation for single and dual DPP scenarios. Data structure
to store such values are just 1D arrays, the single DPP values are
overwritten by the dualDPP values, and we end up using dualDPP for
prefetch calculations twice (once in place of singleDPP support check
and again for dual).

This naturally leads to many problems, one of which validating a mode in
"singleDPP" (when we used dual DPP parameters) and sending the singleDPP
parameters to mode programming, if we cannot support then we observe the
corruption as described in the ticket.

[HOW]
UBF values need to have 2d arrays to store values specific to single and
dual DPP states to avoid single DPP values being overwritten. Other
parameters are recorded on a per state basis such as prefetch UBF values
but they are in the same loop used for calculation and at that point its
fine to overwrite them, its not the case for plain UBF values.

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Ausef Yousof <Ausef.Yousof@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Make DMCUB tracebuffer debugfs chronological

[Why]

Previously, the debugfs did a simple dump of the tracebuffer region.
Because the tracebuffer is a ring, it meant that the entries printed may
not be in chronological order if the ring rolled over. This makes
parsing the tracelog cumbersome.

[How]

Since dmcub provides the current entry count, use that to determine
the latest tracelog entry and output the log chronologically.

Also, the fb region size is not accurate of the actual tracebuffer size;
it has been padded to alignment requirements. Use the tracebuffer size
reported by the fw meta_info, if available. If not, a fallback to the
hardcoded default is needed. To make this value available to other .c
files, its define was moved to dmub_srv.h.

Also, print a indicator at the start of the log if rollover occurred.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: clean up SPL code

[Why & How]
Add check for invalid pixel format, remove unused pixel formats
and clean up some names

Reviewed-by: Navid Assadian <navid.assadian@amd.com>
Signed-off-by: Samson Tam <Samson.Tam@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: DML21 Update Prefetch Calculations

[Why/How]
Mismatch between mode support and mode programming occurs.
Mode support would calculate higher row vblank than mode programming.
As a result, mode programming fails and hardware isn't properly programmed.

Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Austin Zheng <Austin.Zheng@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Adjust secure_display_context data structure

[Why]
Variables relates to secure display are spreading out within struct
amdgpu_display_manager.

[How]
Encapsulate relevant variables into struct secure_display_context and
adjust relevant affected codes.

Reviewed-by: HaoPing Liu <haoping.liu@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Fix phy id mapping issue for secure display

[Why]
Under mst scenario, mst streams are from the same link_enc_hw_inst.
As the result, can't utilize that as the phy index for distinguising
different stream sinks.

[How]
Sort the connectors by:
link_enc_hw_instance->mst tree depth->mst RAD

After sorting the phy index assignment, store connector's relevant info
into dm mapping array. Once need the index, just look up the static
array.

Reviewed-by: HaoPing Liu <haoping.liu@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Adjust dc_stream_forward_crc_window to accept assignment of phy_id

[Why]
For mst streams under same topology, stream->link->link_enc_hw_inst are the same and
hence can't distinguish the crc window setting.

[How]
Firstly adjust dc_stream_forward_crc_window to accept assignment of phy_id. Follow up
another patch to determine the phy_id at dm layer.

Reviewed-by: HaoPing Liu <haoping.liu@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Refactor dcn31_panel_construct to avoid assert

[Why]
We want to avoid unnecessary asserts, one of which is hit in
dcn31_panel_construct when booting on a DCN32 asic that has an eDP
connector on a different DIG than A or B. The DIG-based mapping only
applies when edp0_on_dp1 is supported, therefore the check for valid
eng_id can be moved within the appropriate section of the if statement.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Joshua Aberback <joshua.aberback@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: expose DCN401 HUBP functions

[Why]
Expose DCN401 HUBP functions for use across other platforms.

[Description]
This change aims to make the DCN401 HUBP functions accessible for
enabling their use in future platform developments.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Karthi Kandasamy <karthi.kandasamy@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: populate VABC support in DMCUB

[HOW&WHY]
Stores DMUB support for enablement of Varibright over VABC in DCN32

Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Reviewed-by: Iswara Nagulendran <iswara.nagulendran@amd.com>
Signed-off-by: Harry VanZyllDeJong <hvanzyll@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Support nbif v6_3_1 fatal error handling

Add nbif v6_3_1 fatal error handling support.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: Revert state if force level fails

Before forcing level, CG/PG is disabled or enabled depending on the new
level. However if the force level operation fails, CG/PG state remains
modified. Revert the state change on failure. Also, move invalid
operation checks to the beginning before any logic that could change SOC
state.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Enable VCN_5_0_1 IP block

Add VCN_5_0_1 IP block to kernel boot

Signed-off-by: Sonny Jiang <sonjiang@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Add VCN_5_0_1 support

Add vcn support for VCN_5_0_1

v2: rebase, squash in fixes (Alex)

Signed-off-by: Sonny Jiang <sonjiang@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: enable JPEG5_0_1 ip block

enable JPEG5_0_1 ip block

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Add JPEG5_0_1 support

add support for JPEG5_0_1

v2: squash in updates, rebase on IP instance changes

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Add VCN_5_0_1 codec query

Support VCN_5_0_1 codec query

v2: squash in updates

Signed-off-by: Sonny Jiang <sonjiang@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Add VCN_5_0_1 firmware

Add vcn_5_0_1 firmware support

Signed-off-by: Sonny Jiang <sonjiang@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: update macro for maximum jpeg rings

Update the macro to accomdate more rings.

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Sonny Jiang <sonjiang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd: Update atomfirmware: add new retimer definition

Add some new retimer definitions and also fix a incorrect definition

Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-by: Michael Strauss <michael.strauss@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: update irq sec header for vcn 5.0.0

No functional change.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: update irq sec header for jpeg 5.0.0

No functional change.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: add irq source ids for VCN5_0/JPEG5_0

Add interrupt source id macros for VCN5 and JPEG5

V2: Update copyright year (Sonny)

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Sonny Jiang <sonjiang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Add umc v8_14 ras functions

Add umc v8_14 ras functions.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Add umc v8_14_0 ip headers

Add umc v8_14_0 ip headers.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Add psp v14_0_3 ras support

Add psp v14_0_3 ras support.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/amdgpu: Add Annotations to Process Isolation functions

This update adds explanations to key functions that manage how the
Kernel Fusion Driver (KFD) and Kernel Graphics Driver (KGD) share the
GPU.

amdgpu_gfx_enforce_isolation_wait_for_kfd: Controls the waiting period
for KFD to ensure it takes turns with KGD in using the GPU. It uses a
mutex to safely manage shared data, like timing and state, and tracks
when KFD starts and stops waiting.

amdgpu_gfx_enforce_isolation_ring_begin_use: Ensures KFD has enough time
to run before new tasks are submitted to the GPU ring. It uses a mutex
to synchronize access and may adjust the KFD scheduler.

amdgpu_gfx_enforce_isolation_ring_end_use: Handles cleanup and state
updates when finishing the use of a GPU ring. It may also adjust the KFD
scheduler, using a mutex to manage shared data access.

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Init mmhub v1_8_1 ras func

reuse mmhub v1_8 ras functuion

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Enable xgmi for gfx v9_5_0

Enable xgmi for gfx v9_5_0

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Fetch refclock for SMU v13.0.12

Add support to fetch refclock value for SMU v13.0.12

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: Add mode2 support for SMU v13.0.12

Add mode2 reset support for smu version 13.0.12

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: Add smu_v13_0_12 support

Add support for new smu 13_0_12 version

v2: Updated subject & moved skipping p2s init to a separate patch

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/amdgpu: Add Descriptions to Process Isolation and Cleaner Shader Sysfs Functions

This update adds explanations to key functions related to process
isolation and cleaner shader execution sysfs interfaces.

- `amdgpu_gfx_set_run_cleaner_shader`: Describes how to manually run a
  cleaner shader, which clears the Local Data Store (LDS) and General
  Purpose Registers (GPRs) to ensure data isolation between GPU workloads.

- `amdgpu_gfx_get_enforce_isolation`: Describes how to query the current
  settings of the 'enforce_isolation' feature for each GPU partition.

- `amdgpu_gfx_set_enforce_isolation`: Describes how to enable or disable
  process isolation for GPU partitions through the sysfs interface.

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Enable RAS for psp v13_0_12

Enable RAS Cap check and initialize RAS funcs
for psp v13_0_12

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Load spdm_drv for psp v13_0_12

spdm_drv is a firmware that needs to be loaded
in driver initialization phase.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Add psp v13_0_12 firmware specifiers

Add psp v13_0_12 firmware specifiers for sos and ta

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Shiwu Zhang <shiwu.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: add psp 13_0_12 version support

Add support for new psp 13_0_12 version

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd: Show an info message about optional firmware missing

With the warning from the core about missing firmware gone,
users still may be notified of missing optional firmware by
a more friendly message to clarify it's optional.

Suggested-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: add ACA support for jpeg v4.0.3

Add ACA support for jpeg v4.0.3.

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: add ACA support for vcn v4.0.3

v1:
Add ACA support for vcn v4.0.3.

v2:
- split VCN ACA(v1) to 2 parts: vcn and jpeg.
- move mmSMNAID_AID0_MCA_SMU to amdgpu_aca.h file.

v3:
- split JPEG ACA to another patch.

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: move common ACA ipid defines into amdgpu_aca.h

move common ACA ipid defines into amdgpu_aca.h file.

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: add ih cam support for IH 4.4.4

Same as IH 4.4.2.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Amber Lin <Amber.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: add initial support for sdma444

add sdma444 basic support

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Increase FRU File Id buffer size

Some boards use longer File Ids.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: correct the calculation of RAS bad page

After the introduction of NPS RAS, one bad page record on eeprom may be
related to 1 or 16 bad pages, so the bad page record and bad page are
two different concepts, define a new variable to store bad page number.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: split ras_eeprom_init into init and check functions

Init function is for ras table header read and check function is
responsible for the validation of the header. Call them in different
stages.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd: Add the capability to mark certain firmware as "required"

Some of the firmware that is loaded by amdgpu is not actually required.
For example the ISP firmware on some SoCs is optional, and if it's not
present the ISP IP block just won't be initialized.

The firmware loader core however will show a warning when this happens
like this:
```
Direct firmware load for amdgpu/isp_4_1_0.bin failed with error -2
```

To avoid confusion for non-required firmware, adjust the amd-ucode helper
to take an extra argument indicating if the firmware is required or
optional.

On optional firmware use firmware_request_nowarn() instead of
request_firmware() to avoid the warnings.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/amd-gfx/df71d375-7abd-4b32-97ce-15e57846eed8@amd.com/T/#t
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdkfd: update the cwsr area size for gfx950

Update cwsr area size for gfx950 to fit the new user queue buffer validation.
The size of LDS calculation is referred from gfx950 thunk implementation.

Signed-off-by: Le Ma <le.ma@amd.com>
Acked-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdkfd: Handle save/restore of lds allocated in 1280B blocks

The gfx-9 trap handler is reading LDS allocation size in 256 bytes
granularity (from SQ_WAVE_LDS_ALLOC), but it using the assumption that
this value is always even (i.e. the LDS allocation is really done in
multiple of 512 bytes).  This was true so far, but gfx-950 allocates LDS
in chunks of 1280 bytes, making this assumption invalid.  This can cause
the trap handler to try to save / restore past the end of LDS, and past
the LDS allocated slot in the save are, overriding data from the
following wave.

This patch updates the trap handler to support LDS allocated in 1280
bytes blocks:
- During restore, copy from main memory directly to LDS in batch of 1280
  bytes.
- During save, continue to use 512 bytes blocks (we only have 2 VGPRs we
  can use to hold data), making sure to mask the upper half of the wave
  when handling when the LDS size is not a multiple of 512 bytes.

Signed-off-by: Lancelot SIX <lancelot.six@amd.com>
Co-authored-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Jay Cornwall <jay.cornwall@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdkfd: Adjust CWSR trap handler for gfx950

In gfx950, the SQ_WAVE_LDS_ALLOC.LDS_SIZE field is extended to bits 12
to 22. The LDS_SIZE granularity remains unchanged (units of 64 dwords,
or 256 bytes). This patch adjusts the CWSR trap handler to read the
full extent of LDS_SIZE.

Signed-off-by: Lancelot SIX <lancelot.six@amd.com>
Reviewed-by: Jay Cornwall <jay.cornwall@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdkfd: update buffer_{store,load}_* modifiers for gfx940

Instruction modifiers of the untyped vector memory buffer instructions
(MUBUF encoded) changed in gfx940. The slc, scc and glc modifiers have
been replaced with sc0, sc1 and nt.

The current CWSR trap handler is written using pre-gfx940 modifier
names, making the source incompatible with a strict gfx940 assembler.

This patch updates the cwsr_trap_handler_gfx9.s source file to be
compatible with all gfx9 variants of the ISA. The binary assembled code
is unchanged (so the behaviour is unchanged as well), only the source
representation is updated.

Signed-off-by: Lancelot SIX <lancelot.six@amd.com>
Reviewed-by: Jay Cornwall <jay.cornwall@amd.com>
Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdkfd: add gc 9.5.0 support on kfd

Initial support for GC 9.5.0.

v2: squash in pqm_clean_queue_resource() fix from Lijo

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Apply gc v9_5_0 golden settings

Apply gc v9_5_0 golden settings.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd: update mtype flags for gfx 9.5.0

Update mtype flags to meet gfx 9.5.0 requirements for remote GPU
memory and system memory.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Harish Kasiviswanathan <harish.kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Set proper MTYPE for GC 9.5.0

GC 9.5.0 local memory MTYPE default should be set as RW.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: add initial support for gfx950

add gfx950 basic support

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gfx: add gfx950 microcode

Add firmware declarations.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd: define gc ip version local variable

For better readability. Also leftover orphaned code.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Harish Kasiviswanathan <harish.kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Remove gfxoff usage

GFXOFF is not valid for these IP versions. Also, SDMA v4.4.2 is not in
GFX domain.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Avoid to release the FW twice in the validated error

There will to release the FW twice when the FW validated error.
Even if the release_firmware() will further validate the FW whether
is empty, but that will be redundant and inefficient.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: device: fix spellos and punctuation

Make spelling and punctuation changes to ease reading of the comments.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Xinhui Pan <Xinhui.Pan@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: David Airlie <airlied@gmail.com>
Cc: Simona Vetter <simona@ffwll.ch>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Fix potential NULL pointer dereference in atomctrl_get_smc_sclk_range_table

The function atomctrl_get_smc_sclk_range_table() does not check the return
value of smu_atom_get_data_table(). If smu_atom_get_data_table() fails to
retrieve SMU_Info table, it returns NULL which is later dereferenced.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

In practice this should never happen as this code only gets called
on polaris chips and the vbios data table will always be present on
those chips.

Fixes: a23eefa2f461 ("drm/amd/powerplay: enable dpm for baffin.")
Signed-off-by: Ivan Stepchenko <sid@itb.spb.ru>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Add amdgpu_vcn_sched_mask debugfs

Add debugfs entry to enable or disable job submission to
specific vcn instances. The entry is created only when
there is more than an instance and is unified queue type.

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Jesse Zhang <jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: return error when eeprom checksum failed

Return eeprom table checksum error result, otherwise
it might be overwritten by next call.

V2: replace DRM_ERROR with dev_err

Signed-off-by: Jinzhou Su <jinzhou.su@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd: Add Suspend/Hibernate notification callback support

As part of the suspend sequence VRAM needs to be evicted on dGPUs.
In order to make suspend/resume more reliable we moved this into
the pmops prepare() callback so that the suspend sequence would fail
but the system could remain operational under high memory usage suspend.

Another class of issues exist though where due to memory fragementation
there isn't a large enough contiguous space and swap isn't accessible.

Add support for a suspend/hibernate notification callback that could
evict VRAM before tasks are frozen. This should allow paging out to swap
if necessary.

Link: https://github.com/ROCm/ROCK-Kernel-Driver/issues/174
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3476
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2362
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3781
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Link: https://lore.kernel.org/r/20241128032656.2090059-2-superm1@kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: 3.2.312

DC 3.2.312 contains some improvements as summarized below:
* Fix dcn401 S3 resume sequence
* Fix dcn351 clk table
* Bug fix on IP2, reply, DP tunneling

Reviewed-by: Fangzhi Zuo <jerry.zuo@amd.com>
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>