Tao Zhou [Fri, 29 Nov 2024 08:52:41 +0000 (16:52 +0800)]
drm/amdgpu: correct the calculation of RAS bad page
After the introduction of NPS RAS, one bad page record on eeprom may be
related to 1 or 16 bad pages, so the bad page record and bad page are
two different concepts, define a new variable to store bad page number.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd: Add the capability to mark certain firmware as "required"
Some of the firmware that is loaded by amdgpu is not actually required.
For example the ISP firmware on some SoCs is optional, and if it's not
present the ISP IP block just won't be initialized.
The firmware loader core however will show a warning when this happens
like this:
```
Direct firmware load for amdgpu/isp_4_1_0.bin failed with error -2
```
To avoid confusion for non-required firmware, adjust the amd-ucode helper
to take an extra argument indicating if the firmware is required or
optional.
On optional firmware use firmware_request_nowarn() instead of
request_firmware() to avoid the warnings.
Le Ma [Wed, 7 Aug 2024 09:33:00 +0000 (17:33 +0800)]
drm/amdkfd: update the cwsr area size for gfx950
Update cwsr area size for gfx950 to fit the new user queue buffer validation.
The size of LDS calculation is referred from gfx950 thunk implementation.
Signed-off-by: Le Ma <le.ma@amd.com> Acked-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Lancelot SIX [Fri, 12 Jul 2024 22:22:29 +0000 (23:22 +0100)]
drm/amdkfd: Handle save/restore of lds allocated in 1280B blocks
The gfx-9 trap handler is reading LDS allocation size in 256 bytes
granularity (from SQ_WAVE_LDS_ALLOC), but it using the assumption that
this value is always even (i.e. the LDS allocation is really done in
multiple of 512 bytes). This was true so far, but gfx-950 allocates LDS
in chunks of 1280 bytes, making this assumption invalid. This can cause
the trap handler to try to save / restore past the end of LDS, and past
the LDS allocated slot in the save are, overriding data from the
following wave.
This patch updates the trap handler to support LDS allocated in 1280
bytes blocks:
- During restore, copy from main memory directly to LDS in batch of 1280
bytes.
- During save, continue to use 512 bytes blocks (we only have 2 VGPRs we
can use to hold data), making sure to mask the upper half of the wave
when handling when the LDS size is not a multiple of 512 bytes.
Signed-off-by: Lancelot SIX <lancelot.six@amd.com> Co-authored-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Jay Cornwall <jay.cornwall@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Lancelot SIX [Mon, 17 Jun 2024 11:10:56 +0000 (12:10 +0100)]
drm/amdkfd: Adjust CWSR trap handler for gfx950
In gfx950, the SQ_WAVE_LDS_ALLOC.LDS_SIZE field is extended to bits 12
to 22. The LDS_SIZE granularity remains unchanged (units of 64 dwords,
or 256 bytes). This patch adjusts the CWSR trap handler to read the
full extent of LDS_SIZE.
Signed-off-by: Lancelot SIX <lancelot.six@amd.com> Reviewed-by: Jay Cornwall <jay.cornwall@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Lancelot SIX [Fri, 12 Apr 2024 07:41:53 +0000 (08:41 +0100)]
drm/amdkfd: update buffer_{store,load}_* modifiers for gfx940
Instruction modifiers of the untyped vector memory buffer instructions
(MUBUF encoded) changed in gfx940. The slc, scc and glc modifiers have
been replaced with sc0, sc1 and nt.
The current CWSR trap handler is written using pre-gfx940 modifier
names, making the source incompatible with a strict gfx940 assembler.
This patch updates the cwsr_trap_handler_gfx9.s source file to be
compatible with all gfx9 variants of the ISA. The binary assembled code
is unchanged (so the behaviour is unchanged as well), only the source
representation is updated.
Signed-off-by: Lancelot SIX <lancelot.six@amd.com> Reviewed-by: Jay Cornwall <jay.cornwall@amd.com> Acked-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Sierra [Wed, 21 Feb 2024 21:02:15 +0000 (15:02 -0600)]
drm/amdkfd: add gc 9.5.0 support on kfd
Initial support for GC 9.5.0.
v2: squash in pqm_clean_queue_resource() fix from Lijo
Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Sierra [Wed, 24 Jul 2024 00:29:02 +0000 (19:29 -0500)]
drm/amd: update mtype flags for gfx 9.5.0
Update mtype flags to meet gfx 9.5.0 requirements for remote GPU
memory and system memory.
Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Harish Kasiviswanathan <harish.kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Sierra [Thu, 29 Feb 2024 20:55:58 +0000 (14:55 -0600)]
drm/amdgpu: Set proper MTYPE for GC 9.5.0
GC 9.5.0 local memory MTYPE default should be set as RW.
Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Sierra [Wed, 24 Jul 2024 00:19:42 +0000 (19:19 -0500)]
drm/amd: define gc ip version local variable
For better readability. Also leftover orphaned code.
Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Harish Kasiviswanathan <harish.kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Prike Liang [Mon, 2 Dec 2024 06:13:02 +0000 (14:13 +0800)]
drm/amdgpu: Avoid to release the FW twice in the validated error
There will to release the FW twice when the FW validated error.
Even if the release_firmware() will further validate the FW whether
is empty, but that will be redundant and inefficient.
Randy Dunlap [Thu, 28 Nov 2024 03:20:53 +0000 (19:20 -0800)]
drm/amdgpu: device: fix spellos and punctuation
Make spelling and punctuation changes to ease reading of the comments.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Xinhui Pan <Xinhui.Pan@amd.com> Cc: amd-gfx@lists.freedesktop.org Cc: David Airlie <airlied@gmail.com> Cc: Simona Vetter <simona@ffwll.ch> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ivan Stepchenko [Mon, 2 Dec 2024 08:00:43 +0000 (11:00 +0300)]
drm/amdgpu: Fix potential NULL pointer dereference in atomctrl_get_smc_sclk_range_table
The function atomctrl_get_smc_sclk_range_table() does not check the return
value of smu_atom_get_data_table(). If smu_atom_get_data_table() fails to
retrieve SMU_Info table, it returns NULL which is later dereferenced.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
In practice this should never happen as this code only gets called
on polaris chips and the vbios data table will always be present on
those chips.
Fixes: a23eefa2f461 ("drm/amd/powerplay: enable dpm for baffin.") Signed-off-by: Ivan Stepchenko <sid@itb.spb.ru> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sathishkumar S [Fri, 29 Nov 2024 04:17:47 +0000 (09:47 +0530)]
drm/amdgpu: Add amdgpu_vcn_sched_mask debugfs
Add debugfs entry to enable or disable job submission to
specific vcn instances. The entry is created only when
there is more than an instance and is unified queue type.
Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Reviewed-by: Jesse Zhang <jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd: Add Suspend/Hibernate notification callback support
As part of the suspend sequence VRAM needs to be evicted on dGPUs.
In order to make suspend/resume more reliable we moved this into
the pmops prepare() callback so that the suspend sequence would fail
but the system could remain operational under high memory usage suspend.
Another class of issues exist though where due to memory fragementation
there isn't a large enough contiguous space and swap isn't accessible.
Add support for a suspend/hibernate notification callback that could
evict VRAM before tasks are frozen. This should allow paging out to swap
if necessary.
Peterson [Thu, 21 Nov 2024 20:21:23 +0000 (15:21 -0500)]
drm/amd/display: Check that hw cursor is not required when falling back to subvp sw cursor
[WHY]
When using a sw cursor and flip immediate, the plane that is flipping
immediately will do partial updates causing tearing.
When on certain displays, subvp is expected based on
timings but should be disabled in specific use cases that are not
accounted for.
[HOW]
This was fixed by improving the timings check by using the hw cursor
required flag to cover the unaccounted use cases.
Reviewed-by: Austin Zheng <austin.zheng@amd.com> Signed-off-by: Peterson <peterson.guo@amd.com> Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ausef Yousof [Wed, 20 Nov 2024 17:38:11 +0000 (12:38 -0500)]
drm/amd/display: Populate chroma prefetch parameters, DET buffer fix
[WHY]
Soft hang/lag observed during 10bit playback + moving cursor, corruption
observed in other tickets for same reason, also failing MPO.
1. Currently, we are always running
calculate_lowest_supported_state_for_temp_read which is only
necessary on dGPU
2. Fast validate path does not apply DET buffer allocation policy
3. Prefetch UrgBFactor chroma parameter not populated in prefetch
calculation
[HOW]
1. Add a check to see if we are on APU, if so, skip the code
2. Add det buffer alloc policy checks to fast validate path
3. Populate UrgentBurstChroma param in call to calculate
UrgBChroma prefetch values
-revision commits: small formatting/brackets/null check addition + remove test change + dGPU code
Reviewed-by: Charlene Liu <charlene.liu@amd.com> Signed-off-by: Ausef Yousof <Ausef.Yousof@amd.com> Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Charlene Liu [Mon, 18 Nov 2024 21:48:48 +0000 (16:48 -0500)]
drm/amd/display: correct dcn351 dpm clk table based on pmfw_drv_if
[why]
driver got wrong clock table due to miss match dtm_table headers.
correct the dtn_clock table based on pmfw header.
Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Reviewed-by: Sung joon Kim <sungjoon.kim@amd.com> Signed-off-by: Charlene Liu <Charlene.Liu@amd.com> Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jinzhou Su [Thu, 28 Nov 2024 02:58:45 +0000 (10:58 +0800)]
drm/amdgpu: Add secure display v2 command
Add secure display v2 command to support multiple ROI instances per
display.
v2: fix typo and coding style issue
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Signed-off-by: Jinzhou Su <jinzhou.su@amd.com> Reviewed-by: Lang Yu <lang.yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd: Invert APU check for amdgpu_device_evict_resources()
Resource eviction isn't needed for s3 or s2idle on APUs, but should
be run for S4. As amdgpu_device_evict_resources() will be called
by prepare notifier adjust logic so that APUs only cover S4.
Suggested-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Link: https://lore.kernel.org/r/20241128032656.2090059-1-superm1@kernel.org Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
add "restore" missing variable in the fucntions
sdma_v4_4_2_page_resume and sdma_v4_4_2_inst_start.
This fixes the warning:
warning: Function parameter or struct member 'restore' not described in 'sdma_v4_4_2_page_resume'
warning: Function parameter or struct member 'restore' not described in 'sdma_v4_4_2_inst_start'
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sunil Khatri [Thu, 21 Nov 2024 17:32:11 +0000 (23:02 +0530)]
drm/amdgpu: Update the variable name to dma_buf
Instead of fixing the warning for missing variable
its better to update the variable name to match
with the style followed in the code.
This will fix the below mentioned warning:
warning: Function parameter or struct member 'dbuf' not described in 'amdgpu_bo_create_isp_user'
warning: Excess function parameter 'dma_buf' description in 'amdgpu_bo_create_isp_user'
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Tao Zhou [Thu, 31 Oct 2024 07:48:10 +0000 (15:48 +0800)]
drm/amdgpu: parse legacy RAS bad page mixed with new data in various NPS modes
All legacy RAS bad pages are generated in NPS1 mode, but new bad page
can be generated in any NPS mode, so we can't use retired_page stored
on eeprom directly in non-nps1 mode even for legacy data. We need to
take different actions for different data, new data can be identified
from old data by UMC_CHANNEL_IDX_V2 flag.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Shikang Fan [Thu, 21 Nov 2024 09:06:30 +0000 (17:06 +0800)]
drm/amdgpu: Check fence emitted count to identify bad jobs
In SRIOV, when host driver performs MODE 1 reset and notifies FLR to
guest driver, there is a small chance that there is no job running on hw
but the driver has not updated the pending list yet, causing the driver
not respond the FLR request. Modify the has_job_running function to
make sure if there is still running job.
v2: Use amdgpu_fence_count_emitted to determine job running status.
v3: Remove the timeout wait in has_job_running
Signed-off-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Shikang Fan <shikang.fan@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Aric Cyr [Mon, 18 Nov 2024 02:46:13 +0000 (21:46 -0500)]
drm/amd/display: 3.2.311
This version brings along following fixes:
- Add hblank borrowing support
- Limit VTotal range to max hw cap minus fp
- Correct prefetch calculation
- Add option to retrieve detile buffer size
- Add support for custom recout_width in SPL
- Add disable_ips_in_dpms_off flag for IPS
- Enable EASF based on luma taps only
- Add a left edge pixel if in YCbCr422 or YCbCr420 and odm
Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Aric Cyr <aric.cyr@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Samson Tam [Thu, 14 Nov 2024 08:10:05 +0000 (03:10 -0500)]
drm/amd/display: Add support for custom recout_width in SPL
[WHY]
Add support for custom recout_width for mpc combine in SPL
[HOW]
1. Rename mpc_combine_h and mpc_combine_v
2. Add flag use_recout_width_aligned to use custom recout_width
3. Create union to use either mpc_num_h_slices or mpc_recout_width_align
Reviewed-by: Navid Assadian <navid.assadian@amd.com> Signed-off-by: Samson Tam <Samson.Tam@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Add disable_ips_in_dpms_off flag for IPS
[WHY]
It's possible we still allow IPS2 when all streams are DPMS off but this
is unexpected.
[HOW]
Pass the DM config value into DC so it can use the pure stream count
to decide. We will be in 0 streams for S0i3 so this will still allow
it for D3.
Reviewed-by: Ovidiu Bunea <ovidiu.bunea@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Samson Tam [Tue, 12 Nov 2024 17:14:55 +0000 (12:14 -0500)]
drm/amd/display: Enable EASF based on luma taps only
[WHY]
EASF only applies to luma. Previously both luma and chroma taps
were checked to determine whether to enable EASF.
[HOW]
Only check if luma taps are supported before determining whether
to enable EASF or not.
Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Signed-off-by: Samson Tam <Samson.Tam@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/amdgpu/vcn: Fix kdoc entries for VCN clock/power gating functions
This commit corrects the descriptors for the
vcn_v4_0/v4_0_3/v4_0_5/v5_0_0 _set_clockgating_state and
vcn_v4_0/v4_0_3/v4_0_5/v5_0_0 _set_powergating_state functions in the
amdgpu driver.
The parameter descriptors in the comments were mismatched with the
actual function parameters. The non-existent 'handle' parameter has been
replaced with the correct 'ip_block' parameter in the comments to
accurately reflect the function signatures and to resolving the below
with gcc W=1:
drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c:1232: warning: Function parameter or struct member 'ip_block' not described in 'vcn_v5_0_0_set_clockgating_state'
drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c:1232: warning: Excess function parameter 'handle' description in 'vcn_v5_0_0_set_clockgating_state'
drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c:1263: warning: Function parameter or struct member 'ip_block' not described in 'vcn_v5_0_0_set_powergating_state'
drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c:1263: warning: Excess function parameter 'handle' description in 'vcn_v5_0_0_set_powergating_state'
drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c:2012: warning: Function parameter or struct member 'ip_block' not described in 'vcn_v4_0_set_clockgating_state'
drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c:2012: warning: Excess function parameter 'handle' description in 'vcn_v4_0_set_clockgating_state'
drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c:2043: warning: Function parameter or struct member 'ip_block' not described in 'vcn_v4_0_set_powergating_state'
drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c:2043: warning: Excess function parameter 'handle' description in 'vcn_v4_0_set_powergating_state'
drivers/gpu/drm/amd/amdgpu/vcn_v4_0_5.c:1505: warning: Function parameter or struct member 'ip_block' not described in 'vcn_v4_0_5_set_clockgating_state'
drivers/gpu/drm/amd/amdgpu/vcn_v4_0_5.c:1505: warning: Excess function parameter 'handle' description in 'vcn_v4_0_5_set_clockgating_state'
drivers/gpu/drm/amd/amdgpu/vcn_v4_0_5.c:1536: warning: Function parameter or struct member 'ip_block' not described in 'vcn_v4_0_5_set_powergating_state'
drivers/gpu/drm/amd/amdgpu/vcn_v4_0_5.c:1536: warning: Excess function parameter 'handle' description in 'vcn_v4_0_5_set_powergating_state'
drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c:1629: warning: Function parameter or struct member 'ip_block' not described in 'vcn_v4_0_3_set_powergating_state'
drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c:1629: warning: Excess function parameter 'handle' description in 'vcn_v4_0_3_set_powergating_state'
Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/amdgpu: Add missing kdoc 'inst' parameter in 'smu_dpm_set_power_gate' function
This commit adds the missing kdoc parameter descriptor for 'inst' in the
smu_dpm_set_power_gate function.
The 'inst' parameter, which specifies the instance of the IP block to
power gate/ungate.
Fixes the below with gcc W=1:
drivers/gpu/drm/amd/amdgpu/../pm/swsmu/amdgpu_smu.c:359: warning: Function parameter or struct member 'inst' not described in 'smu_dpm_set_power_gate'
Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Boyuan Zhang [Mon, 7 Oct 2024 17:35:33 +0000 (13:35 -0400)]
drm/amdgpu: move per inst variables to amdgpu_vcn_inst
Move all per instance variables from amdgpu_vcn to amdgpu_vcn_inst.
Move adev->vcn.fw[i] from amdgpu_vcn to amdgpu_vcn_inst.
Move adev->vcn.vcn_config[i] from amdgpu_vcn to amdgpu_vcn_inst.
Move adev->vcn.vcn_codec_disable_mask[i] from amdgpu_vcn to amdgpu_vcn_inst.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amdgpu: pass ip_block in set_powergating_state
Pass ip_block instead of adev in set_powergating_state callback function.
Modify set_powergating_state ip functions for all correspoding ip blocks.
v2: fix a ip block index error.
v3: remove type casting
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Boyuan Zhang [Thu, 3 Oct 2024 18:47:29 +0000 (14:47 -0400)]
drm/amdgpu: add inst to amdgpu_dpm_enable_vcn
Add an instance parameter to amdgpu_dpm_enable_vcn() function, and change
all calls from vcn ip functions to add instance argument. vcn generations
with only one instance (v1.0, v2.0) always use 0 as instance number. vcn
generations with multiple instances (v2.5, v3.0, v4.0, v4.0.3, v4.0.5,
v5.0.0) use the actual instance number.
v2: remove for-loop in amdgpu_dpm_enable_vcn(), and temporarily move it
to vcn ip with multiple instances, in order to keep the exact same logic
as before, until further separation in next patch.
v3: fix missing prefix
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Boyuan Zhang [Thu, 3 Oct 2024 03:52:01 +0000 (23:52 -0400)]
drm/amd/pm: add inst to dpm_set_powergating_by_smu
Add an instance parameter to amdgpu_dpm_set_powergating_by_smu() function,
and use the instance to call set_powergating_by_smu().
v2: remove duplicated functions.
remove for-loop in amdgpu_dpm_set_powergating_by_smu(), and temporarily
move it to amdgpu_dpm_enable_vcn(), in order to keep the exact same logic
as before, until further separation in next patch.
v3: drop SI logic in amdgpu_dpm_enable_vcn().
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Boyuan Zhang [Thu, 3 Oct 2024 03:25:45 +0000 (23:25 -0400)]
drm/amd/pm: add inst to set_powergating_by_smu
Add an instance parameter to set_powergating_by_smu() function, and
re-write all amd_pm functions accordingly. Then use the instance to
call smu_dpm_set_vcn_enable().
v2: remove duplicated functions.
remove for-loop in smu_dpm_set_power_gate(), and temporarily move it to
to amdgpu_dpm_set_powergating_by_smu(), in order to keep the exact same
logic as before, until further separation in next patch.
v3: add instance number in error message.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Boyuan Zhang [Thu, 3 Oct 2024 01:59:33 +0000 (21:59 -0400)]
drm/amd/pm: add inst to smu_dpm_set_vcn_enable
First, add an instance parameter to smu_dpm_set_vcn_enable() function,
and calling dpm_set_vcn_enable() with this given instance.
Second, modify vcn_gated to be an array, to track the gating status
for each vcn instance separately.
With these 2 changes, smu_dpm_set_vcn_enable() will check and set the
gating status for the given vcn instance ONLY.
v2: remove duplicated functions.
remove for-loop in dpm_set_vcn_enable(), and temporarily move it to
to smu_dpm_set_power_gate(), in order to keep the exact same logic as
before, until further separation in next patch.
v3: add instance number in error message.
v4: declaring i at the top of the function.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For smu ip with multiple vcn instances (smu 11/13/14), remove all the
for loop in dpm_set_vcn_enable() functions. And use the instance
argument to power up/down vcn for the given instance only, instead
of powering up/down for all vcn instances.
v2: remove all duplicated functions in v1.
remove for-loop from each ip, and temporarily move to dpm_set_vcn_enable,
in order to keep the exact same logic as before, until further separation
in the next patch.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Tao Zhou [Wed, 30 Oct 2024 08:42:42 +0000 (16:42 +0800)]
drm/amdgpu: add interface to get die id from memory address
And implement it for UMC v12_0. The die id is calculated from IPID
register in bad page retirement flow, but we don't store it on eeprom
and it can be also gotten from physical address.
v2: get PA_C4 and PA_R13 from MCA address since they may be cleared in
retired page.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Tao Zhou [Thu, 24 Oct 2024 07:34:27 +0000 (15:34 +0800)]
drm/amdgpu: support to find RAS bad pages via old TA
Old version of RAS TA doesn't support to convert MCA address stored on
eeprom to physical address (PA), support to find all bad pages in one
memory row by PA with old RAS TA. This approach is only suitable for
nps1 mode.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Lijo Lazar [Thu, 24 Oct 2024 05:31:57 +0000 (11:01 +0530)]
drm/amdgpu: Prefer RAS recovery for scheduler hang
Before scheduling a recovery due to scheduler/job hang, check if a RAS
error is detected. If so, choose RAS recovery to handle the situation. A
scheduler/job hang could be the side effect of a RAS error. In such
cases, it is required to go through the RAS error recovery process. A
RAS error recovery process in certains cases also could avoid a full
device device reset.
An error state is maintained in RAS context to detect the block
affected. Fatal Error state uses unused block id. Set the block id when
error is detected. If the interrupt handler detected a poison error,
it's not required to look for a fatal error. Skip fatal error checking
in such cases.
Tao Zhou [Fri, 18 Oct 2024 06:49:00 +0000 (14:49 +0800)]
drm/amdgpu: do RAS MCA2PA conversion in device init phase
NPS mode is introduced, the value of memory physical address (PA)
related to a MCA address varies per nps mode. We need to rely on
MCA address and convert it into PA accroding to nps mode.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Tao Zhou [Fri, 18 Oct 2024 06:43:04 +0000 (14:43 +0800)]
drm/amdgpu: add TA_RAS_INV_NODE value
We can set UMC node instance to invalid state if we use global channel
index, and RAS TA can choose UMC address conversion approach by checking
node_inst value.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Prike Liang [Fri, 15 Nov 2024 08:04:48 +0000 (16:04 +0800)]
drm/amdgpu: reduce the mmio writes in kiq setting
There's no need to perform the two MMIO writes in the KIQ
Setting registers programmed period, and reducing the MMIO
writes will save the driver loading time.
Signed-off-by: Prike Liang <Prike.Liang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Pratap Nirujogi [Tue, 19 Nov 2024 23:03:15 +0000 (18:03 -0500)]
drm/amd/amdgpu: Add support for isp buffers
Add support to create user BOs with MC address for isp using the dma-buf handle
exported for the buffers allocated from system memory in isp driver.
Export amdgpu_bo_create_kernel() and amdgpu_bo_free_kernel() as well for isp to
allocate GTT internal buffers required for fw to run.
Signed-off-by: Pratap Nirujogi <pratap.nirujogi@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
When using MES creating a pdd will require talking to the GPU to
setup the relevant context. The code here forgot to wake up the GPU
in case it was in suspend, this causes KVM to EFAULT for passthrough
GPU for example. This issue can be masked if the GPU was woken up by
other things (e.g. opening the KMS node) first and have not yet gone to sleep.
v4: do the allocation of proc_ctx_bo in a lazy fashion
when the first queue is created in a process (Felix)
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Reviewed-by: Yunxiang Li <Yunxiang.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
Christian König [Fri, 6 Dec 2024 13:46:06 +0000 (14:46 +0100)]
drm/amdgpu: fix when the cleaner shader is emitted
Emitting the cleaner shader must come after the check if a VM switch is
necessary or not.
Otherwise we will emit the cleaner shader every time and not just when it is
necessary because we switched between applications.
This can otherwise crash on gang submit and probably decreases performance
quite a bit.
v2: squash in fix from Srini (Alex)
Signed-off-by: Christian König <christian.koenig@amd.com> Fixes: ee7a846ea27b ("drm/amdgpu: Emit cleaner shader at end of IB submission") Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
Pratap Nirujogi [Thu, 5 Dec 2024 16:27:36 +0000 (11:27 -0500)]
drm/amdgpu: Fix ISP HW init issue
ISP hw_init is not called with the recent changes related
to hw init levels. AMDGPU_INIT_LEVEL_DEFAULT is ignoring
the ISP IP block as AMDGPU_IP_BLK_MASK_ALL is derived using
incorrect max number of IP blocks.
Update AMDGPU_IP_BLK_MASK_ALL to use AMD_IP_BLOCK_TYPE_NUM
instead of AMDGPU_MAX_IP_NUM to fix the issue.
Andrew Martin [Tue, 26 Nov 2024 17:10:59 +0000 (12:10 -0500)]
drm/amdkfd: Dereference null return value
In the function pqm_uninit there is a call-assignment of "pdd =
kfd_get_process_device_data" which could be null, and this value was
later dereferenced without checking.
Fixes: fb91065851cd ("drm/amdkfd: Refactor queue wptr_bo GART mapping") Signed-off-by: Andrew Martin <Andrew.Martin@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
Prike Liang [Tue, 5 Nov 2024 01:57:42 +0000 (09:57 +0800)]
drm/amdkfd: Correct the migration DMA map direction
The SVM DMA device map direction should be set the same as
the DMA unmap setting, otherwise the DMA core will report
the following warning.
Before finialize this solution, there're some discussion on
the DMA mapping type(stream-based or coherent) in this KFD
migration case, followed by https://lore.kernel.org/all/04d4ab32
-45a1-4b88-86ee-fb0f35a0ca40@amd.com/T/.
As there's no dma_sync_single_for_*() in the DMA buffer accessed
that because this migration operation should be sync properly and
automatically. Give that there's might not be a performance problem
in various cache sync policy of DMA sync. Therefore, in order to
simplify the DMA direction setting alignment, let's set the DMA map
direction as BIDIRECTIONAL.
Kenneth Feng [Wed, 4 Dec 2024 07:52:10 +0000 (13:22 +0530)]
drm/amd/pm: Set SMU v13.0.7 default workload type
Set the default workload type to bootup type on smu v13.0.7.
This is because of the constraint on smu v13.0.7.
Gfx activity has an even higher set point on 3D fullscreen
mode than the one on bootup mode. This causes the 3D fullscreen
mode's performance is worse than the bootup mode's performance
for the lightweighted/medium workload. For the high workload,
the performance is the same between 3D fullscreen mode and bootup
mode.
v2: set the default workload in ASIC specific file
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.11.x
Lijo Lazar [Wed, 4 Dec 2024 07:41:28 +0000 (13:11 +0530)]
drm/amd/pm: Initialize power profile mode
Refactor such that individual SMU IP versions can choose the startup
power profile mode. If no preference, then use the generic default power
profile selection logic.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.11.x
base.sched may not be set for each instance and should not
be used for cases such as non-IB tests.
Fixes: 2320c9e6a768 ("drm/sched: memset() 'job' in drm_sched_job_init()") Signed-off-by: David (Ming Qiang) Wu <David.Wu3@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Victor Zhao [Sat, 23 Nov 2024 10:37:43 +0000 (18:37 +0800)]
drm/amdgpu: use sjt mec fw on gfx943 for sriov
Use second jump table in sriov for live migration or mulitple VF
support so different VF can load different version of MEC as long
as they support sjt
Signed-off-by: Victor Zhao <Victor.Zhao@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Mon, 25 Nov 2024 18:59:09 +0000 (13:59 -0500)]
drm/amdgpu: rework resume handling for display (v2)
Split resume into a 3rd step to handle displays when DCC is
enabled on DCN 4.0.1. Move display after the buffer funcs
have been re-enabled so that the GPU will do the move and
properly set the DCC metadata for DCN.
v2: fix fence irq resume ordering
Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.11.x
Alex Deucher [Sat, 16 Nov 2024 13:20:59 +0000 (08:20 -0500)]
drm/amd/pm: fix and simplify workload handling
smu->workload_mask is IP specific and should not be messed with in
the common code. The mask bits vary across SMU versions.
Move all handling of smu->workload_mask in to the backends and
simplify the code. Store the user's preference in smu->power_profile_mode
which will be reflected in sysfs. For internal driver profile
switches for KFD or VCN, just update the workload mask so that the
user's preference is retained. Remove all of the extra now unused
workload related elements in the smu structure.
v2: use refcounts for workload profiles
v3: rework based on feedback from Lijo
v4: fix the refcount on failure, drop backend mask
v5: rework custom handling
v6: handle failure cleanup with custom profile
v7: Update documentation
Yiqing Yao [Tue, 26 Nov 2024 10:36:11 +0000 (18:36 +0800)]
drm/amdgpu: fix sriov reinit late orders
Use found block to call correct init/resume function on the block.
Set status.hw for resume and init.
Print re-init result again. Change to use dev_info.
Use amdgpu_device_ip_get_ip_block to get target block instead of
loop.
Fixes: 502d76308d45 ("drm/amdgpu: validate resume before function call") Signed-off-by: Yiqing Yao <YiQing.Yao@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Pratap Nirujogi [Fri, 29 Nov 2024 19:52:08 +0000 (14:52 -0500)]
drm/amdgpu: Fix ISP hw init issue
ISP hw_init is not called with the recent changes related
to hw init levels. AMDGPU_INIT_LEVEL_DEFAULT is ignoring
the ISP IP block as AMDGPU_IP_BLK_MASK_ALL is derived using
incorrect max number of IP blocks.
Update AMDGPU_IP_BLK_MASK_ALL to use AMDGPU_MAX_IP_NUM
instead of (AMDGPU_MAX_IP_NUM - 1) to fix the issue.
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Fixes: 14f2fe34f5c6 ("drm/amdgpu: Add init levels") Signed-off-by: Pratap Nirujogi <pratap.nirujogi@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Chris Park [Fri, 15 Nov 2024 20:44:52 +0000 (15:44 -0500)]
drm/amd/display: Add hblank borrowing support
[WHY]
Some DSC timing failed at bandwidth validation due to hactive
can't be evenly divided on each ODM segment.
[HOW]
Borrow from hblank to increase hactive to support these timing.
Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Wenjing Liu <wenjing.liu@amd.com> Signed-off-by: Chris Park <chris.park@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Dillon Varone [Wed, 13 Nov 2024 21:44:15 +0000 (16:44 -0500)]
drm/amd/display: Limit VTotal range to max hw cap minus fp
[WHY & HOW]
Hardware does not support the VTotal to be between fp2 lines of the
maximum possible VTotal, so add a capability flag to track it and apply
where necessary.
Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Jun Lei <jun.lei@amd.com> Reviewed-by: Anthony Koo <anthony.koo@amd.com> Signed-off-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Lo-an Chen [Thu, 14 Nov 2024 09:53:41 +0000 (17:53 +0800)]
drm/amd/display: Correct prefetch calculation
[WHY]
The minimum value of the dst_y_prefetch_equ was not correct
in prefetch calculation whice causes OPTC underflow.
[HOW]
Add the min operation of dst_y_prefetch_equ in prefetch calculation.
Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Lo-an Chen <lo-an.chen@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>