Ce Sun [Tue, 12 May 2026 01:53:21 +0000 (09:53 +0800)]
drm/amdgpu: Fix memory leak of i2s_pdata in ACP initialization
Currently, the i2s_pdata structure is dynamically allocated in
acp_hw_init() but never freed in both the error handling path and
the acp_hw_fini() cleanup path, causing a permanent memory leak.
Signed-off-by: Ce Sun <cesun102@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/radeon/evergreen_cs: Add missing NULL prefix check in surface check
'evergreen_surface_check' is called with a NULL warning prefix when
handling potentially recoverable issues or just to compute the alignment
requirements, and 'evergreen_surface_check' is called again in case of
failure (with the correct prefix, as opposed to NULL), therefore, the
initial check must not print a warning, because the surface may be
accepted successfully after having been corrected, however if it isn't,
the final check will print the warning anyway. The surface check
functions specific to array modes already implement this behavior, but
the 'evergreen_surface_check' function itself doesn't.
This is also supposed to fix the "'%s' directive argument is null
[-Werror=format-overflow=]" compiler warning.
Fixes: 285484e2d55e ("drm/radeon: add support for evergreen/ni tiling informations v11") Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Vitaliy Triang3l Kuzmin <ml@triang3l.ru> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Xiang Liu [Mon, 11 May 2026 08:45:08 +0000 (16:45 +0800)]
drm/amd/ras: Fix SMU EEPROM record field decoding
The SMU EEPROM read paths pass byte-sized record field addresses
to mca_ipid_parse(), whose outputs are u32 pointers.
Writing through those widened pointers can clobber adjacent fields
and bytes beyond the record storage.
Parse the IPID values into local u32 temporaries instead, then
explicitly narrow the values when storing them in the EEPROM record.
Signed-off-by: Xiang Liu <xiang.liu@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Xiang Liu [Mon, 11 May 2026 07:48:55 +0000 (15:48 +0800)]
drm/amd/ras: reset CPER ring on corrupt entry size
When CPER ring overflow handling advances the read pointer, it trusts the
parsed entry size from the current ring contents. Corrupt CPER data can
produce an entry size that does not advance rptr after dword conversion
and pointer masking.
In that case the recovery loop keeps testing the same location while
holding the CPER ring mutex. This can hang the worker that is writing the
next CPER record.
Detect a no-progress rptr update and reset the CPER ring to an empty
state instead. This drops the corrupt contents and lets the writer leave
the recovery path without spinning.
Signed-off-by: Xiang Liu <xiang.liu@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sunil Khatri [Wed, 13 May 2026 07:59:35 +0000 (13:29 +0530)]
drm/amdgpu: userq_va_mapped should remain true once done
Multiple queues needs these bo_va objects belonging to
the same uq_mgr. So once they are mapped lets not unmap
them as at any point of time any of the queues might be
using it.
Also userq_va_mapped should be a boolean than atomic.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ce Sun [Mon, 11 May 2026 10:04:57 +0000 (18:04 +0800)]
drm/amdgpu: avoid integer overflow in VA range check
The original addition operation in 64-bit unsigned type may encounter
overflow situations. To prevent such issues and safely reject invalid
inputs, the check_add_overflow() function is used.
Signed-off-by: Ce Sun <cesun102@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
amdgpu_umc_handle_bad_pages() allocates err_data->err_addr before
querying UMC error information. In the direct and firmware query paths,
the pointer is reassigned to a fresh allocation before the original
buffer is released, so the initial allocation is leaked on each handled
event.
Free the existing buffer before replacing it in those query paths so the
function exit cleanup only owns the active allocation.
Signed-off-by: Xiang Liu <xiang.liu@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Yifan Zhang [Mon, 11 May 2026 14:14:23 +0000 (22:14 +0800)]
drm/amdgpu: unmap all user mappings of framebuffer and doorbell before mode1 reset
During Mode 1 reset, the ASIC undergoes a reset cycle and becomes temporarily
inaccessible via PCIe. Any attempt to access framebuffer or MMIO registers during
this window can result in uncompleted PCIe transactions, leading to NMI panics or
system hangs.
To prevent this, Unmap all of the applications mappings of the framebuffer
and doorbell BARs before mode1 reset. Also prevent new mappings from coming in
during the reset process.
v2: remove inode in kfd_dev (Christian)
v3: correct unmap offset (Felix), remove prevent new mappings part
to avoid deadlock (Christian)
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Taimur Hassan [Sat, 9 May 2026 20:51:05 +0000 (15:51 -0500)]
drm/amd/display: Promote DC to 3.2.383
This version brings along the following updates:
- Add amdgpu_dm KUnit test for:
* CRC function
* HDCP process_output
* colorop TF bitmasks
* color helpers
* PSR and Replay functions
* ISM functions
- Fix eDP receiver ready status check in T7 sequence
- Enable dcn42 pstate pmo
- Refactor PSR. Replay and ABM functionality into dedicated power modules
- Fix assertion due to disable/enable CM blocks
- Enable additional wait for pipe pending checks
- Fix ISM dc_lock deadlock during suspend
- Use lockdep_assert_held() for dc_lock check
- Fix clear PSR config flow
- Exclude the MST overhead from BW deallocation
- Allow power up even w/ powergating disabled on DCN42
- Fix integer overflow in bios_get_image()
- Validate GPIO pin LUT table size before iterating
- Add Auxless-ALPM support in VESA Panel Replay
- Add debug option for replay ESD recovery.
- Validate payload length and link_index in dc_process_dmub_aux_transfer_async.
- Add ADDR3 swizzle modes.
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Wenxian Wang [Sat, 9 May 2026 02:46:23 +0000 (10:46 +0800)]
drm/amd/display: Add ADDR3 swizzle modes
[Why]
New swizzle modes are needed for ADDR3 block support.
[How]
Add DC_ADDR3_SW_64KB_2D_Z and DC_ADDR3_SW_256KB_2D_Z enum
values to dc_hw_types.h.
Reviewed-by: Ilya Bakoulin <ilya.bakoulin@amd.com> Signed-off-by: Wenxian Wang <wenxian.wang@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Harry Wentland [Thu, 7 May 2026 20:26:31 +0000 (16:26 -0400)]
drm/amd/display: Validate payload length and link_index in dc_process_dmub_aux_transfer_async
[Why&How]
dc_process_dmub_aux_transfer_async() copies payload->length bytes into a
16-byte stack buffer (dpaux.data[16]) guarded only by an ASSERT(), which
is a no-op in release builds. If a caller ever passes length > 16 this
results in a stack buffer overflow via memcpy.
Additionally, link_index is used to dereference dc->links[] without
bounds checking against dc->link_count, risking an out-of-bounds access.
Replace the ASSERT with a hard runtime check that returns false when
payload->length exceeds the destination buffer size, and add a bounds
check for link_index before it is used.
Assisted-by: GitHub Copilot:Claude claude-4-opus Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Wei-Guang Li [Wed, 6 May 2026 12:32:33 +0000 (20:32 +0800)]
drm/amd/display: Add debug option for replay ESD recovery
[Why&How]
Add a new debug option "enable_replay_esd_recovery" to control whether
to enable the replay ESD recovery feature.
Reviewed-by: Robin Chen <robin.chen@amd.com> Reviewed-by: Wenjing Liu <wenjing.liu@amd.com> Signed-off-by: Wei-Guang Li <wei-guang.li@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Leon Huang [Thu, 30 Apr 2026 06:53:21 +0000 (14:53 +0800)]
drm/amd/display: Add Auxless-ALPM support in VESA Panel Replay
[How]
Add Auxless-ALPM data in VESA PR initialization
Reviewed-by: Wenjing Liu <wenjing.liu@amd.com> Signed-off-by: Leon Huang <Leon.Huang1@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Harry Wentland [Mon, 4 May 2026 20:14:11 +0000 (16:14 -0400)]
drm/amd/display: Validate GPIO pin LUT table size before iterating
[Why&How]
The GPIO pin table parsers in get_gpio_i2c_info() and
bios_parser_get_gpio_pin_info() derive an element count from the VBIOS
table_header.structuresize field, then iterate over gpio_pin[] entries.
However, GET_IMAGE() only validates that the table header itself fits
within the BIOS image. If the VBIOS reports a structuresize larger than
the actual mapped data, the loop reads past the end of the BIOS image,
causing an out-of-bounds read.
Fix this by calling bios_get_image() to validate that the full claimed
structuresize is accessible within the BIOS image before entering the
loop in both functions.
Assisted-by: GitHub Copilot:claude-opus-4-6 Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Harry Wentland [Mon, 4 May 2026 15:14:45 +0000 (11:14 -0400)]
drm/amd/display: Fix integer overflow in bios_get_image()
[Why&How]
The bounds check in bios_get_image() computes 'offset + size' using
unsigned 32-bit arithmetic before comparing against bios_size. If a
VBIOS image contains a near-UINT32_MAX offset the addition wraps to a
small value, the comparison passes, and the function returns a wild
pointer past the VBIOS mapping.
Additionally, the comparison uses '<' (strict), which incorrectly
rejects the valid exact-fit case where offset + size == bios_size.
Fix both issues by restructuring the check to avoid the addition
entirely: first reject if offset alone exceeds bios_size, then check
size against the remaining space (bios_size - offset). This eliminates
the overflow and correctly permits exact-fit accesses.
Assisted-by: GitHub Copilot:claude-opus-4.6 Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Refactor Replay functionality into dedicated power_replay module
[Why]
Extract all Replay related functions from power.c and
power_helpers.c into a new power_replay.c module for
better code organization and maintainability.
[How]
Create new power_replay.c file containing
Replay-related functions moved from power.c
and power_helpers.c . Update mod_power.h with
function declarations. Maintain forward
declaration for type compatibility.
Reviewed-by: Robin Chen <robin.chen@amd.com> Signed-off-by: Lohita Mudimela <lohita.mudimela@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Charlene Liu [Mon, 27 Apr 2026 23:09:02 +0000 (19:09 -0400)]
drm/amd/display: Allow power up when PG disallowed in driver
[Why]
Do not exit early dcn42 pg control functions on power up for pipe PG
failsafe.
Reviewed-by: Leo Chen <leo.chen@amd.com> Signed-off-by: Charlene Liu <Charlene.Liu@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ray Wu [Mon, 4 May 2026 06:32:13 +0000 (14:32 +0800)]
drm/amd/display: Use lockdep_assert_held() for dc_lock check
[Why]
mutex_is_locked() only tells whether *some* task holds the mutex, not
the current one, so the existing ASSERT can silently pass when the
caller violates the contract.
[How]
Use the kernel's lockdep debugging utility (include/linux/lockdep.h)
and replace ASSERT(mutex_is_locked(&dm->dc_lock)) with
lockdep_assert_held(&dm->dc_lock), which checks the current task's
held-lock stack.
Reviewed-by: Sun peng (Leo) Li <sunpeng.li@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ray Wu [Thu, 30 Apr 2026 02:08:16 +0000 (10:08 +0800)]
drm/amd/display: Fix ISM dc_lock deadlock during suspend
[Why]
System hang observed during suspend/resume while video is playing.
amdgpu_dm_ism_disable() is called under dc_lock and waits for ISM
delayed work via disable_delayed_work_sync(). The work handlers
themselves take dc_lock, producing an ABBA deadlock when a worker is
in flight at suspend time.
[How]
Split the disable path into two phases with opposite locking
contracts:
1. amdgpu_dm_ism_disable() -- quiesces workers, must NOT hold
dc_lock.
2. amdgpu_dm_ism_force_full_power() (new) -- drives the ISM FSM
back to FULL_POWER_RUNNING, must hold dc_lock.
Reviewed-by: Sun peng (Leo) Li <sunpeng.li@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Aric Cyr [Tue, 5 May 2026 20:49:47 +0000 (16:49 -0400)]
drm/amd/display: Enable additional wait for pipe pending checks
[why]
In cases where there are two FULL updates within the same display frame,
it's possible for some blocks to be programmed a second time without having
been latched completely from the first programming.
DCN 3.5 and up already work around this with additional validation checks
for frame count and defer as needed via fsleep.
[how]
Enabled existing pipe checks generically for all DCN versions to avoid HW
programming hazards.
Also removed redundant max_frame_count which can be determined by the
register mask and shift.
Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Signed-off-by: Aric Cyr <Aric.Cyr@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Aric Cyr [Tue, 5 May 2026 20:27:29 +0000 (16:27 -0400)]
drm/amd/display: Fix assertion due to disable/enable CM blocks
[why]
Some dc state transitions can result in CM blocks being disabled, then
re-enabled. The disable will set a defer bit, but re-enable will not
clear it. When optimizing later, an assert will be hit due to incorrect
expected HW state.
[how]
Clear defer bits if the block is re-enabled before optimization is
executed.
Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Signed-off-by: Aric Cyr <Aric.Cyr@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Refactor PSR functionality into dedicated power_psr module
[Why]
Extract all PSR (Panel Self Refresh) related functions from power.c
into a new power_psr.c module for better code organization and
maintainability.
[How]
Create new power_psr.c file containing all PSR-related functions
moved from power.c. Remove static qualifier from shared functions
to enable cross-file access:
- psr_context_to_mod_power_psr_context: Convert PSR context to
module power PSR context
- map_index_from_stream: Map stream to power entity index
- delay_two_frames: Wait for two frame periods
Add function declarations to header. Maintain forward declaration of struct
core_power for type compatibility.
Reviewed-by: Anthony Koo <anthony.koo@amd.com> Signed-off-by: Lohita Mudimela <lohita.mudimela@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sung-huai Wang [Tue, 21 Apr 2026 04:53:56 +0000 (12:53 +0800)]
drm/amd/display: Fix eDP receiver ready status check in T7 sequence
[Why]
Some eDP panels return sinkstatus as 0x5, causing the original sinkstatus == 1
check to never match and resulting in unnecessary polling delay. The
equality check is too restrictive and doesn't properly validate the
specific status bit that indicates receiver readiness.
[How]
Replace direct value comparison with proper bitmask check using
DP_RECEIVE_PORT_0_STATUS constant.
Reviewed-by: Wenjing Liu <wenjing.liu@amd.com> Signed-off-by: Sung-huai Wang <Danny.Wang@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
1. Auto-generated Header: The file 'dmub_cmd.h' is an auto-generated header
managed in an external repository (dmu_stg). Manual changes made directly in
this repository will be overwritten and lost during the next automated weekly
synchronization.
2. Tooling Compatibility: This header is governed by internal AMD firmware
standards which require Doxygen formatting for cross-team documentation.
Moving to kernel-doc syntax may break internal documentation pipelines.
3. Suppressing Warnings: Current 'make htmldocs' and 'make W=1' builds
do not actively scan 'dmub_cmd.h' for kernel-doc compliance, thus no warnings
are triggered during standard compilation. To address warnings generated when
manually running './scripts/kernel-doc', we have added a notice at the file
header indicating that this is an auto-generated file that does not strictly
follow kernel-doc formatting. This ensures that any future linting tools or
manual checks recognize the formatting as intentional.
Acked-by: Harry Wentland <harry.wentland@amd.com> Acked-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: James Lin <pinglei.lin@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
James Lin [Thu, 7 May 2026 00:01:51 +0000 (08:01 +0800)]
drm/amd/display: Add some missing code for dcn42
[why & how]
Some DCN4.2 related code is missing from upstream
Fixes: e56e3cff2a1b ("drm/amd/display: Sync dcn42 with DC 3.2.373") Acked-by: ChiaHsuan Chung <ChiaHsuan.Chung@amd.com> Reviewed-by: Roman Li <Roman.Li@amd.com> Signed-off-by: James Lin <pinglei.lin@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
amdgpu_pmops_runtime_suspend() runs almost the same code that
amdgpu_pmops_runtime_idle() runs. That is there is pointless code
duplication.
[How]
Move amdgpu_pmops_runtime_idle() up, extract common code and then
call from both functions. No intended functional changes.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
David Francis [Tue, 12 May 2026 19:18:18 +0000 (15:18 -0400)]
drm/amdkfd: Check bounds for allocate_sdma_queue restore_sdma_id
allocate_sdma_queue has an option where the sdma queue id can be
specified (used by CRIU). We weren't bounds-checking that
value.
Confirm it's less than the maximum number of queues.
Signed-off-by: David Francis <David.Francis@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sunil Khatri [Thu, 14 May 2026 07:01:00 +0000 (12:31 +0530)]
drm/amdgpu: use atomic operation to achieve lockless serialization
In amdgpu_seq64_alloc there is a possibility that two difference cores
from two separate NODES can try to and could get the same free slot.
So this fixes that race here using atomic test_and_set clear operations.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Timur Kristóf [Wed, 13 May 2026 20:04:16 +0000 (22:04 +0200)]
drm/amdgpu/vce3: Fix VCE 3 firmware size and offsets
The VCPU BO contains the actual FW at an offset, but
it was not calculated into the VCPU BO size.
Subtract this from the FW size to make sure there is
no out of bounds access.
This may fix VM faults when using VCE 3.
Cc: John Olender <john.olender@gmail.com> Fixes: e98226221467 ("drm/amdgpu: recalculate VCE firmware BO size") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
David Francis [Tue, 12 May 2026 19:15:33 +0000 (15:15 -0400)]
drm/amdkfd: Check bounds on allocate_doorbell
allocated_doorbell has an option to set the doorbell id
to a specific value (used by CRIU). This value was not
bounds checked.
Check to confirm it's less than KFD_MAX_NUM_OF_QUEUES_PER_PROCESS.
Signed-off-by: David Francis <David.Francis@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Timur Kristóf [Wed, 13 May 2026 20:04:15 +0000 (22:04 +0200)]
drm/amdgpu/vce2: Fix VCE 2 firmware size and offsets
The VCPU BO contains the actual FW at an offset, but
it was not calculated into the VCPU BO size.
Subtract this from the FW size to make sure there is
no out of bounds access.
Additionally, increase the VCE_V2_0_DATA_SIZE to
have extra space after the VCE handles.
Also increase the data size used for each VCE handle.
The FW needs 23744 bytes, use 24K to be safe.
This fixes VM faults when using VCE 2.
Cc: John Olender <john.olender@gmail.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/4802 Fixes: e98226221467 ("drm/amdgpu: recalculate VCE firmware BO size") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Timur Kristóf [Wed, 13 May 2026 20:04:14 +0000 (22:04 +0200)]
drm/amdgpu/vce1: Stop using amdgpu_vce_resume
The VCE1 firmware works slightly differently and is already
loaded by vce_v1_0_load_fw(). It doesn't actually need to
call amdgpu_vce_resume().
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Timur Kristóf [Wed, 13 May 2026 20:04:13 +0000 (22:04 +0200)]
drm/amdgpu/vce1: Fix VCE 1 firmware size and offsets
The VCPU BO contains the actual FW at an offset, but
it was not calculated into the VCPU BO size.
Subtract this from the FW size to make sure there is
no out of bounds access.
Make sure the stack and data offsets are aligned to
the 32K TLB size.
Check that the FW microcode actually fits in the
space that is reserved for it.
Fixes: d4a640d4b9f3 ("drm/amdgpu/vce1: Implement VCE1 IP block (v2)") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Only allocate entries from the GTT manager when the
VCE GTT node is not allocated yet. This prevents the
possibility of allocating them multiple times, which
causes issues during GPU reset and suspend/resume.
Fixes: 71aec08f80e7 ("amdgpu/vce: use amdgpu_gtt_mgr_alloc_entries") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Timur Kristóf [Wed, 13 May 2026 20:04:11 +0000 (22:04 +0200)]
drm/amdgpu/vce1: Check if VRAM address is lower than GART.
Previously, I had assumed this was not possible
so it was OK to not handle it, but now we got a report
from a user who has a board that is configured this way.
When the VCPU BO is already located in a low 32-bit address
in VRAM (eg. when VRAM is mapped to the low address space),
don't do the workaround.
Fixes: 71aec08f80e7 ("amdgpu/vce: use amdgpu_gtt_mgr_alloc_entries") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Timur Kristóf [Wed, 13 May 2026 20:04:10 +0000 (22:04 +0200)]
drm/amdgpu/vce1: Remove superfluous address check
The same thing is already checked a few lines above.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Timur Kristóf [Wed, 13 May 2026 20:04:09 +0000 (22:04 +0200)]
drm/amdgpu/vce1: Check that the GPU address is < 128 MiB
When ensuring the low 32-bit address, make sure it is
less than 128 MiB, otherwise the VCE seems to fail to initialize.
This seems to be an undocumented limitation of the firmware
validation mechanism. Note that in case of VCE1 the BAR
address is zero and we can't change it also due to the
firmware validator.
When programming the mmVCE_VCPU_CACHE_OFFSETn registers,
don't AND them with a mask. This is incorrect because
the register mask is actually 0x0fffffff and useless because
we already ensure the addresses are below the limit.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Timur Kristóf [Wed, 13 May 2026 20:04:08 +0000 (22:04 +0200)]
drm/amdgpu: Align amdgpu_gtt_mgr entries to TLB size on Tahiti (v2)
The TLB is organized in groups of 8 entries, each one is 4K.
On Tahiti, the HW requires these GART entries to be 32K-aligned.
This fixes a VCE 1 firmware validation failure that can happen
after suspend/resume since we use amdgpu_gtt_mgr for VCE 1.
v2:
- Change variable declaration order
- Add comment about "V bit HW bug"
Fixes: 698fa62f56aa ("drm/amdgpu: Add helper to alloc GART entries") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sunday Clement [Wed, 13 May 2026 15:22:19 +0000 (11:22 -0400)]
drm/amdkfd: Fix OOB memory exposure in get_wave_state()
The get_wave_state() function for v9 trusts cp_hqd_cntl_stack_size and
cp_hqd_cntl_stack_offset values read directly from the MQD, which are
written by GPU microcode and fully attacker-controlled on the
CRIU-restore path (via AMDKFD_IOC_RESTORE_PROCESS with H3).
this leads to an unbounded copy_to_user() that can leak adjacent
GTT/kernel memory. If offset > size, integer underflow produces a ~4 GiB
read length, if size is set to 1 MiB against a 4 KiB allocation, we leak
1 MiB of adjacent kernel memory (other queues' MQDs, ring buffers, KASLR
pointers).
Fix by clamping both cp_hqd_cntl_stack_size to the actual allocated
buffer size (q->ctl_stack_size) and cp_hqd_cntl_stack_offset to the
clamped size before performing arithmetic and copy_to_user().
This ensures we never read beyond the allocated kernel BO regardless of
attacker-supplied MQD field values.
Signed-off-by: Sunday Clement <Sunday.Clement@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Yang Wang [Sat, 9 May 2026 07:20:39 +0000 (15:20 +0800)]
drm/amd/pm: fix memleak of dpm_policies on smu v15
In smu_v15_0_fini_smc_tables, dpm_policies was not freed or NULLed, causing a memory leak.
Add kfree() and NULL assignment to properly release memory and avoid dangling pointers.
Fixes: 2beedc3a92b7 ("drm/amd/pm: Add initial support for smu v15_0_8"); Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Amber Lin [Mon, 23 Mar 2026 18:19:04 +0000 (14:19 -0400)]
drm/amdgpu: Support MES suspend_all_sdma_gangs
suspend_all_sdma_gangs is supported in new MES firmware for gfx 12.1
Signed-off-by: Amber Lin <Amber.Lin@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Michael Chen<michael.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Chenglei Xie [Thu, 7 May 2026 20:16:58 +0000 (16:16 -0400)]
drm/amdgpu: fix OOB risk parsing virt RAS batch trace replies on the VF
amdgpu_virt_ras_get_batch_records() indexed batchs[] and records[]
from ras_cmd_batch_trace_record_rsp copied out of shared memory without
fully bounding the cache window or per-batch offset/trace_num. A
tampered or corrupted buffer could set real_batch_num past the array,
make a naive start_batch_id + real_batch_num comparison wrap in
uint64_t, or point offset+trace_num past records[].
Add amdgpu_virt_ras_check_batch_cached() for a subtraction-based window
with a real_batch_num cap, re-run it after GET_BATCH_TRACE_RECORD, and
use an explicit batch index into batchs[]. Consolidate batch_id,
trace_num, and offset+trace_num checks; on any failure memset the cache
and return -EIO so the next call refetches.
chong li [Wed, 6 May 2026 09:21:23 +0000 (17:21 +0800)]
drm/amdgpu: Add guest driver CUID support
v3:
improve the coding style.
v2:
use debugfs_create_x64 and debugfs_create_x8 to create node.
v1:
1. Add guest driver CUID support
2. Do not expose vf index(variable "fcn_idx") to customers,
replace the fcn_idx with pad.
Only expose the unitid to customers.
background:
Change fcn_idx to pad, VF index won't expose to guest vm.
Introduce a new unitid field as the VF identifier to replace the VF index:
1).unitid is assigned by the host driver
2).It is delivered to the guest via the pf2vf message
3).The application or umd can retrieve united from the sysfs node
Signed-off-by: chong li <chongli2@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Lijo Lazar [Tue, 12 May 2026 14:59:52 +0000 (20:29 +0530)]
drm/amdgpu: Fix discovery offset check under VF
Discovery table may be kept at offset 0 by host driver. Remove the
validation check.
Fixes: 01bdc7e219c4 ("drm/amdgpu: New interface to get IP discovery binary v3") Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Ellen Pan <yunru.pan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sunil Khatri [Tue, 12 May 2026 16:59:48 +0000 (22:29 +0530)]
drm/amdgpu: remove va cursors for all mappings
va_cursor struct needs to be cleaned even if the mapping
has been removed already.
Also simplify it by make it a void function as return value
check isn't needed as its called during tear down.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Amir Shetaia [Thu, 7 May 2026 17:24:55 +0000 (13:24 -0400)]
drm/amdgpu: reject non-user addresses early in GEM_USERPTR ioctl
amdgpu_gem_userptr_ioctl() currently accepts any value of args->addr
and only discovers an out-of-range pointer much later, inside
amdgpu_gem_object_create() and the HMM mirror registration path.
Userspace can drive that path with kernel-side virtual addresses;
the get_user_pages() layer rejects them, but only after the driver
has already allocated a GEM object and started wiring up notifier
state that then has to be torn down on failure.
Add an access_ok() guard at the top of the ioctl, right after the
existing page-alignment check and before flag validation, so any
address that does not lie within the calling task's user address
range is rejected with -EFAULT before any allocation occurs. No
legitimate ROCm/HSA userspace passes kernel-mode pointers through
this interface, so this is defense-in-depth rather than a behaviour
change for valid callers; -EFAULT matches the convention already
used by other uaccess-style rejections in the kernel.
Also add an explicit #include <linux/uaccess.h>; access_ok() is
otherwise only available transitively through other headers in
this translation unit.
Signed-off-by: Amir Shetaia <Amir.Shetaia@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alan Liu [Fri, 1 May 2026 04:35:48 +0000 (12:35 +0800)]
drm/amdgpu/vpe: Force collaborate sync after TRAP
VPE1 could possibly hang and fail to power off at the end of commands in
collaboration mode. This workaround adds a COLLAB_SYNC after TRAP to
force instances synchronized to avoid VPE1 fail to power off.
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Alan liu <haoping.liu@amd.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/5171 Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sunil Khatri [Tue, 12 May 2026 10:30:18 +0000 (16:00 +0530)]
drm/amdgpu/userq: update the vm task info during signal ioctl
Pagefaults does not have process information correctly populated
as vm->task is not set during vm_init but should be updated while
real submission. So setting that up during signal_ioctl to get
the correct submission process details.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sunil Khatri [Tue, 12 May 2026 09:22:40 +0000 (14:52 +0530)]
drm/amdgpu/userq: cancel reset work while tear down in progress
While tear down of a userq_mgr is happening when all the queues
are free we should cancel any reset work if pending before exiting.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Harry Wentland [Tue, 5 May 2026 16:20:18 +0000 (10:20 -0600)]
drm/amdgpu: Remove UML build exclusion from Kconfig
The depends on !UML was added in commit dffe68131707 ("amdgpu: Avoid
building on UML") to work around build failures with allyesconfig on
UML. The original errors were:
- smu7_hwmgr.c: incompatible pointer type 'struct cpuinfo_um *' vs
'struct cpuinfo_x86 *' in intel_core_rkl_chk()
- kfd_topology.c: 'struct cpuinfo_um' has no member named 'apicid'
Both issues have since been resolved independently:
- intel_core_rkl_chk() has been removed entirely.
- kfd_topology.c now uses a proper #ifdef CONFIG_X86_64 guard.
- All other cpuinfo_x86/cpu_data() references in the driver are
guarded by #if IS_ENABLED(CONFIG_X86) or #ifdef CONFIG_X86_64.
Removing this exclusion allows CONFIG_DRM_AMDGPU to be selected on UML,
which in turn enables running KUnit tests (such as amdgpu_dm_crc_test)
under UML without needing a full hardware-capable kernel build.
Reviewed-by: Alex Hung <alex.hung@amd.com> Assisted-by: Claude:claude-opus-4.6 Signed-off-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sunil Khatri [Fri, 8 May 2026 10:28:09 +0000 (15:58 +0530)]
drm/amdgpu/userq: pin mqd and fw object bo to avoid eviction
mqd and fw objects are queue core objects which should remain
valid and never be unmapped and evicted for user queues to work
properly.
During eviction if these buffers are evicted the hw continue to
use the invalid addresses and caused page faults and system hung.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sunil Khatri [Fri, 8 May 2026 06:51:20 +0000 (12:21 +0530)]
drm/amdgpu/userq: use drm_exec in amdgpu_userq_fence_read_wptr
To access the bo from vm mapping first lock the root bo and
then the object bo of the mapping to make sure both locks
are taken safely.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Dave Airlie [Fri, 15 May 2026 05:36:59 +0000 (15:36 +1000)]
Merge tag 'drm-intel-next-2026-05-14' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next
- A Revert of a Kconfig patch that broke some builds (Jani)
- New fb_pin abstraction for xe and i915 fb transparent handling (Ville, Tvrtko)
- Skip inactive MST connectors on HDCP cases (Suraj)
- Reduce redundant intel_panel_fixed_mode (Ankit)
- Some general fixes (Imre, Chaitanya)
- Reorganize display documentation (Jani)
- Start switching to display specific reg types (Jani)
Core Changes:
- Bugfixes and cleanups to pagemap, dp/mst.
- Add lockdep annotations to gpu buddy manager.
- Updates to drm/dp for PR + VRR.
- Improve documentation's table of contents.
- Bump fpfn and lpfn in ttm to 64-bits.
Driver Changes:
- Assorted bugfixes, cleanups and updates to panthor, nouveau, qaic,
hisilicon.
- Add support for CMN N116BCN-EA1, CMN N140HCA-EEK, IVO M140NWFQ R5, IVO
R140NWFW R0, BOE NT140*, BOE NV133FHM-N4F, AUO B140*, AUO B133HAN06.6 and AUO B116XTN02.3 eDP panels.
- More implementation of AIE4 in amdxdna.
- Update panels to use refcounts instead of devm_kzalloc to make
drm_panel_init static.
- Add support for the RCade Display Adapter to gud.
Dmitry Osipenko [Fri, 1 May 2026 00:00:43 +0000 (03:00 +0300)]
drm/virtio: Extend blob UAPI with deferred-mapping hinting
If userspace never maps GEM object, then BO wastes hostmem space
because VirtIO-GPU driver maps VRAM BO at the BO's creating time.
Make mappings on-demand by adding new RESOURCE_CREATE_BLOB IOCTL/UAPI
hinting flag telling that host mapping should be deferred until first
mapping is made when the flag is set by userspace.
Michal Wajdeczko [Mon, 11 May 2026 17:28:38 +0000 (19:28 +0200)]
drm/xe/memirq: Enable GT_MI_USER_INTERRUPT only
We only expect and handle the GT_MI_USER_INTERRUPT from the
engines, there is no point in enabling other interrupts, like
GT_CONTEXT_SWITCH_INTERRUPT, if we don't intent to handle them.
Michal Wajdeczko [Mon, 11 May 2026 17:28:37 +0000 (19:28 +0200)]
drm/xe/memirq: Update interrupt handler logic
To workaround some corner case hardware limitations, new programming
note for the memory based interrupt handler suggests to assume that
some status bytes, like GT_MI_USER_INTERRUPT and GUC_INTR_GUC2HOST,
are always set. Update our interrupt handler to follow the new rules.
Bspec: 53672 Fixes: a6581ebe7685 ("drm/xe/vf: Introduce Memory Based Interrupts Handler") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Link: https://patch.msgid.link/20260511172838.2299-2-michal.wajdeczko@intel.com
Felix Kuehling [Wed, 13 May 2026 14:12:53 +0000 (09:12 -0500)]
drm/ttm: Support 52-bit PAs in ttm_place
fpfn and lpfn in struct ttm_place are 32-bit page numbers. With 4KB page
size this can support up to 44-bit physical addressing. Grow these to
64-bit (uint64_t) to support larger physical addresses.
Jani Nikula [Tue, 5 May 2026 09:16:48 +0000 (12:16 +0300)]
drm/i915/display: define and use intel_reg_{offset, equal, valid}() helpers
Add display specific helpers for getting the register offset, checking
for equality and validity. Add them as static inlines for increased type
safety.
Jani Nikula [Tue, 5 May 2026 09:16:47 +0000 (12:16 +0300)]
drm/i915/display: add struct intel_error_regs and use it
Add struct intel_error_regs, a display version of struct
i915_error_regs, and use it. The goal is to reduce the dependency on
i915 core types and headers.
Jani Nikula [Tue, 5 May 2026 09:16:45 +0000 (12:16 +0300)]
drm/i915/display: add typedef for intel_reg_t and use it
Add a typedef alias intel_reg_t for i915_reg_t, and use it exclusively
in display code. The goal is to eventually define a distinct type for
display, but for now just use an alias.
In a handful of places include intel_display_reg_defs.h instead of
i915_reg_defs.h to get the definition, and isolate the i915_reg_defs.h
include there.
Jani Nikula [Fri, 8 May 2026 11:12:08 +0000 (14:12 +0300)]
Documentation/gpu: add some tables of contents to large documents
Some of the GPU documentation pages are quite long, with various levels
of details. Add document internal tables of contents to the larger
documents to make them easier to navigate.
The index.rst in the sub-directories have toctrees, which provide
similar overviews.
Fix one missing newline at the end of drm-uapi.rst while at it,
primarily because rst should have it, and secondarily because my editor
rst mode refuses to save the file without it.
Jani Nikula [Fri, 8 May 2026 11:12:07 +0000 (14:12 +0300)]
Documentation/gpu: limit main toctree depth to 2
The main GPU documentation toctree has no limit to the toctree depth,
which means the main GPU index page recursively includes all the
headings in all of GPU documentation in the single table of
contents. This makes getting any kind of overview of the documentation
really difficult.
Limit the main toctree depth to 2 i.e. show at most two levels of
headings.
Lin He [Sat, 9 May 2026 03:23:02 +0000 (11:23 +0800)]
drm/hisilicon/hibmc: use clock to look up the PLL value
In the past, we use width and height to look up our PLL value.
But actually the actual clock check is also necessnary. There are
some resolutions that width and height same, but its clock different.
Add the clock check when using pll_table to determine the PLL value.
Fixes: da52605eea8f ("drm/hisilicon/hibmc: Add support for display engine") Signed-off-by: Lin He <helin52@huawei.com> Signed-off-by: Yongbang Shi <shiyongbang@huawei.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patch.msgid.link/20260509032302.2057227-5-shiyongbang@huawei.com
Lin He [Sat, 9 May 2026 03:23:01 +0000 (11:23 +0800)]
drm/hisilicon/hibmc: move display contrl config to hibmc_probe()
If there's no VGA output, this encoder modeset won't be called, which
will cause displaying data from GPU being cut off. It's actually a
common display config for DP and VGA, so move the vdac encoder modeset
to driver load stage.
Removed invalid bit configurations from `hibmc_display_ctrl`
Fixes: 5294967f4ae4 ("drm/hisilicon/hibmc: Add support for VDAC") Signed-off-by: Lin He <helin52@huawei.com> Signed-off-by: Yongbang Shi <shiyongbang@huawei.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patch.msgid.link/20260509032302.2057227-4-shiyongbang@huawei.com
Lin He [Sat, 9 May 2026 03:23:00 +0000 (11:23 +0800)]
drm/hisilicon/hibmc: fix no showing when no connectors connected
Our chip support KVM over IP feature, so hibmc driver need to support
displaying without any connectors plugged in. If no connectors are
connected, the vdac connector status should be set to 'connected' to
ensure proper KVM display functionality. Additionally, for
previous-generation products that may lack hardware link support and
thus cannot detect the monitor, the same approach should be applied
to ensure VGA display functionality.
* Add phys_state in the struct of dp and vdac to check physical outputs.
* The 'epoch_counter' of the vdac connector is incremented when the
physical status changes.
For get_modes: using BMC modes for connector if no display is attached to
phys VGA cable, otherwise use EDID modes by drm_connector_helper_get_modes,
because KVM doesn't provide EDID reads.
The polling mechanism for the KMS helper is enabled.
Fixes: 4c962bc929f1 ("drm/hisilicon/hibmc: Add vga connector detect functions") Reported-by: Thomas Zimmermann <tzimmermann@suse.de> Closes: https://lore.kernel.org/all/0eb5c509-2724-4c57-87ad-74e4270d5a5a@suse.de/ Signed-off-by: Lin He <helin52@huawei.com> Signed-off-by: Yongbang Shi <shiyongbang@huawei.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patch.msgid.link/20260509032302.2057227-3-shiyongbang@huawei.com
Lin He [Sat, 9 May 2026 03:22:59 +0000 (11:22 +0800)]
drm/hisilicon/hibmc: add updating link cap in DP detect()
In the past, the link cap is updated in link training at encoder enable
stage, but the hibmc_dp_mode_valid() is called before it, which will use
DP link's rate and lanes. So add the hibmc_dp_update_caps() in
hibmc_dp_update_caps() to avoid some potential risks.
Fixes: 607805abfb74 ("drm/hisilicon/hibmc: add dp mode valid check") Signed-off-by: Lin He <helin52@huawei.com> Signed-off-by: Yongbang Shi <shiyongbang@huawei.com> Acked-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patch.msgid.link/20260509032302.2057227-2-shiyongbang@huawei.com
drm/xe: Refactor emit_xy_fast_copy and emit_mem_copy functions
To perform copy, based on whether the platform supports service copy
engines, either MEM_COPY or XY_FAST_COPY_BLT instruction is used.
Length of both the instructions is same today and so they use a common
define EMIT_COPY_DW.
This is not true for the future platforms. Implement separate functions
which return the length of the instruction to help in preparing for it.
Implement a function to return the length of the MEM_SET instruction.
This is to prepare for future platforms where the length of MEM_SET
instruction is expected to change.
Implement a function which returns the length of XY_FAST_COLOR_BLT
instruction instead of hardcoding it inside the emit_clear_main_copy.
In future platforms, the length of this instruction is expected to
change and this patch helps in preparing for it.
Shekhar Chauhan [Tue, 12 May 2026 05:55:08 +0000 (11:25 +0530)]
drm/xe/devcoredump: Drop a FIXME in devcoredump
The FIXME says that xe_engine_snapshot_print.. is accessing persistent
driver data, unlike what the FIXME says that it does. Drop the FIXME
since the current code is not going to access the hardware while
dumping.
More details about this patch:
https://patchwork.freedesktop.org/patch/703884/?series=161407&rev=1
The starting two feedbacks make sense and the original patch is wrong
in adding those changes, but the last feedback is the one which
highlights the point.
Ashutosh Dixit [Thu, 30 Apr 2026 16:14:58 +0000 (09:14 -0700)]
drm/xe/oa: Add val arg to xe_oa_is_valid_config_reg
Add val arg to xe_oa_is_valid_config_reg so that register values can also
be verified, in addition to register address. Value verification is needed
to implement MERTOA Wa_14026779378.
Consolidate the two-element allocation into a single allocation using a
flexible array member. This reduces memory fragmentation and simplifies
the error path by eliminating the need to check for allocation failure
between the two allocations.
Add __counted_by for runtime bounds checking.
Signed-off-by: Rosen Penev <rosenp@gmail.com> Tested-by: Youssef Samir <youssef.abdulrahman@oss.qualcomm.com> Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com> Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com> Link: https://patch.msgid.link/20260401220643.12802-1-rosenp@gmail.com
drm/i915/display: Copy color pipeline from plane in the primary joiner pipe
When copying plane color state in a joiner configuration, use the plane in
the primary joiner pipe since it carries the pipeline number selected by
the user-space.
This assumes that all pipes in the joiner are symmetric in their plane
color capabilities.
Sophie D [Sat, 9 May 2026 02:54:05 +0000 (22:54 -0400)]
drm/gud: Add RCade Display Adapter VID/PID pair
The RCade Display Adapter is a hardware device that allows driving an
Arcade CRT display via the GUD protocol. Currently it spoofs an
existing GUD VID/PID pair. However, now that it has its own pair
assigned, it makes sense to add this to the list of pairs that GUD
supports natively.
More information can be found in the project repositories:
https://gitlab.scd31.com/stephen/stm32-usb-vga-adapter-hardware
https://gitlab.scd31.com/stephen/stm32-usb-vga-rcade-adapter
Yang Wang [Fri, 8 May 2026 02:31:22 +0000 (10:31 +0800)]
drm/amd/pm: update dpm clock pm attributes for aldebaran (gc 9.4.2)
v1:
Separate DPM clock attribute constraints for Arcturus (9.4.1) and
Aldebaran (9.4.2) ASICs.
- For Aldebaran:
* mclk/socclk: Disable write, only voltage control supported
* fclk/pcie: Mark as unsupported
- Remove 9.4.2 from global pcie check and handle it in ASIC specific case
- Update comments to reflect correct hardware names
v2:
fix some coding logic issue (by asad)
Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amdgpu/gfx_v12_0: set gfx.rs64_enable from PFP header on GFX12
gfx_v12_0_init_microcode() always loads RS64 CP ucode but never set
adev->gfx.rs64_enable, so it stayed false and code that branches on it
(e.g. MEC pipe reset) used the legacy CP_MEC_CNTL path incorrectly.
Match GFX11: derive RS64 mode from the PFP firmware header (v2.0) via
amdgpu_ucode_hdr_version(). Log at debug when RS64 is enabled.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Thu, 1 Jan 2026 22:20:18 +0000 (17:20 -0500)]
drm/amdgpu: plumb timedout fence through to force completion
When we do a full adapter reset, if we know the timedout fence
mark the fence with -ETIME rather than -ECANCELED so it
gets properly handled by userspace.
v2: rebase
Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>