]> git.ipfire.org Git - thirdparty/kernel/linux.git/log
thirdparty/kernel/linux.git
5 weeks agoMerge tag 'drm-intel-gt-next-2025-07-02' of https://gitlab.freedesktop.org/drm/i915...
Dave Airlie [Fri, 4 Jul 2025 01:43:29 +0000 (11:43 +1000)] 
Merge tag 'drm-intel-gt-next-2025-07-02' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next

Driver Changes:

Fixes/improvements/new stuff:

- Avoid GuC scheduling stalls [guc] (Julia Filipchuk)
- Remove force_probe requirement for DG1 (Ville Syrjälä)
- Handle errors correctly to avoid losing GuC-2-Host messages [guc] (Jesus Narvaez)
- Avoid double wakeref put if GuC context deregister failed [guc] (Jesus Narvaez)
- Avoid timeline memory leak with signals and legacy platforms [ringbuf] (Janusz Krzysztofik)
- Fix MEI (discrete) interrupt handler on RT kernels [gsc] (Junxiao Chang)

Miscellaneous:

- Allow larger memory allocation [selftest] (Mikolaj Wasiak)
- Use provided dma_fence_is_chain (Tvrtko Ursulin)
- Fix build error with GCOV and AutoFDO enabled [pmu] (Tzung-Bi Shih)
- Fix build error some more (Arnd Bergmann)
- Reduce stack usage in igt_vma_pin1() (Arnd Bergmann)
- Move out engine related macros from i915_drv.h (Krzysztof Karas)
- Move GEM_QUIRK_PIN_SWIZZLED_PAGES to i915_gem.h (Krzysztof Karas)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Tvrtko Ursulin <tursulin@igalia.com>
Link: https://lore.kernel.org/r/aGTjUBeOQFw26bRT@linux
5 weeks agoMerge tag 'amd-drm-next-6.17-2025-07-01' of https://gitlab.freedesktop.org/agd5f...
Dave Airlie [Fri, 4 Jul 2025 00:06:22 +0000 (10:06 +1000)] 
Merge tag 'amd-drm-next-6.17-2025-07-01' of https://gitlab.freedesktop.org/agd5f/linux into drm-next

amd-drm-next-6.17-2025-07-01:

amdgpu:
- FAMS2 fixes
- OLED fixes
- Misc cleanups
- AUX fixes
- DMCUB updates
- SR-IOV hibernation support
- RAS updates
- DP tunneling fixes
- DML2 fixes
- Backlight improvements
- Suspend improvements
- Use scaling for non-native modes on eDP
- SDMA 4.4.x fixes
- PCIe DPM fixes
- SDMA 5.x fixes
- Cleaner shader updates for GC 9.x
- Remove fence slab
- ISP genpd support
- Parition handling rework
- SDMA FW checks for userq support
- Add missing firmware declaration
- Fix leak in amdgpu_ctx_mgr_entity_fini()
- Freesync fix
- Ring reset refactoring
- Legacy dpm verbosity changes

amdkfd:
- GWS fix
- mtype fix for ext coherent system memory
- MMU notifier fix
- gfx7/8 fix

radeon:
- CS validation support for additional GL extensions
- Bump driver version for new CS validation checks

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20250701194707.32905-1-alexander.deucher@amd.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
6 weeks agodrm/i915/gsc: mei interrupt top half should be in irq disabled context
Junxiao Chang [Fri, 25 Apr 2025 15:11:07 +0000 (23:11 +0800)] 
drm/i915/gsc: mei interrupt top half should be in irq disabled context

MEI GSC interrupt comes from i915. It has top half and bottom half.
Top half is called from i915 interrupt handler. It should be in
irq disabled context.

With RT kernel, by default i915 IRQ handler is in threaded IRQ. MEI GSC
top half might be in threaded IRQ context. generic_handle_irq_safe API
could be called from either IRQ or process context, it disables local
IRQ then calls MEI GSC interrupt top half.

This change fixes A380/A770 GPU boot hang issue with RT kernel.

Fixes: 1e3dc1d8622b ("drm/i915/gsc: add gsc as a mei auxiliary device")
Tested-by: Furong Zhou <furong.zhou@intel.com>
Suggested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Junxiao Chang <junxiao.chang@intel.com>
Link: https://lore.kernel.org/r/20250425151108.643649-1-junxiao.chang@intel.com
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
6 weeks agodrm/amdgpu/sdma6: add more ucode version checks for userq support
Alex Deucher [Mon, 23 Jun 2025 19:47:45 +0000 (15:47 -0400)] 
drm/amdgpu/sdma6: add more ucode version checks for userq support

Fill in the SDMA ucode version checks for more SDMA 6.x parts.

v2: squash in fixes (Alex)

Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/radeon: bump version to 2.51.0
Patrick Lerda [Sat, 28 Jun 2025 18:54:23 +0000 (20:54 +0200)] 
drm/radeon: bump version to 2.51.0

The version 2.51.0 adds OpenGL 4.6 compatibility to
evergreen and cayman.

Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amdgpu: Remove useless timeout error message
YiPeng Chai [Fri, 27 Jun 2025 08:49:24 +0000 (16:49 +0800)] 
drm/amdgpu: Remove useless timeout error message

The timeout is only used to interrupt polling and
not need to print a error message.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amdgpu: Fix code style issue
Ce Sun [Mon, 30 Jun 2025 02:29:21 +0000 (10:29 +0800)] 
drm/amdgpu: Fix code style issue

cocci warnings: (new ones prefixed by >>)
>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:6088:8-9:
Unneeded variable: "r". Return "0" on line 6141

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202506281925.HHIpXiO7-lkp@intel.com/
Signed-off-by: Ce Sun <cesun102@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amdgpu: refine ras error injection when eeprom initialization failed
ganglxie [Thu, 26 Jun 2025 09:00:45 +0000 (17:00 +0800)] 
drm/amdgpu: refine ras error injection when eeprom initialization failed

when eeprom initialization failed, we still support ras error injection,
and reserve bad pages, but do not save bad pages to eeprom

Signed-off-by: ganglxie <ganglxie@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amdgpu: Fix error with dev_info_once usage
Lijo Lazar [Sat, 28 Jun 2025 07:22:01 +0000 (12:52 +0530)] 
drm/amdgpu: Fix error with dev_info_once usage

Fixes error with dev_info_once usage in amdgpu_device_asic_has_dc_support.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202506281140.mXfWT3EN-lkp@intel.com/
Fixes: a3e510fd69c3 ("drm/amdgpu: Convert from DRM_* to dev_*")
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amdgpu: Use correct severity for BP threshold exceed event
Xiang Liu [Fri, 27 Jun 2025 15:14:08 +0000 (23:14 +0800)] 
drm/amdgpu: Use correct severity for BP threshold exceed event

The severity of CPER for BP threshold exceed event should be set as
CPER_SEV_FATAL to match the OOB implementation.

Signed-off-by: Xiang Liu <xiang.liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd: Change kv-dpm DRM_*() macros to drm_*()
Mario Limonciello [Fri, 27 Jun 2025 14:34:32 +0000 (09:34 -0500)] 
drm/amd: Change kv-dpm DRM_*() macros to drm_*()

drm_*() macros can show the device a message came from.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Cc: Alexandre Demers <alexandre.f.demers@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20250627143432.3222843-3-superm1@kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd: Change legacy-dpm DRM_*() macros to drm_*()
Mario Limonciello [Fri, 27 Jun 2025 14:34:31 +0000 (09:34 -0500)] 
drm/amd: Change legacy-dpm DRM_*() macros to drm_*()

drm_*() macros can show the device a message came from.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Cc: Alexandre Demers <alexandre.f.demers@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20250627143432.3222843-2-superm1@kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd: Decrease message level for legacy-pm, kv-dpm and si-dpm
Mario Limonciello [Fri, 27 Jun 2025 14:34:30 +0000 (09:34 -0500)] 
drm/amd: Decrease message level for legacy-pm, kv-dpm and si-dpm

legacy-pm, kv-dpm and si-dpm have prints while changing power states
that don't have a level and thus are printed by default. These are
not useful at runtime for most people, so decrease them to debug.

Reported-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4322
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20250627143432.3222843-1-superm1@kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd/display: Promote DAL to 3.2.340
Taimur Hassan [Sun, 22 Jun 2025 22:27:48 +0000 (17:27 -0500)] 
drm/amd/display: Promote DAL to 3.2.340

Summary:
* Remove unused tunnel BW validation
* Refactor DML21 initialization and configuration
* Fix link override sequencing when switching between DIO/HPO
* Ensure OLED minimum luminance

Acked-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd/display: [FW Promotion] Release 0.1.17.0
Taimur Hassan [Sun, 22 Jun 2025 20:33:48 +0000 (16:33 -0400)] 
drm/amd/display: [FW Promotion] Release 0.1.17.0

Summary for changes in firmware:
* Add AMD brightness adjustment feature for edp
* Fix BL enable
* Revise low power init sequence
* Fix brightness delta after IPS1 entry
* Adjusted DP blanking sequence

Acked-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd/display: Add DPP & HUBP reset if power gate enabled on DCN314
Ivan Lipski [Thu, 19 Jun 2025 22:14:52 +0000 (18:14 -0400)] 
drm/amd/display: Add DPP & HUBP reset if power gate enabled on DCN314

[WHY]
On DCN314, using full screen application with enabled scaling like 150%,
175%, with overlay cursor, causes a second cursor to appear when changing
planes. Dpp cache is used to track the HW cursor enable. Since power gate
is disabled for hubp & dpp in DCN314, dpp_reset() zero'ed the dpp struct,
while the dpp hardware was not power gated.

So, when plane is changed in a full screen app, and the overlay cursor is
enabled, the cache is cleared, so the cache does not represent the actual
cursor state.

[HOW]
Added conditionals for dpp & hubp reset and their pg_control functions
only if according power_gate flags are enabled.

Reviewed-by: Sun peng Li <sunpeng.li@amd.com>
Signed-off-by: Ivan Lipski <ivlipski@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd/display: Fix Link Override Sequencing When Switching Between DIO/HPO
Michael Strauss [Wed, 12 Feb 2025 18:52:42 +0000 (13:52 -0500)] 
drm/amd/display: Fix Link Override Sequencing When Switching Between DIO/HPO

[WHY]
When performing certain link maintenance compliance tests or forcing link
settings, changing between 128b/132b and 8b/10b rates no longer works on
some ASICs. Some rate divider updates only occur when we set
timings or validate state, which is not performed currently when toggling
DPMS to change rates.

[HOW]
Re-calculate dividers and reprogram audio when switching between DIO
and HPO through DP compliance/escape code path.
Add OTG disable/re-enable so we don't touch the clock while OTG is active.
Acquire dcLock before forcing link settings to avoid thread synchronization
errors due to added programming in escape code path and potential HPD
interrupts.

Reviewed-by: George Shen <george.shen@amd.com>
Signed-off-by: Michael Strauss <michael.strauss@amd.com>
Signed-off-by: Mike Katsnelson <mike.katsnelson@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd/display: Don't allow OLED to go down to fully off
Mario Limonciello [Thu, 19 Jun 2025 14:29:13 +0000 (09:29 -0500)] 
drm/amd/display: Don't allow OLED to go down to fully off

[Why]
OLED panels can be fully off, but this behavior is unexpected.

[How]
Ensure that minimum luminance is at least 1.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4338
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd/display: Added case for when RR equals panel's max RR using freesync
Harold Sun [Thu, 19 Jun 2025 18:52:54 +0000 (14:52 -0400)] 
drm/amd/display: Added case for when RR equals panel's max RR using freesync

[WHY]
Rounding error sometimes occurs when the refresh rate is equal to a panel's
max refresh rate, causing HDMI compliance failures.

[HOW]
Added a case so that we round up to avoid v_total_min to be below a panel's
minimum bound.

Reviewed-by: Jun Lei <jun.lei@amd.com>
Signed-off-by: Harold Sun <Harold.Sun@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd/display: Separate set_gsl from set_gsl_source_select
Ilya Bakoulin [Wed, 18 Jun 2025 17:07:14 +0000 (13:07 -0400)] 
drm/amd/display: Separate set_gsl from set_gsl_source_select

[Why/How]
Separate the checks for set_gsl and set_gsl_source_select, since
source_select may not be implemented/necessary.

Reviewed-by: Nevenko Stupar <nevenko.stupar@amd.com>
Signed-off-by: Ilya Bakoulin <Ilya.Bakoulin@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd/display: Refactor DML21 Initialization and Configuration
Wenjing Liu [Wed, 11 Jun 2025 21:31:40 +0000 (17:31 -0400)] 
drm/amd/display: Refactor DML21 Initialization and Configuration

[Why & How]
- Consolidated the initialization of DML21 parameters into a single
function `dml21_populate_dml_init_params` to streamline the process
and improve code readability.
- Updated the function signatures in the header files to reflect changes
in parameter passing for DML context.
- Removed redundant debug option handling and integrated it into the new
configuration population function.
- Adjusted the DML21 initialization logic in the wrapper to accommodate
the new structure, ensuring compatibility with different DCN versions.
- Enhanced the handling of clock parameters and bounding box configurations
from various sources, including hardware defaults and software policies.
- Improved the clarity of the code by renaming functions and variables for
better understanding of their purposes.

Reviewed-by: Austin Zheng <austin.zheng@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd/display: prepare for new platform
Karthi Kandasamy [Thu, 5 Jun 2025 13:10:44 +0000 (15:10 +0200)] 
drm/amd/display: prepare for new platform

[Why & How]
Expose some function for new platform use

Reviewed-by: Nevenko Stupar <nevenko.stupar@amd.com>
Signed-off-by: Karthi Kandasamy <karthi.kandasamy@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd/display: Remove unused tunnel BW validation
Cruise Hung [Fri, 13 Jun 2025 09:41:50 +0000 (17:41 +0800)] 
drm/amd/display: Remove unused tunnel BW validation

[Why & How]
The tunnel BW validation code has changed to the new one.
Remove the unused code.
The DP tunneling overhead is not updated in SST.
Move updating DP tunneling overhead for both SST and MST.

Reviewed-by: Meenakshikumar Somasundaram <meenakshikumar.somasundaram@amd.com>
Signed-off-by: Cruise Hung <Cruise.Hung@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd/display: add null check
Peichen Huang [Thu, 12 Jun 2025 08:06:41 +0000 (16:06 +0800)] 
drm/amd/display: add null check

[WHY]
Prevents null pointer dereferences to enhance function robustness

[HOW]
Adds early null check and return false if invalid.

Reviewed-by: Cruise Hung <cruise.hung@amd.com>
Signed-off-by: Peichen Huang <PeiChen.Huang@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amdgpu: move scheduler wqueue handling into callbacks
Alex Deucher [Mon, 16 Jun 2025 21:45:05 +0000 (17:45 -0400)] 
drm/amdgpu: move scheduler wqueue handling into callbacks

Move the scheduler wqueue stopping and starting into
the ring reset callbacks.  On some IPs we have to reset
an engine which may have multiple queues.  Move the wqueue
handling into the backend so we can handle them as needed
based on the type of reset available.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amdgpu: move guilty handling into ring resets
Alex Deucher [Fri, 6 Jun 2025 03:34:48 +0000 (23:34 -0400)] 
drm/amdgpu: move guilty handling into ring resets

Move guilty logic into the ring reset callbacks.  This
allows each ring reset callback to better handle fence
errors and force completions in line with the reset
behavior for each IP.  It also allows us to remove
the ring guilty callback since that logic now lives
in the reset callback.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amdgpu: move force completion into ring resets
Alex Deucher [Thu, 29 May 2025 16:58:53 +0000 (12:58 -0400)] 
drm/amdgpu: move force completion into ring resets

Move the force completion handling into each ring
reset function so that each engine can determine
whether or not it needs to force completion on the
jobs in the ring.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amdgpu: rework queue reset scheduler interaction
Christian König [Fri, 2 May 2025 16:17:16 +0000 (18:17 +0200)] 
drm/amdgpu: rework queue reset scheduler interaction

Stopping the scheduler for queue reset is generally a good idea because
it prevents any worker from touching the ring buffer.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amdgpu: update ring reset function signature
Alex Deucher [Thu, 5 Jun 2025 20:12:08 +0000 (16:12 -0400)] 
drm/amdgpu: update ring reset function signature

Going forward, we'll need more than just the vmid.  Add the
guilty amdgpu_fence.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amdgpu: remove job parameter from amdgpu_fence_emit()
Alex Deucher [Mon, 16 Jun 2025 15:56:23 +0000 (11:56 -0400)] 
drm/amdgpu: remove job parameter from amdgpu_fence_emit()

What we actually care about is the amdgpu_fence object
so pass that in explicitly to avoid possible mistakes
in the future.

The job_run_counter handling can be safely removed at this
point as we no longer support job resubmission.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amdkfd: add hqd_sdma_get_doorbell callbacks for gfx7/8
Alex Deucher [Wed, 25 Jun 2025 22:15:37 +0000 (18:15 -0400)] 
drm/amdkfd: add hqd_sdma_get_doorbell callbacks for gfx7/8

These were missed when support was added for other generations.
The callbacks are called unconditionally so we need to make
sure all generations have them.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4304
Link: https://github.com/ROCm/ROCm/issues/4965
Fixes: bac38ca8c475 ("drm/amdkfd: implement per queue sdma reset for gfx 9.4+")
Cc: Jonathan Kim <jonathan.kim@amd.com>
Reported-by: Johl Brown <johlbrown@gmail.com>
Reviewed-by: Jonathan Kim <jonathan.kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amdgpu: Fix memory leak in amdgpu_ctx_mgr_entity_fini
Lin.Cao [Tue, 24 Jun 2025 09:05:34 +0000 (17:05 +0800)] 
drm/amdgpu: Fix memory leak in amdgpu_ctx_mgr_entity_fini

patch dd64956685fa ("drm/amdgpu: Remove duplicated "context still
alive" check") removed ctx put, which will cause amdgpu_ctx_fini()
cannot be called and then cause some finished fence that added by
amdgpu_ctx_add_fence() cannot be released and cause memleak.

Fixes: dd64956685fa ("drm/amdgpu: Remove duplicated "context still alive" check")
Signed-off-by: Lin.Cao <lincao12@amd.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amdgpu: indent an if statement
Dan Carpenter [Wed, 25 Jun 2025 15:22:27 +0000 (10:22 -0500)] 
drm/amdgpu: indent an if statement

The "return true;" line wasn't indented.  Also checkpatch likes when
we align the && conditions.

Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amdgpu: Use dma_buf from GEM object instance
Thomas Zimmermann [Wed, 25 Jun 2025 08:42:18 +0000 (10:42 +0200)] 
drm/amdgpu: Use dma_buf from GEM object instance

Avoid dereferencing struct drm_gem_object.import_attach for the
imported dma-buf. The dma_buf field in the GEM object instance refers
to the same buffer. Prepares to make import_attach an implementation
detail of PRIME.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amdgpu: Test for imported buffers with drm_gem_is_imported()
Thomas Zimmermann [Wed, 25 Jun 2025 08:42:17 +0000 (10:42 +0200)] 
drm/amdgpu: Test for imported buffers with drm_gem_is_imported()

Instead of testing import_attach for imported GEM buffers, invoke
drm_gem_is_imported() to do the test.

v2:
- keep amdgpu_bo_print_info() as-is (Christian)

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amdgpu: Convert from DRM_* to dev_*
Lijo Lazar [Tue, 24 Jun 2025 05:50:56 +0000 (11:20 +0530)] 
drm/amdgpu: Convert from DRM_* to dev_*

Convert from generic DRM_* to dev_* calls to have device context info.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd/pm: Fetch SMUv13.0.12 xgmi max speed/width
Lijo Lazar [Fri, 13 Jun 2025 12:38:22 +0000 (18:08 +0530)] 
drm/amd/pm: Fetch SMUv13.0.12 xgmi max speed/width

On SMU v13.0.12 SOCs, fetch the max values of xgmi speed/width from
firmware.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amdkfd: Don't call mmput from MMU notifier callback
Philip Yang [Fri, 20 Jun 2025 22:32:32 +0000 (18:32 -0400)] 
drm/amdkfd: Don't call mmput from MMU notifier callback

If the process is exiting, the mmput inside mmu notifier callback from
compactd or fork or numa balancing could release the last reference
of mm struct to call exit_mmap and free_pgtable, this triggers deadlock
with below backtrace.

The deadlock will leak kfd process as mmu notifier release is not called
and cause VRAM leaking.

The fix is to take mm reference mmget_non_zero when adding prange to the
deferred list to pair with mmput in deferred list work.

If prange split and add into pchild list, the pchild work_item.mm is not
used, so remove the mm parameter from svm_range_unmap_split and
svm_range_add_child.

The backtrace of hung task:

 INFO: task python:348105 blocked for more than 64512 seconds.
 Call Trace:
  __schedule+0x1c3/0x550
  schedule+0x46/0xb0
  rwsem_down_write_slowpath+0x24b/0x4c0
  unlink_anon_vmas+0xb1/0x1c0
  free_pgtables+0xa9/0x130
  exit_mmap+0xbc/0x1a0
  mmput+0x5a/0x140
  svm_range_cpu_invalidate_pagetables+0x2b/0x40 [amdgpu]
  mn_itree_invalidate+0x72/0xc0
  __mmu_notifier_invalidate_range_start+0x48/0x60
  try_to_unmap_one+0x10fa/0x1400
  rmap_walk_anon+0x196/0x460
  try_to_unmap+0xbb/0x210
  migrate_page_unmap+0x54d/0x7e0
  migrate_pages_batch+0x1c3/0xae0
  migrate_pages_sync+0x98/0x240
  migrate_pages+0x25c/0x520
  compact_zone+0x29d/0x590
  compact_zone_order+0xb6/0xf0
  try_to_compact_pages+0xbe/0x220
  __alloc_pages_direct_compact+0x96/0x1a0
  __alloc_pages_slowpath+0x410/0x930
  __alloc_pages_nodemask+0x3a9/0x3e0
  do_huge_pmd_anonymous_page+0xd7/0x3e0
  __handle_mm_fault+0x5e3/0x5f0
  handle_mm_fault+0xf7/0x2e0
  hmm_vma_fault.isra.0+0x4d/0xa0
  walk_pmd_range.isra.0+0xa8/0x310
  walk_pud_range+0x167/0x240
  walk_pgd_range+0x55/0x100
  __walk_page_range+0x87/0x90
  walk_page_range+0xf6/0x160
  hmm_range_fault+0x4f/0x90
  amdgpu_hmm_range_get_pages+0x123/0x230 [amdgpu]
  amdgpu_ttm_tt_get_user_pages+0xb1/0x150 [amdgpu]
  init_user_pages+0xb1/0x2a0 [amdgpu]
  amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x543/0x7d0 [amdgpu]
  kfd_ioctl_alloc_memory_of_gpu+0x24c/0x4e0 [amdgpu]
  kfd_ioctl+0x29d/0x500 [amdgpu]

Fixes: fa582c6f3684 ("drm/amdkfd: Use mmget_not_zero in MMU notifier")
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amdgpu: Include sdma_4_4_4.bin
Kent Russell [Tue, 24 Jun 2025 15:42:06 +0000 (11:42 -0400)] 
drm/amdgpu: Include sdma_4_4_4.bin

This got missed during SDMA 4.4.4 support.

Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd: Include <linux/export.h> when needed
André Almeida [Fri, 13 Jun 2025 18:26:51 +0000 (15:26 -0300)] 
drm/amd: Include <linux/export.h> when needed

Fix the following compile time warning when building with W=1:

warning: EXPORT_SYMBOL() is used, but #include <linux/export.h> is missing

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: André Almeida <andrealmeid@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd: Do not include <linux/export.h> when unused
André Almeida [Fri, 13 Jun 2025 18:26:50 +0000 (15:26 -0300)] 
drm/amd: Do not include <linux/export.h> when unused

Fix the following compile time warning when building with W=1:

warning: EXPORT_SYMBOL() is not used, but #include <linux/export.h> is present

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: André Almeida <andrealmeid@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agoamdkfd: MTYPE_UC for ext-coherent system memory
David Yat Sin [Thu, 19 Jun 2025 17:51:13 +0000 (17:51 +0000)] 
amdkfd: MTYPE_UC for ext-coherent system memory

Set memory mtype to UC host memory when ext-coherent
flag is set and memory is registered as a SVM allocation.

Reviewed-by: Amber Lin <Amber.Lin@amd.com>
Signed-off-by: David Yat Sin <David.YatSin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amdgpu/sdma5.x: suspend KFD queues in ring reset
Alex Deucher [Mon, 16 Jun 2025 18:04:54 +0000 (14:04 -0400)] 
drm/amdgpu/sdma5.x: suspend KFD queues in ring reset

SDMA 5.x only supports engine soft reset which resets
all queues on the engine.  As such, we need to suspend
KFD queues around resets like we do for SDMA 4.x.

Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/i915/gt: Fix timeline left held on VMA alloc error
Janusz Krzysztofik [Wed, 11 Jun 2025 10:42:13 +0000 (12:42 +0200)] 
drm/i915/gt: Fix timeline left held on VMA alloc error

The following error has been reported sporadically by CI when a test
unbinds the i915 driver on a ring submission platform:

<4> [239.330153] ------------[ cut here ]------------
<4> [239.330166] i915 0000:00:02.0: [drm] drm_WARN_ON(dev_priv->mm.shrink_count)
<4> [239.330196] WARNING: CPU: 1 PID: 18570 at drivers/gpu/drm/i915/i915_gem.c:1309 i915_gem_cleanup_early+0x13e/0x150 [i915]
...
<4> [239.330640] RIP: 0010:i915_gem_cleanup_early+0x13e/0x150 [i915]
...
<4> [239.330942] Call Trace:
<4> [239.330944]  <TASK>
<4> [239.330949]  i915_driver_late_release+0x2b/0xa0 [i915]
<4> [239.331202]  i915_driver_release+0x86/0xa0 [i915]
<4> [239.331482]  devm_drm_dev_init_release+0x61/0x90
<4> [239.331494]  devm_action_release+0x15/0x30
<4> [239.331504]  release_nodes+0x3d/0x120
<4> [239.331517]  devres_release_all+0x96/0xd0
<4> [239.331533]  device_unbind_cleanup+0x12/0x80
<4> [239.331543]  device_release_driver_internal+0x23a/0x280
<4> [239.331550]  ? bus_find_device+0xa5/0xe0
<4> [239.331563]  device_driver_detach+0x14/0x20
...
<4> [357.719679] ---[ end trace 0000000000000000 ]---

If the test also unloads the i915 module then that's followed with:

<3> [357.787478] =============================================================================
<3> [357.788006] BUG i915_vma (Tainted: G     U  W        N ): Objects remaining on __kmem_cache_shutdown()
<3> [357.788031] -----------------------------------------------------------------------------
<3> [357.788204] Object 0xffff888109e7f480 @offset=29824
<3> [357.788670] Allocated in i915_vma_instance+0xee/0xc10 [i915] age=292729 cpu=4 pid=2244
<4> [357.788994]  i915_vma_instance+0xee/0xc10 [i915]
<4> [357.789290]  init_status_page+0x7b/0x420 [i915]
<4> [357.789532]  intel_engines_init+0x1d8/0x980 [i915]
<4> [357.789772]  intel_gt_init+0x175/0x450 [i915]
<4> [357.790014]  i915_gem_init+0x113/0x340 [i915]
<4> [357.790281]  i915_driver_probe+0x847/0xed0 [i915]
<4> [357.790504]  i915_pci_probe+0xe6/0x220 [i915]
...

Closer analysis of CI results history has revealed a dependency of the
error on a few IGT tests, namely:
- igt@api_intel_allocator@fork-simple-stress-signal,
- igt@api_intel_allocator@two-level-inception-interruptible,
- igt@gem_linear_blits@interruptible,
- igt@prime_mmap_coherency@ioctl-errors,
which invisibly trigger the issue, then exhibited with first driver unbind
attempt.

All of the above tests perform actions which are actively interrupted with
signals.  Further debugging has allowed to narrow that scope down to
DRM_IOCTL_I915_GEM_EXECBUFFER2, and ring_context_alloc(), specific to ring
submission, in particular.

If successful then that function, or its execlists or GuC submission
equivalent, is supposed to be called only once per GEM context engine,
followed by raise of a flag that prevents the function from being called
again.  The function is expected to unwind its internal errors itself, so
it may be safely called once more after it returns an error.

In case of ring submission, the function first gets a reference to the
engine's legacy timeline and then allocates a VMA.  If the VMA allocation
fails, e.g. when i915_vma_instance() called from inside is interrupted
with a signal, then ring_context_alloc() fails, leaving the timeline held
referenced.  On next I915_GEM_EXECBUFFER2 IOCTL, another reference to the
timeline is got, and only that last one is put on successful completion.
As a consequence, the legacy timeline, with its underlying engine status
page's VMA object, is still held and not released on driver unbind.

Get the legacy timeline only after successful allocation of the context
engine's VMA.

v2: Add a note on other submission methods (Krzysztof Karas):
    Both execlists and GuC submission use lrc_alloc() which seems free
    from a similar issue.

Fixes: 75d0a7f31eec ("drm/i915: Lift timeline into intel_context")
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12061
Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Sebastian Brzezinka <sebastian.brzezinka@intel.com>
Reviewed-by: Krzysztof Niemiec <krzysztof.niemiec@intel.com>
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Reviewed-by: Nitin Gote <nitin.r.gote@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://lore.kernel.org/r/20250611104352.1014011-2-janusz.krzysztofik@linux.intel.com
6 weeks agodrm/i915: move GEM_QUIRK_PIN_SWIZZLED_PAGES to i915_gem.h
Krzysztof Karas [Wed, 18 Jun 2025 13:52:22 +0000 (13:52 +0000)] 
drm/i915: move GEM_QUIRK_PIN_SWIZZLED_PAGES to i915_gem.h

Move this macro where other GEM_* definitions live.
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://lore.kernel.org/all/ca83a9d8aa86bb92de84c31fd075e92a61f78895.1750251040.git.krzysztof.karas@intel.com
6 weeks agodrm/i915: Move out engine related macros from i915_drv.h
Krzysztof Karas [Wed, 18 Jun 2025 13:51:30 +0000 (13:51 +0000)] 
drm/i915: Move out engine related macros from i915_drv.h

Move macros related to engines out of i915_drv.h header and
place them in intel_engine.h.

Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://lore.kernel.org/r/9b9ed5bbdb37470fa679c5baf961424c9cfbad11.1750251040.git.krzysztof.karas@intel.com
6 weeks agoMerge tag 'drm-misc-next-2025-06-26' of https://gitlab.freedesktop.org/drm/misc/kerne...
Dave Airlie [Thu, 26 Jun 2025 23:57:43 +0000 (09:57 +1000)] 
Merge tag 'drm-misc-next-2025-06-26' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next

drm-misc-next for 6.17:

UAPI Changes:

Cross-subsystem Changes:

Core Changes:
- ci: Add Device tree validation and kunit
- connector: Move HDR sink metadat to drm_display_info

Driver Changes:
- bochs: drm_panic Support
- panfrost: MT8370 Support

- bridge:
  - tc358767: Convert to devm_drm_bridge_alloc()

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <mripard@redhat.com>
Link: https://lore.kernel.org/r/20250626-sincere-loon-of-effort-6dbdf9@houat
6 weeks agodrm/nouveau/disp: Use dev->dev to get the device
Sakari Ailus [Wed, 9 Apr 2025 10:33:44 +0000 (13:33 +0300)] 
drm/nouveau/disp: Use dev->dev to get the device

The local variable dev points to drm->dev already, use dev directly.

Link: https://lore.kernel.org/r/20250409103344.3661603-1-sakari.ailus@linux.intel.com
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
6 weeks agodrm/i915: reduce stack usage in igt_vma_pin1()
Arnd Bergmann [Fri, 20 Jun 2025 11:36:38 +0000 (13:36 +0200)] 
drm/i915: reduce stack usage in igt_vma_pin1()

The igt_vma_pin1() function has a rather high stack usage, which gets
in the way of reducing the default warning limit:

In file included from drivers/gpu/drm/i915/i915_vma.c:2285:
drivers/gpu/drm/i915/selftests/i915_vma.c:257:12: error: stack frame size (1288) exceeds limit (1280) in 'igt_vma_pin1' [-Werror,-Wframe-larger-than]

There are two things going on here:

 - The on-stack modes[] array is really large itself and gets constructed
   for every call, using around 1000 bytes itself depending on the configuration.

 - The call to i915_vma_pin() gets inlined and adds another 200 bytes for
   the i915_gem_ww_ctx structure since commit 7d1c2618eac5 ("drm/i915: Take
   reservation lock around i915_vma_pin.")

The second one is easy enough to change, by moving the function into the
appropriate .c file. Since it is already large enough to not always be
inlined, this seems like a good idea regardless, reducing both the code size
and the internal stack usage of each of its 67 callers.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20250620113644.3844552-1-arnd@kernel.org
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
6 weeks agodrm/i915: fix build error some more
Arnd Bergmann [Fri, 20 Jun 2025 11:18:18 +0000 (13:18 +0200)] 
drm/i915: fix build error some more

An earlier patch fixed a build failure with clang, but I still see the
same problem with some configurations using gcc:

drivers/gpu/drm/i915/i915_pmu.c: In function 'config_mask':
include/linux/compiler_types.h:568:38: error: call to '__compiletime_assert_462' declared with attribute error: BUILD_BUG_ON failed: bit > BITS_PER_TYPE(typeof_member(struct i915_pmu, enable)) - 1
drivers/gpu/drm/i915/i915_pmu.c:116:3: note: in expansion of macro 'BUILD_BUG_ON'
  116 |   BUILD_BUG_ON(bit >

As I understand it, the problem is that the function is not always fully
inlined, but the __builtin_constant_p() can still evaluate the argument
as being constant.

Marking it as __always_inline so far works for me in all configurations.

Fixes: 686d773186bf ("drm/i915/pmu: Fix build error with GCOV and AutoFDO enabled")
Fixes: a644fde77ff7 ("drm/i915/pmu: Change bitmask of enabled events to u32")
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20250620111824.3395007-1-arnd@kernel.org
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
6 weeks agodrm/amd/display: Add sanity checks for drm_edid_raw()
Takashi Iwai [Mon, 16 Jun 2025 16:08:41 +0000 (18:08 +0200)] 
drm/amd/display: Add sanity checks for drm_edid_raw()

When EDID is retrieved via drm_edid_raw(), it doesn't guarantee to
return proper EDID bytes the caller wants: it may be either NULL (that
leads to an Oops) or with too long bytes over the fixed size raw_edid
array (that may lead to memory corruption).  The latter was reported
actually when connected with a bad adapter.

Add sanity checks for drm_edid_raw() to address the above corner
cases, and return EDID_BAD_INPUT accordingly.

Fixes: 48edb2a4256e ("drm/amd/display: switch amdgpu_dm_connector to use struct drm_edid")
Link: https://bugzilla.suse.com/show_bug.cgi?id=1236415
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd/pm: revise the pcie dpm parameters
Kenneth Feng [Mon, 23 Jun 2025 09:02:51 +0000 (17:02 +0800)] 
drm/amd/pm: revise the pcie dpm parameters

revise the pcie dpm parameters

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd/display: Add a trace event for brightness programming
Mario Limonciello [Mon, 23 Jun 2025 17:11:14 +0000 (12:11 -0500)] 
drm/amd/display: Add a trace event for brightness programming

[Why]
Brightness programming may involve a conversion of a user requested
brightness against what was in a custom brightness curve. The values
might not match what a user programmed.

[How]
Add a new trace event to show specific converted brightness values.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Link: https://lore.kernel.org/r/20250623171114.1156451-2-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd/display: Fix AMDGPU_MAX_BL_LEVEL value
Mario Limonciello [Mon, 23 Jun 2025 17:11:13 +0000 (12:11 -0500)] 
drm/amd/display: Fix AMDGPU_MAX_BL_LEVEL value

[Why]
commit 16dc8bc27c2a ("drm/amd/display: Export full brightness range to
userspace") adjusted the brightness range to scale to larger values, but
missed updating AMDGPU_MAX_BL_LEVEL which is needed to make sure that
scaling works properly with custom brightness curves.

[How]
As the change for max brightness of 0xFFFF only applies to devices
supporting DC, use existing DC define MAX_BACKLIGHT_LEVEL.

Fixes: 16dc8bc27c2a ("drm/amd/display: Export full brightness range to userspace")
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20250623171114.1156451-1-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd: Fix spelling mistake "correctalbe" -> "correctable"
Colin Ian King [Mon, 23 Jun 2025 08:41:08 +0000 (09:41 +0100)] 
drm/amd: Fix spelling mistake "correctalbe" -> "correctable"

There is a spelling mistake in a pr_info message. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amdgpu/sdma7: add ucode version checks for userq support
Alex Deucher [Fri, 20 Jun 2025 15:39:22 +0000 (11:39 -0400)] 
drm/amdgpu/sdma7: add ucode version checks for userq support

SDMA 7.0.0/1: 7836028

Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amdgpu/sdma6: add ucode version checks for userq support
Alex Deucher [Thu, 19 Jun 2025 21:56:29 +0000 (17:56 -0400)] 
drm/amdgpu/sdma6: add ucode version checks for userq support

SDMA 6.0.0 version 24
SDMA 6.0.2 version 21
SDMA 6.0.3 version 25

Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amdgpu: Add more checks to PSP mailbox
Lijo Lazar [Mon, 2 Jun 2025 07:25:14 +0000 (12:55 +0530)] 
drm/amdgpu: Add more checks to PSP mailbox

Instead of checking the response flag, use status mask also to check
against any unexpected failures like a device drop. Also, log error if
waiting on a psp response fails/times out.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amdgpu: Convert init_mem_ranges into common helpers
Hawking Zhang [Sat, 21 Jun 2025 13:27:22 +0000 (21:27 +0800)] 
drm/amdgpu: Convert init_mem_ranges into common helpers

They can be shared across multiple products

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amdgpu: Generalize is_multi_chiplet with a common helper v2
Hawking Zhang [Mon, 16 Jun 2025 09:17:30 +0000 (17:17 +0800)] 
drm/amdgpu: Generalize is_multi_chiplet with a common helper v2

It is not necessary to be ip generation specific

v2: rename the helper to is_multi_aid (Lijo)

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amdgpu: Convert query_memory_partition into common helpers
Hawking Zhang [Sat, 21 Jun 2025 13:11:30 +0000 (21:11 +0800)] 
drm/amdgpu: Convert query_memory_partition into common helpers

The query_memory_partition does not need to remain
as soc specific callbacks. They can be shared across
multiple products

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amdgpu: Move MAX_MEM_RANGES to amdgpu_gmc.h
Hawking Zhang [Fri, 13 Jun 2025 11:58:49 +0000 (19:58 +0800)] 
drm/amdgpu: Move MAX_MEM_RANGES to amdgpu_gmc.h

This relocation allows MAX_MEM_RANGES to be shared
across multiple products

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amdgpu: Convert pre|post_partition_switch into common helpers
Hawking Zhang [Thu, 5 Jun 2025 08:39:24 +0000 (16:39 +0800)] 
drm/amdgpu: Convert pre|post_partition_switch into common helpers

So they can be reused for future products

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amdgpu: Convert update_supported_modes into a common helper
Hawking Zhang [Thu, 5 Jun 2025 07:36:17 +0000 (15:36 +0800)] 
drm/amdgpu: Convert update_supported_modes into a common helper

So it can be used for future products

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amdgpu: Convert update_partition_sched_list into a common helper v3
Hawking Zhang [Mon, 16 Jun 2025 09:07:26 +0000 (17:07 +0800)] 
drm/amdgpu: Convert update_partition_sched_list into a common helper v3

The update_partition_sched_list function does not
need to remain as a soc specific callback. It can
be reused for future products.

v2: bypass the function if xcp_mgr is not available (Likun)

v3: Let caller check the availability of xcp_mgr (Lijo)

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amdgpu: Convert select_sched into a common helper v3
Hawking Zhang [Mon, 16 Jun 2025 09:05:05 +0000 (17:05 +0800)] 
drm/amdgpu: Convert select_sched into a common helper v3

The xcp select_sched function does not need to
remain as a soc specific callback. It can be reused
for future products

v2: bypass the function if xcp_mgr is not available (Likun)

v3: Let caller check the availability of xcp mgr (Lijo)

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amdgpu: use common function to map ip for aqua_vanjaram
Likun Gao [Fri, 21 Mar 2025 06:14:06 +0000 (14:14 +0800)] 
drm/amdgpu: use common function to map ip for aqua_vanjaram

Transfer to use function amdgpu_ip_map_init to map ip
instance for aqua_vanjaram instead of operation on
different ASIC.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amdgpu: make ip map init to common function
Likun Gao [Fri, 13 Jun 2025 14:30:34 +0000 (22:30 +0800)] 
drm/amdgpu: make ip map init to common function

IP instance map init function can be an common function
instead of operation on different ASIC.
V2: Create amdgpu_ip.[ch] file for ip related functions.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/radeon/evergreen_cs: lower evergreen_surface_check_linear_aligned restriction
Patrick Lerda [Tue, 10 Jun 2025 19:12:23 +0000 (21:12 +0200)] 
drm/radeon/evergreen_cs: lower evergreen_surface_check_linear_aligned restriction

This change removes the restriction when palign=64 and nbx=32.
This makes two piglit tests working. This is discussed on the
thread linked below.

Link: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9056
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/radeon/evergreen_cs: implement cond_exec and cond_write
Patrick Lerda [Tue, 10 Jun 2025 19:12:22 +0000 (21:12 +0200)] 
drm/radeon/evergreen_cs: implement cond_exec and cond_write

This change implements the support of PACKET3_COND_EXEC and
PACKET3_COND_WRITE which are required to implement
ARB_indirect_parameters. ARB_indirect_parameters is
part of OpenGL 4.6.

Link: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34726
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd/amdgpu: Refine isp_v4_1_1 logging
Pratap Nirujogi [Tue, 10 Jun 2025 20:08:15 +0000 (16:08 -0400)] 
drm/amd/amdgpu: Refine isp_v4_1_1 logging

Replace DRM_ERROR with drm_err function and update log
messages to drop __func__ and print return value.

Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Pratap Nirujogi <pratap.nirujogi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd/amdgpu: Add ISP Generic PM Domain (genpd) support
Pratap Nirujogi [Thu, 29 May 2025 19:21:48 +0000 (15:21 -0400)] 
drm/amd/amdgpu: Add ISP Generic PM Domain (genpd) support

AMDISP I2C device requires to power on ISP HW to probe the sensor
device. Instead of using the exported symbols from ISP driver to
control the power and clocks remotely,added Generic PM Domain (genpd)
support in amdgpu_isp device for its child devices (amd_isp_capture,
amd_isp_i2c_designware) to set power and clocks using PM methods.

Co-developed-by: Bin Du <bin.du@amd.com>
Signed-off-by: Bin Du <bin.du@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Pratap Nirujogi <pratap.nirujogi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd/pm: Add support to set min ISP clocks
Pratap Nirujogi [Mon, 16 Jun 2025 18:16:43 +0000 (14:16 -0400)] 
drm/amd/pm: Add support to set min ISP clocks

Add support to set ISP clocks for SMU v14.0.0. ISP driver
uses amdgpu_dpm_set_soft_freq_range() API to set clocks via
SMU interface than communicating with PMFW directly.

amdgpu_dpm_set_soft_freq_range() is updated to take in any
pp_clock_type than limiting to support only PP_SCLK to allow
ISP and other driver modules to set the min/max clocks. Any
clock specific restrictions are expected to be taken care in
SOC specific SMU implementations instead of generic amdgpu_dpm
and amdgpu_smu interfaces.

Reviewed-by: Xiaojian Du <xiaojian.du@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Pratap Nirujogi <pratap.nirujogi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd/pm: Add support to set ISP Power
Pratap Nirujogi [Mon, 16 Jun 2025 19:07:57 +0000 (15:07 -0400)] 
drm/amd/pm: Add support to set ISP Power

Add support to set ISP power for SMU v14.0.0. ISP driver
uses amdgpu_dpm_set_powergating_by_smu() API to
enable / disable power via SMU interface than communicating
with PMFW directly.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Pratap Nirujogi <pratap.nirujogi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amdgpu: fix slab-use-after-free in amdgpu_userq_mgr_fini+0x70c
Vitaly Prosyak [Wed, 18 Jun 2025 23:35:21 +0000 (19:35 -0400)] 
drm/amdgpu: fix slab-use-after-free in amdgpu_userq_mgr_fini+0x70c

The issue was reproduced on NV10 using IGT pci_unplug test.
It is expected that `amdgpu_driver_postclose_kms()` is called prior to `amdgpu_drm_release()`.
However, the bug is that `amdgpu_fpriv` was freed in `amdgpu_driver_postclose_kms()`, and then
later accessed in `amdgpu_drm_release()` via a call to `amdgpu_userq_mgr_fini()`.
As a result, KASAN detected a use-after-free condition, as shown in the log below.
The proposed fix is to move the calls to `amdgpu_eviction_fence_destroy()` and
`amdgpu_userq_mgr_fini()` into `amdgpu_driver_postclose_kms()`, so they are invoked before
`amdgpu_fpriv` is freed.

This also ensures symmetry with the initialization path in `amdgpu_driver_open_kms()`,
where the following components are initialized:
- `amdgpu_userq_mgr_init()`
- `amdgpu_eviction_fence_init()`
- `amdgpu_ctx_mgr_init()`

Correspondingly, in `amdgpu_driver_postclose_kms()` we should clean up using:
- `amdgpu_userq_mgr_fini()`
- `amdgpu_eviction_fence_destroy()`
- `amdgpu_ctx_mgr_fini()`

This change eliminates the use-after-free and improves consistency in resource management between open and close paths.

[  +0.094367] ==================================================================
[  +0.000026] BUG: KASAN: slab-use-after-free in amdgpu_userq_mgr_fini+0x70c/0x730 [amdgpu]
[  +0.000866] Write of size 8 at addr ffff88811c068c60 by task amd_pci_unplug/1737
[  +0.000026] CPU: 3 UID: 0 PID: 1737 Comm: amd_pci_unplug Not tainted 6.14.0+ #2
[  +0.000008] Hardware name: ASUS System Product Name/ROG STRIX B550-F GAMING (WI-FI), BIOS 1401 12/03/2020
[  +0.000004] Call Trace:
[  +0.000004]  <TASK>
[  +0.000003]  dump_stack_lvl+0x76/0xa0
[  +0.000010]  print_report+0xce/0x600
[  +0.000009]  ? amdgpu_userq_mgr_fini+0x70c/0x730 [amdgpu]
[  +0.000790]  ? srso_return_thunk+0x5/0x5f
[  +0.000007]  ? kasan_complete_mode_report_info+0x76/0x200
[  +0.000008]  ? amdgpu_userq_mgr_fini+0x70c/0x730 [amdgpu]
[  +0.000684]  kasan_report+0xbe/0x110
[  +0.000007]  ? amdgpu_userq_mgr_fini+0x70c/0x730 [amdgpu]
[  +0.000601]  __asan_report_store8_noabort+0x17/0x30
[  +0.000007]  amdgpu_userq_mgr_fini+0x70c/0x730 [amdgpu]
[  +0.000801]  ? __pfx_amdgpu_userq_mgr_fini+0x10/0x10 [amdgpu]
[  +0.000819]  ? srso_return_thunk+0x5/0x5f
[  +0.000008]  amdgpu_drm_release+0xa3/0xe0 [amdgpu]
[  +0.000604]  __fput+0x354/0xa90
[  +0.000010]  __fput_sync+0x59/0x80
[  +0.000005]  __x64_sys_close+0x7d/0xe0
[  +0.000006]  x64_sys_call+0x2505/0x26f0
[  +0.000006]  do_syscall_64+0x7c/0x170
[  +0.000004]  ? kasan_record_aux_stack+0xae/0xd0
[  +0.000005]  ? srso_return_thunk+0x5/0x5f
[  +0.000004]  ? kmem_cache_free+0x398/0x580
[  +0.000006]  ? __fput+0x543/0xa90
[  +0.000006]  ? srso_return_thunk+0x5/0x5f
[  +0.000004]  ? __fput+0x543/0xa90
[  +0.000004]  ? __kasan_check_read+0x11/0x20
[  +0.000007]  ? srso_return_thunk+0x5/0x5f
[  +0.000004]  ? __kasan_check_read+0x11/0x20
[  +0.000003]  ? srso_return_thunk+0x5/0x5f
[  +0.000004]  ? fpregs_assert_state_consistent+0x21/0xb0
[  +0.000006]  ? srso_return_thunk+0x5/0x5f
[  +0.000004]  ? syscall_exit_to_user_mode+0x4e/0x240
[  +0.000005]  ? srso_return_thunk+0x5/0x5f
[  +0.000004]  ? do_syscall_64+0x88/0x170
[  +0.000003]  ? srso_return_thunk+0x5/0x5f
[  +0.000004]  ? do_syscall_64+0x88/0x170
[  +0.000004]  ? srso_return_thunk+0x5/0x5f
[  +0.000004]  ? irqentry_exit+0x43/0x50
[  +0.000004]  ? srso_return_thunk+0x5/0x5f
[  +0.000004]  ? exc_page_fault+0x7c/0x110
[  +0.000006]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  +0.000005] RIP: 0033:0x7ffff7b14f67
[  +0.000005] Code: ff e8 0d 16 02 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 73 ba f7 ff
[  +0.000004] RSP: 002b:00007fffffffe358 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
[  +0.000006] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ffff7b14f67
[  +0.000003] RDX: 0000000000000000 RSI: 00007ffff7f5755a RDI: 0000000000000003
[  +0.000003] RBP: 00007fffffffe380 R08: 0000555555568170 R09: 0000000000000000
[  +0.000003] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fffffffe5c8
[  +0.000003] R13: 00005555555552a9 R14: 0000555555557d48 R15: 00007ffff7ffd040
[  +0.000007]  </TASK>

[  +0.000286] Allocated by task 425 on cpu 11 at 29.751192s:
[  +0.000013]  kasan_save_stack+0x28/0x60
[  +0.000008]  kasan_save_track+0x18/0x70
[  +0.000006]  kasan_save_alloc_info+0x38/0x60
[  +0.000006]  __kasan_kmalloc+0xc1/0xd0
[  +0.000005]  __kmalloc_cache_noprof+0x1bd/0x430
[  +0.000006]  amdgpu_driver_open_kms+0x172/0x760 [amdgpu]
[  +0.000521]  drm_file_alloc+0x569/0x9a0
[  +0.000008]  drm_client_init+0x1b7/0x410
[  +0.000007]  drm_fbdev_client_setup+0x174/0x470
[  +0.000007]  drm_client_setup+0x8a/0xf0
[  +0.000006]  amdgpu_pci_probe+0x50b/0x10d0 [amdgpu]
[  +0.000482]  local_pci_probe+0xe7/0x1b0
[  +0.000008]  pci_device_probe+0x5bf/0x890
[  +0.000005]  really_probe+0x1fd/0x950
[  +0.000007]  __driver_probe_device+0x307/0x410
[  +0.000005]  driver_probe_device+0x4e/0x150
[  +0.000006]  __driver_attach+0x223/0x510
[  +0.000005]  bus_for_each_dev+0x102/0x1a0
[  +0.000006]  driver_attach+0x3d/0x60
[  +0.000005]  bus_add_driver+0x309/0x650
[  +0.000005]  driver_register+0x13d/0x490
[  +0.000006]  __pci_register_driver+0x1ee/0x2b0
[  +0.000006]  xfrm_ealg_get_byidx+0x43/0x50 [xfrm_algo]
[  +0.000008]  do_one_initcall+0x9c/0x3e0
[  +0.000007]  do_init_module+0x29e/0x7f0
[  +0.000006]  load_module+0x5c75/0x7c80
[  +0.000006]  init_module_from_file+0x106/0x180
[  +0.000007]  idempotent_init_module+0x377/0x740
[  +0.000006]  __x64_sys_finit_module+0xd7/0x180
[  +0.000006]  x64_sys_call+0x1f0b/0x26f0
[  +0.000006]  do_syscall_64+0x7c/0x170
[  +0.000005]  entry_SYSCALL_64_after_hwframe+0x76/0x7e

[  +0.000013] Freed by task 1737 on cpu 9 at 76.455063s:
[  +0.000010]  kasan_save_stack+0x28/0x60
[  +0.000006]  kasan_save_track+0x18/0x70
[  +0.000005]  kasan_save_free_info+0x3b/0x60
[  +0.000006]  __kasan_slab_free+0x54/0x80
[  +0.000005]  kfree+0x127/0x470
[  +0.000006]  amdgpu_driver_postclose_kms+0x455/0x760 [amdgpu]
[  +0.000485]  drm_file_free.part.0+0x5b1/0xba0
[  +0.000007]  drm_file_free+0x13/0x30
[  +0.000006]  drm_client_release+0x1c4/0x2b0
[  +0.000006]  drm_fbdev_ttm_fb_destroy+0xd2/0x120 [drm_ttm_helper]
[  +0.000007]  put_fb_info+0x97/0xe0
[  +0.000006]  unregister_framebuffer+0x197/0x380
[  +0.000005]  drm_fb_helper_unregister_info+0x94/0x100
[  +0.000005]  drm_fbdev_client_unregister+0x3c/0x80
[  +0.000007]  drm_client_dev_unregister+0x144/0x330
[  +0.000006]  drm_dev_unregister+0x49/0x1b0
[  +0.000006]  drm_dev_unplug+0x4c/0xd0
[  +0.000006]  amdgpu_pci_remove+0x58/0x130 [amdgpu]
[  +0.000482]  pci_device_remove+0xae/0x1e0
[  +0.000006]  device_remove+0xc7/0x180
[  +0.000006]  device_release_driver_internal+0x3d4/0x5a0
[  +0.000007]  device_release_driver+0x12/0x20
[  +0.000006]  pci_stop_bus_device+0x104/0x150
[  +0.000006]  pci_stop_and_remove_bus_device_locked+0x1b/0x40
[  +0.000005]  remove_store+0xd7/0xf0
[  +0.000007]  dev_attr_store+0x3f/0x80
[  +0.000006]  sysfs_kf_write+0x125/0x1d0
[  +0.000005]  kernfs_fop_write_iter+0x2ea/0x490
[  +0.000007]  vfs_write+0x90d/0xe70
[  +0.000006]  ksys_write+0x119/0x220
[  +0.000006]  __x64_sys_write+0x72/0xc0
[  +0.000006]  x64_sys_call+0x18ab/0x26f0
[  +0.000005]  do_syscall_64+0x7c/0x170
[  +0.000005]  entry_SYSCALL_64_after_hwframe+0x76/0x7e

[  +0.000013] The buggy address belongs to the object at ffff88811c068000
               which belongs to the cache kmalloc-rnd-01-4k of size 4096
[  +0.000016] The buggy address is located 3168 bytes inside of
               freed 4096-byte region [ffff88811c068000ffff88811c069000)

[  +0.000022] The buggy address belongs to the physical page:
[  +0.000010] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0xffff88811c06e000 pfn:0x11c068
[  +0.000006] head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
[  +0.000006] flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff)
[  +0.000007] page_type: f5(slab)
[  +0.000007] raw: 0017ffffc0000040 ffff88810004c140 dead000000000122 0000000000000000
[  +0.000005] raw: ffff88811c06e000 0000000080040002 00000000f5000000 0000000000000000
[  +0.000006] head: 0017ffffc0000040 ffff88810004c140 dead000000000122 0000000000000000
[  +0.000005] head: ffff88811c06e000 0000000080040002 00000000f5000000 0000000000000000
[  +0.000006] head: 0017ffffc0000003 ffffea0004701a01 ffffffffffffffff 0000000000000000
[  +0.000005] head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000
[  +0.000004] page dumped because: kasan: bad access detected

[  +0.000011] Memory state around the buggy address:
[  +0.000009]  ffff88811c068b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  +0.000012]  ffff88811c068b80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  +0.000011] >ffff88811c068c00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  +0.000011]                                                        ^
[  +0.000010]  ffff88811c068c80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  +0.000011]  ffff88811c068d00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  +0.000011] ==================================================================

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Lijo Lazar <lijo.lazar@amd.com>
Cc: Jesse Zhang <Jesse.Zhang@amd.com>
Cc: Arvind Yadav <arvind.yadav@amd.com>
v2: drop amdgpu_drm_release() and assign drm_release()
    as the callback directly.(Alex)

Fixes: adba0929736a ("drm/amdgpu: Fix Illegal opcode in command stream Error")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd: Add missing kdoc for amd_ip_funcs `complete` callback
Mario Limonciello [Fri, 20 Jun 2025 04:14:20 +0000 (23:14 -0500)] 
drm/amd: Add missing kdoc for amd_ip_funcs `complete` callback

The `complete` callback should be described in kernel doc.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Closes: https://lore.kernel.org/linux-next/20250619205931.41cf9332@canb.auug.org.au/
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20250620041420.3585005-1-superm1@kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amdgpu: remove fence slab
Alex Deucher [Mon, 16 Jun 2025 18:28:32 +0000 (14:28 -0400)] 
drm/amdgpu: remove fence slab

Just use kmalloc for the fences in the rare case we need
an independent fence.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd: Adjust output for discovery error handling
Mario Limonciello [Tue, 17 Jun 2025 18:30:52 +0000 (13:30 -0500)] 
drm/amd: Adjust output for discovery error handling

commit 017fbb6690c2 ("drm/amdgpu/discovery: check ip_discovery fw file
available") added support for reading an amdgpu IP discovery bin file
for some specific products. If it's not found then it will fallback to
hardcoded values. However if it's not found there is also a lot of noise
about missing files and errors.

Adjust the error handling to decrease most messages to DEBUG and to show
users less about missing files.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reported-by: Marcus Seyfarth <m.seyfarth@gmail.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4312
Tested-by: Marcus Seyfarth <m.seyfarth@gmail.com>
Fixes: 017fbb6690c2 ("drm/amdgpu/discovery: check ip_discovery fw file available")
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20250617183052.1692059-1-superm1@kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd/display: Promote DAL to 3.2.339
Taimur Hassan [Mon, 16 Jun 2025 02:38:55 +0000 (21:38 -0500)] 
drm/amd/display: Promote DAL to 3.2.339

Summary:

* Improve USB4 bandwidth validation
* dml clock calcuation with EQU Prefetch included
* Tweaking udelay time to fix "failed to blank crtc!" error
* Add LSDMA support to DMUB
* Fix Coverity issue

Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd/display: [FW Promotion] Release 0.1.16.0
Taimur Hassan [Sun, 15 Jun 2025 15:36:32 +0000 (11:36 -0400)] 
drm/amd/display: [FW Promotion] Release 0.1.16.0

Summary for changes in firmware:

* Add DMCUB IPS commands and command parser support
* use OTG count to disable interrupts
* Fix dmub_cmd header data boundary issue
* remove the HW register override

Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd/display: Add DMUB IPS command support for IPS residency tools
Ovidiu Bunea [Fri, 6 Jun 2025 19:49:22 +0000 (15:49 -0400)] 
drm/amd/display: Add DMUB IPS command support for IPS residency tools

[why & how]
Add DMUB IPS CMD interface for driver and
DMU to communicate for IPS residency tools.

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Ovidiu Bunea <Ovidiu.Bunea@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd/display: Add num_slices_h to set_dto_dscclk signature
Ilya Bakoulin [Thu, 5 Jun 2025 15:48:23 +0000 (11:48 -0400)] 
drm/amd/display: Add num_slices_h to set_dto_dscclk signature

Add the number of horizontal slices argument to allow configuring clock
based on slice number.

Reviewed-by: Nevenko Stupar <nevenko.stupar@amd.com>
Signed-off-by: Ilya Bakoulin <Ilya.Bakoulin@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd/display: DML21 Reintegration
Austin Zheng [Wed, 11 Jun 2025 14:09:51 +0000 (10:09 -0400)] 
drm/amd/display: DML21 Reintegration

Update logging macros for detailed debugging
Update structs to contain more detailed information
Add HDMI 16 and 20 Gbps rates

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Austin Zheng <Austin.Zheng@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd/display: Rewording Mode Validation Result
Fangzhi Zuo [Thu, 12 Jun 2025 15:25:03 +0000 (11:25 -0400)] 
drm/amd/display: Rewording Mode Validation Result

It is normal to prune resolutions that exceed hw or bw limitation.
Use error oriented wordings could cause misunderstanding.

Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd/display: LSDMA support
Ostrowski Rafal [Thu, 12 Jun 2025 07:08:51 +0000 (09:08 +0200)] 
drm/amd/display: LSDMA support

[Why]
Driver should be able to send LSDMA commands to DMCUB

[How]
Driver can now send LSDMA commands to DMCUB.
DMCUB should process them and send to LSDMA controller.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Ostrowski Rafal <rostrows@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd/display: Remove redundant macro of refresh rate
Weiguang Li [Thu, 12 Jun 2025 03:10:12 +0000 (11:10 +0800)] 
drm/amd/display: Remove redundant macro of refresh rate

[Why&How]
Found that we add redundant macro on refresh rate when calculating vtotal,
so we remove it.

Reviewed-by: Robin Chen <robin.chen@amd.com>
Signed-off-by: Weiguang Li <wei-guang.li@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd/display: Fix 'failed to blank crtc!'
Wen Chen [Mon, 2 Jun 2025 20:37:08 +0000 (16:37 -0400)] 
drm/amd/display: Fix 'failed to blank crtc!'

[why]
DCN35 is having “DC: failed to blank crtc!” when running HPO
test cases. It's caused by not having sufficient udelay time.

[how]
Replace the old wait_for_blank_complete function with fsleep function to
sleep just until the next frame should come up. This way it doesn't poll
in case the pixel clock or other clock was bugged or until vactive and
the vblank are hit again.

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Wen Chen <Wen.Chen3@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd/display: Initialize mode_select to 0
Alex Hung [Tue, 10 Jun 2025 21:40:18 +0000 (15:40 -0600)] 
drm/amd/display: Initialize mode_select to 0

[WHAT]
mode_select was supposed to be initialized in mpc_read_gamut_remap but
is not set in default case. This can cause indeterminate
behaviors.

This is reported as an UNINIT error by Coverity.

Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd/display: Add new DP tunnel bandwidth validation
Cruise Hung [Tue, 3 Jun 2025 06:30:59 +0000 (14:30 +0800)] 
drm/amd/display: Add new DP tunnel bandwidth validation

[Why & How]
Add new function for DP tunnel bandwidth validation.
It uses the estimated BW and allocated BW to validate the timings.

Reviewed-by: PeiChen Huang <peichen.huang@amd.com>
Reviewed-by: Meenakshikumar Somasundaram <meenakshikumar.somasundaram@amd.com>
Signed-off-by: Cruise Hung <Cruise.Hung@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd/display: Removed unnecessary comment
Alvin Lee [Wed, 14 May 2025 21:16:50 +0000 (17:16 -0400)] 
drm/amd/display: Removed unnecessary comment

Reviewed-by: Sridevi Arvindekar <sridevi.arvindekar@amd.com>
Signed-off-by: Alvin Lee <Alvin.Lee2@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd/display: Include EQU Prefetch Bandwidth For Bandwidth Calculations
Austin Zheng [Thu, 5 Jun 2025 20:37:36 +0000 (16:37 -0400)] 
drm/amd/display: Include EQU Prefetch Bandwidth For Bandwidth Calculations

[Why]
Pixel data bandwidth required in mode programming (MP) ends up being
higher than what was calculated in mode support (MS) even though
the prefetch bandwidths calculated in MP are lower than the MS ones.

MP used a different equ prefetch schedule than MS which lead a
slight difference in parameters. This resulted in the pixel data
bandwidth in MP to be higher than MS.

[How]
Rename the RequiredPrefetchBWOTO term so it can be applied generically.
Update the value with the EQU bandwidth if the EQU schedule is used.
Get the max prefetch bandwidth that MS calculated and use it
as part of the calculations for required bandwidth.

Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Austin Zheng <Austin.Zheng@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd/pm: Fetch SMUv13.0.6 xgmi max speed/width
Lijo Lazar [Fri, 13 Jun 2025 12:34:51 +0000 (18:04 +0530)] 
drm/amd/pm: Fetch SMUv13.0.6 xgmi max speed/width

On SMUv13.0.6 SOCs, fetch the max values of xgmi speed/width from
firmware.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amdgpu/mes: add compatibility checks for set_hw_resource_1
Alex Deucher [Tue, 20 May 2025 14:02:14 +0000 (10:02 -0400)] 
drm/amdgpu/mes: add compatibility checks for set_hw_resource_1

Seems some older MES firmware versions do not properly support
this packet.  Add back some the compatibility checks.

v2: switch to fw version check (Shaoyun)

Fixes: f81cd793119e ("drm/amd/amdgpu: Fix MES init sequence")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4295
Cc: Shaoyun Liu <shaoyun.liu@amd.com>
Reviewed-by: shaoyun.liu <shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amdgpu/gfx9: Add Cleaner Shader Support for GFX9.x GPUs
Srinivasan Shanmugam [Thu, 12 Jun 2025 14:41:14 +0000 (20:11 +0530)] 
drm/amdgpu/gfx9: Add Cleaner Shader Support for GFX9.x GPUs

Enable the cleaner shader for other GFX9.x series of GPUs to provide
data isolation between GPU workloads. The cleaner shader is responsible
for clearing the Local Data Store (LDS), Vector General Purpose
Registers (VGPRs), and Scalar General Purpose Registers (SGPRs), which
helps prevent data leakage and ensures accurate computation results.

This update extends cleaner shader support to GFX9.x GPUs, previously
available for GFX9.4.2. It enhances security by clearing GPU memory
between processes and maintains a consistent GPU state across KGD and
KFD workloads.

Cc: Manu Rastogi <manu.rastogi@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 weeks agodrm/connector: move HDR sink metadata to display info
Jani Nikula [Mon, 19 May 2025 11:29:00 +0000 (14:29 +0300)] 
drm/connector: move HDR sink metadata to display info

Information parsed from the display EDID should be stored in display
info. Move HDR sink metadata there.

Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20250519112900.1383997-1-jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
7 weeks agodrm/ci: Add jobs to run KUnit tests
Vignesh Raman [Mon, 23 Jun 2025 08:50:28 +0000 (14:20 +0530)] 
drm/ci: Add jobs to run KUnit tests

Add jobs to run KUnit tests using tools/testing/kunit/kunit.py tool.

Signed-off-by: Vignesh Raman <vignesh.raman@collabora.com>
Acked-by: Helen Koike <helen.fornazier@gmail.com>
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Link: https://lore.kernel.org/r/20250623085033.39680-3-vignesh.raman@collabora.com
Signed-off-by: Maxime Ripard <mripard@kernel.org>
7 weeks agodrm/ci: Add jobs to validate devicetrees
Vignesh Raman [Mon, 23 Jun 2025 08:50:27 +0000 (14:20 +0530)] 
drm/ci: Add jobs to validate devicetrees

Add jobs to run dt_binding_check and dtbs_check. If warnings are seen,
exit with a non-zero error code while configuring them as warning in
the GitLab CI pipeline.

Signed-off-by: Vignesh Raman <vignesh.raman@collabora.com>
Acked-by: Helen Koike <helen.fornazier@gmail.com>
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Acked-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20250623085033.39680-2-vignesh.raman@collabora.com
Signed-off-by: Maxime Ripard <mripard@kernel.org>
7 weeks agodrm/bochs: Add support for drm_panic
Ryosuke Yasuoka [Fri, 13 Jun 2025 13:20:14 +0000 (22:20 +0900)] 
drm/bochs: Add support for drm_panic

Add drm_panic module for bochs drm so that panic screen can be displayed
on panic.

Signed-off-by: Ryosuke Yasuoka <ryasuoka@redhat.com>
Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com>
Link: https://lore.kernel.org/r/20250613132023.106946-1-ryasuoka@redhat.com
Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com>
7 weeks agoMerge tag 'drm-intel-next-2025-06-18' of https://gitlab.freedesktop.org/drm/i915...
Dave Airlie [Mon, 23 Jun 2025 00:49:25 +0000 (10:49 +1000)] 
Merge tag 'drm-intel-next-2025-06-18' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next

drm/i915 feature pull for v6.17:

Features and functionality:
- Add support for DSC fractional link bpp on DP MST (Imre)
- Add support for simultaneous Panel Replay and Adaptive Sync (Jouni)
- Add support for PTL+ double buffered LUT registers (Chaitanya, Ville)
- Add PIPEDMC event handling in preparation for flip queue (Ville)

Refactoring and cleanups:
- Rename lots of DPLL interfaces to unify them (Suraj)
- Allocate struct intel_display dynamically (Jani)
- Abstract VLV IOSF sideband better (Jani)
- Use str_true_false() helper (Yumeng Fang)
- Refactor DSB code in preparation for flip queue (Ville)
- Use drm_modeset_lock_assert_held() instead of open coding (Luca)
- Remove unused arg from skl_scaler_get_filter_select() (Luca)
- Split out a separate display register header (Jani)
- Abstract DRAM detection better (Jani)
- Convert LPT/WPT SBI sideband to struct intel_display (Jani)

Fixes:
- Fix DSI HS command dispatch with forced pipeline flush (Gareth Yu)
- Fix BMG and LNL+ DP adaptive sync SDP programming (Ankit)
- Fix error path for xe display workqueue allocation (Haoxiang Li)
- Disable DP AUX access probe where not required (Imre)
- Fix DKL PHY access if the port is invalid (Luca)
- Fix PSR2_SU_STATUS access on ADL+ (Jouni)
- Add sanity checks for porch and sync on BXT/GLK DSI (Ville)

DRM core changes:
- Change AUX DPCD access probe address (Imre)
- Refactor EDID quirks, amd make them available to drivers (Imre)
- Add quirk for DPCD access probe (Imre)
- Add DPCD definitions for Panel Replay capabilities (Jouni)

Merges:
- Backmerges to sync with v6.15-rcs and v6.16-rc1 (Jani)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://lore.kernel.org/r/fff9f231850ed410bd81b53de43eff0b98240d31@intel.com
7 weeks agoarm64: dts: mediatek: mt8370: Enable gpu support
Louis-Alexis Eyraud [Fri, 9 May 2025 10:12:51 +0000 (12:12 +0200)] 
arm64: dts: mediatek: mt8370: Enable gpu support

Add a new gpu node in mt8370.dtsi to enable support for the
ARM Mali G57 MC2 GPU (Valhall-JM) found on the MT8370 SoC, using the
Panfrost driver.

On a Mediatek Genio 510 EVK board, the panfrost driver probed with the
following message:
```
panfrost 13000000.gpu: clock rate = 390000000
panfrost 13000000.gpu: mali-g57 id 0x9093 major 0x0 minor 0x0 status 0x0
panfrost 13000000.gpu: features: 00000000,000019f7, issues: 00000003,
   80000400
panfrost 13000000.gpu: Features: L2:0x08130206 Shader:0x00000000
   Tiler:0x00000809 Mem:0x1 MMU:0x00002830 AS:0xff JS:0x7
panfrost 13000000.gpu: shader_present=0x5 l2_present=0x1
[drm] Initialized panfrost 1.3.0 for 13000000.gpu on minor 0
```

Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Louis-Alexis Eyraud <louisalexis.eyraud@collabora.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Link: https://lore.kernel.org/r/20250509-mt8370-enable-gpu-v6-5-2833888cb1d3@collabora.com