]> git.ipfire.org Git - thirdparty/linux.git/log
thirdparty/linux.git
2 months agoMerge tag 'drm-xe-next-2026-03-25' of https://gitlab.freedesktop.org/drm/xe/kernel...
Dave Airlie [Fri, 27 Mar 2026 01:01:44 +0000 (11:01 +1000)] 
Merge tag 'drm-xe-next-2026-03-25' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next

Hi Dave and Sima,

Here goes our third, perhaps, final drm-xe-next PR towards 7.1.

In the big things we have:
- THP support in drm_pagemap
- xe_vm_get_property_ioctl

Thanks,
Matt

UAPI Changes:
- Implement xe_vm_get_property_ioctl (Jonathan)

Cross-subsystem Changes:
- Enable THP support in drm_pagemap (Francois, Brost)

Core Changes:
- Improve VF FLR synchronization for Xe VFIO (Piotr)

Driver Changes:
- Fix confusion with locals on context creation (Tomasz, Fixes)
- Add new SVM copy GT stats per size (Francois)
- always keep track of remap prev/next (Auld, Fixes)
- AuxCCS handling and render compression modifiers (Tvrtko)
- Implement recent spec updates to Wa_16025250150 (Roper)
- xe3p_lpg: L2 flush optimization (Tejas)
- vf: Improve getting clean NULL context (Wajdeczko)
- pf: Fix use-after-free in migration restore (Winiarski. Fixes)
- Fix format specifier for printing pointer differences (Nathan Chancellor, Fixes)
- Extend Wa_14026781792 for xe3lpg (Niton)
- xe3p_lpg: Add Wa_16029437861 (Varun)
- Fix spelling mistakes and comment style in ttm_resource.c (Varun)
- Merge drm/drm-next into drm-xe-next (Thomas)
- Fix missing runtime PM reference in ccs_mode_store (Sanjay, Fixes)
- Fix uninitialized new_ts when capturing context timestamp (Umesh)
- Allow reading after disabling OA stream (Ashutosh)
- Page Reclamation Fixes (Brian Nguyen, Fixes)
- Include running dword offset in default_lrc dumps (Roper)
- Assert/Deassert I2C IRQ (Raag)
- Fixup reset, wedge, unload corner cases (Zhanjun, Brost)
- Fail immediately on GuC load error (Daniele)
- Fix kernel-doc for DRM_XE_VM_BIND_FLAG_DECOMPRESS (Niton, Fixes)
- Drop redundant entries for Wa_16021867713 & Wa_14019449301 (Roper, Fixes)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/acS5xmWC3ivPTmyV@gsse-cloud1.jf.intel.com
2 months agoMerge tag 'amd-drm-next-7.1-2026-03-25' of https://gitlab.freedesktop.org/agd5f/linux...
Dave Airlie [Thu, 26 Mar 2026 23:30:34 +0000 (09:30 +1000)] 
Merge tag 'amd-drm-next-7.1-2026-03-25' of https://gitlab.freedesktop.org/agd5f/linux into drm-next

amd-drm-next-7.1-2026-03-25:

amdgpu:
- DSC fix
- Module parameter parsing fix
- PASID reuse fix
- drm_edid leak fix
- SMU 13.x fixes
- SMU 14.x fix
- Fence fix in amdgpu_amdkfd_submit_ib()
- LVDS fixes
- GPU page fault fix for non-4K pages
- Misc cleanups
- UserQ fixes
- SMU 15.0.8 support
- RAS updates
- Devcoredump fixes
- GFX queue priority fixes
- DPIA fixes
- DCN 4.2 updates
- Add debugfs interface for pcie64 registers
- SMU 15.x fixes
- VCN reset fixes
- Documentation fixes

amdkfd:
- Ordering fix in kfd_ioctl_create_process()

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patch.msgid.link/20260325175012.4185721-1-alexander.deucher@amd.com
2 months agodrm/xe: Fix confusion with locals on context creation
Tomasz Lis [Fri, 20 Mar 2026 14:57:33 +0000 (15:57 +0100)] 
drm/xe: Fix confusion with locals on context creation

After setting a local variable, check that local value rather that
checking destination at which the value will be stored later.

This fixes the obvious mistake in error path; without it,
allocation fail would lead to NULL dereference during context
creation.

Fixes: 89340099c6a4 ("drm/xe/lrc: Refactor context init into xe_lrc_ctx_init()")
Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Cc: Raag Jadav <raag.jadav@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260320145733.1337682-1-tomasz.lis@intel.com
2 months agodrm/xe: Add new SVM copy GT stats per size
Francois Dugast [Wed, 25 Mar 2026 16:01:52 +0000 (17:01 +0100)] 
drm/xe: Add new SVM copy GT stats per size

Breakdown the GT stats for copy to host and copy to device per size (4K,
64K 2M) to make it easier for user space to track memory migrations.
This is helpful to verify allocation alignment is correct when porting
applications to SVM.

Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260325160152.1057556-1-francois.dugast@intel.com
2 months agodrm/xe/xe_vm: Implement xe_vm_get_property_ioctl
Jonathan Cavitt [Tue, 24 Mar 2026 15:29:40 +0000 (15:29 +0000)] 
drm/xe/xe_vm: Implement xe_vm_get_property_ioctl

Add support for userspace to request a list of observed faults
from a specified VM.

v2:
- Only allow querying of failed pagefaults (Matt Brost)

v3:
- Remove unnecessary size parameter from helper function, as it
  is a property of the arguments. (jcavitt)
- Remove unnecessary copy_from_user (Jainxun)
- Set address_precision to 1 (Jainxun)
- Report max size instead of dynamic size for memory allocation
  purposes.  Total memory usage is reported separately.

v4:
- Return int from xe_vm_get_property_size (Shuicheng)
- Fix memory leak (Shuicheng)
- Remove unnecessary size variable (jcavitt)

v5:
- Rename ioctl to xe_vm_get_faults_ioctl (jcavitt)
- Update fill_property_pfs to eliminate need for kzalloc (Jianxun)

v6:
- Repair and move fill_faults break condition (Dan Carpenter)
- Free vm after use (jcavitt)
- Combine assertions (jcavitt)
- Expand size check in xe_vm_get_faults_ioctl (jcavitt)
- Remove return mask from fill_faults, as return is already -EFAULT or 0
  (jcavitt)

v7:
- Revert back to using xe_vm_get_property_ioctl
- Apply better copy_to_user logic (jcavitt)

v8:
- Fix and clean up error value handling in ioctl (jcavitt)
- Reapply return mask for fill_faults (jcavitt)

v9:
- Future-proof size logic for zero-size properties (jcavitt)
- Add access and fault types (Jianxun)
- Remove address type (Jianxun)

v10:
- Remove unnecessary switch case logic (Raag)
- Compress size get, size validation, and property fill functions into a
  single helper function (jcavitt)
- Assert valid size (jcavitt)

v11:
- Remove unnecessary else condition
- Correct backwards helper function size logic (jcavitt)

v12:
- Use size_t instead of int (Raag)

v13:
- Remove engine class and instance (Ivan)

v14:
- Map access type, fault type, and fault level to user macros (Matt
  Brost, Ivan)

v15:
- Remove unnecessary size assertion (jcavitt)

v16:
- Nit fixes (Matt Brost)

v17:
- Rebase and refactor (jcavitt)

v18:
- Do not copy_to_user in critical section (Matt Brost)
- Assert args->size is multiple of sizeof(struct xe_vm_fault) (Matt
  Brost)

v19:
- Remove unnecessary memset (Matt Brost)

v20:
- Report canonicalized address (Jose)
- Mask out prefetch data from access type (Jose, jcavitt)

v21:
- s/uAPI/Link in the commit log links
- Align debug parameters

Link: https://github.com/intel/compute-runtime/pull/878
Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: Michal Mrozek <michal.mrozek@intel.com>
Cc: Jainxun Zhang <jianxun.zhang@intel.com>
Cc: Shuicheng Lin <shuicheng.lin@intel.com>
Cc: Raag Jadav <raag.jadav@intel.com>
Cc: Ivan Briano <ivan.briano@intel.com>
Cc: Jose Souza <jose.souza@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260324152935.72444-10-jonathan.cavitt@intel.com
2 months agodrm/xe/xe_vm: Add per VM fault info
Jonathan Cavitt [Tue, 24 Mar 2026 15:29:39 +0000 (15:29 +0000)] 
drm/xe/xe_vm: Add per VM fault info

Add additional information to each VM so they can report up to the first
50 seen faults.  Only pagefaults are saved this way currently, though in
the future, all faults should be tracked by the VM for future reporting.

Additionally, of the pagefaults reported, only failed pagefaults are
saved this way, as successful pagefaults should recover silently and not
need to be reported to userspace.

v2:
- Free vm after use (Shuicheng)
- Compress pf copy logic (Shuicheng)
- Update fault_unsuccessful before storing (Shuicheng)
- Fix old struct name in comments (Shuicheng)
- Keep first 50 pagefaults instead of last 50 (Jianxun)

v3:
- Avoid unnecessary execution by checking MAX_PFS earlier (jcavitt)
- Fix double-locking error (jcavitt)
- Assert kmemdump is successful (Shuicheng)

v4:
- Rename xe_vm.pfs to xe_vm.faults (jcavitt)
- Store fault data and not pagefault in xe_vm faults list (jcavitt)
- Store address, address type, and address precision per fault (jcavitt)
- Store engine class and instance data per fault (Jianxun)
- Add and fix kernel docs (Michal W)
- Properly handle kzalloc error (Michal W)
- s/MAX_PFS/MAX_FAULTS_SAVED_PER_VM (Michal W)
- Store fault level per fault (Micahl M)

v5:
- Store fault and access type instead of address type (Jianxun)

v6:
- Store pagefaults in non-fault-mode VMs as well (Jianxun)

v7:
- Fix kernel docs and comments (Michal W)

v8:
- Fix double-locking issue (Jianxun)

v9:
- Do not report faults from reserved engines (Jianxun)

v10:
- Remove engine class and instance (Ivan)

v11:
- Perform kzalloc outside of lock (Auld)

v12:
- Fix xe_vm_fault_entry kernel docs (Shuicheng)

v13:
- Rebase and refactor (jcavitt)

v14:
- Correctly ignore fault mode in save_pagefault_to_vm (jcavitt)

v15:
- s/save_pagefault_to_vm/xe_pagefault_save_to_vm (Matt Brost)
- Use guard instead of spin_lock/unlock (Matt Brost)
- GT was added to xe_pagefault struct.  Use xe_gt_hw_engine
  instead of creating a new helper function (Matt Brost)

v16:
- Set address precision programmatically (Matt Brost)

v17:
- Set address precision to fixed value (Matt Brost)

v18:
- s/uAPI/Link in commit log links
- Use kzalloc_obj

Link: https://github.com/intel/compute-runtime/pull/878
Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Cc: Shuicheng Lin <shuicheng.lin@intel.com>
Cc: Jianxun Zhang <jianxun.zhang@intel.com>
Cc: Michal Wajdeczko <Michal.Wajdeczko@intel.com>
Cc: Michal Mzorek <michal.mzorek@intel.com>
Cc: Ivan Briano <ivan.briano@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260324152935.72444-9-jonathan.cavitt@intel.com
2 months agodrm/xe/uapi: Define drm_xe_vm_get_property
Jonathan Cavitt [Tue, 24 Mar 2026 15:29:38 +0000 (15:29 +0000)] 
drm/xe/uapi: Define drm_xe_vm_get_property

Add initial declarations for the drm_xe_vm_get_property ioctl.

v2:
- Expand kernel docs for drm_xe_vm_get_property (Jianxun)

v3:
- Remove address type external definitions (Jianxun)
- Add fault type to xe_drm_fault struct (Jianxun)

v4:
- Remove engine class and instance (Ivan)

v5:
- Add declares for fault type, access type, and fault level (Matt Brost,
  Ivan)

v6:
- Fix inconsistent use of whitespace in defines

v7:
- Rebase and refactor (jcavitt)

v8:
- Rebase (jcavitt)

v9:
- Clarify address is canonical (José)

v10:
- s/uAPI/Link in the commit log links

Link: https://github.com/intel/compute-runtime/pull/878
Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
Acked-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: Ivan Briano <ivan.briano@intel.com>
Acked-by: José Roberto de Souza <jose.souza@intel.com>
Cc: Zhang Jianxun <jianxun.zhang@intel.com>
Cc: Ivan Briano <ivan.briano@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260324152935.72444-8-jonathan.cavitt@intel.com
2 months agodrm/xe/xe_pagefault: Disallow writes to read-only VMAs
Jonathan Cavitt [Tue, 24 Mar 2026 15:29:37 +0000 (15:29 +0000)] 
drm/xe/xe_pagefault: Disallow writes to read-only VMAs

The page fault handler should reject write/atomic access to read only
VMAs.  Add code to handle this in xe_pagefault_service after the VMA
lookup.

v2:
- Apply max line length (Matthew)

Fixes: fb544b844508 ("drm/xe: Implement xe_pagefault_queue_work")
Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Cc: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260324152935.72444-7-jonathan.cavitt@intel.com
2 months agoBackMerge tag 'v7.0-rc4' into drm-next
Dave Airlie [Wed, 25 Mar 2026 23:41:26 +0000 (09:41 +1000)] 
BackMerge tag 'v7.0-rc4' into drm-next

Linux 7.0-rc4

Needed for rust tree.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2 months agodrm/xe: always keep track of remap prev/next
Matthew Auld [Wed, 18 Mar 2026 10:02:09 +0000 (10:02 +0000)] 
drm/xe: always keep track of remap prev/next

During 3D workload, user is reporting hitting:

[  413.361679] WARNING: drivers/gpu/drm/xe/xe_vm.c:1217 at vm_bind_ioctl_ops_unwind+0x1e2/0x2e0 [xe], CPU#7: vkd3d_queue/9925
[  413.361944] CPU: 7 UID: 1000 PID: 9925 Comm: vkd3d_queue Kdump: loaded Not tainted 7.0.0-070000rc3-generic #202603090038 PREEMPT(lazy)
[  413.361949] RIP: 0010:vm_bind_ioctl_ops_unwind+0x1e2/0x2e0 [xe]
[  413.362074] RSP: 0018:ffffd4c25c3df930 EFLAGS: 00010282
[  413.362077] RAX: 0000000000000000 RBX: ffff8f3ee817ed10 RCX: 0000000000000000
[  413.362078] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[  413.362079] RBP: ffffd4c25c3df980 R08: 0000000000000000 R09: 0000000000000000
[  413.362081] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8f41fbf99380
[  413.362082] R13: ffff8f3ee817e968 R14: 00000000ffffffef R15: ffff8f43d00bd380
[  413.362083] FS:  00000001040ff6c0(0000) GS:ffff8f4696d89000(0000) knlGS:00000000330b0000
[  413.362085] CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
[  413.362086] CR2: 00007ddfc4747000 CR3: 00000002e6262005 CR4: 0000000000f72ef0
[  413.362088] PKRU: 55555554
[  413.362089] Call Trace:
[  413.362092]  <TASK>
[  413.362096]  xe_vm_bind_ioctl+0xa9a/0xc60 [xe]

Which seems to hint that the vma we are re-inserting for the ops unwind
is either invalid or overlapping with something already inserted in the
vm. It shouldn't be invalid since this is a re-insertion, so must have
worked before. Leaving the likely culprit as something already placed
where we want to insert the vma.

Following from that, for the case where we do something like a rebind in
the middle of a vma, and one or both mapped ends are already compatible,
we skip doing the rebind of those vma and set next/prev to NULL. As well
as then adjust the original unmap va range, to avoid unmapping the ends.
However, if we trigger the unwind path, we end up with three va, with
the two ends never being removed and the original va range in the middle
still being the shrunken size.

If this occurs, one failure mode is when another unwind op needs to
interact with that range, which can happen with a vector of binds. For
example, if we need to re-insert something in place of the original va.
In this case the va is still the shrunken version, so when removing it
and then doing a re-insert it can overlap with the ends, which were
never removed, triggering a warning like above, plus leaving the vm in a
bad state.

With that, we need two things here:

 1) Stop nuking the prev/next tracking for the skip cases. Instead
    relying on checking for skip prev/next, where needed. That way on the
    unwind path, we now correctly remove both ends.

 2) Undo the unmap va shrinkage, on the unwind path. With the two ends
    now removed the unmap va should expand back to the original size again,
    before re-insertion.

v2:
  - Update the explanation in the commit message, based on an actual IGT of
    triggering this issue, rather than conjecture.
  - Also undo the unmap shrinkage, for the skip case. With the two ends
    now removed, the original unmap va range should expand back to the
    original range.
v3:
  - Track the old start/range separately. vma_size/start() uses the va
    info directly.

Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/7602
Fixes: 8f33b4f054fc ("drm/xe: Avoid doing rebinds")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260318100208.78097-2-matthew.auld@intel.com
2 months agodrm/amd/display: add a no_hpd link_encoder_funcs variant
Alex Deucher [Fri, 20 Mar 2026 16:56:20 +0000 (12:56 -0400)] 
drm/amd/display: add a no_hpd link_encoder_funcs variant

For link encoders without HPD (analog or LVDS), add a
link_encoder_funcs structure with no hpd enable callbacks.

The enable and disable hpd callbacks are currently not used
outside of a special case in debugfs which checks if the hpd
is valid before using it, but this will protect us if they
ever are.

Reviewed-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amdgpu/userq: schedule_delayed_work should be after fence signalled
Sunil Khatri [Tue, 24 Mar 2026 15:16:34 +0000 (20:46 +0530)] 
drm/amdgpu/userq: schedule_delayed_work should be after fence signalled

Reorganise the amdgpu_eviction_fence_suspend_worker code so
schedule_delayed_work is the last thing we do after amdgpu_userq_evict
is complete and the eviction fence is signalled.

Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amdgpu/mes12_1: emove extra ; from declaration statement
Colin Ian King [Mon, 23 Mar 2026 22:43:48 +0000 (22:43 +0000)] 
drm/amdgpu/mes12_1: emove extra ; from declaration statement

There is a declaration statement that has a ;; at the end, remove the
extraneous ;

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/display: Fix DCE LVDS handling
Alex Deucher [Thu, 26 Feb 2026 22:12:08 +0000 (17:12 -0500)] 
drm/amd/display: Fix DCE LVDS handling

LVDS does not use an HPD pin so it may be invalid.  Handle
this case correctly in link encoder creation.

Fixes: 7c8fb3b8e9ba ("drm/amd/display: Add hpd_source index check for DCE60/80/100/110/112/120 link encoders")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/5012
Cc: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Cc: Roman Li <roman.li@amd.com>
Reviewed-by: Roman Li <roman.li@amd.com>
Reviewed-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amdgpu: update outdated comment for renamed amdgpu_fence_driver_init()
Kexin Sun [Sat, 21 Mar 2026 10:57:28 +0000 (18:57 +0800)] 
drm/amdgpu: update outdated comment for renamed amdgpu_fence_driver_init()

The function amdgpu_fence_driver_init() was renamed to
amdgpu_fence_driver_sw_init() by commit 067f44c8b459
("drm/amdgpu: avoid over-handle of fence driver fini in s3
test (v2)").  Update the stale reference in the
amdgpu_fence_driver_init_ring() kdoc.

Assisted-by: unnamed:deepseek-v3.2 coccinelle
Signed-off-by: Kexin Sun <kexinsun@smail.nju.edu.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amdgpu/userq: convert comma to semicolon
Chen Ni [Fri, 20 Mar 2026 08:32:47 +0000 (16:32 +0800)] 
drm/amdgpu/userq: convert comma to semicolon

Using a ',' in place of a ';' can have unintended side effects.
Although that is not the case here, it seems best to use ';'
unless ',' is intended.

Found by inspection.
No functional change intended.
Compile tested only.

Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amdgpu: Handle GPU page faults correctly on non-4K page systems
Donet Tom [Mon, 23 Mar 2026 04:28:36 +0000 (09:58 +0530)] 
drm/amdgpu: Handle GPU page faults correctly on non-4K page systems

During a GPU page fault, the driver restores the SVM range and then maps it
into the GPU page tables. The current implementation passes a GPU-page-size
(4K-based) PFN to svm_range_restore_pages() to restore the range.

SVM ranges are tracked using system-page-size PFNs. On systems where the
system page size is larger than 4K, using GPU-page-size PFNs to restore the
range causes two problems:

Range lookup fails:
Because the restore function receives PFNs in GPU (4K) units, the SVM
range lookup does not find the existing range. This will result in a
duplicate SVM range being created.

VMA lookup failure:
The restore function also tries to locate the VMA for the faulting address.
It converts the GPU-page-size PFN into an address using the system page
size, which results in an incorrect address on non-4K page-size systems.
As a result, the VMA lookup fails with the message: "address 0xxxx VMA is
removed".

This patch passes the system-page-size PFN to svm_range_restore_pages() so
that the SVM range is restored correctly on non-4K page systems.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/display: update outdated comments for renamed vblank_control_worker()
Kexin Sun [Sat, 21 Mar 2026 10:57:17 +0000 (18:57 +0800)] 
drm/amd/display: update outdated comments for renamed vblank_control_worker()

The function vblank_control_worker() was renamed
to amdgpu_dm_crtc_vblank_control_worker() by commit
6ce4f9ee25ff ("drm/amd/display: Add prefix to amdgpu crtc
functions").  Update the two stale references in
amdgpu_dm.c.

Assisted-by: unnamed:deepseek-v3.2 coccinelle
Signed-off-by: Kexin Sun <kexinsun@smail.nju.edu.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/display: clean up typecasts and constants in dcn4_calcs
Adriano Vero [Tue, 17 Mar 2026 20:36:13 +0000 (21:36 +0100)] 
drm/amd/display: clean up typecasts and constants in dcn4_calcs

Signed-off-by: Adriano Vero <litaliano00.contact@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amdgpu/userq: dont use goto to jump when at end of function
Sunil Khatri [Tue, 24 Mar 2026 07:48:28 +0000 (13:18 +0530)] 
drm/amdgpu/userq: dont use goto to jump when at end of function

In function amdgpu_userq_restore_worker we dont need to use
goto as we already in the end of function and it will exit
naturally.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amdgpu: fix syncobj leak for amdgpu_gem_va_ioctl()
Prike Liang [Thu, 19 Mar 2026 06:47:11 +0000 (14:47 +0800)] 
drm/amdgpu: fix syncobj leak for amdgpu_gem_va_ioctl()

It requires freeing the syncobj and chain
alloction resource.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amdgpu/vcn4.0.3: gate per-queue reset by PSP SOS program version
Jesse Zhang [Fri, 20 Mar 2026 08:16:16 +0000 (16:16 +0800)] 
drm/amdgpu/vcn4.0.3: gate per-queue reset by PSP SOS program version

Add a PSP SOS firmware compatibility check before enabling VCN per-queue
reset on vcn_v4_0_3.

Per review, program check is sufficient: when PSP SOS program is 0x01,
require fw version >= 0x0036015f; otherwise allow per-queue reset.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Suggested-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Jesse Zhang <Jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/pm: Enable VCN reset for pgm=4 with appropriate FW version
Jesse.Zhang [Thu, 19 Mar 2026 07:54:38 +0000 (15:54 +0800)] 
drm/amd/pm: Enable VCN reset for pgm=4 with appropriate FW version

Extend the VCN reset capability to include pgm=4 variants when the
firmware version meets the required threshold (>= 0x04557100). This
follows the existing pattern for pgm=0 and pgm=7, ensuring that VCN
reset is enabled only on configurations where it is supported by the
firmware.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amdgpu: use DISCOVERY_TMR_SIZE in ACPI TMR fallback
Jesse.Zhang [Mon, 23 Mar 2026 05:31:54 +0000 (13:31 +0800)] 
drm/amdgpu: use DISCOVERY_TMR_SIZE in ACPI TMR fallback

amdgpu_acpi_get_tmr_info() returns the full TMR region size, not the IP
discovery table size. Using tmr_size as discovery.size can lead to oversized
allocations and probe failure.

In the ACPI fallback path, keep discovery.size as DISCOVERY_TMR_SIZE and only
use ACPI data for offset calculation.

Fixes: 01bdc7e219c4 ("drm/amdgpu: New interface to get IP discovery binary v3")
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Suggested-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/pm: add dedicated dram addr msg for smu v15
Yang Wang [Mon, 23 Mar 2026 01:48:39 +0000 (21:48 -0400)] 
drm/amd/pm: add dedicated dram addr msg for smu v15

Add dedicated SMU Dram MSG mapping to avoid conflicts
in SMU IP v15 common code for upcoming ASICs.

add new smu msg:
- SMU_MSG_SetDriverDramAddr
- SMU_MSG_SetToolsDramAddr

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/pm: disable OD_FAN_CURVE if temp or pwm range invalid for smu v14
Yang Wang [Thu, 19 Mar 2026 07:36:50 +0000 (03:36 -0400)] 
drm/amd/pm: disable OD_FAN_CURVE if temp or pwm range invalid for smu v14

Forcibly disable the OD_FAN_CURVE feature when temperature or PWM range is invalid,
otherwise PMFW will reject this configuration on smu v14.0.2/14.0.3.

example:
$ sudo cat /sys/bus/pci/devices/<BDF>/gpu_od/fan_ctrl/fan_curve

OD_FAN_CURVE:
0: 0C 0%
1: 0C 0%
2: 0C 0%
3: 0C 0%
4: 0C 0%
OD_RANGE:
FAN_CURVE(hotspot temp): 0C 0C
FAN_CURVE(fan speed): 0% 0%

$ echo "0 50 40" | sudo tee fan_curve

kernel log:
[  969.761627] amdgpu 0000:03:00.0: amdgpu: Fan curve temp setting(50) must be within [0, 0]!
[ 1010.897800] amdgpu 0000:03:00.0: amdgpu: Fan curve temp setting(50) must be within [0, 0]!

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amdkfd: Fix NULL pointer check order in kfd_ioctl_create_process
Srinivasan Shanmugam [Mon, 23 Mar 2026 08:58:57 +0000 (14:28 +0530)] 
drm/amdkfd: Fix NULL pointer check order in kfd_ioctl_create_process

In kfd_ioctl_create_process(), the pointer 'p' is used before checking
if it is NULL.

The code accesses p->context_id before validating 'p'. This can lead
to a possible NULL pointer dereference.

Move the NULL check before using 'p' so that the pointer is validated
before access.

Fixes the below:
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_chardev.c:3177 kfd_ioctl_create_process() warn: variable dereferenced before check 'p' (see line 3174)

Fixes: cc6b66d661fd ("amdkfd: introduce new ioctl AMDKFD_IOC_CREATE_PROCESS")
Cc: Zhu Lingshan <lingshan.zhu@amd.com>
Cc: Felix Kuehling <felix.kuehling@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/display: check if ext_caps is valid in BL setup
Alex Deucher [Fri, 20 Mar 2026 16:33:48 +0000 (12:33 -0400)] 
drm/amd/display: check if ext_caps is valid in BL setup

LVDS connectors don't have extended backlight caps so check
if the pointer is valid before accessing it.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/5012
Fixes: 1454642960b0 ("drm/amd: Re-introduce property to control adaptive backlight modulation")
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amdgpu: Fix fence put before wait in amdgpu_amdkfd_submit_ib
Srinivasan Shanmugam [Mon, 23 Mar 2026 08:11:18 +0000 (13:41 +0530)] 
drm/amdgpu: Fix fence put before wait in amdgpu_amdkfd_submit_ib

amdgpu_amdkfd_submit_ib() submits a GPU job and gets a fence
from amdgpu_ib_schedule(). This fence is used to wait for job
completion.

Currently, the code drops the fence reference using dma_fence_put()
before calling dma_fence_wait().

If dma_fence_put() releases the last reference, the fence may be
freed before dma_fence_wait() is called. This can lead to a
use-after-free.

Fix this by waiting on the fence first and releasing the reference
only after dma_fence_wait() completes.

Fixes the below:
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c:697 amdgpu_amdkfd_submit_ib() warn: passing freed memory 'f' (line 696)

Fixes: 9ae55f030dc5 ("drm/amdgpu: Follow up change to previous drm scheduler change.")
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Dan Carpenter <dan.carpenter@linaro.org>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/xe/xelp: Expose AuxCCS frame buffer modifiers on Alderlake-P
Tvrtko Ursulin [Tue, 24 Mar 2026 08:40:18 +0000 (08:40 +0000)] 
drm/xe/xelp: Expose AuxCCS frame buffer modifiers on Alderlake-P

Now that we have implemented all the related missing bits we can enable
the AuxCCS compressed modifiers which were disabled in
cf48bddd31de ("drm/i915/display: Disable AuxCCS framebuffers if built for Xe").

Tested with KDE Wayland, on Lenovo Carbon X1 ADL-P:

        [PLANE:32:plane 1A]: type=PRI
                uapi: [FB:242] AR30 little-endian (0x30335241),0x100000000000008,2880x1800, visible=visible, src=28
                hw: [FB:242] AR30 little-endian (0x30335241),0x100000000000008,2880x1800, visible=yes, src=2880.000

Display is working fine - no artefacts, no DMAR/PIPE faults.

v2:
 * Adjust patch title. (Rodrigo)

v3:
 * Complete rewrite based on the display parent interface.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
References: cf48bddd31de ("drm/i915/display: Disable AuxCCS framebuffers if built for Xe")
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/20260324084018.20353-13-tvrtko.ursulin@igalia.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2 months agodrm/xe/display: Add support for AuxCCS
Tvrtko Ursulin [Tue, 24 Mar 2026 08:40:17 +0000 (08:40 +0000)] 
drm/xe/display: Add support for AuxCCS

Add support for mapping the auxiliary CCS buffer into the DPT page tables.

This will allow for better power efficiency by enabling the render
compression frame buffer modifiers such as
I915_FORMAT_MOD_Y_TILED_GEN12_RC_CCS in a following patch.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Cc: Michael J. Ruhl <michael.j.ruhl@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Link: https://patch.msgid.link/20260324084018.20353-12-tvrtko.ursulin@igalia.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2 months agodrm/xe/display: Respect remapped plane alignment
Tvrtko Ursulin [Tue, 24 Mar 2026 08:40:16 +0000 (08:40 +0000)] 
drm/xe/display: Respect remapped plane alignment

Instead of assuming PAGE_SIZE alignment between the remapped planes
respect the value set in the struct intel_remapped_info.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Cc: Michael J. Ruhl <michael.j.ruhl@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Link: https://patch.msgid.link/20260324084018.20353-11-tvrtko.ursulin@igalia.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2 months agodrm/xe/display: Change write_dpt_remapped_tiled function signature
Tvrtko Ursulin [Tue, 24 Mar 2026 08:40:15 +0000 (08:40 +0000)] 
drm/xe/display: Change write_dpt_remapped_tiled function signature

In preparation for adding support for the auxccs plane lets change the
function signature of write_dpt_remapped_tiled(). This will enable a
tidier way of extending it subsequent patches.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Cc: Michael J. Ruhl <michael.j.ruhl@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Uma Shankar <uma.shankar@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Link: https://patch.msgid.link/20260324084018.20353-10-tvrtko.ursulin@igalia.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2 months agodrm/xe/display: Move remapped plane loop out of __xe_pin_fb_vma_dpt
Tvrtko Ursulin [Tue, 24 Mar 2026 08:40:14 +0000 (08:40 +0000)] 
drm/xe/display: Move remapped plane loop out of __xe_pin_fb_vma_dpt

In preparation for adding support for the auxccs plane lets move the
plane iteration loop to its own function.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Cc: Michael J. Ruhl <michael.j.ruhl@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Link: https://patch.msgid.link/20260324084018.20353-9-tvrtko.ursulin@igalia.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2 months agodrm/xe/xelp: Add AuxCCS invalidation to the indirect context workarounds
Tvrtko Ursulin [Tue, 24 Mar 2026 08:40:13 +0000 (08:40 +0000)] 
drm/xe/xelp: Add AuxCCS invalidation to the indirect context workarounds

Following from the i915 reference implementation, we add the AuxCCS
invalidation to the indirect context workarounds page.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/20260324084018.20353-8-tvrtko.ursulin@igalia.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2 months agodrm/xe: Move aux table invalidation to ring ops
Tvrtko Ursulin [Tue, 24 Mar 2026 08:40:12 +0000 (08:40 +0000)] 
drm/xe: Move aux table invalidation to ring ops

Implement the suggestion of moving the aux invalidation from a helper to a
ring ops vfunc, together with the suggestion to split the vfunc table of
video decode and video enhance engines.

With this done the LRC code will be able to access the functionality via
the newly added ring ops vfunc.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260324084018.20353-7-tvrtko.ursulin@igalia.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2 months agodrm/xe/xelp: Wait for AuxCCS invalidation to complete
Tvrtko Ursulin [Tue, 24 Mar 2026 08:40:11 +0000 (08:40 +0000)] 
drm/xe/xelp: Wait for AuxCCS invalidation to complete

On AuxCCS platforms we need to wait for AuxCCS invalidations to complete.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/20260324084018.20353-6-tvrtko.ursulin@igalia.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2 months agodrm/xe/xelp: Quiesce memory traffic before invalidating AuxCCS
Tvrtko Ursulin [Tue, 24 Mar 2026 08:40:10 +0000 (08:40 +0000)] 
drm/xe/xelp: Quiesce memory traffic before invalidating AuxCCS

According to i915 commit
ad8ebf12217e ("drm/i915/gt: Ensure memory quiesced before invalidation")
quiescing of the memory traffic is required before invalidating the AuxCCS
tables.

Add an extra pipe control flush to achieve that.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/20260324084018.20353-5-tvrtko.ursulin@igalia.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2 months agodrm/xe/xelpg: Limit AuxCCS ring buffer programming to Alderlake
Tvrtko Ursulin [Tue, 24 Mar 2026 08:40:09 +0000 (08:40 +0000)] 
drm/xe/xelpg: Limit AuxCCS ring buffer programming to Alderlake

At the moment the driver does not support AuxCCS at all due respective
modifiers being hidden from userspace.

As we are about to start enabling them, starting with Alderlake, let us
begin by limiting the ring buffer support to just that initial platform.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/20260324084018.20353-4-tvrtko.ursulin@igalia.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2 months agodrm/xe: Use write-combine mapping when populating DPT
Tvrtko Ursulin [Tue, 24 Mar 2026 08:40:08 +0000 (08:40 +0000)] 
drm/xe: Use write-combine mapping when populating DPT

The fallback case for DPT backing store is a buffer object in system
memory buffer, which by default use a write-back CPU caching policy.

If this fallback gets triggered, and since there is currently no flushing,
the DPT writes made when pinning a buffer to display are not guaranteed to
be seen by the display engine.

To fix this, since both the local memory and the stolen memory DPT
placements already use write-combine, let us make the system memory option
follow suit by passing down the appropriate flag.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/20260324084018.20353-3-tvrtko.ursulin@igalia.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2 months agodrm/xe: Rename XE_BO_FLAG_SCANOUT to XE_BO_FLAG_FORCE_WC
Tvrtko Ursulin [Tue, 24 Mar 2026 08:40:07 +0000 (08:40 +0000)] 
drm/xe: Rename XE_BO_FLAG_SCANOUT to XE_BO_FLAG_FORCE_WC

Rename XE_BO_FLAG_SCANOUT to XE_BO_FLAG_FORCE_WC so that the usage of the
flag can legitimately be expanded to more than just the actual frame-
buffer objects.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Suggested-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/20260324084018.20353-2-tvrtko.ursulin@igalia.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2 months agovfio/xe: Notify PF about VF FLR in reset_prepare
Piotr Piórkowski [Mon, 9 Mar 2026 15:24:49 +0000 (16:24 +0100)] 
vfio/xe: Notify PF about VF FLR in reset_prepare

Hook into the PCI error handler reset_prepare() callback to notify
the PF about an upcoming VF FLR before reset_done() is executed.
This enables early FLR_PREPARE signaling and ensures that the PF is
aware of the reset before the completion wait begins.

Signed-off-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Alex Williamson <alex@shazbot.org>
Link: https://patch.msgid.link/20260309152449.910636-3-piotr.piorkowski@intel.com
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2 months agodrm/xe/pf: Add FLR_PREPARE state to VF control flow
Piotr Piórkowski [Mon, 9 Mar 2026 15:24:48 +0000 (16:24 +0100)] 
drm/xe/pf: Add FLR_PREPARE state to VF control flow

Our xe-vfio-pci component relies on the confirmation from the PF
that VF FLR processing has finished, but due to the notification
latency on the HW/FW side, PF might be unaware yet of the already
triggered VF FLR.

Update VF state machine with new FLR_PREPARE state that indicate
imminent VF FLR notification and treat that as a begin of the FLR
sequence. Also introduce function that xe-vfio-pci should call to
guarantee correct synchronization.

v2: move PREPARE into WIP, update commit msg (Michal)

Signed-off-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Co-developed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patch.msgid.link/20260309152449.910636-2-piotr.piorkowski@intel.com
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2 months agodrm/xe: Implement recent spec updates to Wa_16025250150
Matt Roper [Thu, 19 Mar 2026 22:30:34 +0000 (15:30 -0700)] 
drm/xe: Implement recent spec updates to Wa_16025250150

The hardware teams noticed that the originally documented workaround
steps for Wa_16025250150 may not be sufficient to fully avoid a hardware
issue.  The workaround documentation has been augmented to suggest
programming one additional register; make the corresponding change in
the driver.

Fixes: 7654d51f1fd8 ("drm/xe/xe2hpg: Add Wa_16025250150")
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
Link: https://patch.msgid.link/20260319-wa_16025250150_part2-v1-1-46b1de1a31b2@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2 months agodrm/amd/pm: disable OD_FAN_CURVE if temp or pwm range invalid for smu v13
Yang Wang [Fri, 20 Mar 2026 01:17:38 +0000 (21:17 -0400)] 
drm/amd/pm: disable OD_FAN_CURVE if temp or pwm range invalid for smu v13

Forcibly disable the OD_FAN_CURVE feature when temperature or PWM range is invalid,
otherwise PMFW will reject this configuration on smu v13.0.x

example:
$ sudo cat /sys/bus/pci/devices/<BDF>/gpu_od/fan_ctrl/fan_curve

OD_FAN_CURVE:
0: 0C 0%
1: 0C 0%
2: 0C 0%
3: 0C 0%
4: 0C 0%
OD_RANGE:
FAN_CURVE(hotspot temp): 0C 0C
FAN_CURVE(fan speed): 0% 0%

$ echo "0 50 40" | sudo tee fan_curve

kernel log:
[  756.442527] amdgpu 0000:03:00.0: amdgpu: Fan curve temp setting(50) must be within [0, 0]!
[  777.345800] amdgpu 0000:03:00.0: amdgpu: Fan curve temp setting(50) must be within [0, 0]!

Closes: https://github.com/ROCm/amdgpu/issues/208
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amdgpu: Use stack variable to fetch nps info
Lijo Lazar [Mon, 23 Feb 2026 10:58:22 +0000 (16:28 +0530)] 
drm/amdgpu: Use stack variable to fetch nps info

Instead of a dynamic allocation, use stack variable and let the caller
pass the maximum ranges that can be held in the buffer.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/display: Rename enum 'pixel_format' to 'dc_pixel_format'
Hou Wenlong [Mon, 16 Mar 2026 03:46:29 +0000 (11:46 +0800)] 
drm/amd/display: Rename enum 'pixel_format' to 'dc_pixel_format'

Rename the enum 'pixel_format' to 'dc_pixel_format' to avoid potential
name conflicts with the pixel_format struct defined in
include/video/pixel_format.h.

Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/pm: Enable user specified gfx clock ranges
Asad Kamal [Wed, 7 Jan 2026 03:50:21 +0000 (11:50 +0800)] 
drm/amd/pm: Enable user specified gfx clock ranges

Enable user specified gfx clock ranges for smu_15_0_8

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amdgpu: Add smu v15_0_8 ip block
Hawking Zhang [Sun, 2 Nov 2025 11:55:53 +0000 (19:55 +0800)] 
drm/amdgpu: Add smu v15_0_8 ip block

Add smu v15_0_8 ip block

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/pm: Add NPM support for smu_v15_0_8
Asad Kamal [Fri, 30 Jan 2026 13:20:58 +0000 (21:20 +0800)] 
drm/amd/pm: Add NPM support for smu_v15_0_8

Add node power management support for smu_v15_0_8

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/pm: Add baseboard temperature metrics support
Asad Kamal [Fri, 30 Jan 2026 13:18:29 +0000 (21:18 +0800)] 
drm/amd/pm: Add baseboard temperature metrics support

Add baseboard temperature metrics support via system metrics table for
smu_v15_0_8

v4: Add separate function to fill baseboard temperature, use 16, remove
casting

v5: Optimize to use single switch case (Lijo)

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/pm: Add gpuboard temperature metrics support
Asad Kamal [Fri, 30 Jan 2026 13:10:47 +0000 (21:10 +0800)] 
drm/amd/pm: Add gpuboard temperature metrics support

Add gpuboard temperature metrics support via system metrics table for
smu_v15_0_8

v3: Use per sensor attr id (Lijo)

v4: Use s16 for temp, remove cast, use separate function to fill
gpuboard temperature metrics data (Lijo)

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/pm: Add read sensor support
Asad Kamal [Fri, 26 Dec 2025 05:12:56 +0000 (13:12 +0800)] 
drm/amd/pm: Add read sensor support

Add read sensor support for smu_v15_0_8

v2: Remove gfx voltage support (Lijo)

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/pm: Add ppt1 support
Asad Kamal [Fri, 26 Dec 2025 05:00:48 +0000 (13:00 +0800)] 
drm/amd/pm: Add ppt1 support

Add ppt1 support for smu_v15_0_8

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/pm: Add get_thermal_temperature_range support
Asad Kamal [Thu, 25 Dec 2025 16:30:41 +0000 (00:30 +0800)] 
drm/amd/pm: Add get_thermal_temperature_range support

Add get_thermal_temperature_range support smu_v15_0_8

v2: Remove sriov check (Lijo)

v3: Restrict to 1VF mode(Lijo)

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/pm: Add od_edit_dpm_table support
Asad Kamal [Thu, 25 Dec 2025 16:13:13 +0000 (00:13 +0800)] 
drm/amd/pm: Add od_edit_dpm_table support

Add od_edit_dpm_table support for smu_v15_0_8

v2: Skip Gl2clk/Fclk (Lijo)

v3: sqaush in set_performance_support (Asad)

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/pm: add populate_umd_state_clk support
Asad Kamal [Thu, 25 Dec 2025 15:44:20 +0000 (23:44 +0800)] 
drm/amd/pm: add populate_umd_state_clk support

add populate_umd_state_clk support for smu 15.0.8

v2: remove gl2clk/socclk/fclk, restrict to only current min/max (Lijo)

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/display: Fix NULL pointer assumptions in dcn42_init_hw()
Srinivasan Shanmugam [Tue, 17 Mar 2026 03:08:38 +0000 (08:38 +0530)] 
drm/amd/display: Fix NULL pointer assumptions in dcn42_init_hw()

dcn42_init_hw() calls update_bw_bounding_box() when FAMS2 is disabled or
when the dchub reference clock changes. However the existing condition
mixes the callback pointer check with only one side of the || expression:

  ((!fams2_enable && update_bw_bounding_box) || freq_changed)

This allows the block to be entered through the freq_changed path even
when update_bw_bounding_box() is NULL. The function is then called
unconditionally inside the block, which can lead to a NULL pointer
dereference.

Additionally, the code dereferences dc->clk_mgr->bw_params without
verifying that dc->clk_mgr and bw_params are valid.

Restructure the condition so that the update trigger remains the same
(FAMS2 disabled or dchub ref clock changed), but guard the call with
explicit checks for:

  - update_bw_bounding_box callback
  - dc->clk_mgr
  - dc->clk_mgr->bw_params

Also introduce a helper boolean (dchub_ref_freq_changed) to improve
readability of the clock-change condition.

This fixes Smatch warnings about inconsistent NULL assumptions in
dcn42_init_hw().

drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn42/dcn42_hwseq.c:264 dcn42_init_hw() error: we previously assumed 'dc->clk_mgr' could be null (see line 253)
drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn42/dcn42_hwseq.c:278 dcn42_init_hw() error: we previously assumed 'dc->res_pool->funcs->update_bw_bounding_box' could be null (see line 274)

Cc: Roman Li <roman.li@amd.com>
Cc: Alex Hung <alex.hung@amd.com>
Cc: Jerry Zuo <jerry.zuo@amd.com>
Cc: Sun peng Li <sunpeng.li@amd.com>
Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Dan Carpenter <dan.carpenter@linaro.org>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/display: Add clk_mgr NULL checks in dcn32_initialize_min_clocks()
Srinivasan Shanmugam [Sun, 15 Mar 2026 14:41:54 +0000 (20:11 +0530)] 
drm/amd/display: Add clk_mgr NULL checks in dcn32_initialize_min_clocks()

dcn32_init_hw() checks dc->clk_mgr before calling init_clocks(), so the
clock manager is not treated as unconditionally present on this path.
However, dcn32_initialize_min_clocks() later dereferences dc->clk_mgr,
bw_params, and clk_mgr callbacks without validating them.

Add the required guards in dcn32_initialize_min_clocks() before
accessing clk_mgr-dependent state, and check callback presence before
calling get_dispclk_from_dentist() and update_clocks().

Also guard the later update_bw_bounding_box() call in the FAMS2-disabled
path since it also dereferences dc->clk_mgr->bw_params.

This keeps clk_mgr handling consistent in the DCN32 HW init flow and
avoids possible NULL pointer dereferences reported by Smatch.

Fixes the below:
drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn32/dcn32_hwseq.c:1012 dcn32_init_hw() error: we previously assumed 'dc->clk_mgr' could be null (see line 978)

Cc: Roman Li <roman.li@amd.com>
Cc: Alex Hung <alex.hung@amd.com>
Cc: Jerry Zuo <jerry.zuo@amd.com>
Cc: Sun peng Li <sunpeng.li@amd.com>
Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Dan Carpenter <dan.carpenter@linaro.org>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/pm: Add emit clock support
Asad Kamal [Tue, 23 Dec 2025 19:54:11 +0000 (03:54 +0800)] 
drm/amd/pm: Add emit clock support

Add emit clock support and fetching other metrics data like temperature,
clock for smu_v15_0_8

v2: Use umc count for hbm stack temperature (Lijo)

v3: Use correct logic for hbm stacks (Lijo)

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/pm: add set{get}_power_limit support for smu 15.0.8
Yang Wang [Wed, 22 Oct 2025 13:32:55 +0000 (21:32 +0800)] 
drm/amd/pm: add set{get}_power_limit support for smu 15.0.8

export .set_power_limit & .get_power_limit interface for smu 15.0.8

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/pm: add get_unique_id support for smu 15.0.8
Yang Wang [Wed, 22 Oct 2025 13:08:13 +0000 (21:08 +0800)] 
drm/amd/pm: add get_unique_id support for smu 15.0.8

export .get_unique_id interface for smu 15.0.8

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/pm: add get_gpu_metrics support for 15.0.8
Yang Wang [Thu, 23 Oct 2025 03:04:40 +0000 (11:04 +0800)] 
drm/amd/pm: add get_gpu_metrics support for 15.0.8

export .get_gpu_metrics interface for 15.0.8

v2: Remove members already exposed by other interfaces, use mask,
logical conversion (Lijo)

v3: Use correct logic for hbm stacks loop (Lijo)
Remove buffer allocation

v4: Make out of bound check outside loop (Lijo)

v5: fix locking in error case (Alex)

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/pm: Add get_pm_metrics support for smu 15.0.8
Asad Kamal [Wed, 26 Nov 2025 10:41:34 +0000 (18:41 +0800)] 
drm/amd/pm: Add get_pm_metrics support for smu 15.0.8

export .get_pm_metrics interface for smu 15.0.8.

v2: Make tmo as unsigned (Lijo)

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/pm: Add default dpm table support for smu 15.0.8
Asad Kamal [Tue, 23 Dec 2025 16:33:49 +0000 (00:33 +0800)] 
drm/amd/pm: Add default dpm table support for smu 15.0.8

Add default dpm table support for smu 15.0.8

v2: Remove lclk, move pptable check up, add missing clk (Lijo)

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/pm: Update dpm table structs for smu_v15_0
Asad Kamal [Tue, 23 Dec 2025 15:35:32 +0000 (23:35 +0800)] 
drm/amd/pm: Update dpm table structs for smu_v15_0

Update dpm table structs to use common definitions for smu_15_0

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/pm: Setup driver pptable for smu 15.0.8
Yang Wang [Wed, 26 Nov 2025 08:02:41 +0000 (16:02 +0800)] 
drm/amd/pm: Setup driver pptable for smu 15.0.8

Setup driver pptable and initialize data from static metrics table for
smu_v15_0_8

v2: Remove unrelated changes and update description (Lijo)

v3: Use ARRAY_SIZE (Lijo)

v4: Move structure to header file

v5: squash in static metrics support (Asad)

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/pm: Add mode2 support for smu_v15_0_8
Asad Kamal [Mon, 24 Nov 2025 17:19:13 +0000 (01:19 +0800)] 
drm/amd/pm: Add mode2 support for smu_v15_0_8

Add initial mode2 support for smu_v15_0_8

v2: Move out non smu code, remove pci save/restore logic (Lijo)
v3: squash in updated msg (Alex)

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/pm: Add initial support for smu v15_0_8
Hawking Zhang [Mon, 3 Nov 2025 05:39:38 +0000 (13:39 +0800)] 
drm/amd/pm: Add initial support for smu v15_0_8

smu v15_0_8 is the new generation of smu ip block

v2: Squash in rebase changes (Alex)
v3: Squash in fw version check changes (Alex)

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amdgpu/userq: cleanup amdgpu_userq_get/put where not needed
Sunil Khatri [Fri, 20 Mar 2026 11:59:01 +0000 (17:29 +0530)] 
drm/amdgpu/userq: cleanup amdgpu_userq_get/put where not needed

amdgpu_userq_put/get are not needed in case we already holding
the userq_mutex and reference is valid already from queue create
time or from signal ioctl. These additional get/put could be a
potential reason for deadlock in case the ref count reaches zero
and destroy is called which again try to take the userq_mutex.

Due to the above change we avoid deadlock between suspend/restore
calling destroy queues trying to take userq_mutex again.

Cc: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/pm: Add custom fclk setting support
Asad Kamal [Fri, 13 Mar 2026 09:11:03 +0000 (17:11 +0800)] 
drm/amd/pm: Add custom fclk setting support

Add custom fclk setting support for smu_v13_x_x

v2: Move uclk fix to separate patch, return EOPNOTSUPP in case of dpm
disabled (Lijo)

v3: remove dpm check for filling fclk pstate table (Lijo)

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/pm: Add OD_FCLK interface
Asad Kamal [Fri, 13 Mar 2026 05:39:21 +0000 (13:39 +0800)] 
drm/amd/pm: Add OD_FCLK interface

Add OD_FCLK interface to set customa fclk max

v2: Merge patch1 & 3, check EOPNOTSUPP for all clks (Lijo)

v3: Remove redundant check (Lijo)

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/pm: Return -EOPNOTSUPP for unsupported OD_MCLK on smu_v13_0_6
Asad Kamal [Wed, 18 Mar 2026 05:52:57 +0000 (13:52 +0800)] 
drm/amd/pm: Return -EOPNOTSUPP for unsupported OD_MCLK on smu_v13_0_6

When SET_UCLK_MAX capability is absent, return -EOPNOTSUPP from
smu_v13_0_6_emit_clk_levels() for OD_MCLK instead of 0. This makes
unsupported OD_MCLK reporting consistent with other clock types
and allows callers to skip the entry cleanly.

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/pm: Skip redundant UCLK restore in smu_v13_0_6
Asad Kamal [Wed, 18 Mar 2026 05:48:30 +0000 (13:48 +0800)] 
drm/amd/pm: Skip redundant UCLK restore in smu_v13_0_6

Only reapply UCLK soft limits during PP_OD_RESTORE_DEFAULT when the
current max differs from the DPM table max. This avoids redundant
SMC updates and prevents -EINVAL on restore when no change is needed.

Fixes: b7a900344546 ("drm/amd/pm: Allow setting max UCLK on SMU v13.0.6")
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amdgpu: Add amdgpu_regs_pcie64 debugfs node
Stanley.Yang [Thu, 26 Feb 2026 03:41:48 +0000 (11:41 +0800)] 
drm/amdgpu: Add amdgpu_regs_pcie64 debugfs node

Add amdgpu_regs_pcie64 debugfs node to
read/write 64bit PCIE registers.

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/display: Promote DC to 3.2.375
Taimur Hassan [Fri, 13 Mar 2026 22:42:59 +0000 (17:42 -0500)] 
drm/amd/display: Promote DC to 3.2.375

This version brings along following fixes:

- Rework YCbCr422 DSC policy
- Restore full update for tiling change to linear
- add dccg FGCG mask init
- Remove unnecessary completion flag for secure display
- Agument live + capture with CVT case.
- remove dc_clock_limit for apu
- Fix Signed/Unsigned Int Usage Compiler Warning
- Hardcode dtbclk value in bw_params
- Revert inbox0 lock for cursor due to deadlock
- Add 3DLUT DMA broadcast support
- Fix Silence warnings
- export get_power_profile interface for later use
- pg cntl update based on previous asic.
- remove disable_sutter touch pstate debug code
- Refactor DC update checks
- Fix drm_edid leak in amdgpu_dm
- Add Extra SMU Log for dtbclk
- Clamp min DS DCFCLK value to DCN limit
- Update dpia supported configuration
- Multiple DCN42 updates

Acked-by: ChiaHsuan Chung <ChiaHsuan.Chung@amd.com>
Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/display: Rework YCbCr422 DSC policy
Relja Vojvodic [Wed, 11 Mar 2026 21:02:24 +0000 (17:02 -0400)] 
drm/amd/display: Rework YCbCr422 DSC policy

- Reworked YCbCr4:2:2 Native/Simple policy decision making with DSC
enabled based on DSC caps and stream signal type

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Relja Vojvodic <Relja.Vojvodic@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/display: Restore full update for tiling change to linear
Joshua Aberback [Thu, 12 Mar 2026 22:33:49 +0000 (18:33 -0400)] 
drm/amd/display: Restore full update for tiling change to linear

[Why]
There was previously a dc debug flag to indicate that tiling
changes should only be a medium update instead of full. The
function get_plane_info_type was refactored to not rely on dc
state, but in the process the logic was unintentionally changed,
which leads to screen corruption in some cases.

[How]
 - add flag to tiling struct to avoid full update when necessary

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Joshua Aberback <joshua.aberback@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/display: add dccg FGCG mask init
Charlene Liu [Thu, 12 Mar 2026 23:33:33 +0000 (19:33 -0400)] 
drm/amd/display: add dccg FGCG mask init

[why]
missing DCCG_GLOBAL_FGCG_REP_DIS mask macro init

Reviewed-by: Swapnil Patel <swapnil.patel@amd.com>
Signed-off-by: Charlene Liu <Charlene.Liu@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/display: Remove unnecessary completion flag for secure display
Wayne Lin [Fri, 13 Mar 2026 04:05:40 +0000 (12:05 +0800)] 
drm/amd/display: Remove unnecessary completion flag for secure display

The completion flag is not used in secure display today.
Remove unnecessary code.

Reviewed-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/display: Agument live + capture with CVT case.
ChunTao Tso [Wed, 4 Mar 2026 08:37:58 +0000 (16:37 +0800)] 
drm/amd/display: Agument live + capture with CVT case.

1. Add LIVE_CAPTURE_WITH_CVT bit (bit[2]) in union replay_optimization
   to control this feature via DalRegKey_ReplayOptimization.
2. Check the bit in mod_power_set_live_capture_with_cvt_activate function
   before enabling live capture with CVT.
3. Use LIVE_CAPTURE_WITH_CVT to control if Replay want to send CVT in
   live + capture or not.

Reviewed-by: Leon Huang <leon.huang1@amd.com>
Signed-off-by: ChunTao Tso <chuntao.tso@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/display: remove dc_clock_limit for apu
Charlene Liu [Wed, 11 Mar 2026 21:05:19 +0000 (17:05 -0400)] 
drm/amd/display: remove dc_clock_limit for apu

[why]
current apu pmfw does not support dc_clock_limit

Reviewed-by: Roman Li <roman.li@amd.com>
Signed-off-by: Charlene Liu <Charlene.Liu@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/display: Fix Signed/Unsigned Int Usage Compiler Warning
Gaghik Khachatrian [Thu, 12 Mar 2026 14:21:43 +0000 (10:21 -0400)] 
drm/amd/display: Fix Signed/Unsigned Int Usage Compiler Warning

[Why] Compiler generates compiler warnings when signed enum
constants or literal -1 are implicitly converted to unsigned
integer types, cluttering build output and masking genuine issues.

[How] Use UINT_MAX as the invalid sentinel for unsigned IDs and align
loop/index types to unsigned where appropriate to remove implicit
signed-to-unsigned conversions, with no functional behavior change.

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Gaghik Khachatrian <gaghik.khachatrian@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/display: Hardcode dtbclk value in bw_params
Matthew Stewart [Wed, 11 Mar 2026 19:16:00 +0000 (15:16 -0400)] 
drm/amd/display: Hardcode dtbclk value in bw_params

[why&how]

dtbclk should always be 600MHz. Previous logic was to get the real value
from SMU, but this returns 0 when dtbclk is off. Not a problem during
boot when pre-OS enables dtbclk, but PnP was broken due to this.

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Matthew Stewart <Matthew.Stewart2@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/display: Move DPM clk read to clk_mgr_construct in DCN42
Ivan Lipski [Wed, 4 Mar 2026 01:07:58 +0000 (20:07 -0500)] 
drm/amd/display: Move DPM clk read to clk_mgr_construct in DCN42

[Why&How]
The DPM clocks on DCN42 are currently read on every dm_resume, which can
cause in gpu memory freeing while the device is still in suspend.

Move the DPM clock read functionality to clk_mgr_construct() so it
completes once on driver enablement.

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Signed-off-by: Ivan Lipski <ivan.lipski@amd.com>
Signed-off-by: Dmytro Laktyushkin <dmytro.laktyushkin@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/display: Revert inbox0 lock for cursor due to deadlock
Nicholas Kazlauskas [Tue, 10 Mar 2026 20:33:44 +0000 (16:33 -0400)] 
drm/amd/display: Revert inbox0 lock for cursor due to deadlock

[Why]
A deadlock occurs when using inbox0 lock for cursor operations on
PSR-SU and Replays that does not when using the inbox1 locking path.

This is because of a priority inversion issue where inbox1 work
cannot be serviced while holding the HW lock from driver and sending
cursor notifications to DMUB.

Typically the lower priority of inbox1 for the lock command would
allow the PSR and Replay FSMs to complete their transition prior
to giving driver the lock but this is no longer the case with inbox0
having the highest priority in servicing.

[How]
This will reintroduce any synchronization bugs that were there
with Replay or PSR-SU touching the cursor at the same time as driver.

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/display: Add 3DLUT DMA broadcast support
Dillon Varone [Sat, 7 Mar 2026 05:53:03 +0000 (05:53 +0000)] 
drm/amd/display: Add 3DLUT DMA broadcast support

[WHY&HOW]
A single HUBP can be used to fetch 3DLUT and broadcast to a
single HUBP.  Add logic to select the top pipe for a given
plane and use it's HUBP as the broadcast source for multiple
MPC's.

Reviewed-by: Ilya Bakoulin <ilya.bakoulin@amd.com>
Signed-off-by: Dillon Varone <Dillon.Varone@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/display: Fix Silence warnings
Gaghik Khachatrian [Sat, 7 Mar 2026 23:57:57 +0000 (18:57 -0500)] 
drm/amd/display: Fix Silence warnings

Also affects: freesync, hdcp, info_packet, power

[Why] Resolve compiler warnings by marking unused parameters explicitly.

[How] In .c/.h keep parameter names in signatures and add a line with
      `(void)param;`  inside the function body

Preserved function signatures and avoids breaking code paths that
may reference the parameter under conditional compilation.

Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Gaghik Khachatrian <gaghik.khachatrian@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/display: export get_power_profile interface for later use
Charlene Liu [Fri, 6 Mar 2026 15:40:07 +0000 (10:40 -0500)] 
drm/amd/display: export get_power_profile interface for later use

[why]
export dcn401 get_power_profile for later asic.

Reviewed-by: Roman Li <roman.li@amd.com>
Signed-off-by: Charlene Liu <Charlene.Liu@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/display: pg cntl update based on previous asic.
Charlene Liu [Tue, 10 Mar 2026 14:53:08 +0000 (10:53 -0400)] 
drm/amd/display: pg cntl update based on previous asic.

[why]
switch to well tested sequence.

Reviewed-by: Leo Chen <leo.chen@amd.com>
Signed-off-by: Charlene Liu <Charlene.Liu@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/display: remove disable_sutter touch pstate debug code
Charlene Liu [Tue, 10 Mar 2026 17:33:55 +0000 (13:33 -0400)] 
drm/amd/display: remove disable_sutter touch pstate debug code

[why]
diags is using disable_stutter, this will cause issue when pstate switch
enabled

Reviewed-by: Ilya Bakoulin <ilya.bakoulin@amd.com>
Signed-off-by: Charlene Liu <Charlene.Liu@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/display: Refactor DC update checks
Dillon Varone [Thu, 5 Mar 2026 21:42:29 +0000 (16:42 -0500)] 
drm/amd/display: Refactor DC update checks

[WHY&HOW]
DC currently has fragmented definitions of update types.  This changes
consolidates them into a single interface, and adds expanded
functionality to accommodate all use cases.
- adds `dc_check_update_state_and_surfaces_for_stream` to determine
  update type including state, surface, and stream changes.
- adds missing surface/stream update checks to
  `dc_check_update_surfaces_for_stream`
- adds new update type `UPDATE_TYPE_ADDR_ONLY` to accomodate flows where
further distinction from `UPDATE_TYPE_FAST` was needed
- removes caller reliance on `enable_legacy_fast_update` to determine
  which commit function to use, instead embedding it in the update type

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Dillon Varone <Dillon.Varone@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/display: Fix drm_edid leak in amdgpu_dm
Alex Hung [Mon, 9 Mar 2026 17:16:08 +0000 (11:16 -0600)] 
drm/amd/display: Fix drm_edid leak in amdgpu_dm

[WHAT]
When a sink is connected, aconnector->drm_edid was overwritten without
freeing the previous allocation, causing a memory leak on resume.

[HOW]
Free the previous drm_edid before updating it.

Reviewed-by: Roman Li <roman.li@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/display: Fix DCN42 memory clock table using MemClk instead of UClk
Alexander Chechik [Mon, 9 Mar 2026 17:15:24 +0000 (13:15 -0400)] 
drm/amd/display: Fix DCN42 memory clock table using MemClk instead of UClk

[Why]
DCN42 was using UClk values instead of MemClk from MemPstateTable, causing
DML to see half the actual DRAM bandwidth on DDR5 systems and reject high
refresh rate modes.

[How]
Change dcn42_init_clocks() to use MemPstateTable[i].MemClk instead of
MemPstateTable[i].UClk for memclk_mhz initialization.

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Signed-off-by: Alexander Chechik <alexander.chechik@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/display: Add Extra SMU Log for dtbclk
Charlene Liu [Fri, 6 Mar 2026 01:23:25 +0000 (20:23 -0500)] 
drm/amd/display: Add Extra SMU Log for dtbclk

[why]
need to check dtbclk in log for confirmation

Reviewed-by: Gabe Teeger <gabe.teeger@amd.com>
Signed-off-by: Charlene Liu <Charlene.Liu@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/display: Update underflow detection for DCN42
Roman Li [Tue, 17 Mar 2026 00:17:57 +0000 (20:17 -0400)] 
drm/amd/display: Update underflow detection for DCN42

[Why]
The DCN42 underflow detection functions in dcn42_optc.c use
OPTC_RSMU_UNDERFLOW register but the register offset definitions
were missing from dcn_4_2_0_offset.h and dcn42_resource.h.

[How]
Add missing register definitions.

Fixes: e56e3cff2a1b ("drm/amd/display: Sync dcn42 with DC 3.2.373")
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Roman Li <Roman.Li@amd.com>
Signed-off-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amdgpu: fix some more bug in amdgpu_gem_va_ioctl
Christian König [Tue, 3 Feb 2026 16:30:57 +0000 (17:30 +0100)] 
drm/amdgpu: fix some more bug in amdgpu_gem_va_ioctl

Some illegal combination of input flags were not checked and we need to
take the PDEs into account when returning the fence as well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/display: Clamp min DS DCFCLK value to DCN limit
Roman Li [Sat, 14 Mar 2026 00:34:48 +0000 (20:34 -0400)] 
drm/amd/display: Clamp min DS DCFCLK value to DCN limit

[why & how]
DCN has a global limit for minimum DS DCFCLK during any operation.

Adhere to that limit and add a debug flag.

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Signed-off-by: Ovidiu Bunea <ovidiu.bunea@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Signed-off-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/display: Split arbiter programming for DCN42
Nicholas Kazlauskas [Mon, 2 Mar 2026 15:47:34 +0000 (10:47 -0500)] 
drm/amd/display: Split arbiter programming for DCN42

[Why]
We don't want to update the timeout threshold for stall recovery in
firmware dynamically for DCN42 as we're not using FAMS.

Firmware should own programming of this register since the recovery
can be broken if driver updates the value to 0.

[How]
Split program_arbiter for dcn42 and skip the part that updates the
timeout threshold.

Reviewed-by: Leo Chen <leo.chen@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Signed-off-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agodrm/amd/display: Add missing dcn42 hubbub function pointers
Roman Li [Sat, 14 Mar 2026 00:27:54 +0000 (20:27 -0400)] 
drm/amd/display: Add missing dcn42 hubbub function pointers

This aligning commit combines:
- fix dcn42 det programming)
- fix missing dcn42 pointers
- fix SDPIF_Request_Rate_Limit programming value

V2: Add back dchvm_init for DCN42

Reviewed-by: Alex Hung <alex.hung@amd.com>
Reviewed-by: Leo Chen <leo.chen@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Signed-off-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>