Jonathan Cavitt [Tue, 24 Mar 2026 15:29:39 +0000 (15:29 +0000)]
drm/xe/xe_vm: Add per VM fault info
Add additional information to each VM so they can report up to the first
50 seen faults. Only pagefaults are saved this way currently, though in
the future, all faults should be tracked by the VM for future reporting.
Additionally, of the pagefaults reported, only failed pagefaults are
saved this way, as successful pagefaults should recover silently and not
need to be reported to userspace.
v2:
- Free vm after use (Shuicheng)
- Compress pf copy logic (Shuicheng)
- Update fault_unsuccessful before storing (Shuicheng)
- Fix old struct name in comments (Shuicheng)
- Keep first 50 pagefaults instead of last 50 (Jianxun)
v4:
- Rename xe_vm.pfs to xe_vm.faults (jcavitt)
- Store fault data and not pagefault in xe_vm faults list (jcavitt)
- Store address, address type, and address precision per fault (jcavitt)
- Store engine class and instance data per fault (Jianxun)
- Add and fix kernel docs (Michal W)
- Properly handle kzalloc error (Michal W)
- s/MAX_PFS/MAX_FAULTS_SAVED_PER_VM (Michal W)
- Store fault level per fault (Micahl M)
v5:
- Store fault and access type instead of address type (Jianxun)
v6:
- Store pagefaults in non-fault-mode VMs as well (Jianxun)
v7:
- Fix kernel docs and comments (Michal W)
v8:
- Fix double-locking issue (Jianxun)
v9:
- Do not report faults from reserved engines (Jianxun)
v14:
- Correctly ignore fault mode in save_pagefault_to_vm (jcavitt)
v15:
- s/save_pagefault_to_vm/xe_pagefault_save_to_vm (Matt Brost)
- Use guard instead of spin_lock/unlock (Matt Brost)
- GT was added to xe_pagefault struct. Use xe_gt_hw_engine
instead of creating a new helper function (Matt Brost)
v16:
- Set address precision programmatically (Matt Brost)
v17:
- Set address precision to fixed value (Matt Brost)
v18:
- s/uAPI/Link in commit log links
- Use kzalloc_obj
Link: https://github.com/intel/compute-runtime/pull/878 Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Suggested-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Cc: Shuicheng Lin <shuicheng.lin@intel.com> Cc: Jianxun Zhang <jianxun.zhang@intel.com> Cc: Michal Wajdeczko <Michal.Wajdeczko@intel.com> Cc: Michal Mzorek <michal.mzorek@intel.com> Cc: Ivan Briano <ivan.briano@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260324152935.72444-9-jonathan.cavitt@intel.com
Jonathan Cavitt [Tue, 24 Mar 2026 15:29:38 +0000 (15:29 +0000)]
drm/xe/uapi: Define drm_xe_vm_get_property
Add initial declarations for the drm_xe_vm_get_property ioctl.
v2:
- Expand kernel docs for drm_xe_vm_get_property (Jianxun)
v3:
- Remove address type external definitions (Jianxun)
- Add fault type to xe_drm_fault struct (Jianxun)
v4:
- Remove engine class and instance (Ivan)
v5:
- Add declares for fault type, access type, and fault level (Matt Brost,
Ivan)
v6:
- Fix inconsistent use of whitespace in defines
v7:
- Rebase and refactor (jcavitt)
v8:
- Rebase (jcavitt)
v9:
- Clarify address is canonical (José)
v10:
- s/uAPI/Link in the commit log links
Link: https://github.com/intel/compute-runtime/pull/878 Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com> Acked-by: Matthew Brost <matthew.brost@intel.com> Acked-by: Ivan Briano <ivan.briano@intel.com> Acked-by: José Roberto de Souza <jose.souza@intel.com> Cc: Zhang Jianxun <jianxun.zhang@intel.com> Cc: Ivan Briano <ivan.briano@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260324152935.72444-8-jonathan.cavitt@intel.com
Which seems to hint that the vma we are re-inserting for the ops unwind
is either invalid or overlapping with something already inserted in the
vm. It shouldn't be invalid since this is a re-insertion, so must have
worked before. Leaving the likely culprit as something already placed
where we want to insert the vma.
Following from that, for the case where we do something like a rebind in
the middle of a vma, and one or both mapped ends are already compatible,
we skip doing the rebind of those vma and set next/prev to NULL. As well
as then adjust the original unmap va range, to avoid unmapping the ends.
However, if we trigger the unwind path, we end up with three va, with
the two ends never being removed and the original va range in the middle
still being the shrunken size.
If this occurs, one failure mode is when another unwind op needs to
interact with that range, which can happen with a vector of binds. For
example, if we need to re-insert something in place of the original va.
In this case the va is still the shrunken version, so when removing it
and then doing a re-insert it can overlap with the ends, which were
never removed, triggering a warning like above, plus leaving the vm in a
bad state.
With that, we need two things here:
1) Stop nuking the prev/next tracking for the skip cases. Instead
relying on checking for skip prev/next, where needed. That way on the
unwind path, we now correctly remove both ends.
2) Undo the unmap va shrinkage, on the unwind path. With the two ends
now removed the unmap va should expand back to the original size again,
before re-insertion.
v2:
- Update the explanation in the commit message, based on an actual IGT of
triggering this issue, rather than conjecture.
- Also undo the unmap shrinkage, for the skip case. With the two ends
now removed, the original unmap va range should expand back to the
original range.
v3:
- Track the old start/range separately. vma_size/start() uses the va
info directly.
Tvrtko Ursulin [Tue, 24 Mar 2026 08:40:18 +0000 (08:40 +0000)]
drm/xe/xelp: Expose AuxCCS frame buffer modifiers on Alderlake-P
Now that we have implemented all the related missing bits we can enable
the AuxCCS compressed modifiers which were disabled in cf48bddd31de ("drm/i915/display: Disable AuxCCS framebuffers if built for Xe").
Tested with KDE Wayland, on Lenovo Carbon X1 ADL-P:
Tvrtko Ursulin [Tue, 24 Mar 2026 08:40:17 +0000 (08:40 +0000)]
drm/xe/display: Add support for AuxCCS
Add support for mapping the auxiliary CCS buffer into the DPT page tables.
This will allow for better power efficiency by enabling the render
compression frame buffer modifiers such as
I915_FORMAT_MOD_Y_TILED_GEN12_RC_CCS in a following patch.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Cc: Michael J. Ruhl <michael.j.ruhl@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Uma Shankar <uma.shankar@intel.com> Link: https://patch.msgid.link/20260324084018.20353-12-tvrtko.ursulin@igalia.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Tvrtko Ursulin [Tue, 24 Mar 2026 08:40:15 +0000 (08:40 +0000)]
drm/xe/display: Change write_dpt_remapped_tiled function signature
In preparation for adding support for the auxccs plane lets change the
function signature of write_dpt_remapped_tiled(). This will enable a
tidier way of extending it subsequent patches.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Cc: Michael J. Ruhl <michael.j.ruhl@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Uma Shankar <uma.shankar@intel.com> Reviewed-by: Uma Shankar <uma.shankar@intel.com> Link: https://patch.msgid.link/20260324084018.20353-10-tvrtko.ursulin@igalia.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Tvrtko Ursulin [Tue, 24 Mar 2026 08:40:12 +0000 (08:40 +0000)]
drm/xe: Move aux table invalidation to ring ops
Implement the suggestion of moving the aux invalidation from a helper to a
ring ops vfunc, together with the suggestion to split the vfunc table of
video decode and video enhance engines.
With this done the LRC code will be able to access the functionality via
the newly added ring ops vfunc.
Tvrtko Ursulin [Tue, 24 Mar 2026 08:40:10 +0000 (08:40 +0000)]
drm/xe/xelp: Quiesce memory traffic before invalidating AuxCCS
According to i915 commit ad8ebf12217e ("drm/i915/gt: Ensure memory quiesced before invalidation")
quiescing of the memory traffic is required before invalidating the AuxCCS
tables.
Tvrtko Ursulin [Tue, 24 Mar 2026 08:40:08 +0000 (08:40 +0000)]
drm/xe: Use write-combine mapping when populating DPT
The fallback case for DPT backing store is a buffer object in system
memory buffer, which by default use a write-back CPU caching policy.
If this fallback gets triggered, and since there is currently no flushing,
the DPT writes made when pinning a buffer to display are not guaranteed to
be seen by the display engine.
To fix this, since both the local memory and the stolen memory DPT
placements already use write-combine, let us make the system memory option
follow suit by passing down the appropriate flag.
Tvrtko Ursulin [Tue, 24 Mar 2026 08:40:07 +0000 (08:40 +0000)]
drm/xe: Rename XE_BO_FLAG_SCANOUT to XE_BO_FLAG_FORCE_WC
Rename XE_BO_FLAG_SCANOUT to XE_BO_FLAG_FORCE_WC so that the usage of the
flag can legitimately be expanded to more than just the actual frame-
buffer objects.
Hook into the PCI error handler reset_prepare() callback to notify
the PF about an upcoming VF FLR before reset_done() is executed.
This enables early FLR_PREPARE signaling and ensures that the PF is
aware of the reset before the completion wait begins.
Signed-off-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Alex Williamson <alex@shazbot.org> Link: https://patch.msgid.link/20260309152449.910636-3-piotr.piorkowski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
drm/xe/pf: Add FLR_PREPARE state to VF control flow
Our xe-vfio-pci component relies on the confirmation from the PF
that VF FLR processing has finished, but due to the notification
latency on the HW/FW side, PF might be unaware yet of the already
triggered VF FLR.
Update VF state machine with new FLR_PREPARE state that indicate
imminent VF FLR notification and treat that as a begin of the FLR
sequence. Also introduce function that xe-vfio-pci should call to
guarantee correct synchronization.
v2: move PREPARE into WIP, update commit msg (Michal)
Signed-off-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Co-developed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Link: https://patch.msgid.link/20260309152449.910636-2-piotr.piorkowski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Matt Roper [Thu, 19 Mar 2026 22:30:34 +0000 (15:30 -0700)]
drm/xe: Implement recent spec updates to Wa_16025250150
The hardware teams noticed that the originally documented workaround
steps for Wa_16025250150 may not be sufficient to fully avoid a hardware
issue. The workaround documentation has been augmented to suggest
programming one additional register; make the corresponding change in
the driver.
Tejas Upadhyay [Thu, 5 Mar 2026 12:19:06 +0000 (17:49 +0530)]
drm/xe/xe3p_lpg: Restrict UAPI to enable L2 flush optimization
When set, starting xe3p_lpg, the L2 flush optimization
feature will control whether L2 is in Persistent or
Transient mode through monitoring of media activity.
To enable L2 flush optimization include new feature flag
GUC_CTL_ENABLE_L2FLUSH_OPT for Novalake platforms when
media type is detected.
Tighten UAPI validation to restrict userptr, svm and
dmabuf mappings to be either 2WAY or XA+1WAY
V5(Thomas): logic correction
V4(MattA): Modify uapi doc and commit
V3(MattA): check valid op and pat_index value
V2(MattA): validate dma-buf bos and madvise pat-index
Acked-by: José Roberto de Souza <jose.souza@intel.com> Acked-by: Michal Mrozek <michal.mrozek@intel.com> Acked-by: Carl Zhang <carl.zhang@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patch.msgid.link/20260305121902.1892593-9-tejas.upadhyay@intel.com Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Tejas Upadhyay [Thu, 5 Mar 2026 12:19:05 +0000 (17:49 +0530)]
drm/xe/pat: define coh_mode 2way
Defining 2way (two-way coherency) is critical for
Xe3p_LPG (Nova Lake P) platforms to support L2 flush
optimization safely.
This mode allows the driver to skip certain manual cache
flushes (L2 flush optimization) without risking memory
corruption because the hardware ensures the most recent
data is visible to both entities.
Tejas Upadhyay [Thu, 5 Mar 2026 12:19:04 +0000 (17:49 +0530)]
drm/xe/xe3p_lpg: flush shrinker bo cachelines manually
XA, new pat_index introduced post xe3p_lpg, is memory shared between the
CPU and GPU is treated differently from other GPU memory when the Media
engine is power-gated.
XA is *always* flushed, like at the end-of-submssion (and maybe other
places), just that internally as an optimisation hw doesn't need to make
that a full flush (which will also include XA) when Media is
off/powergated, since it doesn't need to worry about GT caches vs Media
coherency, and only CPU vs GPU coherency, so can make that flush a
targeted XA flush, since stuff tagged with XA now means it's shared with
the CPU. The main implication is that we now need to somehow flush non-XA
before freeing system memory pages, otherwise dirty cachelines could be
flushed after the free (like if Media suddenly turns on and does a full
flush)
V4: Add comments for L2 flush path
V3(Thomas/MattA/MattR): Restrict userptr with non-xa, then no need to
flush manually
V2(MattA): Expand commit description
There is a small risk that when fetching a NULL context image the
VF may get a tweaked context image prepared by another VF that was
previously running on the engine before the GuC scheduler switched
the VFs.
To avoid that risk, without forcing GuC scheduler to trigger costly
engine reset on every VF switch, use a watchdog mechanism that when
configured with impossible condition, triggers an interrupt, which
GuC will handle by doing an engine reset. Also adjust job size to
account for additional dwords with watchdog setup.
This command supports memory based Semaphore WAIT. Memory based
semaphores will be used for synchronization between the Producer
and the Consumer contexts. Producer and Consumer Contexts could
be running on different engines or on the same engine inside GT.
The Watchdog Counter Control and Watchdog Counter Threshold
registers are needed for watchdog programming. This watchdog
will generate the "Media Hang Notify" interrupt.
Michał Winiarski [Tue, 17 Feb 2026 15:41:18 +0000 (16:41 +0100)]
drm/xe/pf: Fix use-after-free in migration restore
When an error is returned from xe_sriov_pf_migration_restore_produce(),
the data pointer is not set to NULL, which can trigger use-after-free
in subsequent .write() calls.
Set the pointer to NULL upon error to fix the problem.
Fixes: 1ed30397c0b92 ("drm/xe/pf: Add support for encap/decap of bitstream to/from packet") Reported-by: Sebastian Österlund <sebastian.osterlund@intel.com> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/7230 Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com> Link: https://patch.msgid.link/20260217154118.176902-1-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
drm/xe: Fix format specifier for printing pointer differences
GCC and clang warn (or error with CONFIG_WERROR=y / W=e) several times
when targeting 32-bit platforms along the lines of
drivers/gpu/drm/xe/xe_lrc.c: In function 'dump_mi_command':
drivers/gpu/drm/xe/xe_lrc.c:1921:40: error: format '%lx' expects argument of type 'long unsigned int', but argument 3 has type 'int' [-Werror=format=]
1921 | drm_printf(p, "LRC[%#5lx] = [%#010x] MI_NOOP (%d dwords)\n",
| ~~~~^
| |
| long unsigned int
| %#5x
1922 | dw - num_noop - start, inst_header, num_noop);
| ~~~~~~~~~~~~~~~~~~~~~
| |
| int
drivers/gpu/drm/xe/xe_lrc.c:1922:7: error: format specifies type 'unsigned long' but the argument has type '__ptrdiff_t' (aka 'int') [-Werror,-Wformat]
1921 | drm_printf(p, "LRC[%#5lx] = [%#010x] MI_NOOP (%d dwords)\n",
| ~~~~~
| %#5tx
1922 | dw - num_noop - start, inst_header, num_noop);
| ^~~~~~~~~~~~~~~~~~~~~
Use the '%tx' specifier for printing pointer differences, which clears
up the warnings for 32-bit platforms while introducing no regressions
for 64-bit platforms.
Nitin Gote [Tue, 17 Mar 2026 08:00:59 +0000 (13:30 +0530)]
drm/xe: Extend Wa_14026781792 for xe3lpg
Wa_14026781792 applies to all graphics versions from 30.00
through 35.10 (inclusive). Since there are no IPs between
30.05 and 35.10, consolidate the RTP rules into a single
GRAPHICS_VERSION_RANGE(3000, 3510).
v2: (Matt)
- There are no IPs between 30.05 and 35.10 either,
So, consolidate this into a single GRAPHICS_VERSION_RANGE(3000, 3510)
- Also move it up to the top part of the table
Varun Gupta [Tue, 17 Mar 2026 04:04:47 +0000 (09:34 +0530)]
drm/xe/xe3p_lpg: Add Wa_16029437861
Wa_16029437861 requires disabling COAMA atomics by setting bit 22
(SQ_DISABLE_COAMA) of L3SQCREG2 (0xb104) for Xe3p_LPG graphics
version 35.10 stepping A0..B0. This bit is already set by the existing
Wa_14026144927 entry, so add the new WA ID to the same implementation.
Sanjay Yadav [Fri, 13 Mar 2026 07:16:09 +0000 (12:46 +0530)]
drm/xe: Fix missing runtime PM reference in ccs_mode_store
ccs_mode_store() calls xe_gt_reset() which internally invokes
xe_pm_runtime_get_noresume(). That function requires the caller
to already hold an outer runtime PM reference and warns if none
is held:
Fix this by protecting xe_gt_reset() with the scope-based
guard(xe_pm_runtime)(xe), which is the preferred form when
the reference lifetime matches a single scope.
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/7593 Fixes: 480b358e7d8e ("drm/xe: Do not wake device during a GT reset") Cc: <stable@vger.kernel.org> # v6.19+ Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Shuicheng Lin <shuicheng.lin@intel.com> Suggested-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Sanjay Yadav <sanjay.kumar.yadav@intel.com> Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patch.msgid.link/20260313071608.3459480-2-sanjay.kumar.yadav@intel.com
drm/xe/lrc: Fix uninitialized new_ts when capturing context timestamp
Getting engine specific CTX TIMESTAMP register can fail. In that case,
if the context is active, new_ts is uninitialized. Fix that case by
initializing new_ts to the last value that was sampled in SW -
lrc->ctx_timestamp.
Ashutosh Dixit [Fri, 13 Mar 2026 05:36:30 +0000 (22:36 -0700)]
drm/xe/oa: Allow reading after disabling OA stream
Some OA data might be present in the OA buffer when OA stream is
disabled. Allow UMD's to retrieve this data, so that all data till the
point when OA stream is disabled can be retrieved.
Dave Airlie [Tue, 17 Mar 2026 01:27:01 +0000 (11:27 +1000)]
Merge tag 'drm-intel-next-2026-03-16' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next
[airlied: fixed conflict with xe tree]
drm/i915 feature pull for v7.1:
Features and functionality:
- C10/C20/LT PHY PLL divider verification (Mika)
- Use trans push mechanism to generate PSR frame change event on LNL+ (Jouni)
- Account for DSC bubble overhead for horizontal slices (Ankit, Chaitanya)
Refactoring and cleanups:
- Refactor DP DSC slice config computation (Imre)
- Use GVT versions of register helper macros for GVT MMIO table (Ankit)
- C10/C20/LT PHY PLL computation refactoring (Mika)
- VGA decode refactoring and related fixes/cleanups (Ville)
- Move DSB buffer buffer implementation to display parent interface (Jani)
- Move error interrupt capture to display irq snapshot (Jani)
- Move pcode calls to display parent interface (Jani)
- Reduce GVT dependency on display headers (Jani)
- Compute config and mode valid refactoring for DSC (Ankit)
- Stop using i915 core register headers in display (Uma)
- Refactor DPT, move i915 parts to display parent interface (Jani)
- Refactor gen2-4 overlay, move to display parent interface (Ville)
- Refactor masked field register macro helpers, move to shared headers (Jani)
- Convert a number of workaround checks to the new workaround framework (Luca)
- Refactor and move frontbuffer calls to display parent interface (Jani)
- Add VMA calls to display parent interface (Jani)
- Refactor stolen memory allocation decisions (Vinod, Ville)
- Clean up and unify workqueue usage (Marco Crivellari)
- Preparation for UHBR DP tunnels (Imre)
- Allow DSC passthrough modes during DP MST mode validation (Imre)
- Move framebuffer bo interface to display parent interface (Jani)
Fixes:
- Plenty of DP SST HPD IRQ handling fixes (Imre)
- DP AUX backlight and luminance control fixes (Suraj)
- Respect VBT pipe joiner disable for eDP (Ankit)
- Do not use CASF with joiner (Nemesa)
- Clear C10/C20 PHY response read and error bit to avoid PHY hangs (Suraj)
- Xe3p_LPD DMG clock gating, CDCLK, port sync workarounds (Suraj, Gustavo, Mitul)
- Fix GVT error path (Michał)
- Handle errors on DP DSC receiver cap reads (Suraj)
- DSS clock gating workaround on MTL+ to avoid DSC corruption (Mika)
- Skip state verification for LT PHY in TBT mode (Suraj)
- Fix NULL pointer dereference on suspend when uc firmware not loaded (Rahul Bukte)
- Fix an unlikely DMC state related NULL pointer dereference at probe (Imre)
- Handle error returns from vga_get_uninterruptible() (Simon Richter)
- Increase C10/C20/LT PHY timeouts to include SOC/OS turnaround (Arun)
- Fix BIOS FB vs. stolen memory size check (Ville)
- Fix LOBF to use computed guardband and set context latency (Ankit)
- Handle modeset WW mutex lock failures due to contention properly (Imre)
- Fix pipe BPP clamping due to HDR (Imre)
- Fix stale state usage in DSC state computation (Imre)
- Take HDCP 1.4 vs 2.x into account during link check (Suraj)
- Fix forced link retrain handling in MST HPD IRQ handler (Imre)
- Remove redundant warning on vcpi < 0 (Jonathan)
Core changes:
- iopoll: fix function parameter names in read_poll_timeout_atomic() (Randy Dunlap)
Brian Nguyen [Thu, 5 Mar 2026 17:15:49 +0000 (17:15 +0000)]
drm/xe: Move page reclaim done_handler to own func
Originally, page reclamation is handled by the same fence as tlb
invalidation and uses its seqno, so there was no reason to separate out
the handlers. However in hindsight, for readability, and possible
future changes, it seems more beneficial to move this all out to its own
function.
Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://patch.msgid.link/20260305171546.67691-7-brian3.nguyen@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Brian Nguyen [Thu, 5 Mar 2026 17:15:48 +0000 (17:15 +0000)]
drm/xe: Skip over non leaf pte for PRL generation
The check using xe_child->base.children was insufficient in determining
if a pte was a leaf node. So explicitly skip over every non-leaf pt and
conditionally abort if there is a scenario where a non-leaf pt is
interleaved between leaf pt, which results in the page walker skipping
over some leaf pt.
Note that the behavior being targeted for abort is
PD[0] = 2M PTE
PD[1] = PT -> 512 4K PTEs
PD[2] = 2M PTE
results in abort, page walker won't descend PD[1].
With new abort, ensuring valid PRL before handling a second abort.
v2:
- Revert to previous assert.
- Revised non-leaf handling for interleaf child pt and leaf pte.
- Update comments to specifications. (Stuart)
- Remove unnecessary XE_PTE_PS64. (Matthew B)
v3:
- Modify secondary abort to only check non-leaf PTEs. (Matthew B)
Fixes: b912138df299 ("drm/xe: Create page reclaim list on unbind") Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Cc: Stuart Summers <stuart.summers@intel.com> Link: https://patch.msgid.link/20260305171546.67691-6-brian3.nguyen@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Matt Roper [Wed, 11 Mar 2026 23:04:58 +0000 (16:04 -0700)]
drm/xe: Include running dword offset in default_lrc dumps
Printing a running dword offset in the default_lrc_* debugfs entries
makes it easier for developers to find the right offsets to use in
regs/xe_lrc_layout.h and/or compare the default LRC contents against the
bspec-documented LRC layout.
Jani Nikula [Wed, 11 Mar 2026 14:18:18 +0000 (16:18 +0200)]
drm/{i915,xe}: move framebuffer bo to parent interface
Add .framebuffer_init, .framebuffer_fini and .framebuffer_lookup to the
bo parent interface. While they're about framebuffers, they're
specifically about framebuffer objects, so the bo interface is a good
enough fit, and there's no need to add another interface struct.
Jani Nikula [Wed, 11 Mar 2026 14:18:16 +0000 (16:18 +0200)]
drm/{i915, xe}/bo: move display bo calls to parent interface
Continue i915 and xe separation from display by moving the bo calls to
the display parent interface. Instead of adding all these functions to
intel_parent.[ch], reuse the now vacated intel_bo.[ch], and avoid mass
renames to calls of these functions. This is similar to
intel_display_rpm.[ch].
Make many of the hooks optional to avoid having to implement dummy
functions in xe. Indeed now we can remove many of the existing dummy
functions.
Dave Airlie [Mon, 16 Mar 2026 06:50:47 +0000 (16:50 +1000)]
Merge tag 'amd-drm-next-7.1-2026-03-12' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-7.1-2026-03-12:
amdgpu:
- SMU13 fix
- SMU14 fix
- Fixes for bring up hw testing
- Kerneldoc fix
- GC12 idle power fix for compute workloads
- DCCG fixes
- UserQ fixes
- Move test for fbdev object to a generic helper
- GC 12.1 updates
- Use struct drm_edid in non-DC code
- Include IP discovery data in devcoredump
- SMU 13.x updates
- Misc cleanups
- DML 2.1 fixes
- Enable NV12/P010 support on primary planes
- Enable color encoding and color range on overlay planes
- DC underflow fixes
- HWSS fast path fixes
- Replay fixes
- DCN 4.2 updates
- Support newer IP discovery tables
- LSDMA 7.1 support
- IH 7.1 fixes
- SoC v1 updates
- GC12.1 updates
- PSP 15 updates
- XGMI fixes
- GPUVM locking fix
amdkfd:
- Fix missing BO unreserve in an error path
radeon:
- Move test for fbdev object to a generic helper
Dave Airlie [Mon, 16 Mar 2026 02:21:06 +0000 (12:21 +1000)]
Merge tag 'drm-xe-next-2026-03-12' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next
UAPI Changes:
- add VM_BIND DECOMPRESS support and on-demand decompression (Nitin)
- Allow per queue programming of COMMON_SLICE_CHICKEN3 bit13 (Lionel)
Cross-subsystem Changes:
- Introduce the DRM RAS infrastructure over generic netlink (Riana, Rodrigo)
Dave Airlie [Sun, 15 Mar 2026 23:10:13 +0000 (09:10 +1000)]
Merge tag 'drm-intel-gt-next-2026-03-12' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next
Driver Changes:
Fixes/improvements/new stuff:
- Fix potential overflow of shmem scatterlist length (Janusz Krzysztofik)
Miscellaneous:
- Keep mock file open during unfaultable migrate with fill [selftests] (Krzysztof Karas)
- Test for imported buffers with drm_gem_is_imported() (Thomas Zimmermann)
- Fix corrupted copyright symbols in selftest files [guc] (Konstantin Khorenko)
gem-shmem:
- Track page accessed/dirty status across mmap/vmap
ttm:
- fix fence signalling
Driver Changes:
amdxdna:
- provide NPU power estimate
- support sensor for column utilization
bridge:
- anx7625: Fix USB Type-C handling
- cdns-mhdp8546-core: Handle HDCP state in bridge atomic_check
ivpu:
- fixes
loongson:
- replace custom code with drm_gem_ttm_dumb_map_offset()
mxsfb:
- lcdif: report probing errors with dev_err_probe()
panel:
- ilitek-ili9882t: Allow GPIO calls to sleep
- jadard: Support TAIGUAN XTI05101-01A
- lxd: Support LXD M9189A plus DT bindings
- mantix: Fix pixel clock; Clean up
- motorola: Support Motorola Atrix 4G and Droid X2 plus DT bindings
- novatek: Support Novatek/Tianma NT37700F plus DT bindings
- renesas: Clean up
- simple: Support EDT ET057023UDBA plus DT bindings; Support Powertip
PH800480T032-ZHC19 plus DT bindings; Support Waveshare 13.3"
- clean up DT bindings of various drivers
panthor:
- fix fence handling
vc4:
- check return value of platform_get_irq_byname()
Francois Dugast [Thu, 12 Mar 2026 19:20:14 +0000 (20:20 +0100)]
drm/pagemap: Enable THP support for GPU memory migration
This enables support for Transparent Huge Pages (THP) for device pages by
using MIGRATE_VMA_SELECT_COMPOUND during migration. It removes the need to
split folios and loop multiple times over all pages to perform required
operations at page level. Instead, we rely on newly introduced support for
higher orders in drm_pagemap and folio-level API.
In Xe, this drastically improves performance when using SVM. The GT stats
below collected after a 2MB page fault show overall servicing is more than
7 times faster, and thanks to reduced CPU overhead the time spent on the
actual copy goes from 23% without THP to 80% with THP:
v2:
- Fix one occurrence of drm_pagemap_get_devmem_page() (Matthew Brost)
v3:
- Remove migrate_device_split_page() and folio_split_lock, instead rely on
free_zone_device_folio() to split folios before freeing (Matthew Brost)
- Assert folio order is HPAGE_PMD_ORDER (Matthew Brost)
- Always use folio_set_zone_device_data() in split (Matthew Brost)
Matthew Brost [Thu, 12 Mar 2026 19:20:13 +0000 (20:20 +0100)]
drm/pagemap: Correct cpages calculation for migrate_vma_setup
cpages returned from migrate_vma_setup represents the total number of
individual pages found, not the number of 4K pages. The math in
drm_pagemap_migrate_to_devmem for npages is based on the number of 4K
pages, so cpages != npages can fail even if the entire memory range is
found in migrate_vma_setup (e.g., when a single 2M page is found).
Add drm_pagemap_cpages, which converts cpages to the number of 4K pages
found.
Cc: Andrew Morton <akpm@linux-foundation.org> Cc: David Hildenbrand <david@kernel.org> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Mike Rapoport <rppt@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Balbir Singh <balbirs@nvidia.com> Cc: linux-mm@kvack.org Reviewed-by: Francois Dugast <francois.dugast@intel.com> Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Balbir Singh <balbirs@nvidia.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260312192126.2024853-4-francois.dugast@intel.com
Francois Dugast [Thu, 12 Mar 2026 19:20:11 +0000 (20:20 +0100)]
drm/pagemap: Unlock and put folios when possible
If the page is part of a folio, unlock and put the whole folio at once
instead of individual pages one after the other. This will reduce the
amount of operations once device THP are in use.
Cc: Andrew Morton <akpm@linux-foundation.org> Cc: David Hildenbrand <david@kernel.org> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Mike Rapoport <rppt@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Balbir Singh <balbirs@nvidia.com> Cc: linux-mm@kvack.org Suggested-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Balbir Singh <balbirs@nvidia.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260312192126.2024853-2-francois.dugast@intel.com
Matthew Brost [Tue, 10 Mar 2026 22:50:39 +0000 (18:50 -0400)]
drm/xe: Open-code GGTT MMIO access protection
GGTT MMIO access is currently protected by hotplug (drm_dev_enter),
which works correctly when the driver loads successfully and is later
unbound or unloaded. However, if driver load fails, this protection is
insufficient because drm_dev_unplug() is never called.
Additionally, devm release functions cannot guarantee that all BOs with
GGTT mappings are destroyed before the GGTT MMIO region is removed, as
some BOs may be freed asynchronously by worker threads.
To address this, introduce an open-coded flag, protected by the GGTT
lock, that guards GGTT MMIO access. The flag is cleared during the
dev_fini_ggtt devm release function to ensure MMIO access is disabled
once teardown begins.
Zhanjun Dong [Tue, 10 Mar 2026 22:50:37 +0000 (18:50 -0400)]
drm/xe/guc: Ensure CT state transitions via STOP before DISABLED
The GuC CT state transition requires moving to the STOP state before
entering the DISABLED state. Update the driver teardown sequence to make
the proper state machine transitions.
Fixes: ee4b32220a6b ("drm/xe/guc: Add devm release action to safely tear down CT") Cc: stable@vger.kernel.org Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260310225039.1320161-6-zhanjun.dong@intel.com
Matthew Brost [Tue, 10 Mar 2026 22:50:35 +0000 (18:50 -0400)]
drm/xe: Trigger queue cleanup if not in wedged mode 2
The intent of wedging a device is to allow queues to continue running
only in wedged mode 2. In other modes, queues should initiate cleanup
and signal all remaining fences. Fix xe_guc_submit_wedge to correctly
clean up queues when wedge mode != 2.
Matthew Brost [Tue, 10 Mar 2026 22:50:34 +0000 (18:50 -0400)]
drm/xe: Forcefully tear down exec queues in GuC submit fini
In GuC submit fini, forcefully tear down any exec queues by disabling
CTs, stopping the scheduler (which cleans up lost G2H), killing all
remaining queues, and resuming scheduling to allow any remaining cleanup
actions to complete and signal any remaining fences.
Split guc_submit_fini into device related and software only part. Using
device-managed and drm-managed action guarantees the correct ordering of
cleanup.
Matthew Brost [Tue, 10 Mar 2026 22:50:33 +0000 (18:50 -0400)]
drm/xe: Always kill exec queues in xe_guc_submit_pause_abort
xe_guc_submit_pause_abort is intended to be called after something
disastrous occurs (e.g., VF migration fails, device wedging, or driver
unload) and should immediately trigger the teardown of remaining
submission state. With that, kill any remaining queues in this function.
Fixes: 7c4b7e34c83b ("drm/xe/vf: Abort VF post migration recovery on failure") Cc: stable@vger.kernel.org Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260310225039.1320161-2-zhanjun.dong@intel.com
By using the same variable for both the return of poll_timeout_us and
the return of the polled function guc_wait_ucode, the return value of
the latter is overwritten and lost after exiting the polling loop. Since
guc_wait_ucode returns -1 on GuC load failure, we lose that information
and always continue as if the GuC had been loaded correctly.
This is fixed by simply using 2 separate variables.
Fixes: a4916b4da448 ("drm/xe/guc: Refactor GuC load to use poll_timeout_us()") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Link: https://patch.msgid.link/20260303001732.2540493-2-daniele.ceraolospurio@intel.com
Imre Deak [Wed, 11 Mar 2026 15:31:52 +0000 (17:31 +0200)]
drm/i915/dp: Simplify forcing a link retraining
Since both the DP SST and MST HPD IRQ handlers call
intel_dp_handle_link_service_irq() with LINK_STATUS_CHANGED set in
irq_mask if intel_dp->link.force_retrain is set, checking for the former
flag is sufficient to determine if the link status needs to be checked
(which includes retraining the link if this is forced); remove checking
for the latter flag.
Since LINK_STATUS_CHANGED is currently set unconditionally for DP SST,
extend the related comment to note that it must be set if
intel_dp->link.force_retrain is set (in case setting LINK_STATUS_CHANGED
becomes conditional on DPCD_REV).
Imre Deak [Wed, 11 Mar 2026 15:31:51 +0000 (17:31 +0200)]
drm/i915/dp_mst: Fix forced link retrain handling in MST HPD IRQ handler
Handling of a forced link retraining debugfs request via the DP MST HPD
IRQ handler is incorrectly skipped, if the IRQ handler doesn't see any
HPD IRQs raised by the sink. Fix this by ensuring that the request is
always handled (in the Fixes: commit below by directly calling
intel_dp_check_link_state(), later by the same call moved to
intel_dp_handle_link_service_irq()).
Cc: Luca Coelho <luciano.coelho@intel.com> Fixes: db4855d90363 ("drm/i915/dp_mst: Reuse intel_dp_check_link_state() in the HPD IRQ handler") Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patch.msgid.link/20260311153152.133744-1-imre.deak@intel.com
Suraj Kandpal [Wed, 25 Feb 2026 06:50:45 +0000 (12:20 +0530)]
drm/i915/hdcp: Take force_hdcp14 into account during check_link
During intel_hdcp_check_link phase we need to take into account
if we are currently forcing HDCP 1.4 or not. This is because
we check for HDCP 2.x Link first and only if HDCP 2.x is not being
used check for HDCP 1.4. With force_hdcp14 in picture we should not
be going into intel_hdcp2_check_link because of which we may end
up trying to disable HDCP2.x even if HDCP 1.4 was enabled causing
a lot of issues while IGT tests this.
Matt Roper [Thu, 12 Mar 2026 16:29:23 +0000 (09:29 -0700)]
drm/xe/wa: Drop redundant entries for Wa_16021867713 & Wa_14019449301
The Xe2_HPM-specific RTP table entries for Wa_16021867713 and
Wa_14019449301 were removed by commit 941f538b0af8 ("drm/xe: Consolidate
workaround entries for Wa_16021867713") and commit aa0f0a678370
("drm/xe: Consolidate workaround entries for Wa_14019449301") in favor
of alternate entries earlier in the table that cover a wider range of IP
versions. However these Xe2_HPM-specific entries were accidentally
resurrected during a backmerge, which causes the Xe driver to complain
on probe about two entries trying to program the same registers+bits:
Mika Kuoppala [Wed, 4 Mar 2026 21:17:28 +0000 (23:17 +0200)]
drm/xe: Fix overflow in guc_ct_snapshot_capture
snapshot->ctb is u32*, so pointer arithmetic on it scales
the byte offset from xe_bo_size() by 4, overshooting the
intended start of the g2h portion and writing past the
allocated buffer.
Fix this by using void * to get the arithmetic right and
prevent future mishaps.
v2: s/u8/void for memcpy and iosys_map consistency (Matt)
Fixes: af3de6cf06f9 ("drm/xe: Split H2G and G2H into separate buffer objects") Cc: Matthew Brost <matthew.brost@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: intel-xe@lists.freedesktop.org Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260304211728.249104-1-mika.kuoppala@linux.intel.com
Nitin Gote [Wed, 4 Mar 2026 12:38:02 +0000 (18:08 +0530)]
drm/xe: implement VM_BIND decompression in vm_bind_ioctl
Implement handling of VM_BIND(..., DECOMPRESS) in xe_vm_bind_ioctl.
Key changes:
- Parse and record per-op intent (op->map.request_decompress) when the
DECOMPRESS flag is present.
- Use xe_pat_index_get_comp_en() helper to check if a PAT index
has compression enabled via the XE2_COMP_EN bit.
- Validate DECOMPRESS preconditions in the ioctl path:
- Only valid for MAP ops.
- The provided pat_index must select the device's "no-compression" PAT.
- Only meaningful on devices with flat CCS and the required XE2+
otherwise return -EOPNOTSUPP.
- Use XE_IOCTL_DBG for uAPI sanity checks.
- Implement xe_bo_decompress():
For VRAM BOs run xe_bo_move_notify(), reserve one fence slot,
schedule xe_migrate_resolve(), and attach the returned fence
with DMA_RESV_USAGE_KERNEL. Non-VRAM cases are silent no-ops.
- Wire scheduling into vma_lock_and_validate() so VM_BIND will schedule
decompression when request_decompress is set.
- Handle fault-mode VMs by performing decompression synchronously during
the bind process, ensuring that the resolve is completed before the bind
finishes.
This schedules an in-place GPU resolve (xe_migrate_resolve) for
decompression.
v7: Rebase on latest drm-tip and add compute and igt pr info
v6: (Matt Auld)
- Rebase as xe_pat_index_get_comp_en() is added in separate
patch
- Drop vm param from xe_bo_decompress(), instead of it
extract tile from bo
- Reject decompression on igpu instead of silent skipping
to avoid any failure on Xe2+igpu as xe_device_has_flat_ccs()
can sometimes be false on igpu due some setting in the BIOS
to turn off compression on igpu.
- Nits
v5: (Matt)
- Correct the condition check of xe_pat_index_get_comp_en
v4: (Matt)
- Introduce xe_pat_index_get_comp_en(), which checks
XE2_COMP_EN for the pat_index
- .interruptible should be true, everything else false
v3: (Matt)
- s/xe_bo_schedule_decompress/xe_bo_decompress
- skip the decrompress step if the BO isn't in VRAM
- start/size not required in xe_bo_schedule_decompress
- Use xe_bo_move_notify instead of xe_vm_invalidate_vma
with respect to invalidation.
- Nits
v2:
- Move decompression work out of vm_bind ioctl. (Matt)
- Put that work in a small helper at the BO/migrate layer invoke it
from vma_lock_and_validate which already runs under drm_exec.
- Move lightweight checks to vm_bind_ioctl_check_args (Matthew Auld)
Cc: Matthew Brost <matthew.brost@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Acked-by: Michal Mrozek <michal.mrozek@intel.com> Signed-off-by: Nitin Gote <nitin.r.gote@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patch.msgid.link/20260304123758.3050386-8-nitin.r.gote@intel.com
Nitin Gote [Wed, 4 Mar 2026 12:38:01 +0000 (18:08 +0530)]
drm/xe: add xe_migrate_resolve wrapper and is_vram_resolve support
Introduce an internal __xe_migrate_copy(..., is_vram_resolve) path and
expose a small wrapper xe_migrate_resolve() that calls it with
is_vram_resolve=true.
For resolve/decompression operations we must ensure the copy code uses
the compression PAT index when appropriate; this change centralizes that
behavior and allows callers to schedule a resolve (decompress) operation
via the migrate API.
v3: Fix kernel-doc warnings
v2: (Matt)
- Simplify xe_migrate_resolve(), use single BO/resource;
remove copy_only_ccs argument as it's always false.
Cc: Matthew Brost <matthew.brost@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Nitin Gote <nitin.r.gote@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patch.msgid.link/20260304123758.3050386-7-nitin.r.gote@intel.com
Nitin Gote [Wed, 4 Mar 2026 12:38:00 +0000 (18:08 +0530)]
drm/xe: add VM_BIND DECOMPRESS uapi flag
Add a new VM_BIND flag, DRM_XE_VM_BIND_FLAG_DECOMPRESS, that lets userspace
express intent for the driver to perform on-device in-place decompression
for the GPU mapping created by a MAP bind operation.
This flag is used by subsequent driver changes to trigger scheduling of
GPU work that resolves compressed VRAM pages into an uncompressed PAT
VM mapping.
Behavior and semantics:
- Valid only for DRM_XE_VM_BIND_OP_MAP. IOCTLs using this flag on other ops
are rejected (-EINVAL).
- The bind's pat_index must select the device "no-compression" PAT entry;
otherwise the ioctl is rejected (-EINVAL).
- Only meaningful for VRAM-backed BOs on devices that support Flat CCS and
the required hardware generation (driver will return -EOPNOTSUPP if not).
- On success the driver schedules a migrate/resolve and installs the
returned dma_fence into the BO's kernel reservation
(DMA_RESV_USAGE_KERNEL).
v3: Rebase on latest drm-tip and add compute pr info
v2: Add kernel doc (Matt)
Cc: Matthew Brost <matthew.brost@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Mrozek, Michal <michal.mrozek@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Nitin Gote <nitin.r.gote@intel.com> Acked-by: Michal Mrozek <michal.mrozek@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patch.msgid.link/20260304123758.3050386-6-nitin.r.gote@intel.com
accel/amdxdna: Support sensors for column utilization
The AMD PMF driver provides realtime column utilization (npu_busy)
metrics for the NPU. Extend the DRM_IOCTL_AMDXDNA_GET_INFO sensor
query to expose these metrics to userspace.
Add AMDXDNA_SENSOR_TYPE_COLUMN_UTILIZATION to the sensor type enum
and update aie2_get_sensors() to return both the total power and up
to 8 column utilization sensors if the user buffer permits.
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Reviewed-by: Lizhi Hou <lizhi.hou@amd.com>
[lizhi: support legacy tool which uses small buffer. checkpatch cleanup] Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260311171842.473453-1-lizhi.hou@amd.com
Lizhi Hou [Sat, 28 Feb 2026 06:10:57 +0000 (00:10 -0600)]
accel/amdxdna: Add IOCTL to retrieve realtime NPU power estimate
The AMD PMF driver provides an interface to obtain realtime power
estimates for the NPU. Expose this information to userspace through a
new DRM_IOCTL_AMDXDNA_GET_INFO parameter, allowing applications to query
the current NPU power level.
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Reviewed-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
(Update comment to indicate power and utilization) Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260228061109.361239-2-superm1@kernel.org
Gustavo Sousa [Tue, 3 Mar 2026 20:46:17 +0000 (17:46 -0300)]
drm/xe/pat: Extract gt_pta_entry()
Avoid code duplication by extracting the logic for selection of the
correct PAT_PTA entry for a GT into function gt_pta_entry() and using
that function whenever necessary.
Christian König [Tue, 20 Jan 2026 12:09:52 +0000 (13:09 +0100)]
drm/amdgpu: revert to old status lock handling v4
It turned out that protecting the status of each bo_va with a
spinlock was just hiding problems instead of solving them.
Revert the whole approach, add a separate stats_lock and lockdep
assertions that the correct reservation lock is held all over the place.
This not only allows for better checks if a state transition is properly
protected by a lock, but also switching back to using list macros to
iterate over the state of lists protected by the dma_resv lock of the
root PD.
v2: re-add missing check
v3: split into two patches
v4: re-apply by fixing holding the VM lock at the right places.
Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd: Set num IP blocks to 0 if discovery fails
If discovery has failed for any reason (such as no support for a block)
then there is no need to unwind all the IP blocks in fini. In this
condition there can actually be failures during the unwind too.
Reset num_ip_blocks to zero during failure path and skip the unnecessary
cleanup path.
Suggested-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Hawking Zhang [Mon, 9 Feb 2026 20:08:54 +0000 (04:08 +0800)]
drm/amdgpu: New interface to get IP discovery binary v3
Implement a driver path to read the IP discovery
binary offset and size from DRIVER_SCRATCH registers
BIOS signals usage by setting a feature flag that
instructs the driver to use this method. Otherwise,
fallback to legacy approach.
v2: Simplify discovery offset/size retrieval in
get_tmr_info
v3: Update get_tmr_info to cover discovery offset
and size retrieval for both bare-metal and sriov
Philip Yang [Tue, 9 Dec 2025 20:13:23 +0000 (15:13 -0500)]
drm/amdkfd: Unreserve bo if queue update failed
Error handling path should unreserve bo then return failed.
Fixes: 305cd109b761 ("drm/amdkfd: Validate user queue update") Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Alex Sierra <alex.sierra@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>