git.ipfire.org Git - thirdparty/kernel/linux.git/log

drm/xe/xe_vm: Add per VM fault info

Add additional information to each VM so they can report up to the first
50 seen faults.  Only pagefaults are saved this way currently, though in
the future, all faults should be tracked by the VM for future reporting.

Additionally, of the pagefaults reported, only failed pagefaults are
saved this way, as successful pagefaults should recover silently and not
need to be reported to userspace.

v2:
- Free vm after use (Shuicheng)
- Compress pf copy logic (Shuicheng)
- Update fault_unsuccessful before storing (Shuicheng)
- Fix old struct name in comments (Shuicheng)
- Keep first 50 pagefaults instead of last 50 (Jianxun)

v3:
- Avoid unnecessary execution by checking MAX_PFS earlier (jcavitt)
- Fix double-locking error (jcavitt)
- Assert kmemdump is successful (Shuicheng)

v4:
- Rename xe_vm.pfs to xe_vm.faults (jcavitt)
- Store fault data and not pagefault in xe_vm faults list (jcavitt)
- Store address, address type, and address precision per fault (jcavitt)
- Store engine class and instance data per fault (Jianxun)
- Add and fix kernel docs (Michal W)
- Properly handle kzalloc error (Michal W)
- s/MAX_PFS/MAX_FAULTS_SAVED_PER_VM (Michal W)
- Store fault level per fault (Micahl M)

v5:
- Store fault and access type instead of address type (Jianxun)

v6:
- Store pagefaults in non-fault-mode VMs as well (Jianxun)

v7:
- Fix kernel docs and comments (Michal W)

v8:
- Fix double-locking issue (Jianxun)

v9:
- Do not report faults from reserved engines (Jianxun)

v10:
- Remove engine class and instance (Ivan)

v11:
- Perform kzalloc outside of lock (Auld)

v12:
- Fix xe_vm_fault_entry kernel docs (Shuicheng)

v13:
- Rebase and refactor (jcavitt)

v14:
- Correctly ignore fault mode in save_pagefault_to_vm (jcavitt)

v15:
- s/save_pagefault_to_vm/xe_pagefault_save_to_vm (Matt Brost)
- Use guard instead of spin_lock/unlock (Matt Brost)
- GT was added to xe_pagefault struct.  Use xe_gt_hw_engine
  instead of creating a new helper function (Matt Brost)

v16:
- Set address precision programmatically (Matt Brost)

v17:
- Set address precision to fixed value (Matt Brost)

v18:
- s/uAPI/Link in commit log links
- Use kzalloc_obj

Link: https://github.com/intel/compute-runtime/pull/878
Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Cc: Shuicheng Lin <shuicheng.lin@intel.com>
Cc: Jianxun Zhang <jianxun.zhang@intel.com>
Cc: Michal Wajdeczko <Michal.Wajdeczko@intel.com>
Cc: Michal Mzorek <michal.mzorek@intel.com>
Cc: Ivan Briano <ivan.briano@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260324152935.72444-9-jonathan.cavitt@intel.com

drm/xe/uapi: Define drm_xe_vm_get_property

Add initial declarations for the drm_xe_vm_get_property ioctl.

v2:
- Expand kernel docs for drm_xe_vm_get_property (Jianxun)

v3:
- Remove address type external definitions (Jianxun)
- Add fault type to xe_drm_fault struct (Jianxun)

v4:
- Remove engine class and instance (Ivan)

v5:
- Add declares for fault type, access type, and fault level (Matt Brost,
Ivan)

v6:
- Fix inconsistent use of whitespace in defines

v7:
- Rebase and refactor (jcavitt)

v8:
- Rebase (jcavitt)

v9:
- Clarify address is canonical (José)

v10:
- s/uAPI/Link in the commit log links

Link: https://github.com/intel/compute-runtime/pull/878
Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
Acked-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: Ivan Briano <ivan.briano@intel.com>
Acked-by: José Roberto de Souza <jose.souza@intel.com>
Cc: Zhang Jianxun <jianxun.zhang@intel.com>
Cc: Ivan Briano <ivan.briano@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260324152935.72444-8-jonathan.cavitt@intel.com

drm/xe/xe_pagefault: Disallow writes to read-only VMAs

The page fault handler should reject write/atomic access to read only
VMAs. Add code to handle this in xe_pagefault_service after the VMA
lookup.

v2:
- Apply max line length (Matthew)

Fixes: fb544b844508 ("drm/xe: Implement xe_pagefault_queue_work")
Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Cc: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260324152935.72444-7-jonathan.cavitt@intel.com

drm/xe: always keep track of remap prev/next

During 3D workload, user is reporting hitting:

[  413.361679] WARNING: drivers/gpu/drm/xe/xe_vm.c:1217 at vm_bind_ioctl_ops_unwind+0x1e2/0x2e0 [xe], CPU#7: vkd3d_queue/9925
[  413.361944] CPU: 7 UID: 1000 PID: 9925 Comm: vkd3d_queue Kdump: loaded Not tainted 7.0.0-070000rc3-generic #202603090038 PREEMPT(lazy)
[  413.361949] RIP: 0010:vm_bind_ioctl_ops_unwind+0x1e2/0x2e0 [xe]
[  413.362074] RSP: 0018:ffffd4c25c3df930 EFLAGS: 00010282
[  413.362077] RAX: 0000000000000000 RBX: ffff8f3ee817ed10 RCX: 0000000000000000
[  413.362078] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[  413.362079] RBP: ffffd4c25c3df980 R08: 0000000000000000 R09: 0000000000000000
[  413.362081] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8f41fbf99380
[  413.362082] R13: ffff8f3ee817e968 R14: 00000000ffffffef R15: ffff8f43d00bd380
[  413.362083] FS:  00000001040ff6c0(0000) GS:ffff8f4696d89000(0000) knlGS:00000000330b0000
[  413.362085] CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
[  413.362086] CR2: 00007ddfc4747000 CR3: 00000002e6262005 CR4: 0000000000f72ef0
[  413.362088] PKRU: 55555554
[  413.362089] Call Trace:
[  413.362092]  <TASK>
[  413.362096]  xe_vm_bind_ioctl+0xa9a/0xc60 [xe]

Which seems to hint that the vma we are re-inserting for the ops unwind
is either invalid or overlapping with something already inserted in the
vm. It shouldn't be invalid since this is a re-insertion, so must have
worked before. Leaving the likely culprit as something already placed
where we want to insert the vma.

Following from that, for the case where we do something like a rebind in
the middle of a vma, and one or both mapped ends are already compatible,
we skip doing the rebind of those vma and set next/prev to NULL. As well
as then adjust the original unmap va range, to avoid unmapping the ends.
However, if we trigger the unwind path, we end up with three va, with
the two ends never being removed and the original va range in the middle
still being the shrunken size.

If this occurs, one failure mode is when another unwind op needs to
interact with that range, which can happen with a vector of binds. For
example, if we need to re-insert something in place of the original va.
In this case the va is still the shrunken version, so when removing it
and then doing a re-insert it can overlap with the ends, which were
never removed, triggering a warning like above, plus leaving the vm in a
bad state.

With that, we need two things here:

1) Stop nuking the prev/next tracking for the skip cases. Instead
    relying on checking for skip prev/next, where needed. That way on the
    unwind path, we now correctly remove both ends.

2) Undo the unmap va shrinkage, on the unwind path. With the two ends
    now removed the unmap va should expand back to the original size again,
    before re-insertion.

v2:
  - Update the explanation in the commit message, based on an actual IGT of
    triggering this issue, rather than conjecture.
  - Also undo the unmap shrinkage, for the skip case. With the two ends
    now removed, the original unmap va range should expand back to the
    original range.
v3:
  - Track the old start/range separately. vma_size/start() uses the va
    info directly.

Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/7602
Fixes: 8f33b4f054fc ("drm/xe: Avoid doing rebinds")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260318100208.78097-2-matthew.auld@intel.com

drm/xe/xelp: Expose AuxCCS frame buffer modifiers on Alderlake-P

Now that we have implemented all the related missing bits we can enable
the AuxCCS compressed modifiers which were disabled in
cf48bddd31de ("drm/i915/display: Disable AuxCCS framebuffers if built for Xe").

Tested with KDE Wayland, on Lenovo Carbon X1 ADL-P:

        [PLANE:32:plane 1A]: type=PRI
                uapi: [FB:242] AR30 little-endian (0x30335241),0x100000000000008,2880x1800, visible=visible, src=28
                hw: [FB:242] AR30 little-endian (0x30335241),0x100000000000008,2880x1800, visible=yes, src=2880.000

Display is working fine - no artefacts, no DMAR/PIPE faults.

v2:
* Adjust patch title. (Rodrigo)

v3:
* Complete rewrite based on the display parent interface.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
References: cf48bddd31de ("drm/i915/display: Disable AuxCCS framebuffers if built for Xe")
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/20260324084018.20353-13-tvrtko.ursulin@igalia.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/xe/display: Add support for AuxCCS

Add support for mapping the auxiliary CCS buffer into the DPT page tables.

This will allow for better power efficiency by enabling the render
compression frame buffer modifiers such as
I915_FORMAT_MOD_Y_TILED_GEN12_RC_CCS in a following patch.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Cc: Michael J. Ruhl <michael.j.ruhl@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Link: https://patch.msgid.link/20260324084018.20353-12-tvrtko.ursulin@igalia.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/xe/display: Respect remapped plane alignment

Instead of assuming PAGE_SIZE alignment between the remapped planes
respect the value set in the struct intel_remapped_info.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Cc: Michael J. Ruhl <michael.j.ruhl@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Link: https://patch.msgid.link/20260324084018.20353-11-tvrtko.ursulin@igalia.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/xe/display: Change write_dpt_remapped_tiled function signature

In preparation for adding support for the auxccs plane lets change the
function signature of write_dpt_remapped_tiled(). This will enable a
tidier way of extending it subsequent patches.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Cc: Michael J. Ruhl <michael.j.ruhl@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Uma Shankar <uma.shankar@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Link: https://patch.msgid.link/20260324084018.20353-10-tvrtko.ursulin@igalia.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/xe/display: Move remapped plane loop out of __xe_pin_fb_vma_dpt

In preparation for adding support for the auxccs plane lets move the
plane iteration loop to its own function.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Cc: Michael J. Ruhl <michael.j.ruhl@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Link: https://patch.msgid.link/20260324084018.20353-9-tvrtko.ursulin@igalia.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/xe/xelp: Add AuxCCS invalidation to the indirect context workarounds

Following from the i915 reference implementation, we add the AuxCCS
invalidation to the indirect context workarounds page.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/20260324084018.20353-8-tvrtko.ursulin@igalia.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/xe: Move aux table invalidation to ring ops

Implement the suggestion of moving the aux invalidation from a helper to a
ring ops vfunc, together with the suggestion to split the vfunc table of
video decode and video enhance engines.

With this done the LRC code will be able to access the functionality via
the newly added ring ops vfunc.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260324084018.20353-7-tvrtko.ursulin@igalia.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/xe/xelp: Wait for AuxCCS invalidation to complete

On AuxCCS platforms we need to wait for AuxCCS invalidations to complete.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/20260324084018.20353-6-tvrtko.ursulin@igalia.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/xe/xelp: Quiesce memory traffic before invalidating AuxCCS

According to i915 commit
ad8ebf12217e ("drm/i915/gt: Ensure memory quiesced before invalidation")
quiescing of the memory traffic is required before invalidating the AuxCCS
tables.

Add an extra pipe control flush to achieve that.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/20260324084018.20353-5-tvrtko.ursulin@igalia.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/xe/xelpg: Limit AuxCCS ring buffer programming to Alderlake

At the moment the driver does not support AuxCCS at all due respective
modifiers being hidden from userspace.

As we are about to start enabling them, starting with Alderlake, let us
begin by limiting the ring buffer support to just that initial platform.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/20260324084018.20353-4-tvrtko.ursulin@igalia.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/xe: Use write-combine mapping when populating DPT

The fallback case for DPT backing store is a buffer object in system
memory buffer, which by default use a write-back CPU caching policy.

If this fallback gets triggered, and since there is currently no flushing,
the DPT writes made when pinning a buffer to display are not guaranteed to
be seen by the display engine.

To fix this, since both the local memory and the stolen memory DPT
placements already use write-combine, let us make the system memory option
follow suit by passing down the appropriate flag.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/20260324084018.20353-3-tvrtko.ursulin@igalia.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/xe: Rename XE_BO_FLAG_SCANOUT to XE_BO_FLAG_FORCE_WC

Rename XE_BO_FLAG_SCANOUT to XE_BO_FLAG_FORCE_WC so that the usage of the
flag can legitimately be expanded to more than just the actual frame-
buffer objects.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Suggested-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/20260324084018.20353-2-tvrtko.ursulin@igalia.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

vfio/xe: Notify PF about VF FLR in reset_prepare

Hook into the PCI error handler reset_prepare() callback to notify
the PF about an upcoming VF FLR before reset_done() is executed.
This enables early FLR_PREPARE signaling and ensures that the PF is
aware of the reset before the completion wait begins.

Signed-off-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Alex Williamson <alex@shazbot.org>
Link: https://patch.msgid.link/20260309152449.910636-3-piotr.piorkowski@intel.com
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>

drm/xe/pf: Add FLR_PREPARE state to VF control flow

Our xe-vfio-pci component relies on the confirmation from the PF
that VF FLR processing has finished, but due to the notification
latency on the HW/FW side, PF might be unaware yet of the already
triggered VF FLR.

Update VF state machine with new FLR_PREPARE state that indicate
imminent VF FLR notification and treat that as a begin of the FLR
sequence. Also introduce function that xe-vfio-pci should call to
guarantee correct synchronization.

v2: move PREPARE into WIP, update commit msg (Michal)

Signed-off-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Co-developed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patch.msgid.link/20260309152449.910636-2-piotr.piorkowski@intel.com
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>

drm/xe: Implement recent spec updates to Wa_16025250150

The hardware teams noticed that the originally documented workaround
steps for Wa_16025250150 may not be sufficient to fully avoid a hardware
issue. The workaround documentation has been augmented to suggest
programming one additional register; make the corresponding change in
the driver.

Fixes: 7654d51f1fd8 ("drm/xe/xe2hpg: Add Wa_16025250150")
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
Link: https://patch.msgid.link/20260319-wa_16025250150_part2-v1-1-46b1de1a31b2@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>

drm/xe/xe3p: Skip TD flush

Xe3p has HW ability to do transient display flush so the xe driver can
enable this HW feature by default and skip the software TD flush.

Bspec: 60002
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Shekhar Chauhan <shekhar.chauhan@intel.com>
Link: https://patch.msgid.link/20260305121902.1892593-10-tejas.upadhyay@intel.com
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>

drm/xe/xe3p_lpg: Restrict UAPI to enable L2 flush optimization

When set, starting xe3p_lpg, the L2 flush optimization
feature will control whether L2 is in Persistent or
Transient mode through monitoring of media activity.

To enable L2 flush optimization include new feature flag
GUC_CTL_ENABLE_L2FLUSH_OPT for Novalake platforms when
media type is detected.

Tighten UAPI validation to restrict userptr, svm and
dmabuf mappings to be either 2WAY or XA+1WAY

V5(Thomas): logic correction
V4(MattA): Modify uapi doc and commit
V3(MattA): check valid op and pat_index value
V2(MattA): validate dma-buf bos and madvise pat-index

Acked-by: José Roberto de Souza <jose.souza@intel.com>
Acked-by: Michal Mrozek <michal.mrozek@intel.com>
Acked-by: Carl Zhang <carl.zhang@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patch.msgid.link/20260305121902.1892593-9-tejas.upadhyay@intel.com
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>

drm/xe/pat: define coh_mode 2way

Defining 2way (two-way coherency) is critical for
Xe3p_LPG (Nova Lake P) platforms to support L2 flush
optimization safely.

This mode allows the driver to skip certain manual cache
flushes (L2 flush optimization) without risking memory
corruption because the hardware ensures the most recent
data is visible to both entities.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patch.msgid.link/20260305121902.1892593-8-tejas.upadhyay@intel.com
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>

drm/xe/xe3p_lpg: flush shrinker bo cachelines manually

XA, new pat_index introduced post xe3p_lpg, is memory shared between the
CPU and GPU is treated differently from other GPU memory when the Media
engine is power-gated.

XA is *always* flushed, like at the end-of-submssion (and maybe other
places), just that internally as an optimisation hw doesn't need to make
that a full flush (which will also include XA) when Media is
off/powergated, since it doesn't need to worry about GT caches vs Media
coherency, and only CPU vs GPU coherency, so can make that flush a
targeted XA flush, since stuff tagged with XA now means it's shared with
the CPU. The main implication is that we now need to somehow flush non-XA
before freeing system memory pages, otherwise dirty cachelines could be
flushed after the free (like if Media suddenly turns on and does a full
flush)

V4: Add comments for L2 flush path
V3(Thomas/MattA/MattR): Restrict userptr with non-xa, then no need to
flush manually
V2(MattA): Expand commit description

Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patch.msgid.link/20260305121902.1892593-7-tejas.upadhyay@intel.com
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>

drm/xe/vf: Improve getting clean NULL context

There is a small risk that when fetching a NULL context image the
VF may get a tweaked context image prepared by another VF that was
previously running on the engine before the GuC scheduler switched
the VFs.

To avoid that risk, without forcing GuC scheduler to trigger costly
engine reset on every VF switch, use a watchdog mechanism that when
configured with impossible condition, triggers an interrupt, which
GuC will handle by doing an engine reset. Also adjust job size to
account for additional dwords with watchdog setup.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patch.msgid.link/20260303201354.17948-4-michal.wajdeczko@intel.com

drm/xe: Add MI_SEMAPHORE_WAIT command definition

This command supports memory based Semaphore WAIT. Memory based
semaphores will be used for synchronization between the Producer
and the Consumer contexts. Producer and Consumer Contexts could
be running on different engines or on the same engine inside GT.

Bspec: 45749, 60244
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patch.msgid.link/20260303201354.17948-3-michal.wajdeczko@intel.com

drm/xe: Add PR_CTR_CTRL/THRSH register definitions

The Watchdog Counter Control and Watchdog Counter Threshold
registers are needed for watchdog programming. This watchdog
will generate the "Media Hang Notify" interrupt.

Bspec: 45999, 46000
Bspec: 60373, 60374
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patch.msgid.link/20260303201354.17948-2-michal.wajdeczko@intel.com

drm/xe/pf: Fix use-after-free in migration restore

When an error is returned from xe_sriov_pf_migration_restore_produce(),
the data pointer is not set to NULL, which can trigger use-after-free
in subsequent .write() calls.
Set the pointer to NULL upon error to fix the problem.

Fixes: 1ed30397c0b92 ("drm/xe/pf: Add support for encap/decap of bitstream to/from packet")
Reported-by: Sebastian Österlund <sebastian.osterlund@intel.com>
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/7230
Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
Link: https://patch.msgid.link/20260217154118.176902-1-michal.winiarski@intel.com
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>

drm/xe: Fix format specifier for printing pointer differences

GCC and clang warn (or error with CONFIG_WERROR=y / W=e) several times
when targeting 32-bit platforms along the lines of

  drivers/gpu/drm/xe/xe_lrc.c: In function 'dump_mi_command':
  drivers/gpu/drm/xe/xe_lrc.c:1921:40: error: format '%lx' expects argument of type 'long unsigned int', but argument 3 has type 'int' [-Werror=format=]
   1921 |                 drm_printf(p, "LRC[%#5lx]  =  [%#010x] MI_NOOP (%d dwords)\n",
        |                                    ~~~~^
        |                                        |
        |                                        long unsigned int
        |                                    %#5x
   1922 |                            dw - num_noop - start, inst_header, num_noop);
        |                            ~~~~~~~~~~~~~~~~~~~~~
        |                                          |
        |                                          int

  drivers/gpu/drm/xe/xe_lrc.c:1922:7: error: format specifies type 'unsigned long' but the argument has type '__ptrdiff_t' (aka 'int') [-Werror,-Wformat]
   1921 |                 drm_printf(p, "LRC[%#5lx]  =  [%#010x] MI_NOOP (%d dwords)\n",
        |                                    ~~~~~
        |                                    %#5tx
   1922 |                            dw - num_noop - start, inst_header, num_noop);
        |                            ^~~~~~~~~~~~~~~~~~~~~

Use the '%tx' specifier for printing pointer differences, which clears
up the warnings for 32-bit platforms while introducing no regressions
for 64-bit platforms.

Fixes: 65fcf19cb36b ("drm/xe: Include running dword offset in default_lrc dumps")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20260316-drm-xe-fix-32-bit-wformat-ptrdiff-v1-1-0108b10b2b6b@kernel.org
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>

drm/xe: Extend Wa_14026781792 for xe3lpg

Wa_14026781792 applies to all graphics versions from 30.00
through 35.10 (inclusive). Since there are no IPs between
30.05 and 35.10, consolidate the RTP rules into a single
GRAPHICS_VERSION_RANGE(3000, 3510).

v2: (Matt)
- There are no IPs between 30.05 and 35.10 either,
So, consolidate this into a single GRAPHICS_VERSION_RANGE(3000, 3510)
- Also move it up to the top part of the table

Signed-off-by: Nitin Gote <nitin.r.gote@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20260317080059.1275116-2-nitin.r.gote@intel.com
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>

drm/xe/xe3p_lpg: Add Wa_16029437861

Wa_16029437861 requires disabling COAMA atomics by setting bit 22
(SQ_DISABLE_COAMA) of L3SQCREG2 (0xb104) for Xe3p_LPG graphics
version 35.10 stepping A0..B0. This bit is already set by the existing
Wa_14026144927 entry, so add the new WA ID to the same implementation.

Signed-off-by: Varun Gupta <varun.gupta@intel.com>
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Link: https://patch.msgid.link/20260317040447.1792687-1-varun.gupta@intel.com
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>

drm/ttm: Fix spelling mistakes and comment style in ttm_resource.c

Correct several spelling mistakes and textual inconsistencies in
kdoc comments and inline comments.

Suggested-by: Sanjay Yadav <sanjay.kumar.yadav@intel.com>
Signed-off-by: Varun Gupta <varun.gupta@intel.com>
Reviewed-by: Nitin Gote <nitin.r.gote@intel.com>
Link: https://patch.msgid.link/20260316035915.1403424-1-varun.gupta@intel.com
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>

Merge drm/drm-next into drm-xe-next

Bring in series "drm/{i915,xe}: sort out step enums between the drivers"
that was merged through i915.

Link: https://lore.kernel.org/all/cover.1772635152.git.jani.nikula@intel.com
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>

drm/xe: Fix missing runtime PM reference in ccs_mode_store

ccs_mode_store() calls xe_gt_reset() which internally invokes
xe_pm_runtime_get_noresume(). That function requires the caller
to already hold an outer runtime PM reference and warns if none
is held:

  [46.891177] xe 0000:03:00.0: [drm] Missing outer runtime PM protection
  [46.891178] WARNING: drivers/gpu/drm/xe/xe_pm.c:885 at
  xe_pm_runtime_get_noresume+0x8b/0xc0

Fix this by protecting xe_gt_reset() with the scope-based
guard(xe_pm_runtime)(xe), which is the preferred form when
the reference lifetime matches a single scope.

v2:
- Use scope-based guard(xe_pm_runtime)(xe) (Shuicheng)
- Update commit message accordingly

Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/7593
Fixes: 480b358e7d8e ("drm/xe: Do not wake device during a GT reset")
Cc: <stable@vger.kernel.org> # v6.19+
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Shuicheng Lin <shuicheng.lin@intel.com>
Suggested-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Sanjay Yadav <sanjay.kumar.yadav@intel.com>
Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patch.msgid.link/20260313071608.3459480-2-sanjay.kumar.yadav@intel.com

drm/xe/lrc: Fix uninitialized new_ts when capturing context timestamp

Getting engine specific CTX TIMESTAMP register can fail. In that case,
if the context is active, new_ts is uninitialized. Fix that case by
initializing new_ts to the last value that was sampled in SW -
lrc->ctx_timestamp.

Flagged by static analysis.

v2: Fix new_ts initialization (Ashutosh)

Fixes: bb63e7257e63 ("drm/xe: Avoid toggling schedule state to check LRC timestamp in TDR")
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Link: https://patch.msgid.link/20260312125308.3126607-2-umesh.nerlige.ramappa@intel.com

drm/xe/oa: Allow reading after disabling OA stream

Some OA data might be present in the OA buffer when OA stream is
disabled. Allow UMD's to retrieve this data, so that all data till the
point when OA stream is disabled can be retrieved.

v2: Update tail pointer after disable (Umesh)

Fixes: efb315d0a013 ("drm/xe/oa/uapi: Read file_operation")
Cc: stable@vger.kernel.org
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Umesh Nerlige Ramappa<umesh.nerlige.ramappa@intel.com>
Link: https://patch.msgid.link/20260313053630.3176100-1-ashutosh.dixit@intel.com

Merge tag 'drm-intel-next-2026-03-16' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next

[airlied: fixed conflict with xe tree]
drm/i915 feature pull for v7.1:

Features and functionality:
- C10/C20/LT PHY PLL divider verification (Mika)
- Use trans push mechanism to generate PSR frame change event on LNL+ (Jouni)
- Account for DSC bubble overhead for horizontal slices (Ankit, Chaitanya)

Refactoring and cleanups:
- Refactor DP DSC slice config computation (Imre)
- Use GVT versions of register helper macros for GVT MMIO table (Ankit)
- C10/C20/LT PHY PLL computation refactoring (Mika)
- VGA decode refactoring and related fixes/cleanups (Ville)
- Move DSB buffer buffer implementation to display parent interface (Jani)
- Move error interrupt capture to display irq snapshot (Jani)
- Move pcode calls to display parent interface (Jani)
- Reduce GVT dependency on display headers (Jani)
- Compute config and mode valid refactoring for DSC (Ankit)
- Stop using i915 core register headers in display (Uma)
- Refactor DPT, move i915 parts to display parent interface (Jani)
- Refactor gen2-4 overlay, move to display parent interface (Ville)
- Refactor masked field register macro helpers, move to shared headers (Jani)
- Convert a number of workaround checks to the new workaround framework (Luca)
- Refactor and move frontbuffer calls to display parent interface (Jani)
- Add VMA calls to display parent interface (Jani)
- Refactor stolen memory allocation decisions (Vinod, Ville)
- Clean up and unify workqueue usage (Marco Crivellari)
- Preparation for UHBR DP tunnels (Imre)
- Allow DSC passthrough modes during DP MST mode validation (Imre)
- Move framebuffer bo interface to display parent interface (Jani)

Fixes:
- Plenty of DP SST HPD IRQ handling fixes (Imre)
- DP AUX backlight and luminance control fixes (Suraj)
- Respect VBT pipe joiner disable for eDP (Ankit)
- Do not use CASF with joiner (Nemesa)
- Clear C10/C20 PHY response read and error bit to avoid PHY hangs (Suraj)
- Xe3p_LPD DMG clock gating, CDCLK, port sync workarounds (Suraj, Gustavo, Mitul)
- Fix GVT error path (Michał)
- Handle errors on DP DSC receiver cap reads (Suraj)
- DSS clock gating workaround on MTL+ to avoid DSC corruption (Mika)
- Skip state verification for LT PHY in TBT mode (Suraj)
- Fix NULL pointer dereference on suspend when uc firmware not loaded (Rahul Bukte)
- Fix an unlikely DMC state related NULL pointer dereference at probe (Imre)
- Handle error returns from vga_get_uninterruptible() (Simon Richter)
- Increase C10/C20/LT PHY timeouts to include SOC/OS turnaround (Arun)
- Fix BIOS FB vs. stolen memory size check (Ville)
- Fix LOBF to use computed guardband and set context latency (Ankit)
- Handle modeset WW mutex lock failures due to contention properly (Imre)
- Fix pipe BPP clamping due to HDR (Imre)
- Fix stale state usage in DSC state computation (Imre)
- Take HDCP 1.4 vs 2.x into account during link check (Suraj)
- Fix forced link retrain handling in MST HPD IRQ handler (Imre)
- Remove redundant warning on vcpi < 0 (Jonathan)

Core changes:
- iopoll: fix function parameter names in read_poll_timeout_atomic() (Randy Dunlap)

Merges:
- Backmerge drm-next for v7.0-rc1 (Jani)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patch.msgid.link/b14bb0f297b1750816cf5f342bde608e435655fa@intel.com

drm/xe: Skip adding PRL entry to NULL VMA

NULL VMAs have no corresponding PTE, so skip adding a PRL entry to avoid
an unnecessary PRL abort during unbind.

Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260305171546.67691-8-brian3.nguyen@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>

drm/xe: Move page reclaim done_handler to own func

Originally, page reclamation is handled by the same fence as tlb
invalidation and uses its seqno, so there was no reason to separate out
the handlers. However in hindsight, for readability, and possible
future changes, it seems more beneficial to move this all out to its own
function.

Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Link: https://patch.msgid.link/20260305171546.67691-7-brian3.nguyen@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>

drm/xe: Skip over non leaf pte for PRL generation

The check using xe_child->base.children was insufficient in determining
if a pte was a leaf node. So explicitly skip over every non-leaf pt and
conditionally abort if there is a scenario where a non-leaf pt is
interleaved between leaf pt, which results in the page walker skipping
over some leaf pt.

Note that the behavior being targeted for abort is
PD[0] = 2M PTE
PD[1] = PT -> 512 4K PTEs
PD[2] = 2M PTE

results in abort, page walker won't descend PD[1].

With new abort, ensuring valid PRL before handling a second abort.

v2:
- Revert to previous assert.
- Revised non-leaf handling for interleaf child pt and leaf pte.
- Update comments to specifications. (Stuart)
- Remove unnecessary XE_PTE_PS64. (Matthew B)

v3:
- Modify secondary abort to only check non-leaf PTEs. (Matthew B)

Fixes: b912138df299 ("drm/xe: Create page reclaim list on unbind")
Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Cc: Stuart Summers <stuart.summers@intel.com>
Link: https://patch.msgid.link/20260305171546.67691-6-brian3.nguyen@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>

drm/xe: Include running dword offset in default_lrc dumps

Printing a running dword offset in the default_lrc_* debugfs entries
makes it easier for developers to find the right offsets to use in
regs/xe_lrc_layout.h and/or compare the default LRC contents against the
bspec-documented LRC layout.

Reviewed-by: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com>
Link: https://patch.msgid.link/20260311-default_lrc_offsets-v1-1-58d8ed3aa081@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>

drm/xe/i2c: Assert/Deassert I2C IRQ

I2C IRQ is triggered using virtual wire. Assert/Deassert it in IRQ
handler to allow subsequent interrupt generation.

Signed-off-by: Raag Jadav <raag.jadav@intel.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Link: https://patch.msgid.link/20260313080438.4166251-1-raag.jadav@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>

drm/{i915,xe}: move framebuffer bo to parent interface

Add .framebuffer_init, .framebuffer_fini and .framebuffer_lookup to the
bo parent interface. While they're about framebuffers, they're
specifically about framebuffer objects, so the bo interface is a good
enough fit, and there's no need to add another interface struct.

Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com>
Link: https://patch.msgid.link/848d32a44bf844cba3d66e44ba9f20bea4a8352d.1773238670.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

drm/i915/fb: make intel_fb_bo.c less dependent on display

intel_fb_bo.c is i915 core specific code, and should use struct
drm_i915_private instead of struct intel_display.

Switch one DISPLAY_VER() to GRAPHICS_VER(). The check is for < 4, where
they're effectively the same thing.

Reviewed-by: Suraj Kandpal@intel.com>
Link: https://patch.msgid.link/13087bd24bd5af5265ca6af67f086b93e26e311f.1773238670.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

drm/{i915, xe}/bo: move display bo calls to parent interface

Continue i915 and xe separation from display by moving the bo calls to
the display parent interface. Instead of adding all these functions to
intel_parent.[ch], reuse the now vacated intel_bo.[ch], and avoid mass
renames to calls of these functions. This is similar to
intel_display_rpm.[ch].

Make many of the hooks optional to avoid having to implement dummy
functions in xe. Indeed now we can remove many of the existing dummy
functions.

Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com>
Link: https://patch.msgid.link/7899eef2ccf0cd603df69099df065226a0df917b.1773238670.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

drm/xe: rename intel_bo.c to xe_display_bo.c

Follow the xe_ prefixed file naming in xe. With xe_bo.[ch] already being
a thing in xe core, use xe_display_bo.c.

Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com>
Link: https://patch.msgid.link/2f73eda5117462407f12113ce096496282ee3fcc.1773238670.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

drm/i915: move i915 specific bo implementation to i915

The bo interface implementation is different for both i915 and xe. Move
the i915 specific implementation from display to i915 core.

Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com>
Link: https://patch.msgid.link/e159166d623899996a51a577365ca7ab9b1a0974.1773238670.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

Merge tag 'amd-drm-next-7.1-2026-03-12' of https://gitlab.freedesktop.org/agd5f/linux into drm-next

amd-drm-next-7.1-2026-03-12:

amdgpu:
- SMU13 fix
- SMU14 fix
- Fixes for bring up hw testing
- Kerneldoc fix
- GC12 idle power fix for compute workloads
- DCCG fixes
- UserQ fixes
- Move test for fbdev object to a generic helper
- GC 12.1 updates
- Use struct drm_edid in non-DC code
- Include IP discovery data in devcoredump
- SMU 13.x updates
- Misc cleanups
- DML 2.1 fixes
- Enable NV12/P010 support on primary planes
- Enable color encoding and color range on overlay planes
- DC underflow fixes
- HWSS fast path fixes
- Replay fixes
- DCN 4.2 updates
- Support newer IP discovery tables
- LSDMA 7.1 support
- IH 7.1 fixes
- SoC v1 updates
- GC12.1 updates
- PSP 15 updates
- XGMI fixes
- GPUVM locking fix

amdkfd:
- Fix missing BO unreserve in an error path

radeon:
- Move test for fbdev object to a generic helper

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patch.msgid.link/20260312184425.3875669-1-alexander.deucher@amd.com
Signed-off-by: Dave Airlie <airlied@redhat.com>

Merge tag 'drm-xe-next-2026-03-12' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next

UAPI Changes:
- add VM_BIND DECOMPRESS support and on-demand decompression (Nitin)
- Allow per queue programming of COMMON_SLICE_CHICKEN3 bit13 (Lionel)

Cross-subsystem Changes:
- Introduce the DRM RAS infrastructure over generic netlink (Riana, Rodrigo)

Core Changes:
- Two-pass MMU interval notifiers (Thomas)

Driver Changes:
- Merge drm/drm-next into drm-xe-next (Brost)
- Fix overflow in guc_ct_snapshot_capture (Mika, Fixes)
- Extract gt_pta_entry (Gustavo)
- Extra enabling patches for NVL-P (Gustavo)
- Add Wa_14026578760 (Varun)
- Add type-specific GT loop iterator (Roper)
- Refactor xe_migrate_prepare_vm (Raag)
- Don't disable GuCRC in suspend path (Vinay, Fixes)
- Add missing kernel docs in xe_exec_queue.c (Niranjana)
- Change TEST_VRAM to work with 32-bit resource_size_t (Wajdeczko)
- Fix memory leak in xe_vm_madvise_ioctl (Varun, Fixes)
- Skip access counter queue init for unsupported platforms (Himal)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/abLUVfSHu8EHRF9q@lstrano-desk.jf.intel.com

Merge tag 'drm-intel-gt-next-2026-03-12' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next

Driver Changes:

Fixes/improvements/new stuff:

- Fix potential overflow of shmem scatterlist length (Janusz Krzysztofik)

Miscellaneous:

- Keep mock file open during unfaultable migrate with fill [selftests] (Krzysztof Karas)
- Test for imported buffers with drm_gem_is_imported() (Thomas Zimmermann)
- Fix corrupted copyright symbols in selftest files [guc] (Konstantin Khorenko)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Tvrtko Ursulin <tursulin@igalia.com>
Link: https://patch.msgid.link/abKBHNFsBQCv2h3e@linux

Merge tag 'drm-misc-next-2026-03-12' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next

drm-misc-next for v7.1:

UAPI Changes:

amdxdna:
- Add sensors ioctls

Cross-subsystem Changes:

dma-buf:
- clean pages with helpers

Documenatation:
- devicetree: Add lxd vendor prefix

Core Changes:

buddy:
- improve aligned allocations

gem-shmem:
- Track page accessed/dirty status across mmap/vmap

ttm:
- fix fence signalling

Driver Changes:

amdxdna:
- provide NPU power estimate
- support sensor for column utilization

bridge:
- anx7625: Fix USB Type-C handling
- cdns-mhdp8546-core: Handle HDCP state in bridge atomic_check

ivpu:
- fixes

loongson:
- replace custom code with drm_gem_ttm_dumb_map_offset()

mxsfb:
- lcdif: report probing errors with dev_err_probe()

panel:
- ilitek-ili9882t: Allow GPIO calls to sleep
- jadard: Support TAIGUAN XTI05101-01A
- lxd: Support LXD M9189A plus DT bindings
- mantix: Fix pixel clock; Clean up
- motorola: Support Motorola Atrix 4G and Droid X2 plus DT bindings
- novatek: Support Novatek/Tianma NT37700F plus DT bindings
- renesas: Clean up
- simple: Support EDT ET057023UDBA plus DT bindings; Support Powertip
PH800480T032-ZHC19 plus DT bindings; Support Waveshare 13.3"
- clean up DT bindings of various drivers

panthor:
- fix fence handling

vc4:
- check return value of platform_get_irq_byname()

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patch.msgid.link/20260312075629.GA21234@linux.fritz.box

drm/pagemap: Enable THP support for GPU memory migration

This enables support for Transparent Huge Pages (THP) for device pages by
using MIGRATE_VMA_SELECT_COMPOUND during migration. It removes the need to
split folios and loop multiple times over all pages to perform required
operations at page level. Instead, we rely on newly introduced support for
higher orders in drm_pagemap and folio-level API.

In Xe, this drastically improves performance when using SVM. The GT stats
below collected after a 2MB page fault show overall servicing is more than
7 times faster, and thanks to reduced CPU overhead the time spent on the
actual copy goes from 23% without THP to 80% with THP:

Without THP:

    svm_2M_pagefault_us: 966
    svm_2M_migrate_us: 942
    svm_2M_device_copy_us: 223
    svm_2M_get_pages_us: 9
    svm_2M_bind_us: 10

With THP:

    svm_2M_pagefault_us: 132
    svm_2M_migrate_us: 128
    svm_2M_device_copy_us: 106
    svm_2M_get_pages_us: 1
    svm_2M_bind_us: 2

v2:
- Fix one occurrence of drm_pagemap_get_devmem_page() (Matthew Brost)

v3:
- Remove migrate_device_split_page() and folio_split_lock, instead rely on
  free_zone_device_folio() to split folios before freeing (Matthew Brost)
- Assert folio order is HPAGE_PMD_ORDER (Matthew Brost)
- Always use folio_set_zone_device_data() in split (Matthew Brost)

v4:
- Warn on compound device page, s/continue/goto next/ (Matthew Brost)

v5:
- Revert warn on compound device page
- s/zone_device_page_init()/zone_device_folio_init() (Matthew Brost)

Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Michal Mrozek <michal.mrozek@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Balbir Singh <balbirs@nvidia.com>
Cc: linux-mm@kvack.org
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260312192126.2024853-5-francois.dugast@intel.com

drm/pagemap: Correct cpages calculation for migrate_vma_setup

cpages returned from migrate_vma_setup represents the total number of
individual pages found, not the number of 4K pages. The math in
drm_pagemap_migrate_to_devmem for npages is based on the number of 4K
pages, so cpages != npages can fail even if the entire memory range is
found in migrate_vma_setup (e.g., when a single 2M page is found).
Add drm_pagemap_cpages, which converts cpages to the number of 4K pages
found.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Balbir Singh <balbirs@nvidia.com>
Cc: linux-mm@kvack.org
Reviewed-by: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Balbir Singh <balbirs@nvidia.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260312192126.2024853-4-francois.dugast@intel.com

drm/pagemap: Add helper to access zone_device_data

This new helper helps ensure all accesses to zone_device_data use the
correct API whether the page is part of a folio or not.

v2:
- Move to drm_pagemap.h, stick to folio_zone_device_data (Matthew Brost)
- Return struct drm_pagemap_zdd * (Matthew Brost)

v3:
- Add stub for !CONFIG_ZONE_DEVICE (CI)

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Balbir Singh <balbirs@nvidia.com>
Cc: linux-mm@kvack.org
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260312192126.2024853-3-francois.dugast@intel.com

drm/pagemap: Unlock and put folios when possible

If the page is part of a folio, unlock and put the whole folio at once
instead of individual pages one after the other. This will reduce the
amount of operations once device THP are in use.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Balbir Singh <balbirs@nvidia.com>
Cc: linux-mm@kvack.org
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Balbir Singh <balbirs@nvidia.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260312192126.2024853-2-francois.dugast@intel.com

drm/xe: Open-code GGTT MMIO access protection

GGTT MMIO access is currently protected by hotplug (drm_dev_enter),
which works correctly when the driver loads successfully and is later
unbound or unloaded. However, if driver load fails, this protection is
insufficient because drm_dev_unplug() is never called.

Additionally, devm release functions cannot guarantee that all BOs with
GGTT mappings are destroyed before the GGTT MMIO region is removed, as
some BOs may be freed asynchronously by worker threads.

To address this, introduce an open-coded flag, protected by the GGTT
lock, that guards GGTT MMIO access. The flag is cleared during the
dev_fini_ggtt devm release function to ensure MMIO access is disabled
once teardown begins.

Cc: stable@vger.kernel.org
Fixes: 919bb54e989c ("drm/xe: Fix missing runtime outer protection for ggtt_remove_node")
Reviewed-by: Zhanjun Dong <zhanjun.dong@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260310225039.1320161-8-zhanjun.dong@intel.com

drm/xe/uc: Drop xe_guc_sanitize in favor of managed cleanup

If the firmware fails to load in GT resets the device is wedged also
initiating a GuC state cleanup.

Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260310225039.1320161-7-zhanjun.dong@intel.com

drm/xe/guc: Ensure CT state transitions via STOP before DISABLED

The GuC CT state transition requires moving to the STOP state before
entering the DISABLED state. Update the driver teardown sequence to make
the proper state machine transitions.

Fixes: ee4b32220a6b ("drm/xe/guc: Add devm release action to safely tear down CT")
Cc: stable@vger.kernel.org
Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260310225039.1320161-6-zhanjun.dong@intel.com

drm/xe: Use XE_WEDGED_MODE_UPON_ANY_HANG_NO_RESET enum instead of magic number

Replace the magic number 2 with the proper enum value
XE_WEDGED_MODE_UPON_ANY_HANG_NO_RESET for better code readability
and maintainability.

Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260310225039.1320161-5-zhanjun.dong@intel.com

drm/xe: Trigger queue cleanup if not in wedged mode 2

The intent of wedging a device is to allow queues to continue running
only in wedged mode 2. In other modes, queues should initiate cleanup
and signal all remaining fences. Fix xe_guc_submit_wedge to correctly
clean up queues when wedge mode != 2.

Fixes: 7dbe8af13c18 ("drm/xe: Wedge the entire device")
Cc: stable@vger.kernel.org
Reviewed-by: Zhanjun Dong <zhanjun.dong@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260310225039.1320161-4-zhanjun.dong@intel.com

drm/xe: Forcefully tear down exec queues in GuC submit fini

In GuC submit fini, forcefully tear down any exec queues by disabling
CTs, stopping the scheduler (which cleans up lost G2H), killing all
remaining queues, and resuming scheduling to allow any remaining cleanup
actions to complete and signal any remaining fences.

Split guc_submit_fini into device related and software only part. Using
device-managed and drm-managed action guarantees the correct ordering of
cleanup.

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: stable@vger.kernel.org
Reviewed-by: Zhanjun Dong <zhanjun.dong@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260310225039.1320161-3-zhanjun.dong@intel.com

drm/xe: Always kill exec queues in xe_guc_submit_pause_abort

xe_guc_submit_pause_abort is intended to be called after something
disastrous occurs (e.g., VF migration fails, device wedging, or driver
unload) and should immediately trigger the teardown of remaining
submission state. With that, kill any remaining queues in this function.

Fixes: 7c4b7e34c83b ("drm/xe/vf: Abort VF post migration recovery on failure")
Cc: stable@vger.kernel.org
Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260310225039.1320161-2-zhanjun.dong@intel.com

drm/xe/guc: Fail immediately on GuC load error

By using the same variable for both the return of poll_timeout_us and
the return of the polled function guc_wait_ucode, the return value of
the latter is overwritten and lost after exiting the polling loop. Since
guc_wait_ucode returns -1 on GuC load failure, we lose that information
and always continue as if the GuC had been loaded correctly.

This is fixed by simply using 2 separate variables.

Fixes: a4916b4da448 ("drm/xe/guc: Refactor GuC load to use poll_timeout_us()")
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Link: https://patch.msgid.link/20260303001732.2540493-2-daniele.ceraolospurio@intel.com

drm/i915/dp: Simplify forcing a link retraining

Since both the DP SST and MST HPD IRQ handlers call
intel_dp_handle_link_service_irq() with LINK_STATUS_CHANGED set in
irq_mask if intel_dp->link.force_retrain is set, checking for the former
flag is sufficient to determine if the link status needs to be checked
(which includes retraining the link if this is forced); remove checking
for the latter flag.

Since LINK_STATUS_CHANGED is currently set unconditionally for DP SST,
extend the related comment to note that it must be set if
intel_dp->link.force_retrain is set (in case setting LINK_STATUS_CHANGED
becomes conditional on DPCD_REV).

Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patch.msgid.link/20260311153152.133744-2-imre.deak@intel.com

drm/i915/dp_mst: Fix forced link retrain handling in MST HPD IRQ handler

Handling of a forced link retraining debugfs request via the DP MST HPD
IRQ handler is incorrectly skipped, if the IRQ handler doesn't see any
HPD IRQs raised by the sink. Fix this by ensuring that the request is
always handled (in the Fixes: commit below by directly calling
intel_dp_check_link_state(), later by the same call moved to
intel_dp_handle_link_service_irq()).

Cc: Luca Coelho <luciano.coelho@intel.com>
Fixes: db4855d90363 ("drm/i915/dp_mst: Reuse intel_dp_check_link_state() in the HPD IRQ handler")
Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patch.msgid.link/20260311153152.133744-1-imre.deak@intel.com

drm/xe/uapi: Fix kernel-doc for DRM_XE_VM_BIND_FLAG_DECOMPRESS

There is kernel-doc warning for DRM_XE_VM_BIND_FLAG_DECOMPRESS:

./include/uapi/drm/xe_drm.h:1060: WARNING: Block quote ends without
a blank line; unexpected unindent.

Fix the warning by adding the missing '%' prefix to
DRM_XE_VM_BIND_FLAG_DECOMPRESS in the kernel-doc list entry for
struct drm_xe_vm_bind_op.

Fixes: 2270bd7124f4 ("drm/xe: add VM_BIND DECOMPRESS uapi flag")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202603121515.gEMrFlTL-lkp@intel.com/
Cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Nitin Gote <nitin.r.gote@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patch.msgid.link/20260312160244.809849-2-nitin.r.gote@intel.com
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>

drm/i915/hdcp: Take force_hdcp14 into account during check_link

During intel_hdcp_check_link phase we need to take into account
if we are currently forcing HDCP 1.4 or not. This is because
we check for HDCP 2.x Link first and only if HDCP 2.x is not being
used check for HDCP 1.4. With force_hdcp14 in picture we should not
be going into intel_hdcp2_check_link because of which we may end
up trying to disable HDCP2.x even if HDCP 1.4 was enabled causing
a lot of issues while IGT tests this.

Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com>
Reviewed-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Link: https://patch.msgid.link/20260225065045.3040787-1-suraj.kandpal@intel.com

drm/xe/wa: Drop redundant entries for Wa_16021867713 & Wa_14019449301

The Xe2_HPM-specific RTP table entries for Wa_16021867713 and
Wa_14019449301 were removed by commit 941f538b0af8 ("drm/xe: Consolidate
workaround entries for Wa_16021867713") and commit aa0f0a678370
("drm/xe: Consolidate workaround entries for Wa_14019449301") in favor
of alternate entries earlier in the table that cover a wider range of IP
versions.  However these Xe2_HPM-specific entries were accidentally
resurrected during a backmerge, which causes the Xe driver to complain
on probe about two entries trying to program the same registers+bits:

  <3> [48.491155] xe 0000:03:00.0: [drm] *ERROR* Tile0: GT1: discarding save-restore reg 1c3f1c (clear: 00000008, set: 00000008, masked: no, mcr: no): ret=-22
  <3> [48.491211] xe 0000:03:00.0: [drm] *ERROR* Tile0: GT1: discarding save-restore reg 1d3f1c (clear: 00000008, set: 00000008, masked: no, mcr: no): ret=-22
  <3> [48.491225] xe 0000:03:00.0: [drm] *ERROR* Tile0: GT1: discarding save-restore reg 1c3f08 (clear: 00000020, set: 00000020, masked: no, mcr: no): ret=-22
  <3> [48.491238] xe 0000:03:00.0: [drm] *ERROR* Tile0: GT1: discarding save-restore reg 1d3f08 (clear: 00000020, set: 00000020, masked: no, mcr: no): ret=-22

Re-drop the redundant Xe2_HPM-specific entries to eliminate the dmesg
errors.

Fixes: 58351f46de26 ("Merge v7.0-rc3 into drm-next")
Cc: Simona Vetter <simona.vetter@ffwll.ch>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/7608
Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch>
Link: https://patch.msgid.link/20260312-wa_merge_fix-v1-1-2ec6607f1e0c@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>

Merge drm/drm-next into drm-xe-next

Backmerging to bring in 7.00-rc3. Important ahead GPU SVM merging THP
support.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>

drm/xe: Fix overflow in guc_ct_snapshot_capture

snapshot->ctb is u32*, so pointer arithmetic on it scales
the byte offset from xe_bo_size() by 4, overshooting the
intended start of the g2h portion and writing past the
allocated buffer.

Fix this by using void * to get the arithmetic right and
prevent future mishaps.

v2: s/u8/void for memcpy and iosys_map consistency (Matt)

Fixes: af3de6cf06f9 ("drm/xe: Split H2G and G2H into separate buffer objects")
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: intel-xe@lists.freedesktop.org
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260304211728.249104-1-mika.kuoppala@linux.intel.com

drm/xe: implement VM_BIND decompression in vm_bind_ioctl

Implement handling of VM_BIND(..., DECOMPRESS) in xe_vm_bind_ioctl.

Key changes:
- Parse and record per-op intent (op->map.request_decompress) when the
  DECOMPRESS flag is present.
- Use xe_pat_index_get_comp_en() helper to check if a PAT index
  has compression enabled via the XE2_COMP_EN bit.
- Validate DECOMPRESS preconditions in the ioctl path:
  - Only valid for MAP ops.
  - The provided pat_index must select the device's "no-compression" PAT.
  - Only meaningful on devices with flat CCS and the required XE2+
    otherwise return -EOPNOTSUPP.
  - Use XE_IOCTL_DBG for uAPI sanity checks.
- Implement xe_bo_decompress():
  For VRAM BOs run xe_bo_move_notify(), reserve one fence slot,
  schedule xe_migrate_resolve(), and attach the returned fence
  with DMA_RESV_USAGE_KERNEL. Non-VRAM cases are silent no-ops.
- Wire scheduling into vma_lock_and_validate() so VM_BIND will schedule
  decompression when request_decompress is set.
- Handle fault-mode VMs by performing decompression synchronously during
  the bind process, ensuring that the resolve is completed before the bind
  finishes.

This schedules an in-place GPU resolve (xe_migrate_resolve) for
decompression.

Compute PR: https://github.com/intel/compute-runtime/pull/898
IGT PR: https://patchwork.freedesktop.org/series/157553/

v7: Rebase on latest drm-tip and add compute and igt pr info

v6: (Matt Auld)
   - Rebase as xe_pat_index_get_comp_en() is added in separate
     patch
   - Drop vm param from xe_bo_decompress(), instead of it
     extract tile from bo
   - Reject decompression on igpu instead of silent skipping
     to avoid any failure on Xe2+igpu as xe_device_has_flat_ccs()
     can sometimes be false on igpu due some setting in the BIOS
     to turn off compression on igpu.
   - Nits

v5: (Matt)
   - Correct the condition check of xe_pat_index_get_comp_en

v4: (Matt)
   - Introduce xe_pat_index_get_comp_en(), which checks
     XE2_COMP_EN for the pat_index
   - .interruptible should be true, everything else false

v3: (Matt)
   - s/xe_bo_schedule_decompress/xe_bo_decompress
   - skip the decrompress step if the BO isn't in VRAM
   - start/size not required in xe_bo_schedule_decompress
   - Use xe_bo_move_notify instead of xe_vm_invalidate_vma
     with respect to invalidation.
   - Nits

v2:
   - Move decompression work out of vm_bind ioctl. (Matt)
   - Put that work in a small helper at the BO/migrate layer invoke it
     from vma_lock_and_validate which already runs under drm_exec.
   - Move lightweight checks to vm_bind_ioctl_check_args (Matthew Auld)

Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Acked-by: Michal Mrozek <michal.mrozek@intel.com>
Signed-off-by: Nitin Gote <nitin.r.gote@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patch.msgid.link/20260304123758.3050386-8-nitin.r.gote@intel.com

drm/xe: add xe_migrate_resolve wrapper and is_vram_resolve support

Introduce an internal __xe_migrate_copy(..., is_vram_resolve) path and
expose a small wrapper xe_migrate_resolve() that calls it with
is_vram_resolve=true.

For resolve/decompression operations we must ensure the copy code uses
the compression PAT index when appropriate; this change centralizes that
behavior and allows callers to schedule a resolve (decompress) operation
via the migrate API.

v3: Fix kernel-doc warnings

v2: (Matt)
- Simplify xe_migrate_resolve(), use single BO/resource;
remove copy_only_ccs argument as it's always false.

Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Nitin Gote <nitin.r.gote@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patch.msgid.link/20260304123758.3050386-7-nitin.r.gote@intel.com

drm/xe: add VM_BIND DECOMPRESS uapi flag

Add a new VM_BIND flag, DRM_XE_VM_BIND_FLAG_DECOMPRESS, that lets userspace
express intent for the driver to perform on-device in-place decompression
for the GPU mapping created by a MAP bind operation.

This flag is used by subsequent driver changes to trigger scheduling of
GPU work that resolves compressed VRAM pages into an uncompressed PAT
VM mapping.

Behavior and semantics:
- Valid only for DRM_XE_VM_BIND_OP_MAP. IOCTLs using this flag on other ops
  are rejected (-EINVAL).
- The bind's pat_index must select the device "no-compression" PAT entry;
  otherwise the ioctl is rejected (-EINVAL).
- Only meaningful for VRAM-backed BOs on devices that support Flat CCS and
  the required hardware generation (driver will return -EOPNOTSUPP if not).
- On success the driver schedules a migrate/resolve and installs the
  returned dma_fence into the BO's kernel reservation
  (DMA_RESV_USAGE_KERNEL).

Compute PR: https://github.com/intel/compute-runtime/pull/898

v3: Rebase on latest drm-tip and add compute pr info

v2: Add kernel doc (Matt)

Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mrozek, Michal <michal.mrozek@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Nitin Gote <nitin.r.gote@intel.com>
Acked-by: Michal Mrozek <michal.mrozek@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patch.msgid.link/20260304123758.3050386-6-nitin.r.gote@intel.com

Merge drm/drm-next into drm-misc-next

Biju Das needs a patch for rz-du merged in 7.0-rc3

Signed-off-by: Maxime Ripard <mripard@kernel.org>

accel/amdxdna: Support sensors for column utilization

The AMD PMF driver provides realtime column utilization (npu_busy)
metrics for the NPU. Extend the DRM_IOCTL_AMDXDNA_GET_INFO sensor
query to expose these metrics to userspace.

Add AMDXDNA_SENSOR_TYPE_COLUMN_UTILIZATION to the sensor type enum
and update aie2_get_sensors() to return both the total power and up
to 8 column utilization sensors if the user buffer permits.

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Reviewed-by: Lizhi Hou <lizhi.hou@amd.com>
[lizhi: support legacy tool which uses small buffer. checkpatch cleanup]
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20260311171842.473453-1-lizhi.hou@amd.com

accel/amdxdna: Add IOCTL to retrieve realtime NPU power estimate

The AMD PMF driver provides an interface to obtain realtime power
estimates for the NPU. Expose this information to userspace through a
new DRM_IOCTL_AMDXDNA_GET_INFO parameter, allowing applications to query
the current NPU power level.

Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Reviewed-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
(Update comment to indicate power and utilization)
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20260228061109.361239-2-superm1@kernel.org

accel/amdxdna: Import AMD_PMF namespace

The amdxdna driver uses amd_pmf_get_npu_data() which is exported in the
AMD_PMF namespace. Import the AMD_PMF namespace.

Reviewed-by: Lizhi Hou <lizhi.hou@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20260301005028.367618-1-superm1@kernel.org

drm/xe/pat: Extract gt_pta_entry()

Avoid code duplication by extracting the logic for selection of the
correct PAT_PTA entry for a GT into function gt_pta_entry() and using
that function whenever necessary.

Reviewed-by: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com>
Link: https://patch.msgid.link/20260303-pat-gt_pta_entry-v1-1-0dee8e1e7bd9@intel.com
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>

drm/amdgpu: revert to old status lock handling v4

It turned out that protecting the status of each bo_va with a
spinlock was just hiding problems instead of solving them.

Revert the whole approach, add a separate stats_lock and lockdep
assertions that the correct reservation lock is held all over the place.

This not only allows for better checks if a state transition is properly
protected by a lock, but also switching back to using list macros to
iterate over the state of lists protected by the dma_resv lock of the
root PD.

v2: re-add missing check
v3: split into two patches
v4: re-apply by fixing holding the VM lock at the right places.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd: Set num IP blocks to 0 if discovery fails

If discovery has failed for any reason (such as no support for a block)
then there is no need to unwind all the IP blocks in fini. In this
condition there can actually be failures during the unwind too.

Reset num_ip_blocks to zero during failure path and skip the unnecessary
cleanup path.

Suggested-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: Fix xgmi max speed reporting

Fix XGMI max bitrate/width reporting on SMUv13.0.12 SOCs. The data
format got changed when moved to static table from dynamic metrics
table.

Fixes: 1bec2f270766 ("drm/amd/pm: Fetch SMUv13.0.12 xgmi max speed/width")
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: fix array out of bounds accesses for mes sw_fini

The mes.fw[] is per-pipe resource shared accross xcc inst.
And enlarge hung_queue array to max inst_pipes.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: fix sysfs ip base addr with 64bit

Correct the base addr value shown on sysfs with ignore reg_base_64,
since the base_addr value have been over write when discovery_init.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: disable rlc fw info print

Disable to print RLC v2_5 related firmware information by default.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: change sdma doorbell size for soc v1

Change SDMA doorbel size to 14 per SDMA engine.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: correct SDMA instance number for soc v1_0

Calculate sdma instance number according to xcc_mask and
num_inst_per_xcc, and correct adev->sdma.sdma_mask according
to totally sdma instance number.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: fix shift-out-of-bounds when updating umc active mask

UMC node_inst_num can exceed 32, causing
(1 << node_inst_num) to shift a 32-bit int
out of bounds

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: bypass IMU ucode loading for MP0 15.0.8

For MP0 15.0.8, IMU ucode is part of IFWI and ASP would load it by default.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: update GFX CGCG/LS flags for gfx 12.1

Update GFX CGCG flags and fix num_xcc assignment

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/amdgpu: Disable reset on init for soc_v1_0

Return false from soc_v1_0_reset_on_init as psp is loaded with ifwi and
sol register will be non zero on first load itself

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Add default reset method for soc_v1_0

Add mode2 as default reset method for soc_v1_0

v2: Remove unnecessary overrides while selecting reset method (Lijo)
v4: Add dev_warn_once (Lijo)

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: use common defines for GMC 12.1 HUB faults

Use proper definitions rather than a number.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Handle IH v7_1 reg offset differences

IH v7_1 changes the offsets of some registers relative to
IH v7_0. Introduce IH v7_1-specific register access

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: fix DF NULL pointer issue for soc24

If DF function not initialized, NULL pointer issue
will happen on soc24.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: add support for lsdma v7_1

Add support for LSDMA v7_1_0.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Add lsdma v7_1_0 ip headers

Add header files for lsdma v7_1_0 register offsets
and shift masks

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Use memcpy to update IPD table for sriov guest

On some hardware configuration, sriov guests
cannot access mm_index and mm_data. Update the
IPD table via memcpy in these cases

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: New interface to get IP discovery binary v3

Implement a driver path to read the IP discovery
binary offset and size from DRIVER_SCRATCH registers
BIOS signals usage by setting a feature flag that
instructs the driver to use this method. Otherwise,
fallback to legacy approach.

v2: Simplify discovery offset/size retrieval in
get_tmr_info

v3: Update get_tmr_info to cover discovery offset
and size retrieval for both bare-metal and sriov

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/discovery: use common function to check discovery table

Use an common function to check the validation of discovery table.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/discovery: support new discovery binary header

Support for new IP discovery binary header version 2.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdkfd: Unreserve bo if queue update failed

Error handling path should unreserve bo then return failed.

Fixes: 305cd109b761 ("drm/amdkfd: Validate user queue update")
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Alex Sierra <alex.sierra@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>