git.ipfire.org Git - thirdparty/kernel/linux.git/log

mm: Fix a hmm_range_fault() livelock / starvation problem

If hmm_range_fault() fails a folio_trylock() in do_swap_page,
trying to acquire the lock of a device-private folio for migration,
to ram, the function will spin until it succeeds grabbing the lock.

However, if the process holding the lock is depending on a work
item to be completed, which is scheduled on the same CPU as the
spinning hmm_range_fault(), that work item might be starved and
we end up in a livelock / starvation situation which is never
resolved.

This can happen, for example if the process holding the
device-private folio lock is stuck in
   migrate_device_unmap()->lru_add_drain_all()
sinc lru_add_drain_all() requires a short work-item
to be run on all online cpus to complete.

A prerequisite for this to happen is:
a) Both zone device and system memory folios are considered in
   migrate_device_unmap(), so that there is a reason to call
   lru_add_drain_all() for a system memory folio while a
   folio lock is held on a zone device folio.
b) The zone device folio has an initial mapcount > 1 which causes
   at least one migration PTE entry insertion to be deferred to
   try_to_migrate(), which can happen after the call to
   lru_add_drain_all().
c) No or voluntary only preemption.

This all seems pretty unlikely to happen, but indeed is hit by
the "xe_exec_system_allocator" igt test.

Resolve this by waiting for the folio to be unlocked if the
folio_trylock() fails in do_swap_page().

Rename migration_entry_wait_on_locked() to
softleaf_entry_wait_unlock() and update its documentation to
indicate the new use-case.

Future code improvements might consider moving
the lru_add_drain_all() call in migrate_device_unmap() to be
called *after* all pages have migration entries inserted.
That would eliminate also b) above.

v2:
- Instead of a cond_resched() in hmm_range_fault(),
  eliminate the problem by waiting for the folio to be unlocked
  in do_swap_page() (Alistair Popple, Andrew Morton)
v3:
- Add a stub migration_entry_wait_on_locked() for the
  !CONFIG_MIGRATION case. (Kernel Test Robot)
v4:
- Rename migrate_entry_wait_on_locked() to
  softleaf_entry_wait_on_locked() and update docs (Alistair Popple)
v5:
- Add a WARN_ON_ONCE() for the !CONFIG_MIGRATION
  version of softleaf_entry_wait_on_locked().
- Modify wording around function names in the commit message
  (Andrew Morton)

Suggested-by: Alistair Popple <apopple@nvidia.com>
Fixes: 1afaeb8293c9 ("mm/migrate: Trylock device page in do_swap_page")
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: linux-mm@kvack.org
Cc: <dri-devel@lists.freedesktop.org>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: <stable@vger.kernel.org> # v6.15+
Reviewed-by: John Hubbard <jhubbard@nvidia.com> #v3
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Link: https://patch.msgid.link/20260210115653.92413-1-thomas.hellstrom@linux.intel.com

drm/gpusvm: Fix unbalanced unlock in drm_gpusvm_scan_mm()

There is a unbalanced lock/unlock to gpusvm notifier lock:
[  931.045868] =====================================
[  931.046509] WARNING: bad unlock balance detected!
[  931.047149] 6.19.0-rc6+xe-**************** #9 Tainted: G     U
[  931.048150] -------------------------------------
[  931.048790] kworker/u5:0/51 is trying to release lock (&gpusvm->notifier_lock) at:
[  931.049801] [<ffffffffa090c0d8>] drm_gpusvm_scan_mm+0x188/0x460 [drm_gpusvm_helper]
[  931.050802] but there are no more locks to release!
[  931.051463]

The drm_gpusvm_notifier_unlock() sits under err_free label and the
first jump to err_free is just before calling the
drm_gpusvm_notifier_lock() causing unbalanced unlock.

Fixes: f1d08a586482 ("drm/gpusvm: Introduce a function to scan the current migration state")
Signed-off-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260209123433.1271053-1-maciej.patelczyk@intel.com

drm/xe/xe2_hpg: Fix handling of Wa_14019988906 & Wa_14019877138

The PSS_CHICKEN register has been part of the RCS engine's LRC since it
was first introduced in Xe_LP. That means that any workarounds that
adjust its value (such as Wa_14019988906 and Wa_14019877138) need to be
implemented in the lrc_was[] table so that they become part of the
default LRC from which all subsequent LRCs are copied. Although these
workarounds were implemented correctly on most platforms, they were
incorrectly placed on the engine_was[] table for Xe2_HPG.

Move the workarounds to the proper lrc_was[] table and switch the
'xe_rtp_match_first_render_or_compute' rule to specifically match the
RCS since that's the engine whose LRC manages the register.

Bspec: 65182
Fixes: 7f3ee7d88058 ("drm/xe/xe2hpg: Add initial GT workarounds")
Reviewed-by: Shekhar Chauhan <shekhar.chauhan@intel.com>
Link: https://patch.msgid.link/20260205220508.51905-2-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>

drm/xe/nvlp: Bump maximum WOPCM size

On NVL-P, the primary GT's WOPCM gained an extra 8MiB for the Memory
URB.  As such, we need to bump the maximum size in the driver so that
the driver is able to load without erroring out thinking that the WOPCM
is too small.

FIXME: The wopcm code in xe driver is a bit confusing.  For the case
where the offsets for GUC WOPCM are already locked, it appears we are
using the maximum overall WOPCM size instead of the sizes relative to
each type of GT.  The function __check_layout() should be checking
against the latter.

Bspec: 67090
Reviewed-by: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com>
Link: https://patch.msgid.link/20260206-nvl-p-upstreaming-v3-15-636e1ad32688@intel.com
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>

drm/i915/nvlp: Hook up display support

Although NVL-S and NVL-P are quite different on the GT side, they use
identical Xe3p_LPD display IP and should take all the same codepaths.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patch.msgid.link/20260206-nvl-p-upstreaming-v3-14-636e1ad32688@intel.com
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>

drm/xe/nvlp: Attach MOCS table for nvlp

The MOCS table for NVL-P is same as for Xe2/Xe3 platforms.

Signed-off-by: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20260206-nvl-p-upstreaming-v3-13-636e1ad32688@intel.com
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>

drm/xe/nvlp: Add NVL-P platform definition

Add platform definition along with device IDs for NVL-P.  Here is the
list of device descriptor fields and associated Bspec references:

  .dma_mask_size (Bspec 74198)
  .has_cached_pt (Bspec 71582)
  .has_display (Bspec 74196)
  .has_flat_ccs (Bspec 74110)
  .has_page_reclaim_hw_assist (Bspec 73451)
  .max_gt_per_tile (Bspec 74196)
  .va_bits (Bspec 74198)
  .vm_max_level (Bspec 59507)

v2:
  - Add list of descriptor fields and Bspec references. (Matt)

Signed-off-by: Shekhar Chauhan <shekhar.chauhan@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20260206-nvl-p-upstreaming-v3-12-636e1ad32688@intel.com
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>

drm/xe/xe3p_lpg: Set STLB bank hash mode to 4KB

Since the dominant size of the pages referred in an i-gpu, such as
Xe3p_LPG, will be 4KB, the HW default of mix of 64K and 2M for STLB bank
hash mode does not make sense.

Allow the SW to change it to 4KB Mode, for Xe3p_LPG.

v2:
- Add Bspec reference. (Matt)

Bspec: 78248
Signed-off-by: Aradhya Bhatia <aradhya.bhatia@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20260206-nvl-p-upstreaming-v3-11-636e1ad32688@intel.com
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>

drm/xe/xe3p_lpg: Update LRC sizes

Like with previous generations, the engine context images for of both
RCS and CCS in Xe3p_LPG contain a common layout at the end for the
context related to the "Compute Pipeline".

The size of the memory area written to such section varies; it depends
on the type of preemption has taken place during the execution and type
of command streamer instruction that was used on the pipeline. For
Xe3p_LPG, the maximum possible size, including NOOPs for cache line
alignment, is 4368 dwords, which would be the case of a mid-thread
preemption during the execution of a COMPUTE_WALKER_2 instruction.

The maximum size has increased in such a way that we need to update
xe_gt_lrc_size() to match the new sizing requirement. When we add that
to the engine-specific parts, we have:

- RCS context image: 6672 dwords = 26688 bytes -> 7 pages
- CCS context image: 5024 dwords = 20096 bytes -> 5 pages

Bspec: 65182, 55793, 73590
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20260206-nvl-p-upstreaming-v3-10-636e1ad32688@intel.com
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>

drm/xe/xe3p_lpg: Extend 'group ID' mask size

Xe3p_LPG extends the 'group ID' register mask by one bit. Since the new
upper bit (12) was unused on previous platforms, we can safely extend
the existing mask size without worrying about adding conditional version
checks to the register programming.

Bspec: 67175
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com>
Link: https://patch.msgid.link/20260206-nvl-p-upstreaming-v3-9-636e1ad32688@intel.com
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>

drm/xe/xe3p_lpg: Drop unnecessary tuning settings

From Xe3p onward, the desired settings are now the hardware's
default values and the driver does not need to program them explicitly.

Since 35.xx seems to be the starting point for "Xe3p" version numbers;
we'll adjust the bounds of the old programming to stop at 34.99. Even
though there's no platform with version 35.00 at the moment, this is
simplest in case one does show up in the future.

Bspec: 72161, 59928, 59930
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
Link: https://patch.msgid.link/20260206-nvl-p-upstreaming-v3-8-636e1ad32688@intel.com
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>

drm/xe/xe3p_lpg: Disable reporting of context switch status to GHWSP

By default the hardware reports context switch status into the global
hardware status page. The Xe driver doesn't use this information for
anything, and as of Xe3p, leaving this setting enabled will prevent
other hardware optimizations from being enabled. Disable this reporting
as suggested by the tuning guide.

Bspec: 72161
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
Link: https://patch.msgid.link/20260206-nvl-p-upstreaming-v3-7-636e1ad32688@intel.com
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>

drm/xe/xe3p_lpg: Add LRC parsing for additional RCS engine state

Xe3p_LPG adds some additional state instructions to the RCS engine's
LRC. Add support for these to the debugfs LRC parser.

Note that the bspec's LRC description page seems to have a few mistakes
in the name/spelling of these new instructions (e.g.,
"3DSTATE_TASK_DATA_EXT" instead of "3DSTATE_TASK_SHADER_DATA_EXT" or
"3DSTATE_VIEWPORT_STATE_POINTERS_CL_SF_2" instead of
"3DSTATE_VIEWPORT_STATE_POINTERS_SF_CLIP_2").

Bspec: 65182
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
Link: https://patch.msgid.link/20260206-nvl-p-upstreaming-v3-6-636e1ad32688@intel.com
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>

drm/xe/xe3p_lpg: Add MCR steering

Xe3p_LPG has nearly identical steering to Xe2 and Xe3.  The only
DSS/XeCore change from those IPs is an additional range from
0xDE00-0xDE7F that was previously reserved, so we can simply grow one of
the existing ranges in the Xe2 table to include it.  Similarly, the
"instance0" table is also almost identical, but gains one additional
PSMI range and requires a separate table.

v2:
  - Drop reserved range from MEMPIPE range. (Dnyaneshwar)

Bspec: 75242
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com>
Link: https://patch.msgid.link/20260206-nvl-p-upstreaming-v3-5-636e1ad32688@intel.com
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>

drm/xe/xe3p_lpg: Add new PAT table

PAT programming for Xe3p_LPG is more similar to Xe2 and Xe3 than it is
to Xe3p_XPC.  Compared to Xe2/Xe3 we have:

* There's a slight update to the PAT table, where two new indices (18
  and 19) are added to expose a new "WB - Transient App" L3 caching
  mode.

* The PTA_MODE entry must be programmed differently according to the
  media type, and both differ from Xe2.

There are no changes to the underlying registers, so the Xe2 ops can be
re-used for Xe3p.

Bspec: 71582, 74160
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
Link: https://patch.msgid.link/20260206-nvl-p-upstreaming-v3-4-636e1ad32688@intel.com
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>

drm/xe/pat: Differentiate between primary and media for PTA

Differently from currently supported platforms, in upcoming changes we
will need to have different PAT entries for PTA based on the GT type. As
such, let's prepare the code to support that by having two separate
PTA-specific members in the pat struct, one for each type of GT.

While at it, also fix the kerneldoc for pat_ats.

Co-developed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20260206-nvl-p-upstreaming-v3-3-636e1ad32688@intel.com
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>

drm/xe/xe3p_lpg: Add initial workarounds for graphics version 35.10

Add the initial set of workarounds for Xe3p_LPG graphics version 35.10.

v2:
- Fix spacing style for field LOCALITYDIS. (Matt)
- Drop unnecessary Wa_14025780377. (Matt)

Signed-off-by: Shekhar Chauhan <shekhar.chauhan@intel.com>
Co-developed-by: Nitin Gote <nitin.r.gote@intel.com>
Signed-off-by: Nitin Gote <nitin.r.gote@intel.com>
Co-developed-by: Tangudu Tilak Tirumalesh <tilak.tirumalesh.tangudu@intel.com>
Signed-off-by: Tangudu Tilak Tirumalesh <tilak.tirumalesh.tangudu@intel.com>
Co-developed-by: Mallesh Koujalagi <mallesh.koujalagi@intel.com>
Signed-off-by: Mallesh Koujalagi <mallesh.koujalagi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20260206-nvl-p-upstreaming-v3-2-636e1ad32688@intel.com
Co-developed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>

drm/xe/xe3p_lpg: Add support for graphics IP 35.10

Add Xe3p_LPG graphics IP version 35.10. Xe3p_LPG supports all features
described by XE2_GFX_FEATURES and also multi-queue feature on BCS and
CCS engines.  As such, create a new struct xe_graphics_desc named
graphics_xe3p_lpg that inherits from XE2_GFX_FEATURES and also includes
the necessary .multi_queue_engine_class_mask.

Here is a list of fields and associated Bspec references for the members
of the IP descriptor:

.hw_engine_mask (Bspec 60149)
.multi_queue_engine_class_mask (Bspec 74110)
.has_asid (Bspec 71132)
.has_atomic_enable_pte_bit (Bspec 59510, 74675)
.has_indirect_ring_state (Bspec 67296)
.has_range_tlb_inval (Bspec 71126)
.has_usm (Bspec 59651)
.has_64bit_timestamp (Bspec 60318)
.num_geometry_xecore_fuse_regs (Bspec 62566, 67401, 67536)
.num_compute_xecore_fuse_regs (Bspec 62565, 62561, 67537)

v2:
  - Drop non-existing fields from the list in the commit message. (Matt)
  - Squash patch adding .multi_queue_engine_class_mask here. (Matt)
  - Rename graphics_xe3p to graphics_xe3p_lpg. (Matt)
  - Add fields .num_geometry_xecore_fuse_regs and
    .num_compute_xecore_fuse_regs after rebasing and inheriting
    commit 6acf3d3ed6c1 ("drm/xe: Move number of XeCore fuse registers to
    graphics descriptor"). (Gustavo)

Signed-off-by: Shekhar Chauhan <shekhar.chauhan@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20260206-nvl-p-upstreaming-v3-1-636e1ad32688@intel.com
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>

drm/xe/mmio: Avoid double-adjust in 64-bit reads

xe_mmio_read64_2x32() was adjusting register addresses and then
calling xe_mmio_read32(), which applies the adjustment again.
This may shift accesses twice if adj_offset < adj_limit. There is
no issue currently, as for media gt, adj_offset > adj_limit, so
the 2nd adjust will be a no-op. But it may not work in future.

To fix it, replace the adjusted-address comparison with a direct
sanity check that ensures the MMIO address adjustment cutoff never
falls within the 8-byte range of a 64-bit register. And let
xe_mmio_read32() handle address translation.

v2: rewrite the sanity check in a more natural way. (Matt)
v3: Add Fixes tag. (Jani)

Fixes: 07431945d8ae ("drm/xe: Avoid 64-bit register reads")
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Link: https://patch.msgid.link/20260130165621.471408-2-shuicheng.lin@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>

drm/xe/vf: Allow VF to initialize MCR tables

While VFs can't access MCR registers, it's still safe to initialize
our per-platform MCR tables, as we might need them later in the LRC
programming, as engines itself may access MCR steer registers and
thanks to all our past fixes to the VF probe initialization order,
VFs are able to use values of the fuse registers needed here.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20260207214428.5205-1-michal.wajdeczko@intel.com
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>

drm/xe/guc: Add Wa_14025883347 for GuC DMA failure on reset

Prevent GuC firmware DMA failures during GuC-only reset by disabling
idle flow and verifying SRAM handling completion. Without this, reset
can be issued while SRAM handler is copying WOPCM to SRAM,
causing GuC HW to get stuck.

v2: Modify error message (Badal)
    Rename reg bit name (Daniele)
    Update WA skip condition (Daniele)
    Update SRAM handling logic (Daniele)
v3: Reorder WA call (Badal)
    Wait for GuC ready status (Daniele)
v4: Update reg name (Badal)
    Add comment (Daniele)
    Add extended graphics version (Daniele)
    Modify rules

Signed-off-by: Sk Anirban <sk.anirban@intel.com>
Reviewed-by: Badal Nilawar <badal.nilawar@intel.com>
Acked-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patch.msgid.link/20260202105313.3338094-4-sk.anirban@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>

drm/xe/uapi: update used tracking kernel-doc

In commit 4d0b035fd6da ("drm/xe/uapi: loosen used tracking restriction")
we dropped the CAP_PERMON restriction but missed updating the
corresponding kernel-doc. Fix that.

v2 (Sanjay):
- Don't drop the note around the extra cpu_visible_used expectations.

Reported-by: Ulisses Furquim <ulisses.furquim@intel.com>
Fixes: 4d0b035fd6da ("drm/xe/uapi: loosen used tracking restriction")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Sanjay Yadav <sanjay.kumar.yadav@intel.com>
Reviewed-by: Sanjay Yadav <sanjay.kumar.yadav@intel.com>
Link: https://patch.msgid.link/20260130125105.451229-2-matthew.auld@intel.com

drm/xe: Add bounds check on pat_index to prevent OOB kernel read in madvise

When user provides a bogus pat_index value through the madvise IOCTL, the
xe_pat_index_get_coh_mode() function performs an array access without
validating bounds. This allows a malicious user to trigger an out-of-bounds
kernel read from the xe->pat.table array.

The vulnerability exists because the validation in madvise_args_are_sane()
directly calls xe_pat_index_get_coh_mode(xe, args->pat_index.val) without
first checking if pat_index is within [0, xe->pat.n_entries).

Although xe_pat_index_get_coh_mode() has a WARN_ON to catch this in debug
builds, it still performs the unsafe array access in production kernels.

v2(Matthew Auld)
- Using array_index_nospec() to mitigate spectre attacks when the value
is used

v3(Matthew Auld)
- Put the declarations at the start of the block

Fixes: ada7486c5668 ("drm/xe: Implement madvise ioctl for xe")
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Cc: <stable@vger.kernel.org> # v6.18+
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Shuicheng Lin <shuicheng.lin@intel.com>
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Jia Yao <jia.yao@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patch.msgid.link/20260205161529.1819276-1-jia.yao@intel.com

drm/xe/pf: Fix the address range assert in ggtt_get_pte helper

The ggtt_get_pte helper used for saving VF GGTT incorrectly assumes that
ggtt_size == ggtt_end.
Fix it to avoid triggering spurious asserts if VF GGTT object lands in
high GGTT range.

Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20260130215624.556099-1-michal.winiarski@intel.com
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>

drm/xe/xe3p_xpc: XeCore mask spans four registers

On Xe3p_XPC, there are now four registers reserved to express the XeCore
mask rather than just three. Define the new registers and update the IP
descriptor accordingly.

Note that this only applies to Xe3p_XPC for now; Xe3p_LPG still only
uses three registers to express the mask.

Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://patch.msgid.link/20260205214139.48515-4-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>

drm/xe: Move number of XeCore fuse registers to graphics descriptor

The number of registers used to express the XeCore mask has some
"special cases" that don't always get inherited by later IP versions so
it's cleaner and simpler to record the numbers in the IP descriptor
rather than adding extra conditions to the standalone get_num_dss_regs()
function.

Note that a minor change here is that we now always treat the number of
registers as 0 for the media GT. Technically a copy of these fuse
registers does exist in the media GT as well (at the usual
0x380000+$offset location), but the value of those is always supposed to
read back as 0 because media GTs never have any XeCores or EUs.

v2:
- Add a kunit assertion to catch descriptors that forget to initialize
either count. (Gustavo)

Cc: Gustavo Sousa <gustavo.sousa@intel.com>
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://patch.msgid.link/20260205214139.48515-3-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>

drm/xe: Add forcewake status to powergate_info

Dump forcewake status and ref counts for all domains as part
of this debugfs. This is the sample output from gt1-

$ cat /sys/kernel/debug/dri//0/gt1/powergate_info
Media Power Gating Enabled: yes
Media Slice0 Power Gate Status: down
GSC Power Gate Status: down
GT.ref_count=0, GT.forcewake=0x10000
VDBox0.ref_count=0, VDBox0.forcewake=0x10000
VEBox0.ref_count=0, VEBox0.forcewake=0x10000
GSC.ref_count=0, GSC.forcewake=0x10000

v2: Fix checkpatch issues

Reviewed-by: Badal Nilawar <badal.nilawar@intel.com>
Signed-off-by: Vinay Belgaumkar<vinay.belgaumkar@intel.com>
Link: https://patch.msgid.link/20260204190314.2904009-3-vinay.belgaumkar@intel.com

drm/xe: Add GSC to powergate_info

Add GSC powergate status to the existing debugfs.

Reviewed-by: Badal Nilawar <badal.nilawar@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Link: https://patch.msgid.link/20260204190314.2904009-2-vinay.belgaumkar@intel.com

drm/xe: Add a wrapper for SLPC set/unset params

Also, extract out the GuC RC related set/unset param functions
into xe_guc_rc file. GuC still allows us to override GuC RC mode
using an SLPC H2G interface. Continue to use that interface, but
move the related code to the newly created xe_guc_rc file.

Cc: Riana Tauro <riana.tauro@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Riana Tauro <riana.tauro@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Link: https://patch.msgid.link/20260204014234.2867763-4-vinay.belgaumkar@intel.com

drm/xe: Use FORCEWAKE_GT in xe_guc_pc_fini_hw()

No need to use FORCEWAKE_ALL since the registers being written are in
GT domain.

Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20260204014234.2867763-3-vinay.belgaumkar@intel.com

drm/xe: Decouple GuC RC code from xe_guc_pc

Move enable/disable GuC RC logic into the new file. This will
allow us to independently enable/disable GuC RC and not rely
on SLPC related functions. GuC already provides separate H2G
interfaces to setup GuC RC and SLPC.

Cc: Riana Tauro <riana.tauro@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Riana Tauro <riana.tauro@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Link: https://patch.msgid.link/20260204014234.2867763-2-vinay.belgaumkar@intel.com

drm/xe/tests: Fix g2g_test_array indexing

The G2G KUnit test allocates a compact N×N
matrix sized by gt_count and verifies entries
using dense indices: idx = (j * gt_count) + i

The producer path currently computes idx using
gt->info.id. However, gt->info.id values
are not guaranteed to be contiguous.
For example, with gt_count=2 and IDs {0,3},
this formula produces indices beyond the
allocated range, causing mismatches and
potential out-of-bounds access.

Update the producer to map each GT to a dense
index in [0..gt_count-1] and compute:
idx = (tx_dense * gt_count) + rx_dense

Additionally, introduce an event-based delay
in g2g_test_in_order() to ensure ordering
between sends.

v2: Add single helper function (Daniele)

v3: Modify comment (Daniele)

Signed-off-by: Pallavi Mishra <pallavi.mishra@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Link: https://patch.msgid.link/20260129054722.2150674-1-pallavi.mishra@intel.com

drm/xe: Mutual exclusivity between CCS-mode and PF

Due to SLA agreement between PF and VFs, currently we block CCS
mode changes if driver is running as PF, even if there are no VFs
enabled yet. Use lockdown mechanism provided by the PF to relax
that limitation and still enforce above VFs related requirements.

Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Nareshkumar Gollakoti <naresh.kumar.g@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20260202170810.1393147-6-naresh.kumar.g@intel.com

drm/xe: Prevent VFs from exposing the CCS mode sysfs file

Skip creating CCS sysfs files in VF mode to ensure VFs do not
try to change CCS mode, as it is predefined and immutable in
the SR-IOV mode.

Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Nareshkumar Gollakoti <naresh.kumar.g@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20260202170810.1393147-5-naresh.kumar.g@intel.com

drm/xe/configfs: Fix 'parameter name omitted' errors

On some configs and old compilers we can get following build errors:

  ../drivers/gpu/drm/xe/xe_configfs.h: In function 'xe_configfs_get_ctx_restore_mid_bb':
  ../drivers/gpu/drm/xe/xe_configfs.h:40:76: error: parameter name omitted
   static inline u32 xe_configfs_get_ctx_restore_mid_bb(struct pci_dev *pdev, enum xe_engine_class,
                                                                            ^~~~~~~~~~~~~~~~~~~~
  ../drivers/gpu/drm/xe/xe_configfs.h: In function 'xe_configfs_get_ctx_restore_post_bb':
  ../drivers/gpu/drm/xe/xe_configfs.h:42:77: error: parameter name omitted
   static inline u32 xe_configfs_get_ctx_restore_post_bb(struct pci_dev *pdev, enum xe_engine_class,
                                                                             ^~~~~~~~~~~~~~~~~~~~
when trying to define our configfs stub functions. Fix that.

Fixes: 7a4756b2fd04 ("drm/xe/lrc: Allow to add user commands mid context switch")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
Link: https://patch.msgid.link/20260203193745.576-1-michal.wajdeczko@intel.com

drm/xe: Drop unnecessary include from xe_tile.h

We don't need to include xe_device_types.h there.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Link: https://patch.msgid.link/20260203211240.745-5-michal.wajdeczko@intel.com

drm/xe: Promote struct xe_tile definition to own file

We already have separate .c and .h files for xe_tile functions,
time to introduce _types.h to follow what other components do.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20260203211240.745-4-michal.wajdeczko@intel.com

drm/xe: Promote struct xe_mmio definition to own file

We already have separate .c and .h files for xe_mmio functions,
time to introduce _types.h to follow what other components do.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com> #v1
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20260203211240.745-3-michal.wajdeczko@intel.com

drm/xe: Move xe_root_tile_mmio() to xe_device.h

It seems to be a better place for this helper function, where
we already have other 'root' oriented helpers.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20260203211240.745-2-michal.wajdeczko@intel.com

drm/xe/pf: Fix sysfs initialization

In case of devm_add_action_or_reset() failure the provided cleanup
action will be run immediately on the not yet initialized kobject.
This may lead to errors like:

[ ] kobject: '(null)' (ff110001393608e0): is not initialized, yet kobject_put() is being called.
[ ] WARNING: lib/kobject.c:734 at kobject_put+0xd9/0x250, CPU#0: kworker/0:0/9
[ ] RIP: 0010:kobject_put+0xdf/0x250
[ ] Call Trace:
[ ]  xe_sriov_pf_sysfs_init+0x21/0x100 [xe]
[ ]  xe_sriov_pf_init_late+0x87/0x2b0 [xe]
[ ]  xe_sriov_init_late+0x5f/0x2c0 [xe]
[ ]  xe_device_probe+0x5f2/0xc20 [xe]
[ ]  xe_pci_probe+0x396/0x610 [xe]
[ ]  local_pci_probe+0x47/0xb0

[ ] refcount_t: underflow; use-after-free.
[ ] WARNING: lib/refcount.c:28 at refcount_warn_saturate+0x68/0xb0, CPU#0: kworker/0:0/9
[ ] RIP: 0010:refcount_warn_saturate+0x68/0xb0
[ ] Call Trace:
[ ]  kobject_put+0x174/0x250
[ ]  xe_sriov_pf_sysfs_init+0x21/0x100 [xe]
[ ]  xe_sriov_pf_init_late+0x87/0x2b0 [xe]
[ ]  xe_sriov_init_late+0x5f/0x2c0 [xe]
[ ]  xe_device_probe+0x5f2/0xc20 [xe]
[ ]  xe_pci_probe+0x396/0x610 [xe]
[ ]  local_pci_probe+0x47/0xb0

Fix that by calling kobject_init() and kobject_add() separately
and register cleanup action after the kobject is initialized.

Also make this cleanup registration a part of the create helper to
fix another mistake, as in the loop we were wrongly passing parent
kobject while registering cleanup action, and this resulted in some
undetected leaks.

Fixes: 5c170a4d9c53 ("drm/xe/pf: Prepare sysfs for SR-IOV admin attributes")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
Link: https://patch.msgid.link/20260203235332.1350-1-michal.wajdeczko@intel.com

drm/xe: Drop unnecessary goto in xe_device_create

The error label in this function just does an immediate return without
any further cleanup or processing. Replace the goto statements with
returns.

Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260204191025.3957211-2-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>

drm/gpusvm: Allow device pages to be mapped in mixed mappings after system pages

The current code rejects device mappings whenever system pages have
already been encountered. This is not the intended behavior when
allow_mixed is set.

Relax the restriction by permitting a single pagemap to be selected when
allow_mixed is enabled, even if system pages were found earlier.

Fixes: bce13d6ecd6c ("drm/gpusvm, drm/xe: Allow mixed mappings for userptr")
Cc: stable@vger.kernel.org
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Francois Dugast <francois.dugast@intel.com>
Link: https://patch.msgid.link/20260130194928.3255613-3-matthew.brost@intel.com

drm/gpusvm: Force unmapping on error in drm_gpusvm_get_pages

drm_gpusvm_get_pages() only sets the local flags prior to committing the
pages. If an error occurs mid-mapping, has_dma_mapping will be clear,
causing the unmap function to skip unmapping pages that were
successfully mapped before the error. Fix this by forcibly setting
has_dma_mapping in the error path to ensure all previously mapped pages
are properly unmapped.

Fixes: 99624bdff867 ("drm/gpusvm: Add support for GPU Shared Virtual Memory")
Cc: stable@vger.kernel.org
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Francois Dugast <francois.dugast@intel.com>
Link: https://patch.msgid.link/20260130194928.3255613-2-matthew.brost@intel.com

drm/xe/pm: Disable D3Cold for BMG only on specific platforms

Restrict D3Cold disablement for BMG to unsupported NUC platforms,
instead of disabling it on all platforms.

Signed-off-by: Karthik Poosa <karthik.poosa@intel.com>
Fixes: 3e331a6715ee ("drm/xe/pm: Temporarily disable D3Cold on BMG")
Link: https://patch.msgid.link/20260123173238.1642383-1-karthik.poosa@intel.com
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/xe/configfs: Add sriov.admin_only_pf attribute

Instead of relying on fixed relation to the display probe flag,
add configfs attribute to allow an administrator to configure
desired PF operation mode in a more flexible way.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/20260121214218.2817-6-michal.wajdeczko@intel.com

drm/xe/pf: Define admin_only as real flag

Instead of doing guesses each time during the runtime, set flag
admin_only once during PF's initialization.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/20260127210501.794-1-michal.wajdeczko@intel.com

drm/xe/configfs: Always return consistent max_vfs value

The max_vfs parameter used by the Xe driver has its default value
definition, but it could be altered by the module parameter or by
the device specific configfs attribute.

To avoid mistakes or code duplication, always rely on the configfs
helper (or stub), which will provide necessary fallback if needed.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/20260121214218.2817-4-michal.wajdeczko@intel.com

drm/xe/configfs: Use proper notation for local include

For local includes we should use "" notation, not <>.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/20260121214218.2817-3-michal.wajdeczko@intel.com

drm/xe: Keep all defaults in single header

We already have most of Xe defaults defined in xe_module.c,
where we use them for the modparam initializations, but some
were defined elsewhere, which breaks the consistency.

Introduce xe_defaults.h file, that will act as a placeholder
for all our default values, and can be used from other places.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/20260121214218.2817-2-michal.wajdeczko@intel.com

drm/xe: add WQ_PERCPU to alloc_workqueue users

This continues the effort to refactor workqueue APIs, which began with
the introduction of new workqueues and a new alloc_workqueue flag in:

commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and system_dfl_wq")
commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag")

The refactoring is going to alter the default behavior of
alloc_workqueue() to be unbound by default.

With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND),
any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND
must now use WQ_PERCPU. For more details see the Link tag below.

In order to keep alloc_workqueue() behavior identical, explicitly request
WQ_PERCPU.

Link: https://lore.kernel.org/all/20250221112
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/20260202103756.62138-3-marco.crivellari@suse.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/xe: replace use of system_unbound_wq with system_dfl_wq

This patch continues the effort to refactor workqueue APIs, which has begun
with the changes introducing new workqueues and a new alloc_workqueue flag:

   commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and system_dfl_wq")
   commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag")

The point of the refactoring is to eventually alter the default behavior of
workqueues to become unbound by default so that their workload placement is
optimized by the scheduler.

Before that to happen, workqueue users must be converted to the better named
new workqueues with no intended behaviour changes:

   system_wq -> system_percpu_wq
   system_unbound_wq -> system_dfl_wq

This way the old obsolete workqueues (system_wq, system_unbound_wq) can be
removed in the future.

Link: https://lore.kernel.org/all/20250221112003.1dSuoGyc@linutronix.de/
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/20260202103756.62138-2-marco.crivellari@suse.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/xe/guc: Allow second H2G retry on FLR

During VF FLR the scratch registers could be cleared both by the
GuC and by the PF driver. Allow to retry more times once we find
out that the HXG header was cleared and wait at least 256ms before
resending the same message again to the GuC.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260127193727.601-7-michal.wajdeczko@intel.com

drm/xe/guc: Wait before retrying sending H2G

We shall resend H2G message after receiving NO_RESPONSE_RETRY reply,
but since GuC dropped that H2G due to some interim state, we should
give it a little time to stabilize. Wait before sending the same H2G
again, start with 1ms delay, then increase exponentially to 256ms.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260127193727.601-6-michal.wajdeczko@intel.com

drm/xe/guc: Drop redundant register read

The xe_mmio_wait32() already returns the last value of the register
for which we were waiting, there is no need read it again.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260127193727.601-5-michal.wajdeczko@intel.com

drm/xe/guc: Limit sleep while waiting for H2G credits

Instead of endlessly increasing the sleep timeout while waiting
for the H2G credits, use exponential increase only up to the given
limit, like it was initially done in the GuC submission code.

While here, fix the actual timeout to the 1s as it was documented.

Suggested-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260127193727.601-4-michal.wajdeczko@intel.com

drm/xe: Move exponential sleep logic to helper

We want to reuse the same increased sleep logic in other places.
To avoid code duplication, move it to the helper.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260127193727.601-3-michal.wajdeczko@intel.com

drm/xe: Promote relaxed_ms_sleep

We want to have single place with sleep related helpers for better
code reuse. Create xe_sleep.h and move relaxed_ms_sleep() there.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260127193727.601-2-michal.wajdeczko@intel.com

drm/xe/pf: Simplify IS_SRIOV_PF macro

Instead of two having variants of the IS_SRIOV_PF macro, move the
CONFIG_PCI_IOV check to the xe_device_is_sriov_pf() function and
let the compiler optimize that. This will help us drop poor man's
type check of the macro parameter that fails on const xe pointer.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
Link: https://patch.msgid.link/20260128222714.3056-1-michal.wajdeczko@intel.com

drm/xe: Fix kerneldoc for xe_tlb_inval_job_alloc_dep

Correct the function name in the kerneldoc.
It is for below warning:
"Warning: drivers/gpu/drm/xe/xe_tlb_inval_job.c:210 expecting prototype for
xe_tlb_inval_alloc_dep(). Prototype was for xe_tlb_inval_job_alloc_dep()
instead"

Fixes: 15366239e2130 ("drm/xe: Decouple TLB invalidations from GT")
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20260129233834.419977-8-shuicheng.lin@intel.com

drm/xe: Fix kerneldoc for xe_gt_tlb_inval_init_early

Correct the function name in the kerneldoc.
It is for below warning:
"Warning: drivers/gpu/drm/xe/xe_tlb_inval.c:136 expecting prototype for
xe_gt_tlb_inval_init(). Prototype was for xe_gt_tlb_inval_init_early()
instead"

v2: add () for the function. (Michal)

Fixes: db16f9d90c1d9 ("drm/xe: Split TLB invalidation code in frontend and backend")
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20260129233834.419977-7-shuicheng.lin@intel.com

drm/xe: Fix kerneldoc for xe_migrate_exec_queue

Correct the function name in the kerneldoc.
It is for below warning:
"Warning: drivers/gpu/drm/xe/xe_migrate.c:1262 expecting prototype for
xe_get_migrate_exec_queue(). Prototype was for xe_migrate_exec_queue()
instead"

Fixes: 916ee4704a865 ("drm/xe/vf: Register CCS read/write contexts with Guc")
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20260129233834.419977-6-shuicheng.lin@intel.com

drm/xe/query: Fix topology query pointer advance

The topology query helper advanced the user pointer by the size
of the pointer, not the size of the structure. This can misalign
the output blob and corrupt the following mask. Fix the increment
to use sizeof(*topo).
There is no issue currently, as sizeof(*topo) happens to be equal
to sizeof(topo) on 64-bit systems (both evaluate to 8 bytes).

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20260130043907.465128-2-shuicheng.lin@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>

drm/xe: use entry_dump callbacks for xe2+ PAT dumps

Move xe2+ PAT entry printing into the entry_dump op so platform
specific logic stays localized, simplifying future maintenance.

v2:
- Do not null xe->pat.ops for VFs.
- Skip PAT init and dump on VFs (-EOPNOTSUPP), avoiding NULL ops use.

v3:
- fixed typo

v4: (Matt)
- Switch xe2_dump() to use the new ops->entry_dump() vfunc.
- Remove xe3p_xpc_dump() and reuse the common xe2_dump() for Xe3p XPC.
- This also fixes Xe3p_HPM media PAT dumping by using the proper
non-MCR access for the PAT register range (bspec 76445).

Cc: Matt Roper <matthew.d.roper@intel.com>
Suggested-by: Brian Nguyen <brian3.nguyen@intel.com>
Signed-off-by: Xin Wang <x.wang@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20260130175349.2249033-1-x.wang@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>

drm/xe/guc: Fix kernel-doc warning in GuC scheduler ABI header

The GuC scheduler ABI header contains a file-level comment that is not
intended to document a kernel-doc symbol. Using kernel-doc comment
syntax (/** */) triggers kernel-doc warnings.

With "-Werror", this causes the build to fail. Convert the comment to a
regular block comment.

HDRTEST drivers/gpu/drm/xe/abi/guc_scheduler_abi.h
Warning: drivers/gpu/drm/xe/abi/guc_scheduler_abi.h:11 This comment starts with '/**', but isn't a kernel-doc comment. Refer to Documentation/doc-guide/kernel-doc.rst
* Generic defines required for registration with and submissions to the GuC
1 warnings as errors
make[6]: *** [drivers/gpu/drm/xe/Makefile:377: drivers/gpu/drm/xe/abi/guc_scheduler_abi.hdrtest] Error 3
make[5]: *** [scripts/Makefile.build:544: drivers/gpu/drm/xe] Error 2
make[4]: *** [scripts/Makefile.build:544: drivers/gpu/drm] Error 2
make[3]: *** [scripts/Makefile.build:544: drivers/gpu] Error 2
make[2]: *** [scripts/Makefile.build:544: drivers] Error 2
make[1]: *** [/home/kbuild2/kernel/Makefile:2088: .] Error 2
make: *** [Makefile:248: __sub-make] Error 2

v2:
- Add Fixes tag (Daniele)

Fixes: b0c5cf4f5917 ("drm/gt/guc: extract scheduler-related defines from guc_fwif.h")
Signed-off-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com>
Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patch.msgid.link/20260130135210.2659200-1-chaitanya.kumar.borah@intel.com

drm/xe/guc: Fix CFI violation in debugfs access.

xe_guc_print_info is void-returning, but the function pointer it is
assigned to expects an int-returning function, leading to the following
CFI error:

[ 206.873690] CFI failure at guc_debugfs_show+0xa1/0xf0 [xe]
(target: xe_guc_print_info+0x0/0x370 [xe]; expected type: 0xbe3bc66a)

Fix this by updating xe_guc_print_info to return an integer.

Fixes: e15826bb3c2c ("drm/xe/guc: Refactor GuC debugfs initialization")
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: George D Sworo <george.d.sworo@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20260129182547.32899-2-daniele.ceraolospurio@intel.com

drm/xe: Apply WA_16028005424 to Media

Apply WA_16028005424 to following IPs:
Xe2_LPM, Xe2_HPM, Xe3_LPM, Xe3p_LPM

While doing this move the same WA defined for Xe3_LPG under the comment
for Xe3_LPG. It was wrongly placed under Xe3_LPM.

Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20260128062911.1456539-2-balasubramani.vivekanandan@intel.com

drm/xe/pf: Fix typo in function kernel-doc

The function name is missing an underscore, which results in:

  Warning: ../drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c:1261
  This comment starts with '/**', but isn't a kernel-doc comment.
  Refer to Documentation/doc-guide/kernel-doc.rst
  * xe_gt_sriov_pf_control_trigger restore_vf() - Start ...

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
Link: https://patch.msgid.link/20251217150702.2669-1-michal.wajdeczko@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/xe/multi_queue: Protect priority against concurrent access

Use a spinlock to protect multi-queue priority being concurrently
updated by multiple set_priority ioctls and to protect against
concurrent read and write to this field.

v2: Update documentation, remove WRITE/READ_LOCK() (Thomas)
Use scoped_guard, reduced lock scope (Matt Brost)
v3: Fix author (checkpatch)

Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patch.msgid.link/20260126174241.3470390-2-niranjana.vishwanathapura@intel.com

drm/xe/nvm: Defer xe->nvm assignment until init succeeds

Allocate and initialize the NVM structure using a local pointer and
assign it to xe->nvm only after all initialization steps succeed.

This avoids exposing a partially initialized xe->nvm and removes the
need to explicitly clear xe->nvm on error paths, simplifying error
handling and making the lifetime rules clearer.

Cc: Alexander Usyskin <alexander.usyskin@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Brian Nguyen <brian3.nguyen@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Brian Nguyen <brian3.nguyen@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Link: https://patch.msgid.link/20260120183239.2966782-8-shuicheng.lin@intel.com

drm/xe/nvm: Fix double-free on aux add failure

After a successful auxiliary_device_init(), aux_dev->dev.release
(xe_nvm_release_dev()) is responsible for the kfree(nvm). When
there is failure with auxiliary_device_add(), driver will call
auxiliary_device_uninit(), which call put_device(). So that the
.release callback will be triggered to free the memory associated
with the auxiliary_device.

Move the kfree(nvm) into the auxiliary_device_init() failure path
and remove the err goto path to fix below error.

"
[   13.232905] ==================================================================
[   13.232911] BUG: KASAN: double-free in xe_nvm_init+0x751/0xf10 [xe]
[   13.233112] Free of addr ffff888120635000 by task systemd-udevd/273

[   13.233120] CPU: 8 UID: 0 PID: 273 Comm: systemd-udevd Not tainted 6.19.0-rc2-lgci-xe-kernel+ #225 PREEMPT(voluntary)
...
[   13.233125] Call Trace:
[   13.233126]  <TASK>
[   13.233127]  dump_stack_lvl+0x7f/0xc0
[   13.233132]  print_report+0xce/0x610
[   13.233136]  ? kasan_complete_mode_report_info+0x5d/0x1e0
[   13.233139]  ? xe_nvm_init+0x751/0xf10 [xe]
...
"

v2: drop err goto path. (Alexander)

Fixes: d4c3ed963e41 ("drm/xe: defer free of NVM auxiliary container to device release callback")
Reviewed-by: Nitin Gote <nitin.r.gote@intel.com>
Reviewed-by: Brian Nguyen <brian3.nguyen@intel.com>
Cc: Alexander Usyskin <alexander.usyskin@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Suggested-by: Brian Nguyen <brian3.nguyen@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Link: https://patch.msgid.link/20260120183239.2966782-7-shuicheng.lin@intel.com

drm/xe/nvm: Manage nvm aux cleanup with devres

Move nvm teardown to a devm-managed action registered from xe_nvm_init().
This ensures the auxiliary NVM device is deleted on probe failure and
device detach without requiring explicit calls from remove paths.

As part of this, drop xe_nvm_fini() from xe_device_remove() and from the
survivability sysfs teardown, and remove the public xe_nvm_fini() API from
the header.

This is to fix below warn message when there is probe failure after
xe_nvm_init(), then xe_device_probe() is called again:
"
[  207.318152] sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:01.0/0000:03:00.0/xe.nvm.768'
[  207.318157] CPU: 5 UID: 0 PID: 10261 Comm: modprobe Tainted: G    B   W           6.19.0-rc2-lgci-xe-kernel+ #223 PREEMPT(voluntary)
[  207.318160] Tainted: [B]=BAD_PAGE, [W]=WARN
[  207.318161] Hardware name: ASUS System Product Name/PRIME Z790-P WIFI, BIOS 0812 02/24/2023
[  207.318163] Call Trace:
[  207.318163]  <TASK>
[  207.318165]  dump_stack_lvl+0xa0/0xc0
[  207.318170]  dump_stack+0x10/0x20
[  207.318171]  sysfs_warn_dup+0xd5/0x110
[  207.318175]  sysfs_create_dir_ns+0x1f6/0x280
[  207.318177]  ? __pfx_sysfs_create_dir_ns+0x10/0x10
[  207.318179]  ? lock_acquire+0x1a4/0x2e0
[  207.318182]  ? __kasan_check_read+0x11/0x20
[  207.318185]  ? do_raw_spin_unlock+0x5c/0x240
[  207.318187]  kobject_add_internal+0x28d/0x8e0
[  207.318189]  kobject_add+0x11f/0x1f0
[  207.318191]  ? __pfx_kobject_add+0x10/0x10
[  207.318193]  ? lockdep_init_map_type+0x4b/0x230
[  207.318195]  ? get_device_parent.isra.0+0x43/0x4c0
[  207.318197]  ? kobject_get+0x55/0xf0
[  207.318199]  device_add+0x2d7/0x1500
[  207.318201]  ? __pfx_device_add+0x10/0x10
[  207.318203]  ? lockdep_init_map_type+0x4b/0x230
[  207.318205]  __auxiliary_device_add+0x99/0x140
[  207.318208]  xe_nvm_init+0x7a2/0xef0 [xe]
[  207.318333]  ? xe_devcoredump_init+0x80/0x110 [xe]
[  207.318452]  ? __devm_add_action+0x82/0xc0
[  207.318454]  ? fs_reclaim_release+0xc0/0x110
[  207.318457]  xe_device_probe+0x17dd/0x2c40 [xe]
[  207.318574]  ? __pfx___drm_dev_dbg+0x10/0x10
[  207.318576]  ? add_dr+0x180/0x220
[  207.318579]  ? __pfx___drmm_mutex_release+0x10/0x10
[  207.318582]  ? __pfx_xe_device_probe+0x10/0x10 [xe]
[  207.318697]  ? xe_pm_init_early+0x33a/0x410 [xe]
[  207.318850]  xe_pci_probe+0x936/0x1250 [xe]
[  207.318999]  ? lock_acquire+0x1a4/0x2e0
[  207.319003]  ? __pfx_xe_pci_probe+0x10/0x10 [xe]
[  207.319151]  local_pci_probe+0xe6/0x1a0
[  207.319154]  pci_device_probe+0x523/0x840
[  207.319157]  ? __pfx_pci_device_probe+0x10/0x10
[  207.319159]  ? sysfs_do_create_link_sd.isra.0+0x8c/0x110
[  207.319162]  ? sysfs_create_link+0x48/0xc0
...
"

Fixes: c28bfb107dac ("drm/xe/nvm: add on-die non-volatile memory device")
Reviewed-by: Alexander Usyskin <alexander.usyskin@intel.com>
Reviewed-by: Brian Nguyen <brian3.nguyen@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Riana Tauro <riana.tauro@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Link: https://patch.msgid.link/20260120183239.2966782-6-shuicheng.lin@intel.com

drm/xe/configfs: Fix is_bound() pci_dev lifetime

Move pci_dev_put() after pci_dbg() to avoid using pdev after dropping its
reference.

Fixes: 2674f1ef29f46 ("drm/xe/configfs: Block runtime attribute changes")
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Link: https://patch.msgid.link/20260121173750.3090907-2-shuicheng.lin@intel.com

drm/xe/gt: Use CLASS() for forcewake in xe_gt_enable_comp_1wcoh

Adopt the scoped forcewake management using CLASS(xe_force_wake, ...)
to simplify the code and ensure proper resource release.

Cc: Xin Wang <x.wang@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Xin Wang <x.wang@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Link: https://patch.msgid.link/20260123180425.3262944-2-shuicheng.lin@intel.com

drm/xe/vf: Reset VF GuC state on fini

Unlike native/PF driver, which was explicitly triggering full GuC
reset during driver unwind, the VF driver was not notifying GuC that
it is about to unwind, and this could lead GuC to access stale data,
which in turn could be interpreted as VF's malicious activity.

Add managed action to send to GuC VF_RESET message during GT unwind.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patch.msgid.link/20260122151924.3726-1-michal.wajdeczko@intel.com

drm/xe: Move _THIS_IP_ usage from xe_vm_create() to dedicated function

After commit a3866ce7b122 ("drm/xe: Add vm to exec queues association"),
building for an architecture other than x86 (which defines its own
_THIS_IP_) with clang fails with:

  drivers/gpu/drm/xe/xe_vm.c:1586:3: error: cannot jump from this indirect goto statement to one of its possible targets
   1586 |                 drm_exec_retry_on_contention(&exec);
        |                 ^
  include/drm/drm_exec.h:123:4: note: expanded from macro 'drm_exec_retry_on_contention'
    123 |                         goto *__drm_exec_retry_ptr;             \
        |                         ^
  drivers/gpu/drm/xe/xe_vm.c:1542:3: note: possible target of indirect goto statement
   1542 |                 might_lock(&vm->exec_queues.lock);
        |                 ^
  include/linux/lockdep.h:553:33: note: expanded from macro 'might_lock'
    553 |         lock_release(&(lock)->dep_map, _THIS_IP_);                      \
        |                                        ^
  include/linux/instruction_pointer.h:10:41: note: expanded from macro '_THIS_IP_'
     10 | #define _THIS_IP_  ({ __label__ __here; __here: (unsigned long)&&__here; })
        |                                         ^
  drivers/gpu/drm/xe/xe_vm.c:1583:2: note: jump exits scope of variable with __attribute__((cleanup))
   1583 |         xe_validation_guard(&ctx, &xe->val, &exec, (struct xe_val_flags) {.interruptible = true},
        |         ^
  drivers/gpu/drm/xe/xe_validation.h:189:2: note: expanded from macro 'xe_validation_guard'
    189 |         scoped_guard(xe_validation, _ctx, _val, _exec, _flags, &_ret) \
        |         ^
  include/linux/cleanup.h:442:2: note: expanded from macro 'scoped_guard'
    442 |         __scoped_guard(_name, __UNIQUE_ID(label), args)
        |         ^
  include/linux/cleanup.h:433:20: note: expanded from macro '__scoped_guard'
    433 |         for (CLASS(_name, scope)(args);                                 \
        |                           ^
  drivers/gpu/drm/xe/xe_vm.c:1542:3: note: jump enters a statement expression
   1542 |                 might_lock(&vm->exec_queues.lock);
        |                 ^
  include/linux/lockdep.h:553:33: note: expanded from macro 'might_lock'
    553 |         lock_release(&(lock)->dep_map, _THIS_IP_);                      \
        |                                        ^
  include/linux/instruction_pointer.h:10:20: note: expanded from macro '_THIS_IP_'
     10 | #define _THIS_IP_  ({ __label__ __here; __here: (unsigned long)&&__here; })
        |                    ^

While this is a false positive error because __drm_exec_retry_ptr is
only ever assigned the label in drm_exec_until_all_locked() (thus it can
never jump over the cleanup variable), this error is not unreasonable in
general because the only supported use case for taking the address of a
label is computed gotos [1]. The kernel's use of the address of a label
in _THIS_IP_ is considered problematic by both GCC [2][3] and clang [4]
but they need to provide something equivalent before they can break this
use case.

Hide the usage of _THIS_IP_ by moving the CONFIG_PROVE_LOCKING if
statement to its own function, avoiding the error. This is similar to
commit 187e16f69de2 ("drm/xe: Work around clang multiple goto-label
error") but with the sources of _THIS_IP_.

Fixes: a3866ce7b122 ("drm/xe: Add vm to exec queues association")
Link: https://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44298
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120071
Link: https://github.com/llvm/llvm-project/issues/138272
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260121-xe-vm-fix-clang-goto-error-v1-1-7e121d81512e@kernel.org

drm/xe: Unregister drm device on probe error

Call drm_dev_unregister() when xe_device_probe() fails after successful
drm_dev_register(). This ensures the DRM device is promptly unregistered
before returning an error, avoiding leaving it registered on the failure
path.
Otherwise, there is warn message if xe_device_probe() is called again:
"
[  207.322365] [drm:drm_minor_register]
[  207.322381] debugfs: '128' already exists in 'dri'
[  207.322432] sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:01.0/0000:03:00.0/drm/renderD128'
[  207.322435] CPU: 5 UID: 0 PID: 10261 Comm: modprobe Tainted: G    B   W           6.19.0-rc2-lgci-xe-kernel+ #223 PREEMPT(voluntary)
[  207.322439] Tainted: [B]=BAD_PAGE, [W]=WARN
[  207.322440] Hardware name: ASUS System Product Name/PRIME Z790-P WIFI, BIOS 0812 02/24/2023
[  207.322441] Call Trace:
[  207.322442]  <TASK>
[  207.322443]  dump_stack_lvl+0xa0/0xc0
[  207.322446]  dump_stack+0x10/0x20
[  207.322448]  sysfs_warn_dup+0xd5/0x110
[  207.322451]  sysfs_create_dir_ns+0x1f6/0x280
[  207.322453]  ? __pfx_sysfs_create_dir_ns+0x10/0x10
[  207.322455]  ? lock_acquire+0x1a4/0x2e0
[  207.322458]  ? __kasan_check_read+0x11/0x20
[  207.322461]  kobject_add_internal+0x28d/0x8e0
[  207.322464]  kobject_add+0x11f/0x1f0
[  207.322465]  ? lock_acquire+0x1a4/0x2e0
[  207.322467]  ? __pfx_kobject_add+0x10/0x10
[  207.322469]  ? __kasan_check_write+0x14/0x20
[  207.322471]  ? kobject_put+0x62/0x4a0
[  207.322473]  ? get_device_parent.isra.0+0x1bb/0x4c0
[  207.322475]  ? kobject_put+0x62/0x4a0
[  207.322477]  device_add+0x2d7/0x1500
[  207.322479]  ? __pfx_device_add+0x10/0x10
[  207.322481]  ? drm_debugfs_add_file+0xfa/0x170
[  207.322483]  ? drm_debugfs_add_files+0x82/0xd0
[  207.322485]  ? drm_debugfs_add_files+0x82/0xd0
[  207.322487]  drm_minor_register+0x10a/0x2d0
[  207.322489]  drm_dev_register+0x143/0x860
[  207.322491]  ? xe_configfs_get_psmi_enabled+0x12/0x90 [xe]
[  207.322667]  xe_device_probe+0x185b/0x2c40 [xe]
[  207.322812]  ? __pfx___drm_dev_dbg+0x10/0x10
[  207.322815]  ? add_dr+0x180/0x220
[  207.322818]  ? __pfx___drmm_mutex_release+0x10/0x10
[  207.322821]  ? __pfx_xe_device_probe+0x10/0x10 [xe]
[  207.322966]  ? xe_pm_init_early+0x33a/0x410 [xe]
[  207.323136]  xe_pci_probe+0x936/0x1250 [xe]
[  207.323298]  ? lock_acquire+0x1a4/0x2e0
[  207.323302]  ? __pfx_xe_pci_probe+0x10/0x10 [xe]
[  207.323464]  local_pci_probe+0xe6/0x1a0
[  207.323468]  pci_device_probe+0x523/0x840
[  207.323470]  ? __pfx_pci_device_probe+0x10/0x10
[  207.323473]  ? sysfs_do_create_link_sd.isra.0+0x8c/0x110
[  207.323476]  ? sysfs_create_link+0x48/0xc0
[  207.323479]  really_probe+0x1fd/0x8a0
...
"

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patch.msgid.link/20260109211041.2446012-2-shuicheng.lin@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>

drm/xe/ptl: Disable DCC on PTL

On PTL, the recommendation is to disable DCC(Duty Cycle Control) as
it may cause some regressions due to added latencies. Upcoming GuC
releases will disable DCC on PTL as well, but we need to force it in
KMD so that this behavior is propagated to older kernels.

v2: Update commit message (Rodrigo)
v3: Rebase
v4: Fix typo: s/propagted/propagated

Fixes: 5cdb71d3b0db ("drm/xe/ptl: Add GuC FW definition for PTL")
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Link: https://patch.msgid.link/20260124005917.398522-1-vinay.belgaumkar@intel.com
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/xe/xelp: Fix Wa_18022495364

It looks I mistyped CS_DEBUG_MODE2 as CS_DEBUG_MODE1 when adding the
workaround. Fix it.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Fixes: ca33cd271ef9 ("drm/xe/xelp: Add Wa_18022495364")
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: <stable@vger.kernel.org> # v6.18+
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patch.msgid.link/20260116095040.49335-1-tvrtko.ursulin@igalia.com

drm/xe: Skip address copy for sync-only execs

For parallel exec queues, xe_exec_ioctl() copied the batch buffer address
array from userspace without checking num_batch_buffer.
If user creates a sync-only exec that doesn't use the address field, the
exec will fail with -EFAULT.
Add num_batch_buffer check to skip the copy, and the exec could be executed
successfully.

Here is the sync-only exec:
struct drm_xe_exec exec = {
    .extensions = 0,
    .exec_queue_id = qid,
    .num_syncs = 1,
    .syncs = (uintptr_t)&sync,
    .address = 0,            /* ignored for sync-only */
    .num_batch_buffer = 0,   /* sync-only */
};

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260122214053.3189366-2-shuicheng.lin@intel.com

drm/xe: derive mem copy capability from graphics version

Drop .has_mem_copy_instr from the platform descriptors and set it
in xe_info_init() after handle_gmdid() populates graphics_verx100.
Centralizing the GRAPHICS_VER(xe) >= 20 check keeps MEM_COPY enabled
on Xe2+ and removes redundant per-platform plumbing.

Bspec: 57561

Fixes: 1e12dbae9d72 ("drm/xe/migrate: support MEM_COPY instruction")
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Suggested-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Nitin Gote <nitin.r.gote@intel.com>
Link: https://patch.msgid.link/20260120054724.1982608-2-nitin.r.gote@intel.com
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>

drm/xe: Use DRM_BUDDY_CONTIGUOUS_ALLOCATION for contiguous allocations

The VRAM/stolen memory managers do not currently set
DRM_BUDDY_CONTIGUOUS_ALLOCATION for contiguous allocations. Enabling
this flag activates the buddy allocator's try_harder path, which helps
handle fragmented memory scenarios.

This enables the __alloc_contig_try_harder fallback in the buddy
allocator, allowing contiguous allocation requests to succeed even when
memory is fragmented by combining allocations from both(RHS and LHS)
sides of a large free block.

v2: (Matt B)
- Remove redundant logic for rounding allocation size and trimming when
TTM_PL_FLAG_CONTIGUOUS is set, since drm_buddy now handles this when
DRM_BUDDY_CONTIGUOUS_ALLOCATION is enabled

Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6713
Suggested-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Sanjay Yadav <sanjay.kumar.yadav@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patch.msgid.link/20260121111416.3104399-2-sanjay.kumar.yadav@intel.com

drm/xe: Select CONFIG_DEVICE_PRIVATE when DRM_XE_GPUSVM is selected

CONFIG_DEVICE_PRIVATE is a prerequisite for DRM_XE_GPUSVM.
Explicitly select it so that DRM_XE_GPUSVM is not unintentionally
left out from distro configs not explicitly enabling
CONFIG_DEVICE_PRIVATE.

v2:
- Select also CONFIG_ZONE_DEVICE since it's needed by
CONFIG_DEVICE_PRIVATE.
v3:
- Depend on CONFIG_ZONE_DEVICE rather than selecting it.

Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: <dri-devel@lists.freedesktop.org>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patch.msgid.link/20260121091048.41371-3-thomas.hellstrom@linux.intel.com

drm, drm/xe: Fix xe userptr in the absence of CONFIG_DEVICE_PRIVATE

CONFIG_DEVICE_PRIVATE is not selected by default by some distros,
for example Fedora, and that leads to a regression in the xe driver
since userptr support gets compiled out.

It turns out that DRM_GPUSVM, which is needed for xe userptr support
compiles also without CONFIG_DEVICE_PRIVATE, but doesn't compile
without CONFIG_ZONE_DEVICE.
Exclude the drm_pagemap files from compilation with !CONFIG_ZONE_DEVICE,
and remove the CONFIG_DEVICE_PRIVATE dependency from CONFIG_DRM_GPUSVM and
the xe driver's selection of it, re-enabling xe userptr for those configs.

v2:
- Don't compile the drm_pagemap files unless CONFIG_ZONE_DEVICE is set.
- Adjust the drm_pagemap.h header accordingly.

Fixes: 9e9787414882 ("drm/xe/userptr: replace xe_hmm with gpusvm")
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: dri-devel@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v6.18+
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patch.msgid.link/20260121091048.41371-2-thomas.hellstrom@linux.intel.com

drm/xe/migrate: fix job lock assert

We are meant to be checking the user vm for the bind queue, but actually
we are checking the migrate vm. For various reasons this is not
currently firing but this will likely change in the future.

Now that we have the user_vm attached to the bind queue, we can fix this
by directly checking that here.

Fixes: dba89840a920 ("drm/xe: Add GT TLB invalidation jobs")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Arvind Yadav <arvind.yadav@intel.com>
Link: https://patch.msgid.link/20260120110609.77958-4-matthew.auld@intel.com

drm/xe/uapi: disallow bind queue sharing

Currently this is very broken if someone attempts to create a bind
queue and share it across multiple VMs. For example currently we assume
it is safe to acquire the user VM lock to protect some of the bind queue
state, but if allow sharing the bind queue with multiple VMs then this
quickly breaks down.

To fix this reject using a bind queue with any VM that is not the same
VM that was originally passed when creating the bind queue. This a uAPI
change, however this was more of an oversight on kernel side that we
didn't reject this, and expectation is that userspace shouldn't be using
bind queues in this way, so in theory this change should go unnoticed.

Based on a patch from Matt Brost.

v2 (Matt B):
  - Hold the vm lock over queue create, to ensure it can't be closed as
    we attach the user_vm to the queue.
  - Make sure we actually check for NULL user_vm in destruction path.
v3:
  - Fix error path handling.

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Reported-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Michal Mrozek <michal.mrozek@intel.com>
Cc: Carl Zhang <carl.zhang@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Acked-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Arvind Yadav <arvind.yadav@intel.com>
Acked-by: Michal Mrozek <michal.mrozek@intel.com>
Link: https://patch.msgid.link/20260120110609.77958-3-matthew.auld@intel.com

drm/xe: Add context-based invalidation to GuC TLB invalidation backend

Introduce context-based invalidation support to the GuC TLB invalidation
backend. This is implemented by iterating over each exec queue per GT
within a VM, skipping inactive queues, and issuing a context-based (GuC
ID) H2G TLB invalidation. All H2G messages, except the final one, are
sent with an invalid seqno, which the G2H handler drops to ensure the
TLB invalidation fence is only signaled once all H2G messages are
completed.

A watermark mechanism is also added to switch between context-based TLB
invalidations and full device-wide invalidations, as the return on
investment for context-based invalidation diminishes when many exec
queues are mapped.

v2:
- Fix checkpatch warnings
v3:
- Rebase on PRL
- Use ref counting to avoid racing with deregisters
v4:
- Extra braces (Stuart)
- Use per GT list (CI)
- Reorder put

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Tested-by: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Link: https://patch.msgid.link/20260116221731.868657-12-matthew.brost@intel.com

drm/xe: Add exec queue active vfunc

If an exec queue is inactive (e.g., not registered or scheduling is
disabled), TLB invalidations are not issued for that queue. Add a
virtual function to determine the active state, which TLB invalidation
logic can hook into.

v5:
- Operate on primary in active function

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Tested-by: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Link: https://patch.msgid.link/20260116221731.868657-11-matthew.brost@intel.com

drm/xe: Add xe_tlb_inval_idle helper

Introduce the xe_tlb_inval_idle helper to detect whether any TLB
invalidations are currently in flight. This is used in context-based TLB
invalidations to determine whether dummy TLB invalidations need to be
sent to maintain proper TLB invalidation fence ordering..

v2:
- Implement xe_tlb_inval_idle based on pending list

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Tested-by: Stuart Summers <stuart.summers@intel.com>
Link: https://patch.msgid.link/20260116221731.868657-10-matthew.brost@intel.com

drm/xe: Add send_tlb_inval_ppgtt helper

Extract the common code that issues a TLB invalidation H2G for PPGTTs
into a helper function. This helper can be reused for both ASID-based
and context-based TLB invalidations.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Tested-by: Stuart Summers <stuart.summers@intel.com>
Link: https://patch.msgid.link/20260116221731.868657-9-matthew.brost@intel.com

drm/xe: Rename send_tlb_inval_ppgtt to send_tlb_inval_asid_ppgtt

Context-based TLB invalidations have their own set of GuC TLB
invalidation operations. Rename the current PPGTT invalidation function,
which operates on ASIDs, to a more descriptive name that reflects its
purpose.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Tested-by: Stuart Summers <stuart.summers@intel.com>
Link: https://patch.msgid.link/20260116221731.868657-8-matthew.brost@intel.com

drm/xe: Taint TLB invalidation seqno lock with GFP_KERNEL

Taint TLB invalidation seqno lock with GFP_KERNEL as TLB invalidations
can be in the path of reclaim (e.g., MMU notifiers).

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
Tested-by: Stuart Summers <stuart.summers@intel.com>
Link: https://patch.msgid.link/20260116221731.868657-7-matthew.brost@intel.com

drm/xe: Add vm to exec queues association

Maintain a list of exec queues per vm which will be used by TLB
invalidation code to do context-ID based tlb invalidations.

v4:
- More asserts (Stuart)
- Per GT list (CI)
- Skip adding / removal if context TLB invalidatiions not supported
(Stuart)

Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Tested-by: Stuart Summers <stuart.summers@intel.com>
Link: https://patch.msgid.link/20260116221731.868657-6-matthew.brost@intel.com

drm/xe: Add xe_device_asid_to_vm helper

Introduce the xe_device_asid_to_vm helper, which can be used throughout
the driver to resolve the VM from a given ASID.

v4:
- Move forward declare after includes (Stuart)

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Tested-by: Stuart Summers <stuart.summers@intel.com>
Link: https://patch.msgid.link/20260116221731.868657-5-matthew.brost@intel.com

drm/xe: Add has_ctx_tlb_inval to device info

Add has_ctx_tlb_inval to device info indicating a device has context
basd TLB invalidation.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Tested-by: Stuart Summers <stuart.summers@intel.com>
Link: https://patch.msgid.link/20260116221731.868657-4-matthew.brost@intel.com

drm/xe: Make usm.asid_to_vm allocation use GFP_NOWAIT

Ensure the asid_to_vm lookup is reclaim-safe so it can be performed
during TLB invalidations, which is necessary for context-based TLB
invalidation support.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Tested-by: Stuart Summers <stuart.summers@intel.com>
Link: https://patch.msgid.link/20260116221731.868657-3-matthew.brost@intel.com

drm/xe: Add normalize_invalidation_range

Extract the code that determines the alignment of TLB invalidation into
a helper function — normalize_invalidation_range. This will be useful
when adding context-based invalidations to the GuC TLB invalidation
backend.

Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Tested-by: Stuart Summers <stuart.summers@intel.com>
Link: https://patch.msgid.link/20260116221731.868657-2-matthew.brost@intel.com

drm/xe/multi_queue: Enable multi_queue on xe3p_xpc

xe3p_xpc supports multi_queue, enable it.

v2: Rename multi_queue_enable_mask to multi_queue_engine_class_mask
(Matt Brost)

Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260116220333.861850-3-matthew.brost@intel.com

drm/xe: Ban entire multi-queue group on any job timeout

In multi-queue mode, we only have control over the entire group, so we
cannot ban individual queues or signal fences until the whole group is
removed from hardware. Implement banning of the entire group if any job
within it times out.

v2:
- Fix lock inversion (Niranjana)
- Initialize new queues in group to stopped
v3:
- Blindly call xe_exec_queue_multi_queue_primary (Niranjana)
- More comments around temporary list when stopping (Niranjana)
- Restart group on false timeout (Niranjana)

Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Link: https://patch.msgid.link/20260116220333.861850-2-matthew.brost@intel.com

drm/xe/xe_query: Remove check for gt

There's no need to check a userspace-provided GT ID (which may come from
any tile) against the number of GTs that can be present on a single
tile. The xe_device_get_gt() lookup already checks that the GT ID passed
is valid for the current device.(Matt Roper)

Signed-off-by: Nakshtra Goyal <nakshtra.goyal@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20260113091928.67446-1-nakshtra.goyal@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>

drm/xe: Reduce LRC timestamp stuck message on VFs to notice

An LRC timestamp getting stuck is a somewhat normal occurrence. If a
single VF submits a job that does not get timesliced, the LRC timestamp
will not increment. Reduce the LRC timestamp stuck message on VFs to
notice (same log level as job timeout) to avoid false CI bugs in tests
where a VF submits a job that does not get timesliced.

Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/7032
Fixes: bb63e7257e63 ("drm/xe: Avoid toggling schedule state to check LRC timestamp in TDR")
Suggested-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Link: https://patch.msgid.link/20260114184905.4189026-1-matthew.brost@intel.com