Matt Roper [Mon, 10 Nov 2025 23:20:20 +0000 (15:20 -0800)]
drm/xe/eustall: Store forcewake reference in stream structure
Calls to xe_force_wake_put() should generally pass the exact reference
returned by xe_force_wake_get(). Since EU stall grabs and releases
forcewake in different functions, xe_eu_stall_disable_locked() is
currently calling put with a hardcoded RENDER domain. Although this
works for now, it's somewhat fragile in case the power domain(s)
required by stall sampling change in the future, or if workarounds show
up that require us to obtain additional domains.
Stash the original reference obtained during stream enable inside the
stream structure so that we can use it directly when the stream is
disabled.
Michal Wajdeczko [Wed, 12 Nov 2025 12:44:08 +0000 (13:44 +0100)]
drm/xe/pf: Use migration-friendly GGTT auto-provisioning
Instead of trying very hard to find the largest fair GGTT size that
could be allocated for VFs on the current tile, pick some smaller
rounded down to power-of-two value that is more likely to be
provisioned in the same manner by the other PF instance:
Note that due to FW/HW limitations we can't share all 4GiB GGTT
address space with VFs, so for the larger (>7) number of the VFs
the change in the outcome is happening at different points than
we have in case of GuC contexts/doorbells IDs.
Michał Winiarski [Wed, 12 Nov 2025 13:22:20 +0000 (14:22 +0100)]
drm/intel/bmg: Allow device ID usage with single-argument macros
When INTEL_BMG_G21_IDS were added as a subplatform, token concatenation
operator usage was omitted, making INTEL_BMG_IDS not usable with
single-argument macros.
Fix that by adding the missing operator.
Michał Winiarski [Wed, 12 Nov 2025 13:22:19 +0000 (14:22 +0100)]
drm/xe/pf: Add wait helper for VF FLR
VF FLR requires additional processing done by PF driver.
The processing is done after FLR is already finished from PCIe
perspective.
In order to avoid a scenario where migration state transitions while
PF processing is still in progress, additional synchronization
point is needed.
Add a helper that will be used as part of VF driver struct
pci_error_handlers .reset_done() callback.
Lukasz Laguna [Wed, 12 Nov 2025 13:22:17 +0000 (14:22 +0100)]
drm/xe/migrate: Add function to copy of VRAM data in chunks
Introduce a new function to copy data between VRAM and sysmem objects.
The existing xe_migrate_copy() is tailored for eviction and restore
operations, which involves additional logic and operates on entire
objects.
The xe_migrate_vram_copy_chunk() allows copying chunks of data to or
from a dedicated buffer object, which is essential in case of VF
migration.
Michał Winiarski [Wed, 12 Nov 2025 13:22:13 +0000 (14:22 +0100)]
drm/xe/pf: Add helpers for VF GGTT migration data handling
In an upcoming change, the VF GGTT migration data will be handled as
part of VF control state machine. Add the necessary helpers to allow the
migration data transfer to/from the HW GGTT resource.
Michał Winiarski [Wed, 12 Nov 2025 13:22:11 +0000 (14:22 +0100)]
drm/xe/pf: Switch VF migration GuC save/restore to struct migration data
In upcoming changes, the GuC VF migration data will be handled as part
of separate SAVE/RESTORE states in VF control state machine.
Now that the data is decoupled from both guc_state debugfs and PAUSE
state, we can safely remove the struct xe_gt_sriov_state_snapshot and
modify the GuC save/restore functions to operate on struct
xe_sriov_migration_data.
Michał Winiarski [Wed, 12 Nov 2025 13:22:10 +0000 (14:22 +0100)]
drm/xe/pf: Don't save GuC VF migration data on pause
In upcoming changes, the GuC VF migration data will be handled as part
of separate SAVE/RESTORE states in VF control state machine.
Remove it from PAUSE state.
Michał Winiarski [Wed, 12 Nov 2025 13:22:09 +0000 (14:22 +0100)]
drm/xe/pf: Remove GuC migration data save/restore from GT debugfs
In upcoming changes, SR-IOV VF migration data will be extended beyond
GuC data and exported to userspace using VFIO interface (with a
vendor-specific variant driver) and a device-level debugfs interface.
Remove the GT-level debugfs.
Michał Winiarski [Wed, 12 Nov 2025 13:22:08 +0000 (14:22 +0100)]
drm/xe/pf: Increase PF GuC Buffer Cache size and use it for VF migration
Contiguous PF GGTT VMAs can be scarce after creating VFs.
Increase the GuC buffer cache size to 8M for PF so that we can fit GuC
migration data (which currently maxes out at just over 4M) and use the
cache instead of allocating fresh BOs.
Michał Winiarski [Wed, 12 Nov 2025 13:22:07 +0000 (14:22 +0100)]
drm/xe: Allow the caller to pass guc_buf_cache size
An upcoming change will use GuC buffer cache as a place where GuC
migration data will be stored, and the memory requirement for that is
larger than indirect data.
Allow the caller to pass the size based on the intended usecase.
Michał Winiarski [Wed, 12 Nov 2025 13:22:06 +0000 (14:22 +0100)]
drm/xe: Add sa/guc_buf_cache sync interface
In upcoming changes the cached buffers are going to be used to read data
produced by the GuC. Add a counterpart to flush, which synchronizes the
CPU-side of suballocation with the GPU data and propagate the interface
to GuC Buffer Cache.
Michał Winiarski [Wed, 12 Nov 2025 13:22:04 +0000 (14:22 +0100)]
drm/xe/pf: Add minimalistic migration descriptor
The descriptor reuses the KLV format used by GuC and contains metadata
that can be used to quickly fail migration when source is incompatible
with destination.
Michał Winiarski [Wed, 12 Nov 2025 13:22:03 +0000 (14:22 +0100)]
drm/xe/pf: Add support for encap/decap of bitstream to/from packet
Add debugfs handlers for migration state and handle bitstream
.read()/.write() to convert from bitstream to/from migration data
packets.
As descriptor/trailer are handled at this layer - add handling for both
save and restore side.
Michał Winiarski [Wed, 12 Nov 2025 13:22:02 +0000 (14:22 +0100)]
drm/xe/pf: Add helpers for migration data packet allocation / free
Now that it's possible to free the packets - connect the restore
handling logic with the ring.
The helpers will also be used in upcoming changes that will start
producing migration data packets.
Michał Winiarski [Wed, 12 Nov 2025 13:22:01 +0000 (14:22 +0100)]
drm/xe/pf: Add data structures and handlers for migration rings
Migration data is queued in a per-GT ptr_ring to decouple the worker
responsible for handling the data transfer from the .read() and .write()
syscalls.
Add the data structures and handlers that will be used in future
commits.
Michał Winiarski [Wed, 12 Nov 2025 13:21:59 +0000 (14:21 +0100)]
drm/xe/pf: Convert control state to bitmap
In upcoming changes, the number of states will increase as a result of
introducing SAVE and RESTORE states.
This means that using unsigned long as underlying storage won't work on
32-bit architectures, as we'll run out of bits.
Use bitmap instead.
Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202510231918.XlOqymLC-lkp@intel.com/ Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251112132220.516975-4-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Michał Winiarski [Wed, 12 Nov 2025 13:21:58 +0000 (14:21 +0100)]
drm/xe: Move migration support to device-level struct
Upcoming changes will allow users to control VF state and obtain its
migration data with a device-level granularity (not tile/gt).
Change the data structures to reflect that and move the GT-level
migration init to happen after device-level init.
Michał Winiarski [Wed, 12 Nov 2025 13:21:57 +0000 (14:21 +0100)]
drm/xe/pf: Remove GuC version check for migration support
Since commit 4eb0aab6e4434 ("drm/xe/guc: Bump minimum required GuC
version to v70.29.2"), the minimum GuC version required by the driver
is v70.29.2, which should already include everything that we need for
migration.
Remove the version check.
Sk Anirban [Wed, 12 Nov 2025 18:51:55 +0000 (00:21 +0530)]
drm/xe/guc: Eliminate RPe caching for SLPC parameter handling
RPe is runtime-determined by PCODE and caching it caused stale values,
leading to incorrect GuC SLPC parameter settings.
Drop the cached rpe_freq field and query fresh values from hardware
on each use to ensure GuC SLPC parameters reflect current RPe.
v2: Remove cached RPe frequency field (Rodrigo)
v3: Remove extra variable (Vinay)
Modify function name (Vinay)
v4: Maintain a separate function for PVC (Rodrigo)
v5: Avoid RPn update while fetching RPe frequency (Rodrigo)
v6: Split platform-specific RPe comments (Vinay)
drm/xe/pf: Allow to lockdown the PF using custom guard
Some driver components, like eudebug or ccs-mode, can't be used
when VFs are enabled. Add functions to allow those components
to block the PF from enabling VFs for the requested duration.
Introduce trivial counter to allow lockdown or exclusive access
that can be used in the scenarios where we can't follow the strict
owner semantics as required by the rw_semaphore implementation.
Before enabling VFs, the PF will try to arm the "vfs_enabling"
guard for the exclusive access. This will fail if there are
some lockdown requests already initiated by the other components.
For testing purposes, add debugfs file which will call these new
functions from the file's open/close hooks.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Christoph Manszewski <christoph.manszewski@intel.com> Reviewed-by: Christoph Manszewski <christoph.manszewski@intel.com> Link: https://patch.msgid.link/20251109162451.4779-1-michal.wajdeczko@intel.com
Lucas De Marchi [Mon, 10 Nov 2025 16:41:08 +0000 (08:41 -0800)]
drm/xe/pcode: Rework error mapping
The sparse array used for error decoding from is unnecessarily big. It
should be better handled by a switch statement that will also allow us
to more easily improve this code.
Add a CASE_ERR() macro to keep the table compact and use it instead of
the 256-entries array, which saves some space:
Kriish Sharma [Mon, 10 Nov 2025 18:42:06 +0000 (18:42 +0000)]
drm/xe: fix kernel-doc function name mismatch in xe_pm.c
Documentation build reported:
WARNING: ./drivers/gpu/drm/xe/xe_pm.c:131 expecting prototype for xe_pm_might_block_on_suspend(). Prototype was for xe_pm_block_on_suspend() instead
The kernel-doc comment for xe_pm_block_on_suspend() incorrectly used
the function name xe_pm_might_block_on_suspend(). Fix the header to
match the actual function prototype.
drm/xe/pf: Add runtime registers for GFX ver >= 35
Add a dedicated runtime register list for GFX ver >= 35.
Compared to the list for GFX >= 30, this variant drops
HUC_KERNEL_LOAD_INFO, MIRROR_FUSE1 and adds SERVICE_COPY_ENABLE.
v2:
- drop MIRROR_FUSE1 register
- update commit message
Fixes: 5e0de2dfbc1b ("drm/xe/cri: Add CRI platform definition") Signed-off-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251107211845.3633633-1-piotr.piorkowski@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Lucas De Marchi [Fri, 7 Nov 2025 18:23:45 +0000 (10:23 -0800)]
drm/xe/vram: Move forcewake down to get_flat_ccs_offset()
With SG_TILE_ADDR_RANGE use, the only thing requiring GT forcewake while
probing for vram size is the get_flat_ccs_offset(). Move the forcewake
down where it's needed.
Fei Yang [Fri, 7 Nov 2025 18:23:44 +0000 (10:23 -0800)]
drm/xe: Use SG_TILE_ADDR_RANGE instead of TILE_ADDR_RANGE
The TILE_ADDR_RANGE register is not available on all platforms going
forward as it was deprecated and is being replaced by equivalent
registers within SoC MMIO space. While that doesn't happen, the
SG_TILE_ADDR_RANGE (base 0x1083a0) is still valid for all platforms
supported by xe. Use that instead.
Fixes: 50292f9af8ec ("drm/xe: Move 'vm_max_level' flag back to platform descriptor") Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Gustavo Sousa <gustavo.sousa@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patch.msgid.link/20251108040634.6376-2-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
drm/xe/pf: Use migration-friendly doorbells auto-provisioning
Instead of trying very hard to find the largest fair number of GuC
doorbell IDs that could be allocated for VFs on the current GT, pick
some smaller rounded down to power-of-two value that is more likely
to be provisioned in the same manner by the other PF instance:
drm/xe/pf: Use migration-friendly context IDs auto-provisioning
Instead of trying very hard to find the largest fair number of GuC
context IDs that could be allocated for VFs on the current GT, pick
some smaller rounded down to power-of-two value that is more likely
to be provisioned in the same manner by the other PF instance:
Lucas De Marchi [Tue, 4 Nov 2025 22:20:51 +0000 (14:20 -0800)]
drm/xe/gt_throttle: Avoid TOCTOU when monitoring reasons
It's currently not possible to safely monitor if there's throttling
happening and what are the reasons. The approach of reading the status
and then reading the reasons is not reliable as by the time sysadmin
reads the reason, the throttling could not be happening anymore.
Previous tentative to fix that[1] was breaking the ABI and potentially
sysadmin's scripts. This takes a different approach of adding and
documenting the additional attribute. It's still valuable, though
redundant, to provide the simpler 0/1 interface.
In order to avoid userspace knowledge on the bitmask meaning and to be
able to maintain the kernel side in sync with possible changes in
future, just walk the attribute group and check what are the masks that
match the value read.
Gwan-gyeong Mun [Wed, 5 Nov 2025 01:13:11 +0000 (03:13 +0200)]
drm/xe: Remove never used code in xe_vm_create()
Clang is not happy with set but unused variable (this is visible
with `make LLVM=1` build:
drivers/gpu/drm/xe/xe_vm.c:1462:11: error: variable 'number_tiles' set
but not used [-Werror,-Wunused-but-set-variable]
The use of this variable was removed in the commit mentioned below as
"Fixes:" but only its declaration and update remain.
It seems like the variable is not used along with the assignment that
does not have side effects as far as I can see.
Remove those altogether.
Fixes: cb99e12ba8cb ("drm/xe: Decouple bind queue last fence from TLB invalidations") Signed-off-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Link: https://patch.msgid.link/20251105011311.3177875-1-gwan-gyeong.mun@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Matthew Brost [Fri, 31 Oct 2025 16:54:15 +0000 (09:54 -0700)]
drm/xe: Add xe_guc_pagefault layer
Add xe_guc_pagefault layer (producer) which parses G2H fault messages
messages into struct xe_pagefault, forwards them to the page fault layer
(consumer) for servicing, and provides a vfunc to acknowledge faults to
the GuC upon completion. Replace the old (and incorrect) GT page fault
layer with this new layer throughout the driver.
As part of this change, the ACC handling code has been removed, as it is
dead code that is currently unused.
Matthew Brost [Fri, 31 Oct 2025 16:54:14 +0000 (09:54 -0700)]
drm/xe: Implement xe_pagefault_queue_work
Implement a worker that services page faults, using the same
implementation as in xe_gt_pagefault.c.
v2:
- Rebase on exhaustive eviction changes
- Include engine instance in debug prints (Stuart)
Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Tested-by: Francois Dugast <francois.dugast@intel.com> Link: https://patch.msgid.link/20251031165416.2871503-6-matthew.brost@intel.com
Matthew Brost [Fri, 31 Oct 2025 16:54:10 +0000 (09:54 -0700)]
drm/xe: Stub out new pagefault layer
Stub out the new page fault layer and add kernel documentation. This is
intended as a replacement for the GT page fault layer, enabling multiple
producers to hook into a shared page fault consumer interface.
v2:
- Fix kernel doc typo (checkpatch)
- Remove comment around GT (Stuart)
- Add explaination around reclaim (Francois)
- Add comment around u8 vs enum (Francois)
- Include engine instance (Stuart)
v3:
- Fix XE_PAGEFAULT_TYPE_ATOMIC_ACCESS_VIOLATION kernel doc (Stuart)
Matthew Brost [Fri, 31 Oct 2025 23:40:50 +0000 (16:40 -0700)]
drm/xe: Remove last fence dependency check from binds and execs
Eliminate redundant last fence dependency checks in exec and bind jobs,
as they are now equivalent to xe_exec_queue_is_idle. Simplify the code
by removing this dead logic.
Matthew Brost [Fri, 31 Oct 2025 23:40:49 +0000 (16:40 -0700)]
drm/xe: Disallow input fences on zero batch execs and zero binds
Prevent input fences from being installed on zero batch execs or zero
binds, which were originally added to support queue idling in Mesa via
output fences. Although input fence support was introduced for interface
consistency, it leads to incorrect behavior due to chained composite
fences, which are disallowed.
Avoid the complexity of fixing this by removing support, as input fences
for these cases are not used in practice.
Matthew Brost [Fri, 31 Oct 2025 23:40:48 +0000 (16:40 -0700)]
drm/xe: Skip TLB invalidation waits in page fault binds
Avoid waiting on unrelated TLB invalidations when servicing page fault
binds. Since the migrate queue is shared across processes, TLB
invalidations triggered by other processes may occur concurrently but
are not relevant to the current bind. Teach the bind pipeline to skip
waits on such invalidations to prevent unnecessary serialization.
Matthew Brost [Fri, 31 Oct 2025 23:40:47 +0000 (16:40 -0700)]
drm/xe: Decouple bind queue last fence from TLB invalidations
Separate the bind queue’s last fence to apply exclusively to the bind
job, avoiding unnecessary serialization on prior TLB invalidations.
Preserve correct user fence signaling by merging bind and TLB
invalidation fences later in the pipeline.
v3:
- Fix lockdep assert for migrate queues (CI)
- Use individual dma fence contexts for array out fences (Testing)
- Don't set last fence with arrays (Testing)
- Move TLB invalid last fence under migrate lock (Testing)
- Don't set queue last for migrate queues (Testing)
Matthew Brost [Fri, 31 Oct 2025 23:40:46 +0000 (16:40 -0700)]
drm/xe: Attach last fence to TLB invalidation job queues
Add support for attaching the last fence to TLB invalidation job queues
to address serialization issues during bursts of unbind jobs. Ensure
that user fence signaling for a bind job reflects both the bind job
itself and the last fences of all related TLB invalidations. Maintain
submission order based solely on the state of the bind and TLB
invalidation queues.
Introduce support functions for last fence attachment to TLB
invalidation queues.
v3:
- Fix assert in xe_exec_queue_tlb_inval_last_fence_set (CI)
- Ensure migrate lock held for migrate queues (Testing)
v5:
- Style nits (Thomas)
- Rewrite commit message (Thomas)
Matthew Brost [Fri, 31 Oct 2025 23:40:45 +0000 (16:40 -0700)]
drm/xe: Enforce correct user fence signaling order using
Prevent application hangs caused by out-of-order fence signaling when
user fences are attached. Use drm_syncobj (via dma-fence-chain) to
guarantee that each user fence signals in order, regardless of the
signaling order of the attached fences. Ensure user fence writebacks to
user space occur in the correct sequence.
Jouni Högander [Fri, 31 Oct 2025 12:23:11 +0000 (14:23 +0200)]
drm/xe: Do clean shutdown also when using flr
Currently Xe driver is triggering flr without any clean-up on
shutdown. This is causing random warnings from pending related works as the
underlying hardware is reset in the middle of their execution.
Fix this by performing clean shutdown also when using flr.
drm/xe/guc: Synchronize Dead CT worker with unbind
Cancel and wait for any Dead CT worker to complete before continuing
with device unbinding. Else the worker will end up using resources freed
by the undind operation.
Cc: Zhanjun Dong <zhanjun.dong@intel.com> Fixes: d2c5a5a926f4 ("drm/xe/guc: Dead CT helper") Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://patch.msgid.link/20251103123144.3231829-6-balasubramani.vivekanandan@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
drm/xe/gt: Synchronize GT reset with device unbind
When unbinding wait for any GT reset in progress to complete. Unbinding
will release the mmio mapping but mmio operations are performed during
GT reset causing Kernel panic.
Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patch.msgid.link/20251103123144.3231829-5-balasubramani.vivekanandan@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Lucas De Marchi [Fri, 31 Oct 2025 22:22:45 +0000 (15:22 -0700)]
drm/xe: Inline gt_reset in the worker
gt_reset() doesn't make sense by itself: it can only be called as part
of the worker. Inline it there to avoid it being called from elsewhere
and clarify the gt_reset() vs do_gt_reset() paths. Note that the error
return from gt_reset() was just being ignored.
Also add a comment to the xe_pm_runtime_put() to make sure the
get()/put() pair is clear.
Michal Wajdeczko [Thu, 30 Oct 2025 22:23:47 +0000 (23:23 +0100)]
drm/xe/pf: Allow to stop the VF using sysfs
It is expected that VFs activity will be monitored and in some
cases admin might want to silence specific VF without killing
the VM where it was attached.
Add write-only attribute to stop GuC scheduling at VFs level.
Writing "1" or "y" (or whatever is recognized by the strtobool()
function) to this file will trigger the change of the VF state
to STOP (GuC will stop servicing the VF). To go back to a READY
state (to allow GuC to service this VF again) the VF FLR must be
triggered (which can be done by writing 1 to device/reset file).
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patch.msgid.link/20251030222348.186658-17-michal.wajdeczko@intel.com
Michal Wajdeczko [Thu, 30 Oct 2025 22:23:46 +0000 (23:23 +0100)]
drm/xe/pf: Add sysfs device symlinks to enabled VFs
For convenience, for every enabled VF add 'device' symlink from
our SR-IOV admin VF folder to enabled sysfs PCI VF device entry.
Remove all those links when disabling PCI VFs.
For completeness, add static 'device' symlink for the PF itself.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patch.msgid.link/20251030222348.186658-16-michal.wajdeczko@intel.com
Writing "high" to the PF read-write attribute will change PF
priority on all tiles/GTs to HIGH (schedule function in the next
time-slice after current one completes and it has work). Writing
"low" or "normal" to change priority to LOW/NORMAL is supported.
When read, those files will display the current and available
scheduling priorities. The currently active priority level will
be enclosed in square brackets, default output will be like:
$ grep . -h sriov_admin/{pf,vf1,vf2}/profile/sched_priority
[low] normal high
[low] normal
[low] normal
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patch.msgid.link/20251030222348.186658-14-michal.wajdeczko@intel.com
Michal Wajdeczko [Thu, 30 Oct 2025 22:23:43 +0000 (23:23 +0100)]
drm/xe/pf: Allow bulk change all VFs priority using sysfs
It is expected to be a common practice to configure the same level
of scheduling priority across all VFs and PF (at least as starting
point). Due to current GuC FW limitations it is also the only way
to change VFs priority.
Add write-only sysfs attribute that will apply required priority
level to all VFs and PF at once.
Writing "low" to this write-only attribute will change PF and
VFs scheduling priority on all tiles/GTs to LOW (function will
be scheduled only if it has work submitted). Similarly, writing
"normal" will change functions priority to NORMAL (functions will
be scheduled irrespective of whether there is a work or not).
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patch.msgid.link/20251030222348.186658-13-michal.wajdeczko@intel.com
Michal Wajdeczko [Thu, 30 Oct 2025 22:23:42 +0000 (23:23 +0100)]
drm/xe/pf: Add functions to provision scheduling priority
We already have function to configure PF (or VF) scheduling priority
on a single GT, but we also need function that will cover all tiles
and GTs.
However, due to the current GuC FW limitation, we can't always rely
on per-GT function as it actually only works for the PF case. The
only way to change VFs scheduling priority is to use 'sched_if_idle'
policy KLV that will change priorities for all VFs (and the PF).
We will use these new functions in the upcoming patches.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patch.msgid.link/20251030222348.186658-11-michal.wajdeczko@intel.com
Michal Wajdeczko [Thu, 30 Oct 2025 22:23:40 +0000 (23:23 +0100)]
drm/xe/pf: Add functions to bulk provision EQ/PT
We already have functions to configure EQ/PT for single VF across
all tiles/GTs. Now add helper functions that will do that for all
VFs (and the PF) at once.
Michal Wajdeczko [Thu, 30 Oct 2025 22:23:39 +0000 (23:23 +0100)]
drm/xe/pf: Add functions to bulk configure EQ/PT on GT
We already have functions to bulk configure 'hard' resources like
GGTT, LMEM or GuC context/doorbells IDs. Now add functions for the
'soft' scheduling parameters, as we will need them soon in the
upcoming patches.
Michal Wajdeczko [Thu, 30 Oct 2025 22:23:37 +0000 (23:23 +0100)]
drm/xe/pf: Relax report helper to accept PF in bulk configs
Our current bulk configuration requests are only about VFs, but
we want to add new functions that will also include PF configs.
Update our bulk report helper to accept also PFID as first VFID.
Michal Wajdeczko [Thu, 30 Oct 2025 22:23:36 +0000 (23:23 +0100)]
drm/xe/pf: Allow change PF and VFs EQ/PT using sysfs
On current platforms, in SR-IOV virtualization, the GPU is shared
between VFs on the time-slice basis. The 'execution quantum' (EQ)
and 'preemption timeout' (PT) are two main scheduling parameters
that could be set individually per each VF.
Add EQ/PT read-write attributes for the PF and all VFs.
By exposing those two parameters over sysfs, the admin can change
their default values (infinity) and let the GuC scheduler enforce
that settings.
Writing 0 to these files will set infinity EQ/PT for the VF on all
tiles/GTs. This is a default value. Writing non-zero integers to
these files will change EQ/PT to new value (in their respective
units: msec or usec).
Reading from these files will return EQ/PT as previously set on
all tiles/GTs. In case of inconsistent values detected, due to
errors or low-level configuration done using debugfs, -EUCLEAN
error will be returned.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patch.msgid.link/20251030222348.186658-6-michal.wajdeczko@intel.com
Michal Wajdeczko [Thu, 30 Oct 2025 22:23:35 +0000 (23:23 +0100)]
drm/xe/pf: Add _locked variants of the VF PT config functions
In upcoming patches we will want to configure VF's preemption
timeout (PT) on all GTs under single lock to avoid potential
races due to parallel GT configuration attempts.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patch.msgid.link/20251030222348.186658-5-michal.wajdeczko@intel.com
Michal Wajdeczko [Thu, 30 Oct 2025 22:23:34 +0000 (23:23 +0100)]
drm/xe/pf: Add _locked variants of the VF EQ config functions
In upcoming patches we will want to configure VF's execution
quantum (EQ) on all GTs under single lock to avoid potential
races in parallel GT configuration attempts.
Michal Wajdeczko [Thu, 30 Oct 2025 22:23:33 +0000 (23:23 +0100)]
drm/xe/pf: Take RPM during calls to SR-IOV attr.store()
We expect that all SR-IOV attr.store() handlers will require active
runtime PM reference. To simplify implementation of those handlers,
take an implicit RPM reference on their behalf. Also wait until PF
completes its restart.
Michal Wajdeczko [Thu, 30 Oct 2025 22:23:32 +0000 (23:23 +0100)]
drm/xe/pf: Prepare sysfs for SR-IOV admin attributes
We already have some SR-IOV specific knobs exposed as debugfs
files to allow low level tuning of the SR-IOV configurations,
but those files are mainly for the use by the developers and
debugfs might not be available on the production builds.
Start building dedicated sysfs sub-tree under xe device, where
in upcoming patches we will add selected attributes that will
help provision and manage PF and all VFs:
Add all required data types and helper macros that will be used
by upcoming patches to define actual attributes.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patch.msgid.link/20251030222348.186658-2-michal.wajdeczko@intel.com
Xin Wang [Thu, 30 Oct 2025 22:17:34 +0000 (22:17 +0000)]
drm/xe: highlight reserved PAT entries in dump output
Enhance the PAT table dump by marking reserved entries with an
asterisk (*) for improved readability and debugging.
V2:
Added a note in the "PAT table" header explaining the meaning of
the asterisk(*) to improve clarity for readers. (Matt Roper)
V3:
Introduced a valid field in struct xe_pat_table_entry to
explicitly track whether an entry is valid or reserved, avoiding
reliance on coh_mode == 0. (Matt Roper)
Lucas De Marchi [Wed, 29 Oct 2025 23:45:09 +0000 (16:45 -0700)]
drm/xe/gt_throttle: Drop individual show functions
They are all doing the same thing with the mask being the param. Just
declare our own attribute to store the mask and provide a single
function.
Another common pattern is to define the show function in the macro,
however on follow up work the mask may be used for returning more
information, so it'd need to be stored in any case.
Lucas De Marchi [Wed, 29 Oct 2025 23:45:08 +0000 (16:45 -0700)]
drm/xe: Improve freq and throttle documentation
Add xe_gt_throttle under the "GT Frequency Management" and improve the
narrative making sure the documentation for both *_freq and throttle/*
attributes follow the same style.
Lucas De Marchi [Wed, 29 Oct 2025 23:45:07 +0000 (16:45 -0700)]
drm/xe/gt_throttle: Tidy up attribute definition
Move the attribute definitions to be grouped together rather than near
the show() function: checkpatch keeps complaining about the missing
newline when defining new attributes and it reads better to group
everything, which should match e.g. the xe_pmu.c style.
While grouping them, also define a THROTTLE_ATTR_RO(), similar to
DEVICE_ATTR_RO(), and use it to define all attributes. This makes it
shorter and with a familiar syntax.
Finally, during the cri_throttle_attrs[] array definition, also
highlight what's coming from common attributes and what is CRI-specific.
These 3 things could be done as separate commits, but they are all about
the same thing: reduce the attribute definition verbosity and are very
simple and mechanical.
Lucas De Marchi [Wed, 29 Oct 2025 23:45:06 +0000 (16:45 -0700)]
drm/xe/gt_throttle: Add throttle_to_gt()
Reduce boilerplate code by adding a helper to go directly from the
throttle kobject to the gt. Note that there's already a kobj_to_gt(),
but that actually converts our kobj_gt object to gt.
Lucas De Marchi [Wed, 29 Oct 2025 23:45:05 +0000 (16:45 -0700)]
drm/xe/gt_throttle: Always read and mask
Use a single function to read and mask the value the callers will be
interested in. This reduces the risk of a caller using a plain call to
xe_gt_throttle_get_limit_reasons() without applying any mask, which can
return unexpected bits for future platforms.
Select which reg and mask it's going to be used according to the
platform and gt type and always use that one function.
There was an odd xe_gt_dbg() when reading the status, which is not done
for any other throttle/* sysfs file, so just make the status be as
special as everybody else.
Lucas De Marchi [Wed, 29 Oct 2025 23:45:04 +0000 (16:45 -0700)]
drm/xe/gt_throttle: Tidy up perf reasons reading
There's no need to be so verbose with two functions per bit:
read_reason_xxxxx() and reason_xxxxx_show(). Drop the former and just
use a new is_throttled_by() that receives the mask as parameter.
Thomas Hellström [Mon, 27 Oct 2025 13:12:28 +0000 (14:12 +0100)]
drm/xe: Fix uninitialized return value from xe_validation_guard()
the DEFINE_CLASS() macro creates an inline function and
the init args are passed down to it; since _ret is passed as an int,
whatever value is set inside the function is not visible to the caller.
Pass _ret as a pointer so its value propagates to the caller.
Fixes: c460bc2311df ("drm/xe: Introduce an xe_validation wrapper around drm_exec") Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6220 Cc: Maarten Lankhorst <maarten.lankhorst@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: intel-xe@lists.freedesktop.org Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251027131228.12098-1-thomas.hellstrom@linux.intel.com
Shuicheng Lin [Mon, 27 Oct 2025 20:21:19 +0000 (20:21 +0000)]
drm/xe: Limit number of jobs per exec queue
Add a limit to the number of jobs that can be queued in a single
exec queue to avoid potential resource exhaustion.
A new field `job_cnt` is introduced in `struct xe_exec_queue` to
track the number of active DRM jobs, along with a maximum limit
`XE_MAX_JOB_COUNT_PER_EXEC_QUEUE` set to 1000.
If the job count exceeds this threshold, `xe_exec_ioctl()` now
returns `-EAGAIN` to signal that the caller should retry later.
A trace event is added to track when the limit is reached:
"xe_exec_queue_reach_max_job_count: dev=0000:03:00.0, job count
exceeded the maximum limit (1000) per exec queue. engine_class=0x3,
logical_mask=0x1, guc_id=2"
v3: add assert in xe_exec_queue_destroy that q->job_cnt is zero. (Matt)
v2 (Matt):
- add log to trace the limit is hit.
- Change max count from 0x1000 to 1000.
- Use atomic_t for job_cnt.
Suggested-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251027202118.3339905-2-shuicheng.lin@intel.com
Michal Wajdeczko [Fri, 24 Oct 2025 20:58:25 +0000 (22:58 +0200)]
drm/xe/pf: Access VF's register using dedicated MMIO view
Instead of creating ad-hoc new register definitions with altered
register addresses to mimic the VF's access to these registers,
prepare new MMIO instance per required VF, with shifted internal
location of the register map. This will allow to use unmodified
register definitions in all calls to xe_mmio() functions.
Nitin Gote [Mon, 27 Oct 2025 09:26:43 +0000 (14:56 +0530)]
drm/xe/xe3: Add WA_14024681466 for Xe3_LPG
Apply WA_14024681466 to Xe3_LPG graphics IP versions from 30.00 to 30.05.
v2: (Matthew Roper)
- Remove stepping filter as workaround applies to all steppings.
- Add an engine class filter so it only applies to the RENDER engine.
Michal Wajdeczko [Sat, 25 Oct 2025 12:49:05 +0000 (14:49 +0200)]
drm/xe/pf: Fix VF FLR synchronization between all GTs
If subsequent VF FLR request is triggered when previous VF FLR
sequence is still being processed, we ignore it as not needed.
But in case of the multi-GT platforms, one GT may already finish
its VF FLR processing and will start a new sequence, which includes
new cross-GT synchronization point. However, since other GT may
be still busy with post-sync cleanup steps, this will put on hold
this new FLR sequence, which might never finish due to lack of any
future synchronization checkouts.
Add additional cross-GT FLR synchronization point when each GT
ends processing its own FLR sequence. This should also help to
cover the case when one GT fails FLR processing before reaching
the first synchronization point.
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6287 Fixes: 2a8fcf7cc950 ("drm/xe/pf: Synchronize VF FLR between all GTs") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patch.msgid.link/20251025124906.5264-1-michal.wajdeczko@intel.com
Sanjay Yadav [Thu, 23 Oct 2025 12:14:54 +0000 (17:44 +0530)]
drm/xe: Fix spelling and typos across Xe driver files
Corrected various spelling mistakes and typos in multiple
files under the Xe directory. These fixes improve clarity
and maintain consistency in documentation.
v2
- Replaced all instances of "XE" with "Xe" where it referred
to the driver name
- of -> for
- Typical -> Typically
v3
- Revert "Xe" to "XE" for macro prefix reference
Matt Roper [Fri, 24 Oct 2025 20:08:35 +0000 (13:08 -0700)]
drm/xe/configfs: Drop MAX_GT_TYPE_CHARS constant
Early revisions of commit 7abd69278bb5 ("drm/xe/configfs: Add attribute
to disable GT types") used MAX_GT_TYPE_CHARS not only to size the
constant name field, but also for some of the string matching logic. By
the time the patch finally landed, the constant was no longer needed for
parsing. Stop using it for the string field definition as well; this
eliminates the risk that we forget to update the constant if we ever add
a GT type name longer than seven characters.
Matt Roper [Tue, 21 Oct 2025 22:45:56 +0000 (15:45 -0700)]
drm/xe/xe3p_xpc: Add MCR steering for NODE and L3BANK ranges
The bspec was originally missing the information related to steering of
L3-related ranges. Now that a late-breaking spec update has added the
necessary information, implement the steering rules in the code. Note
that the sole L3BANK range is the same as the one used on Xe_LPG, so we
can re-use the existing table for that MCR type.
Matt Roper [Tue, 21 Oct 2025 22:45:55 +0000 (15:45 -0700)]
drm/xe/xe3p_xpc: Treat all PSMI MCR ranges as "INSTANCE0"
Early versions of the B-spec originally indicated that Xe3p_XPC had two
ranges of PSMI registers requiring MCR steering (one starting at 0xB500,
one starting at 0xB600), and that reads of registers in these ranges
required different grpid values to ensure that a non-terminated value is
obtained. A late-breaking spec update has simplified this; both ranges
can be safely steered to grpid=0 for reads.
Drop the "PSMI19" replication type and related code, and consolidate
both register ranges into a single entry in the "INSTANCE0" steering
table.
Add platform definition and PCI IDs for Crescent Island.
Other platforms use INTEL_VGA_DEVICE since they have a
PCI_BASE_CLASS_DISPLAY class. This is not the case for CRI, so just
match on devid, which should be sufficient.
Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Reviewed-by: Shekhar Chauhan <shekhar.chauhan@intel.com> Link: https://lore.kernel.org/r/20251021-cri-v1-1-bf11e61d9f49@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>