Michal Wajdeczko [Fri, 11 Jul 2025 19:33:12 +0000 (21:33 +0200)]
drm/xe/pf: Resend PF provisioning after GT reset
If we reload the GuC due to suspend/resume or GT reset then we
have to resend not only any VFs provisioning data, but also PF
configuration, like scheduling parameters (EQ, PT), as otherwise
GuC will continue to use default values.
Michal Wajdeczko [Fri, 11 Jul 2025 19:33:11 +0000 (21:33 +0200)]
drm/xe/pf: Prepare to stop SR-IOV support prior GT reset
As part of the resume or GT reset, the PF driver schedules work
which is then used to complete restarting of the SR-IOV support,
including resending to the GuC configurations of provisioned VFs.
However, in case of short delay between those two actions, which
could be seen by triggering a GT reset on the suspened device:
While this VFs reprovisioning will be successful during next spin
of the worker, to avoid those errors, make sure to cancel restart
worker if we are about to trigger next reset.
I was debugging some unrelated issue and noticed the current code was
very verbose. We can improve it easily by using the more common batch
buffer building pattern.
add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-348 (-348)
Function old new delta
xe_gt_record_default_lrcs.cold 304 296 -8
xe_gt_record_default_lrcs 2200 1860 -340
Total: Before=13554, After=13206, chg -2.57%
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250710-lrc-refactors-v2-7-a5e2ca03f6bd@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Lucas De Marchi [Thu, 10 Jul 2025 20:33:51 +0000 (13:33 -0700)]
drm/xe/gt: Drop third submission for default context
There's no need to submit the nop job again on the first queue. Any
state needed is already saved when the first LRC is switched out. The
comment is a little misleading regarding indirect W/A: first of all
there's still no indirect W/A enabled and secondly, even after they are,
there's no need to submit this job again for having their state
propagated: the indirect W/A will actually run on every LRC switch.
Lucas De Marchi [Thu, 10 Jul 2025 20:33:50 +0000 (13:33 -0700)]
drm/xe/lrc: Remove leftover TODO/FIXME
There isn't anything to set for CTX_TIMESTAMP handling in the empty
LRC: that is set on every LRC init since it should always start from 0
rather than the value saved in the image after first submission.
The FIXME about perma-pinning also doesn't make much sense as we will
always going to pin the lrc and the GGTT mapping has nothing to do with
VM bind.
Lucas De Marchi [Thu, 10 Jul 2025 20:33:48 +0000 (13:33 -0700)]
drm/xe/gt: Extract emit_job_sync()
Both the nop and wa jobs are going through the same boiler plate calls
to emit the job with a timeout and handling error for both bb and job.
Extract emit_job_sync() so those functions create the bb, handling
possible errors and delegate the part about really emitting the job
and waiting for its completion.
Lucas De Marchi [Thu, 10 Jul 2025 20:33:47 +0000 (13:33 -0700)]
drm/xe: Count dwords before allocating
The bb allocation in emit_wa_job() is wrong in 2 ways: first it's
allocating enough space for the 3DSTATE or hardcoding 4k depending on
the engine. In the first case it doesn't account for the WAs and in the
former it may not be sufficient. Secondly it's using the size instead of
number of dwords, causing the buffer to be 4x bigger than needed:
xe_bb_new() receives number of dwords as parameter and its declaration
was also not following its implementation.
Lastly, reword the debug message since it's not only about the LRC WAs
anymore as it also include the 3DSTATE for render.
While it's unlikely this is causing any real issue, let's calculate the
needed space and allocate just enough.
Lucas De Marchi [Thu, 10 Jul 2025 20:33:46 +0000 (13:33 -0700)]
drm/xe/lrc: Reduce scope of empty lrc data
The only case in which new lrc data is created from scratch is when it's
called prior to recording the default lrc. There's no need to check for
NULL init_data since in that case the function already failed: just move
the allocation where it's needed.
Michal Wajdeczko [Sun, 13 Jul 2025 10:36:24 +0000 (12:36 +0200)]
drm/xe/pf: Stop requiring VF/PF version negotiation on every GT
While some VF/PF relay actions must be handled on the GT level,
like query for runtime registers, it was clarified by the arch
team that initial version negotiation can be done by the VF just
once, by using any available GuC/GT.
Move handling of the VF/PF ABI version negotiation on the PF side
from the GT level functions to the device level functions.
Michal Wajdeczko [Sun, 13 Jul 2025 10:36:19 +0000 (12:36 +0200)]
drm/xe: Combine PF and VF device data into union
There is no need to keep PF and VF data fields fully separate
since we can be only in one mode at the time. Move them into
a anonymous union to save few bytes.
Xin Wang [Fri, 11 Jul 2025 06:09:24 +0000 (06:09 +0000)]
drm/xe: Update register definitions in LRC layout header
Update the register definitions in xe_lrc_layout.h to align with the
official hardware specification (Bspec) terminology. Specifically:
- rename PVC_CTX_ACC_CTR_THOLD to CTX_ACC_CTR_THOLD
- rename PVC_CTX_ASID to CTX_ASID
Signed-off-by: Xin Wang <x.wang@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250711060924.7373-1-x.wang@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
drm/xe: Add plumbing for indirect context workarounds
Some upcoming workarounds need to be emitted from the indirect workaround
context so lets add some plumbing where they will be able to easily slot
in.
No functional changes for now since everything is still deactivated.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Bspec: 45954 Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250711160153.49833-7-tvrtko.ursulin@igalia.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
drm/xe: Allow specifying number of extra dwords at the end of wa bb emission
Indirect context setup will need more than one.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250711160153.49833-6-tvrtko.ursulin@igalia.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
drm/xe: Track number of written dwords from workaround batch buffer emission
Indirect context setup will need to get to the number of written dwords.
Lets add it as an output parameter so it can be accessed from the finish
helper regardless of whether code is writing directly or via an shadow
buffer.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250711160153.49833-5-tvrtko.ursulin@igalia.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
drm/xe: Rename utilization workaround emission function
Lucas suggested to consolidate to a slightly different naming scheme which
will align with the upcoming additions better.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Suggested-by: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250711160153.49833-4-tvrtko.ursulin@igalia.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Group the function arguments in a struct for more readable code and easier
extending.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250711160153.49833-3-tvrtko.ursulin@igalia.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Generalize the wa bb emission by splitting it into three phases - setup,
emit and finish, and extract setup and finish steps into helpers.
This will enable using the same infrastructure for emitting the indirect
context workarounds.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250711160153.49833-2-tvrtko.ursulin@igalia.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Lucas De Marchi [Fri, 11 Jul 2025 21:49:12 +0000 (14:49 -0700)]
drm/xe: Fix missing kernel-doc
Fix warning:
Warning: drivers/gpu/drm/xe/xe_device_types.h:658 struct member 'wa_active' not described in 'xe_device'
Fixes: 661a6950e061 ("drm/xe: Add infrastructure for Device OOB workarounds") Cc: Matt Atwood <matthew.s.atwood@intel.com> Reviewed-by: Jonathan Cavitt <joanthan.cavitt@intel.com> Link: https://lore.kernel.org/r/20250711214911.2009714-2-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
xe_bo_create_from_data() last use was removed in 2023 by
commit 0e1a47fcabc8 ("drm/xe: Add a helper for DRM device-lifetime BO
create")
xe_rtp_match_first_gslice_fused_off() last use was removed in 2023 by
commit 4e124151fcfc ("drm/xe/dg2: Drop pre-production workarounds")
Remove them, and xe_dss_mask_empty whose last use was by
xe_rtp_match_first_gslice_fused_off().
(Xe has a bunch ofother symbols that have been added but not used,
given how new it is, I've left those, as opposed to these that
had the code that used them removed).
Lucas De Marchi [Thu, 26 Jun 2025 21:25:53 +0000 (14:25 -0700)]
drm/xe: Normalize default param values
Document xe module params with the default values following a similar
strategy for all of them:
1) Define a DEFAULT_* macro with the default value. When the
value can't be directly stringified, also define a *_STR
variant
2) Use __stringify() or the _STR variant to make sure the
default value shows up in the param description
This allows us to show the correct default according to the
configuration. max_vfs for example was wrongly documented for
CONFIG_DRM_XE_DEBUG and svm_notifier_size didn't have its default
documented.
Lucas De Marchi [Thu, 10 Jul 2025 21:34:41 +0000 (14:34 -0700)]
drm/xe/migrate: Fix alignment check
The check would fail if the address is unaligned, but not when
accounting the offset. Instead of `buf | offset` it should have
been `buf + offset`. To make it more readable and also drop the
uintptr_t, just use the IS_ALIGNED() macro.
Fixes: 270172f64b11 ("drm/xe: Update xe_ttm_access_memory to use GPU for non-visible access") Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://lore.kernel.org/r/20250710-migrate-aligned-v1-1-44003ef3c078@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Matthew Brost [Thu, 10 Jul 2025 20:54:13 +0000 (13:54 -0700)]
drm/xe: Remove references to CONFIG_DRM_XE_DEVMEM_MIRROR
The prefetch code was referencing CONFIG_DRM_XE_DEVMEM_MIRROR, which has
been replaced by CONFIG_DRM_XE_PAGEMAP. As a result, prefetches were
limited to SRAM. Update the code to use CONFIG_DRM_XE_PAGEMAP instead of
the deprecated option.
Fixes: f86ad0ed620c ("drm/gpusvm, drm/pagemap: Move migration functionality to drm_pagemap") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250710205413.1105595-1-matthew.brost@intel.com
Matthew Auld [Thu, 10 Jul 2025 13:41:29 +0000 (14:41 +0100)]
drm/xe/migrate: fix copy direction in access_memory
After we do the modification on the host side, ensure we write the
result back to VRAM and not the other way around, otherwise the
modification will be lost if treated like a read.
Fixes: 270172f64b11 ("drm/xe: Update xe_ttm_access_memory to use GPU for non-visible access") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250710134128.800756-2-matthew.auld@intel.com
Skipping TLB invalidations on VF causing unrecoverable
faults. Probable reason for skipping TLB invalidations
on SRIOV could be lack of support for instruction
MI_FLUSH_DW_STORE_INDEX. Add back TLB flush with some
additional handling.
Michal Wajdeczko [Thu, 10 Jul 2025 10:30:40 +0000 (10:30 +0000)]
drm/xe/sriov: Mark BMG as SR-IOV capable
Enable SR-IOV support for BMG platforms. Note that as other flags from
the platform descriptor, it only means it may have that capability: it
still depends on runtime checks for the proper support in HW and
firmware.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Tested-by: Jakub Kolakowski <jakub1.kolakowski@intel.com> Signed-off-by: Jakub Kolakowski <jakub1.kolakowski@intel.com> Reviewed-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Link: https://lore.kernel.org/r/20250710103040.375610-3-jakub1.kolakowski@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Matt Atwood [Wed, 9 Jul 2025 22:16:05 +0000 (15:16 -0700)]
drm/xe: extend Wa_15015404425 to apply to PTL
Wa_15015404425 only needs to be applied on PTL platforms with an A step
compute die. There is no way to map PCI revid to the compute die
stepping. The easiest way to figure out compute die stepping our end is
to map the media IP's stepping to the compute die. For PTL, compute die
has an A stepping if and only if the media IP's stepping is also A-step
(This relationship is determined on a per platform basis and just
happens to be this way on PTL).
In addition this workaround is a chicken-and-egg problem. Wa_15015404425
requires that all register reads be preceded by four dummy MMIO writes
(including during early driver init and even pre-OS firmware). The
driver needs to perform some MMIO reads during init which include the
GMD_ID register that contains the Media IPs stepping. To handle this in
the safest manner assume the workaround applies to all of PTL during
driver probe and deactivate the workaround after.
The overall solution becomes a set of two workarounds:
* 15015404425 - a Device OOB workaround that's always active for PTL
* 15015404425_disable - a GT OOB workaround that applies to PTL
platfroms with a B0 or later stepping
The first of these workarounds issues dummy MMIO writes we do when
reading registers. The second guards logic that disables the first once
we have the necessary information later in the probe process.
v2: rename SoC to device, avoid null pointer dereference, update commit
message.
v3: rebase
v5: move disable check into xe_device_probe to avoid linking in xe_wa
into xe_pci, reword commit message
v6: squash extension and b0 support into 1 patch
Matt Atwood [Wed, 9 Jul 2025 22:16:03 +0000 (15:16 -0700)]
drm/xe: Add infrastructure for Device OOB workarounds
Some workarounds need to be able to be applied ahead of any GT
initialization for example 15015404425. This patch creates XE_DEVICE_WA
macro, in the same vein as XE_WA. This macro can be used ahead of GT
initialization, and can be tracked in sysfs. This should alleviate some
of the complexities that exist in i915.
v2: name change SoC to Device, address style issues
v5: split into separate patch from RTP changes, put oob within a struct,
move the initiation of oob workarounds into xe_device_probe_early(),
clean up the comments around XE_WA.
Matt Atwood [Wed, 9 Jul 2025 22:16:01 +0000 (15:16 -0700)]
drm/xe: add xe_device_wa infrastructure
There are some workarounds that must be appplied before gt init,
wa_15015404425 for example. Instead of sprinking them conditionally
throughout the driver as we did for i915 generate an oob.rules file
reusing the RTP infrastructure to make these easier to track.
v2: rename xe_soc_wa to xe_device_wa
v5: derive prefix from argument rather than hard coding the values.
v6: split out xe_gen-wa_oob changes
drm/xe/guc: Cancel ongoing H2G requests when stopping CT
Once we have started a GT reset sequence, which includes stopping
GuC CTB communication, we should also cancel all ongoing H2G send-
recv requests, as either GuC is already dead, or due to imminent
reset GuC will not be able to reply, or due to internal cleanup
we will lose pending fences. With this we will report dedicated
-ECANCELED error instead of misleading -ETIME.
These workarounds are not applicable for use by the VFs.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Tested-by: Jakub Kolakowski <jakub1.kolakowski@intel.com> Reviewed-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Signed-off-by: Jakub Kolakowski <jakub1.kolakowski@intel.com> Link: https://lore.kernel.org/r/20250710103040.375610-2-jakub1.kolakowski@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Juston Li [Wed, 9 Jul 2025 19:23:14 +0000 (12:23 -0700)]
drm/xe/bo: add GPU memory trace points
Add TRACE_GPU_MEM tracepoints for tracking global GPU memory usage.
These are required by VSR on Android 12+ for reporting GPU driver memory
allocations.
v5:
- Drop process_mem tracking
- Set the gpu_id field to dev->primary->index (Lucas, Tvrtko)
- Formatting cleanup under 80 columns
v3:
- Use now configurable CONFIG_TRACE_GPU_MEM instead of adding a
per-driver Kconfig (Lucas)
v2:
- Use u64 as preferred by checkpatch (Tvrtko)
- Fix errors in comments/Kconfig description (Tvrtko)
- drop redundant "CONFIG" in Kconfig
Signed-off-by: Juston Li <justonli@chromium.org> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250709192313.479336-2-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Heikki Krogerus [Tue, 1 Jul 2025 12:22:50 +0000 (15:22 +0300)]
drm/xe: Support for I2C attached MCUs
Adding adaption/glue layer where the I2C host adapter
(Synopsys DesignWare I2C adapter) and the I2C clients (the
microcontroller units) are enumerated.
The microcontroller units (MCU) that are attached to the GPU
depend on the OEM. The initially supported MCU will be the
Add-In Management Controller (AMC).
Co-developed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://lore.kernel.org/r/20250701122252.2590230-4-heikki.krogerus@linux.intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Rodrigo fixed the co-developed tags and SPDX format in the .c file]
Heikki Krogerus [Tue, 1 Jul 2025 12:22:49 +0000 (15:22 +0300)]
i2c: designware: Add quirk for Intel Xe
The regmap is coming from the parent also in case of Xe
GPUs. Reusing the Wangxun quirk for that.
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Co-developed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Reviewed-by: Andi Shyti <andi.shyti@kernel.org> Link: https://lore.kernel.org/r/20250701122252.2590230-3-heikki.krogerus@linux.intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Rodrigo fixed the co-developed tags while merging]
Heikki Krogerus [Tue, 1 Jul 2025 12:22:48 +0000 (15:22 +0300)]
i2c: designware: Use polling by default when there is no irq resource
The irq resource itself can be used as a generic way to
determine when polling is needed.
This not only removes the need for special additional device
properties that would soon be needed when the platform may
or may not have the irq, but it also removes the need to
check the platform in the first place in order to determine
is polling needed or not.
Since we are already using reusable buffer objects from the GuC
buffer cache, we can directly write into their CPU pointers and
spare unnecessary temporary allocation.
While around, also make sure to clear obtained buffer, to avoid
sending some stale data.
Shuicheng Lin [Mon, 7 Jul 2025 00:49:14 +0000 (00:49 +0000)]
drm/xe: Release runtime pm for error path of xe_devcoredump_read()
xe_pm_runtime_put() is missed to be called for the error path in
xe_devcoredump_read().
Add function description comments for xe_devcoredump_read() to help
understand it.
v2: more detail function comments and refine goto logic (Matt)
Fixes: c4a2e5f865b7 ("drm/xe: Add devcoredump chunking") Cc: stable@vger.kernel.org Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250707004911.3502904-6-shuicheng.lin@intel.com
Shuicheng Lin [Mon, 7 Jul 2025 00:49:13 +0000 (00:49 +0000)]
drm/xe: Remove unused code in devcoredump_snapshot()
The deleted code is no longer needed because patch "drm/xe/guc: Plumb
GuC-capture into dev coredump" has removed the related usage code.
Remove the code to tidy up the function.
v2: s/bacause/because
Reviewed-by: Zhanjun Dong <zhanjun.dong@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250707004911.3502904-5-shuicheng.lin@intel.com
Matthew Auld [Tue, 1 Jul 2025 10:39:50 +0000 (11:39 +0100)]
drm/xe/bmg: fix compressed VRAM handling
There looks to be an issue in our compression handling when the BO pages
are very fragmented, where we choose to skip the identity map and
instead fall back to emitting the PTEs by hand when migrating memory,
such that we can hopefully do more work per blit operation. However in
such a case we need to ensure the src PTEs are correctly tagged with a
compression enabled PAT index on dgpu xe2+, otherwise the copy will
simply treat the src memory as uncompressed, leading to corruption if
the memory was compressed by the user.
To fix this pass along use_comp_pat into emit_pte() on the src side, to
indicate that compression should be considered.
v2 (Jonathan): tweak the commit message
Fixes: 523f191cc0c7 ("drm/xe/xe_migrate: Handle migration logic for xe2+ dgfx") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Akshata Jahagirdar <akshata.jahagirdar@intel.com> Cc: <stable@vger.kernel.org> # v6.12+ Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://lore.kernel.org/r/20250701103949.83116-2-matthew.auld@intel.com
Seeing some unexplained random failures during LRC context switches with
indirect ring state enabled. The failures were always there, but the
repro rate increased with the addition of WA BB as a separate BO.
Commit 3a1edef8f4b5 ("drm/xe: Make WA BB part of LRC BO") helped to
reduce the issues in the context switches, but didn't eliminate them
completely.
Indirect ring state is not required for any current features, so disable
for now until failures can be root caused.
Cc: stable@vger.kernel.org Fixes: fe0154cf8222 ("drm/xe/xe2: Enable Indirect Ring State support for Xe2") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250702035846.3178344-1-matthew.brost@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Tomasz Lis [Mon, 30 Jun 2025 15:21:55 +0000 (17:21 +0200)]
drm/xe/vf: Make multi-GT migration less error prone
There is a remote chance that after migration, some GTs will not
send the MIGRATED interrupt, or due to current VF KMD state the
interrupt will not lead to marking the GT for recovery.
Requiring IRQs from all GTs before starting migration introduces
the possibility that the process will get stalled due to one GuC.
One could argue it is also waste of time to wait for all IRQs,
but we should get them all IRQs as soon as VGPU starts, so that's
not really an impactful argument.
Still, not waiting for all GTs makes it easier to handle situations:
* where one GuC IRQ is missing
* where state before probe is unclean - getting MIGRATED IRQ as soon
as interrupts are enabled
* where multiple migrations happen close to each other
To help with these cases, this patch alters the post-migration
recovery so that recovery task is started as soon as one GuC IRQ
is handled, and other GTs are included in recovery later as the
subsequent IRQs are serviced.
The post-migration recovery can now be called for any selection of
GTs, and it will perform recovery on all GTs for which IRQs have
arrived, even multiple times if necessary.
v2: Typos and style fixes
v3: Transferring gt_flags by value rather than reference to last
function where it is used
Signed-off-by: Tomasz Lis <tomasz.lis@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Acked-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Reviewed-by: Michal Winiarski <michal.winiarski@intel.com> Link: https://lore.kernel.org/r/20250630152155.195648-1-tomasz.lis@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Our LMEM buffer objects are not cleared by default on alloc
and during VF provisioning we only setup LMTT PTEs for the
actually provisioned LMEM range. But beyond that valid range
we might leave some stale data that could either point to some
other VFs allocations or even to the PF pages.
Explicitly clear all new LMTT page to avoid the risk that a
malicious VF would try to exploit that gap.
While around add asserts to catch any undesired PTE overwrites
and low-level debug traces to track LMTT PT life-cycle.
Fixes: b1d204058218 ("drm/xe/pf: Introduce Local Memory Translation Table") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Lukasz Laguna <lukasz.laguna@intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://lore.kernel.org/r/20250701220052.1612-1-michal.wajdeczko@intel.com
Riana Tauro [Mon, 30 Jun 2025 09:37:41 +0000 (15:07 +0530)]
drm/xe/xe_pmu: Validate gt in event supported
Validate gt instead of checking gt_id is lesser
than max gts per tile
Signed-off-by: Riana Tauro <riana.tauro@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20250630093741.2435281-1-riana.tauro@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Matt Roper [Tue, 1 Jul 2025 20:13:28 +0000 (13:13 -0700)]
drm/xe/xe_query: Use separate iterator while filling GT list
The 'id' value updated by for_each_gt() is the uapi GT ID of the GTs
being iterated over, and may skip over values if a GT is not present on
the device. Use a separate iterator for GT list array assignments to
ensure that the array will be filled properly on future platforms where
index in the GT query list may not match the uapi ID.
v2:
- Include the missing increment of the iterator. (Jonathan)
Matt Roper [Tue, 1 Jul 2025 20:13:27 +0000 (13:13 -0700)]
drm/xe: Don't compare GT ID to GT count when determining valid GTs
On current platforms with multiple GTs, all of the GT IDs are
consecutive; as a result we know that the GT IDs range from 0 to
gt_count-1 and can determine if a GT ID is valid by comparing against
the count. The consecutive nature of GT IDs may not hold true on future
platforms if/when we have platforms that are both multi-tile and have
multiple GTs within each tile. Once such platforms exist, it's quite
possible that we could wind up with something like a GT list composed of
IDs 0, 2, and 3 with no GT 1 (which would be a 2-tile platform with
media only on the second tile).
To future-proof the code we should stop comparing against the GT count
to determine whether a GT ID is valid or not. Instead we should do an
actual lookup of the ID to determine whether the GT exists. This also
means that our GT loop macro should not end at the GT count, but should
rather examine the entire space up to (# of tiles) * (max GT per tile)
to ensure it doesn't stop prematurely.
Matt Roper [Tue, 1 Jul 2025 20:13:26 +0000 (13:13 -0700)]
drm/xe: Assign GT IDs properly on multi-tile + multi-GT platforms
Although "multi-tile" and "multiple GTs per tile" are mutually-exclusive
characteristics on all of our platforms today, this may not always be
true. Assign GT IDs according to xe->info.max_gt_per_tile in a way that
should work even if future platforms have different configurations.
This patch should not change the behavior of current platforms; it only
future-proofs for potential future designs.
v2:
- Re-calculate gt_count if tile count gets reduced by MTCFG. (PVC CI)
Matt Roper [Tue, 1 Jul 2025 20:13:24 +0000 (13:13 -0700)]
drm/xe: Track maximum GTs per tile on a per-platform basis
Today all of our platforms fall into one of three cases:
* Single tile platforms with a single (primary) GT
* Single tile platforms with two GTs (primary + media)
* Two-tile platforms with a single GT (primary) in each
Our numbering of GTs has been a bit inconsistent between platforms
(e.g., GT1 is the media GT on some platforms, but the second tile's
primary GT on others). In the future we'll likely have platforms that
are both multi-tile and multi-GT, which will make the situation more
confusing. We could also wind up with more than just two types of GTs
at some point in the future.
Going forward we should standardize the way we assign uapi GT IDs to
internal GT structures. Let's declare that for userspace GT ID n,
GT[n]'s tile = n / (max gt per tile)
GT[n]'s slot within tile = n % (max gt per tile)
We don't want the GT numbering to change for any of our current
platforms since the current IDs are part of our ABI contract with
userspace so this means we should track the 'max gt per tile' value on a
per-platform basis rather than just using a single value across the
driver. Encode this into device descriptors in xe_pci.c and use the
per-platform number for various checks in the code. Constant
XE_MAX_GT_PER_TILE will remain just as the maximum across all platforms
for easy of sizing array allocations.
Michal Wajdeczko [Fri, 27 Jun 2025 18:41:43 +0000 (20:41 +0200)]
drm/xe/hw_engine_group: Fix potential leak
If we fail to allocate a workqueue we will leak kzalloc'ed group
object since it was designed to be kfree'ed in the drmm cleanup
action, but we didn't have a chance to register this action yet.
To avoid this leak allocate a group object using drmm_kzalloc()
and start using predefined drmm action to release the workqueue.
Tvrtko Ursulin [Mon, 30 Jun 2025 12:46:47 +0000 (13:46 +0100)]
drm/xe: Consolidate LRC offset calculations
Attempt to consolidate the LRC offsets calculations by aligning the
recently added wa_bb_offset with the naming scheme in the file and
also change the size stored in struct xe_lrc to not include the ring
buffer.
The former makes it somewhat visually easier to follow the layout of the
various logical blocks stored in the LRC bo, while the latter reduces the
number of sprinkled around calculations.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250630124711.8209-2-tvrtko.ursulin@igalia.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Harry Austen [Fri, 27 Jun 2025 20:30:35 +0000 (13:30 -0700)]
drm/xe: Allow dropping kunit dependency as built-in
Fix Kconfig symbol dependency on KUNIT, which isn't actually required
for XE to be built-in. However, if KUNIT is enabled, it must be built-in
too.
Fixes: 08987a8b6820 ("drm/xe: Fix build with KUNIT=m") Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Harry Austen <hpausten@protonmail.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20250627-xe-kunit-v2-2-756fe5cd56cf@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Matthew Brost [Wed, 25 Jun 2025 14:41:28 +0000 (07:41 -0700)]
drm/xe: Drop bo->size
bo->size is redundant because the base GEM object already has a size
field with the same value. Drop bo->size and use the base GEM object’s
size instead. While at it, introduce xe_bo_size() to abstract the BO
size.
drm/xe/guc: Enable the Dynamic Inhibit Context Switch optimization
The Dynamic Inhibit Context Switch is an optimization aimed at reducing
the amount of time the HW is stuck waiting on an unsatisfied semaphore.
When this optimization is enabled, the GuC will dynamically modify the
CTX_CTRL_INHIBIT_SYN_CTX_SWITCH in the CTX_CONTEXT_CONTROL register of
LRCs to enable immediate switching out on an unsatisfied semaphore wait
when multiple contexts are competing for time on the same engine.
This feature is available on recent HW from GuC 70.40.1 onwards and it
is enabled via a per-VF feature opt-in.
v2: rebase
v3: switch to using guc_buf_cache instead of dedicated alloc
v4: add helper to check for feature availability (Michal), don't enable
if multi-lrc is possible.
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Julia Filipchuk <julia.filipchuk@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://lore.kernel.org/r/20250625205405.1653212-4-daniele.ceraolospurio@intel.com
On newer HW (Xe2 onwards + PVC) it is possible to get extra information
when a CAT error occurs, specifically a dword reporting the error type.
To enable this extra reporting, we need to opt-in with the GuC, which is
done via a specific per-VF feature opt-in H2G.
On platforms where the HW does not support the extra reporting, the GuC
will set the type to 0xdeadbeef, so we can keep the code simple and
opt-in to the feature on every platform and then just discard the data
if it is invalid.
Note that on native/PF we're guaranteed that the opt in is available
because we don't support any GuC old enough to not have it, but if we're
a VF we might be running on a non-XE PF with an older GuC, so we need to
handle that case. We can re-use the invalid type above to handle this
scenario the same way as if the feature was not supported in HW.
Given that this patch is the first user of the guc_buf_cache on native
and VF, it also extends that feature to non-PF use-cases.
v2: simpler print for the error type (John), rebase
v3: use guc_buf_cache instead of new alloc, simpler doc (Michal)
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> #v1 Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://lore.kernel.org/r/20250625205405.1653212-3-daniele.ceraolospurio@intel.com
Jia Yao [Thu, 12 Jun 2025 22:46:20 +0000 (22:46 +0000)]
drm/xe: Fix out-of-bounds field write in MI_STORE_DATA_IMM
According to Bspec, bits 0~9 of MI_STORE_DATA_IMM must not exceed 0x3FE.
The macro MI_SDI_NUM_QW(x) evaluates to 2 * x + 1, which means the
condition 2 * x + 1 <= 0x3FE must be satisfied. Therefore, the maximum
valid value for x is 0x1FE, not 0x1FF.
v2
- Replace 0x1fe with macro MAX_PTE_PER_SDI (Auld, Matthew & Patelczyk, Maciej)
v3
- Change macro MAX_PTE_PER_SDI from 0x1fe to 0x1feU (De Marchi, Lucas)
Bspec: 60246
Fixes: 9c44fd5f6e8a ("drm/xe: Add migrate layer functions for SVM support") Cc: Matthew Brost <matthew.brost@intel.com> Cc: Brian3 Nguyen <brian3.nguyen@intel.com> Cc: Alex Zuo <alex.zuo@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Maciej Patelczyk <maciej.patelczyk@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Suggested-by: Shuicheng Lin <shuicheng.lin@intel.com> Signed-off-by: Jia Yao <jia.yao@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Maciej Patelczyk <maciej.patelczyk@intel.com> Link: https://lore.kernel.org/r/20250612224620.161105-1-jia.yao@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Now that all previous allocations are gone, ensure no new allocations
will ever be done before xe_display_init_early(), by moving the call
that allows allocations downwards.
drm/xe: Split init of xe_gt_init_hwconfig to xe_gt_init and *_early
Now that we added the separate step of initialising GUC in
xe_gt_init_early, it should be ok to initialise the minimum during early
init, and the rest after allocations are allowed.
Clarify that the functions are the part of gt_init() that are called
with the respective power domains held. all_domain() of course only
works after discovering and initialisation of force_wake on all engines,
that's why the split is needed in the first place.
drm/xe: Make it possible to read instance0 MCR registers after xe_gt_mcr_init_early
After mcr_init_early, we need to be able to do VRAM and CCS probing
without hwconfig probe. Fortunately the relevant registers are all
instance 0, which fortunately means no dependencies on further initialization
is required.
Thomas Hellström [Thu, 19 Jun 2025 13:40:35 +0000 (15:40 +0200)]
drm/xe: Implement and use the drm_pagemap populate_mm op
Add runtime PM since we might call populate_mm on a foreign device.
v3:
- Fix a kerneldoc failure (Matt Brost)
- Revert the bo type change from device to kernel (Matt Brost)
v4:
- Add an assert in xe_svm_alloc_vram (Matt Brost)
Matthew Brost [Thu, 19 Jun 2025 13:40:33 +0000 (15:40 +0200)]
drm/gpusvm, drm/pagemap: Move migration functionality to drm_pagemap
The migration functionality and track-keeping of per-pagemap VRAM
mapped to the CPU mm is not per GPU_vm, but rather per pagemap.
This is also reflected by the functions not needing the drm_gpusvm
structures. So move to drm_pagemap.
With this, drm_gpusvm shouldn't really access the page zone-device-data
since its meaning is internal to drm_pagemap. Currently it's used to
reject mapping ranges backed by multiple drm_pagemap allocations.
For now, make the zone-device-data a void pointer.
Alter the interface of drm_gpusvm_migrate_to_devmem() to ensure we don't
pass a gpusvm pointer.
Rename CONFIG_DRM_XE_DEVMEM_MIRROR to CONFIG_DRM_XE_PAGEMAP.
Matt is listed as author of this commit since he wrote most of the code,
and it makes sense to retain his git authorship.
Thomas mostly moved the code around.
v3:
- Kerneldoc fixes (CI)
- Don't update documentation about how the drm_pagemap
migration should be interpreted until upcoming
patches where the functionality is implemented.
(Matt Brost)
v4:
- More kerneldoc fixes around timeslice_ms
(Himal Ghimiray, Matt Brost)
v6:
- Fix an uninitialized pagemap pointer (CI)
Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250619134035.170086-2-thomas.hellstrom@linux.intel.com
There are several things wrong with the way this WA was implemented:
- The KLV is only supported on GuC 70.47.0 or newer, so we shouldn't
apply it unconditionally.
- The KLV requires 2 DWs of data, which are not currently provided.
The GuC currently ignores any unknown KLVs, so on versions older that
70.47.0 nothing happens. However, starting on 70.47.0 the GuC attempts
to parse the KLV and fails due to the missing data, causing a GuC load
abort.
Given that 70.47.0 is the first GuC version approved for public release
for PTL, let's revert this patch so it doesn't cause the GuC load to
fail with that blob. We can then re-apply it properly fixed after the
GuC definition is merged, which will also have the added benefit of
running the KLV addition through CI with the right GuC version.
Shuicheng Lin [Sun, 8 Jun 2025 23:01:33 +0000 (23:01 +0000)]
drm/xe/uapi: Correct sync type definition in comments
Commit 37d078e51b4c ("drm/xe/uapi: Split xe_sync types from flags") renamed some DRM_XE_SYNC_*
defines but later commits kept using the old names. Correct them with the new definition.
v2: correct fixes tag and update commit message to explain why (Lucas)
Fixes: 9329f0667215 ("drm/xe/uapi: Use LR abbrev for long-running vms") Fixes: 4b437893a826 ("drm/xe/uapi: More uAPI documentation additions and cosmetic updates") Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Francois Dugast <francois.dugast@intel.com> Cc: Zongyao Bai <zongyao.bai@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Link: https://lore.kernel.org/r/20250608230133.1250849-1-shuicheng.lin@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Michal Wajdeczko [Thu, 12 Jun 2025 22:09:37 +0000 (00:09 +0200)]
drm/xe/guc: Explicitly exit CT safe mode on unwind
During driver probe we might be briefly using CT safe mode, which
is based on a delayed work, but usually we are able to stop this
once we have IRQ fully operational. However, if we abort the probe
quite early then during unwind we might try to destroy the workqueue
while there is still a pending delayed work that attempts to restart
itself which triggers a WARN.
This was recently observed during unsuccessful VF initialization: