Lucas De Marchi [Thu, 10 Apr 2025 04:59:34 +0000 (21:59 -0700)]
drm/xe: Set LRC addresses before guc load
The metadata saved in the ADS is read by GuC when it's initialized.
Saving the addresses to the LRCs when they are populated is too late as
GuC will keep using the old ones.
This was causing GuC to use the RCS LRC for any engine class. It's not a
big problem on a Linux-only scenario since the they are used by GuC only
on media engines when the watchdog is triggered. However, in a
virtualization scenario with Windows as the VF, it causes the wrong LRCs
to be loaded as the watchdog is used for all engines.
Fix it by letting guc_golden_lrc_init() initialize the metadata, like
other *_init() functions, and later guc_golden_lrc_populate() to copy
the LRCs to the right places. The former is called before the second GuC
load, while the latter is called after LRCs have been recorded.
Cc: Chee Yin Wong <chee.yin.wong@intel.com> Cc: John Harrison <john.c.harrison@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: <stable@vger.kernel.org> # v6.11+ Reviewed-by: Matthew Brost <matthew.brost@intel.com> Tested-by: Chee Yin Wong <chee.yin.wong@intel.com> Link: https://lore.kernel.org/r/20250409-fix-guc-ads-v1-1-494135f7a5d0@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Michal Wajdeczko [Fri, 11 Apr 2025 19:30:30 +0000 (21:30 +0200)]
drm/xe/pf: Don't show GGTT/LMEM debugfs files under media GT
Most of the PF's debugfs files (and their implementations) are
based on the GT hierarchy even if files are related to GGTT or
LMEM data, that are related to the tile.
While we could reach the tile data from any GT, to avoid potential
misuse, some functions allow to be used on the primary GT only,
and may use asserts to enforce that.
In our case, the following assert could be seen when reading the
/sys/kernel/debug/dri/0000:00:02.0/gt1/pf/ggtt_available
drm/xe/vf: Don't expose privileged GT debugfs files if VF
Some of the debugfs files require access to the registers that are
not accessible to the VFs. Don't expose those files on VF drivers.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Marcin Bernatowicz <marcin.bernatowicz@linux.intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Tested-by: Marcin Bernatowicz <marcin.bernatowicz@linux.intel.com> Reviewed-by: Marcin Bernatowicz <marcin.bernatowicz@linux.intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250403142635.1821-4-michal.wajdeczko@intel.com
We don't have to drmm_kmalloc() local copy of debugfs_list to
write there our pointer to the struct xe_guc as we can extract
pointer to the struct xe_gt from the grandparent debugfs entry,
in similar way to what we did for GT debugfs files.
Note that there is no change in file/directory structure, just
refactored how files are created and how functions are called.
Lucas De Marchi [Wed, 9 Apr 2025 14:09:56 +0000 (07:09 -0700)]
drm/xe: Allow to drop vram resizing
The default behavior if the LMEMBAR doesn't match the maximum possible
size is to try to resize it. However the user might want to keep, even
for testing the behavior with small BAR, whatever size was set via
sysfs. Change the module parameter to int and check for negative value.
Matthew Brost [Tue, 8 Apr 2025 15:59:15 +0000 (08:59 -0700)]
drm/xe: Add page queue multiplier
For an unknown reason the math to determine the PF queue size does is
not correct - compute UMD applications are overflowing the PF queue
which is fatal. A multippier of 8 fixes the problem.
Fixes: 3338e4f90c14 ("drm/xe: Use topology to determine page fault queue size") Cc: stable@vger.kernel.org Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jagmeet Randhawa <jagmeet.randhawa@intel.com> Link: https://lore.kernel.org/r/20250408155915.78770-1-matthew.brost@intel.com
Shuicheng Lin [Sat, 5 Apr 2025 17:15:39 +0000 (17:15 +0000)]
drm/xe: remove unused LE_COS
The LE_COS definition missed passing the value parameter to
REG_FIELD_PREP. This didn't cause build errors because the entire
macro was unused.
The value for this field is universally "0" for every MOCS entry on
the old Xe_LP platforms, and the whole field has been removed from
Xe_HP onward. Just delete the line so that we don't have an unused
definition.
Suggested-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Link: https://lore.kernel.org/r/20250405171539.599850-1-shuicheng.lin@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
drm/xe: Enable configfs support for survivability mode
Enable survivability mode if supported and configfs attribute is set.
Enabling survivability mode manually is useful in cases where pcode does
not detect failure, validation and for IFR (in-field-repair).
To set configfs survivability mode attribute for a device
Registers a configfs subsystem called 'xe' that creates a
directory in the mounted configfs directory (/sys/kernel/config)
Userspace can then create the device that has to be configured
under the xe directory
mkdir /sys/kernel/config/xe/0000:03:00.0
The device created will have the following attributes to be
configured
Lucas De Marchi [Thu, 3 Apr 2025 05:38:05 +0000 (22:38 -0700)]
drm/xe: Fix taking invalid lock on wedge
If device wedges on e.g. GuC upload, the submission is not yet enabled
and the state is not even initialized. Protect the wedge call so it does
nothing in this case. It fixes the following splat:
Matt Roper [Fri, 4 Apr 2025 22:00:54 +0000 (15:00 -0700)]
drm/xe: Ensure XE_BO_FLAG_CPU_ADDR_MIRROR has a unique value
When XE_BO_FLAG_PINNED_NORESTORE and XE_BO_FLAG_PINNED_LATE_RESTORE were
added, they were assigned BO flag values in the middle of the flag
range, requiring renumbering of the higher flags. In both cases,
XE_BO_FLAG_CPU_ADDR_MIRROR was overlooked during renumbering because it
was defined below XE_BO_FLAG_GGTT_ALL and thus was not immediately
visible in code diffs changing this area of the code; this resulted in
XE_BO_FLAG_CPU_ADDR_MIRROR clashing with another flag.
Assign XE_BO_FLAG_CPU_ADDR_MIRROR a unique value, and also move the
definition of XE_BO_FLAG_GGTT_ALL down below all of the individual flags
so that this kind of mistake is less likely in the future. Also, while
we're at it, fix up some space vs tab whitespace inconsistency in these
flag definitions.
Fixes: 7f387e6012b6 ("drm/xe: add XE_BO_FLAG_PINNED_LATE_RESTORE") Fixes: 045448da87bf ("drm/xe: Add XE_BO_FLAG_PINNED_NORESTORE") Cc: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://lore.kernel.org/r/20250404220053.1758356-2-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Oak Zeng [Thu, 3 Apr 2025 16:53:28 +0000 (12:53 -0400)]
drm/xe: Allow scratch page under fault mode for certain platform
Normally scratch page is not allowed when a vm is operate under page
fault mode, i.e., in the existing codes, DRM_XE_VM_CREATE_FLAG_SCRATCH_PAGE
and DRM_XE_VM_CREATE_FLAG_FAULT_MODE are mutual exclusive. The reason
is fault mode relies on recoverable page to work, while scratch page
can mute recoverable page fault.
On xe2 and xe3, out of bound prefetch can cause page fault and further
system hang because xekmd can't resolve such page fault. SYCL and OCL
language runtime requires out of bound prefetch to be silently dropped
without causing any functional problem, thus the existing behavior
doesn't meet language runtime requirement.
At the same time, HW prefetching can cause page fault interrupt. Due to
page fault interrupt overhead (i.e., need Guc and KMD involved to fix
the page fault), HW prefetching can be slowed by many orders of magnitude.
Fix those problems by allowing scratch page under fault mode for xe2 and
xe3. With scratch page in place, HW prefetching could always hit scratch
page instead of causing interrupt.
A side effect is, scratch page could hide application program error.
Application out of bound accesses are hided by scratch page mapping,
instead of get reported to user.
v2: Refine commit message (Thomas)
v3: Move the scratch page flag check to after scratch page wa (Thomas)
v4: drop NEEDS_SCRATCH macro (matt)
Add a comment to DRM_XE_VM_CREATE_FLAG_SCRATCH_PAGE
Signed-off-by: Oak Zeng <oak.zeng@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://lore.kernel.org/r/20250403165328.2438690-4-oak.zeng@intel.com Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Oak Zeng [Thu, 3 Apr 2025 16:53:27 +0000 (12:53 -0400)]
drm/xe: Clear scratch page on vm_bind
When a vm runs under fault mode, if scratch page is enabled, we need
to clear the scratch page mapping on vm_bind for the vm_bind address
range. Under fault mode, we depend on recoverable page fault to
establish mapping in page table. If scratch page is not cleared, GPU
access of address won't cause page fault because it always hits the
existing scratch page mapping.
When vm_bind with IMMEDIATE flag, there is no need of clearing as
immediate bind can overwrite the scratch page mapping.
So far only is xe2 and xe3 products are allowed to enable scratch page
under fault mode. On other platform we don't allow scratch page under
fault mode, so no need of such clearing.
v2: Rework vm_bind pipeline to clear scratch page mapping. This is similar
to a map operation, with the exception that PTEs are cleared instead of
pointing to valid physical pages. (Matt, Thomas)
TLB invalidation is needed after clear scratch page mapping as larger
scratch page mapping could be backed by physical page and cached in
TLB. (Matt, Thomas)
v3: Fix the case of clearing huge pte (Thomas)
Improve commit message (Thomas)
v4: TLB invalidation on all LR cases, not only the clear on bind
cases (Thomas)
v5: Misc cosmetic changes (Matt)
Drop pt_update_ops.invalidate_on_bind. Directly wire
xe_vma_op.map.invalidata_on_bind to bind_op_prepare/commit (Matt)
v6: checkpatch fix (Matt)
v7: No need to check platform needs_scratch deciding invalidate_on_bind
(Matt)
v8: rebase
v9: rebase
v10: fix an error in xe_pt_stage_bind_entry, introduced in v9 rebase
Signed-off-by: Oak Zeng <oak.zeng@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://lore.kernel.org/r/20250403165328.2438690-3-oak.zeng@intel.com Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Oak Zeng [Thu, 3 Apr 2025 16:53:26 +0000 (12:53 -0400)]
drm/xe: Introduced needs_scratch bit in device descriptor
On some platform, scratch page is needed for out of bound prefetch
to work. Introduce a bit in device descriptor to specify whether
this device needs scratch page to work.
v2: introduce a needs_scratch bit in device info (Thomas, Jonathan)
v3: drop NEEDS_SCRATCH macro (Matt)
Signed-off-by: Oak Zeng <oak.zeng@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://lore.kernel.org/r/20250403165328.2438690-2-oak.zeng@intel.com Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Matthew Auld [Thu, 3 Apr 2025 10:24:48 +0000 (11:24 +0100)]
drm/xe/sriov: support non-contig VRAM provisioning
Currently we can run into issues with provisioning VRAM region, due to
requiring contig VRAM BO underneath. We sometimes see that allocation
(multiple GB) can fail even when there is enough free space. We don't
need CPU access to the buffer in the first place, so can forgo pin_map
and therefore also the contig requirement. Keep the same behavior with
save and restore during suspend/resume (which can now be done with
blitter). We also need the VRAM to occupy the same pages so we don't
need to re-program the LMTT, so should still remain pinned (also we
don't want something to try evict it). With that covert over to plain
pinned kernel object.
Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Link: https://lore.kernel.org/r/20250403102440.266113-16-matthew.auld@intel.com
Matthew Auld [Thu, 3 Apr 2025 10:24:47 +0000 (11:24 +0100)]
drm/xe: allow non-contig VRAM kernel BO
If the kernel bo doesn't care about vmap(), either directly or
indirectly with save/restore then we don't need to force contig for such
buffers.
Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Link: https://lore.kernel.org/r/20250403102440.266113-15-matthew.auld@intel.com
Matthew Auld [Thu, 3 Apr 2025 10:24:46 +0000 (11:24 +0100)]
drm/xe: unconditionally apply PINNED for pin_map()
Some users apply PINNED and some don't when using pin_map(). The pin in
pin_map() should imply PINNED so just unconditionally apply it and clean
up all users.
Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Link: https://lore.kernel.org/r/20250403102440.266113-14-matthew.auld@intel.com
Matthew Auld [Thu, 3 Apr 2025 10:24:45 +0000 (11:24 +0100)]
drm/xe: add XE_BO_FLAG_PINNED_LATE_RESTORE
With the idea of having more pinned objects using the blitter engine
where possible, during suspend/resume, mark the pinned objects which
can be done during the late phase once submission/migration has been
setup. Start out simple with lrc and page-tables from userspace.
v2:
- s/early_restore/late_restore; early restore was way too bold with too
many places being impacted at once.
v3:
- Split late vs early into separate lists, to align with newly added
apply-to-pinned infra.
v4:
- Rebase.
v5:
- Make sure we restore the late phase kernel_bo_present in igpu.
Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Link: https://lore.kernel.org/r/20250403102440.266113-13-matthew.auld@intel.com
Matthew Auld [Thu, 3 Apr 2025 10:24:44 +0000 (11:24 +0100)]
drm/xe/migrate: ignore CCS for kernel objects
For kernel BOs we don't clear the CCS state on creation, therefore we
should be careful to ignore it when copying pages. In a future patch we
opt for using the copy path here for kernel BOs, so this now needs to be
considered.
v2:
- Drop bogus asserts (CI)
Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Link: https://lore.kernel.org/r/20250403102440.266113-12-matthew.auld@intel.com
Matthew Brost [Thu, 3 Apr 2025 10:24:43 +0000 (11:24 +0100)]
drm/xe: Add XE_BO_FLAG_PINNED_NORESTORE
Not all BOs need to be restored on resume / d3cold exit, add
XE_BO_FLAG_PINNED_NO_RESTORE which skips restoring of BOs rather just
allocates VRAM for the BO. This should slightly speedup resume / d3cold
exit flows.
Marking GuC ADS, GuC CT, GuC log, GuC PC, and SA as NORESTORE.
v2:
- s/WONTNEED/NORESTORE (Vivi)
- Rebase on newly added g2g and backup object flow
Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250403102440.266113-11-matthew.auld@intel.com
Matthew Auld [Thu, 3 Apr 2025 10:24:42 +0000 (11:24 +0100)]
drm/xe: use backup object for pinned save/restore
Currently we move pinned objects, relying on the fact that the lpfn/fpfn
will force the placement to occupy the same pages when restoring.
However this then limits all such pinned objects to be contig
underneath. In addition it is likely a little fragile moving pinned
objects in the first place. Rather than moving such objects rather copy
the page contents to a secondary system memory object, that way the VRAM
pages never move and remain pinned. This also opens the door for
eventually having non-contig pinned objects that can also be
saved/restored using blitter.
v2:
- Make sure to drop the fence ref.
- Handle NULL bo->migrate.
v3:
- Ensure we add the copy fence to the BOs, otherwise backup_obj can
be freed before pipelined copy finishes.
v4:
- Rebase on newly added apply-to-pinned infra.
The structure was missing a proper kerneldoc header and once
that was added a number of typos and errors became obvious.
Fix those.
Reported-by: Lucas De Marchi <lucas.demarchi@intel.com> Closes: https://lore.kernel.org/intel-xe/x53tcs5bjldw6lcorjemuheklxcmepdvr2u7lvt3hpqrzqoc4h@nsu6hs25taqj/ Fixes: b2d4b03b03a7 ("drm/xe: Make the PT code handle placement per PTE rather than per vma / range") Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250402122924.25526-1-thomas.hellstrom@linux.intel.com
Vinay Belgaumkar [Mon, 31 Mar 2025 20:48:27 +0000 (13:48 -0700)]
drm/xe/pmu: Add GT frequency events
Define PMU events for GT frequency (actual and requested). The
instantaneous values for these frequencies will be displayed.
Following PMU events are being added:
xe_0000_00_02.0/gt-actual-frequency/ [Kernel PMU event]
xe_0000_00_02.0/gt-requested-frequency/ [Kernel PMU event]
Standard perf commands can be used to monitor GT frequency:
$ perf stat -e xe_0000_00_02.0/gt-requested-frequency,gt=0/ -I1000
v2: Use locks while storing/reading samples, keep track of multiple
clients (Lucas) and other general cleanup.
v3: Review comments (Lucas) and use event counts instead of mask for
active events.
v4: Add freq events to event_param_valid method (Riana)
v5: Use instantaneous values instead of aggregating (Lucas)
v6: Obtain fwake at init for freq events as well and use non fwake
variant method for reading requested freq to avoid lockdep issues (Lucas)
v7: Review comments (Rodrigo, Lucas)
Cc: Riana Tauro <riana.tauro@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://lore.kernel.org/r/20250331204827.2535393-1-vinay.belgaumkar@intel.com
Gustavo Sousa [Fri, 28 Mar 2025 16:50:50 +0000 (13:50 -0300)]
drm/xe: Make PPHWSP size explicit in xe_gt_lrc_size()
The context of each engine starts with a 4k memory space for the
"Per-process HW status page" (PPHWSP). In xe_gt_lrc_size(), we have been
implicitly accounting for that page in the switch statement on the
engine class.
Since the PPHWSP is common to all engines, let's extract that into it's
own assignment. That makes the context structure more explicit in the
code and aligns better with the descriptions in Bspec.
Another advantage of keeping it separate is that now the sizes used in
the switch statement match the sizes we calculate engine-specific
context images, which have their own Bspec pages.
Kenneth Graunke [Sun, 30 Mar 2025 16:59:23 +0000 (12:59 -0400)]
drm/xe: Invalidate L3 read-only cachelines for geometry streams too
Historically, the Vertex Fetcher unit has not been an L3 client. That
meant that, when a buffer containing vertex data was written to, it was
necessary to issue a PIPE_CONTROL::VF Cache Invalidate to invalidate any
VF L2 cachelines associated with that buffer, so the new value would be
properly read from memory.
Since Tigerlake and later, VERTEX_BUFFER_STATE and 3DSTATE_INDEX_BUFFER
have included an "L3 Bypass Enable" bit which userspace drivers can set
to request that the vertex fetcher unit snoop L3. However, unlike most
true L3 clients, the "VF Cache Invalidate" bit continues to only
invalidate the VF L2 cache - and not any associated L3 lines.
To handle that, PIPE_CONTROL has a new "L3 Read Only Cache Invalidation
Bit", which according to the docs, "controls the invalidation of the
Geometry streams cached in L3 cache at the top of the pipe." In other
words, the vertex and index buffer data that gets cached in L3 when
"L3 Bypass Disable" is set.
Mesa always sets L3 Bypass Disable so that the VF unit snoops L3, and
whenever it issues a VF Cache Invalidate, it also issues a L3 Read Only
Cache Invalidate so that both L2 and L3 vertex data is invalidated.
xe is issuing VF cache invalidates too (which handles cases like CPU
writes to a buffer between GPU batches). Because userspace may enable
L3 snooping, it needs to issue an L3 Read Only Cache Invalidate as well.
Fixes significant flickering in Firefox on Meteorlake, which was writing
to vertex buffers via the CPU between batches; the missing L3 Read Only
invalidates were causing the vertex fetcher to read stale data from L3.
Fixes: b4b05e53b550 ("drm/xe/guc_pc: Retry and wait longer for GuC PC start") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/intel-xe/1454a5f1-ee18-4df1-a6b2-a4a3dddcd1cb@stanley.mountain/ Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250328181752.26677-1-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Stuart Summers [Fri, 28 Mar 2025 15:42:36 +0000 (15:42 +0000)]
drm/xe: Don't print error about hwconfig when using execlists
This error message is only applicable for platforms using
GuC submission - to warn the user if the GuC they are using
or the platform they are running doesn't have this information
to provide to userspace about the platform. When forcing
execlist submission, which is something only used for debug,
the user is running at their own risk and should understand
the limitations of running without GuC.
v2 (John/Lucas): Don't print an info message with execlists
Signed-off-by: Stuart Summers <stuart.summers@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Jagmeet Randhawa <jagmeet.randhawa@intel.com> Link: https://lore.kernel.org/r/20250328154236.9216-1-stuart.summers@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
John Harrison [Tue, 25 Mar 2025 20:32:11 +0000 (13:32 -0700)]
drm/xe/guc: Re-word message about ADS size changes
The error capture list in the ADS is initially allocated using a
placeholder size. When the actual size is determinied later on, there
is a debug print about the new size. However, the wording is such that
some people see it as an unexpected thing and therefore a potential
problem. So re-word it to be a little less concerning.
Arnd Bergmann [Mon, 24 Mar 2025 21:06:02 +0000 (22:06 +0100)]
drm/xe: avoid plain 64-bit division
Building the xe driver for i386 results in a link time warning:
x86_64-linux-ld: drivers/gpu/drm/xe/xe_migrate.o: in function `xe_migrate_vram':
xe_migrate.c:(.text+0x1e15): undefined reference to `__udivdi3'
Avoid this by using DIV_U64_ROUND_UP() instead of DIV_ROUND_UP(). The driver
is unlikely to be used on 32=bit hardware, so the extra cost here is not
too important.
John Harrison [Tue, 25 Mar 2025 20:31:11 +0000 (13:31 -0700)]
drm/xe/guc: Reformat dead CT reason string to be devcoredump compatible
The dump on a dead CT tries to emulate the devcoredump formatting (it
would use devcoredump code directly but that requires more re-work to
happen - work in progress). So update the print of the dead CT reason
code to match the format of the 'reason' string that was added to the
actual devcoredump a little while ago.
Badal Nilawar [Thu, 27 Mar 2025 16:19:14 +0000 (21:49 +0530)]
drm/xe/d3cold: Set power state to D3Cold during s2idle/s3
According to pci core guidelines, pci_save_config is recommended when the
driver explicitly needs to set the pci power state. As of now xe kmd is
only doing pci_save_config while entering to s2idle/s3 state, which makes
pci core think that device driver has already applied required pci power
state. This leads to GPU remain in D0 state. To fix the issue setting
the pci power state to D3Cold.
Fixes:dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Tejas Upadhyay [Thu, 27 Mar 2025 12:26:47 +0000 (17:56 +0530)]
drm/xe/hw_engine: define sysfs_ops on all directories
Sysfs_ops needs to be defined on all directories which
can have attr files with set/get method. Add sysfs_ops
to even those directories which is currently empty but
would have attr files with set/get method in future.
Leave .default with default sysfs_ops as it will never
have setter method.
V2(Himal/Rodrigo):
- use single sysfs_ops for all dir and attr with set/get
- add default ops as ./default does not need runtime pm at all
Matthew Brost [Tue, 11 Mar 2025 18:29:15 +0000 (11:29 -0700)]
drm/xe: Use local fence in error path of xe_migrate_clear
The intent of the error path in xe_migrate_clear is to wait on locally
generated fence and then return. The code is waiting on m->fence which
could be the local fence but this is only stable under the job mutex
leading to a possible UAF. Fix code to wait on local fence.
drm/xe: Ensure fixed_slice_mode gets set after ccs_mode change
The RCU_MODE_FIXED_SLICE_CCS_MODE setting is not getting invoked
in the gt reset path after the ccs_mode setting by the user.
Add it to engine register update list (in hw_engine_setup_default_state())
which ensures it gets set in the gt reset and engine reset paths.
v2: Add register update to engine list to ensure it gets updated
after engine reset also.
Thomas Hellström [Wed, 26 Mar 2025 15:16:34 +0000 (16:16 +0100)]
drm/xe: Fix an out-of-bounds shift when invalidating TLB
When the size of the range invalidated is larger than
rounddown_pow_of_two(ULONG_MAX),
The function macro roundup_pow_of_two(length) will hit an out-of-bounds
shift [1].
Use a full TLB invalidation for such cases.
v2:
- Use a define for the range size limit over which we use a full
TLB invalidation. (Lucas)
- Use a better calculation of the limit.
Aradhya Bhatia [Wed, 26 Mar 2025 15:19:29 +0000 (20:49 +0530)]
drm/xe/migrate: Switch from drm to dev managed actions
Change the scope of the migrate subsystem to be dev managed instead of
drm managed.
The parent pci struct &device, that the xe struct &drm_device is a part
of, gets removed when a hot unplug is triggered, which causes the
underlying iommu group to get destroyed as well.
The migrate subsystem, which handles the lifetime of the page-table tree
(pt) BO, doesn't get a chance to keep the BO back during the hot unplug,
as all the references to DRM haven't been put back.
When all the references to DRM are indeed put back later, the migrate
subsystem tries to put back the pt BO. Since the underlying iommu group
has been already destroyed, a kernel NULL ptr dereference takes place
while attempting to keep back the pt BO.
Thomas Hellström [Wed, 26 Mar 2025 08:05:51 +0000 (09:05 +0100)]
drm/xe: Make the PT code handle placement per PTE rather than per vma / range
With SVM, ranges forwarded to the PT code for binding can, mostly
due to races when migrating, point to both VRAM and system / foreign
device memory. Make the PT code able to handle that by checking,
for each PTE set up, whether it points to local VRAM or to system
memory.
v2:
- Fix system memory GPU atomic access.
v3:
- Avoid the UAPI change. It needs more thought.
Thomas Hellström [Wed, 26 Mar 2025 08:05:50 +0000 (09:05 +0100)]
drm/xe/migrate: Allow xe_migrate_vram() also on non-pagefault capable devices
The drm_pagemap functionality does not depend on the device having
recoverable pagefaults available. So allow xe_migrate_vram() also for
such devices. Even if this will have little use in practice, it's
beneficial for testin multi-device SVM, since a memory provider could
be a non-pagefault capable gpu.
Thomas Hellström [Wed, 26 Mar 2025 08:05:49 +0000 (09:05 +0100)]
drm/xe/bo: Add a bo remove callback
On device unbind, migrate exported bos, including pagemap bos to
system. This allows importers to take proper action without
disruption. In particular, SVM clients on remote devices may
continue as if nothing happened, and can chose a different
placement.
The evict_flags() placement is chosen in such a way that bos that
aren't exported are purged.
For pinned bos, we unmap DMA, but their pages are not freed yet
since we can't be 100% sure they are not accessed.
All pinned external bos (not just the VRAM ones) are put on the
pinned.external list with this patch. But this only affects the
xe_bo_pci_dev_remove_pinned() function since !VRAM bos are
ignored by the suspend / resume functionality. As a follow-up we
could look at removing the suspend / resume iteration over
pinned external bos since we currently don't allow pinning
external bos in VRAM, and other external bos don't need any
special treatment at suspend / resume.
v2:
- Address review comments. (Matthew Auld).
v3:
- Don't introduce an external_evicted list (Matthew Auld)
- Add a discussion around suspend / resume behaviour to the
commit message.
- Formatting fixes.
v4:
- Move dma-unmaps of pinned kernel bos to a dev managed
callback to give subsystems using these bos a chance to
clean them up. (Matthew Auld)
Thomas Hellström [Wed, 26 Mar 2025 08:05:48 +0000 (09:05 +0100)]
drm/xe/svm: Fix a potential bo UAF
If drm_gpusvm_migrate_to_devmem() succeeds, if a cpu access happens to the
range the bo may be freed before xe_bo_unlock(), causing a UAF.
Since the reference is transferred, use xe_svm_devmem_release() to
release the reference on drm_gpusvm_migrate_to_devmem() failure,
and hold a local reference to protect the UAF.
Nakshtra Goyal [Thu, 27 Feb 2025 10:23:39 +0000 (15:53 +0530)]
drm/xe: Add fault injection for xe_oa_alloc_regs
Add fault injection for xe_oa_alloc_regs to allow it to fail while
executing xe_oa_add_config_ioctl().
This need to be added as it cannot be reached by injecting error through
IOCTL arguments.
Yue Haibing [Sun, 23 Mar 2025 11:41:03 +0000 (19:41 +0800)]
drm/xe: Fix unmet direct dependencies warning
WARNING: unmet direct dependencies detected for FB_IOMEM_HELPERS
Depends on [n]: HAS_IOMEM [=y] && FB_CORE [=n]
Selected by [m]:
- DRM_XE_DISPLAY [=y] && HAS_IOMEM [=y] && DRM [=m] && DRM_XE [=m] && DRM_XE [=m]=m [=m] && HAS_IOPORT [=y]
DRM_XE_DISPLAY requires FB_IOMEM_HELPERS, but the dependency FB_CORE is
missing, selecting FB_IOMEM_HELPERS if DRM_FBDEV_EMULATION is set as
other drm drivers.
Lucas De Marchi [Fri, 14 Mar 2025 13:54:27 +0000 (06:54 -0700)]
drm/xe: Allow to inject error in early probe
Allow to test if driver behaves correctly when xe_pcode_probe_early()
fails. Note that this is not sufficient for testing survivability mode
as it's still required to read the hw to check for errors, which doesn't
happen on an injected failure.
To complete the early probe coverage, allow injection in the other
functions as well: xe_mmio_probe_early() and xe_device_probe_early().
Lucas De Marchi [Fri, 14 Mar 2025 13:54:26 +0000 (06:54 -0700)]
drm/xe: Set survivability mode before heci init
Commit d40f275d96e8 ("drm/xe: Move survivability entirely to xe_pci")
tried to follow the logic: initialize everything needed and if
everything succeeds, set the flag that it's enabled. While it fixed some
corner cases of those calls failing, it was wrong for setting the flag
after the call to xe_heci_gsc_init(): that function does a different
initialization for survivability mode.
Fix that and add comments about this being done on purpose.
Lucas De Marchi [Fri, 14 Mar 2025 13:48:58 +0000 (06:48 -0700)]
drm/xe: Move survivability back to xe
Commit d40f275d96e8 ("drm/xe: Move survivability entirely to xe_pci")
moved the survivability handling to be done entirely in the xe_pci
layer. However there are some issues with that approach:
1) Survivability mode needs at least the mmio initialized, otherwise it
can't really read a register to decide if it should enter that state
2) SR-IOV mode should be initialized, otherwise it's not possible to
check if it's VF
Besides, as pointed by Riana the check for
xe_survivability_mode_enable() was wrong in xe_pci_probe() since it's
not a bool return.
Fix that by moving the initialization to be entirely in the xe_device
layer, with the correct dependencies handled: only after mmio and sriov
initialization, and not triggering it on error from
wait_for_lmem_ready(). This restores the trigger behavior before that
commit. The xe_pci layer now only checks for "is it enabled?",
like it's doing in xe_pci_suspend()/xe_pci_remove(), etc.
Lucas De Marchi [Fri, 7 Mar 2025 18:13:22 +0000 (10:13 -0800)]
drm/xe/uc: Add support for different firmware files on each GT
The different GTs on a device can be very different. Right now for all
platforms the same firmware is loaded in each GT, however future
platforms may benefit from loading a different file depending on the GT
type.
Based on previous patch by John Harrison <John.C.Harrison@Intel.com>.
Vinay Belgaumkar [Thu, 20 Mar 2025 17:51:23 +0000 (10:51 -0700)]
drm/xe: Apply Wa_16023105232
The WA requires KMD to disable DOP clock gating during a semaphore
wait and also ensure that idle delay for every CS is lower than the
idle wait time in the PWRCTX_MAXCNT register. Default values for these
registers already comply with this restriction.
v2: Store timestamp_base in gt info and other comments (Daniele)
v3: Skip WA check for VF
v4: Review comments (Matt Roper)
v5: Cleanup the clock functions and use reg_field_get (Matt Roper)
v6: Fix checkpatch issue
v7: Fix CI issue
The `struct ttm_resource->placement` contains TTM_PL_FLAG_* flags, but
it was incorrectly tested for XE_PL_* flags.
This caused xe_dma_buf_pin() to always fail when invoked for
the second time. Fix this by checking the `mem_type` field instead.
Fixes: 7764222d54b7 ("drm/xe: Disallow pinning dma-bufs in VRAM") Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: intel-xe@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v6.8+ Signed-off-by: Tomasz Rusinowicz <tomasz.rusinowicz@intel.com> Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250218100353.2137964-1-jacek.lawrynowicz@linux.intel.com Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Raag Jadav [Wed, 12 Mar 2025 08:59:09 +0000 (14:29 +0530)]
drm/xe/hwmon: expose fan speed
Add hwmon support for fan1_input, fan2_input and fan3_input attributes,
which will expose fan speed of respective channels in RPM when supported
by hardware. With this in place we can monitor fan speed using lm-sensors
tool.
v2: Rely on platform checks instead of mailbox error (Aravind, Rodrigo)
v3: Introduce has_fan_control flag (Rodrigo)
Dave Airlie [Fri, 14 Mar 2025 04:28:39 +0000 (14:28 +1000)]
Merge tag 'mediatek-drm-next-6.15-v2' of https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux into drm-next
Mediatek DRM Next for Linux 6.15
1. HDMI fixup and refinement
2. Move to devm_platform_ioremap_resource() usage
3. Add MT8188 dsc compatible
4. Fix config_updating flag never false when no mbox channel
5. dp: drm_err => dev_err in HPD path to avoid NULL ptr
6. Add dpi power-domains example
7. Add MT8365 SoC support
8. Fix error codes in mtk_dsi_host_transfer()
Harish Chegondi [Wed, 12 Mar 2025 17:31:20 +0000 (10:31 -0700)]
drm/xe/eustall: Fix a possible pointer dereference after free
If devm_add_action_or_reset() isn't successful, xe_eu_stall_fini()
is invoked. So, unsuccessful return from devm_add_action_or_reset()
shouldn't dereference gt->eu_stall as xe_eu_stall_fini() already
frees it. Fix this issue.
Tvrtko Ursulin [Fri, 7 Mar 2025 11:13:59 +0000 (11:13 +0000)]
drm/xe: Fix MOCS debugfs LNCF readout
With only XE_FW_GT taken LNCF registers read back as all zeroes, leading
to a wild goose chase trying to figure out why is register programming
incorrect.
Fix it by grabbing XE_FORCEWAKE_ALL for affected platforms.
Lucas De Marchi [Fri, 7 Mar 2025 04:00:05 +0000 (20:00 -0800)]
drm/xe/rtp: Drop sentinels from arg to xe_rtp_process_to_sr()
There's a mismatch on API: while xe_rtp_process_to_sr() processes
entries until an entry without name, the active tracking with
xe_rtp_process_ctx_enable_active_tracking() needs to use the number of
elements. The number of elements is taken everywhere using ARRAY_SIZE(),
but that will have one entry too many. This leads to the following
warning, as reported by lkp:
drivers/gpu/drm/xe/xe_tuning.c: In function 'xe_tuning_dump':
>> include/drm/drm_print.h:228:31: warning: '%s' directive argument is null [-Wformat-overflow=]
228 | drm_printf((printer), "%.*s" fmt, (indent), "\t\t\t\t\tX", ##__VA_ARGS__)
| ^~~~~~
drivers/gpu/drm/xe/xe_tuning.c:226:17: note: in expansion of macro 'drm_printf_indent'
226 | drm_printf_indent(p, 1, "%s\n", engine_tunings[idx].name);
| ^~~~~~~~~~~~~~~~~
That's because it will still process the last entry when tracking the
active tunings. The same issue exists in the WAs. Change
xe_rtp_process_to_sr() to also take the number of elements so the empty
entry can be removed and the warning should go away. Fixing on the
active-tracking side would more fragile as the it would need a `- 1`
everywhere and continue to use a different approach for number of
elements.
Aside from the warning, it's a non-issue as there would always be enough
bits allocated and the last entry would never be active since
xe_rtp_process_to_sr() stops on the sentinel.
Lucas De Marchi [Sat, 8 Mar 2025 01:14:28 +0000 (17:14 -0800)]
drm/gpusvm: Fix kernel-doc
Due to wrong `.. kernel-doc` directive in Documentation/gpu/rfc/gpusvm.rst
the documentation was actually not parsing anything from
drivers/gpu/drm/drm_gpusvm.c. This fixes the kernel-doc include and all
warnings/errors created when doing so.
Cc: Simona Vetter <simona.vetter@ffwll.ch> Cc: Dave Airlie <airlied@redhat.com> Cc: Christian König <christian.koenig@amd.com> Cc: dri-devel@lists.freedesktop.org Cc: Matthew Brost <matthew.brost@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Closes: https://lore.kernel.org/intel-xe/20250307195239.57abcd2d@canb.auug.org.au/ Fixes: 99624bdff867 ("drm/gpusvm: Add support for GPU Shared Virtual Memory") Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250307-fix-svm-kerneldoc-v2-1-03c74b199620@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 4da1fb61e02a783fdd7eb725ea03d897b8ef19ea) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Rodrigo Vivi [Thu, 6 Mar 2025 22:06:43 +0000 (17:06 -0500)]
drm/xe/guc_pc: Remove duplicated pc_start call
xe_guc_pc_start() was getting called from both
xe_uc_init_hw() and from xe_guc_start().
But both are called from do_gt_restart() and only
xe_uc_init_hw() is called at initialization.
So, let's remove the duplication in the regular gt_restart
path.
The only place where xe_guc_pc_start() won't get called now
is on the gt_reset failure path. However, if gt_reset has
failed, it is really unlikely that the PC start will work
or is desired.
Dan Carpenter [Wed, 8 Jan 2025 09:35:57 +0000 (12:35 +0300)]
drm/mediatek: dsi: fix error codes in mtk_dsi_host_transfer()
There is a type bug because the return statement:
return ret < 0 ? ret : recv_cnt;
The issue is that ret is an int, recv_cnt is a u32 and the function
returns ssize_t, which is a signed long. The way that the type promotion
works is that the negative error codes are first cast to u32 and then
to signed long. The error codes end up being positive instead of
negative and the callers treat them as success.
Fixes: 81cc7e51c4f1 ("drm/mediatek: Allow commands to be sent during video mode") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/r/202412210801.iADw0oIH-lkp@intel.com/ Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Mattijs Korpershoek <mkorpershoek@baylibre.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: CK Hu <ck.hu@mediatek.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/b754a408-4f39-4e37-b52d-7706c132e27f@stanley.mountain/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Fabien Parent [Fri, 10 Jan 2025 13:31:11 +0000 (14:31 +0100)]
dt-bindings: display: mediatek: dpi: add power-domains example
DPI is part of the display / multimedia block in MediaTek SoCs, and
always have a power-domain (at least in the upstream device-trees).
Add the power-domains property to the binding example.
Fixes: 9273cf7d3942 ("dt-bindings: display: mediatek: convert the dpi bindings to yaml") Signed-off-by: Fabien Parent <fparent@baylibre.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Acked-by: Rob Herring (Arm) <robh@kernel.org> Reviewed-by: CK Hu <ck.hu@mediatek.com> Signed-off-by: Alexandre Mergnat <amergnat@baylibre.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20231023-display-support-v7-1-6703f3e26831@baylibre.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Douglas Anderson [Thu, 16 Jan 2025 17:42:50 +0000 (09:42 -0800)]
drm/mediatek: dp: drm_err => dev_err in HPD path to avoid NULL ptr
The function mtk_dp_wait_hpd_asserted() may be called before the
`mtk_dp->drm_dev` pointer is assigned in mtk_dp_bridge_attach().
Specifically it can be called via this callpath:
- mtk_edp_wait_hpd_asserted
- [panel probe]
- dp_aux_ep_probe
Using "drm" level prints anywhere in this callpath causes a NULL
pointer dereference. Change the error message directly in
mtk_dp_wait_hpd_asserted() to dev_err() to avoid this. Also change the
error messages in mtk_dp_parse_capabilities(), which is called by
mtk_dp_wait_hpd_asserted().
While touching these prints, also add the error code to them to make
future debugging easier.
Jason-JH Lin [Mon, 24 Feb 2025 05:12:21 +0000 (13:12 +0800)]
drm/mediatek: Fix config_updating flag never false when no mbox channel
When CONFIG_MTK_CMDQ is enabled, if the display is controlled by the CPU
while other hardware is controlled by the GCE, the display will encounter
a mbox request channel failure.
However, it will still enter the CONFIG_MTK_CMDQ statement, causing the
config_updating flag to never be set to false. As a result, no page flip
event is sent back to user space, and the screen does not update.
Dave Airlie [Wed, 12 Mar 2025 21:54:40 +0000 (07:54 +1000)]
Merge tag 'drm-intel-gt-next-2025-02-26' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next
UAPI Changes:
- Add sysfs for SLPC power profiles [slpc] (Vinay Belgaumkar)
Driver Changes:
Fixes/improvements/new stuff:
- Fix zero delta busyness issue [pmu] (Umesh Nerlige Ramappa)
- Fix page cleanup on DMA remap failure (Brian Geffon)
- Debug print LRC state entries only if the context is pinned [guc] (Daniele Ceraolo Spurio)
- Drop custom hotplug code [pmu] (Lucas De Marchi)
- Use spin_lock_irqsave() in interruptible context [guc] (Krzysztof Karas)
- Add wait on depth stall done bit handling [gen12] (Juha-Pekka Heikkila)
Miscellaneous:
- Change throttle criteria for rps [selftest] (Raag Jadav)
- Add debug print about hw config table size (John Harrison)
- Include requested frequency in slow firmware load messages [uc] (John Harrison)
- Remove i915_pmu_event_event_idx() [pmu] (Lucas De Marchi)
- Remove unused live_context_for_engine (Dr. David Alan Gilbert)
- Add Wa_22010465259 in its respective WA list (Ranu Maurya)
- Correct frequency handling in RPS power measurement [selftests] (Sk Anirban)
- Add helper function slpc_measure_power [guc/slpc] (Sk Anirban)
- Revert "drm/i915/gt: Log reason for setting TAINT_WARN at reset" [gt] (Sebastian Brzezinka)
- Avoid using uninitialized context [selftests] (Krzysztof Karas)
- Use struct_size() helper in kmalloc() (luoqing)
- Use prandom in selftest [selftests] (Markus Theil)
- Replace kmap with its safer kmap_local_page counterpart [gt] (Andi Shyti)
Merges:
- Merge drm/drm-next into drm-intel-gt-next (Tvrtko Ursulin)
Michal Wajdeczko [Tue, 11 Mar 2025 11:40:41 +0000 (12:40 +0100)]
drm/xe/vf: Don't check CTC_MODE[0] if VF
Starting from commit 18778b5fdd01 ("drm/xe: Eliminate usage of
TIMESTAMP_OVERRIDE") we access the CTC_MODE register only to warn
if it has undocumented value. There is no point in doing that on
the VF driver. While here, move this check to a helper function.
Michal Wajdeczko [Tue, 11 Mar 2025 13:57:26 +0000 (14:57 +0100)]
drm/xe/vf: Catch all unexpected register reads
While we can only mimic read32 for a few GT registers for which
the PF shared the values, we shouldn't avoid calling helper code
if we try to access non-GT register, as then we miss to trigger
a debug warning. For cases where sriov_vf_gt was not set, just
use primary_gt instead.
Michal Wajdeczko [Tue, 11 Mar 2025 13:57:25 +0000 (14:57 +0100)]
drm/xe/vf: Don't try Driver-FLR if VF
Driver-FLR can't be triggered from the VF driver, so treat it
as disabled if VF. While around, fix also the message, as it
shouldn't be printed just 'once' as we may have many devices.
Michal Wajdeczko [Tue, 11 Mar 2025 10:52:21 +0000 (11:52 +0100)]
drm/xe/vf: Unblock xe_rtp_process_to_sr for VFs
In commit 9632dfb0def4 ("drm/xe/vf: Don't run any save-restore
RTP actions if VF") we disabled processing of all RTP rules if
we were running as a VFs, since many of the RTP actions were
trying to access registers unaccessible for VFs.
This also included all of LRC WA rules, since some of them were
implemented in a way that required RMW pattern.
Now, as we can program LRC WAs without accessing such registers
from the driver, relying on the MI_MATH instruction instead, we
can unblock xe_rtp_process_to_sr() for VFs.
Currently we are blocking processing of all save-restore rules
by the VFs inside the xe_rtp_process_to_sr() function, but we
want to unblock that to allow processing of the LRC WA rules.
To avoid hitting WARNs about reading an inaccessible registers by
the VFs, stop applying save-restore MMIOs action if VF, without
relying that SR list will be always empty for the VF.
drm/xe: Avoid reading RMW registers in emit_wa_job
To allow VFs properly handle LRC WAs, we should postpone doing
all RMW register operations and let them be run by the engine
itself, since attempt to perform read registers from within the
driver will fail on the VF. Use MI_MATH and ALU for that.
drm/xe: Add MI_MATH and ALU instruction definitions
The command streamer implements an Arithmetic Logic Unit (ALU)
which supports basic arithmetic and logical operations on two
64-bit operands. Access to this ALU is thru the MI_MATH command
and sixteen General Purpose Register (GPR) 64-bit registers,
which are used as temporary storage.
Dave Airlie [Tue, 11 Mar 2025 02:15:48 +0000 (12:15 +1000)]
Merge tag 'drm-intel-next-2025-03-10' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next
drm/i915 feature pull #2 for v6.15:
Features and functionality:
- FBC dirty rectangle support for display version 30+ (Vinod)
- Update plane scalers via DSB based commits (Ville)
- Move runtime power status info to display power debugfs (Jani)
Refactoring and cleanups:
- Convert i915 and xe to DRM client setup (Thomas)
- Refactor and clean up CDCLK/bw/dbuf readout/sanitation (Ville)
- Conversions from drm_i915_private to struct intel_display (Jani, Suraj)
- Refactor display reset for better separation between display and core (Jani)
- Move panel fitter code together (Jani)
- Add mst and hdcp sub-structs to display structs for clarity (Jani)
- Header refactoring to clarify separation between display and i915 core (Jani)
Fixes:
- Fix DP MST max stream count to match number of pipes (Jani)
- Fix encoder HW state readout of DP MST UHBR (Imre)
- Fix ICL+ combo PHY cursor and coeff polarity programming (Ville)
- Fix pipeDMC and ATS fault handling (Ville)
- Display workarounds (Gustavo)
- Remove duplicate forward declaration (Vinod)
- Improve POWER_DOMAIN_*() macro type safety (Gustavo)
- Move CDCLK post plane programming later (Ville)
Dave Airlie [Tue, 11 Mar 2025 00:26:08 +0000 (10:26 +1000)]
Merge tag 'drm-xe-next-2025-03-07' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next
UAPI Changes:
- Expose per-engine activity via perf pmu (Riana, Lucas, Umesh)
- Add support for EU stall sampling (Harish, Ashutosh)
- Allow userspace to provide low latency hint for submission (Tejas)
- GPU SVM and Xe SVM implementation (Matthew Brost)
Cross-subsystem Changes:
- devres handling for component drivers (Lucas)
- Backmege drm-next to allow cross dependent change with i915
- GPU SVM and Xe SVM implementation (Matthew Brost)
Core Changes:
Driver Changes:
- Fixes to userptr and missing validations (Matthew Auld, Thomas
Hellström, Matthew Brost)
- devcoredump typos and error handling improvement (Shuicheng)
- Allow oa_exponent value of 0 (Umesh)
- Finish moving device probe to devm (Lucas)
- Fix race between submission restart and scheduled being freed (Tejas)
- Fix counter overflows in gt_stats (Francois)
- Refactor and add missing workarounds and tunings for pre-Xe2 platforms
(Aradhya, Tvrtko)
- Fix PXP locks interaction with exec queues being killed (Daniele)
- Eliminate TIMESTAMP_OVERRIDE from xe (Matt Roper)
- Change xe_gen_wa_oob to allow building on MacOS (Daniel Gomez)
- New workarounds for Panther Lake (Tejas)
- Fix VF resume errors (Satyanarayana)
- Fix workaround infra skipping some workarounds dependent on engine
initialization (Tvrtko)
- Improve per-IP descriptors (Gustavo)
- Add more error injections to probe sequence (Francois)
Dave Airlie [Tue, 11 Mar 2025 00:19:06 +0000 (10:19 +1000)]
Merge tag 'drm-msm-next-2025-03-09' of https://gitlab.freedesktop.org/drm/msm into drm-next
Updates for v6.15
GPU:
- Fix obscure GMU suspend failure
- Expose syncobj timeline support
- Extend GPU devcoredump with pagetable info
- a623 support
- Fix a6xx gen1/gen2 indexed-register blocks in gpu snapshot / devcoredump
Display:
- Add cpu-cfg interconnect paths on SM8560 and SM8650
- Introduce KMS OMMU fault handler, causing devcoredump snapshot
- Fixed error pointer dereference in msm_kms_init_aspace()
DPU:
- Fix mode_changing handling
- Add writeback support on SM6150 (QCS615)
- Fix DSC programming in 1:1:1 topology
- Reworked hardware resource allocation, moving it to the CRTC code
- Enabled support for Concurrent WriteBack (CWB) on SM8650
- Enabled CDM blocks on all relevant platforms
- Reworked debugfs interface for BW/clocks debugging
- Clear perf params before calculating bw
- Support YUV formats on writeback
- Fixed double inclusion
- Fixed writeback in YUV formats when using cloned output, Dropped
wb2_formats_rgb
- Corrected dpu_crtc_check_mode_changed and struct dpu_encoder_virt
kerneldocs
- Fixed uninitialized variable in dpu_crtc_kickoff_clone_mode()
Matt Roper [Fri, 7 Mar 2025 19:07:55 +0000 (11:07 -0800)]
drm/xe/xe3: Recognize 3DSTATE_COARSE_PIXEL in LRC dumps
Xe3 adds a new 3DSTATE_COARSE_PIXEL state instruction as part of the
render engine LRC. Ensure we can recognize and report this properly in
the LRC dumps.