Suraj Kandpal [Wed, 6 Mar 2024 02:42:48 +0000 (08:12 +0530)]
drm/xe/hdcp: Enable HDCP for XE
Enable HDCP for Xe by defining functions which take care of
interaction of HDCP as a client with the GSC CS interface.
Add intel_hdcp_gsc_message to Makefile and add corresponding
changes to xe_hdcp_gsc.c to make it build.
--v2
-add kfree at appropriate place [Daniele]
-remove useless define [Daniele]
-move host session logic to xe_gsc_submit.c [Daniele]
-call xe_gsc_check_and_update_pending directly in an if condition
[Daniele]
-use xe_device instead of drm_i915_private [Daniele]
--v3
-use xe prefix for newly exposed function [Daniele]
-remove client specific defines from intel_gsc_mtl_header [Daniele]
-add missing kfree() [Daniele]
-have NULL check for hdcp_message in finish function [Daniele]
-dont have too many variable declarations in the same line [Daniele]
--v4
-don't point the hdcp_message structure in xe_device to anything
until it properly gets initialized [Daniele]
Suraj Kandpal [Wed, 6 Mar 2024 02:42:46 +0000 (08:12 +0530)]
drm/xe/hdcp: Use xe_device struct
Use xe_device struct instead of drm_i915_private so as to not
cause confusion and comply with Xe standards as drm_i915_private is
xe_device under the hood.
Matthew Brost [Thu, 29 Feb 2024 19:45:20 +0000 (11:45 -0800)]
drm/xe: Do not grab forcewakes when issuing GGTT TLB invalidation via GuC
Forcewakes are not required for communication with the GuC via CTB
as it is a memory based interfaced. Acquring forcewakes takes
considerable time. With that, do not grab a forcewake when issuing a
GGTT TLB invalidation via the GuC.
Matt Roper [Wed, 6 Mar 2024 00:40:49 +0000 (16:40 -0800)]
drm/xe/arl: Add Arrow Lake H support
ARL-H uses the same media and display IP as MTL, and a version 12.74
graphics IP (referred to as Xe_LPG+). From a driver point of view, we
should be able to just treat the whole platform as MTL and rely on
GRAPHICS_VERx100 checks to handle any spots where ARL's Xe_LPG+ needs
different handling from MTL's Xe_LPG (i.e., workarounds).
v2: Resolve conflict and Reorder PCI ids in sorted order
v3: Append signed-off-by commiter to this commit
Matt Roper [Thu, 29 Feb 2024 07:08:05 +0000 (12:38 +0530)]
drm/xe/xelpg: Extend some workarounds to graphics version 12.74
A handful of Xe_LPG workarounds are also relevant to graphics version
12.74 as well. Extend the graphics version range for these workarounds
accordingly.
Matt Roper [Thu, 29 Feb 2024 07:08:04 +0000 (12:38 +0530)]
drm/xe/xelpg: Recognize graphics version 12.74 as Xe_LPG
Graphics version 12.74 (which is technically called "Xe_LPG+") should be
handled the same as versions Xe_LPG 12.70/12.71 by the KMD. Only the
workaround lists (handled in the next patch) will be a bit different.
Rodrigo Vivi [Fri, 1 Mar 2024 18:05:25 +0000 (13:05 -0500)]
drm/xe: Convert xe_pm_runtime_{get, put} to void and protect from recursion
With mem_access going away and pm_runtime getting called instead,
we need to protect these against recursions.
The put is asynchronous so there's no need to block it. However, for a
proper balance, we need to ensure that the references are taken and
restored regardless of the flow. So, let's convert them all to void and
use some direct linux/pm_runtime functions.
Rodrigo Vivi [Fri, 1 Mar 2024 18:05:24 +0000 (13:05 -0500)]
drm/xe: Create a xe_pm_runtime_resume_and_get variant for display
Introduce the resume and get to fulfill the display need for checking
if the device was actually resumed (or it is awake) and the reference
was taken.
Then we can convert the remaining cases to a void function and have
individual functions for individual cases.
Also, already start this new function protected from the runtime
recursion, since runtime_pm will need to call for display functions
for a proper D3Cold flow.
Rodrigo Vivi [Fri, 1 Mar 2024 18:05:23 +0000 (13:05 -0500)]
drm/xe: Fix display runtime_pm handling
i915's intel_runtime_pm_get_if_in_use actually calls the
pm_runtime_get_if_active() with ign_usage_count = false, but Xe
was erroneously calling it with true because of the mem_access cases.
This can lead to unnecessary references getting hold here and device
never getting into the runtime suspended state.
Let's use directly the 'if_in_use' function provided by linux/pm_runtime.
Also, already start this new function protected from the runtime
recursion, since runtime_pm will need to call for display functions
for a proper D3Cold flow.
v2: Update commit message based on Matt's feedback.
Fix return condition of pm_runtime_get_if_in_use (Matt)
Lucas De Marchi [Wed, 28 Feb 2024 06:10:48 +0000 (22:10 -0800)]
drm/xe/mocs: Fix DG2 kunit
LNCFCMOCS31[31:16] is read-only for DG2 and MTL, so it's not possible to
check set it. While trying to set doesn't cause any issue, later when
it's read back to check if the value got correctly recorded causes the
test to fail. Now that test is reliable for an odd number of entries,
reduce it so the last entry is ignored.
Bspec: 55267 Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1253 Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1233 Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240228061048.3661978-6-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Lucas De Marchi [Wed, 28 Feb 2024 06:10:47 +0000 (22:10 -0800)]
drm/xe/mocs: Allow odd number of entries on test
Refactor the mocs/l3cc kunit test to support odd number of entries. This
switches out from the "check the register value" approach to check the
entry value if it makes sense from the register read. This provides an
easier output to reason about and cross check with bspec.
Some code reordering and variable re-use was also done so the 2
functions follow more or less the same logic.
Lucas De Marchi [Wed, 28 Feb 2024 06:10:46 +0000 (22:10 -0800)]
drm/xe/mocs: Move warn/assertion up
The warn-once in __init_mocs_table() to make sure there's an index set
for unused entries is more a sanity check that should be done as the
first thing in that function. The kunit test replicates the same check,
so also move it up and turn it into a failure condition for the test.
Lucas De Marchi [Wed, 28 Feb 2024 06:10:44 +0000 (22:10 -0800)]
drm/xe/mocs: Refactor mocs/l3cc loop
There's no reason to keep the assignment an condition in the same
statement, particularly making use of the comma operator. Improve
readability by doing each step on its own statement. This will make
supporting odd number of entries more easily.
Matthew Brost [Sun, 25 Feb 2024 00:14:48 +0000 (16:14 -0800)]
drm/xe: Fix build error in xe_ggtt.c
Need to include io-64-nonatomic-lo-hi.h for writeq function.
Commit 3121fed0c51b ("drm/xe: Cleanup some layering in GGTT")
removed the xe_mmio.h include so lost the indirect include. Add it
where it's needed.
Fixes: 3121fed0c51b ("drm/xe: Cleanup some layering in GGTT") Closes: https://lore.kernel.org/oe-kbuild-all/202402241903.R5J8hKVI-lkp@intel.com/ Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240225001448.81513-1-matthew.brost@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Matt Roper [Thu, 22 Feb 2024 18:40:08 +0000 (10:40 -0800)]
drm/xe: Add LRC parsing for more GPU instructions
The LRCs on some of our newer platforms appear to contain a few GPU
instructions that weren't handled in our LRC parser. Add the relevant
instruction names and opcodes so that our debugfs LRC dumps will
properly indicate what these are.
Arnd Bergmann [Mon, 26 Feb 2024 12:46:38 +0000 (13:46 +0100)]
drm/xe/xe2: fix 64-bit division in pte_update_size
This function does not build on 32-bit targets when the compiler
fails to reduce DIV_ROUND_UP() into a shift:
ld.lld: error: undefined symbol: __aeabi_uldivmod
>>> referenced by xe_migrate.c
>>> drivers/gpu/drm/xe/xe_migrate.o:(pte_update_size) in archive vmlinux.a
There are two instances in this function. Change the first to
use an open-coded shift with the same behavior, and the second
one to a 32-bit calculation, which is sufficient here as the size
is never more than 2^32 pages (16TB).
Arnd Bergmann [Mon, 26 Feb 2024 12:46:37 +0000 (13:46 +0100)]
drm/xe/mmio: fix build warning for BAR resize on 32-bit
clang complains about a nonsensical test on builds with a 32-bit phys_addr_t,
which means resizing will always fail:
drivers/gpu/drm/xe/xe_mmio.c:109:23: error: result of comparison of constant 4294967296 with expression of type 'resource_size_t' (aka 'unsigned int') is always false [-Werror,-Wtautological-constant-out-of-range-compare]
109 | root_res->start > 0x100000000ull)
| ~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~
Previously, BAR resize was always disallowed on 32-bit kernels, but
this apparently changed recently. Since 32-bit machines can in theory
support PAE/LPAE for large address spaces, this may end up useful,
so change the driver to shut up the warning but still work when
phys_addr_t/resource_size_t is 64 bit wide.
Fixes: 9a6e6c14bfde ("drm/xe/mmio: Use non-atomic writeq/readq variant for 32b") Fixes: 237412e45390 ("drm/xe: Enable 32bits build") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240226124736.1272949-2-arnd@kernel.org Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Mika Kuoppala [Thu, 15 Feb 2024 18:11:52 +0000 (20:11 +0200)]
drm/xe: Deny unbinds if uapi ufence pending
If user fence was provided for MAP in vm_bind_ioctl
and it has still not been signalled, deny UNMAP of said
vma with EBUSY as long as unsignalled fence exists.
This guarantees that MAP vs UNMAP sequences won't
escape under the radar if we ever want to track the
client's state wrt to completed and accessible MAPs.
By means of intercepting the ufence release signalling.
v2: find ufence with num_fences > 1 (Matt)
v3: careful on clearing vma ufence (Matt)
Mika Kuoppala [Thu, 15 Feb 2024 18:11:51 +0000 (20:11 +0200)]
drm/xe: Expose user fence from xe_sync_entry
By allowing getting reference to user fence, we can
control the lifetime outside of sync entries.
This is needed to allow vma to track the associated
user fence that was provided with bind ioctl.
v2: xe_user_fence can be kept opaque (Jani, Matt)
v3: indent fix (Matt)
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240215181152.450082-2-mika.kuoppala@linux.intel.com
Matthew Brost [Fri, 23 Feb 2024 20:46:59 +0000 (12:46 -0800)]
drm/xe/guc: Handle timing out of signaled jobs gracefully
Timing out of signaled jobs can happen during regular operations (e.g.
an exec queue closed immediately after last fence signaled). The TDR can
pass the worker which free jobs. Rather than running through the TDR if
signaled job is found, simply free it without any debug messages.
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reported-by: José Roberto de Souza <jose.souza@intel.com> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1271 Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Tested-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240223204659.40750-1-matthew.brost@intel.com
Paulo Zanoni [Thu, 15 Feb 2024 00:53:53 +0000 (16:53 -0800)]
drm/xe: get rid of MAX_BINDS
Mesa has been issuing a single bind operation per ioctl since xe.ko
changed to GPUVA due xe.ko bug #746. If I change Mesa to try again to
issue every single bind operation it can in the same ioctl, it hits
the MAX_BINDS assertion when running Vulkan conformance tests.
Test dEQP-VK.sparse_resources.transfer_queue.3d.rgba32i.1024_128_8
issues 960 bind operations in a single ioctl, it's the most I could
find in the conformance suite.
I don't see a reason to keep the MAX_BINDS restriction: it doesn't
seem to be preventing any specific issue. If the number is too big for
the memory allocations, then those will fail. Nothing related to
num_binds seems to be using the stack. Let's just get rid of it.
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Testcase: dEQP-VK.sparse_resources.transfer_queue.3d.rgba32i.1024_128_8
References: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/746 Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240215005353.1295420-1-paulo.r.zanoni@intel.com
Matthew Brost [Mon, 26 Feb 2024 15:55:54 +0000 (07:55 -0800)]
drm/xe: Use vmalloc for array of bind allocation in bind IOCTL
Use vmalloc in effort to allow a user pass in a large number of binds in
an IOCTL (mesa use case). Also use array allocations rather open coding
the size calculation.
Francois Dugast [Thu, 8 Feb 2024 18:35:39 +0000 (10:35 -0800)]
drm/xe: Extend uAPI to query HuC micro-controler firmware version
The infrastructure to query GuC firmware version is already in place. It
is extended with a new micro-controller type to query the HuC firmware
version. It can be used from user space to know if HuC is running.
Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Francois Dugast <francois.dugast@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Francois Dugast <francois.dugast@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240208183539.185095-2-jose.souza@intel.com
Rodrigo Vivi [Thu, 22 Feb 2024 16:39:36 +0000 (11:39 -0500)]
drm/xe: Convert gt_reset from mem_access to xe_pm_runtime
We need to ensure that device is in D0 on any kind of GT reset.
We are likely already protected by outer bounds like exec,
but if exec/sched ref gets dropped on a hang, we might transition
to D3 before we are able to perform the gt_reset and recover.
Rodrigo Vivi [Thu, 22 Feb 2024 16:39:35 +0000 (11:39 -0500)]
drm/xe: Remove mem_access from suspend and resume functions
At these points, we are sure that device is awake in D0.
Likely in the middle of the transition, but awake. So,
these extra protections are useless. Let's remove it and
continue with the killing of xe_device_mem_access.
Rodrigo Vivi [Thu, 22 Feb 2024 16:39:31 +0000 (11:39 -0500)]
drm/xe: Replace dma_buf mem_access per direct xe_pm_runtime calls
Continue on the path to entirely remove mem_access helpers in
favour of the direct xe_pm_runtime calls. This item is one of
the direct outer bounds of the protection.
Rodrigo Vivi [Thu, 22 Feb 2024 16:39:28 +0000 (11:39 -0500)]
drm/xe: Runtime PM wake on every sysfs call
Let's ensure our PCI device is awaken on every sysfs call.
Let's increase the runtime_pm protection and start moving
that to the outer bounds.
For now, for the files with small number of attr functions,
let's only call the runtime pm functions directly.
For the hw_engines entries with many files, let's add
the sysfs_ops wrapper.
Rodrigo Vivi [Thu, 22 Feb 2024 16:39:27 +0000 (11:39 -0500)]
drm/xe: Convert kunit tests from mem_access to xe_pm_runtime
Let's convert the kunit tests that are currently relying on
xe_device_mem_access_{get,put} towards the direct xe_pm_runtime_{get,put}.
While doing this we need to move the get/put calls towards the outer
bounds of the tests to ensure consistency with the other usages of
pm_runtime on the regular paths.
v2: include xe_pm.h in tests/xe_mocs.c and sort the include block
while at it.
Rodrigo Vivi [Thu, 22 Feb 2024 16:39:26 +0000 (11:39 -0500)]
drm/xe: Runtime PM wake on every IOCTL
Let's ensure our PCI device is awaken on every IOCTL entry.
Let's increase the runtime_pm protection and start moving
that to the outer bounds.
v2: minor typo fix and renaming function to make it clear
that is intended to be used by ioctl only. (Matt)
v3: Make it NULL if CONFIG_COMPAT is not selected.
Rodrigo Vivi [Thu, 22 Feb 2024 16:39:25 +0000 (11:39 -0500)]
drm/xe: Convert mem_access assertion towards the runtime_pm state
The mem_access helpers are going away and getting replaced by
direct calls of the xe_pm_runtime_{get,put} functions. However, an
assertion with a warning splat is desired when we hit the worst
case of a memory access with the device really in the 'suspended'
state.
Also, this needs to be the first step. Otherwise, the upcoming
conversion would be really noise with warn splats of missing mem_access
gets.
Matthew Brost [Thu, 22 Feb 2024 23:20:21 +0000 (15:20 -0800)]
drm/xe: Don't support execlists in xe_gt_tlb_invalidation layer
The xe_gt_tlb_invalidation layer implements TLB invalidations for a GuC
backend. Simply return if in execlists mode. A follow up may properly
implement the xe_gt_tlb_invalidation layer for both GuC and execlists.
Matthew Brost [Thu, 22 Feb 2024 23:20:19 +0000 (15:20 -0800)]
drm/xe: Fix execlist splat
Although execlist submission is not supported it should be kept in a
basic working state as it can be used for very early hardware bring up.
Fix the below splat.
Francois Dugast [Thu, 22 Feb 2024 23:23:56 +0000 (18:23 -0500)]
drm/xe/uapi: Remove unused flags
Those cases missed in previous uAPI cleanups were mostly accidentally
brought in from i915 or created to exercise the possibilities of gpuvm
but they are not used by userspace yet, so let's remove them. They can
still be brought back later if needed.
v2:
- Fix XE_VM_FLAG_FAULT_MODE support in xe_lrc.c (Brian Welty)
- Leave DRM_XE_VM_BIND_OP_UNMAP_ALL (José Roberto de Souza)
- Ensure invalid flag values are rejected (Rodrigo Vivi)
v3: Rebase after removal of persistent exec_queues (Francois Dugast)
v4: Rodrigo: Rebase after the new dumpable flag.
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240222232356.175431-1-rodrigo.vivi@intel.com
the preferred way in the kernel is to use the struct_size() helper to
do the arithmetic instead of the argument "size + size * count" in the
kzalloc() function.
This way, the code is more readable and more safer.
Lucas De Marchi [Thu, 22 Feb 2024 14:41:24 +0000 (06:41 -0800)]
drm/xe: Use pointers in trace events
Commit a0df2cc858c3 ("drm/xe/xe_bo_move: Enhance xe_bo_move trace")
inadvertently reverted commit 8d038f49c1f3 ("drm/xe: Fix cast on trace
variable"), breaking the build on 32bits.
As noted by Ville, there's no point in converting the pointers to u64
and add casts everywhere. In fact, it's better to just use %p and let
the address be hashed. Convert all the cases in xe_trace.h to use
pointers.
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Priyanka Dandamudi <priyanka.dandamudi@intel.com> Cc: Oak Zeng <oak.zeng@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240222144125.2862546-1-lucas.demarchi@intel.com
Dafna Hirschfeld [Wed, 21 Feb 2024 08:36:22 +0000 (10:36 +0200)]
drm/xe: Do not include current dir for generated/xe_wa_oob.h
The generated file 'generated/xe_wa_oob.h' is included using:
"generated/xe_wa_oob.h"
which first look inside the source code. But the file resides
in the build directory and should therefore be included using:
<generated/xe_wa_oob.h>
drm/xe: Implement VM snapshot support for BO's and userptr
Since we cannot immediately capture the BO's and userptr, perform it in
2 stages. The immediate stage takes a reference to each BO and userptr,
while a delayed worker captures the contents and then frees the
reference.
This is required because in signaling context, no locks can be taken, no
memory can be allocated, and no waits on userspace can be performed.
With the delayed worker, all of this can be performed very easily,
without having to resort to hacks.
Changes since v1:
- Fix crash on NULL captured vm.
- Use ascii85_encode to capture BO contents and save some space.
- Add length to coredump output for each captured area.
Changes since v2:
- Dump each mapping on their own line, to simplify tooling.
- Fix null pointer deref in xe_vm_snapshot_free.
Changes since v3:
- Don't add uninitialized value to snap->ofs. (Souza)
- Use kernel types for u32 and u64.
- Move snap_mutex destruction to final vm destruction. (Souza)
Changes since v4:
- Remove extra memset. (Souza)
Add the flag XE_VM_BIND_FLAG_DUMPABLE to notify devcoredump that this
mapping should be dumped.
This is not hooked up, but the uapi should be ready before merging.
It's likely easier to dump the contents of the bo's at devcoredump
readout time, so it's better if the bos will stay unmodified after
a hang. The NEEDS_CPU_MAPPING flag is removed as requirement.
Ashutosh Dixit [Tue, 6 Feb 2024 19:27:31 +0000 (11:27 -0800)]
drm/xe/xe_gt_idle: Drop redundant newline in name
Newline in name is redunant and produces an unnecessary empty line during
'cat name'. Newline is added during sysfs_emit. See '27a1a1e2e47d ("drm/xe:
stringify the argument to avoid potential vulnerability")'.
v2: Add Fixes tag (Riana)
Fixes: 7b076d14f21a ("drm/xe/mtl: Add support to get C6 residency/status of MTL") Reviewed-by: Riana Tauro <riana.tauro@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Michał Winiarski [Mon, 19 Feb 2024 13:05:30 +0000 (14:05 +0100)]
drm/xe: Initialize GuC earlier during probe
SR-IOV VF has limited access to MMIO registers. Fortunately, it is able
to access a curated subset that is needed to initialize the driver by
communicating with SR-IOV PF using GuC CT.
Initialize GuC earlier in order to keep the unified probe ordering
between VF and PF modes.
Michał Winiarski [Mon, 19 Feb 2024 13:05:29 +0000 (14:05 +0100)]
drm/xe/guc: Move GuC power control init to "post-hwconfig"
SLPC is not used at "hwconfig" stage. Move the initialization of data
structures used for SLPC to a later point in probe.
Also - move the xe_guc_pc_init_early to happen just prior to initial
"hwconfig" load.
Michał Winiarski [Mon, 19 Feb 2024 13:05:28 +0000 (14:05 +0100)]
drm/xe/huc: Realloc HuC FW in vram for post-hwconfig
Similar to GuC, we're using system memory for the initial stage, and
move the image to vram when it's available for subsequent loads (e.g.
after reset).
Michał Winiarski [Mon, 19 Feb 2024 13:05:27 +0000 (14:05 +0100)]
drm/xe/guc: Allocate GuC data structures in system memory for initial load
GuC load will need to happen at an earlier point in probe, where local
memory is not yet available. Use system memory for GuC data structures
used for initial "hwconfig" load, and realloc at a later,
"post-hwconfig" load if needed, when local memory is available.
Matthew Brost [Mon, 19 Feb 2024 21:19:42 +0000 (13:19 -0800)]
drm/xe: Return 2MB page size for compact 64k PTEs
Compact 64k PTEs are only intended to be used within a single VMA which
covers the entire 2MB range of the compact 64k PTEs. Add
XE_VMA_PTE_COMPACT VMA flag to indicate compact 64k PTEs are used and
update xe_vma_max_pte_size to return at least 2MB if set.
v2: Include missing changes
Fixes: 8f33b4f054fc ("drm/xe: Avoid doing rebinds") Fixes: c47794bdd63d ("drm/xe: Set max pte size when skipping rebinds") Reported-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/758 Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240219211942.3633795-4-matthew.brost@intel.com
Enhanced xe_bo_move trace to be more readable.
It will help to show the migration details.
Src and dst details.
v2: Modify trace_xe_bo_move(), it takes the integer mem_type
rather than a string.
Make mem_type_to_name() extern, it will be used by trace.(Thomas)
v3: Move mem_type_to_name() to xe_bo.[ch] (Thomas, Matt)
v4: Add device details to reduce ambiquity related to vram0/vram1. (Oak)
v5: Rename mem_type_to_name to xe_mem_type_to_name. (Thomas)
v6: Optimised code to use xe_bo_device(__entry->bo). (Thomas)
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Oak Zeng <oak.zeng@intel.com> Cc: Kempczynski Zbigniew <Zbigniew.Kempczynski@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Brian Welty <brian.welty@intel.com> Signed-off-by: Priyanka Dandamudi <priyanka.dandamudi@intel.com> Reviewed-by: Oak Zeng <oak.zeng@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240220044748.948496-1-priyanka.dandamudi@intel.com
drm/xe/uapi: Remove support for persistent exec_queues
Persistent exec_queues delays explicit destruction of exec_queues
until they are done executing, but destruction on process exit
is still immediate. It turns out no UMD is relying on this
functionality, so remove it. If there turns out to be a use-case
in the future, let's re-add.
Persistent exec_queues were never used for LR VMs
v2:
- Don't add an "UNUSED" define for the missing property
(Lucas, Rodrigo)
v3:
- Remove the remaining struct xe_exec_queue::persistent state
(Niranjana, Lucas)
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Francois Dugast <francois.dugast@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240209113444.8396-1-thomas.hellstrom@linux.intel.com
Dave Airlie [Fri, 16 Feb 2024 01:19:14 +0000 (11:19 +1000)]
Merge tag 'drm-intel-gt-next-2024-02-15' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
UAPI Changes:
- Add GuC submission interface version query (Tvrtko Ursulin)
Driver Changes:
Fixes/improvements/new stuff:
- Atomically invalidate userptr on mmu-notifier (Jonathan Cavitt)
- Update handling of MMIO triggered reports (Umesh Nerlige Ramappa)
- Don't make assumptions about intel_wakeref_t type (Jani Nikula)
- Add workaround 14019877138 [xelpg] (Tejas Upadhyay)
- Allow for very slow HuC loading [huc] (John Harrison)
- Flush context destruction worker at suspend [guc] (Alan Previn)
- Close deregister-context race against CT-loss [guc] (Alan Previn)
- Avoid circular locking issue on busyness flush [guc] (John Harrison)
- Use rc6.supported flag from intel_gt for rc6_enable sysfs (Juan Escamilla)
- Reflect the true and current status of rc6_enable (Juan Escamilla)
- Wake GT before sending H2G message [mtl] (Vinay Belgaumkar)
- Restart the heartbeat timer when forcing a pulse (John Harrison)
Future platform enablement:
- Extend driver code of Xe_LPG to Xe_LPG+ [xelpg] (Harish Chegondi)
- Extend some workarounds/tuning to gfx version 12.74 [xelpg] (Matt Roper)
Miscellaneous:
- Reconcile Excess struct member kernel-doc warnings (Randy Dunlap)
- Change wa and EU_PERF_CNTL registers to MCR type [guc] (Shuicheng Lin)
- Add flex arrays to struct i915_syncmap (Erick Archer)
- Increasing the sleep time for live_rc6_manual [selftests] (Anirban Sk)
Dave Airlie [Thu, 15 Feb 2024 20:52:03 +0000 (06:52 +1000)]
Merge tag 'drm-intel-next-2024-02-07' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
drm/i915 feature pull for v6.9:
Features and functionality:
- Early transport for panel replay and PSR (Jouni)
- New ARL PCI IDs (Matt)
- DP TPS4 PHY test pattern support (Khaled)
Refactoring and cleanups:
- Unify and improve VSC SDP for PSR and non-PSR cases (Jouni)
- Refactor memory regions and improve debug logging (Ville)
- Rework global state serialization (Ville)
- Remove unused CDCLK divider fields (Gustavo)
- Unify HDCP connector logging format (Jani)
- Use display instead of graphics version in display code (Jani)
- Move VBT and opregion debugfs next to the implementation (Jani)
- Abstract opregion interface, use opaque type (Jani)
Fixes:
- Fix MTL stolen memory access (Ville)
- Fix initial display plane readout for MTL (Ville)
- Fix HPD handling during driver init/shutdown (Imre)
- Cursor vblank evasion fixes (Ville)
- Various VSC SDP fixes (Jouni)
- Allow PSR mode changes without full modeset (Jouni)
- Fix CDCLK sanitization on module load for Xe2_LPD (Gustavo)
- Fix the max DSC bpc supported by the source (Ankit)
- Add missing LNL ALPM AUX wake configuration (Jouni)
- Cx0 PHY state readout and verify fixes (Mika)
- Fix PSR (panel replay) debugfs for MST connectors (Imre)
- Fail HDCP repeater authentication if Type1 device not present (Suraj)
- Ratelimit debug logging in vm_fault_ttm (Nirmoy)
- Use a fake PCH for MTL because south display is not on the PCH (Haridhar)
- Disable DSB for Xe driver for now (José)
- Fix some LNL display register changes (Lucas)
- Fix build on ChromeOS (Paz Zcharya)
- Preserve current shared DPLL for fastsets on Type-C ports (Ville)
- Fix state checker warnings for MG/TC/TBT PLLs (Ville)
- Fix HDCP repeater ctl register value on errors (Jani)
- Allow FBC with CCS modifiers on SKL+ (Ville)
- Fix HDCP GGTT pinning (Ville)
DRM core changes:
- Add ratelimited drm dbg print (Nirmoy)
- DPCD PSR early transport macro (Jouni)
Merges:
- Backmerge drm-next to bring Xe driver to drm-intel-next (Jani)
John Harrison [Wed, 10 Jan 2024 21:02:16 +0000 (13:02 -0800)]
drm/i915/gt: Restart the heartbeat timer when forcing a pulse
The context persistence code does things like send super high priority
heartbeat pulses to ensure any leaked context can still be pre-empted
and thus isn't a total denial of service but only a minor denial of
service. Unfortunately, it wasn't bothering to restart the heartbeat
worker with a fresh timeout. Thus, if a persistent context happened to
be closed just before the heartbeat was going to go ping anyway then
the forced pulse would get a negligble execution time. And as the
forced pulse is super high priority, the worker thread's next step is
a reset. Which means a potentially innocent system randomly goes boom
when attempting to close a context. So, force a re-schedule of the
worker thread with the appropriate timeout.
Tvrtko Ursulin [Thu, 8 Feb 2024 08:25:10 +0000 (08:25 +0000)]
drm/i915: Add GuC submission interface version query
Add a new query to the GuC submission interface version.
Mesa intends to use this information to check for old firmware versions
with a known bug where using the render and compute command streamers
simultaneously can cause GPU hangs due issues in firmware scheduling.
Based on patches from Vivaik and Joonas.
Compile tested only.
v2:
* Added branch version.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: Jose Souza <jose.souza@intel.com> Cc: Sagar Ghuge <sagar.ghuge@intel.com> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Vivaik Balasubrawmanian <vivaik.balasubrawmanian@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Tested-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240208082510.1363268-1-tvrtko.ursulin@linux.intel.com
Dmitry Baryshkov [Wed, 14 Feb 2024 08:37:08 +0000 (10:37 +0200)]
drm: ci: use clk_ignore_unused for apq8016
If the ADV7511 bridge driver is compiled as a module, while DRM_MSM is
built-in, the clk_disable_unused congests with the runtime PM handling
of the DSI PHY for the clk_prepare_lock(). This causes apq8016 runner to
fail without completing any jobs ([1]). Drop the BM_CMDLINE which
duplicate the command line from the .baremetal-igt-arm64 clause and
enforce the clk_ignore_unused kernelarg instead to make apq8016 runner
work.
Randy Dunlap [Tue, 13 Feb 2024 06:17:33 +0000 (22:17 -0800)]
drm: drm_crtc: correct some comments
Fix some typos and punctuation.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Alex Deucher <alexdeucher@gmail.com> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: dri-devel@lists.freedesktop.org Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20240213061733.8068-1-rdunlap@infradead.org
If the firmware framebuffer has been reloacted, the sysfb code
fixes the screen_info state before it creates the framebuffer's
platform device. Efifb will automatically receive a screen_info
with updated values. Hence remove the tracking from efifb.
firmware/sysfb: Update screen_info for relocated EFI framebuffers
On ARM PCI systems, the PCI hierarchy might be reconfigured during
boot and the firmware framebuffer might move as a result of that.
The values in screen_info will then be invalid.
Work around this problem by tracking the framebuffer's initial
location before it get relocated; then fix the screen_info state
between reloaction and creating the firmware framebuffer's device.
This functionality has been lifted from efifb. See the commit message
of commit 55d728a40d36 ("efi/fb: Avoid reconfiguration of BAR that
covers the framebuffer") for more information.
There will be no EFI framebuffer device for disabled parent devices
and thus we never probe efifb in that case. Hence remove the tracking
code from efifb.
firmware/sysfb: Create firmware device only for enabled PCI devices
Test if the firmware framebuffer's parent PCI device, if any, has
been enabled. If not, the firmware framebuffer is most likely not
working. Hence, do not create a device for the firmware framebuffer
on disabled PCI devices.
So far, efifb tracked the status of the PCI parent device internally
and did not bind if it was disabled. This patch implements the
functionality for all PCI-based firmware framebuffers.
v3:
* make commit message more precise (Sui)
v2:
* rework sysfb_pci_dev_is_enabled() (Javier)
The EFI device has the correct parent device set. This allows Linux
to handle the power management internally. Hence, remove the manual
PM management for the parent device from efifb.
firmware/sysfb: Set firmware-framebuffer parent device
Set the firmware framebuffer's parent device, which usually is the
graphics hardware's physical device. Integrates the framebuffer in
the Linux device hierarchy and lets Linux handle dependencies among
devices. For example, the graphics hardware won't be suspended while
the firmware device is still active.
v4:
* fix build for CONFIG_SYSFB_SIMPLEFB=n, again
v3:
* fix build for CONFIG_SYSFB_SIMPLEFB=n (Sui)
* test result of screen_info_pci_dev() for errors (Sui)
v2:
* detect parent device in sysfb_parent_dev()
The plain values as stored in struct screen_info need to be decoded
before being used. Add helpers that decode the type of video output
and the framebuffer I/O aperture.
Old or non-x86 systems may not set the type of video directly, but
only indicate the presence by storing 0x01 in orig_video_isVGA. The
decoding logic in screen_info_video_type() takes this into account.
It then follows similar code in vgacon's vgacon_startup() to detect
the video type from the given values.
A call to screen_info_resources() returns all known resources of the
given screen_info. The resources' values have been taken from existing
code in vgacon and vga16fb. These drivers can later be converted to
use the new interfaces.
drm/xe: Add uAPI to query GuC firmware submission version
Due to a bug in GuC firmware, Mesa can't enable by default the usage of
compute engines in DG2 and newer.
A new GuC firmware fixed the issue but until now there was no way
for Mesa to know if KMD was running with the fixed GuC version or not,
so this uAPI is required.
It may be expanded in future to query other firmware versions too.
This is querying XE_UC_FW_VER_COMPATIBILITY/submission version because
that is also supported by VFs, while XE_UC_FW_VER_RELEASE don't.
v2:
- fixed drm_xe_query_uc_fw_version documentation
- moved branch_ver as the first version number
Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Francois Dugast <francois.dugast@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240208183539.185095-1-jose.souza@intel.com
Most Rockchip hdmi nodes are part of a power domain.
Add a power-domains property and include it to the example
with some reordering to align with the (new) documentation
about property ordering.
Johan Jonker [Wed, 31 Jan 2024 21:14:29 +0000 (22:14 +0100)]
dt-bindings: display: rockchip: rockchip,dw-hdmi: remove port property
The hdmi-connector nodes are now functional and the new way to model
hdmi ports nodes with both in and output port subnodes. Unfortunately
with the conversion to YAML the old method with only an input port node
was used. Later the new method was also added to the binding.
A binding must be unambiguously, so remove the old port property
entirely and make port@0 and port@1 a requirement as all
upstream dts files are updated as well and because checking
deprecated stuff is a bit pointless.
Update the example to avoid use of the removed property.