Lucas De Marchi [Tue, 16 Sep 2025 21:15:39 +0000 (14:15 -0700)]
drm/xe/configfs: Allow to select by class only
For a future configfs attribute, it's desirable to select by engine mask
only as the instance doesn't make sense.
Rename the function lookup_engine_mask() to lookup_engine_info() and
make it return the entry. This allows parse_engine() to still return an
item if the caller wants to allow parsing a class-only string like
"rcs", "bcs", "ccs", etc.
drm/xe/xe_late_bind_fw: Introduce debug fs node to disable late binding
Introduce a debug filesystem node to disable late binding fw reload
during the system or runtime resume. This is intended for situations
where the late binding fw needs to be loaded from user mode,
perticularly for validation purpose.
Note that xe kmd doesn't participate in late binding flow from user
space. Binary loaded from the userspace will be lost upon entering to
D3 cold hence user space app need to handle this situation.
drm/xe/xe_late_bind_fw: Load late binding firmware
Load late binding firmware
v2:
- s/EAGAIN/EBUSY/
- Flush worker in suspend and driver unload (Daniele)
v3:
- Use retry interval of 6s, in steps of 200ms, to allow
other OS components release MEI CL handle (Sasha)
v4:
- return -ENODEV if component not added (Daniele)
- parse and print status returned by csc
v5:
- Use payload to check firmware valid (Daniele)
- Obtain the RPM reference before scheduling the worker to
ensure the device remains awake until the worker completes
firmware loading (Rodrigo)
v6:
- In case of error donot re-attempt fw download (Daniele)
v7 (Rodrigo):
- Rename of mei structs and callback.
Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250905154953.3974335-6-badal.nilawar@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
drm/xe/xe_late_bind_fw: Initialize late binding firmware
Search for late binding firmware binaries and populate the meta data of
firmware structures.
v2 (Daniele):
- drm_err if firmware size is more than max pay load size
- s/request_firmware/firmware_request_nowarn/ as firmware will
not be available for all possible cards
v3 (Daniele):
- init firmware from within xe_late_bind_init, propagate error
- switch late_bind_fw to array to handle multiple firmware types
v4 (Daniele):
- Alloc payload dynamically, fix nits
v6 (Daniele)
- %s/MAX_PAYLOAD_SIZE/XE_LB_MAX_PAYLOAD_SIZE/
Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250905154953.3974335-5-badal.nilawar@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Introduce xe_late_bind_fw to enable firmware loading for the devices,
such as the fan controller, during the driver probe. Typically,
firmware for such devices are part of IFWI flash image but can be
replaced at probe after OEM tuning.
This patch binds mei late binding component to enable firmware loading.
v2:
- Add devm_add_action_or_reset to remove the component (Daniele)
- Add INTEL_MEI_GSC check in xe_late_bind_init() (Daniele)
v3:
- Fail driver probe if late bind initialization fails,
add has_late_bind flag (Daniele)
v4:
- %s/I915_COMPONENT_LATE_BIND/INTEL_COMPONENT_LATE_BIND/
v6:
- rebased
v7:
- rebased
- In xe_late_bind_init, use drm_err when returning an error to
stop the probe (Lucas)
- Use imperative mode in commit message (Lucas)
Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250905154953.3974335-4-badal.nilawar@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Introduce a new MEI client driver to support Late Binding firmware
upload/update for Intel discrete graphics platforms.
Late Binding is a runtime firmware upload/update mechanism that allows
payloads, such as fan control and voltage regulator, to be securely
delivered and applied without requiring SPI flash updates or
system reboots. This driver enables the Xe graphics driver and other
user-space tools to push such firmware blobs to the authentication
firmware via the MEI interface.
The driver handles authentication, versioning, and communication
with the authentication firmware, which in turn coordinates with
the PUnit/PCODE to apply the payload.
This is a foundational component for enabling dynamic, secure,
and re-entrant configuration updates on platforms like Battlemage.
Cc: Badal Nilawar <badal.nilawar@intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20250905154953.3974335-3-badal.nilawar@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Add a new helper function that allows MEI client drivers
to query the maximum transmission unit (MTU) for a connected
MEI client.
This is useful for clients that need to transmit large payloads,
such as firmware blobs, allowing them to determine the maximum
message size that can be safely sent before starting transmission and
size of the buffer to allocate when receiving data.
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com> Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20250905154953.3974335-2-badal.nilawar@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Thomas Hellström [Thu, 11 Sep 2025 08:03:24 +0000 (10:03 +0200)]
drm/xe: Work around clang multiple goto-label error
When using drm_exec_retry_on_contention(), clang may consider
all labels for which we take addresses in a function as
potential retry goto targets, although strictly only one
is possible. It will then in some situations generate false
positive errors.
In this case, the compiler, for some architectures, consider the
might_lock(&m->job_mutex);
as a potential goto target from drm_exec_retry_on_contention(),
and errors.
Work around that by moving the xe_validate / drm_exec
transaction to a separate function.
v2:
- New commit message based on analysis of Nathan Chancellor
Fixes: 59eabff2a352 ("drm/xe: Convert xe_bo_create_pin_map() for exhaustive eviction") Cc: Matthew Brost <matthew.brost@intel.com> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202509101853.nDmyxTEM-lkp@intel.com/ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> # build Link: https://lore.kernel.org/r/20250911080324.180307-1-thomas.hellstrom@linux.intel.com
Michal Wajdeczko [Tue, 16 Sep 2025 17:00:29 +0000 (19:00 +0200)]
drm/xe/sysfs: Simplify sysfs registration
Instead of manually maintaining each sysfs file define and use
attribute groups and register them using device managed function.
Then use is_visible() to filter-out unsupported attributes.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Raag Jadav <raag.jadav@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250916170029.3313-3-michal.wajdeczko@intel.com
Michal Wajdeczko [Tue, 16 Sep 2025 17:00:28 +0000 (19:00 +0200)]
drm/xe/vf: Don't expose sysfs attributes not applicable for VFs
VFs can't read BMG_PCIE_CAP(0x138340) register nor access PCODE
(already guarded by the info.skip_pcode flag) so we shouldn't
expose attributes that require any of them to avoid errors like:
Fixes: 0e414bf7ad01 ("drm/xe: Expose PCIe link downgrade attributes") Fixes: cdc36b66cd41 ("drm/xe: Expose fan control and voltage regulator version") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Lukasz Laguna <lukasz.laguna@intel.com> Reviewed-by: Raag Jadav <raag.jadav@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250916170029.3313-2-michal.wajdeczko@intel.com
Shuicheng Lin [Thu, 11 Sep 2025 17:31:40 +0000 (17:31 +0000)]
drm/xe/madvise: Fix ioctl argument check
It is "preferred_mem_loc" instead of "atomic" for the ATTR_PREFERRED_LOC
path.
Also include 2 minor changes with no functional impact.
1. Remove the redundant "attr.atomic_access" assignment.
2. Replace down_read_interruptible() with
xe_svm_notifier_lock_interruptible() to pair with
xe_svm_notifier_unlock().
Fixes: ada7486c5668 ("drm/xe: Implement madvise ioctl for xe") Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://lore.kernel.org/r/20250911173139.1405878-2-shuicheng.lin@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Shuicheng Lin [Thu, 11 Sep 2025 03:14:06 +0000 (03:14 +0000)]
drm/xe: Misc refine for svm
These changes should have no functional impact.
1. Correct typo of "operation"in macro range_debug().
2. Combine 2 spin_lock() call in xe_svm_garbage_collector() into 1.
3. Drop redundant preferred_region_is_vram check in
xe_svm_range_needs_migrate_to_vram().
4. Combine the devmem_possible check in xe_svm_handle_pagefault().
need_vram includes the IS_DGFX() check, so there is no change for
.devmem_only.
v2: revert !ctx.devmem_only change (Matt)
v3: rebase code and refine commit message.
v4: rebase code and refine commit message.
Michal Wajdeczko [Tue, 16 Sep 2025 17:16:45 +0000 (19:16 +0200)]
drm/xe/tests: Add pre-GMDID IP descriptors to param generators
Recently introduced kunit parameter generators were based on
the existing arrays which have only GDMID-based IPs and didn't
take into account IP definitions from pre-GMDID era.
Add test only arrays with pre-GMDID IPs (as those will not change)
and extend param generators to start iterating over them.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250916171645.3335-1-michal.wajdeczko@intel.com
Since the PXP start comes after __xe_exec_queue_init() has completed,
we need to cleanup what was done in that function in case of a PXP
start error.
__xe_exec_queue_init calls the submission backend init() function,
so we need to introduce an opposite for that. Unfortunately, while
we already have a fini() function pointer, it performs other
operations in addition to cleaning up what was done by the init().
Therefore, for clarity, the existing fini() has been renamed to
destroy(), while a new fini() has been added to only clean up what was
done by the init(), with the latter being called by the former (via
xe_exec_queue_fini).
Fixes: 72d479601d67 ("drm/xe/pxp/uapi: Add userspace and LRC support for PXP-using queues") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://lore.kernel.org/r/20250909221240.3711023-3-daniele.ceraolospurio@intel.com
Yang Li [Tue, 16 Sep 2025 02:10:39 +0000 (10:10 +0800)]
drm/xe: Remove duplicate header files
Fix some duplicate includes in xe:
./drivers/gpu/drm/xe/xe_tlb_inval.c: xe_tlb_inval.h is included more than once.
./drivers/gpu/drm/xe/xe_pt.c: xe_tlb_inval_job.h is included more than once.
While at it, also sort the include lines alphabetically.
Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=24705 Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=24706 Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
[Reword commit message] Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250916021039.1632766-1-yang.lee@linux.alibaba.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
John Harrison [Tue, 9 Sep 2025 22:41:31 +0000 (15:41 -0700)]
drm/xe/guc: Return an error code if the GuC load fails
Due to multiple explosion issues in the early days of the Xe driver,
the GuC load was hacked to never return a failure. That prevented
kernel panics and such initially, but now all it achieves is creating
more confusing errors when the driver tries to submit commands to a
GuC it already knows is not there. So fix that up.
As a stop-gap and to help with debug of load failures due to invalid
GuC init params, a wedge call had been added to the inner GuC load
function. The reason being that it leaves the GuC log accessible via
debugfs. However, for an end user, simply aborting the module load is
much cleaner than wedging and trying to continue. The wedge blocks
user submissions but it seems that various bits of the driver itself
still try to submit to a dead GuC and lots of subsequent errors occur.
And with regards to developers debugging why their particular code
change is being rejected by the GuC, it is trivial to either add the
wedge back in and hack the return code to zero again or to just do a
GuC log dump to dmesg.
v2: Add support for error injection testing and drop the now redundant
wedge call.
Zongyao Bai [Mon, 15 Sep 2025 21:47:15 +0000 (05:47 +0800)]
drm/xe/sysfs: Add cleanup action in xe_device_sysfs_init
On partial failure, some sysfs files created before the failure might
not be removed. Add common cleanup step to remove them all immediately,
as is should be harmless to attempt to remove non-existing files.
Fixes: 0e414bf7ad01 ("drm/xe: Expose PCIe link downgrade attributes") Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Stuart Summers <stuart.summers@intel.com> Cc: Shuicheng Lin <shuicheng.lin@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Zongyao Bai <zongyao.bai@intel.com> Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250915214716.1327379-2-zongyao.bai@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
John Harrison [Wed, 10 Sep 2025 21:02:35 +0000 (14:02 -0700)]
drm/xe/guc: Add test for G2G communications
Add a test for sending messages from every GuC to every other GuC to
test G2G communications.
Note that, being a debug only feature, the test interface only exists
in pre-production builds of the GuC firmware.
v2: Fix 'default' case to actually use the driver's registration code
as well as allocation. Add comments explaining the different test
types. Fix (C) date and an assert. Review feedback from Daniele.
John Harrison [Wed, 10 Sep 2025 21:02:34 +0000 (14:02 -0700)]
drm/xe: Allow freeing of a managed bo
If a bo is created via xe_managed_bo_create_pin_map() then it cannot be
freed by the driver using xe_bo_unpin_map_no_vm(), or indeed any other
existing function. The DRM layer will still have a pointer stashed
away for later freeing, causing a invalid memory access on driver
unload. So add a helper for releasing the DRM action as well.
v2: Drop 'xe' parameter (review feedbak from Michal W)
John Harrison [Wed, 10 Sep 2025 21:02:33 +0000 (14:02 -0700)]
drm/xe/guc: Add firmware build type to available info
Some test features are not available in production builds of the GuC
firmware. So add the build type field to the available information
that tests can inspect to decide if they should skip or run.
John Harrison [Wed, 10 Sep 2025 21:02:32 +0000 (14:02 -0700)]
drm/xe/guc: Update CSS header structures
Rework the CSS header structure according to recent updates to the GuC
API spec. Also include more field definitions.
v2: Also pass the new GuC specific structure to a GuC specific
function instead of the higher level, generic structure (review
feedback from Daniele).
Also correct naming of CSS_TIME_* fields.
Lucas De Marchi [Fri, 12 Sep 2025 22:05:34 +0000 (15:05 -0700)]
drm/xe: Use ARRAY_SIZE in guc_waklv_init()
Prefer using ARRAY_SIZE where needed and just passing 1 instead of
calculating the size of one element.
Cc: Jonathan Cavitt <jonathan.cavitt@intel.com> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202508130158.eogeBZQT-lkp@intel.com/ Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://lore.kernel.org/r/20250912-guc-ads-array-size-v1-1-a6555392a1f8@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Dan Carpenter [Thu, 7 Aug 2025 15:53:41 +0000 (18:53 +0300)]
drm/xe: Fix a NULL vs IS_ERR() in xe_vm_add_compute_exec_queue()
The xe_preempt_fence_create() function returns error pointers. It
never returns NULL. Update the error checking to match.
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/aJTMBdX97cof_009@stanley.mountain Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
drm/xe: defer free of NVM auxiliary container to device release callback
Do not kfree the intel_dg_nvm_dev in xe_nvm_fini() right after
auxiliary_device_delete/uninit. The auxiliary_device embeds the
device/kobject (and its name); freeing it too early can race
with asynchronous device_del/udev processing and cause a use-after-free.
Signed-off-by: Nitin Gote <nitin.r.gote@intel.com> Fixes: c28bfb107dac ("drm/xe/nvm: add on-die non-volatile memory device") Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250911052823.226696-1-nitin.r.gote@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Refactor: eliminate type casts by using proper u32
declarations.
v2:
- Address review comments. (Karthik)
v3:
- Use the proper u32 type and drop cast. (Lucas De Marchi)
- Modify variable when actually using u64 value.
- Change r value to reg_value with u32 type.
v4:
- Remove newline between trailer and Signed-off-by. (Lucas De Marchi)
- Change reg_val to val for more user-friendly logging.
- Use mul_u32_u32 function since both values are u32.
v5:
- mul_u32_u32 function with shift. (Lucas De Marchi)
Fixes: 7596d839f6228 ("drm/xe/hwmon: Add support to manage power limits though mailbox") Signed-off-by: Mallesh Koujalagi <mallesh.koujalagi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250912113458.2815172-1-mallesh.koujalagi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
drm/xe/xe3: Extend Wa_18041344222 to graphics IP versions 30.00 and 30.01
Apply WA 18041344222 to Xe3 LPG graphics IP versions 30.00 and 30.01 too.
Bspec: 56024 Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Matt Atwood <matthew.s.atwood@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/7368f8059013424ac94f4a01c23f9c98a37b06dc.1757552915.git.harish.chegondi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
======================================================
WARNING: possible circular locking dependency detected
6.17.0-rc4-lgci-xe-xe-pw-153723v2+ #1 Tainted: G S U
------------------------------------------------------
xe_pm/11324 is trying to acquire lock: ffff8881085f22a0 (&pc->freq_lock){+.+.}-{3:3}, at:
xe_guc_pc_start+0x39f/0xf70 [xe]
drm/xe: Add dedicated printk macros for tile and device
We already have dedicated helper macros for printing GT-oriented
messages but we don't have any to print messages that are tile
oriented and we wrongly try to use plain drm or GT-oriented ones.
Add tile-oriented printk messages and to provide similar coverage
as we have with xe_assert() macros. Also add set of simple macros
for the top level xe_device, which we could easily tweak to include
extra device specific info if needed.
Typical output of our printk macros will look like:
[drm] this is xe_WARN()
[drm] *ERROR* this is xe_err()
[drm] *ERROR* this is xe_err_printer()
[drm] this is xe_info()
[drm] this is xe_info_printer()
[drm:printk_demo.cold] this is xe_dbg()
[drm:printk_demo.cold] this is xe_dbg_printer()
[drm] Tile0: this is xe_tile_WARN()
[drm] *ERROR* Tile0: this is xe_tile_err()
[drm] *ERROR* Tile0: this is xe_tile_err_printer()
[drm] Tile0: this is xe_tile_info()
[drm] Tile0: this is xe_tile_info_printer()
[drm:printk_demo.cold] Tile0: this is xe_tile_dbg()
[drm:printk_demo.cold] Tile0: this is xe_tile_dbg_printer()
[drm] Tile0: GT0: this is xe_gt_WARN()
[drm] *ERROR* Tile0: GT0: this is xe_gt_err()
[drm] *ERROR* Tile0: GT0: this is xe_gt_err_printer()
[drm] Tile0: GT0: this is xe_gt_info()
[drm] Tile0: GT0: this is xe_gt_info_printer()
[drm:printk_demo.cold] Tile0: GT0: this is xe_gt_dbg()
[drm:printk_demo.cold] Tile0: GT0: this is xe_gt_dbg_printer()
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250909165941.31730-5-michal.wajdeczko@intel.com
drm/xe: Prepare format for GT-oriented messages in one place
To avoid code duplication (and thus potential mistakes) and to
allow easier changes (if needed) of the prefix format of the
GT-oriented messages, prepare that prefix in dedicated macro.
All recent platforms (including all the ones officially supported by the
Xe driver) do not allow concurrent execution of RCS and CCS workloads
from different address spaces, with the HW blocking the context switch
when it detects such a scenario.
The DUAL_QUEUE flag helps with this, by causing the GuC to not submit a
context it knows will not be able to execute. This, however, causes a new
problem: if RCS and CCS queues have pending workloads from different
address spaces, the GuC needs to choose from which of the 2 queues to
pick the next workload to execute. By default, the GuC prioritizes RCS
submissions over CCS ones, which can lead to CCS workloads being
significantly (or completely) starved of execution time.
The driver can tune this by setting a dedicated scheduling policy KLV;
this KLV allows the driver to specify a quantum (in ms) and a ratio
(percentage value between 0 and 100), and the GuC will prioritize the CCS
for that percentage of each quantum.
Given that we want to guarantee enough RCS throughput to avoid missing
frames, we set the yield policy to 20% of each 80ms interval.
v2: updated quantum and ratio, improved comment, use xe_guc_submit_disable
in gt_sanitize
Fixes: d9a1ae0d17bd ("drm/xe/guc: Enable WA_DUAL_QUEUE for newer platforms") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Tested-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Link: https://lore.kernel.org/r/20250905235632.3333247-2-daniele.ceraolospurio@intel.com
Michal Wajdeczko [Wed, 10 Sep 2025 22:24:39 +0000 (00:24 +0200)]
drm/xe/pf: Drop rounddown_pow_of_two fair LMEM limitation
This effectively reverts commit 4c3fe5eae46b ("drm/xe/pf: Limit
fair VF LMEM provisioning") since we don't need it any more after
non-contig VRAM allocations were fixed. This allows larger LMEM
auto-provisioning for VFs, so instead:
[ ] GT0: PF: LMEM available(14096M) fair(1 x 8192M)
[ ] GT0: PF: VF1 provisioned with 8589934592 (8.00 GiB) LMEM
or
[ ] GT0: PF: LMEM available(14096M) fair(2 x 4096M)
[ ] GT0: PF: VF1..VF2 provisioned with 4294967296 (4.00 GiB) LMEM
we may get:
[ ] GT0: PF: LMEM available(14096M) fair(1 x 14096M)
[ ] GT0: PF: VF1 provisioned with 14780727296 (13.8 GiB) LMEM
and
[ ] GT0: PF: LMEM available(14096M) fair(2 x 7048M)
[ ] GT0: PF: VF1..VF2 provisioned with 7390363648 (6.88 GiB) LMEM
GuC has an interface to set a power profile for the SLPC algorithm.
Base mode is default and ensures a balanced performance, power_saving
mode has conservative up/down thresholds and is suitable for use with
apps that typically need to be power efficient. This will result in
lower GT frequencies, thus consuming lower power.
Selected power profile will be displayed in this format:
drm/xe: Convert pinned suspend eviction for exhaustive eviction
Pinned suspend eviction and preparation for eviction validates
system memory for eviction buffers. Do that under a
validation exclusive lock to avoid interfering with other
processes validating system graphics memory.
v2:
- Avoid gotos from within xe_validation_guard().
- Adapt to signature change of xe_validation_guard().
drm/xe: Rework instances of variants of xe_bo_create_locked()
A common pattern is to create a locked bo, pin it without mapping
and then unlock it. Add a function to do that, which internally
uses xe_validation_guard().
With that we can remove xe_bo_create_locked_range() and add
exhaustive eviction to stolen, pf_provision_vf_lmem and
psmi_alloc_object.
v4:
- New patch after reorganization.
v5:
- Replace DRM_XE_GEM_CPU_CACHING_WB with 0. (CI)
drm/xe: Convert xe_bo_create_pin_map() for exhaustive eviction
Introduce an xe_bo_create_pin_map_novm() function that does not
take the drm_exec paramenter to simplify the conversion of many
callsites.
For the rest, ensure that the same drm_exec context that was used
for locking the vm is passed down to validation.
Use xe_validation_guard() where appropriate.
v2:
- Avoid gotos from within xe_validation_guard(). (Matt Brost)
- Break out the change to pf_provision_vf_lmem8 to a separate
patch.
- Adapt to signature change of xe_validation_guard().
drm/xe: Convert xe_bo_create_pin_map_at() for exhaustive eviction
Most users of xe_bo_create_pin_map_at() and
xe_bo_create_pin_map_at_aligned() are not using the vm parameter,
and that simplifies conversion. Introduce an
xe_bo_create_pin_map_at_novm() function and make the _aligned()
version static. Use xe_validation_guard() for conversion.
v2:
- Adapt to signature change of xe_validation_guard(). (Matt Brost)
- Fix up documentation.
v4:
- Postpone the change to i915_gem_stolen_insert_node_in_range() to
a later patch.
drm/xe: Convert xe_dma_buf.c for exhaustive eviction
Convert dma-buf migration to XE_PL_TT and dma-buf import to
support exhaustive eviction, using xe_validation_guard().
It seems unlikely that the import would result in an -ENOMEM,
but convert import anyway for completeness.
The dma-buf map_attachment() functionality unfortunately doesn't
support passing a drm_exec, which means that foreign devices
validating a dma-buf that we exported will not, unless they are
xeKMD devices, participate in the exhaustive eviction scheme.
v2:
- Avoid gotos from within xe_validation_guard(). (Matt Brost)
- Adapt to signature change of xe_validation_guard(). (Matt Brost)
- Remove an unneeded (void)ret. (Matt Brost)
- Fix up an error path.
Convert __xe_pin_fb_vma() for exhaustive eviction
using xe_validation_guard().
v2:
- Avoid gotos from within xe_validation_guard(). (Matt Brost)
- Adapt to signature change of xe_validation_guard(). (Matt Brost)
- Use interruptible waiting, since xe_bo_migrate() already does that.
drm/xe: Convert the CPU fault handler for exhaustive eviction
The CPU fault handler may populate bos and migrate, and in doing
so might interfere with other tasks validating.
Rework the CPU fault handler completely into a fastpath
and a slowpath. The fastpath trylocks only the validation lock
in read-mode. If that fails, there's a fallback to the
slowpath, where we do a full validation transaction.
This mandates open-coding of bo locking, bo idling and
bo populating, but we still call into TTM for fault
finalizing.
v2:
- Rework the CPU fault handler to actually take part in
the exhaustive eviction scheme (Matthew Brost).
v3:
- Don't return anything but VM_FAULT_RETRY if we've dropped the
mmap_lock. Not even if a signal is pending.
- Rebase on gpu_madvise() and split out fault migration.
- Wait for idle after migration.
- Check whether the resource manager uses tts to determine
whether to map the tt or iomem.
- Add a number of asserts.
- Allow passing a ttm_operation_ctx to xe_bo_migrate() so that
it's possible to try non-blocking migration.
- Don't fall through to TTM on migration / population error
Instead remove the gfp_retry_mayfail in mode 2 where we
must succeed. (Matthew Brost)
v5:
- Don't allow faulting in the imported bo case (Matthew Brost)
drm/xe: Convert existing drm_exec transactions for exhaustive eviction
Convert existing drm_exec transactions, like GT pagefault validation,
non-LR exec() IOCTL and the rebind worker to support
exhaustive eviction using the xe_validation_guard().
v2:
- Adapt to signature change in xe_validation_guard() (Matt Brost)
- Avoid gotos from within xe_validation_guard() (Matt Brost)
- Check error return from xe_validation_guard()
drm/xe: Introduce an xe_validation wrapper around drm_exec
Introduce a validation wrapper xe_validation_guard() as a helper
intended to be used around drm_exec transactions what perform
validations. Once TTM can handle exhaustive eviction we could
remove this wrapper or make it mostly a NO-OP unless other
functionality is added to it.
Currently the wrapper takes a read lock upon entry and if the
transaction hits an OOM, all locks are released and the
transaction is retried with a write-lock. If all other
validations participate in this scheme, the transaction with
the write lock will be the only transaction validating and
should have access to all available non-pinned memory.
There is currently a problem in that TTM converts -EDEADLOCKS to
-ENOMEM, and with ww_mutex slowpath error injections, we can hit
-ENOMEMs without having actually ran out of memory. We abuse
ww_mutex internals to detect such situations until TTM is fixes
to not convert the error code. In the meantime, injecting
ww_mutex slowpath -EDEADLOCKs is a good way to test
the implementation in the absence of real OOMs.
Just introduce the wrapper in this commit. It will be hooked up
to the driver in following commits.
v2:
- Mark class_xe_validation conditional so that the loop is
skipped on initialization error.
- Argument sanitation (Matt Brost)
- Fix conditional execution of xe_validation_ctx_fini()
(Matt Brost)
- Add a no_block mode for upcoming use in the CPU fault handler.
v4:
- Update kerneldoc. (Xe CI).
We want all validation (potential backing store allocation) to be part
of a drm_exec transaction. Therefore add a drm_exec pointer argument
to xe_bo_validate() and ___xe_bo_create_locked(). Upcoming patches
will deal with making all (or nearly all) calls to these functions
part of a drm_exec transaction. In the meantime, define special values
of the drm_exec pointer:
XE_VALIDATION_UNIMPLEMENTED: Implementation of the drm_exec transaction
has not been done yet.
XE_VALIDATION_UNSUPPORTED: Some Middle-layers (dma-buf) doesn't allow
the drm_exec context to be passed down to map_attachment where
validation takes place.
XE_VALIDATION_OPT_OUT: May be used only for kunit tests where exhaustive
eviction isn't crucial and the ROI of converting those is very
small.
For XE_VALIDATION_UNIMPLEMENTED and XE_VALIDATION_OPT_OUT there is also
a lockdep check that a drm_exec transaction can indeed start at the
location where the macro is expanded. This is to encourage
developers to take this into consideration early in the code
development process.
v2:
- Fix xe_vm_set_validation_exec() imbalance. Add an assert that
hopefully catches future instances of this (Matt Brost)
v3:
- Extend to psmi_alloc_object
drm/xe/guc: Don't invoke disable_ct action during replacement
During second CT initialization step, known as post_hwconfig, we
want to replace previously registered CT disable devm action to
make sure it will be invoked prior to releasing underlying BO.
But to replace this action we don't need to execute it right away
since we know that CT was disabled prior to that late init step
and we already assert that. Use devm_remove_action() instead to
avoid extra message about 'disabling CT' that could be seen now:
[drm:guc_ct_change_state [xe]] GT0: GuC CT communication channel disabled
...
DEVRES REM ff11000149320940 guc_action_disable_ct (16 bytes)
[drm:guc_ct_change_state [xe]] GT0: GuC CT communication channel disabled
DEVRES ADD ff110001664fc040 guc_action_disable_ct (16 bytes)
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Link: https://lore.kernel.org/r/20250908102053.539-3-michal.wajdeczko@intel.com
drm/xe/guc: Always add CT disable action during second init step
On DGFX, during init_post_hwconfig() step, we are reinitializing
CTB BO in VRAM and we have to replace cleanup action to disable CT
communication prior to release of underlying BO.
But that introduces some discrepancy between DGFX and iGFX, as for
iGFX we keep previously added disable CT action that would be called
during unwind much later.
To keep the same flow on both types of platforms, always replace old
cleanup action and register new one.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Link: https://lore.kernel.org/r/20250908102053.539-2-michal.wajdeczko@intel.com
Matt Roper [Fri, 5 Sep 2025 21:56:15 +0000 (14:56 -0700)]
drm/xe: Never report L3 bank mask for media GT going forward
We currently report an L3 bank mask as part of the GT topology on both
GTs (primary and media) because a copy of the L3 bank fuse register
exists on both GTs (e.g., $gsi_offset + 0x9130 on Xe3). After recent
discussions it's come to light that the only known userspace software
that uses this part of the uapi (the compute UMD and Mesa) only uses the
value reported for the primary GT; the value reported for the media GT
is ignored by both projects, and the media UMDs don't have any use for
L3 information today. Since we always strive to have our uapi match the
specific needs of userspace and not include additional unused baggage,
let's officially drop L3 bank reporting on the media GT going forward
and only keep it around for the primary GT where it actually gets used.
This change will only apply to future platforms (Xe3 and later); even
though it would probably be safe to remove it from Xe1/Xe2 as well, we
don't want to take any chances with changing existing ABI.
Note that we'd already disabled reading/reporting of the L3 bank for the
media GT on PTL in commit 9ab440a9d042 ("drm/xe/ptl: L3bank mask is not
available on the media GT") because it was discovered that the copy of
the fuse registers on the media GT were just reporting a bogus ~0 value
rather than an accurate mask. So this is just extending that PTL
behavior forward to WCL and other future platforms. Note that we're
also free to reinstate this part of the uapi in the future if/when some
new userspace consumer emerges that _does_ have a use for media-specific
L3 bank masks.
drm/xe/debugfs: Don't expose dgfx residencies attributes on VF
In addition of checking if we are running on the BATTLEMAGE,
we should also check for not being a VF driver, as VFs can't
access necessary registers, and doing so leads to:
| .. [drm] GT0: VF is trying to read an inaccessible register 0x35b004+0x0
| RIP: 0010:xe_gt_sriov_vf_read32+0x5e2/0x8a0 [xe]
| Call Trace:
| xe_mmio_read32+0x110/0x280 [xe]
| read_residency_counter+0x42/0xd0 [xe]
| dgfx_pkg_residencies_show+0x115/0x190 [xe]
| .. [drm] Package G2 counter failed to read, ret -19
or
| .. [drm] GT0: VF is trying to read an inaccessible register 0x35b004+0x0
| RIP: 0010:xe_gt_sriov_vf_read32+0x5e2/0x8a0 [xe]
| Call Trace:
| xe_mmio_read32+0x110/0x280 [xe]
| read_residency_counter+0x42/0xd0 [xe]
| dgfx_pcie_link_residencies_show+0xe7/0x160 [xe]
| .. [drm] PCIE LINK L0 RESIDENCY counter failed to read, ret -19
Similarly, there is no point to expose inject_csc_hw_error on VFs,
as HW errors support is already disabled for VFs.
We only need single set of VF CCS contexts, they are not per-tile
as initial implementation might suggest. Move all VF CCS data from
xe_tile.sriov.vf to xe_device.sriov.vf. Also rename some structs to
align with the usage and fix their kernel-doc.
This will allow as to drop ugly IS_VF_CCS_BB_VALID macro.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Link: https://lore.kernel.org/r/20250908123025.747-6-michal.wajdeczko@intel.com
drm/xe/vf: Use single check when calling VF CCS functions
All xe_sriov_vf_ccs() functions but init() expect to be called
when initialization was successful and CCS handling is ready.
Update IS_VF_CCS_READY macro and use it as single entry guard.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Link: https://lore.kernel.org/r/20250908123025.747-5-michal.wajdeczko@intel.com
We only use this macro once and we can open-code it to explicitly
show relevant conditions and avoid duplications.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Link: https://lore.kernel.org/r/20250908123025.747-4-michal.wajdeczko@intel.com
This function is dedicated for use by the VFs, we shouldn't use
name that might suggests it's general purpose. While there, update
asserts to better reflect intended usage.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Link: https://lore.kernel.org/r/20250908123025.747-2-michal.wajdeczko@intel.com
Lucas De Marchi [Fri, 5 Sep 2025 16:22:37 +0000 (09:22 -0700)]
drm/xe/configfs: Use config_group_put()
configfs has a config_group_put() helper that was adopted by
commit 88df7939d728 ("drm/xe/configfs: Rename struct xe_config_device").
Another pending work to add psmi later landed in commit afe902848b41 ("drm/xe/configfs: Allow to enable PSMI") and didn't use
the helper.
Use config_group_put() consistently to hide the inner workings of
configfs. No change in behavior since it does exactly the same thing
as currently being done.
John Harrison [Thu, 4 Sep 2025 19:57:51 +0000 (12:57 -0700)]
drm/xe/guc: Fix badly worded error message
If a GuC id lookup failed, the error message was 'Not engine present',
which is bad in multiple ways - incorrect English and 'engines' are
now called 'exec queues' in this context. So fix it.
drm/xe: Block exec and rebind worker while evicting for suspend / hibernate
When the xe pm_notifier evicts for suspend / hibernate, there might be
racing tasks trying to re-validate again. This can lead to suspend taking
excessive time or get stuck in a live-lock. This behaviour becomes
much worse with the fix that actually makes re-validation bring back
bos to VRAM rather than letting them remain in TT.
Prevent that by having exec and the rebind worker waiting for a completion
that is set to block by the pm_notifier before suspend and is signaled
by the pm_notifier after resume / wakeup.
It's probably still possible to craft malicious applications that block
suspending. More work is pending to fix that.
v3:
- Avoid wait_for_completion() in the kernel worker since it could
potentially cause work item flushes from freezable processes to
wait forever. Instead terminate the rebind workers if needed and
re-launch at resume. (Matt Auld)
v4:
- Fix some bad naming and leftover debug printouts.
- Fix kerneldoc.
- Use drmm_mutex_init() for the xe->rebind_resume_lock (Matt Auld).
- Rework the interface of xe_vm_rebind_resume_worker (Matt Auld).
drm/xe: Attempt to bring bos back to VRAM after eviction
VRAM+TT bos that are evicted from VRAM to TT may remain in
TT also after a revalidation following eviction or suspend.
This manifests itself as applications becoming sluggish
after buffer objects get evicted or after a resume from
suspend or hibernation.
If the bo supports placement in both VRAM and TT, and
we are on DGFX, mark the TT placement as fallback. This means
that it is tried only after VRAM + eviction.
This flaw has probably been present since the xe module was
upstreamed but use a Fixes: commit below where backporting is
likely to be simple. For earlier versions we need to open-
code the fallback algorithm in the driver.
v2:
- Remove check for dgfx. (Matthew Auld)
- Update the xe_dma_buf kunit test for the new strategy (CI)
- Allow dma-buf to pin in current placement (CI)
- Make xe_bo_validate() for pinned bos a NOP.
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/5995 Fixes: a78a8da51b36 ("drm/ttm: replace busy placement with flags v6") Cc: Matthew Brost <matthew.brost@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: <stable@vger.kernel.org> # v6.9+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://lore.kernel.org/r/20250904160715.2613-2-thomas.hellstrom@linux.intel.com
drm/xe/migrate: Remove unneeded emit_pte() when copying CCS only
In xe_migrate_copy(), when copy_only_ccs is true, we only need two
emit_pte() calls one for the BO and one for the raw CCS storage.
However, the current implementation issues three emit_pte() calls,
resulting in an unnecessary PTE programming job.
This fix removes the redundant emit_pte() call to avoid programming
the same PTEs twice and reducing overhead during CCS-only migration.
v2: Preserve correct behavior on DG2, which requires both CCS and
page copies.
drm/xe: Fix broken kernel-doc for the struct xe_bo
Use correct multi-line kernel-doc style if required.
Some members were described only in the commit message.
Some other members were described using wrong names.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Cc: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Reviewed-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Link: https://lore.kernel.org/r/20250904144026.7222-1-michal.wajdeczko@intel.com
Michal Wajdeczko [Fri, 29 Aug 2025 17:19:21 +0000 (19:19 +0200)]
drm/xe/kunit: Drop xe_wa_test_exit
Remove xe_wa_test_exit() as it could crach the KUnit kernel in
case of hitting some asserts in xe_wa_test_init() as test->priv
could not be pointing to expected data.
| # xe_wa_gt: ASSERTION FAILED at drivers/gpu/drm/xe/tests/xe_wa_test.c:34
| Expected ret == 0, but
| ret == -19 (0xffffffffffffffed)
|Bus error - the host /dev/shm or /tmp mount likely just ran out of space
|Kernel panic - not syncing: Kernel mode signal 7
Note that there is no need to call drm_kunit_helper_free_device()
since our fake device allocated by drm_kunit_helper_alloc_device()
will be cleaned up automatically.
Michal Wajdeczko [Fri, 29 Aug 2025 17:19:19 +0000 (19:19 +0200)]
drm/xe/kunit: Drop custom struct platform_test_case
Custom struct platform_test_case definition in xe_wa is now almost
identical to generic struct xe_pci_fake_data defintiion except the
.name member, which could be generated by xe_pci_fake_data_desc().
Michal Wajdeczko [Fri, 29 Aug 2025 17:19:18 +0000 (19:19 +0200)]
drm/xe/kunit: Introduce xe_pci_fake_data_desc()
We already use struct xe_pci_fake_data to provide custom config of
the fake PCI device and soon we will be using this struct also as
direct parameter for the parameterized Xe KUnit tests.
Add function to generate description based on that config data.
For platform or subplatform name lookup pciidlist which already
have definitions of all supported platforms.
Matthew Auld [Fri, 29 Aug 2025 16:47:16 +0000 (17:47 +0100)]
drm/xe: improve dma-resv handling for backup object
Since the dma-resv is shared we don't need to reserve and add a fence
slot fence twice, plus no need to loop through the dependencies.
Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://lore.kernel.org/r/20250829164715.720735-2-matthew.auld@intel.com
Matthew Auld [Thu, 28 Aug 2025 14:24:39 +0000 (15:24 +0100)]
drm/xe/pt: unify xe_pt_svm_pre_commit with userptr
We now use the same notifier lock for SVM and userptr, with that we can
combine xe_pt_userptr_pre_commit and xe_pt_svm_pre_commit.
v2: (Matt B)
- Re-use xe_svm_notifier_lock/unlock for userptr.
- Combine svm/userptr handling further down into op_check_svm_userptr.
v3:
- Only hide the ops if we lack DRM_GPUSVM, since we also need them for
userptr.
Suggested-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250828142430.615826-18-matthew.auld@intel.com
Matthew Auld [Thu, 28 Aug 2025 14:24:38 +0000 (15:24 +0100)]
drm/xe/userptr: replace xe_hmm with gpusvm
Goal here is cut over to gpusvm and remove xe_hmm, relying instead on
common code. The core facilities we need are get_pages(), unmap_pages()
and free_pages() for a given useptr range, plus a vm level notifier
lock, which is now provided by gpusvm.
v2:
- Reuse the same SVM vm struct we use for full SVM, that way we can
use the same lock (Matt B & Himal)
v3:
- Re-use svm_init/fini for userptr.
v4:
- Allow building xe without userptr if we are missing DRM_GPUSVM
config. (Matt B)
- Always make .read_only match xe_vma_read_only() for the ctx. (Dafna)
v5:
- Fix missing conversion with CONFIG_DRM_XE_USERPTR_INVAL_INJECT
v6:
- Convert the new user in xe_vm_madise.
Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Dafna Hirschfeld <dafna.hirschfeld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250828142430.615826-17-matthew.auld@intel.com
Matthew Auld [Thu, 28 Aug 2025 14:24:37 +0000 (15:24 +0100)]
drm/xe/vm: split userptr bits into separate file
This will simplify compiling out the bits that depend on DRM_GPUSVM in a
later patch. Without this we end up littering the code with ifdef
checks, plus it becomes hard to be sure that something won't blow at
runtime due to something not being initialised, even though it passed
the build. Should be no functional change here.
Matthew Auld [Thu, 28 Aug 2025 14:24:35 +0000 (15:24 +0100)]
drm/gpusvm: refactor core API to use pages struct
Refactor the core API of get/unmap/free pages to all operate on
drm_gpusvm_pages. In the next patch we want to export a simplified core
API without needing fully blown svm range etc.
Suggested-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250828142430.615826-14-matthew.auld@intel.com
Matthew Auld [Thu, 28 Aug 2025 14:24:34 +0000 (15:24 +0100)]
drm/gpusvm: pull out drm_gpusvm_pages substructure
Pull the pages stuff from the svm range into its own substructure, with
the idea of having the main pages related routines, like get_pages(),
unmap_pages() and free_pages() all operating on some lower level
structures, which can then be re-used for stuff like userptr.
v2:
- Move seq into pages struct (Matt B)
v3:
- Small kernel-doc fixes
Suggested-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250828142430.615826-13-matthew.auld@intel.com
Matthew Auld [Thu, 28 Aug 2025 14:24:33 +0000 (15:24 +0100)]
drm/gpusvm: use more selective dma dir in get_pages()
If we are only reading the memory then from the device pov the direction
can be DMA_TO_DEVICE. This aligns with the xe-userptr code. Using the
most restrictive data direction to represent the access is normally a
good idea.
Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250828142430.615826-12-matthew.auld@intel.com
Matthew Auld [Thu, 28 Aug 2025 14:24:32 +0000 (15:24 +0100)]
drm/gpusvm: fix hmm_pfn_to_map_order() usage
Handle the case where the hmm range partially covers a huge page (like
2M), otherwise we can potentially end up doing something nasty like
mapping memory which is outside the range, and maybe not even mapped by
the mm. Fix is based on the xe userptr code, which in a future patch
will directly use gpusvm, so needs alignment here.
v2:
- Add kernel-doc (Matt B)
- s/fls/ilog2/ (Thomas)
Reported-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://lore.kernel.org/r/20250828142430.615826-11-matthew.auld@intel.com
Add Wa_18041344222 for Xe2_HPG that requires disabling
the perf mode for subslice count for eustall sampling
when the enabled slices are discontiguous.