git.ipfire.org Git - thirdparty/kernel/stable.git/log

Merge tag 'amd-drm-next-6.18-2025-09-19' of https://gitlab.freedesktop.org/agd5f/linux into drm-next

amd-drm-next-6.18-2025-09-19:

amdgpu:
- Fence drv clean up fix
- DPC fixes
- Misc display fixes
- Support the MMIO remap page as a ttm pool
- JPEG parser updates
- UserQ updates
- VCN ctx handling fixes
- Documentation updates
- Misc cleanups
- SMU 13.0.x updates
- SI DPM updates
- GC 11.x cleaner shader updates
- DMCUB updates
- DML fixes
- Improve fallback handling for pixel encoding
- VCN reset improvements
- DCE6 DC updates
- DSC fixes
- Use devm for i2c buses
- GPUVM locking updates
- GPUVM documentation improvements
- Drop non-DC DCE11 code
- S0ix fixes
- Backlight fix
- SR-IOV fixes

amdkfd:
- SVM updates

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20250919193354.2989255-1-alexander.deucher@amd.com

Merge tag 'drm-xe-next-2025-09-19' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next

UAPI Changes:
- Drop L3 bank mask reporting from the media GT on Xe3 and later. Only
   do that for the primary GT. No userspace needs or uses it for media
   and some platforms may report bogus values.
- Add SLPC power_profile sysfs interface with support for base and
   power_saving modes (Vinay Belgaumkar, Rodrigo Vivi)
- Add configfs attributes to add post/mid context-switch commands
   (Lucas De Marchi)

Cross-subsystem Changes:
- Fix hmm_pfn_to_map_order() usage in gpusvm and refactor APIs to
   align with pieces previous handled by xe_hmm (Matthew Auld)

Core Changes:
- Add MEI driver for Late Binding Firmware Update/Upload
   (Alexander Usyskin)

Driver Changes:
- Fix GuC CT teardown wrt TLB invalidation (Satyanarayana)
- Fix CCS save/restore on VF (Satyanarayana)
- Increase default GuC crash buffer size (Zhanjun)
- Allow to clear GT stats in debugfs to aid debugging (Matthew Brost)
- Add more SVM GT stats to debugfs (Matthew Brost)
- Fix error handling in VMA attr query (Himal)
- Move sa_info in debugfs to be per tile (Michal Wajdeczko)
- Limit number of retries upon receiving NO_RESPONSE_RETRY from GuC to
   avoid endless loop (Michal Wajdeczko)
- Fix configfs handling for survivability_mode undoing user choice when
   unbinding the module (Michal Wajdeczko)
- Refactor configfs attribute visibility to future-proof it and stop
   exposing survivability_mode if not applicable (Michal Wajdeczko)
- Constify some functions (Harish Chegondi, Michal Wajdeczko)
- Add/extend more HW workarounds for Xe2 and Xe3
   (Harish Chegondi, Tangudu Tilak Tirumalesh)
- Replace xe_hmm with gpusvm (Matthew Auld)
- Improve fake pci and WA kunit handling for testing new platforms
   (Michal Wajdeczko)
- Reduce unnecessary PTE writes when migrating (Sanjay Yadav)
- Cleanup GuC interface definitions and log message (John Harrison)
- Small improvements around VF CCS (Michal Wajdeczko)
- Enable bus mastering for the I2C controller (Raag Jadav)
- Prefer devm_mutex of hand rolling it (Christophe JAILLET)
- Drop sysfs and debugfs attributes not available for VF (Michal Wajdeczko)
- GuC CT devm actions improvements (Michal Wajdeczko)
- Recommend new GuC versions for PTL and BMG (Julia Filipchuk)
- Improveme driver handling for exhaustive eviction using new
   xe_validation wrapper around drm_exec (Thomas Hellström)
- Add and use printk wrappers for tile and device (Michal Wajdeczko)
- Better document workaround handling in Xe (Lucas De Marchi)
- Improvements on ARRAY_SIZE  and ERR_CAST usage (Lucas De Marchi,
   Fushuai Wang)
- Align CSS firmware headers with the GuC APIs (John Harrison)
- Test GuC to GuC (G2G) communication to aid debug in pre-production
   firmware (John Harrison)
- Bail out driver probing if GuC fails to load (John Harrison)
- Allow error injection in xe_pxp_exec_queue_add()
   (Daniele Ceraolo Spurio)
- Minor refactors in xe_svm (Shuicheng Lin)
- Fix madvise ioctl error handling (Shuicheng Lin)
- Use attribute groups to simplify sysfs registration
   (Michal Wajdeczko)
- Add Late Binding Firmware implementation in Xe to work together with
   the MEI component (Badal Nilawar, Daniele Ceraolo Spurio, Rodrigo
   Vivi)
- Fix build with CONFIG_MODULES=n (Lucas De Marchi)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/c2et6dnkst2apsgt46dklej4nprqdukjosb55grpaknf3pvcxy@t7gtn3hqtp6n

drm/xe: Fix build with CONFIG_MODULES=n

When building with CONFIG_MODULES=n, the __exit functions are dropped.
However our init functions may call them for error handling, so they are
not good candidates for the exit sections.

Fix this error reported by 0day:

ld.lld: error: relocation refers to a symbol in a discarded section: xe_configfs_exit
>>> defined in vmlinux.a(drivers/gpu/drm/xe/xe_configfs.o)
>>> referenced by xe_module.c
>>> drivers/gpu/drm/xe/xe_module.o:(init_funcs) in archive vmlinux.a

This is the only exit function using __exit. Drop it to fix the build.

Cc: Riana Tauro <riana.tauro@intel.com>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202506092221.1FmUQmI8-lkp@intel.com/
Fixes: 16280ded45fb ("drm/xe: Add configfs to enable survivability mode")
Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Link: https://lore.kernel.org/r/20250912-fix-nomodule-build-v1-1-d11b70a92516@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

Merge tag 'drm-intel-next-2025-09-12' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next

Cross-subsystem Changes:
- Overflow: add range_overflows and range_end_overflows (Jani)

Core Changes:
- Get rid of dev->struct_mutex (Luiz)

Non-display related:
- GVT: Remove redundant ternary operators (Liao)
- Various i915_utils clean-ups (Jani)

Display related:
- Wait PSR idle before on dsb commit (Jouni)
- Fix size for for_each_set_bit() in abox iteration (Jani)
- Abstract figuring out encoder name (Jani)
- Remove FBC modulo 4 restriction for ADL-P+ (Uma)
- Panic: refactor framebuffer allocation (Jani)
- Backlight luminance control improvements (Suraj, Aaron)
- Add intel_display_device_present (Jani)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/aMxX_lBxm7wd5wmi@intel.com

Merge tag 'drm-misc-next-fixes-2025-09-18' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next

Short summary of fixes pull:

pixpaper:
- Fix mode_valid function signature

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://lore.kernel.org/r/20250918064558.GA10017@linux.fritz.box

drm/xe/configfs: Add mid context restore bb

Like done for post context restore, allow the user to add commands to
the middle of context restore, at the beginning of engine restore
commands.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20250916-wa-bb-cmds-v5-7-306bddbc15da@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe/lrc: Allow to add user commands mid context switch

Like done for post-context-restore commands, allow to add commands from
configfs in the middle of context restore. Since currently the indirect
ctx hardcodes the offset to CTX_INDIRECT_CTX_OFFSET_DEFAULT, this is
executed in the very beginning of engine context restore.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20250916-wa-bb-cmds-v5-6-306bddbc15da@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe/lrc: Allow INDIRECT_CTX for more engine classes

Currently it's only allowed for render and compute. Going forward we
want to enable it for more engine classes. Let the XE_LRC_FLAG_INDIRECT_CTX
flag (and thus gt_engine_needs_indirect_ctx()) be the deciding factor
for its availability.

While at it, add the missing const to rcs_funcs array. Since
CTX_INDIRECT_CTX_OFFSET_DEFAULT already matches the HW default and
gt_engine_needs_indirect_ctx() only ever enables it for rcs/ccs, there
is no change in behavior, it's only preparation for future use case.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20250916-wa-bb-cmds-v5-5-306bddbc15da@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe/configfs: Add post context restore bb

Allow the user to specify commands to execute during a context restore.
Currently it's possible to parse 2 types of actions:

- cmd: the instructions are added as is to the bb
- reg: just use the address and value, without worrying about
encoding the right LRI instruction. This is possibly the most
useful use case, so added a dedicated action for that.

This also prepares for future BBs: mid context restore and rc6 context
restore that can re-use the same parsing functions.

Reviewed-by: Raag Jadav <raag.jadav@intel.com>
Link: https://lore.kernel.org/r/20250916-wa-bb-cmds-v5-4-306bddbc15da@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe/lrc: Allow to add user commands on context switch

During validation it's useful to allows additional commands to be
executed on context switch. Fetch the commands from configfs (to be
added) and add them to the WA BB.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20250916-wa-bb-cmds-v5-3-306bddbc15da@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe/configfs: Allow to select by class only

For a future configfs attribute, it's desirable to select by engine mask
only as the instance doesn't make sense.

Rename the function lookup_engine_mask() to lookup_engine_info() and
make it return the entry. This allows parse_engine() to still return an
item if the caller wants to allow parsing a class-only string like
"rcs", "bcs", "ccs", etc.

Reviewed-by: Raag Jadav <raag.jadav@intel.com>
Link: https://lore.kernel.org/r/20250916-wa-bb-cmds-v5-2-306bddbc15da@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe/configfs: Extract function to parse engine

Move the part that copies the engine to a local buffer so it can be
shared in future for other configfs attributes parsing an engine.

Reviewed-by: Raag Jadav <raag.jadav@intel.com>
Link: https://lore.kernel.org/r/20250916-wa-bb-cmds-v5-1-306bddbc15da@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/amd/display: Only restore backlight after amdgpu_dm_init or dm_resume

On clients that utilize AMD_PRIVATE_COLOR properties for HDR support,
brightness sliders can include a hardware controlled portion and a
gamma-based portion. This is the case on the Steam Deck OLED when using
gamescope with Steam as a client.

When a user sets a brightness level while HDR is active, the gamma-based
portion and/or hardware portion are adjusted to achieve the desired
brightness. However, when a modeset takes place while the gamma-based
portion is in-use, restoring the hardware brightness level overrides the
user's overall brightness level and results in a mismatch between what
the slider reports and the display's current brightness.

To avoid overriding gamma-based brightness, only restore HW backlight
level after boot or resume. This ensures that the backlight level is
set correctly after the DC layer resets it while avoiding interference
with subsequent modesets.

Fixes: 7875afafba84 ("drm/amd/display: Fix brightness level not retained over reboot")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4551
Signed-off-by: Matthew Schwartz <matthew.schwartz@linux.dev>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/atom: Check kcalloc() for WS buffer in amdgpu_atom_execute_table_locked()

kcalloc() may fail. When WS is non-zero and allocation fails, ectx.ws
remains NULL while ectx.ws_size is set, leading to a potential NULL
pointer dereference in atom_get_src_int() when accessing WS entries.

Return -ENOMEM on allocation failure to avoid the NULL dereference.

Signed-off-by: Guangshuo Li <lgs201920130244@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: revert to old status lock handling v3

It turned out that protecting the status of each bo_va with a
spinlock was just hiding problems instead of solving them.

Revert the whole approach, add a separate stats_lock and lockdep
assertions that the correct reservation lock is held all over the place.

This not only allows for better checks if a state transition is properly
protected by a lock, but also switching back to using list macros to
iterate over the state of lists protected by the dma_resv lock of the
root PD.

v2: re-add missing check
v3: split into two patches

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/xe/xe_late_bind_fw: Extract and print version info

Extract and print version info of the late binding binary.

v2: Some refinements (Daniele)

Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20250905154953.3974335-10-badal.nilawar@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe/xe_late_bind_fw: Introduce debug fs node to disable late binding

Introduce a debug filesystem node to disable late binding fw reload
during the system or runtime resume. This is intended for situations
where the late binding fw needs to be loaded from user mode,
perticularly for validation purpose.
Note that xe kmd doesn't participate in late binding flow from user
space. Binary loaded from the userspace will be lost upon entering to
D3 cold hence user space app need to handle this situation.

v2:
- s/(uval == 1) ? true : false/!!uval/ (Daniele)
v3:
- Refine the commit message (Daniele)

Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20250905154953.3974335-9-badal.nilawar@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe/xe_late_bind_fw: Reload late binding fw during system resume

Reload late binding fw during resume from system suspend

v2:
- Unconditionally reload late binding fw (Rodrigo)
- Flush worker during system suspend

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20250905154953.3974335-8-badal.nilawar@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe/xe_late_bind_fw: Reload late binding fw in rpm resume

Reload late binding fw during runtime resume.

Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20250905154953.3974335-7-badal.nilawar@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe/xe_late_bind_fw: Load late binding firmware

Load late binding firmware

v2:
- s/EAGAIN/EBUSY/
- Flush worker in suspend and driver unload (Daniele)
v3:
- Use retry interval of 6s, in steps of 200ms, to allow
   other OS components release MEI CL handle (Sasha)
v4:
- return -ENODEV if component not added (Daniele)
- parse and print status returned by csc
v5:
- Use payload to check firmware valid (Daniele)
- Obtain the RPM reference before scheduling the worker to
   ensure the device remains awake until the worker completes
   firmware loading (Rodrigo)
v6:
- In case of error donot re-attempt fw download (Daniele)
v7 (Rodrigo):
- Rename of mei structs and callback.

Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20250905154953.3974335-6-badal.nilawar@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe/xe_late_bind_fw: Initialize late binding firmware

Search for late binding firmware binaries and populate the meta data of
firmware structures.

v2 (Daniele):
- drm_err if firmware size is more than max pay load size
- s/request_firmware/firmware_request_nowarn/ as firmware will
not be available for all possible cards
v3 (Daniele):
- init firmware from within xe_late_bind_init, propagate error
- switch late_bind_fw to array to handle multiple firmware types
v4 (Daniele):
- Alloc payload dynamically, fix nits
v6 (Daniele)
- %s/MAX_PAYLOAD_SIZE/XE_LB_MAX_PAYLOAD_SIZE/

Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20250905154953.3974335-5-badal.nilawar@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe/xe_late_bind_fw: Introduce xe_late_bind_fw

Introduce xe_late_bind_fw to enable firmware loading for the devices,
such as the fan controller, during the driver probe. Typically,
firmware for such devices are part of IFWI flash image but can be
replaced at probe after OEM tuning.
This patch binds mei late binding component to enable firmware loading.

v2:
- Add devm_add_action_or_reset to remove the component (Daniele)
- Add INTEL_MEI_GSC check in xe_late_bind_init() (Daniele)
v3:
- Fail driver probe if late bind initialization fails,
add has_late_bind flag (Daniele)
v4:
- %s/I915_COMPONENT_LATE_BIND/INTEL_COMPONENT_LATE_BIND/
v6:
- rebased
v7:
- rebased
- In xe_late_bind_init, use drm_err when returning an error to
stop the probe (Lucas)
- Use imperative mode in commit message (Lucas)

Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20250905154953.3974335-4-badal.nilawar@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

mei: late_bind: add late binding component driver

Introduce a new MEI client driver to support Late Binding firmware
upload/update for Intel discrete graphics platforms.

Late Binding is a runtime firmware upload/update mechanism that allows
payloads, such as fan control and voltage regulator, to be securely
delivered and applied without requiring SPI flash updates or
system reboots. This driver enables the Xe graphics driver and other
user-space tools to push such firmware blobs to the authentication
firmware via the MEI interface.

The driver handles authentication, versioning, and communication
with the authentication firmware, which in turn coordinates with
the PUnit/PCODE to apply the payload.

This is a foundational component for enabling dynamic, secure,
and re-entrant configuration updates on platforms like Battlemage.

Cc: Badal Nilawar <badal.nilawar@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20250905154953.3974335-3-badal.nilawar@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

mei: bus: add mei_cldev_mtu interface

Add a new helper function that allows MEI client drivers
to query the maximum transmission unit (MTU) for a connected
MEI client.

This is useful for clients that need to transmit large payloads,
such as firmware blobs, allowing them to determine the maximum
message size that can be safely sent before starting transmission and
size of the buffer to allocate when receiving data.

Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20250905154953.3974335-2-badal.nilawar@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/amdgpu: add missing comment for the new argument

In function 'amdgpu_vm_lock_done_list' update the comment
for the new argument 'vm'.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202509180211.UAqME0zj-lkp@intel.com/
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: suspend KFD and KGD user queues for S0ix

We need to make sure the user queues are preempted so
GFX can enter gfxoff.

Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Tested-by: David Perry <david.perry@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/userq: Optimize S0ix handling

In S0i3, GFX state is retained, so it's preferrable to
preempt queues rather than unmapping them as the overhead
is lower.

Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Tested-by: David Perry <david.perry@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Fix PRT flag for gfx12

AMDGPU_PTE_PRT_GFX12 flag is missed during pageTable rework, add it back.

Fixes: 6716a823d18d ("drm/amdgpu: rework how PTE flags are generated v3")
Signed-off-by: Joe Wang <joe.wang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Check VF critical region before RAS poison injection

Check VF critical region before RAS poison injection to ensure that the
poison injection will not hit the VF critical region.

Signed-off-by: Xiang Liu <xiang.liu@amd.com>
Reviewed-by: Shravan Kumar Gande <Shravankumar.Gande@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdkfd: add proper handling for S0ix

When in S0i3, the GFX state is retained, so all we need to do
is stop the runlist so GFX can enter gfxoff.

Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Tested-by: David Perry <david.perry@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Introduce VF critical region check for RAS poison injection

The SRIOV guest send requet to host to check whether the poison
injection address is in VF critical region or not via mabox.

Signed-off-by: Xiang Liu <xiang.liu@amd.com>
Reviewed-by: Shravan Kumar Gande <Shravankumar.Gande@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: remove non-DC DCE 11 code

DC has been the default for ~8 years now and supports
many things that the non-DC code does not (audio, DP MST, etc.).
No DCE 11.x IPs ever supported analog encoders so that is not
an issue. Finally drop this code.

Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: Enable npm metrics data

Enable npm metrics data for smu_v13_0_12

v3: Add node id check for setting NPM_CAPS (Lijo)

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: Fetch npm data from system metrics table

Fetch npm data from system metrics table for smu_v13_0_12

v3: Remove intermittent type for npm data, remove node id check,
move npm caps check to npm_get_data function (Lijo)

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: Add sysfs node for node power

Add sysfs node to expose node power limit for smu_v13_0_12

v2: Remove support check from visible function (Kevin)

v3: Update comments (Kevin)
Remove sysfs remove file, change format specifier
for sysfs_emit, use attribute_group.name (Lijo)

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: Allow system metrics table in 1vf mode

Allow fetching system metrics table in 1VF mode

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/xe: Work around clang multiple goto-label error

When using drm_exec_retry_on_contention(), clang may consider
all labels for which we take addresses in a function as
potential retry goto targets, although strictly only one
is possible. It will then in some situations generate false
positive errors.

In this case, the compiler, for some architectures, consider the

might_lock(&m->job_mutex);

as a potential goto target from drm_exec_retry_on_contention(),
and errors.

Work around that by moving the xe_validate / drm_exec
transaction to a separate function.

v2:
- New commit message based on analysis of Nathan Chancellor

Fixes: 59eabff2a352 ("drm/xe: Convert xe_bo_create_pin_map() for exhaustive eviction")
Cc: Matthew Brost <matthew.brost@intel.com>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202509101853.nDmyxTEM-lkp@intel.com/
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org> # build
Link: https://lore.kernel.org/r/20250911080324.180307-1-thomas.hellstrom@linux.intel.com

drm/xe/sysfs: Simplify sysfs registration

Instead of manually maintaining each sysfs file define and use
attribute groups and register them using device managed function.
Then use is_visible() to filter-out unsupported attributes.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Raag Jadav <raag.jadav@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20250916170029.3313-3-michal.wajdeczko@intel.com

drm/xe/vf: Don't expose sysfs attributes not applicable for VFs

VFs can't read BMG_PCIE_CAP(0x138340) register nor access PCODE
(already guarded by the info.skip_pcode flag) so we shouldn't
expose attributes that require any of them to avoid errors like:

[] xe 0000:03:00.1: [drm] Tile0: GT0: VF is trying to read an \
                     inaccessible register 0x138340+0x0
[] RIP: 0010:xe_gt_sriov_vf_read32+0x6c2/0x9a0 [xe]
[] Call Trace:
[]  xe_mmio_read32+0x110/0x280 [xe]
[]  auto_link_downgrade_capable_show+0x2e/0x70 [xe]
[]  dev_attr_show+0x1a/0x70
[]  sysfs_kf_seq_show+0xaa/0x120
[]  kernfs_seq_show+0x41/0x60

Fixes: 0e414bf7ad01 ("drm/xe: Expose PCIe link downgrade attributes")
Fixes: cdc36b66cd41 ("drm/xe: Expose fan control and voltage regulator version")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Lukasz Laguna <lukasz.laguna@intel.com>
Reviewed-by: Raag Jadav <raag.jadav@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20250916170029.3313-2-michal.wajdeczko@intel.com

drm/xe/madvise: Fix ioctl argument check

It is "preferred_mem_loc" instead of "atomic" for the ATTR_PREFERRED_LOC
path.

Also include 2 minor changes with no functional impact.
1. Remove the redundant "attr.atomic_access" assignment.
2. Replace down_read_interruptible() with
xe_svm_notifier_lock_interruptible() to pair with
xe_svm_notifier_unlock().

Fixes: ada7486c5668 ("drm/xe: Implement madvise ioctl for xe")
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://lore.kernel.org/r/20250911173139.1405878-2-shuicheng.lin@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>

drm/xe: Misc refine for svm

These changes should have no functional impact.
1. Correct typo of "operation"in macro range_debug().
2. Combine 2 spin_lock() call in xe_svm_garbage_collector() into 1.
3. Drop redundant preferred_region_is_vram check in
   xe_svm_range_needs_migrate_to_vram().
4. Combine the devmem_possible check in xe_svm_handle_pagefault().
   need_vram includes the IS_DGFX() check, so there is no change for
   .devmem_only.

v2: revert !ctx.devmem_only change (Matt)
v3: rebase code and refine commit message.
v4: rebase code and refine commit message.

Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20250911031405.1371812-2-shuicheng.lin@intel.com

drm/xe/tests: Add pre-GMDID IP descriptors to param generators

Recently introduced kunit parameter generators were based on
the existing arrays which have only GDMID-based IPs and didn't
take into account IP definitions from pre-GMDID era.

Add test only arrays with pre-GMDID IPs (as those will not change)
and extend param generators to start iterating over them.

[ ] =================== xe_pci (2 subtests) ====================
[ ] ==================== check_graphics_ip ====================
[ ] [PASSED] 12.00 Xe_LP
[ ] [PASSED] 12.10 Xe_LP+
[ ] [PASSED] 12.55 Xe_HPG
[ ] [PASSED] 12.60 Xe_HPC
[ ] [PASSED] 12.70 Xe_LPG
[ ] [PASSED] 12.71 Xe_LPG
[ ] [PASSED] 12.74 Xe_LPG+
[ ] [PASSED] 20.01 Xe2_HPG
[ ] [PASSED] 20.02 Xe2_HPG
[ ] [PASSED] 20.04 Xe2_LPG
[ ] [PASSED] 30.00 Xe3_LPG
[ ] [PASSED] 30.01 Xe3_LPG
[ ] [PASSED] 30.03 Xe3_LPG
[ ] ================ [PASSED] check_graphics_ip ================
[ ] ===================== check_media_ip ======================
[ ] [PASSED] 12.00 Xe_M
[ ] [PASSED] 12.55 Xe_HPM
[ ] [PASSED] 13.00 Xe_LPM+
[ ] [PASSED] 13.01 Xe2_HPM
[ ] [PASSED] 20.00 Xe2_LPM
[ ] [PASSED] 30.00 Xe3_LPM
[ ] [PASSED] 30.02 Xe3_LPM
[ ] ================= [PASSED] check_media_ip ==================
[ ] ===================== [PASSED] xe_pci ======================

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20250916171645.3335-1-michal.wajdeczko@intel.com

Merge tag 'drm-rust-next-2025-09-16' of https://gitlab.freedesktop.org/drm/rust/kernel into drm-next

DRM Rust changes for v6.18

Alloc
  - Add BorrowedPage type and AsPageIter trait
  - Implement Vmalloc::to_page() and VmallocPageIter
  - Implement AsPageIter for VBox and VVec

DMA & Scatterlist
  - Add dma::DataDirection and type alias for dma_addr_t
  - Abstraction for struct scatterlist and struct sg_table

DRM
  - In the DRM GEM module, simplify overall use of generics, add
    DriverFile type alias and drop Object::SIZE.

Nova (Core)
  - Various register!() macro improvements (paving the way for lifting
    it to common driver infrastructure)
  - Minor VBios fixes and refactoring
  - Minor firmware request refactoring
  - Advance firmware boot stages; process Booter and patch its
    signature, process GSP and GSP bootloader
  - Switch development fimrware version to r570.144
  - Add basic firmware bindings for r570.144
  - Move GSP boot code to its own module
  - Clean up and take advantage of pin-init features to store most of
    the driver's private data within a single allocation
  - Update ARef import from sync::aref
  - Add website to MAINTAINERS entry

Nova (DRM)
  - Update ARef import from sync::aref
  - Add website to MAINTAINERS entry

Pin-Init
  - Merge pin-init PR from Benno
    - `#[pin_data]` now generates a `*Projection` struct similar to the
      `pin-project` crate.

    - Add initializer code blocks to `[try_][pin_]init!` macros: make
      initializer macros accept any number of `_: {/* arbitrary code
      */},` & make them run the code at that point.

    - Make the `[try_][pin_]init!` macros expose initialized fields via
      a `let` binding as `&mut T` or `Pin<&mut T>` for later fields.

Rust
  - Various methods for AsBytes and FromBytes traits

Tyr
  - Initial Rust driver skeleton for ARM Mali GPUs.
    - It can power up the GPU, query for GPU metatdata through MMIO and
      provide the metadata to userspace via DRM device IOCTL (struct
      drm_panthor_dev_query).

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: "Danilo Krummrich" <dakr@kernel.org>
Link: https://lore.kernel.org/r/DCUC4SY6SRBD.1ZLHAIQZOC6KG@kernel.org

drm/xe: Allow error injection for xe_pxp_exec_queue_add

This will allow us to simulate this function returning an error like
we do for other functions called in the exec_queue_create path.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://lore.kernel.org/r/20250909221240.3711023-4-daniele.ceraolospurio@intel.com

drm/xe: Fix error handling if PXP fails to start

Since the PXP start comes after __xe_exec_queue_init() has completed,
we need to cleanup what was done in that function in case of a PXP
start error.
__xe_exec_queue_init calls the submission backend init() function,
so we need to introduce an opposite for that. Unfortunately, while
we already have a fini() function pointer, it performs other
operations in addition to cleaning up what was done by the init().
Therefore, for clarity, the existing fini() has been renamed to
destroy(), while a new fini() has been added to only clean up what was
done by the init(), with the latter being called by the former (via
xe_exec_queue_fini).

Fixes: 72d479601d67 ("drm/xe/pxp/uapi: Add userspace and LRC support for PXP-using queues")
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://lore.kernel.org/r/20250909221240.3711023-3-daniele.ceraolospurio@intel.com

drm/amdgpu: re-order and document VM code

Re-order fields in the VM structure and try to improve the
documentation a bit.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: remove check for BO reservation add assert instead

We should leave such checks to lockdep and not implement something
manually.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: Update pmfw headers for smu_v13_0_12

Update pmfw headers for smu_v13_0_12 to include node power limit

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: Rename amdgpu_hwmon_get_sensor_generic

Rename amdgpu_hwmon_get_sensor_generic to use for generic pm
interfaces

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd: Only restore cached manual clock settings in restore if OD enabled

If OD is not enabled then restoring cached clock settings doesn't make
sense and actually leads to errors in resume.

Check if enabled before restoring settings.

Fixes: 4e9526924d09 ("drm/amd: Restore cached manual clock settings during resume")
Reported-by: Jérôme Lécuyer <jerome.4a4c@gmail.com>
Closes: https://lore.kernel.org/amd-gfx/0ffe2692-7bfa-4821-856e-dd0f18e2c32b@amd.com/T/#me6db8ddb192626360c462b7570ed7eba0c6c9733
Suggested-by: Jérôme Lécuyer <jerome.4a4c@gmail.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: Use devm_i2c_add_adapter() in the V14_0_2 smu

The I2C init for V14_0_2 uses i2c_add_adapter() and i2c_del_adapter(),
this commit replaces the use of these two functions with
devm_i2c_add_adapter(). Notice that V14_0_2 init initializes multiple
I2C buses in a loop; if something goes wrong, the previous adapters are
removed, and the amdgpu load is interrupted. Since I2C init is required
for the correct load of amdgpu, it is safe to rely on
devm_i2c_add_adapter() to handle any previously initialized I2C adapter.

Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: Use devm_i2c_add_adapter() in the V13_0_6 smu

The I2C init for V13_0_6 uses i2c_add_adapter() and i2c_del_adapter(),
this commit replaces the use of these two functions with
devm_i2c_add_adapter(). Notice that V13_0_6 init initializes multiple
I2C buses in a loop; if something goes wrong, the previous adapters are
removed, and the amdgpu load is interrupted. Since I2C init is required
for the correct load of amdgpu, it is safe to rely on
devm_i2c_add_adapter() to handle any previously initialized I2C adapter.

Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: Use devm_i2c_add_adapter() in the V13 smu

The I2C init for SMU_V13 uses i2c_add_adapter() and i2c_del_adapter(),
this commit replaces the use of these two functions with
devm_i2c_add_adapter(). Notice that SMU_V13 init initializes multiple
I2C buses in a loop; if something goes wrong, the previous adapters are
removed, and the amdgpu load is interrupted. Since I2C init is required
for the correct load of amdgpu, it is safe to rely on
devm_i2c_add_adapter() to handle any previously initialized I2C adapter.

Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: Use devm_i2c_add_adapter() in the Sienna smu

The I2C init for Sienna Cichlid uses i2c_add_adapter() and
i2c_del_adapter(), this commit replaces the use of these two functions
with devm_i2c_add_adapter(). Notice that Sienna Cichlid init initializes
multiple I2C buses in a loop; if something goes wrong, the previous
adapters are removed, and the amdgpu load is interrupted. Since I2C init
is required for the correct load of amdgpu, it is safe to rely on
devm_i2c_add_adapter() to handle any previously initialized I2C adapter.

Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: Use devm_i2c_add_adapter() in the Navi10 smu

The I2C init for Navi10 uses i2c_add_adapter() and i2c_del_adapter(),
this commit replaces the use of these two functions with
devm_i2c_add_adapter(). Notice that Navi10 init initializes multiple I2C
buses in a loop; if something goes wrong, the previous adapters are
removed, and the amdgpu load is interrupted. Since I2C init is required
for the correct load of amdgpu, it is safe to rely on
devm_i2c_add_adapter() to handle any previously initialized I2C adapter.

Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: Use devm_i2c_add_adapter() in the Arcturus smu

The I2C init for Arcturus uses i2c_add_adapter() and i2c_del_adapter(),
this commit replaces the use of these two functions with
devm_i2c_add_adapter(). Notice that Arcturus init initializes multiple
I2C buses in a loop; if something goes wrong, the previous adapters are
removed, and the amdgpu load is interrupted. Since I2C init is required
for the correct load of amdgpu, it is safe to rely on
devm_i2c_add_adapter() to handle any previously initialized I2C adapter.

Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: Use devm_i2c_add_adapter() in the i2c init

Instead of using i2c_add_adapter() and i2c_del_adapter(), replace them
with devm_i2c_add_adapter() to simplify the i2c logic.

Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Use devm_i2c_add_adapter() in SMU V11

Instead of using i2c_add_adapter() and i2c_del_adapter() in the SMU V11,
use devm_i2c_add_adapter() to simplify the code path.

Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/amdgpu_i2c: Use devm_i2c_add_adapter instead of i2c_add_adapter

This commit replaces i2c_add_adapter() with devm_i2c_add_adapter() and
removes part of the cleanup logic since the new function handles the i2c
removal.

Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Use devm_i2c_add_adapter to simplify i2c cleanup logic

This commit replaces the utilization of i2c_add/del_adapter() with
devm_i2c_add_adapter() to reduce the amount of boilerplate. Using
devm_i2c_add_adapter() has the advantage of removing the manual
manipulation of the I2C adapter.

Suggested-by: Robert Beckett <bob.beckett@collabora.com>
Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Use kmalloc_array() instead of kmalloc()

Documentation/process/deprecated.rst recommends against the use of kmalloc
with dynamic size calculations due to the risk of overflow and smaller
allocation being made than the caller was expecting. This could lead to
buffer overflow in code similar to the memcpy in
amdgpu_dm_plane_add_modifier().

Signed-off-by: James Flowers <bold.zone2373@fastmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: fix userq VM validation v4

That was actually complete nonsense and not validating the BOs
at all. The code just cleared all VM areas were it couldn't grab the
lock for a BO.

Try to fix this. Only compile tested at the moment.

v2: fix fence slot reservation as well as pointed out by Sunil.
also validate PDs, PTs, per VM BOs and update PDEs
v3: grab the status_lock while working with the done list.
v4: rename functions, add some comments, fix waiting for updates to
complete.
v4: rename amdgpu_vm_lock_done_list(), add some more comments

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: reject gang submissions under SRIOV

Gang submission means that the kernel driver guarantees that multiple
submissions are executed on the HW at the same time on different engines.

Background is that those submissions then depend on each other and each
can't finish stand alone.

SRIOV now uses world switch to preempt submissions on the engines to allow
sharing the HW resources between multiple VFs.

The problem is now that the SRIOV world switch can't know about such inter
dependencies and will cause a timeout if it waits for a partially running
gang submission.

To conclude SRIOV and gang submissions are fundamentally incompatible at
the moment. For now just disable them.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/xe: Remove duplicate header files

Fix some duplicate includes in xe:
./drivers/gpu/drm/xe/xe_tlb_inval.c: xe_tlb_inval.h is included more than once.
./drivers/gpu/drm/xe/xe_pt.c: xe_tlb_inval_job.h is included more than once.

While at it, also sort the include lines alphabetically.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=24705
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=24706
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
[Reword commit message]
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20250916021039.1632766-1-yang.lee@linux.alibaba.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe/guc: Return an error code if the GuC load fails

Due to multiple explosion issues in the early days of the Xe driver,
the GuC load was hacked to never return a failure. That prevented
kernel panics and such initially, but now all it achieves is creating
more confusing errors when the driver tries to submit commands to a
GuC it already knows is not there. So fix that up.

As a stop-gap and to help with debug of load failures due to invalid
GuC init params, a wedge call had been added to the inner GuC load
function. The reason being that it leaves the GuC log accessible via
debugfs. However, for an end user, simply aborting the module load is
much cleaner than wedging and trying to continue. The wedge blocks
user submissions but it seems that various bits of the driver itself
still try to submit to a dead GuC and lots of subsequent errors occur.
And with regards to developers debugging why their particular code
change is being rejected by the GuC, it is trivial to either add the
wedge back in and hack the return code to zero again or to just do a
GuC log dump to dmesg.

v2: Add support for error injection testing and drop the now redundant
wedge call.

CC: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
Link: https://lore.kernel.org/r/20250909224132.536320-1-John.C.Harrison@Intel.com

drm/xe/sysfs: Add cleanup action in xe_device_sysfs_init

On partial failure, some sysfs files created before the failure might
not be removed. Add common cleanup step to remove them all immediately,
as is should be harmless to attempt to remove non-existing files.

Fixes: 0e414bf7ad01 ("drm/xe: Expose PCIe link downgrade attributes")
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Stuart Summers <stuart.summers@intel.com>
Cc: Shuicheng Lin <shuicheng.lin@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Zongyao Bai <zongyao.bai@intel.com>
Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20250915214716.1327379-2-zongyao.bai@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

Merge tag 'exynos-drm-next-for-v6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-next

New feature
- Add glue layer support for Exynos7870 DSIM in Exynos DSI driver
  . Introduces Exynos7870 DSIM bridge integration at Exynos DRM DSI layer.

Bug fixups for exynos7_drm_decon.c module
- Remove redundant ctx->suspended state handling
  . Cleans up unused state check logic as call flow is now correctly managed.
  . Fixes an issue where decon_commit() was blocked from decon_atomic_enable() due to incorrect state setting.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Inki Dae <inki.dae@samsung.com>
Link: https://lore.kernel.org/r/20250915113543.51294-1-inki.dae@samsung.com

Merge tag 'exynos-drm-misc-next-for-v6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-next

New feature
- Add DSIM bridge driver support for Exynos7870
. Introduces Exynos7870 DSIM IP block support in the samsung-dsim bridge driver.
- Document Exynos7870 DSIM compatible in dt-bindings
. Adds exynos7870 compatible string and required clocks in device tree schema.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Inki Dae <inki.dae@samsung.com>
Link: https://lore.kernel.org/r/20250915111802.28177-1-inki.dae@samsung.com

Merge tag 'drm-msm-next-2025-09-12' of https://gitlab.freedesktop.org/drm/msm into drm-next

Changes for v6.18

GPU and Core:
- in DT bindings describe clocks per GPU type
- GMU bandwidth voting for x1-85
- a663 speedbins
- a623 speedbins
- cleanup some remaining no-iommu leftovers after VM_BIND conversion
- fix GEM obj 32b size truncation
- add missing VM_BIND param validation
- various fixes
- IFPC for x1-85 and a750
- register xml and gen_header.py sync from mesa

Display:
- add missing bindings for display on SC8180X
- added DisplayPort MST bindings
- conversion from round_rate() to determine_rate()
- DSI PHY fixes, correcting programming glitches
- misc small fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rob Clark <rob.clark@oss.qualcomm.com>
Link: https://lore.kernel.org/r/CACSVV01FgXN+fD6U1Hi6Tj4WCf=V-+NO8BXi+80iS4qOZwpaGg@mail.gmail.com

drm/amd: Drop unnecessary calls to smu_dpm_set_vpe_enable()

smu_hw_init() and smu_hw_fini() call smu_dpm_set_vpe_enable() for
APUs as part of startup and teardown.  These calls however are
not necessary because vpe_hw_init()/vpe_hw_fini() will call at
init/fini:

```
vpe_hw_init() / vpe_hw_fini()
  amdgpu_device_ip_set_powergating_state()
    vpe_set_powergating_state()
      amdgpu_dpm_enable_vpe()
        amdgpu_dpm_set_powergating_by_smu()
          smu_dpm_set_power_gate()
            smu_dpm_set_vpe_enable()
```

Drop the extra calls.

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: revert "Implement new dummy vram manager"

This is should be unnecessary since a VRAM manager isn't mandatory in
the first place.

It could be that we have some missing checks inside AMDGPU or TTM but
those should then be fixed instead of worked around like that.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: add AMDGPU_IDS_FLAGS_GANG_SUBMIT

Add a UAPI flag indicating if gang submit is supported or not.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Don't use non-registered VUPDATE on DCE 6

The VUPDATE interrupt isn't registered on DCE 6, so don't try
to use that.

This fixes a page flip timeout after sleep/resume on DCE 6.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Disable VRR on DCE 6

DCE 6 was not advertised as being able to support VRR,
so let's mark it as unsupported for now.

The VRR implementation in amdgpu_dm depends on the VUPDATE
interrupt which is not registered for DCE 6.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Disable fastboot on DCE 6 too

It already didn't work on DCE 8,
so there is no reason to assume it would on DCE 6.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display/dml2: Guard dml21_map_dc_state_into_dml_display_cfg with DC_FP_START

dml21_map_dc_state_into_dml_display_cfg calls (the call is usually
inlined by the compiler) populate_dml21_surface_config_from_plane_state
and populate_dml21_plane_config_from_plane_state which may use FPU.  In
a x86-64 build:

    $ objdump --disassemble=dml21_map_dc_state_into_dml_display_cfg \
    > drivers/gpu/drm/amd/display/dc/dml2/dml21/dml21_translation_helper.o |
    > grep %xmm -c
    63

Thus it needs to be guarded with DC_FP_START.  But we must note that the
current code quality of the in-kernel FPU use in AMD dml2 is very much
problematic: we are actually calling DC_FP_START in dml21_wrapper.c
here, and this translation unit is built with CC_FLAGS_FPU.  Strictly
speaking this does not make any sense: with CC_FLAGS_FPU the compiler is
allowed to generate FPU uses anywhere in the translated code, perhaps
out of the DC_FP_START guard.  This problematic pattern also occurs in
at least dml2_wrapper.c, dcn35_fpu.c, and dcn351_fpu.c.  Thus we really
need a careful audit and refactor for the in-kernel FPU uses, and this
patch is simply whacking a mole.  However per the reporter, whacking
this mole is enough to make a 9060XT "just work."

Reported-by: Asiacn <710187964@qq.com>
Closes: https://github.com/loongson-community/discussions/issues/102
Tested-by: Asiacn <710187964@qq.com>
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Reviewed-by: Huacai Chen <chenhuacai@loongson.cn>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Keep PLL0 running on DCE 6.0 and 6.4

DC can turn off the display clock when no displays are connected
or when all displays are off, for reference see:
- dce*_validate_bandwidth

DC also assumes that the DP clock is always on and never powers
it down, for reference see:
- dce110_clock_source_power_down

In case of DCE 6.0 and 6.4, PLL0 is the clock source for both
the engine clock and DP clock, for reference see:
- radeon_atom_pick_pll
- atombios_crtc_set_disp_eng_pll

Therefore, PLL0 should be always kept running on DCE 6.0 and 6.4.
This commit achieves that by ensuring that by setting the display
clock to the corresponding value in low power state instead of
zero.

This fixes a page flip timeout on SI with DC which happens when
all connected displays are blanked.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Fix DVI-D/HDMI adapters

When the EDID has the HDMI bit, we should simply select
the HDMI signal type even on DVI ports.

For reference see, the legacy amdgpu display code:
amdgpu_atombios_encoder_get_encoder_mode
which selects ATOM_ENCODER_MODE_HDMI for the same case.

This commit fixes DVI connectors to work with DVI-D/HDMI
adapters so that they can now produce output over these
connectors for HDMI monitors with higher bandwidth modes.
With this change, even HDMI audio works through DVI.

For testing, I used a CAA-DMDHFD3 DVI-D/HDMI adapter
with the following GPUs:

Tahiti (DCE 6) - DC can now output 4K 30 Hz over DVI
Polaris 10 (DCE 11.2) - DC can now output 4K 60 Hz over DVI

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdkfd: add function svm_migrate_successful_pages

to get migration pages. dst MIGRATE_PFN_VALID bit and src
MIGRATE_PFN_MIGRATE bit should always be set when migration success.

cpage includes src MIGRATE_PFN_MIGRATE bit set and MIGRATE_PFN_VALID
bit unset pages for both RAM and VRAM when memory is only allocated
without being populated before migration, those ram pages should be
counted as migrated pages and those vram pages should not be counted
as migrated pages. Here migration pages refer to how many vram pages
invloved. Current svm_migrate_unsuccessful_pages only covers the
unsuccessful case that source is on RAM.

So far, we only see two unsuccessful migration cases. Since we
can clearly identify successful migration cases through dst
MIGRATE_PFN_VALID bit and src MIGRATE_PFN_MIGRATE bit within this
prange, also eventually successful migration pages will be used,
so we can use function svm_migrate_successful_pages to replace
function svm_migrate_unsuccessful_pages.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Philip Yang<Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Revert "drm/amdkfd: return migration pages from copy function"

This reverts commit bd6093e2f1601c0c83906f5115a2efb6b93050b1.

migrate_vma_pages can fail if a CPU thread faults on the same page.
However, the page table is locked and only one of the new pages will
be inserted. The device driver will see that the MIGRATE_PFN_MIGRATE
bit is cleared if it loses the race.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/amdgpu: Fix the mes version that support inv_tlbs

MES pipe0 will do VM invalidation with engine set 5 when assign VMID to a process,
driver will submit inv_tlb package to mes pipe1. It might run into race condition
if both pipes use the same invalidate engine set. From MES version 0x83 it will use
invalidate engine set 6 for pipe1 to fix the issue

Signed-off-by: Shaoyun Liu <shaoyun.liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd: Avoid evicting resources at S5

Normally resources are evicted on dGPUs at suspend or hibernate and
on APUs at hibernate. These steps are unnecessary when using the S4
callbacks to put the system into S5.

Cc: AceLan Kao <acelan.kao@canonical.com>
Cc: Kai-Heng Feng <kaihengf@nvidia.com>
Cc: Mark Pearson <mpearson-lenovo@squebb.ca>
Cc: Denis Benato <benato.denis96@gmail.com>
Cc: Merthan Karakaş <m3rthn.k@gmail.com>
Tested-by: Eric Naim <dnaim@cachyos.org>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Switch user queues to use preempt/restore for eviction

This patch modifies the user queue management to use preempt/restore
operations instead of full map/unmap for queue eviction scenarios where
applicable. The changes include:

1. Introduces new helper functions:
   - amdgpu_userqueue_preempt_helper()
   - amdgpu_userqueue_restore_helper()

2. Updates queue state management to track PREEMPTED state

3. Modifies eviction handling to use preempt instead of unmap:
   - amdgpu_userq_evict_all() now uses preempt_helper
   - amdgpu_userq_restore_all() now uses restore_helper

The preempt/restore approach provides better performance during queue
eviction by avoiding the overhead of full queue teardown and setup.
Full map/unmap operations are still used for initial setup/teardown
and system suspend scenarios.

v2: rename amdgpu_userqueue_restore_helper/amdgpu_userqueue_preempt_helper to
amdgpu_userq_restore_helper/amdgpu_userq_preempt_helper for consistency. (Alex)

v3: amdgpu_userq_stop_sched_for_enforce_isolation() and
amdgpu_userq_start_sched_for_enforce_isolation() should use preempt and restore (Alex)

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: adjust MES API used for suspend and resume

Use the suspend and resume API rather than remove queue
and add queue API.  The former just preempts the queue
while the latter remove it from the scheduler completely.
There is no need to do that, we only need preemption
in this case.

V2: replace queue_active with queue state
v3: set the suspend_fence_addr
v4: allocate another per queue buffer for the suspend fence, and  set the sequence number.
    also wait for the suspend fence. (Alex)
v5: use a wb slot (Alex)
v6: Change the timeout period. For MES, the default timeout  is  2100000; /* 2100 ms */ (Alex)

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: unified smu feature cap for vcn reset

unified vcn reset smu feature cap

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: unified smu feature cap for sdma reset

unified sdma reset smu feature cap

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: unified smu feature cap for link reset

unified link reset smu feature cap

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Promote DC to 3.2.350

This version brings along following updates:
- Add DSC padding for OVT support
- Setup pixel encoding for YCBCR422
- Fix dml ms order
- Rename header file link.h to link_service.h
- Fix DMUB loading sequence
- Modify link training policy

Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Acked-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Revert "drm/amd/display: Reduce Stack Usage by moving 'audio_output' into 'stream_res' v4"

This reverts commit 1cf1205ef268 ("drm/amd/display: Reduce Stack Usage by moving 'audio_output' into 'stream_res' v4")

Reason for revert: Causes DP compliance errors

Acked-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Martin Leung <Martin.Leung@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Add DSC padding for OVT Support

[Why]
-Certain OVT timings require DSC configurations which divide the
horizontal active unevenly across DSC slices
-DSC slices must be even, so padding needs to be added to the active
to make this possible
-The pixel clock of the HW now needs to be increased to accommodate
the extra padded pixels
-To keep the line time the same, the blank of the HW timing needs to
be increased as well

[How]
-Calculate h_active padding, h_total padding, and pixel clock based
off of the original OVT timing and DSC calculations
-Store these values in the pipe and program HW with these modifications
-Added general support for cases where DSC slice config does not evenly
split the horizontal active by fixing some slice width calculations
-Updated PPS calculations for these cases

Reviewed-by: Chris Park <chris.park@amd.com>
Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Relja Vojvodic <rvojvodi@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Add fallback path for YCBCR422

[Why]
DP validation may fail with multiple displays and higher color depths.
The sink may support others though.

[How]
When DP bandwidth validation fails, progressively fallback through:
- YUV422 8bpc (bandwidth efficient)
- YUV422 6bpc (reduced color depth)
- YUV420 (last resort)

This resolves cases where displays would show no image due to insufficient
DP link bandwidth for the requested RGB mode.

Suggested-by: Mauri Carvalho <mcarvalho3@lenovo.com>
Reviewed-by: Wayne Lin <wayne.lin@amd.com>
Signed-off-by: Mario Limonciello <Mario.Limonciello@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Set up pixel encoding for YCBCR422

[Why]
fill_stream_properties_from_drm_display_mode() will not configure pixel
encoding to YCBCR422 when the DRM color format supports YCBCR422 but not
YCBCR420 or YCBCR4444. Instead it will fallback to RGB.

[How]
Add support for YCBCR422 in pixel encoding mapping.

Suggested-by: Mauri Carvalho <mcarvalho3@lenovo.com>
Reviewed-by: Wayne Lin <wayne.lin@amd.com>
Signed-off-by: Mario Limonciello <Mario.Limonciello@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: fix dml ms order of operations

[why&how]
small error in order of operations in immediateflipbytes
calculation on dml ms side that can result in dml ms
and mp mismatch immediateflip support for a given pipe
and thus an invalid hw state, correct the order to align
with mp.

Reviewed-by: Leo Chen <leo.chen@amd.com>
Signed-off-by: Ausef Yousof <Ausef.Yousof@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: limit one non-related log to dGPU

[Why&How]
some log are for dGPU only.
Added check to limit log.

Reviewed-by: Chris Park <chris.park@amd.com>
Signed-off-by: Charlene Liu <Charlene.Liu@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Allow RX6xxx & RX7700 to invoke amdgpu_irq_get/put

[Why&How]
As reported on https://gitlab.freedesktop.org/drm/amd/-/issues/3936,
SMU hang can occur if the interrupts are not enabled appropriately,
causing a vblank timeout.

This patch reverts commit 5009628d8509 ("drm/amd/display: Remove unnecessary
amdgpu_irq_get/put"), but only for RX6xxx & RX7700 GPUs, on which the
issue was observed.

This will re-enable interrupts regardless of whether the user space needed
it or not.

Fixes: 5009628d8509 ("drm/amd/display: Remove unnecessary amdgpu_irq_get/put")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3936
Suggested-by: Sun peng Li <sunpeng.li@amd.com>
Reviewed-by: Sun peng Li <sunpeng.li@amd.com>
Signed-off-by: Ivan Lipski <ivan.lipski@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Rename header file link.h to link_service.h

[WHY]
Header file name "link.h" collides with system header when dc is
compiled as a user-mode library

[WHAT]
Rename link.h to link_service.h to avoid name collision

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Wesley Chalmers <Wesley.Chalmers@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Fix DMCUB loading sequence for DCN3.2

[Why]
New sequence from HW for reset and firmware reloading has been
provided that aims to stabilize the reload sequence in the case the
firmware is hung or has outstanding requests.

[How]
Update the sequence to remove the DMUIF reset and the redundant
writes in the release.

Reviewed-by: Sreeja Golui <sreeja.golui@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: prepare dml 2.1 for new asic

[Why&How]
prepare dml 2.1 for new asic

Acked-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Dmytro Laktyushkin <dmytro.laktyushkin@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Modify the link training policy

[Why&How]
Currently fallback to low link rate if the link training
fails once on USB4. It may cause the bandwidth couldn't
satisfy the requirement of streams. Modify the policy
to do training retry in the previous few times, only
do fallback at the last time.

Reviewed-by: Meenakshikumar Somasundaram <meenakshikumar.somasundaram@amd.com>
Signed-off-by: Zhikai Zhai <zhikai.zhai@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Revert "drm/amdgpu: Allocate psp fw private buffer in vram"

This reverts commit 22dcb283d63d5677a5875d0002d04d2c61720f78.
Need to certain APU platforms and will proceed to rework
the patch accordingly

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>