Pei Xiao [Thu, 19 Mar 2026 02:04:05 +0000 (10:04 +0800)]
spi: sifive: Simplify clock handling with devm_clk_get_enabled()
Replace devm_clk_get() followed by clk_prepare_enable() with
devm_clk_get_enabled() for the bus clock. This reduces boilerplate code
and error handling, as the managed API automatically disables the clock
when the device is removed or if probe fails.
Remove the now-unnecessary clk_disable_unprepare() calls from the probe
error path and the remove callback. Adjust the error handling to use the
existing put_host label.
Pei Xiao [Thu, 19 Mar 2026 02:03:59 +0000 (10:03 +0800)]
spi: bcmbca-hsspi: Simplify clock handling with devm_clk_get_enabled()
Replace devm_clk_get() followed by clk_prepare_enable() with
devm_clk_get_enabled() for both the "hsspi" and "pll" clocks. This
reduces boilerplate code and error handling, as the managed API
automatically disables the clocks when the device is removed or if
probe fails.
Remove the now-unnecessary clk_disable_unprepare() calls from the
probe error paths and the remove callback. Simplify the error handling
by converting to direct returns with dev_err_probe() where appropriate.
Pei Xiao [Thu, 19 Mar 2026 02:03:58 +0000 (10:03 +0800)]
spi: bcm63xx-hsspi: Simplify clock handling with devm_clk_get_enabled()
Replace devm_clk_get() followed by clk_prepare_enable() with
devm_clk_get_enabled() for both the "hsspi" and "pll" clocks. This
reduces boilerplate code and error handling, as the managed API
automatically disables the clocks when the device is removed or if
probe fails.
Remove the now-unnecessary clk_disable_unprepare() calls from the
probe error paths and the remove callback. Accordingly, adjust the
error handling labels to direct returns where possible.
Ben Dooks [Mon, 23 Mar 2026 17:56:04 +0000 (17:56 +0000)]
ASoC: soc-topology: fix __le32 conversion in printed values
A number of dev_dbg() and dev_err() calls get passed values that are
of __le32 type which does not get noticed by sparse until my variadic
checking patches.
There are a number of these, and we should probably fix these up.
The sparse warnings are numerous so the first few are listed here that
this patch fixes:
sound/soc/soc-topology.c:226:9: warning: incorrect type in argument 4 (different base types)
sound/soc/soc-topology.c:226:9: expected int
sound/soc/soc-topology.c:226:9: got restricted __le32 [usertype] get
sound/soc/soc-topology.c:226:9: warning: incorrect type in argument 5 (different base types)
sound/soc/soc-topology.c:226:9: expected int
sound/soc/soc-topology.c:226:9: got restricted __le32 [usertype] put
sound/soc/soc-topology.c:226:9: warning: incorrect type in argument 6 (different base types)
sound/soc/soc-topology.c:226:9: expected int
sound/soc/soc-topology.c:226:9: got restricted __le32 [usertype] info
sound/soc/soc-topology.c:226:9: warning: incorrect type in argument 4 (different base types)
sound/soc/soc-topology.c:226:9: expected int
sound/soc/soc-topology.c:226:9: got restricted __le32 [usertype] get
sound/soc/soc-topology.c:226:9: warning: incorrect type in argument 5 (different base types)
sound/soc/soc-topology.c:226:9: expected int
sound/soc/soc-topology.c:226:9: got restricted __le32 [usertype] put
sound/soc/soc-topology.c:226:9: warning: incorrect type in argument 6 (different base types)
sound/soc/soc-topology.c:226:9: expected int
sound/soc/soc-topology.c:226:9: got restricted __le32 [usertype] info
sound/soc/soc-topology.c:226:9: warning: incorrect type in argument 4 (different base types)
sound/soc/soc-topology.c:226:9: expected int
sound/soc/soc-topology.c:226:9: got restricted __le32 [usertype] get
sound/soc/soc-topology.c:226:9: warning: incorrect type in argument 5 (different base types)
sound/soc/soc-topology.c:226:9: expected int
sound/soc/soc-topology.c:226:9: got restricted __le32 [usertype] put
sound/soc/soc-topology.c:226:9: warning: incorrect type in argument 6 (different base types)
sound/soc/soc-topology.c:226:9: expected int
sound/soc/soc-topology.c:226:9: got restricted __le32 [usertype] info
Joel Fernandes [Thu, 19 Mar 2026 21:07:22 +0000 (17:07 -0400)]
rust: interop: Add list module for C linked list interface
Add a new module `kernel::interop::list` for working with C's doubly
circular linked lists. Provide low-level iteration over list nodes.
Typed iteration over actual items is provided with a `clist_create`
macro to assist in creation of the `CList` type.
Cc: Nikola Djukic <ndjukic@nvidia.com> Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com> Reviewed-by: Alexandre Courbot <acourbot@nvidia.com> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Acked-by: Alexandre Courbot <acourbot@nvidia.com> Acked-by: Gary Guo <gary@garyguo.net> Acked-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com> Link: https://patch.msgid.link/20260319210722.1543776-1-joelagnelf@nvidia.com
[ * Remove stray empty comment and double blank line in doctest,
* Improve wording and fix a few typos,
* Use markdown emphasis instead of caps,
* Move interop/mod.rs to interop.rs.
Filipe Manana [Wed, 18 Mar 2026 16:17:59 +0000 (16:17 +0000)]
btrfs: fix lost error when running device stats on multiple devices fs
Whenever we get an error updating the device stats item for a device in
btrfs_run_dev_stats() we allow the loop to go to the next device, and if
updating the stats item for the next device succeeds, we end up losing
the error we had from the previous device.
Fix this by breaking out of the loop once we get an error and make sure
it's returned to the caller. Since we are in the transaction commit path
(and in the critical section actually), returning the error will result
in a transaction abort.
Fixes: 733f4fbbc108 ("Btrfs: read device stats on mount, write modified ones during commit") Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
Since commit 3d74a7556fba ("btrfs: zlib: introduce zlib_compress_bio()
helper"), there are some reports about different crashes in zlib
compression path. One of the symptoms is list corruption like the
following:
list_del corruption. next->prev should be fffffbb340204a08, but was ffff8d6517cb7de0. (next=fffffbb3402d62c8)
------------[ cut here ]------------
kernel BUG at lib/list_debug.c:65!
Oops: invalid opcode: 0000 [#1] SMP NOPTI
CPU: 1 UID: 0 PID: 21436 Comm: kworker/u16:7 Not tainted 7.0.0-rc2-jcg+ #1 PREEMPT
Hardware name: LENOVO 10VGS02P00/3130, BIOS M1XKT57A 02/10/2022
Workqueue: btrfs-delalloc btrfs_work_helper [btrfs]
RIP: 0010:__list_del_entry_valid_or_report+0xec/0xf0
Call Trace:
<TASK>
btrfs_alloc_compr_folio+0xae/0xc0 [btrfs]
zlib_compress_bio+0x39d/0x6a0 [btrfs]
btrfs_compress_bio+0x2e3/0x3d0 [btrfs]
compress_file_range+0x2b0/0x660 [btrfs]
btrfs_work_helper+0xdb/0x3e0 [btrfs]
process_one_work+0x192/0x3d0
worker_thread+0x19a/0x310
kthread+0xdf/0x120
ret_from_fork+0x22e/0x310
ret_from_fork_asm+0x1a/0x30
</TASK>
---[ end trace 0000000000000000 ]---
Other symptoms include VM_BUG_ON() during folio_put() but it's rarer.
David Sterba firstly reported this during his CI runs but unfortunately
I'm unable to hit it.
Meanwhile zstd/lzo doesn't seem to have the same problem.
[CAUSE]
During zlib_compress_bio() every time the output buffer is full, we
queue the full folio into the compressed bio, and allocate a new folio
as the output folio.
After the input has finished, we loop through zlib_deflate() with
Z_FINISH to flush all output.
And when that is done, we still need to check if the last folio has any
content, and if so we still need to queue that part into the compressed
bio.
The problem is in the final folio handling, if the final folio is full
(for x86_64 the folio size is 4K), the length to queue is calculated by
But since total_out is 4K aligned, the resulted @cur_len will be 0, then
we hit the bio_add_folio(), which has a quirk that if bio_add_folio()
got an length 0, it will still queue the folio into the bio, but return
false.
In that case we go to out: tag, which calls btrfs_free_compr_folio() to
release @out_folio, which may put the out folio into the btrfs global
pool list.
On the other hand, that @out_folio is already added to the
compressed bio, and will later be released again by
cleanup_compressed_bio(), which results double release.
And if this time we still need to put the folio into the btrfs global
pool list, it will result a list corruption because it's already in the
list.
[FIX]
Instead of offset_inside_folio(), directly use the difference between
strm.total_out and bi_size.
So that if the last folio is completely full, we can still properly
queue the full folio other than queueing zero byte.
btrfs: fix leak of kobject name for sub-group space_info
When create_space_info_sub_group() allocates elements of
space_info->sub_group[], kobject_init_and_add() is called for each
element via btrfs_sysfs_add_space_info_type(). However, when
check_removing_space_info() frees these elements, it does not call
btrfs_sysfs_remove_space_info() on them. As a result, kobject_put() is
not called and the associated kobj->name objects are leaked.
This memory leak is reproduced by running the blktests test case
zbd/009 on kernels built with CONFIG_DEBUG_KMEMLEAK. The kmemleak
feature reports the following error:
Filipe Manana [Tue, 17 Feb 2026 14:46:50 +0000 (14:46 +0000)]
btrfs: fix zero size inode with non-zero size after log replay
When logging that an inode exists, as part of logging a new name or
logging new dir entries for a directory, we always set the generation of
the logged inode item to 0. This is to signal during log replay (in
overwrite_item()), that we should not set the i_size since we only logged
that an inode exists, so the i_size of the inode in the subvolume tree
must be preserved (as when we log new names or that an inode exists, we
don't log extents).
This works fine except when we have already logged an inode in full mode
or it's the first time we are logging an inode created in a past
transaction, that inode has a new i_size of 0 and then we log a new name
for the inode (due to a new hardlink or a rename), in which case we log
an i_size of 0 for the inode and a generation of 0, which causes the log
replay code to not update the inode's i_size to 0 (in overwrite_item()).
After log replay the file remains with a size of 64K. This is because when
we first log the inode, when we fsync file foo, we log its current i_size
of 0, and then when we create a hard link we log again the inode in exists
mode (LOG_INODE_EXISTS) but we set a generation of 0 for the inode item we
add to the log tree, so during log replay overwrite_item() sees that the
generation is 0 and i_size is 0 so we skip updating the inode's i_size
from 64K to 0.
Fix this by making sure at fill_inode_item() we always log the real
generation of the inode if it was logged in the current transaction with
the i_size we logged before. Also if an inode created in a previous
transaction is logged in exists mode only, make sure we log the i_size
stored in the inode item located from the commit root, so that if we log
multiple times that the inode exists we get the correct i_size.
Mark Harmstone [Tue, 17 Feb 2026 17:35:42 +0000 (17:35 +0000)]
btrfs: fix super block offset in error message in btrfs_validate_super()
Fix the superblock offset mismatch error message in
btrfs_validate_super(): we changed it so that it considers all the
superblocks, but the message still assumes we're only looking at the
first one.
The change from %u to %llu is because we're changing from a constant to
a u64.
Fixes: 069ec957c35e ("btrfs: Refactor btrfs_check_super_valid") Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Mark Harmstone <mark@harmstone.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
Matt Roper [Thu, 19 Mar 2026 22:30:34 +0000 (15:30 -0700)]
drm/xe: Implement recent spec updates to Wa_16025250150
The hardware teams noticed that the originally documented workaround
steps for Wa_16025250150 may not be sufficient to fully avoid a hardware
issue. The workaround documentation has been augmented to suggest
programming one additional register; make the corresponding change in
the driver.
kernel log:
[ 756.442527] amdgpu 0000:03:00.0: amdgpu: Fan curve temp setting(50) must be within [0, 0]!
[ 777.345800] amdgpu 0000:03:00.0: amdgpu: Fan curve temp setting(50) must be within [0, 0]!
Closes: https://github.com/ROCm/amdgpu/issues/208 Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 470891606c5a97b1d0d937e0aa67a3bed9fcb056) Cc: stable@vger.kernel.org
Asad Kamal [Wed, 18 Mar 2026 05:52:57 +0000 (13:52 +0800)]
drm/amd/pm: Return -EOPNOTSUPP for unsupported OD_MCLK on smu_v13_0_6
When SET_UCLK_MAX capability is absent, return -EOPNOTSUPP from
smu_v13_0_6_emit_clk_levels() for OD_MCLK instead of 0. This makes
unsupported OD_MCLK reporting consistent with other clock types
and allows callers to skip the entry cleanly.
Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit d82e0a72d9189e8acd353988e1a57f85ce479e37) Cc: stable@vger.kernel.org
Asad Kamal [Wed, 18 Mar 2026 05:48:30 +0000 (13:48 +0800)]
drm/amd/pm: Skip redundant UCLK restore in smu_v13_0_6
Only reapply UCLK soft limits during PP_OD_RESTORE_DEFAULT when the
current max differs from the DPM table max. This avoids redundant
SMC updates and prevents -EINVAL on restore when no change is needed.
Fixes: b7a900344546 ("drm/amd/pm: Allow setting max UCLK on SMU v13.0.6") Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 17f11bbbc76c8e83c8474ea708316b1e3631d927)
Alex Hung [Mon, 9 Mar 2026 17:16:08 +0000 (11:16 -0600)]
drm/amd/display: Fix drm_edid leak in amdgpu_dm
[WHAT]
When a sink is connected, aconnector->drm_edid was overwritten without
freeing the previous allocation, causing a memory leak on resume.
[HOW]
Free the previous drm_edid before updating it.
Reviewed-by: Roman Li <roman.li@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 52024a94e7111366141cfc5d888b2ef011f879e5) Cc: stable@vger.kernel.org
Eric Huang [Mon, 16 Mar 2026 15:01:30 +0000 (11:01 -0400)]
drm/amdgpu: prevent immediate PASID reuse case
PASID resue could cause interrupt issue when process
immediately runs into hw state left by previous
process exited with the same PASID, it's possible that
page faults are still pending in the IH ring buffer when
the process exits and frees up its PASID. To prevent the
case, it uses idr cyclic allocator same as kernel pid's.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 8f1de51f49be692de137c8525106e0fce2d1912d) Cc: stable@vger.kernel.org
Ruijing Dong [Tue, 17 Mar 2026 17:54:11 +0000 (13:54 -0400)]
drm/amdgpu: fix strsep() corrupting lockup_timeout on multi-GPU (v3)
amdgpu_device_get_job_timeout_settings() passes a pointer directly
to the global amdgpu_lockup_timeout[] buffer into strsep().
strsep() destructively replaces delimiter characters with '\0'
in-place.
On multi-GPU systems, this function is called once per device.
When a multi-value setting like "0,0,0,-1" is used, the first
GPU's call transforms the global buffer into "0\00\00\0-1". The
second GPU then sees only "0" (terminated at the first '\0'),
parses a single value, hits the single-value fallthrough
(index == 1), and applies timeout=0 to all rings — causing
immediate false job timeouts.
Fix this by copying into a stack-local array before calling
strsep(), so the global module parameter buffer remains intact
across calls. The buffer is AMDGPU_MAX_TIMEOUT_PARAM_LENGTH
(256) bytes, which is safe for the stack.
v2: wrap commit message to 72 columns, add Assisted-by tag.
v3: use stack array with strscpy() instead of kstrdup()/kfree()
to avoid unnecessary heap allocation (Christian).
This patch was developed with assistance from Claude (claude-opus-4-6).
Assisted-by: Claude:claude-opus-4-6 Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Ruijing Dong <ruijing.dong@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 94d79f51efecb74be1d88dde66bdc8bfcca17935) Cc: stable@vger.kernel.org
Yussuf Khalil [Fri, 6 Mar 2026 12:06:35 +0000 (12:06 +0000)]
drm/amd/display: Do not skip unrelated mode changes in DSC validation
Starting with commit 17ce8a6907f7 ("drm/amd/display: Add dsc pre-validation in
atomic check"), amdgpu resets the CRTC state mode_changed flag to false when
recomputing the DSC configuration results in no timing change for a particular
stream.
However, this is incorrect in scenarios where a change in MST/DSC configuration
happens in the same KMS commit as another (unrelated) mode change. For example,
the integrated panel of a laptop may be configured differently (e.g., HDR
enabled/disabled) depending on whether external screens are attached. In this
case, plugging in external DP-MST screens may result in the mode_changed flag
being dropped incorrectly for the integrated panel if its DSC configuration
did not change during precomputation in pre_validate_dsc().
At this point, however, dm_update_crtc_state() has already created new streams
for CRTCs with DSC-independent mode changes. In turn,
amdgpu_dm_commit_streams() will never release the old stream, resulting in a
memory leak. amdgpu_dm_atomic_commit_tail() will never acquire a reference to
the new stream either, which manifests as a use-after-free when the stream gets
disabled later on:
BUG: KASAN: use-after-free in dc_stream_release+0x25/0x90 [amdgpu]
Write of size 4 at addr ffff88813d836524 by task kworker/9:9/29977
Since there is no reliable way of figuring out whether a CRTC has unrelated
mode changes pending at the time of DSC validation, remember the value of the
mode_changed flag from before the point where a CRTC was marked as potentially
affected by a change in DSC configuration. Reset the mode_changed flag to this
earlier value instead in pre_validate_dsc().
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/5004 Fixes: 17ce8a6907f7 ("drm/amd/display: Add dsc pre-validation in atomic check") Signed-off-by: Yussuf Khalil <dev@pp3345.net> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit cc7c7121ae082b7b82891baa7280f1ff2608f22b)
Varun R Mallya [Mon, 23 Mar 2026 08:11:31 +0000 (13:41 +0530)]
selftests/bpf: Improve connect_force_port test reliability
The connect_force_port test fails intermittently in CI because the
hardcoded server ports (60123/60124) may already be in use by other
tests or processes [1].
Fix this by passing port 0 to start_server(), letting the kernel assign
a free port dynamically. The actual assigned port is then propagated to
the BPF programs by writing it into the .bss map's initial value (via
bpf_map__initial_value()) before loading, so the BPF programs use the
correct backend port at runtime.
Felix Gu [Sun, 22 Mar 2026 13:29:56 +0000 (21:29 +0800)]
spi: meson-spicc: Fix double-put in remove path
meson_spicc_probe() registers the controller with
devm_spi_register_controller(), so teardown already drops the
controller reference via devm cleanup.
Calling spi_controller_put() again in meson_spicc_remove()
causes a double-put.
Fixes: 8311ee2164c5 ("spi: meson-spicc: fix memory leak in meson_spicc_remove") Signed-off-by: Felix Gu <ustc.gu@gmail.com> Reviewed-by: Johan Hovold <johan@kernel.org> Link: https://patch.msgid.link/20260322-rockchip-v1-1-fac3f0c6dad8@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
kernel log:
[ 756.442527] amdgpu 0000:03:00.0: amdgpu: Fan curve temp setting(50) must be within [0, 0]!
[ 777.345800] amdgpu 0000:03:00.0: amdgpu: Fan curve temp setting(50) must be within [0, 0]!
Closes: https://github.com/ROCm/amdgpu/issues/208 Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Hou Wenlong [Mon, 16 Mar 2026 03:46:29 +0000 (11:46 +0800)]
drm/amd/display: Rename enum 'pixel_format' to 'dc_pixel_format'
Rename the enum 'pixel_format' to 'dc_pixel_format' to avoid potential
name conflicts with the pixel_format struct defined in
include/video/pixel_format.h.
Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Asad Kamal [Thu, 25 Dec 2025 16:30:41 +0000 (00:30 +0800)]
drm/amd/pm: Add get_thermal_temperature_range support
Add get_thermal_temperature_range support smu_v15_0_8
v2: Remove sriov check (Lijo)
v3: Restrict to 1VF mode(Lijo)
Signed-off-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Asad Kamal [Thu, 25 Dec 2025 16:13:13 +0000 (00:13 +0800)]
drm/amd/pm: Add od_edit_dpm_table support
Add od_edit_dpm_table support for smu_v15_0_8
v2: Skip Gl2clk/Fclk (Lijo)
v3: sqaush in set_performance_support (Asad)
Signed-off-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Asad Kamal [Thu, 25 Dec 2025 15:44:20 +0000 (23:44 +0800)]
drm/amd/pm: add populate_umd_state_clk support
add populate_umd_state_clk support for smu 15.0.8
v2: remove gl2clk/socclk/fclk, restrict to only current min/max (Lijo)
Signed-off-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Fix NULL pointer assumptions in dcn42_init_hw()
dcn42_init_hw() calls update_bw_bounding_box() when FAMS2 is disabled or
when the dchub reference clock changes. However the existing condition
mixes the callback pointer check with only one side of the || expression:
This allows the block to be entered through the freq_changed path even
when update_bw_bounding_box() is NULL. The function is then called
unconditionally inside the block, which can lead to a NULL pointer
dereference.
Additionally, the code dereferences dc->clk_mgr->bw_params without
verifying that dc->clk_mgr and bw_params are valid.
Restructure the condition so that the update trigger remains the same
(FAMS2 disabled or dchub ref clock changed), but guard the call with
explicit checks for:
Also introduce a helper boolean (dchub_ref_freq_changed) to improve
readability of the clock-change condition.
This fixes Smatch warnings about inconsistent NULL assumptions in
dcn42_init_hw().
drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn42/dcn42_hwseq.c:264 dcn42_init_hw() error: we previously assumed 'dc->clk_mgr' could be null (see line 253)
drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn42/dcn42_hwseq.c:278 dcn42_init_hw() error: we previously assumed 'dc->res_pool->funcs->update_bw_bounding_box' could be null (see line 274)
Cc: Roman Li <roman.li@amd.com> Cc: Alex Hung <alex.hung@amd.com> Cc: Jerry Zuo <jerry.zuo@amd.com> Cc: Sun peng Li <sunpeng.li@amd.com> Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Dan Carpenter <dan.carpenter@linaro.org> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Add clk_mgr NULL checks in dcn32_initialize_min_clocks()
dcn32_init_hw() checks dc->clk_mgr before calling init_clocks(), so the
clock manager is not treated as unconditionally present on this path.
However, dcn32_initialize_min_clocks() later dereferences dc->clk_mgr,
bw_params, and clk_mgr callbacks without validating them.
Add the required guards in dcn32_initialize_min_clocks() before
accessing clk_mgr-dependent state, and check callback presence before
calling get_dispclk_from_dentist() and update_clocks().
Also guard the later update_bw_bounding_box() call in the FAMS2-disabled
path since it also dereferences dc->clk_mgr->bw_params.
This keeps clk_mgr handling consistent in the DCN32 HW init flow and
avoids possible NULL pointer dereferences reported by Smatch.
Fixes the below:
drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn32/dcn32_hwseq.c:1012 dcn32_init_hw() error: we previously assumed 'dc->clk_mgr' could be null (see line 978)
Cc: Roman Li <roman.li@amd.com> Cc: Alex Hung <alex.hung@amd.com> Cc: Jerry Zuo <jerry.zuo@amd.com> Cc: Sun peng Li <sunpeng.li@amd.com> Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Dan Carpenter <dan.carpenter@linaro.org> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Yang Wang [Thu, 23 Oct 2025 03:04:40 +0000 (11:04 +0800)]
drm/amd/pm: add get_gpu_metrics support for 15.0.8
export .get_gpu_metrics interface for 15.0.8
v2: Remove members already exposed by other interfaces, use mask,
logical conversion (Lijo)
v3: Use correct logic for hbm stacks loop (Lijo)
Remove buffer allocation
v4: Make out of bound check outside loop (Lijo)
v5: fix locking in error case (Alex)
Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Asad Kamal [Wed, 26 Nov 2025 10:41:34 +0000 (18:41 +0800)]
drm/amd/pm: Add get_pm_metrics support for smu 15.0.8
export .get_pm_metrics interface for smu 15.0.8.
v2: Make tmo as unsigned (Lijo)
Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Yang Wang [Wed, 26 Nov 2025 08:02:41 +0000 (16:02 +0800)]
drm/amd/pm: Setup driver pptable for smu 15.0.8
Setup driver pptable and initialize data from static metrics table for
smu_v15_0_8
v2: Remove unrelated changes and update description (Lijo)
v3: Use ARRAY_SIZE (Lijo)
v4: Move structure to header file
v5: squash in static metrics support (Asad)
Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Asad Kamal [Mon, 24 Nov 2025 17:19:13 +0000 (01:19 +0800)]
drm/amd/pm: Add mode2 support for smu_v15_0_8
Add initial mode2 support for smu_v15_0_8
v2: Move out non smu code, remove pci save/restore logic (Lijo)
v3: squash in updated msg (Alex)
Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sunil Khatri [Fri, 20 Mar 2026 11:59:01 +0000 (17:29 +0530)]
drm/amdgpu/userq: cleanup amdgpu_userq_get/put where not needed
amdgpu_userq_put/get are not needed in case we already holding
the userq_mutex and reference is valid already from queue create
time or from signal ioctl. These additional get/put could be a
potential reason for deadlock in case the ref count reaches zero
and destroy is called which again try to take the userq_mutex.
Due to the above change we avoid deadlock between suspend/restore
calling destroy queues trying to take userq_mutex again.
Cc: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Asad Kamal [Wed, 18 Mar 2026 05:52:57 +0000 (13:52 +0800)]
drm/amd/pm: Return -EOPNOTSUPP for unsupported OD_MCLK on smu_v13_0_6
When SET_UCLK_MAX capability is absent, return -EOPNOTSUPP from
smu_v13_0_6_emit_clk_levels() for OD_MCLK instead of 0. This makes
unsupported OD_MCLK reporting consistent with other clock types
and allows callers to skip the entry cleanly.
Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Asad Kamal [Wed, 18 Mar 2026 05:48:30 +0000 (13:48 +0800)]
drm/amd/pm: Skip redundant UCLK restore in smu_v13_0_6
Only reapply UCLK soft limits during PP_OD_RESTORE_DEFAULT when the
current max differs from the DPM table max. This avoids redundant
SMC updates and prevents -EINVAL on restore when no change is needed.
Fixes: b7a900344546 ("drm/amd/pm: Allow setting max UCLK on SMU v13.0.6") Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Taimur Hassan [Fri, 13 Mar 2026 22:42:59 +0000 (17:42 -0500)]
drm/amd/display: Promote DC to 3.2.375
This version brings along following fixes:
- Rework YCbCr422 DSC policy
- Restore full update for tiling change to linear
- add dccg FGCG mask init
- Remove unnecessary completion flag for secure display
- Agument live + capture with CVT case.
- remove dc_clock_limit for apu
- Fix Signed/Unsigned Int Usage Compiler Warning
- Hardcode dtbclk value in bw_params
- Revert inbox0 lock for cursor due to deadlock
- Add 3DLUT DMA broadcast support
- Fix Silence warnings
- export get_power_profile interface for later use
- pg cntl update based on previous asic.
- remove disable_sutter touch pstate debug code
- Refactor DC update checks
- Fix drm_edid leak in amdgpu_dm
- Add Extra SMU Log for dtbclk
- Clamp min DS DCFCLK value to DCN limit
- Update dpia supported configuration
- Multiple DCN42 updates
Joshua Aberback [Thu, 12 Mar 2026 22:33:49 +0000 (18:33 -0400)]
drm/amd/display: Restore full update for tiling change to linear
[Why]
There was previously a dc debug flag to indicate that tiling
changes should only be a medium update instead of full. The
function get_plane_info_type was refactored to not rely on dc
state, but in the process the logic was unintentionally changed,
which leads to screen corruption in some cases.
[How]
- add flag to tiling struct to avoid full update when necessary
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: Aric Cyr <aric.cyr@amd.com> Signed-off-by: Joshua Aberback <joshua.aberback@amd.com> Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Wayne Lin [Fri, 13 Mar 2026 04:05:40 +0000 (12:05 +0800)]
drm/amd/display: Remove unnecessary completion flag for secure display
The completion flag is not used in secure display today.
Remove unnecessary code.
Reviewed-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
ChunTao Tso [Wed, 4 Mar 2026 08:37:58 +0000 (16:37 +0800)]
drm/amd/display: Agument live + capture with CVT case.
1. Add LIVE_CAPTURE_WITH_CVT bit (bit[2]) in union replay_optimization
to control this feature via DalRegKey_ReplayOptimization.
2. Check the bit in mod_power_set_live_capture_with_cvt_activate function
before enabling live capture with CVT.
3. Use LIVE_CAPTURE_WITH_CVT to control if Replay want to send CVT in
live + capture or not.
Reviewed-by: Leon Huang <leon.huang1@amd.com> Signed-off-by: ChunTao Tso <chuntao.tso@amd.com> Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Charlene Liu [Wed, 11 Mar 2026 21:05:19 +0000 (17:05 -0400)]
drm/amd/display: remove dc_clock_limit for apu
[why]
current apu pmfw does not support dc_clock_limit
Reviewed-by: Roman Li <roman.li@amd.com> Signed-off-by: Charlene Liu <Charlene.Liu@amd.com> Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Fix Signed/Unsigned Int Usage Compiler Warning
[Why] Compiler generates compiler warnings when signed enum
constants or literal -1 are implicitly converted to unsigned
integer types, cluttering build output and masking genuine issues.
[How] Use UINT_MAX as the invalid sentinel for unsigned IDs and align
loop/index types to unsigned where appropriate to remove implicit
signed-to-unsigned conversions, with no functional behavior change.
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Gaghik Khachatrian <gaghik.khachatrian@amd.com> Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Matthew Stewart [Wed, 11 Mar 2026 19:16:00 +0000 (15:16 -0400)]
drm/amd/display: Hardcode dtbclk value in bw_params
[why&how]
dtbclk should always be 600MHz. Previous logic was to get the real value
from SMU, but this returns 0 when dtbclk is off. Not a problem during
boot when pre-OS enables dtbclk, but PnP was broken due to this.
Reviewed-by: Charlene Liu <charlene.liu@amd.com> Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Matthew Stewart <Matthew.Stewart2@amd.com> Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Revert inbox0 lock for cursor due to deadlock
[Why]
A deadlock occurs when using inbox0 lock for cursor operations on
PSR-SU and Replays that does not when using the inbox1 locking path.
This is because of a priority inversion issue where inbox1 work
cannot be serviced while holding the HW lock from driver and sending
cursor notifications to DMUB.
Typically the lower priority of inbox1 for the lock command would
allow the PSR and Replay FSMs to complete their transition prior
to giving driver the lock but this is no longer the case with inbox0
having the highest priority in servicing.
[How]
This will reintroduce any synchronization bugs that were there
with Replay or PSR-SU touching the cursor at the same time as driver.
Reviewed-by: Charlene Liu <charlene.liu@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Dillon Varone [Sat, 7 Mar 2026 05:53:03 +0000 (05:53 +0000)]
drm/amd/display: Add 3DLUT DMA broadcast support
[WHY&HOW]
A single HUBP can be used to fetch 3DLUT and broadcast to a
single HUBP. Add logic to select the top pipe for a given
plane and use it's HUBP as the broadcast source for multiple
MPC's.
Charlene Liu [Fri, 6 Mar 2026 15:40:07 +0000 (10:40 -0500)]
drm/amd/display: export get_power_profile interface for later use
[why]
export dcn401 get_power_profile for later asic.
Reviewed-by: Roman Li <roman.li@amd.com> Signed-off-by: Charlene Liu <Charlene.Liu@amd.com> Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Dillon Varone [Thu, 5 Mar 2026 21:42:29 +0000 (16:42 -0500)]
drm/amd/display: Refactor DC update checks
[WHY&HOW]
DC currently has fragmented definitions of update types. This changes
consolidates them into a single interface, and adds expanded
functionality to accommodate all use cases.
- adds `dc_check_update_state_and_surfaces_for_stream` to determine
update type including state, surface, and stream changes.
- adds missing surface/stream update checks to
`dc_check_update_surfaces_for_stream`
- adds new update type `UPDATE_TYPE_ADDR_ONLY` to accomodate flows where
further distinction from `UPDATE_TYPE_FAST` was needed
- removes caller reliance on `enable_legacy_fast_update` to determine
which commit function to use, instead embedding it in the update type
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Dillon Varone <Dillon.Varone@amd.com> Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Hung [Mon, 9 Mar 2026 17:16:08 +0000 (11:16 -0600)]
drm/amd/display: Fix drm_edid leak in amdgpu_dm
[WHAT]
When a sink is connected, aconnector->drm_edid was overwritten without
freeing the previous allocation, causing a memory leak on resume.
[HOW]
Free the previous drm_edid before updating it.
Reviewed-by: Roman Li <roman.li@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Fix DCN42 memory clock table using MemClk instead of UClk
[Why]
DCN42 was using UClk values instead of MemClk from MemPstateTable, causing
DML to see half the actual DRAM bandwidth on DDR5 systems and reject high
refresh rate modes.
[How]
Change dcn42_init_clocks() to use MemPstateTable[i].MemClk instead of
MemPstateTable[i].UClk for memclk_mhz initialization.
Reviewed-by: Charlene Liu <charlene.liu@amd.com> Signed-off-by: Alexander Chechik <alexander.chechik@amd.com> Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Roman Li [Tue, 17 Mar 2026 00:17:57 +0000 (20:17 -0400)]
drm/amd/display: Update underflow detection for DCN42
[Why]
The DCN42 underflow detection functions in dcn42_optc.c use
OPTC_RSMU_UNDERFLOW register but the register offset definitions
were missing from dcn_4_2_0_offset.h and dcn42_resource.h.
[How]
Add missing register definitions.
Fixes: e56e3cff2a1b ("drm/amd/display: Sync dcn42 with DC 3.2.373") Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Roman Li <Roman.Li@amd.com> Signed-off-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Tue, 3 Feb 2026 16:30:57 +0000 (17:30 +0100)]
drm/amdgpu: fix some more bug in amdgpu_gem_va_ioctl
Some illegal combination of input flags were not checked and we need to
take the PDEs into account when returning the fence as well.
Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Roman Li [Sat, 14 Mar 2026 00:34:48 +0000 (20:34 -0400)]
drm/amd/display: Clamp min DS DCFCLK value to DCN limit
[why & how]
DCN has a global limit for minimum DS DCFCLK during any operation.
Adhere to that limit and add a debug flag.
Reviewed-by: Charlene Liu <charlene.liu@amd.com> Signed-off-by: Ovidiu Bunea <ovidiu.bunea@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Signed-off-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Split arbiter programming for DCN42
[Why]
We don't want to update the timeout threshold for stall recovery in
firmware dynamically for DCN42 as we're not using FAMS.
Firmware should own programming of this register since the recovery
can be broken if driver updates the value to 0.
[How]
Split program_arbiter for dcn42 and skip the part that updates the
timeout threshold.
Reviewed-by: Leo Chen <leo.chen@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Signed-off-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Roman Li [Sat, 14 Mar 2026 00:27:54 +0000 (20:27 -0400)]
drm/amd/display: Add missing dcn42 hubbub function pointers
This aligning commit combines:
- fix dcn42 det programming)
- fix missing dcn42 pointers
- fix SDPIF_Request_Rate_Limit programming value
V2: Add back dchvm_init for DCN42
Reviewed-by: Alex Hung <alex.hung@amd.com> Reviewed-by: Leo Chen <leo.chen@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Signed-off-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Roman Li [Fri, 13 Mar 2026 23:34:02 +0000 (19:34 -0400)]
drm/amd/display: Add get_default_tiling_info for dcn42
Add DCN42 portion that was stripped during previously.
Fixes: 8333f22e44a9 ("drm/amd/display: Query DC for gfx handling when setting linear tiling") Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Signed-off-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Move it out of smu present block for cases where it isn't
Reviewed-by: Ivan Lipski <ivan.lipski@amd.com> Signed-off-by: Dmytro Laktyushkin <dmytro.laktyushkin@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Signed-off-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Charlene Liu [Thu, 5 Mar 2026 15:14:39 +0000 (10:14 -0500)]
drm/amd/display: System Hang When System enters to S0i3 w/ iGPU
[why]
System Hang when system enters to S0i3 w/ iGPU
some link_enc are NULL due to BIOS integration info table not correct,
but driver should have enough null pointer protection.
Reviewed-by: Leo Chen <leo.chen@amd.com> Signed-off-by: Charlene Liu <Charlene.Liu@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Signed-off-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ivan Lipski [Wed, 4 Mar 2026 01:07:58 +0000 (20:07 -0500)]
drm/amd/display: Move DPM clk read to clk_mgr_construct in DCN42
[Why&How]
The DPM clocks on DCN42 are currently read on every dm_resume, which can
cause in gpu memory freeing while the device is still in suspend.
Move the DPM clock read functionality to clk_mgr_construct() so it
completes once on driver enablement.
Reviewed-by: Charlene Liu <charlene.liu@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Signed-off-by: Dmytro Laktyushkin <dmytro.laktyushkin@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Signed-off-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
DCN401 didn't have a MRQ present so these fields didn't exist.
They are still present on DCN42 so we need to continue programming
them like we did on DCN35 or we can block have poor meta requesting
efficiency which blocks p-state.
[How]
Add `hubp42_program_requestor` which takes DML21 input and programs
the registers like DCN35 and prior.
Reviewed-by: Leo Chen <leo.chen@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Signed-off-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Charlene Liu [Mon, 2 Mar 2026 20:45:41 +0000 (15:45 -0500)]
drm/amd/display: dcn42 don't round up disclk and dppclk
[why]
dml2 based on num_enabled clock != 2 to do clock ramming to dpm.
apu has 8 levels dispclk/dppclk/dcfclk/fclk, but only 4 levels of memclk.
to avoid mapping dispclk/dppclk to DPM clock,
based on arch review, force dispclk/dppclk num_level as 2.
Reviewed-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Charlene Liu <Charlene.Liu@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Signed-off-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Kexin Sun [Sat, 21 Mar 2026 10:59:40 +0000 (18:59 +0800)]
spi: pl022: update outdated references to pump_transfers()
The pump_transfers tasklet was removed when the driver
switched to the SPI core message processing loop in commit 9b2ef250b31d ("spi: spl022: switch to use default
spi_transfer_one_message()"). The kdoc of
pl022_interrupt_handler() still describes the old flow: scheduling
the pump_transfers tasklet, calling giveback(), and flagging
STATE_ERROR — none of which exist in the current code. Rewrite
the kdoc to reflect the current paths, which flag errors via
SPI_TRANS_FAIL_IO and call spi_finalize_current_transfer()
directly.
Eric Huang [Mon, 16 Mar 2026 15:01:30 +0000 (11:01 -0400)]
drm/amdgpu: prevent immediate PASID reuse case
PASID resue could cause interrupt issue when process
immediately runs into hw state left by previous
process exited with the same PASID, it's possible that
page faults are still pending in the IH ring buffer when
the process exits and frees up its PASID. To prevent the
case, it uses idr cyclic allocator same as kernel pid's.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ruijing Dong [Tue, 17 Mar 2026 17:54:11 +0000 (13:54 -0400)]
drm/amdgpu: fix strsep() corrupting lockup_timeout on multi-GPU (v3)
amdgpu_device_get_job_timeout_settings() passes a pointer directly
to the global amdgpu_lockup_timeout[] buffer into strsep().
strsep() destructively replaces delimiter characters with '\0'
in-place.
On multi-GPU systems, this function is called once per device.
When a multi-value setting like "0,0,0,-1" is used, the first
GPU's call transforms the global buffer into "0\00\00\0-1". The
second GPU then sees only "0" (terminated at the first '\0'),
parses a single value, hits the single-value fallthrough
(index == 1), and applies timeout=0 to all rings — causing
immediate false job timeouts.
Fix this by copying into a stack-local array before calling
strsep(), so the global module parameter buffer remains intact
across calls. The buffer is AMDGPU_MAX_TIMEOUT_PARAM_LENGTH
(256) bytes, which is safe for the stack.
v2: wrap commit message to 72 columns, add Assisted-by tag.
v3: use stack array with strscpy() instead of kstrdup()/kfree()
to avoid unnecessary heap allocation (Christian).
This patch was developed with assistance from Claude (claude-opus-4-6).
Assisted-by: Claude:claude-opus-4-6 Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Ruijing Dong <ruijing.dong@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Thu, 19 Feb 2026 23:20:27 +0000 (18:20 -0500)]
drm/amdgpu/gfx11: look at the right prop for gfx queue priority
Look at hqd_queue_priority rather than hqd_pipe_priority.
In practice, it didn't matter as both were always set for
kernel queues, but that will change in the future.
Fixes: 2e216b1e6ba2 ("drm/amdgpu/gfx11: handle priority setup for gfx pipe1")
Reviewed-by:Jesse Zhang <jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Thu, 19 Feb 2026 23:18:28 +0000 (18:18 -0500)]
drm/amdgpu/gfx10: look at the right prop for gfx queue priority
Look at hqd_queue_priority rather than hqd_pipe_priority.
In practice, it didn't matter as both were always set for
kernel queues, but that will change in the future.
Fixes: b07d1d73b09e ("drm/amd/amdgpu: Enable high priority gfx queue")
Reviewed-by:Jesse Zhang <jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 17 Mar 2026 20:34:41 +0000 (16:34 -0400)]
drm/amdgpu/pm: drop SMU driver if version not matched messages
It just leads to user confusion.
Cc: Yang Wang <kevinyang.wang@amd.com> Cc: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amdgpu: Skip discovery dump when topology is unavailable
When generating a devcoredump, amdgpu_discovery_dump() prints the IP
discovery topology.
The function already needs to handle the case where
adev->discovery.ip_top is NULL to avoid a crash.
Currently, the code prints a section header and an additional message
when the topology is unavailable.
However, for platforms where discovery is not used, this section is not
expected to be present. Printing an extra message adds unnecessary
output.
Simplify this by skipping the entire section when ip_top is NULL.
The NULL check is kept to avoid a crash, but no output is generated when
the discovery topology is unavailable.
Cc: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
amdgpu_discovery_dump() uses adev->discovery.ip_top. However,
ip_top may be NULL if the discovery topology was never initialized.
The current code does not check for this before using ip_top. As a
result, when ip_top is NULL, the coredump worker crashes while taking
the spinlock for ip_top->die_kset.
Fix this by checking for a missing ip_top before walking the discovery
topology. If it is unavailable, print a short message in the dump and
return safely.
- If ip_top is NULL, print a message and skip the dump
- Also add the same check in the cleanup path
This makes the coredump and cleanup paths safe even when the
discovery topology is not available.
v2: Updated commit message - Clarified that ip_top is not freed, it can
just be NULL if discovery was not initialized. (Christian/Lijo)
v3: Removed the extra drm_warn() for sysfs init failure as sysfs already
reports errors. (Christian)
Fixes: e81eff80aad6 ("drm/amdgpu: include ip discovery data in devcoredump") Cc: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Yussuf Khalil [Fri, 6 Mar 2026 12:06:35 +0000 (12:06 +0000)]
drm/amd/display: Do not skip unrelated mode changes in DSC validation
Starting with commit 17ce8a6907f7 ("drm/amd/display: Add dsc pre-validation in
atomic check"), amdgpu resets the CRTC state mode_changed flag to false when
recomputing the DSC configuration results in no timing change for a particular
stream.
However, this is incorrect in scenarios where a change in MST/DSC configuration
happens in the same KMS commit as another (unrelated) mode change. For example,
the integrated panel of a laptop may be configured differently (e.g., HDR
enabled/disabled) depending on whether external screens are attached. In this
case, plugging in external DP-MST screens may result in the mode_changed flag
being dropped incorrectly for the integrated panel if its DSC configuration
did not change during precomputation in pre_validate_dsc().
At this point, however, dm_update_crtc_state() has already created new streams
for CRTCs with DSC-independent mode changes. In turn,
amdgpu_dm_commit_streams() will never release the old stream, resulting in a
memory leak. amdgpu_dm_atomic_commit_tail() will never acquire a reference to
the new stream either, which manifests as a use-after-free when the stream gets
disabled later on:
BUG: KASAN: use-after-free in dc_stream_release+0x25/0x90 [amdgpu]
Write of size 4 at addr ffff88813d836524 by task kworker/9:9/29977
Since there is no reliable way of figuring out whether a CRTC has unrelated
mode changes pending at the time of DSC validation, remember the value of the
mode_changed flag from before the point where a CRTC was marked as potentially
affected by a change in DSC configuration. Reset the mode_changed flag to this
earlier value instead in pre_validate_dsc().
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/5004 Fixes: 17ce8a6907f7 ("drm/amd/display: Add dsc pre-validation in atomic check") Signed-off-by: Yussuf Khalil <dev@pp3345.net> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/ras: Remove redundant NULL check in pending bad-bank list iteration
ras_umc_log_pending_bad_bank() walks through a list of pending ECC
bad-bank entries. These entries are saved when a bad-bank error cannot
be processed immediately, for example during a GPU reset.
Later, this function iterates over the pending list and retries logging
each bad-bank error. If logging succeeds, the entry is removed from the
list and the memory for that node is freed.
The loop uses list_for_each_entry_safe(), which already guarantees that
ecc_node points to a valid list entry while the loop body is executing.
Checking "ecc_node &&" inside the loop is therefore unnecessary and
redundant.
Fixes the below:
drivers/gpu/drm/amd/amdgpu/../ras/rascore/ras_umc.c:225 ras_umc_log_pending_bad_bank() warn: variable dereferenced before check 'ecc_node' (see line 223)
Fixes: 7a3f9c0992c4 ("drm/amd/ras: Add umc common ras functions") Cc: Dan Carpenter <dan.carpenter@linaro.org> Cc: YiPeng Chai <YiPeng.Chai@amd.com> Cc: Tao Zhou <tao.zhou1@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: YiPeng Chai <YiPeng.Chai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>