git.ipfire.org Git - thirdparty/linux.git/log

ARM: omap2: dead code cleanup in kconfig for ARCH_OMAP4

The same kconfig 'select OMAP_INTERCONNECT' appears twice for ARCH_OMAP4.
I propose removing the second instance, as it is effectively dead code.

This dead code was found by kconfirm, a static analysis tool for Kconfig.

Signed-off-by: Julian Braha <julianbraha@gmail.com>
Link: https://patch.msgid.link/20260329183018.519560-1-julianbraha@gmail.com
Signed-off-by: Kevin Hilman <khilman@baylibre.com>

ARM: OMAP1: Fix DEBUG_LL and earlyprintk on OMAP16XX

On OMAP16XX, the UART enable bit shifts are written instead of the actual
bits. This breaks the boot when DEBUG_LL and earlyprintk is enabled;
the UART gets disabled and some random bits get enabled. Fix that.

Fixes: 34c86239b184 ("ARM: OMAP1: clock: Fix early UART rate issues")
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Link: https://patch.msgid.link/aca7HnXZ-aCSJPW7@darkstar.musicnaut.iki.fi
Signed-off-by: Kevin Hilman <khilman@baylibre.com>

drm/amdgpu/uvd4.2: Don't initialize UVD 4.2 when DPM is disabled

UVD 4.2 doesn't work at all when DPM is disabled because
the SMU is responsible for ungating it. So, Linux fails
to boot with CIK GPUs when using the amdgpu.dpm=0 parameter.

Fix this by returning -ENOENT from uvd_v4_2_early_init()
when amdgpu_dpm isn't enabled.

Note: amdgpu.dpm=0 is often suggested as a workaround
for issues and is useful for debugging.

Fixes: a2e73f56fa62 ("drm/amdgpu: Add support for CIK parts")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm/smu7: Add SCLK cap for quirky Hawaii board

On a specific Radeon R9 390X board, the GPU can "randomly" hang
while gaming. Initially I thought this was a RADV bug and tried
to work around this in Mesa:
commit 8ea08747b86b ("radv: Mitigate GPU hang on Hawaii in Dota 2 and RotTR")

However, I got some feedback from other users who are reporting
that the above mitigation causes a significant performance
regression for them, and they didn't experience the hang on their
GPU in the first place.

After some further investigation, it turns out that the problem
is that the highest SCLK DPM level on this board isn't stable.
Lowering SCLK to 1040 MHz (from 1070 MHz) works around the issue,
and has a negligible impact on performance compared to the Mesa
patch. (Note that increasing the voltage can also work around it,
but we felt that lowering the SCLK is the safer option.)

To solve the above issue, add an "sclk_cap" field to smu7_hwmgr
and set this field for the affected board. The capped SCLK value
correctly appears on the sysfs interface and shows up in GUI
tools such as LACT.

Fixes: 9f4b35411cfe ("drm/amd/powerplay: add CI asics support to smumgr (v3)")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm/ci: Fill DW8 fields from SMC

In ci_populate_dw8() we currently just read a value from the SMU
and then throw it away. Instead of throwing away the value,
we should use it to fill other fields in DW8 (like radeon).

Otherwise the value of the other fiels is just cleared when
we copy this data to the SMU later.

Fixes: 9f4b35411cfe ("drm/amd/powerplay: add CI asics support to smumgr (v3)")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm/ci: Clear EnabledForActivity field for memory levels

Follow what radeon did and what amdgpu does for other GPUs with SMU7.

Fixes: 9f4b35411cfe ("drm/amd/powerplay: add CI asics support to smumgr (v3)")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm/ci: Fix powertune defaults for Hawaii 0x67B0

There is no AMD GPU with the ID 0x66B0, this looks like a typo.
It should be 0x67B0 which is actually part of the PCI ID list,
and should use the Hawaii XT powertune defaults according to
the old radeon driver.

Fixes: 9f4b35411cfe ("drm/amd/powerplay: add CI asics support to smumgr (v3)")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm/smu7: Remove non-functional SMU7 voltage dependency on DAL

It looks like this was written for an old version of DC (DAL)
and was never adapted afterwards. This was non-functional
because it relied on the "dal_power_level" field which was
never assigned anywhere in the code base.

Also, it was not implemented for CI ASICs.

Now superseded by the newer voltage dependency on display
clock table added by the previous commit, let's remove.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm/smu7: Fix SMU7 voltage dependency on display clock

The DCE (display controller engine) requires a minimum voltage
in order to function correctly, depending on which clock level
it currently uses.

Add a new table that contains display clock frequency levels
and the corresponding required voltages. The clock frequency
levels are taken from DC (and the old radeon driver's voltage
dependency table for CI in cases where its values were lower).
The voltage levels are taken from the following function:
phm_initializa_dynamic_state_adjustment_rule_settings().
Furthermore, in case of CI, call smu7_patch_vddc() on the new
table to account for leakage voltage (like in radeon).

Use the display clock value from amd_pp_display_configuration
to look up the voltage level needed by the DCE. Send the
voltage to the SMU via the PPSMC_MSG_VddC_Request command.

The previous implementation of this feature was non-functional
because it relied on a "dal_power_level" field which was never
assigned; and it was not at all implemented for CI ASICs.

I verified this on a Radeon R9 M380 which previously booted to
a black screen with DC enabled (default since Linux 6.19), but
now works correctly.

Fixes: 599a7e9fe1b6 ("drm/amd/powerplay: implement smu7 hwmgr to manager asics with smu ip version 7.")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm/ci: Disable MCLK DPM on problematic CI ASICs

There are two known cases where MCLK DPM can causes issues:

Radeon R9 M380 found in iMac computers from 2015.
The SMU in this GPU just hangs as soon as we send it the
PPSMC_MSG_MCLKDPM_Enable command, even when MCLK switching is
disabled, and even when we only populate one MCLK DPM level.
Apply workaround to all devices with the same subsystem ID.

Radeon R7 260X due to old memory controller microcode.
We only flash the MC ucode when it isn't set up by the VBIOS,
therefore there is no way to make sure that it has the correct
ucode version.

I verified that this patch fixes the SMU hang on the R9 M380
which would previously fail to boot. This also fixes the UVD
initialization error on that GPU which happened because the
SMU couldn't ungate the UVD after it hung.

Fixes: 86457c3b21cb ("drm/amd/powerplay: Add support for CI asics to hwmgr")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm/ci: Use highest MCLK on CI when MCLK DPM is disabled

When MCLK DPM is disabled for any reason, populate the MCLK
table with the highest MCLK DPM level, so that the ASIC can
use the highest possible memory clock to get good performance
even when MCLK DPM is disabled.

Fixes: 9f4b35411cfe ("drm/amd/powerplay: add CI asics support to smumgr (v3)")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Merge tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux

Pull crypto library fix from Eric Biggers:
"Fix missing zeroization of the ChaCha state"

* tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
lib/crypto: chacha: Zeroize permuted_state before it leaves scope

spi: cadence-qspi: Fix exec_mem_op error handling

cqspi_exec_mem_op() increments the runtime PM usage counter before all
refcount checks are performed. If one of these checks fails, the function
returns without dropping the PM reference.

Move the pm_runtime_resume_and_get() call after the refcount checks so
that runtime PM is only acquired when the operation can proceed and
drop the inflight_ops refcount if the PM resume fails.

Cc: stable@vger.kernel.org
Fixes: 7446284023e8 ("spi: cadence-quadspi: Implement refcount to handle unbind during busy")
Signed-off-by: Emanuele Ghidoli <emanuele.ghidoli@toradex.com>
Link: https://patch.msgid.link/20260313135236.46642-1-ghidoliemanuele@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>

drm/amdkfd: Fix queue preemption/eviction failures by aligning control stack size to GPU page size

The control stack size is calculated based on the number of CUs and
waves, and is then aligned to PAGE_SIZE. When the resulting control
stack size is aligned to 64 KB, GPU hangs and queue preemption
failures are observed while running RCCL unit tests on systems with
more than two GPUs.

amdgpu 0048:0f:00.0: amdgpu: Queue preemption failed for queue with
doorbell_id: 80030008
amdgpu 0048:0f:00.0: amdgpu: Failed to evict process queues
amdgpu 0048:0f:00.0: amdgpu: GPU reset begin!. Source: 4
amdgpu 0048:0f:00.0: amdgpu: Queue preemption failed for queue with
doorbell_id: 80030008
amdgpu 0048:0f:00.0: amdgpu: Failed to evict process queues
amdgpu 0048:0f:00.0: amdgpu: Failed to restore process queues

This issue is observed on both 4 KB and 64 KB system page-size
configurations.

This patch fixes the issue by aligning the control stack size to
AMDGPU_GPU_PAGE_SIZE instead of PAGE_SIZE, so the control stack size
will not be 64 KB on systems with a 64 KB page size and queue
preemption works correctly.

Additionally, In the current code, wg_data_size is aligned to PAGE_SIZE,
which can waste memory if the system page size is large. In this patch,
wg_data_size is aligned to AMDGPU_GPU_PAGE_SIZE. The cwsr_size, calculated
from wg_data_size and the control stack size, is aligned to PAGE_SIZE.

Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit a3e14436304392fbada359edd0f1d1659850c9b7)

hwmon: (occ) Fix missing newline in occ_show_extended()

In occ_show_extended() case 0, when the EXTN_FLAG_SENSOR_ID flag
is set, the sysfs_emit format string "%u" is missing the trailing
newline that the sysfs ABI expects. The else branch correctly uses
"%4phN\n", and all other show functions in this file include the
trailing newline.

Add the missing "\n" for consistency and correct sysfs output.

Fixes: c10e753d43eb ("hwmon (occ): Add sensor types and versions")
Signed-off-by: Sanman Pradhan <psanman@juniper.net>
Link: https://lore.kernel.org/r/20260326224510.294619-3-sanman.pradhan@hpe.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>

drm/amdgpu: Fix wait after reset sequence in S4

For a mode-1 reset done at the end of S4 on PSPv11 dGPUs, only check if
TOS is unloaded.

Fixes: 32f73741d6ee ("drm/amdgpu: Wait for bootloader after PSPv11 reset")
Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/4853
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 2fb4883b884a437d760bd7bdf7695a7e5a60bba3)
Cc: stable@vger.kernel.org

hwmon: (occ) Fix division by zero in occ_show_power_1()

In occ_show_power_1() case 1, the accumulator is divided by
update_tag without checking for zero. If no samples have been
collected yet (e.g. during early boot when the sensor block is
included but hasn't been updated), update_tag is zero, causing
a kernel divide-by-zero crash.

The 2019 fix in commit 211186cae14d ("hwmon: (occ) Fix division by
zero issue") only addressed occ_get_powr_avg() used by
occ_show_power_2() and occ_show_power_a0(). This separate code
path in occ_show_power_1() was missed.

Fix this by reusing the existing occ_get_powr_avg() helper, which
already handles the zero-sample case and uses mul_u64_u32_div()
to multiply before dividing for better precision. Move the helper
above occ_show_power_1() so it is visible at the call site.

Fixes: c10e753d43eb ("hwmon (occ): Add sensor types and versions")
Cc: stable@vger.kernel.org
Signed-off-by: Sanman Pradhan <psanman@juniper.net>
Link: https://lore.kernel.org/r/20260326224510.294619-2-sanman.pradhan@hpe.com
[groeck: Fix alignment problems reported by checkpatch]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>

drm/amd/display: Fix NULL pointer dereference in dcn401_init_hw()

dcn401_init_hw() assumes that update_bw_bounding_box() is valid when
entering the update path. However, the existing condition:

((!fams2_enable && update_bw_bounding_box) || freq_changed)

does not guarantee this, as the freq_changed branch can evaluate to true
independently of the callback pointer.

This can result in calling update_bw_bounding_box() when it is NULL.

Fix this by separating the update condition from the pointer checks and
ensuring the callback, dc->clk_mgr, and bw_params are validated before
use.

Fixes the below:
../dc/hwss/dcn401/dcn401_hwseq.c:367 dcn401_init_hw() error: we previously assumed 'dc->res_pool->funcs->update_bw_bounding_box' could be null (see line 362)

Fixes: ca0fb243c3bb ("drm/amd/display: Underflow Seen on DCN401 eGPU")
Cc: Daniel Sa <Daniel.Sa@amd.com>
Cc: Alvin Lee <alvin.lee2@amd.com>
Cc: Roman Li <roman.li@amd.com>
Cc: Alex Hung <alex.hung@amd.com>
Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Dan Carpenter <dan.carpenter@linaro.org>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 86117c5ab42f21562fedb0a64bffea3ee5fcd477)
Cc: stable@vger.kernel.org

drm/amdgpu: Change AMDGPU_VA_RESERVED_TRAP_SIZE to 64KB

Currently, AMDGPU_VA_RESERVED_TRAP_SIZE is hardcoded to 8KB, while
KFD_CWSR_TBA_TMA_SIZE is defined as 2 * PAGE_SIZE. On systems with
4K pages, both values match (8KB), so allocation and reserved space
are consistent.

However, on 64K page-size systems, KFD_CWSR_TBA_TMA_SIZE becomes 128KB,
while the reserved trap area remains 8KB. This mismatch causes the
kernel to crash when running rocminfo or rccl unit tests.

Kernel attempted to read user page (2) - exploit attempt? (uid: 1001)
BUG: Kernel NULL pointer dereference on read at 0x00000002
Faulting instruction address: 0xc0000000002c8a64
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
CPU: 34 UID: 1001 PID: 9379 Comm: rocminfo Tainted: G E
6.19.0-rc4-amdgpu-00320-gf23176405700 #56 VOLUNTARY
Tainted: [E]=UNSIGNED_MODULE
Hardware name: IBM,9105-42A POWER10 (architected) 0x800200 0xf000006
of:IBM,FW1060.30 (ML1060_896) hv:phyp pSeries
NIP: c0000000002c8a64 LR: c00000000125dbc8 CTR: c00000000125e730
REGS: c0000001e0957580 TRAP: 0300 Tainted: G E
MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 24008268
XER: 00000036
CFAR: c00000000125dbc4 DAR: 0000000000000002 DSISR: 40000000
IRQMASK: 1
GPR00: c00000000125d908 c0000001e0957820 c0000000016e8100
c00000013d814540
GPR04: 0000000000000002 c00000013d814550 0000000000000045
0000000000000000
GPR08: c00000013444d000 c00000013d814538 c00000013d814538
0000000084002268
GPR12: c00000000125e730 c000007e2ffd5f00 ffffffffffffffff
0000000000020000
GPR16: 0000000000000000 0000000000000002 c00000015f653000
0000000000000000
GPR20: c000000138662400 c00000013d814540 0000000000000000
c00000013d814500
GPR24: 0000000000000000 0000000000000002 c0000001e0957888
c0000001e0957878
GPR28: c00000013d814548 0000000000000000 c00000013d814540
c0000001e0957888
NIP [c0000000002c8a64] __mutex_add_waiter+0x24/0xc0
LR [c00000000125dbc8] __mutex_lock.constprop.0+0x318/0xd00
Call Trace:
0xc0000001e0957890 (unreliable)
__mutex_lock.constprop.0+0x58/0xd00
amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x6fc/0xb60 [amdgpu]
kfd_process_alloc_gpuvm+0x54/0x1f0 [amdgpu]
kfd_process_device_init_cwsr_dgpu+0xa4/0x1a0 [amdgpu]
kfd_process_device_init_vm+0xd8/0x2e0 [amdgpu]
kfd_ioctl_acquire_vm+0xd0/0x130 [amdgpu]
kfd_ioctl+0x514/0x670 [amdgpu]
sys_ioctl+0x134/0x180
system_call_exception+0x114/0x300
system_call_vectored_common+0x15c/0x2ec

This patch changes AMDGPU_VA_RESERVED_TRAP_SIZE to 64 KB and
KFD_CWSR_TBA_TMA_SIZE to the AMD GPU page size. This means we reserve
64 KB for the trap in the address space, but only allocate 8 KB within
it. With this approach, the allocation size never exceeds the reserved
area.

Fixes: 34a1de0f7935 ("drm/amdkfd: Relocate TBA/TMA to opposite side of VM hole")
Reviewed-by: Christian König <christian.koenig@amd.com>
Suggested-by: Felix Kuehling <felix.kuehling@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 31b8de5e55666f26ea7ece5f412b83eab3f56dbb)
Cc: stable@vger.kernel.org

drm/amdgpu/userq: fix memory leak in MQD creation error paths

In mes_userq_mqd_create(), the memdup_user() allocations for
IP-specific MQD structs are not freed when subsequent VA validation
fails. The goto free_mqd label only cleans up the MQD BO object and
userq_props.

Fix by adding kfree() before each goto free_mqd on VA validation
failure in the COMPUTE, GFX, and SDMA branches.

Fixes: 9e46b8bb0539 ("drm/amdgpu: validate userq buffer virtual address and size")
Reported-by: Yuhao Jiang <danisjiang@gmail.com>
Signed-off-by: Junrui Luo <moonafterrain@outlook.com>
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 27f5ff9e4a4150d7cf8b4085aedd3b77ddcc5d08)
Cc: stable@vger.kernel.org

drm/amd: Fix MQD and control stack alignment for non-4K

For gfxV9, due to a hardware bug ("based on the comments in the code
here [1]"), the control stack of a user-mode compute queue must be
allocated immediately after the page boundary of its regular MQD buffer.
To handle this, we allocate an enlarged MQD buffer where the first page
is used as the MQD and the remaining pages store the control stack.
Although these regions share the same BO, they require different memory
types: the MQD must be UC (uncached), while the control stack must be
NC (non-coherent), matching the behavior when the control stack is
allocated in user space.

This logic works correctly on systems where the CPU page size matches
the GPU page size (4K). However, the current implementation aligns both
the MQD and the control stack to the CPU PAGE_SIZE. On systems with a
larger CPU page size, the entire first CPU page is marked UC—even though
that page may contain multiple GPU pages. The GPU treats the second 4K
GPU page inside that CPU page as part of the control stack, but it is
incorrectly mapped as UC.

This patch fixes the issue by aligning both the MQD and control stack
sizes to the GPU page size (4K). The first 4K page is correctly marked
as UC for the MQD, and the remaining GPU pages are marked NC for the
control stack. This ensures proper memory type assignment on systems
with larger CPU page sizes.

[1]: https://elixir.bootlin.com/linux/v6.18/source/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c#L118

Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 998d6781410de1c4b787fdbf6c56e851ea7fa553)

Merge tag 'trace-rtla-v7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull rtla build fix from Steven Rostedt:

- Fix build failure when libbpf does not exist

   RTLA supports building without BPF libraries, but a recent change
   added a libbpf.h include outside of the BPF protection which caused
   build failures when libbpf was not installed.

* tag 'trace-rtla-v7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  rtla: Fix build without libbpf header

drm/amdkfd: Align expected_queue_size to PAGE_SIZE

The AQL queue size can be 4K, but the minimum buffer object (BO)
allocation size is PAGE_SIZE. On systems with a page size larger
than 4K, the expected queue size does not match the allocated BO
size, causing queue creation to fail.

Align the expected queue size to PAGE_SIZE so that it matches the
allocated BO size and allows queue creation to succeed.

Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit b01cd158a2f5230b137396c5f8cda3fc780abbc2)

drm/amdgpu: fix the idr allocation flags

Fix the IDR allocation flags by using atomic GFP
flags in non‑sleepable contexts to avoid the __might_sleep()
complaint.

  268.290239] [drm] Initialized amdgpu 3.64.0 for 0000:03:00.0 on minor 0
[  268.294900] BUG: sleeping function called from invalid context at ./include/linux/sched/mm.h:323
[  268.295355] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1744, name: modprobe
[  268.295705] preempt_count: 1, expected: 0
[  268.295886] RCU nest depth: 0, expected: 0
[  268.296072] 2 locks held by modprobe/1744:
[  268.296077]  #0: ffff8c3a44abd1b8 (&dev->mutex){....}-{4:4}, at: __driver_attach+0xe4/0x210
[  268.296100]  #1: ffffffffc1a6ea78 (amdgpu_pasid_idr_lock){+.+.}-{3:3}, at: amdgpu_pasid_alloc+0x26/0xe0 [amdgpu]
[  268.296494] CPU: 12 UID: 0 PID: 1744 Comm: modprobe Tainted: G     U     OE       6.19.0-custom #16 PREEMPT(voluntary)
[  268.296498] Tainted: [U]=USER, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
[  268.296499] Hardware name: AMD Majolica-RN/Majolica-RN, BIOS RMJ1009A 06/13/2021
[  268.296501] Call Trace:

Fixes: 8f1de51f49be ("drm/amdgpu: prevent immediate PASID reuse case")
Tested-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ea56aa2625708eaf96f310032391ff37746310ef)
Cc: stable@vger.kernel.org

drm/amdgpu: validate doorbell_offset in user queue creation

amdgpu_userq_get_doorbell_index() passes the user-provided
doorbell_offset to amdgpu_doorbell_index_on_bar() without bounds
checking. An arbitrarily large doorbell_offset can cause the
calculated doorbell index to fall outside the allocated doorbell BO,
potentially corrupting kernel doorbell space.

Validate that doorbell_offset falls within the doorbell BO before
computing the BAR index, using u64 arithmetic to prevent overflow.

Fixes: f09c1e6077ab ("drm/amdgpu: generate doorbell index for userqueue")
Reported-by: Yuhao Jiang <danisjiang@gmail.com>
Signed-off-by: Junrui Luo <moonafterrain@outlook.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit de1ef4ffd70e1d15f0bf584fd22b1f28cbd5e2ec)
Cc: stable@vger.kernel.org

drm/amdgpu/pm: drop SMU driver if version not matched messages

It just leads to user confusion.

Cc: Yang Wang <kevinyang.wang@amd.com>
Cc: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit e471627d56272a791972f25e467348b611c31713)
Cc: stable@vger.kernel.org

cpufreq: Add boost_freq_req QoS request

The Power Management Quality of Service (PM QoS) allows to
aggregate constraints from multiple entities. It is currently
used to manage the min/max frequency of a given policy.

Frequency constraints can come for instance from:
- Thermal framework: acpi_thermal_cpufreq_init()
- Firmware: _PPC objects: acpi_processor_ppc_init()
- User: by setting policyX/scaling_[min|max]_freq
The minimum of the max frequency constraints is used to compute
the resulting maximum allowed frequency.

When enabling boost frequencies, the same frequency request object
(policy->max_freq_req) as to handle requests from users is used.
As a result, when setting:
- scaling_max_freq
- boost
The last sysfs file used overwrites the request from the other
sysfs file.

To avoid this, create a per-policy boost_freq_req to save the boost
constraints instead of overwriting the last scaling_max_freq
constraint.

policy_set_boost() calls the cpufreq set_boost callback.
Update the newly added boost_freq_req request from there:
- whenever boost is toggled
- to cover all possible paths

In the existing .set_boost() callbacks:
- Don't update policy->max as this is done through the qos notifier
   cpufreq_notifier_max() which calls cpufreq_set_policy().
- Remove freq_qos_update_request() calls as the qos request is now
   done in policy_set_boost() and updates the new boost_freq_req

$ ## Init state
scaling_max_freq:1000000
cpuinfo_max_freq:1000000

$ echo 700000 > scaling_max_freq
scaling_max_freq:700000
cpuinfo_max_freq:1000000

$ echo 1 > ../boost
scaling_max_freq:1200000
cpuinfo_max_freq:1200000

$ echo 800000 > scaling_max_freq
scaling_max_freq:800000
cpuinfo_max_freq:1200000

$ ## Final step:
$ ## Without the patches:
$ echo 0 > ../boost
scaling_max_freq:1000000
cpuinfo_max_freq:1000000

$ ## With the patches:
$ echo 0 > ../boost
scaling_max_freq:800000
cpuinfo_max_freq:1000000

Note:
cpufreq_frequency_table_cpuinfo() updates policy->min
and max from:
A.
cpufreq_boost_set_sw()
\-cpufreq_frequency_table_cpuinfo()
B.
cpufreq_policy_online()
\-cpufreq_table_validate_and_sort()
  \-cpufreq_frequency_table_cpuinfo()
Keep these updates as some drivers expect policy->min and
max to be set through B.

Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/20260326204404.1401849-3-pierre.gondois@arm.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

cpufreq: Remove max_freq_req update for pre-existing policy

policy->max_freq_req QoS constraint represents the maximal allowed
frequency than can be requested. It is set by:
- writing to policyX/scaling_max sysfs file
- toggling the cpufreq/boost sysfs file

Upon calling freq_qos_update_request(), a successful update
of the max_freq_req value triggers cpufreq_notifier_max(),
followed by cpufreq_set_policy() which update the requested
frequency for the policy.
If the new max_freq_req value is not different from the
original value, no frequency update is triggered.

In a specific sequence of toggling:
- cpufreq/boost sysfs file
- CPU hot-plugging
a CPU could end up with boost enabled but running at the
maximal non-boost frequency, cpufreq_notifier_max() not being
triggered. The following fixed that:
commit 1608f0230510 ("cpufreq: Fix re-boost issue after hotplugging
a CPU")

The following:
commit dd016f379ebc ("cpufreq: Introduce a more generic way to
set default per-policy boost flag")
also fixed the issue by correctly setting the max_freq_req
constraint of a policy that is re-activated. This makes the
first fix unnecessary.

As the original issue is fixed by another method,
this patch reverts:
commit 1608f0230510 ("cpufreq: Fix re-boost issue after hotplugging
a CPU")

Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/20260326204404.1401849-2-pierre.gondois@arm.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

rcutorture: Test call_srcu() with preemption disabled and not

This commit tests invoking call_srcu() with preemption both enabled
and disabled, via acquiring of pi lock.

[ Joel: reword commit message. ]

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>

rcu: Add BOOTPARAM_RCU_STALL_PANIC Kconfig option

Add a Kconfig option to set the default value of the
kernel.panic_on_rcu_stall sysctl, allowing the kernel to be built
with panic-on-RCU-stall enabled by default.

This is useful for high-availability systems that require automatic
recovery (via panic_timeout) when a CPU stall is detected, without
needing userspace to configure the sysctl at boot.

This follows the pattern established by BOOTPARAM_SOFTLOCKUP_PANIC
and BOOTPARAM_HUNG_TASK_PANIC. The runtime sysctl can still override
the Kconfig default.

Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Gustavo Luiz Duarte <gustavold@gmail.com>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>

torture: Avoid modulo-zero error in torture_hrtimeout_ns()

Currently, all calls to torture_hrtimeout_ns() either provide a non-zero
fuzzt_ns or a NULL trsp, either of which avoids taking the modulus of a
zero-valued fuzzt_ns. But this code should do a better job of defending
itself, so this commit explicitly checks fuzzt_ns and avoids the modulus
when its value is zero.

Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>

rcu/nocb: Extract nocb_bypass_needs_flush() to reduce duplication

The bypass flush decision logic is duplicated in rcu_nocb_try_bypass()
and nocb_gp_wait() with similar conditions.

This commit therefore extracts the functionality into a common helper
function nocb_bypass_needs_flush() improving the code readability.

A flush_faster parameter is added to controlling the flushing thresholds
and timeouts. This design was in the original commit d1b222c6be1f
("rcu/nocb: Add bypass callback queueing") to avoid having the GP
kthread aggressively flush the bypass queue.

Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>

rcu/nocb: Consolidate rcu_nocb_cpu_offload/deoffload functions

The rcu_nocb_cpu_offload() and rcu_nocb_cpu_deoffload() functions are
nearly duplicates.

Therefore, extract the common logic into rcu_nocb_cpu_toggle_offload()
which takes an 'offload' boolean, and make both exported functions
simple wrappers.

This eliminates a bunch of duplicate code at the call sites, namely
mutex locking, CPU hotplug locking and CPU online checks.

Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>

rcu-tasks: Remove unnecessary smp_store_release() in cblist_init_generic()

The cblist_init_generic() is executed during the CPU early boot
phase due to commit:30ef09635b9e ("rcu-tasks: Initialize callback
lists at rcu_init() time"), at this time, only one boot CPU is
online and the irq is disabled. this commit therefore use routine
assignment replace of smp_store_release() and WRITE_ONCE() in the
cblist_init_generic().

Signed-off-by: Zqiang <qiang.zhang@linux.dev>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>

rcutorture: Add NOCB02 config for nocb poll mode testing

Add new rcutorture config NOCB02 that enables rcu_nocb_poll boot
parameter combined with CONFIG_RCU_NOCB_CPU to exercise the polling
mode code paths in the NOCB implementation.

This config exercises poll-mode paths not covered by other configs,
where callback invocation uses active polling instead of kthread
wakeups.

This config is not added to CFLIST to avoid increasing the default
test duration; it can be run explicitly when poll-mode testing
is needed.

Acked-by: Frederic Weisbecker <frederic@kernel.org>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>

rcutorture: Add NOCB01 config for RCU_LAZY torture testing

Add new rcutorture config NOCB01 that enables CONFIG_RCU_LAZY combined
with CONFIG_RCU_NOCB_CPU to exercise the lazy callback code paths in
the NOCB implementation.

This config exercises lazy callback paths not covered by other configs,
including lazy-only wake and lazy defer logic.

This config is not added to CFLIST to avoid increasing the default
test duration; it can be run explicitly when lazy callback testing
is needed.

Acked-by: Frederic Weisbecker <frederic@kernel.org>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>

rcu-tasks: Document that RCU Tasks Trace grace periods now imply RCU grace periods

Now that RCU Tasks Trace is implemented in terms of SRCU-fast, the fact
that each SRCU-fast grace period implies at least two RCU grace periods
in turn means that each RCU Tasks Trace grace period implies at least
two grace periods. This commit therefore updates the documentation
accordingly.

Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>

srcu: Fix s/they disables/they disable/ typo in srcu_read_unlock_fast()

Typo fix in srcu_read_unlock_fast() header comment.

Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>

srcu: Fix SRCU read flavor macro comments

The SRCU_READ_FLAVOR_FAST and SRCU_READ_FLAVOR_FAST_UPDOWN comments
need repair. The former fails to not that SRCU-fast can be used in NMI
handlers, and the latter says that it goes with srcu_read_lock_fast()
when it really goes with srcu_read_lock_fast_updown(). This commit
therefore fixes both comments.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>

rcuscale: Ditch rcu_scale_shutdown in favor of torture_shutdown_init()

The torture_shutdown_init() function spawns a shutdown kthread in
a manner very similar to that implemented by rcu_scale_shutdown().
This commit therefore re-implements rcu_scale_shutdown() in terms of
torture_shutdown_init().

This patch was generated by Claude given as input the patch making the
same transformation of ref_scale_shutdown().

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>

refscale: Ditch ref_scale_shutdown in favor of torture_shutdown_init()

The torture_shutdown_init() function spawns a shutdown kthread in
a manner very similar to that implemented by ref_scale_shutdown().
This commit therefore re-implements ref_scale_shutdown in terms of
torture_shutdown_init().

The initial draft of this patch was generated by version 2.1.16 of the
Claude AI/LLM, but trained and configured for use by my employer, and
prompted to refer to Linux-kernel source code. This initial draft failed
to provide a forward reference to ref_scale_cleanup(), passed zero to
torture_shutdown_init() for an unwelcome insta-shutdown, and failed to
pass the kvm.sh --duration argument in as a refscale module parameter.
On the other hand, it did catch the need to NULL main_task on the
post-test self-shutdown code path, which I might well have forgotten
to do.

This version of the patch fixes those problems, and in fact very little
of the initial draft remains.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>

rcutorture: Fix numeric "test" comparison in srcu_lockdep.sh

This commit switches from "-eq" to "=" to handle the non-numeric
comparisons in srcu_lockdep.sh. While in the area, adjust SRCU flavor
to improve coverage.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>

torture: Print informative message for test without recheck file

If a type of torture test lacks a recheck file, a bash diagnostic is
printed, which looks like a torture-test bug. This commit gets rid of
this false positive by explicitly checking for the file, invoking it if
it exists, otherwise printing an informative non-diagnostic message.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>

torture: Make hangs more visible in torture.sh output

This commit labels "QEMU killed" lines so that they will be picked up
by torture.sh processing.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>

kvm-check-branches.sh: Remove in favor of kvm-series.sh

The kvm-series.sh script is an order-of-magnitude optimization of
kvm-check-branches.sh, so remove the old script.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>

rcutorture: Add a textbook-style trivial preemptible RCU

This commit adds a trivial textbook implementation of preemptible RCU
to rcutorture ("torture_type=trivial-preempt"), similar in spirit to the
existing "torture_type=trivial" textbook implementation of non-preemptible
RCU. Neither trivial RCU implementation has any value for production use,
and are intended only to keep Paul honest in his introductory writings
and presentations.

[ paulmck: Apply kernel test robot feedback. ]

Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>

PM: EM: Fix NULL pointer dereference when perf domain ID is not found

dev_energymodel_nl_get_perf_domains_doit() calls
em_perf_domain_get_by_id() but does not check the return value before
passing it to __em_nl_get_pd_size(). When a caller supplies a
non-existent perf domain ID, em_perf_domain_get_by_id() returns NULL,
and __em_nl_get_pd_size() immediately dereferences pd->cpus
(struct offset 0x30), causing a NULL pointer dereference.

The sister handler dev_energymodel_nl_get_perf_table_doit() already
handles this correctly via __em_nl_get_pd_table_id(), which returns
NULL and causes the caller to return -EINVAL. Add the same NULL check
in the get-perf-domains do handler.

Fixes: 380ff27af25e ("PM: EM: Add dump to get-perf-domains in the EM YNL spec")
Reported-by: Yi Lai <yi1.lai@linux.intel.com>
Closes: https://lore.kernel.org/lkml/aXiySM79UYfk+ytd@ly-workstation/
Signed-off-by: Changwoo Min <changwoo@igalia.com>
Cc: 6.19+ <stable@vger.kernel.org> # 6.19+
[ rjw: Subject and changelog edits ]
Link: https://patch.msgid.link/20260329073615.649976-1-changwoo@igalia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

lib/crypto: tests: Migrate ChaCha20Poly1305 self-test to KUnit

Move the ChaCha20Poly1305 test from an ad-hoc self-test to a KUnit test.

Keep the same test logic for now, just translated to KUnit.

Moving to KUnit has multiple benefits, such as:

- Consistency with the rest of the lib/crypto/ tests.

- Kernel developers familiar with KUnit, which is used kernel-wide, can
  quickly understand the test and how to enable and run it.

- The test will be automatically run by anyone using
  lib/crypto/.kunitconfig or KUnit's all_tests.config.

- Results are reported using the standard KUnit mechanism.

- It eliminates one of the few remaining back-references to crypto/ from
  lib/crypto/, specifically a reference to CONFIG_CRYPTO_SELFTESTS.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260327224229.137532-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>

lib/crypto: sparc: Drop optimized MD5 code

MD5 is obsolete. Continuing to maintain architecture-optimized
implementations of MD5 is unnecessary and risky. It diverts resources
from the modern algorithms that are actually important.

While there was demand for continuing to maintain the PowerPC optimized
MD5 code to accommodate userspace programs that are misusing AF_ALG
(https://lore.kernel.org/linux-crypto/c4191597-341d-4fd7-bc3d-13daf7666c41@csgroup.eu/),
no such demand has been seen for the SPARC optimized MD5 code.

Thus, let's drop it and focus effort on the more modern SHA algorithms,
which already have optimized code for SPARC.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260326203341.60393-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>

lib/crypto: mips: Drop optimized MD5 code

MD5 is obsolete.  Continuing to maintain architecture-optimized
implementations of MD5 is unnecessary and risky.  It diverts resources
from the modern algorithms that are actually important.

While there was demand for continuing to maintain the PowerPC optimized
MD5 code to accommodate userspace programs that are misusing AF_ALG
(https://lore.kernel.org/linux-crypto/c4191597-341d-4fd7-bc3d-13daf7666c41@csgroup.eu/),
no such demand has been seen for the MIPS Cavium Octeon optimized MD5
code.  Note that this code runs on only one particular line of SoCs.

Thus, let's drop it and focus effort on the more modern SHA algorithms,
which already have optimized code for the same SoCs.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260326204824.62010-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>

PCI/AER: Stop ruling out unbound devices as error source

When searching for the error source, the AER driver rules out devices whose
enable_cnt is zero.  This was introduced in 2009 by commit 28eb27cf0839
("PCI AER: support invalid error source IDs") without providing a
rationale.

Drivers typically call pci_enable_device() on probe, hence the enable_cnt
check essentially filters out unbound devices.  At the time of the commit,
drivers had to opt in to AER by calling pci_enable_pcie_error_reporting()
and so any AER-enabled device could be assumed to be bound to a driver.
The check thus made sense because it allowed skipping config space accesses
to devices which were known not to be the error source.

But since 2022, AER is universally enabled on all devices when they are
enumerated, cf. commit f26e58bf6f54 ("PCI/AER: Enable error reporting when
AER is native").

Errors may very well be reported by unbound devices, e.g. due to link
instability.  By ruling them out as error source, errors reported by them
are neither logged nor cleared.  When they do get bound and another error
occurs, the earlier error is reported together with the new error, which
may confuse users.  Stop doing so.

Fixes: f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is native")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Stefan Roese <stefan.roese@mailbox.org>
Cc: stable@vger.kernel.org # v6.0+
Link: https://patch.msgid.link/734338c2e8b669db5a5a3b45d34131b55ffebfca.1774605029.git.lukas@wunner.de

drm/amdgpu: use multiple entities in amdgpu_move_blit

Thanks to "drm/ttm: rework pipelined eviction fence handling", ttm
can deal correctly with moves and evictions being executed from
different contexts.

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdkfd: Fix queue preemption/eviction failures by aligning control stack size to GPU page size

The control stack size is calculated based on the number of CUs and
waves, and is then aligned to PAGE_SIZE. When the resulting control
stack size is aligned to 64 KB, GPU hangs and queue preemption
failures are observed while running RCCL unit tests on systems with
more than two GPUs.

amdgpu 0048:0f:00.0: amdgpu: Queue preemption failed for queue with
doorbell_id: 80030008
amdgpu 0048:0f:00.0: amdgpu: Failed to evict process queues
amdgpu 0048:0f:00.0: amdgpu: GPU reset begin!. Source: 4
amdgpu 0048:0f:00.0: amdgpu: Queue preemption failed for queue with
doorbell_id: 80030008
amdgpu 0048:0f:00.0: amdgpu: Failed to evict process queues
amdgpu 0048:0f:00.0: amdgpu: Failed to restore process queues

This issue is observed on both 4 KB and 64 KB system page-size
configurations.

This patch fixes the issue by aligning the control stack size to
AMDGPU_GPU_PAGE_SIZE instead of PAGE_SIZE, so the control stack size
will not be 64 KB on systems with a 64 KB page size and queue
preemption works correctly.

Additionally, In the current code, wg_data_size is aligned to PAGE_SIZE,
which can waste memory if the system page size is large. In this patch,
wg_data_size is aligned to AMDGPU_GPU_PAGE_SIZE. The cwsr_size, calculated
from wg_data_size and the control stack size, is aligned to PAGE_SIZE.

Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: use TTM_NUM_MOVE_FENCES when reserving fences

Use TTM_NUM_MOVE_FENCES as an upperbound of how many fences
ttm might need to deal with moves/evictions.

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: round robin through clear_entities in amdgpu_fill_buffer

This makes clear of different BOs run in parallel. Partial jobs to
clear a single BO still execute sequentially.

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: allocate move entities dynamically

No functional change for now, as we always allocate a single entity.

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: allocate clear entities dynamically

No functional change for now, as we always allocate a single entity
and use it everywhere.

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdkfd: fix kernel crash on releasing NULL sysfs entry

there is an abnormal case that When a process re-opens kfd
with different mm_struct(execve() called by user), the
allocated p->kobj will be freed, but missed setting it to NULL,
that will cause sysfs/kernel crash with NULL pointers in p->kobj
on kfd_process_remove_sysfs() when releasing process, and the
similar error on kfd_procfs_del_queue() as well.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: Unify version check in SMUv14

Use common helper function for firmware version check and logging in
SMUv14

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Add Idle state manager(ISM)

[Why]

Rapid allow/disallow of idle optimization calls, whether it be IPS or
self-refresh features, can end up using more power if actual
time-in-idle is low. It can also spam DMUB command submission in a way
that prevents it from servicing other requestors.

[How]

Introduce the Idle State Manager (ISM) to amdgpu. It maintains a finite
state machine that uses a hysteresis to determine if a delay should be
inserted between a caller allowing idle, and when the actual idle
optimizations are programmed.

A second timer is also introduced to enable static screen optimizations
(SSO) such as PSR1 and Replay low HZ idle mode. Rapid SSO enable/disable
can have a negative power impact on some low hz video playback, and can
introduce user lag for PSR1 (due to up to 3 frames of sync latency).

This effectively rate-limits idle optimizations, based on hysteresis.

This also replaces the existing delay logic used for PSR1, allowing
drm_vblank_crtc_config.disable_immediate = true, and thus allowing
drm_crtc_vblank_restore().

v2:
* Loosen criteria for ISM to exit idle optimizations; it failed to exit
  idle correctly on cursor updates when there are no drm_vblank
  requestors,
* Document default_ism_config
* Convert pr_debug to trace events to reduce overhead on frequent
  codepaths
* checkpatch.pl fixes

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/4527
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3709
Fixes: 58a261bfc967 ("drm/amd/display: use a more lax vblank enable policy for older ASICs")
Signed-off-by: Ray Wu <ray.wu@amd.com>
Signed-off-by: Leo Li <sunpeng.li@amd.com>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

ASoC: amd: acp: update dmic_num logic for acp pdm dmic

Vijendar Mukunda <Vijendar.Mukunda@amd.com> says:

This patch series updates the dmic_num logic for acp pdm dmic and
renames the dmic component name in acp soundwire legacy machine driver.

ASoC: amd: acp-sdw-legacy: rename the dmic component name

For acp pdm dmic use case, user space needs a reliable identifier
to select the correct UCM configuration. Rename component string
as acp-dmic to select the correct UCM configuration for acp pdm dmic.

Signed-off-by: Vijendar Mukunda <Vijendar.Mukunda@amd.com>
Link: https://patch.msgid.link/20260330072431.3512358-3-Vijendar.Mukunda@amd.com
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: amd: acp: update dmic_num logic for acp pdm dmic

Currently there is no mechanism to read dmic_num in mach_params
structure. In this scenario mach_params->dmic_num check always
returns 0 which fails to add component string for dmic.
Update the condition check with acp pdm dmic quirk check and
pass the dmic_num as 1.

Fixes: 2981d9b0789c ("ASoC: amd: acp: add soundwire machine driver for legacy stack")
Signed-off-by: Vijendar Mukunda <Vijendar.Mukunda@amd.com>
Link: https://patch.msgid.link/20260330072431.3512358-2-Vijendar.Mukunda@amd.com
Signed-off-by: Mark Brown <broonie@kernel.org>

drm/amdgpu/gfx11: Add Cleaner Shader Support for GFX11.5.4

The Cleaner Shader is responsible for clearing LDS, VGPRs and SGPRs
between GPU workloads to enforce process isolation and avoid data
leakage.

The cleaner shader clears per-wave GPU state (LDS, VGPRs and SGPRs)
between workloads, improving process isolation and preventing stale data
from being observed by subsequent tasks.

This reuses the existing cleaner shader used on GFX11.0.3 and enables it
for GFX11.5.4 GPUs when firmware requirements are met.

Cc: Muhammad Adam <muhammad.adam@amd.com>
Cc: Mario Sopena-Novales <mario.novales@amd.com>
Cc: Tom Wu <Tom.Wu@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Fix wait after reset sequence in S4

For a mode-1 reset done at the end of S4 on PSPv11 dGPUs, only check if
TOS is unloaded.

Fixes: 32f73741d6ee ("drm/amdgpu: Wait for bootloader after PSPv11 reset")
Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/4853
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdkfd: Switch to dev_* printk stuff in kfd_int_process_v12_1.c

dev_* printk stuff is multi-GPU friendly.

Use dev_warn_ratelimited() for print_sq_intr_info_error() which is
consistent with previous IPs.

Use dev_dbg_ratelimited() for irrelevant node interrupt print to
avoid too much noise.

Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: Unify version check in SMUv12

Use common helper function for firmware version check and logging in
SMUv12.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: Unify version check in SMUv11

Use common helper function for firmware version check and logging in
SMUv11

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Promote DC to 3.2.376

This version brings along following fixes:

- correct unknown plane state patch
- Revert "Refactor DC update checks"
- Revert "Add 3DLUT DMA broadcast support"
- Remove invalid DPSTREAMCLK mask usage
- enable eDP DSC seamless boot support
- Revert "Rework HDMI link training and YCbCr422 with DSC policy"
- Disable PSR & Replay CRTC disable by default
- Fix Silence Compiler Warnings
- Add link output control for DPIA
- eliminate clock manager code duplication
- Don't set 4to1MPC config dynamically
- Merge pipes for validate
- Fix bounds checking in dml2_0 clock table array
- Avoid turning off the PHY when OTG is running for DVI
- Should support p-state under dcn21
- Enable Replay support for dcn42
- Remove check for DC_DMCUB_ENABLE on DCN42

Acked-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: [FW Promotion] Release 0.1.53.0

[Why]
dmu: Parse freesync mccs vcp code

Acked-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Silence type conversion warnings in dml2

[Why]
Compiler build generates type conversion warnings throughout dc/dml2_0
where values are implicitly narrowed (e.g. int/uint32_t/uint64_t assigned
to uint8_t, unsigned char, char, bool, or dml_bool_t), cluttering build
output and masking genuine issues.

[How]
Add explicit casts at each narrowing assignment with ASSERT guards
to catch out-of-range values in debug builds:
- uint8_t: otg_inst, num_planes, pipe_idx, vblank_index fields
- unsigned char: pipe_dlg_param.otg_inst from tg->inst
- char: mcache num_pipes from num_dpps_required
- bool/dml_bool_t: INTERLACE bitfield and fams2 enable flag use != 0
- uint64_t: widen min_hardware_refresh_in_uhz to hold div64_u64 result,
then cast to unsigned long for min_refresh_uhz with ASSERT

Reviewed-by: Austin Zheng <austin.zheng@amd.com>
Signed-off-by: Gaghik Khachatrian <gaghik.khachatrian@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Fix Compiler warnings in dmub

[Why]
Resolve compiler warnings by marking unused parameters explicitly.

[How]
In .c and .h files, keep parameter names in signatures and add a
line with`(void)param;` inside the function body

Preserved function signatures and avoids breaking code paths that
may reference the parameter under conditional compilation.

Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Gaghik Khachatrian <gaghik.khachatrian@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Fixed Silence complier warnings in dc

[Why]
Resolve compiler warnings by marking unused parameters explicitly.

[How]
In .c and .h function definitions, keep parameter names
in signatures and add a line with `(void)param;` in function body

Preserved function signatures and avoids breaking code paths that
may reference the parameter under conditional compilation.

Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Reviewed-by: Austin Zheng <austin.zheng@amd.com>
Signed-off-by: Gaghik Khachatrian <gaghik.khachatrian@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

ASoC: SOF: topology: use kzalloc_flex

Simplify allocation by using a flexible array member.

Add __counted_by for extra runtime analysis.

Signed-off-by: Rosen Penev <rosenp@gmail.com>
Link: https://patch.msgid.link/20260326023053.53493-1-rosenp@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>

drm/amd/display: Move FPU Guards From DML To DC - Part 3

[Why]
FPU guards (DC_FP_START/DC_FP_END) are required to wrap around code that
can manipulates floats. To do this properly, the FPU guards must be used
in a file that is not compiled as a FPU unit. If the guards are used in
a file that is a FPU unit, other sections in the file that aren't guarded
may be end up being compiled to use FPU operations.

[How]
Added DC_FP_START and DC_FP_END to DC functions that call DML functions
using FPU.

Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Rafal Ostrowski <rafal.ostrowski@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

ASoC: generic: keep fallback dai_name stable across rebind

simple_parse_dai() and graph_util_parse_dai() first try to identify a
DAI via dai_args. When that works the card can rebind without relying on
dlc->dai_name.

The fallback path still calls snd_soc_get_dlc(), which returns a
borrowed dai_name pointer. If the CPU or codec component is unbound
while the sound card stays registered, the generic card keeps that
pointer and the next rebind may compare stale memory while matching the
DAI.

Stage the fallback result in a temporary dai_link_component and move
only a card-owned copy of dai_name into the live link component. Use
devm_kstrdup_const() so static names are reused and dynamic ones remain
valid for the lifetime of the card device.

Suggested-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Acked-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Cássio Gabriel <cassiogabrielcontato@gmail.com>
Link: https://patch.msgid.link/20260327-asoc-generic-fallback-dai-name-rebind-v3-1-c206e44f40c8@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>

drm/amd/display: Move FPU Guards From DML To DC - Part 2

[Why]
FPU guards (DC_FP_START/DC_FP_END) are required to wrap around code that
can manipulates floats. To do this properly, the FPU guards must be used
in a file that is not compiled as a FPU unit. If the guards are used in
a file that is a FPU unit, other sections in the file that aren't guarded
may be end up being compiled to use FPU operations.

[How]
Removed DC_FP_START and DC_FP_END.

Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Rafal Ostrowski <rafal.ostrowski@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Move FPU Guards From DML To DC - Part 1

[Why]
FPU guards (DC_FP_START/DC_FP_END) are required to wrap around code that
can manipulates floats. To do this properly, the FPU guards must be used
in a file that is not compiled as a FPU unit. If the guards are used in
a file that is a FPU unit, other sections in the file that aren't guarded
may be end up being compiled to use FPU operations.

[How]
Added DC_FP_START and DC_FP_END to DC functions that call DML functions
using FPU.

Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Rafal Ostrowski <rafal.ostrowski@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: add support to query vram info from firmware

add support to query vram info from firmware

v2: change APU vram type, add multi-aid check
v3: seperate vram info query function into 3 parts and
call them in a helper func when requirements
are met.
v4: calculate vram_width for v9.x

Signed-off-by: Gangliang Xie <ganglxie@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Merge branch 'for-7.0-fixes' into for-7.1

Conflict in kernel/sched/ext.c init_sched_ext_class() between:

  415cb193bb97 ("sched_ext: Fix SCX_KICK_WAIT deadlock by deferring wait
  to balance callback")

which adds cpus_to_sync cpumask allocation, and:

  84b1a0ea0b7c ("sched_ext: Implement scx_bpf_dsq_reenq() for user DSQs")
  8c1b9453fde6 ("sched_ext: Convert deferred_reenq_locals from llist to
  regular list")

which add deferred_reenq init code at the same location. Both are
independent additions. Include both.

Signed-off-by: Tejun Heo <tj@kernel.org>

drm/amd/display: correct unknown plane state patch

[why]
dcn42x is using same gfx as dcn35, i.e. not use gfx_address3.

Reviewed-by: Ovidiu Bunea <ovidiu.bunea@amd.com>
Signed-off-by: Charlene Liu <Charlene.Liu@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Revert "drm/amd/display: Refactor DC update checks"

Revert commit c24bb00cc6cf ("drm/amd/display: Refactor DC update checks")

[WHY]
Causing issues with PSR/Replay, reverting until those can be fixed.

Reviewed-by: Martin Leung <Martin.Leung@amd.com>
Signed-off-by: Dillon Varone <Dillon.Varone@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Revert "drm/amd/display: Add 3DLUT DMA broadcast support"

Revert commit 7d59465de38e ("drm/amd/display: Add 3DLUT DMA broadcast support")

[WHY&HOW]
Dependencies of this change are still causing issues, so reverting until
those can be fixed.

Reviewed-by: Martin Leung <Martin.Leung@amd.com>
Signed-off-by: Dillon Varone <Dillon.Varone@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Remove invalid DPSTREAMCLK mask usage

[Why]
The invalid register field access causes ASSERT(mask != 0) to fire
in set_reg_field_values() during display enable.

WARNING: at drivers/gpu/drm/amd/amdgpu/../display/dc/dc_helper.c:100
set_reg_field_values.isra.0+0xcf/0xf0 [amdgpu]
Call Trace:
<TASK>
generic_reg_update_ex+0x66/0x1d0 [amdgpu]
dccg401_set_dpstreamclk+0xed/0x350 [amdgpu]
dcn401_enable_stream+0x165/0x370 [amdgpu]
link_set_dpms_on+0x6e9/0xe90 [amdgpu]
dce110_apply_single_controller_ctx_to_hw+0x343/0x530 [amdgpu]
dce110_apply_ctx_to_hw+0x1f6/0x2d0 [amdgpu]
dc_commit_state_no_check+0x49a/0xe20 [amdgpu]
dc_commit_streams+0x354/0x570 [amdgpu]
amdgpu_dm_atomic_commit_tail+0x6f8/0x3fc0 [amdgpu]

DCN4.x hardware does not have DPSTREAMCLK_GATE_DISABLE and
DPSTREAMCLK_ROOT_GATE_DISABLE fields in DCCG_GATE_DISABLE_CNTL3.
These global fields only exist in DCN3.1.x hardware.

[How]
Remove the call that tries to update non-existent fields in CNTL3.
DCN4.x uses per-instance fields in CNTL5 instead,
which are already correctly programmed in the switch cases above.

Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Roman Li <Roman.Li@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: using cm structure for lut3d related info

[Why]
Using the alternative implementation via cm structure of config
lut3d data

Reviewed-by: Austin Zheng <austin.zheng@amd.com>
Signed-off-by: Dillon Varone <Dillon.Varone@amd.com>
Signed-off-by: ChuanYu Tseng <ChuanYu.Tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: enable eDP DSC seamless boot support

[Why]
VBIOS supports DSC for seamless boot on newer hardware.
Reading hardware state allows proper DSC validation without breaking
existing boot display.

[What]
Remove DSC block for boot timing validation and implement hardware state
reading to populate DSC configuration from VBIOS-configured state.
Enhance dsc_read_state function in DCN401 to read additional
DSC parameters.

Reviewed-by: Yihan Zhu <yihan.zhu@amd.com>
Signed-off-by: Mohit Bawa <Mohit.Bawa@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Revert "drm/amd/display: Rework YCbCr422 DSC policy"

Revert commit 19b79e4f2182 ("drm/amd/display: Rework YCbCr422 DSC policy")

Reason for Revert:
This commit is causing compliance failures

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Relja Vojvodic <Relja.Vojvodic@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/dc: Disable PSR & Replay CRTC disable by default

[why & how]
Let IPS FSM handle OTG disable.

Reviewed-by: Leo Chen <leo.chen@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Ovidiu Bunea <ovidiu.bunea@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Fixed silence signed/unsigned mismatch warnings

Fix compiler warnings by consistently use the same signedness for
a given value

Reviewed-by: Joshua Aberback <joshua.aberback@amd.com>
Signed-off-by: Clay King <clayking@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/dc: Add link output control for DPIA

[Why]
To support specific sequencing requirements for DPIA link output

[How]
Implement the dpia_link_hwss structure and define the necessary
control function pointers. The initialization order is
aligned with the core link_hwss definition to ensure consistency

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Lincheng Ku <LinCheng.Ku@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Fix silence signed/unsigned mismatch warnings in dml

[Why & How]
Fix signed/unsigned mismatch warnings by using the same signedness for a
given value

Reviewed-by: Austin Zheng <austin.zheng@amd.com>
Signed-off-by: Clay King <clayking@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Fix Silence signed/unsighed mismatch warning in dc

[Why]
Implicit signed-to-unsigned conversions caused compiler
warnings in DC paths.

[How]
Added explicit (unsigned int)/(uint32_t) casts for sentinel -1
assignments and IRQ ~MASK initializers, with small cast alignment
in logging/DPCD code.

Functionality and behavior is unchanged; only type intent is explicit.

Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Gaghik Khachatrian <gaghik.khachatrian@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: eliminate clock manager code duplication

[Why]
Clock manager contained significant duplicate code between
variants with identical logic for functions using only SMU
calls or shared registers. This increases maintenance overhead
and potential for bugs.

[How]
Expose clock constants and internal functions in header for
sharing. Remove duplicate implementations and update function
pointers to use shared functions. Refactor remaining
variant-specific functions to use shared constants and helper
functions. Add compatibility comments for hardware differences.

Reviewed-by: Dmytro Laktyushkin <dmytro.laktyushkin@amd.com>
Signed-off-by: Gabe Teeger <gabe.teeger@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Fix Silence Conversion Warnings in Dmub

Fix Conversion that might result in a loss of data  warnings in dmub/src/:

- dmub_dcn20/31/32/35/42/60/401.c: Add ASSERT(value <= 0xFF) and
  explicit (uint8_t) cast when storing REG_GET results into uint8_t
  debug struct fields. Add != 0 for bool assignments from uint32_t
  bitfield reads.
- dmub_reg.c: Cast va_arg shift value to uint8_t with ASSERT guard
  before passing to set_reg_field_value_masks().
- dmub_srv.c: Widen num_pending to uint64_t to match uint64_t
  arithmetic; use != 0 for bool assignments from unsigned expressions.

No functional change intended.

Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Gaghik Khachatrian <gaghik.khachatrian@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Don't set 4to1MPC config dynamically

We were previously modifying the global dc->config.enable_4to1MPC
dynamically. These variables are meant as global configs, not to
by dynamically modified. Modifying them dynamically prevents us
from enabling/disabling functionality for debug purposes and can
easily lead to bad things since we're not operating on the current
state but on DC-wide variables.

Instead we should look at the existing split4mpc decision in
dcn20_validate_apply_split_flags and make the decision there,
if the global config.enable_4to1MPC is set to true for the
DCN version we're running.

This fixes corruption that is observed when running a new IGT
kms_colorop test for color-space-conversion that uses a
YUV plane and outputs to a writeback connector.

Co-developed by Claude Sonnet 4.5.

Assisted-by: Claude:claude-sonnet-4.5
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Fix NULL pointer dereference in dcn401_init_hw()

dcn401_init_hw() assumes that update_bw_bounding_box() is valid when
entering the update path. However, the existing condition:

((!fams2_enable && update_bw_bounding_box) || freq_changed)

does not guarantee this, as the freq_changed branch can evaluate to true
independently of the callback pointer.

This can result in calling update_bw_bounding_box() when it is NULL.

Fix this by separating the update condition from the pointer checks and
ensuring the callback, dc->clk_mgr, and bw_params are validated before
use.

Fixes the below:
../dc/hwss/dcn401/dcn401_hwseq.c:367 dcn401_init_hw() error: we previously assumed 'dc->res_pool->funcs->update_bw_bounding_box' could be null (see line 362)

Fixes: ca0fb243c3bb ("drm/amd/display: Underflow Seen on DCN401 eGPU")
Cc: Daniel Sa <Daniel.Sa@amd.com>
Cc: Alvin Lee <alvin.lee2@amd.com>
Cc: Roman Li <roman.li@amd.com>
Cc: Alex Hung <alex.hung@amd.com>
Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Dan Carpenter <dan.carpenter@linaro.org>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

driver core: Make deferred_probe_timeout default a Kconfig option

Code using driver_deferred_probe_check_state() differs from most
EPROBE_DEFER handling in the kernel. Where other EPROBE_DEFER handling
(e.g. clks, gpios and regulators) waits indefinitely for suppliers to
show up, code using driver_deferred_probe_check_state() will fail
after the deferred_probe_timeout.

This is a problem for generic distro kernels which want to support many
boards using a single kernel build. These kernels want as much drivers to
be modular as possible. The initrd also should be as small as possible,
so the initrd will *not* have drivers not needing to get the rootfs.

Combine this with waiting for a full-disk encryption password in
the initrd and it is pretty much guaranteed that the default 10s timeout
will be hit, causing probe() failures when drivers on the rootfs happen
to get modprobe-d before other rootfs modules providing their suppliers.

Make the default timeout configurable from Kconfig to allow distro kernel
configs where many of the supplier drivers are modules to set the default
through Kconfig.

Reviewed-by: Saravana Kannan <saravanak@kernel.org>
Signed-off-by: Hans de Goede <johannes.goede@oss.qualcomm.com>
Link: https://patch.msgid.link/20260314084916.10868-1-johannes.goede@oss.qualcomm.com
[ Drop deferred_probe_timeout documentation change in
kernel-parameters.txt. - Danilo ]
Signed-off-by: Danilo Krummrich <dakr@kernel.org>

drm/amd/display: Merge pipes for validate

Validation expects to operate on non-split pipes. This is
seen in dcn20_fast_validate_bw, which merges pipes for
validation. We weren't doing that in the non-fast path
which lead to validation failures when operating with
4-to-1 MPC and a writeback connector.

Co-developed by Claude Sonnet 4.5

Assisted-by: Claude:claude-sonnet-4.5
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Add update_descriptor param info in 'update_planes_and_stream_state'

Add missing info for the update_descriptor parameter in
update_planes_and_stream_state().

Fixes the below with gcc W=1:
../display/dc/core/dc.c:3630 function parameter 'update_descriptor' not described in 'update_planes_and_stream_state'

Fixes: c24bb00cc6cf ("drm/amd/display: Refactor DC update checks")
Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Cc: Dillon Varone <Dillon.Varone@amd.com>
Cc: Chuanyu Tseng <chuanyu.tseng@amd.com>
Cc: Roman Li <roman.li@amd.com>
Cc: Alex Hung <alex.hung@amd.com>
Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Dan Carpenter <dan.carpenter@linaro.org>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Add NULL check for integrated_info in clk_mgr_construct

clk_mgr_construct() initializes display clock and memory bandwidth
settings during driver bring-up.

As part of this, the driver selects a watermark table based on the
memory type (DDR4, LPDDR4, LPDDR5) from ctx->dc_bios->integrated_info.

The display pipeline continuously reads pixel data from memory,
processes it (such as scaling, color conversion, and blending), and
sends it to the screen. To keep this pipeline running smoothly, the
driver must ensure there is enough memory bandwidth and that clocks are
increased when needed.

Watermark tables define when the GPU should increase clocks to ensure
there is enough bandwidth to feed pixel data without underflow.

However, ctx->dc_bios->integrated_info is dereferenced without checking
for NULL in multiple clk_mgr_construct() implementations. On some
platforms, BIOS may not provide this information, and accessing it
directly can cause a NULL pointer dereference during initialization.

Fix this by adding a NULL check before accessing integrated_info.

If integrated_info is not available, the driver safely falls back to
default watermark tables.

Fixes:
../dcn21/rn_clk_mgr.c:775 rn_clk_mgr_construct() warn: variable dereferenced before check 'ctx->dc_bios->integrated_info' (see line 743)
../dcn301/vg_clk_mgr.c:750 vg_clk_mgr_construct() warn: variable dereferenced before check 'ctx->dc_bios->integrated_info' (see line 736)
../dcn31/dcn31_clk_mgr.c:789 dcn31_clk_mgr_construct() warn: variable dereferenced before check 'ctx->dc_bios->integrated_info' (see line 728)
../dcn314/dcn314_clk_mgr.c:906 dcn314_clk_mgr_construct() warn: variable dereferenced before check 'ctx->dc_bios->integrated_info' (see line 845)
../dcn315/dcn315_clk_mgr.c:716 dcn315_clk_mgr_construct() warn: variable dereferenced before check 'ctx->dc_bios->integrated_info' (see line 655)
../dcn316/dcn316_clk_mgr.c:660 dcn316_clk_mgr_construct() warn: variable dereferenced before check 'ctx->dc_bios->integrated_info' (see line 639)
../dcn35/dcn35_clk_mgr.c:1540 dcn35_clk_mgr_construct() warn: variable dereferenced before check 'ctx->dc_bios->integrated_info' (see line 1467)

Fixes: 25879d7b4986 ("drm/amd/display: Clean FPGA code in dc")
Cc: Roman Li <roman.li@amd.com>
Cc: Alex Hung <alex.hung@amd.com>
Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Dan Carpenter <dan.carpenter@linaro.org>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>