]> git.ipfire.org Git - thirdparty/kernel/stable.git/log
thirdparty/kernel/stable.git
3 weeks agodrm/radeon/radeon_connectors: remove radeon_connector_free_edid
Joshua Peisach [Sat, 23 May 2026 14:27:48 +0000 (10:27 -0400)] 
drm/radeon/radeon_connectors: remove radeon_connector_free_edid

Since we are using struct drm_edid, we can call drm_edid_free directly.
Also make sure to set the pointer to NULL afterwards.

Signed-off-by: Joshua Peisach <jpeisach@ubuntu.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/radeon/radeon_connectors: use struct drm_edid instead of struct edid
Joshua Peisach [Sat, 23 May 2026 14:27:47 +0000 (10:27 -0400)] 
drm/radeon/radeon_connectors: use struct drm_edid instead of struct edid

This was done with amdgpu, just bringing the same patch to radeon.

The goal of this is to stop using the deprecated edid functions,
specifically drm_connector_update_edid_property. Switch to struct
drm_edid and the appropriate function replacements for the new type.

Also, for audio, use the raw edid for SADB allocations and for
equivalent drm_edid_is_digital expressions.

Signed-off-by: Joshua Peisach <jpeisach@ubuntu.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amd/display: Initialize dsc_caps to 0
Ivan Lipski [Wed, 13 May 2026 21:53:57 +0000 (17:53 -0400)] 
drm/amd/display: Initialize dsc_caps to 0

[Why&How]
If we don't do that we make DSC decisions based on random
inputs, which might result in disallowing DSC when the
monitor and HW support it.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Ivan Lipski <ivan.lipski@amd.com>
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu: fix calling VM invalidation in amdgpu_hmm_invalidate_gfx
Christian König [Wed, 18 Feb 2026 11:31:29 +0000 (12:31 +0100)] 
drm/amdgpu: fix calling VM invalidation in amdgpu_hmm_invalidate_gfx

Otherwise we don't invalidate page tables on next CS.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Tested-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu: fix amdgpu_hmm_range_get_pages
Christian König [Wed, 18 Feb 2026 11:53:27 +0000 (12:53 +0100)] 
drm/amdgpu: fix amdgpu_hmm_range_get_pages

The notifier sequence must only be read once or otherwise we could work
with invalid pages.

While at it also fix the coding style, e.g. drop the pre-initialized
return value and use the common define for 2G range.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Tested-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amd/ras: cap pending_ecc_list size
Stanley.Yang [Mon, 11 May 2026 11:44:16 +0000 (19:44 +0800)] 
drm/amd/ras: cap pending_ecc_list size

Drop new entries once pending_ecc_count hits RAS_UMC_PENDING_ECC_MAX
(8192) so an ECC storm or repeated UMC error injection cannot exhaust
kernel memory. Dropped events are counted and reported via a
rate-limited warning.

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agonvmet-tcp: check return value of nvmet_tcp_set_queue_sock
Geliang Tang [Tue, 26 May 2026 09:28:05 +0000 (17:28 +0800)] 
nvmet-tcp: check return value of nvmet_tcp_set_queue_sock

The return value of nvmet_tcp_set_queue_sock() is currently ignored in
nvmet_tcp_tls_handshake_done(). If it fails (e.g., due to the socket
not being in TCP_ESTABLISHED state), the socket callbacks will not be
properly set, leading to queue and socket leakage.

Fix this by capturing the return value and calling
nvmet_tcp_schedule_release_queue() on failure to ensure proper cleanup.

Fixes: 675b453e0241 ("nvmet-tcp: enable TLS handshake upcall")
Reviewed-by: Hannes Reinecke <hare@kernel.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Keith Busch <kbusch@kernel.org>
3 weeks agodrm/amdgpu: init locals in umc_v12_0_convert_error_address
Stanley.Yang [Mon, 11 May 2026 09:27:29 +0000 (17:27 +0800)] 
drm/amdgpu: init locals in umc_v12_0_convert_error_address

row, col, col_lower, row_lower, row_high and bank could be read on
code paths that never assign them. Initialize them to 0.

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu/userq: use array instead of list for userq_vas
Sunil Khatri [Wed, 20 May 2026 11:09:49 +0000 (16:39 +0530)] 
drm/amdgpu/userq: use array instead of list for userq_vas

Use arrays instead of list for userq_vas since we have fixed no
of bos. Also, we dont have to worry to free that memory later
since this array would be free along with queue only.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu/userq: move mqd_destroy to later stage to keep core obj valid
Sunil Khatri [Wed, 20 May 2026 10:55:50 +0000 (16:25 +0530)] 
drm/amdgpu/userq: move mqd_destroy to later stage to keep core obj valid

mqd_destroy cleans up queue core objects like mqd and fw_object
which are needed for any pending fence to signal properly.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdkfd: fix a vulnerability of integer overflow in kfd debugger
Eric Huang [Tue, 12 May 2026 14:19:52 +0000 (10:19 -0400)] 
drm/amdkfd: fix a vulnerability of integer overflow in kfd debugger

get_queue_ids() computes array_size = num_queues * sizeof(uint32_t),
which could overflow on 32-bit size_t build. using array_size()
instead, it saturates to SIZE_MAX on overflow.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amd: Add dedicated helper for amdgpu_device_find_parent()
Mario Limonciello [Wed, 20 May 2026 15:46:17 +0000 (10:46 -0500)] 
drm/amd: Add dedicated helper for amdgpu_device_find_parent()

There are a few cases that code walks up the topology to find the
link partner of the integrated switch in a dGPU.  Split this out
to a helper and call in all places.

This does have a functional change that amdgpu_device_gpu_bandwidth()
doesn't cache the internal link but only the parent.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu/userq: remove amdgpu_userq_create/destroy_object wrapper
Sunil Khatri [Wed, 20 May 2026 10:43:09 +0000 (16:13 +0530)] 
drm/amdgpu/userq: remove amdgpu_userq_create/destroy_object wrapper

Remove the amdgpu_userq_create/destroy_object wrappers and
use directly the kernel bo allocation function which does all the
things which are done in wrapper.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu: Fix TOCTOU on UniRAS command response size
Chenglei Xie [Mon, 11 May 2026 18:13:45 +0000 (14:13 -0400)] 
drm/amdgpu: Fix TOCTOU on UniRAS  command response size

The guest maps the PF response in shared VRAM (struct ras_cmd_ctx in the
command buffer). After amdgpu_virt_send_remote_ras_cmd() returns, the code
validated rcmd->output_size against the caller buffer, then copied
rcmd->output_buff_raw using rcmd->output_size again. A malicious PF could
change output_size between those reads so the memcpy length exceeds the
caller’s output_size and overflows guest stack or heap buffers.

Snapshot output_size with READ_ONCE() once, assign cmd->output_size from
that value, and use the same snapshot for the bounds check and memcpy.
Also read cmd_res once with READ_ONCE() so the error branch and
cmd->cmd_res assignment do not observe different values from shared memory.

Signed-off-by: Chenglei Xie <Chenglei.Xie@amd.com>
Reviewed-by: YiPeng Chai <YiPeng.Chai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu: bound SR-IOV RAS CPER dump parsing against used_size
Chenglei Xie [Mon, 11 May 2026 19:24:29 +0000 (15:24 -0400)] 
drm/amdgpu: bound SR-IOV RAS CPER dump parsing against used_size

The VF copies a PF-provided CPER telemetry blob and walks records using
cper_dump->count and each entry's record_length. count is u64 while the
loop used u32, so a large count could loop indefinitely. record_length was
not limited to the kmemdup'd region, so the first iteration could read far
past the allocation; record_length == 0 could spin forever on the same
entry. Together that allowed a malicious hypervisor to leak heap past the
blob into the CPER ring or hang the guest.

Require used_size to cover the fixed header before buf and stay within the
telemetry cap. Track remaining bytes in buf, cap iterations with u64 and
CPER_MAX_ALLOWED_COUNT, and reject record_length outside
[sizeof(cper_hdr), remaining] before writing to the ring.

Signed-off-by: Chenglei Xie <Chenglei.Xie@amd.com>
Reviewed-by: YiPeng Chai <YiPeng.Chai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amd/pm/si: Notify the SMC when switching to AC
Jeremy Klarenbeek [Tue, 19 May 2026 08:41:58 +0000 (10:41 +0200)] 
drm/amd/pm/si: Notify the SMC when switching to AC

There are some platforms that don't have a dedicated
GPIO line to manage the AC/DC switch. In this case,
the SI SMC automatically notices when switching to DC,
but needs to be notified when switching to AC.

Fixup and use si_notify_hw_of_powersource() which was
previously hidden behind an "#if 0".

This fixes some SI laptop GPUs to be able to use their
performance power states after switching from DC to AC.

Some affected GPUs are:
FirePro W4170M - Dell Precision M2800
Radeon HD 8790M - Dell Latitude E6540

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Co-developed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Jeremy Klarenbeek <jeremy.klarenbeek99@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amd/pm/si: Fix updating clock limits from power states
Jeremy Klarenbeek [Tue, 19 May 2026 08:41:57 +0000 (10:41 +0200)] 
drm/amd/pm/si: Fix updating clock limits from power states

VBIOS can contain conflicting values between:
- the maximum allowed clocks and voltages on AC or DC
- the clocks and voltages in power states on AC or DC

Update maximum clock (and voltage) limits for both AC/DC
and take the highest value from the VBIOS limits and
the performance/battery power states. Previously this
was only done for AC, but is also needed for DC.

This commit fixes the behaviour on some laptop GPUs,
where the VBIOS limit was set to the lowest possible
clock frequency, so the GPU was stuck on the lowest
possible power level on battery.

Some affected GPUs are:
FirePro W4170M (Dell Precision M2800)
Radeon HD 8790M (Dell Latitude E6540)
and possibly other laptop GPUs.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Co-developed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Jeremy Klarenbeek <jeremy.klarenbeek99@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amd/pm/smu7: Notify SMU7 of DC->AC switch
Timur Kristóf [Tue, 19 May 2026 08:41:56 +0000 (10:41 +0200)] 
drm/amd/pm/smu7: Notify SMU7 of DC->AC switch

When ATOM_PP_PLATFORM_CAP_HARDWAREDC is set,
the SMU has a GPIO pin for detecting AC/DC switch
and everything works automatically.

Otherwise when there is no GPIO pin, the SMU can
automatically detect switching to DC, but needs
to be notified of switching to AC.

Use PPSMC_MSG_RunningOnAC to notify the SMC
when switching to AC.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amd/pm: Rename enable_bapm() to notify_ac_dc()
Timur Kristóf [Tue, 19 May 2026 08:41:55 +0000 (10:41 +0200)] 
drm/amd/pm: Rename enable_bapm() to notify_ac_dc()

No functional changes, just change the name of this
function pointer to be more generic.

BAPM refers to a specific feature on KV, but other kinds of
ASICs may also need the SMU to be notified on AC/DC changes.

Also remove the argument and use adev->pm.ac_power instead.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amd/pm/si: Disregard vblank time when no displays are connected
Timur Kristóf [Tue, 19 May 2026 08:41:54 +0000 (10:41 +0200)] 
drm/amd/pm/si: Disregard vblank time when no displays are connected

When no displays are connected, there is no vblank
happening so the power management code shouldn't
worry about it.

This fixes a regression that caused the memory clock
to be stuck at maximum when there were no displays
connected to a SI GPU.

Fixes: 9003a0746864 ("drm/amd/pm: Treat zero vblank time as too short in si_dpm (v3)")
Fixes: 9d73b107a61b ("drm/amd/pm: Use pm_display_cfg in legacy DPM (v2)")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Jeremy Klarenbeek <jeremy.klarenbeek99@gmail.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amd/pm: Delete PP_DAL_POWERLEVEL
Timur Kristóf [Tue, 19 May 2026 10:21:18 +0000 (12:21 +0200)] 
drm/amd/pm: Delete PP_DAL_POWERLEVEL

Not used and not needed anymore.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amd/pm: Delete get_dal_power_level
Timur Kristóf [Tue, 19 May 2026 10:21:17 +0000 (12:21 +0200)] 
drm/amd/pm: Delete get_dal_power_level

Not needed anymore.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amd/pm: Delete vddc_dep_on_dal_pwrl
Timur Kristóf [Tue, 19 May 2026 10:21:16 +0000 (12:21 +0200)] 
drm/amd/pm: Delete vddc_dep_on_dal_pwrl

It was not used by anything anymore.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amd/pm: Delete non-functional SMU8 get_dal_power_level implementation
Timur Kristóf [Tue, 19 May 2026 10:21:15 +0000 (12:21 +0200)] 
drm/amd/pm: Delete non-functional SMU8 get_dal_power_level implementation

This function was effectively a no-op because it always
returned the maximum possible power level, because the
maximum voltage is in millivolts while the dependency
table didn't contain actual voltages.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amd/pm: Delete dummy get_dal_power_level implementations
Timur Kristóf [Tue, 19 May 2026 10:21:14 +0000 (12:21 +0200)] 
drm/amd/pm: Delete dummy get_dal_power_level implementations

These implementations did not actually return
the DAL power level, so they were effectively
a no-op.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amd/pm: Delete unused get_display_power_level() function
Timur Kristóf [Tue, 19 May 2026 10:21:13 +0000 (12:21 +0200)] 
drm/amd/pm: Delete unused get_display_power_level() function

Was not called from anywhere.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amd/display: Delete dm_pp_clocks_state
Timur Kristóf [Tue, 19 May 2026 10:21:12 +0000 (12:21 +0200)] 
drm/amd/display: Delete dm_pp_clocks_state

It isn't used by anything anymore.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amd/display: Delete disp_clk_voltage from integrated info (v2)
Timur Kristóf [Tue, 19 May 2026 10:21:11 +0000 (12:21 +0200)] 
drm/amd/display: Delete disp_clk_voltage from integrated info (v2)

Only DCE 11.0 relies on this information and even that
didn't use this field, because it queries the information
from the pplib. It also filled the field incorrectly on
that version.

On newer GPUs, the VIOS integrated info no longer contains
display clock voltage dependencies, so we don't need it.

v2:
- Also delete some code wrapped in #if 0

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amd/display: Delete max_clks_by_state from DCE clock manager (v2)
Timur Kristóf [Tue, 19 May 2026 10:21:10 +0000 (12:21 +0200)] 
drm/amd/display: Delete max_clks_by_state from DCE clock manager (v2)

It was not used by anything anymore.

Note that the parts of DC that need this information actually
already query it from the pplib and don't use the hardcoded
information from max_clks_by_state.

v2:
- Also delete state_dependent_clocks

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amd/display: Set max supported display clock without max_clks_by_state (v2)
Timur Kristóf [Tue, 19 May 2026 10:21:09 +0000 (12:21 +0200)] 
drm/amd/display: Set max supported display clock without max_clks_by_state (v2)

The max_clks_by_state was based on hardcoded values, which are
not really used anywhere, only to know the maximum clock.
Just hardcode the same maximum clock for each DCE version.

v2:
- Use previous max display clock for DCE 11.2

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amd/display: Delete max_clocks_state
Timur Kristóf [Tue, 19 May 2026 10:21:08 +0000 (12:21 +0200)] 
drm/amd/display: Delete max_clocks_state

It's not used by anything anymore.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amd/display: Remove min/max clock levels from clk_mgr (v2)
Timur Kristóf [Tue, 19 May 2026 10:21:07 +0000 (12:21 +0200)] 
drm/amd/display: Remove min/max clock levels from clk_mgr (v2)

These fields are not used by anything anymore.

v2:
- Delete dm_pp_get_static_clocks()
- Delete pp_to_dc_powerlevel_state()

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amd/display: Delete dce_get_required_clocks_state()
Timur Kristóf [Tue, 19 May 2026 10:21:06 +0000 (12:21 +0200)] 
drm/amd/display: Delete dce_get_required_clocks_state()

It is not called from anywhere anymore.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdkfd: Check for pdd drm file first in CRIU restore path
David Francis [Thu, 14 May 2026 14:31:20 +0000 (10:31 -0400)] 
drm/amdkfd: Check for pdd drm file first in CRIU restore path

CRIU restore ioctls are meant to be called by CRIU with no
existing drm file. There's an error path
for if the drm file unexpectedly exists. It was positioned so
it was missing a fput(drm_file).

Do that check earlier, as soon as we have the pdd.

Signed-off-by: David Francis <David.Francis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu: fix potential overflow in fs_info.debugfs_name
Stanley.Yang [Mon, 11 May 2026 08:49:19 +0000 (16:49 +0800)] 
drm/amdgpu: fix potential overflow in fs_info.debugfs_name

Use snprintf() with sizeof(fs_info.debugfs_name) so a long RAS block
name plus the "_err_inject" suffix cannot overflow the 32-byte buffer.

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu/userq: make sure queue is valid in the hang_detect_work
Sunil Khatri [Mon, 18 May 2026 14:28:08 +0000 (19:58 +0530)] 
drm/amdgpu/userq: make sure queue is valid in the hang_detect_work

Thread 1: Running amdgpu_userq_destroy which eventually remove
the queue from door bell and set userq_mgr = NULL.

Thread2: An interrupt might have scheduled the hang_detect_work
which still need userq_mgr to be valid but could get an NULL
ptrs.

To fix that make sure we cancel the hang_detect_work again before
setting userq_mgr to NULL.

Along with that we also need all the queue va to remain valid till
we could be running anything on the queue and hence moving the
userq_va post hang_detect handler is cancelled.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu/userq: reserve root bo without interruption
Sunil Khatri [Mon, 18 May 2026 13:25:25 +0000 (18:55 +0530)] 
drm/amdgpu/userq: reserve root bo without interruption

Fix the code to make it an uninterruptible reservation
for root bo.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu/userq: add amdgpu_bo_unpin when amdgpu_ttm_alloc_gart fails
Sunil Khatri [Mon, 18 May 2026 13:03:00 +0000 (18:33 +0530)] 
drm/amdgpu/userq: add amdgpu_bo_unpin when amdgpu_ttm_alloc_gart fails

Unpin the wptr_obj->obj when amdgpu_ttm_alloc_gart fails.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu: simplify return value in amdgpu_userq_get_doorbell_index
Sunil Khatri [Mon, 18 May 2026 12:12:15 +0000 (17:42 +0530)] 
drm/amdgpu: simplify return value in amdgpu_userq_get_doorbell_index

amdgpu_userq_get_doorbell_index returns a uint64 type index
as well as a int type failure values. Simplifying this and
using a int type return value and getting the index in input pointer
of type uint64 type.

Also since it's used at once place making it static would be better.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdkfd: fix NULL pointer bug in svm_range_set_attr
Eric Huang [Thu, 7 May 2026 19:51:49 +0000 (15:51 -0400)] 
drm/amdkfd: fix NULL pointer bug in svm_range_set_attr

The process_info could be NULL if user doesn't call kfd_ioctl_acquire_vm
before calling kfd_ioctl_svm.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agoblk-throttle: schedule parent dispatch in tg_flush_bios()
Tao Cui [Fri, 22 May 2026 09:15:30 +0000 (17:15 +0800)] 
blk-throttle: schedule parent dispatch in tg_flush_bios()

tg_flush_bios() schedules pending_timer on the child tg's own
service_queue, which causes throtl_pending_timer_fn() to dispatch from
the child's pending_tree.  For leaf cgroups this tree is empty, so the
timer fires and exits without dispatching the throttled bio.

The throttled bio sits in the parent's pending_tree with disptime set
to jiffies (THROTL_TG_CANCELING zeroes all dispatch times), but the
parent's timer is never explicitly rescheduled.  The bio only gets
dispatched when the parent timer eventually fires at its previously
scheduled expiry.

Fix by calling throtl_schedule_next_dispatch(sq->parent_sq, true)
instead, matching what tg_set_limit() already does.  This forces the
parent's dispatch cycle to run immediately and flush all canceling
bios without waiting for a stale timer.

For the device deletion path (blk_throtl_cancel_bios), directly
complete throttled bios with EIO via bio_io_error() instead of
dispatching them through the timer -> work -> submission chain.
This avoids a race with the SCSI state machine where bios can reach
the SCSI layer while the device is in SDEV_CANCEL state, causing
ENODEV instead of the expected EIO.

Reported-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Link: https://lore.kernel.org/all/ag2owaQQoigp_fSV@shinmob/
Signed-off-by: Tao Cui <cuitao@kylinos.cn>
Link: https://patch.msgid.link/20260522091530.1901437-1-cuitao@kylinos.cn
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 weeks agorust: block: mq: align init_request numa_node arg with C signature
Andreas Hindborg [Wed, 27 May 2026 09:18:09 +0000 (11:18 +0200)] 
rust: block: mq: align init_request numa_node arg with C signature

Commit b040a1a4523d ("block: switch numa_node to int in
blk_mq_hw_ctx and init_request") changed the type of the
`numa_node` argument of `blk_mq_ops::init_request` from
`unsigned int` to `int`. Update the Rust callback signature to
match, so that the function item can be coerced to the C fn
pointer type stored in `blk_mq_ops`.

Without this change the Rust block layer fails to build:

  error[E0308]: mismatched types
     --> rust/kernel/block/mq/operations.rs:274:28
      |
  274 |         init_request: Some(Self::init_request_callback),
      |                       ---- ^^^^^^^^^^^^^^^^^^^^^^^^^^^
      |                       expected fn pointer, found fn item
      |
      = note: expected fn pointer
                `unsafe extern "C" fn(_, _, _, i32) -> _`
                    found fn item
                `unsafe extern "C" fn(_, _, _, u32) -> _ {...}`

The argument is unused on the Rust side, so this is a pure
type-signature change with no functional impact.

Fixes: b040a1a4523d ("block: switch numa_node to int in blk_mq_hw_ctx and init_request")
Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
Reviewed-by: Gary Guo <gary@garyguo.net>
Link: https://patch.msgid.link/20260527-block-for-next-2026-05-26-2200-failure-v1-1-4865889e282c@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 weeks agodrm/amd/display: Write REFCLK to 48MHz on DCN21
Ivan Lipski [Thu, 14 May 2026 15:53:50 +0000 (11:53 -0400)] 
drm/amd/display: Write REFCLK to 48MHz on DCN21

[Why&How]
dccg21_init() calls dccg2_init() which hardcodes 100MHz refclk values
for MICROSECOND_TIME_BASE_DIV and MILLISECOND_TIME_BASE_DIV. DCN21
uses 48MHz refclk, so the wrong values corrupt DCCG timing and cause eDP
link training failure on cold boot.

Write the correct 48MHz values directly instead of calling dccg2_init().

v2:
Fixed typo

Fixes: e6e2b956fc81 ("drm/amd/display: Add missing DCCG register entries for DCN20-DCN316")
Reported-by: Max Chernoff <git@maxchernoff.ca>
Tested-by: Max Chernoff <git@maxchernoff.ca>
Signed-off-by: Ivan Lipski <ivan.lipski@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agoblock: partitions: replace __get_free_page() with kmalloc()
Mike Rapoport (Microsoft) [Wed, 27 May 2026 14:33:28 +0000 (17:33 +0300)] 
block: partitions: replace __get_free_page() with kmalloc()

check_partition() allocates a buffer to use as backing memory for
seq_buf.

This buffer can be allocated with kmalloc() as there's nothing special
about it to go directly to the page allocator.

kmalloc() provides a better API that does not require ugly casts and
kfree() does not need to know the size of the freed object.

For a single allocation on the cold path the performance difference between
kmalloc() and __get_free_pages() is not measurable as both allocators take
an object/page from a per-CPU list for fast path allocations.

For the slow path the performance is anyway determined by the amount of
reclaim involved rather than by what allocator is used.

Replace use of __get_free_page() with kmalloc() and free_page() with
kfree().

Link: https://lore.kernel.org/all/635405e4-9423-4a25-a6e7-e03c8ea0bcbe@redhat.com
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Link: https://patch.msgid.link/20260527-block-v2-1-8e06f914c484@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 weeks agodrm/amdgpu/userq: Fix the mutex_init cleanup for fence_drv_lock
Sunil Khatri [Tue, 19 May 2026 09:42:42 +0000 (15:12 +0530)] 
drm/amdgpu/userq: Fix the mutex_init cleanup for fence_drv_lock

mutex fence_drv_lock is destroyed in amdgpu_userq_fence_driver_free
also in one of the jump condition mutex_destroy is also called leading
to double mutex_destroy.

So rearranging the code so amdgpu_userq_fence_driver_free takes care
of the clean up along with mutex_destroy.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agoKVM: arm64: Don't populate TPIDR_EL2 in finalise_el2()
Will Deacon [Mon, 18 May 2026 15:31:26 +0000 (16:31 +0100)] 
KVM: arm64: Don't populate TPIDR_EL2 in finalise_el2()

Currently, it is not necessary for __finalise_el2() to configure
TPIDR_EL2:

* The hyp stub code does not consume the value of TPIDR_EL2.

* On the boot cpu, TPIDR_EL1 is used for the percpu offset until the
  ARM64_HAS_VIRT_HOST_EXTN cpucap is detected and boot alternatives
  are patched. Before boot alternatives are patched,
  cpu_copy_el2regs() will copy TPIDR_EL1 into TPIDR_EL2. It is not
  necessary for __finalise_el2() to initialise TPIDR_EL2 before this.

* Secondary CPUs are brought up after boot alternatives have been
  patched, and __secondary_switched() will initialize TPIDR_EL2 in
  'init_cpu_task', after finalise_el2() calls __finalise_el2()

* KVM hyp code which may consume TPIDR_EL2 is brought up after all
  secondaries have been booted, once TPIDR_El2 has been configured on
  all CPUs.

Remove the redundant initialisation from __finalise_el2().

Cc: Oliver Upton <oupton@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://patch.msgid.link/20260518153127.6078-1-will@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
3 weeks agosamples: rust: rust_driver_auxiliary: showcase lifetime-bound registration data
Danilo Krummrich [Mon, 25 May 2026 20:21:11 +0000 (22:21 +0200)] 
samples: rust: rust_driver_auxiliary: showcase lifetime-bound registration data

Make the Data struct lifetime-parameterized, storing a reference to the
parent pci::Device<Bound>. This demonstrates that registration data can
hold device resources tied to the parent driver's lifetime.

In connect(), retrieve the parent PCI device from the registration data
rather than casting through adev.parent().

Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Link: https://patch.msgid.link/20260525202921.124698-25-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
3 weeks agoremoteproc: xlnx: Remove binding header dependency
Tanmay Shah [Fri, 8 May 2026 17:40:06 +0000 (10:40 -0700)] 
remoteproc: xlnx: Remove binding header dependency

Bindings can be deprecated and driver should not include bindings
headers directly. Instead define needed constants in the driver.

Signed-off-by: Tanmay Shah <tanmay.shah@amd.com>
Link: https://lore.kernel.org/r/20260508174006.3783082-1-tanmay.shah@amd.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
3 weeks agodrm/amd/display: Delete unimplemented dm_pp_apply_power_level_change_request() (v2)
Timur Kristóf [Tue, 19 May 2026 10:21:05 +0000 (12:21 +0200)] 
drm/amd/display: Delete unimplemented dm_pp_apply_power_level_change_request() (v2)

dm_pp_apply_power_level_change_request() was called from old
DCE clock manager implementations on DCE6, 8, 10, 11.2
but has not been implemented ever since the beginning of DC.

Affected GPUs have been working fine without that implementation
for many years. Let's delete it now.

v2:
- Delete dm_pp_apply_power_level_change_request too

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu/userq: Fix doorbell object cleanup of queue
Sunil Khatri [Tue, 19 May 2026 09:32:00 +0000 (15:02 +0530)] 
drm/amdgpu/userq: Fix doorbell object cleanup of queue

Unpin and unref the door bell obj if queue creation fails before
initialization is complete.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agorust: auxiliary: generalize Registration over ForLt
Danilo Krummrich [Mon, 25 May 2026 20:21:10 +0000 (22:21 +0200)] 
rust: auxiliary: generalize Registration over ForLt

Generalize Registration<T> to Registration<F: ForLt> and
Device::registration_data<F: ForLt>() to return Pin<&F::Of<'_>>.

The stored 'static lifetime is shortened to the borrow lifetime of &self
via ForLt::cast_ref; ForLt's covariance guarantee makes this sound.

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Link: https://patch.msgid.link/20260525202921.124698-24-dakr@kernel.org
[ Use PhantomData<F::Of<'a>> instead of
  PhantomData<(fn(&'a ()) -> &'a (), F)>], which also gets us rid of
  #[allow(clippy::type_complexity)]. - Danilo ]
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
3 weeks agodrm/amdgpu: Replace use of system_unbound_wq with system_dfl_wq
Marco Crivellari [Thu, 14 May 2026 10:38:09 +0000 (12:38 +0200)] 
drm/amdgpu: Replace use of system_unbound_wq with system_dfl_wq

This patch continues the effort to refactor workqueue APIs, which has begun
with the changes introducing new workqueues and a new alloc_workqueue flag:

   commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and system_dfl_wq")
   commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag")

The point of the refactoring is to eventually alter the default behavior of
workqueues to become unbound by default so that their workload placement is
optimized by the scheduler.

Before that to happen, workqueue users must be converted to the better named
new workqueues with no intended behaviour changes:

   system_wq -> system_percpu_wq
   system_unbound_wq -> system_dfl_wq

This way the old obsolete workqueues (system_wq, system_unbound_wq) can be
removed in the future.

Link: https://lore.kernel.org/all/20250221112003.1dSuoGyc@linutronix.de/
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amd/display: Replace use of system_unbound_wq with system_dfl_wq
Marco Crivellari [Thu, 14 May 2026 10:38:08 +0000 (12:38 +0200)] 
drm/amd/display: Replace use of system_unbound_wq with system_dfl_wq

This patch continues the effort to refactor workqueue APIs, which has begun
with the changes introducing new workqueues and a new alloc_workqueue flag:

   commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and system_dfl_wq")
   commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag")

The point of the refactoring is to eventually alter the default behavior of
workqueues to become unbound by default so that their workload placement is
optimized by the scheduler.

Before that to happen, workqueue users must be converted to the better named
new workqueues with no intended behaviour changes:

   system_wq -> system_percpu_wq
   system_unbound_wq -> system_dfl_wq

This way the old obsolete workqueues (system_wq, system_unbound_wq) can be
removed in the future.

Cc: Ray Wu <ray.wu@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Leo Li <sunpeng.li@amd.com>
Cc: Rodrigo Siqueira <siqueira@igalia.com>
Link: https://lore.kernel.org/all/20250221112003.1dSuoGyc@linutronix.de/
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu: check num_entries in GEM_OP GET_MAPPING_INFO
Ziyi Guo [Sun, 8 Feb 2026 00:02:55 +0000 (00:02 +0000)] 
drm/amdgpu: check num_entries in GEM_OP GET_MAPPING_INFO

kvcalloc(args->num_entries, sizeof(*vm_entries), GFP_KERNEL) at
amdgpu_gem.c:1050 uses the user-supplied num_entries directly without
any upper bounds check. Since num_entries is a __u32 and
sizeof(drm_amdgpu_gem_vm_entry) is 32 bytes, a large num_entries
produces an allocation exceeding INT_MAX, triggering
WARNING in __kvmalloc_node_noprof(), causing a kernel WARNING,
TAINT_WARN, and panic on CONFIG_PANIC_ON_WARN=y systems.

Add a size bounds check before we invoke the kvzalloc() to
reject oversized num_entries early with -EINVAL.

Fixes: 4d82724f7f2b ("drm/amdgpu: Add mapping info option for GEM_OP ioctl")
Signed-off-by: Ziyi Guo <n7l8m4@u.northwestern.edu>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agorust: types: add `ForLt` trait for higher-ranked lifetime support
Gary Guo [Mon, 25 May 2026 20:21:09 +0000 (22:21 +0200)] 
rust: types: add `ForLt` trait for higher-ranked lifetime support

There are a few cases, e.g. when dealing with data referencing each
other, one might want to write code that is generic over lifetimes. For
example, if you want to take a function that takes `&'a Foo` and gives
`Bar<'a>`, you can write:

    f: impl for<'a> FnOnce(&'a Foo) -> Bar<'a>,

However, it becomes tricky when you want that function to not have a
fixed `Bar`, but have it be generic again. In this case, one needs
something that is generic over types that are themselves generic over
lifetimes.

`ForLt` provides such support. It provides a trait `ForLt` which
describes a type generic over a lifetime. One may use `ForLt::Of<'a>` to
get an instance of a type for a specific lifetime.

For the case of cross referencing, one would almost always want the
lifetime to be covariant. Therefore this is also made a requirement for
the `ForLt` trait, so functions with `ForLt` trait bound can assume
covariance.

A macro `ForLt!()` is provided to be able to obtain a type that
implements `ForLt`. For example, `ForLt!(for<'a> Bar<'a>)` would yield a
type that `<TheType as ForLt>::Of<'a>` is `Bar<'a>`. This also works
with lifetime elision, e.g. `ForLt!(Bar<'_>)` or for types without
lifetime at all, e.g. `ForLt!(u32)`.

The API design draws inspiration from the higher-kinded-types [1] crate,
however a different design decision has been taken (e.g. covariance
requirement) and the implementation is independent.

License headers use "Apache-2.0 OR MIT" because I anticipate this to be
used in pin-init crate too which is licensed as such.

Link: https://docs.rs/higher-kinded-types/
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>
Signed-off-by: Gary Guo <gary@garyguo.net>
Acked-by: Miguel Ojeda <ojeda@kernel.org>
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Link: https://patch.msgid.link/20260525202921.124698-23-dakr@kernel.org
[ Handle macro_rules! invocations in the ForLt! proc macro's covariance
  and WF checks. Since proc macros cannot expand macro_rules!, add a
  visit_macro() implementation to conservatively assume macro
  invocations may contain lifetimes, forcing them through the
  compiler-assisted covariance proof.

  Fix a few typos in the documentation and in the commit message, add
  empty lines before samples, add missing periods and consistently use
  markdown.

  - Danilo ]
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
3 weeks agogpu: nova-core: separate driver type from driver data
Danilo Krummrich [Mon, 25 May 2026 20:21:08 +0000 (22:21 +0200)] 
gpu: nova-core: separate driver type from driver data

Introduce NovaCoreDriver as the driver type implementing pci::Driver,
keeping NovaCore as the per-device data type. This prepares for making
NovaCore lifetime-parameterized once auxiliary::Registration requires a
lifetime for the binding scope.

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Link: https://patch.msgid.link/20260525202921.124698-22-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
3 weeks agosamples: rust: rust_driver_pci: use HRT lifetime for Bar
Danilo Krummrich [Mon, 25 May 2026 20:21:07 +0000 (22:21 +0200)] 
samples: rust: rust_driver_pci: use HRT lifetime for Bar

Convert the sample driver to SampleDriver<'bound>, taking advantage of
the lifetime-parameterized Driver trait.

The driver struct holds &'bound pci::Device directly instead of
ARef<pci::Device>, and pci::Bar<'bound> directly instead of
Devres<pci::Bar>. This removes PinnedDrop, pin_init_scope, and runtime
revocation checks on BAR access.

Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Link: https://patch.msgid.link/20260525202921.124698-21-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
3 weeks agorust: io: make IoMem and ExclusiveIoMem lifetime-parameterized
Danilo Krummrich [Mon, 25 May 2026 20:21:06 +0000 (22:21 +0200)] 
rust: io: make IoMem and ExclusiveIoMem lifetime-parameterized

Add a lifetime parameter to IoMem<'a, SIZE> and ExclusiveIoMem<'a,
SIZE>, storing a &'a Device<Bound> reference to tie the mapping to the
device's lifetime.

This mirrors the pci::Bar<'a, SIZE> design and enables drivers to hold
I/O memory mappings directly in their HRT private data, tied to the
device lifetime.

IoRequest::iomap_* methods now return the mapping directly instead of
wrapping it in Devres. Callers that need device-managed revocation can
call the new into_devres() method.

Acked-by: Uwe Kleine-König <ukleinek@kernel.org>
Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Link: https://patch.msgid.link/20260525202921.124698-20-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
3 weeks agorust: pci: make Bar lifetime-parameterized
Danilo Krummrich [Mon, 25 May 2026 20:21:05 +0000 (22:21 +0200)] 
rust: pci: make Bar lifetime-parameterized

Convert pci::Bar<SIZE> to pci::Bar<'a, SIZE>, storing &'a Device<Bound>
to tie the BAR mapping lifetime to the device.

iomap_region_sized() now returns Result<Bar<'a, SIZE>> directly instead
of impl PinInit<Devres<Bar<SIZE>>, Error>.

Since the lifetime ties the mapping to the device's bound state, callers
no longer need Devres for the common case where the Bar lives in the
driver's private data.

Add Bar::into_devres() to consume the bar and register it as a
device-managed resource, returning Devres<Bar<'static, SIZE>>. The
lifetime is erased to 'static because Devres guarantees the bar does not
actually outlive the device -- access is revoked on unbind.

Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Link: https://patch.msgid.link/20260525202921.124698-19-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
3 weeks agorust: driver: update module documentation for GAT-based Data type
Danilo Krummrich [Mon, 25 May 2026 20:21:04 +0000 (22:21 +0200)] 
rust: driver: update module documentation for GAT-based Data type

Now that all bus driver traits use type Data<'bound>: 'bound, update the
illustrative driver trait in the module documentation to reflect the GAT
pattern and lifetime-parameterized callbacks.

Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Gary Guo <gary@garyguo.net>
Link: https://patch.msgid.link/20260525202921.124698-18-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
3 weeks agorust: i2c: make Driver trait lifetime-parameterized
Danilo Krummrich [Mon, 25 May 2026 20:21:03 +0000 (22:21 +0200)] 
rust: i2c: make Driver trait lifetime-parameterized

Add a 'bound lifetime to the associated Data, changing type Data to type
Data<'bound>.

This allows the driver's bus device private data to capture the device /
driver bound lifetime; device resources can be stored directly by
reference rather than requiring Devres.

The probe() and unbind() callbacks thus gain a 'bound lifetime parameter
on the methods themselves; avoiding a global lifetime on the trait impl.

Existing drivers set type Data<'bound> = Self, preserving the current
behavior.

Acked-by: Igor Korotin <igor.korotin@linux.dev>
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Gary Guo <gary@garyguo.net>
Link: https://patch.msgid.link/20260525202921.124698-17-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
3 weeks agorust: usb: make Driver trait lifetime-parameterized
Danilo Krummrich [Mon, 25 May 2026 20:21:02 +0000 (22:21 +0200)] 
rust: usb: make Driver trait lifetime-parameterized

Add a 'bound lifetime to the associated Data, changing type Data to type
Data<'bound>.

This allows the driver's bus device private data to capture the device /
driver bound lifetime; device resources can be stored directly by
reference rather than requiring Devres.

The probe() and disconnect() callbacks thus gain a 'bound lifetime
parameter on the methods themselves; avoiding a global lifetime on the
trait impl.

Existing drivers set type Data<'bound> = Self, preserving the current
behavior.

Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
Link: https://patch.msgid.link/20260525202921.124698-16-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
3 weeks agorust: auxiliary: make Driver trait lifetime-parameterized
Danilo Krummrich [Mon, 25 May 2026 20:21:01 +0000 (22:21 +0200)] 
rust: auxiliary: make Driver trait lifetime-parameterized

Add a 'bound lifetime to the associated Data, changing type Data to type
Data<'bound>.

This allows the driver's bus device private data to capture the device /
driver bound lifetime; device resources can be stored directly by
reference rather than requiring Devres.

The probe() and unbind() callbacks thus gain a 'bound lifetime parameter
on the methods themselves; avoiding a global lifetime on the trait impl.

Existing drivers set type Data<'bound> = Self, preserving the current
behavior.

Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Gary Guo <gary@garyguo.net>
Link: https://patch.msgid.link/20260525202921.124698-15-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
3 weeks agorust: platform: make Driver trait lifetime-parameterized
Danilo Krummrich [Mon, 25 May 2026 20:21:00 +0000 (22:21 +0200)] 
rust: platform: make Driver trait lifetime-parameterized

Add a 'bound lifetime to the associated Data, changing type Data to type
Data<'bound>.

This allows the driver's bus device private data to capture the device /
driver bound lifetime; device resources can be stored directly by
reference rather than requiring Devres.

The probe() and unbind() callbacks thus gain a 'bound lifetime parameter
on the methods themselves; avoiding a global lifetime on the trait impl.

Existing drivers set type Data<'bound> = Self, preserving the current
behavior.

Acked-by: Uwe Kleine-König <ukleinek@kernel.org>
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Gary Guo <gary@garyguo.net>
Link: https://patch.msgid.link/20260525202921.124698-14-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
3 weeks agodrm/amdgpu: fix lock leak on ENOMEM in AMDGPU_GEM_OP_GET_MAPPING_INFO
Michael Bommarito [Sun, 17 May 2026 13:17:42 +0000 (09:17 -0400)] 
drm/amdgpu: fix lock leak on ENOMEM in AMDGPU_GEM_OP_GET_MAPPING_INFO

The AMDGPU_GEM_OP_GET_MAPPING_INFO branch of amdgpu_gem_op_ioctl()
holds three cleanup-tracked resources before calling kvcalloc():
the drm_gem_object reference from drm_gem_object_lookup(), the
drm_exec lock on the looked-up GEM via drm_exec_lock_obj(), and
the drm_exec lock on the per-process VM root page directory via
amdgpu_vm_lock_pd().  All three are released by the out_exec
label that every other error path in this function jumps to.
The kvcalloc() failure path returns -ENOMEM directly, skipping
out_exec and leaking all three.

The leaked per-process VM root PD dma_resv lock is the
load-bearing leak: any subsequent operation on the same VM
(further GEM ops, command-submission, eviction, TTM shrinker
callbacks) blocks on the held lock.  DRM_IOCTL_AMDGPU_GEM_OP is
DRM_AUTH | DRM_RENDER_ALLOW, so this is an unprivileged-local
denial of service against the caller's GPU context, reachable
by any process with /dev/dri/renderD* access.

Route the failure through out_exec so drm_exec_fini() and
drm_gem_object_put() run.

Reproduced on stock 7.0.0-10, Ryzen 7 5700U / Radeon Vega
(Lucienne): the failing ioctl returns -ENOMEM and a second
GET_MAPPING_INFO on the same fd then blocks in
drm_exec_lock_obj() on the leaked dma_resv.  SIGKILL on the
caller does not reap the task; the fd-release path during
process exit goes through amdgpu_gem_object_close() ->
drm_exec_prepare_obj() on the same lock, leaving the task in D
state until the box is rebooted.  The patched kernel was not
rebuilt and re-tested on this hardware; the fix is mechanical.
Tested on a single Lucienne / Vega box only.

Ziyi Guo posted an independent INT_MAX-bound check for
args->num_entries in the same branch [1]; the two patches are
complementary and can land in either order.

Fixes: 4d82724f7f2b ("drm/amdgpu: Add mapping info option for GEM_OP ioctl")
Link: https://lore.kernel.org/all/20260208000255.4073363-1-n7l8m4@u.northwestern.edu/
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agorust: pci: make Driver trait lifetime-parameterized
Danilo Krummrich [Mon, 25 May 2026 20:20:59 +0000 (22:20 +0200)] 
rust: pci: make Driver trait lifetime-parameterized

Add a 'bound lifetime to the associated Data, changing type Data to type
Data<'bound>.

This allows the driver's bus device private data to capture the device /
driver bound lifetime; device resources can be stored directly by
reference rather than requiring Devres.

The probe() and unbind() callbacks thus gain a 'bound lifetime parameter
on the methods themselves; avoiding a global lifetime on the trait impl.

Existing drivers set type Data<'bound> = Self, preserving the current
behavior.

Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Gary Guo <gary@garyguo.net>
Link: https://patch.msgid.link/20260525202921.124698-13-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
3 weeks agodrm/amd/display: Fix amdgpu_dm KUnit allmodconfig build
Ray Wu [Tue, 19 May 2026 03:44:55 +0000 (11:44 +0800)] 
drm/amd/display: Fix amdgpu_dm KUnit allmodconfig build

[Why]
With CONFIG_DRM_AMD_DC_KUNIT_TEST=m, allmodconfig only defines the
_MODULE variant. Four KUnit helper headers gate their declarations
with #ifdef CONFIG_DRM_AMD_DC_KUNIT_TEST, so the declarations vanish
while the matching .c files (driven by IS_ENABLED() via
STATIC_IFN_KUNIT) keep the functions non-static. The build breaks
with implicit declarations and -Werror=missing-prototypes.

amdgpu_dm_crc.h additionally uses symbols that its test file does not
pull in indirectly, amdgpu_dm_colorop_test.c has a copy-paste
duplicate function with the wrong expected bitmask, and the three
colorop TF bitmasks are not exported for modpost.

[How]
- Switch the crc/hdcp/color/psr KUnit guards to IS_ENABLED().
- Make amdgpu_dm_crc.h self-contained (dc_types.h + forward decl).
- Rename the duplicated shaper test back to its intended name and
  fix its expected bitmask.
- Export amdgpu_dm_supported_{degam,shaper,blnd}_tfs via
  EXPORT_IF_KUNIT().

Assisted-by: Copilot:claude-4-opus
Reviewed-by: Alex Hung <alex.hung@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agorust: device: make Core and CoreInternal lifetime-parameterized
Danilo Krummrich [Mon, 25 May 2026 20:20:58 +0000 (22:20 +0200)] 
rust: device: make Core and CoreInternal lifetime-parameterized

Device<Core> references in probe callbacks are scoped to the callback,
not the full binding duration. Add a lifetime parameter to Core and
CoreInternal to accurately represent this in the type system.

Suggested-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Link: https://patch.msgid.link/20260525202921.124698-12-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
3 weeks agorust: device: implement Sync for Device<Bound>
Danilo Krummrich [Mon, 25 May 2026 20:20:57 +0000 (22:20 +0200)] 
rust: device: implement Sync for Device<Bound>

Implement Sync for Device<Bound> in addition to Device<Normal>.

Device<Bound> uses the same underlying struct device as Device<Normal>;
Bound is a zero-sized type-state marker that does not affect thread
safety.

This is needed for types that hold &'bound Device<Bound>, such as
io::mem::IoMem, to be Send.

Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Uwe Kleine-König <ukleinek@kernel.org>
Tested-by: Dirk Behme <dirk.behme@de.bosch.com>
Link: https://patch.msgid.link/20260525202921.124698-11-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
3 weeks agorust: usb: implement Sync for Device<Bound>
Danilo Krummrich [Mon, 25 May 2026 20:20:56 +0000 (22:20 +0200)] 
rust: usb: implement Sync for Device<Bound>

Implement Sync for Device<Bound> in addition to Device<Normal>.

Device<Bound> uses the same underlying struct usb_device as
Device<Normal>; Bound is a zero-sized type-state marker that does not
affect thread safety.

This is needed for drivers to store &'bound usb::Device<Bound> in their
private data while remaining Send.

Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Uwe Kleine-König <ukleinek@kernel.org>
Tested-by: Dirk Behme <dirk.behme@de.bosch.com>
Link: https://patch.msgid.link/20260525202921.124698-10-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
3 weeks agorust: auxiliary: implement Sync for Device<Bound>
Danilo Krummrich [Mon, 25 May 2026 20:20:55 +0000 (22:20 +0200)] 
rust: auxiliary: implement Sync for Device<Bound>

Implement Sync for Device<Bound> in addition to Device<Normal>.

Device<Bound> uses the same underlying struct auxiliary_device as
Device<Normal>; Bound is a zero-sized type-state marker that does not
affect thread safety.

This is needed for drivers to store &'bound auxiliary::Device<Bound> in
their private data while remaining Send.

Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Uwe Kleine-König <ukleinek@kernel.org>
Tested-by: Dirk Behme <dirk.behme@de.bosch.com>
Link: https://patch.msgid.link/20260525202921.124698-9-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
3 weeks agorust: platform: implement Sync for Device<Bound>
Danilo Krummrich [Mon, 25 May 2026 20:20:54 +0000 (22:20 +0200)] 
rust: platform: implement Sync for Device<Bound>

Implement Sync for Device<Bound> in addition to Device<Normal>.

Device<Bound> uses the same underlying struct platform_device as
Device<Normal>; Bound is a zero-sized type-state marker that does not
affect thread safety.

This is needed for drivers to store &'bound platform::Device<Bound> in
their private data while remaining Send.

Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Uwe Kleine-König <ukleinek@kernel.org>
Tested-by: Dirk Behme <dirk.behme@de.bosch.com>
Link: https://patch.msgid.link/20260525202921.124698-8-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
3 weeks agorust: pci: implement Sync for Device<Bound>
Danilo Krummrich [Mon, 25 May 2026 20:20:53 +0000 (22:20 +0200)] 
rust: pci: implement Sync for Device<Bound>

Implement Sync for Device<Bound> in addition to Device<Normal>.

Device<Bound> uses the same underlying struct pci_dev as Device<Normal>;
Bound is a zero-sized type-state marker that does not affect thread
safety.

This is needed for drivers to store &'bound pci::Device<Bound> in their
private data while remaining Send.

Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Uwe Kleine-König <ukleinek@kernel.org>
Tested-by: Dirk Behme <dirk.behme@de.bosch.com>
Link: https://patch.msgid.link/20260525202921.124698-7-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
3 weeks agorust: driver core: drop drvdata before devres release
Danilo Krummrich [Mon, 25 May 2026 20:20:52 +0000 (22:20 +0200)] 
rust: driver core: drop drvdata before devres release

Move the post_unbind_rust callback before devres_release_all() in
device_unbind_cleanup().

With drvdata() removed, the driver's bus device private data is only
accessible by the owning driver itself. It is hence safe to drop the
driver's bus device private data before devres actions are released.

This reordering is the key enabler for Higher-Ranked Lifetime Types
(HRT) in Rust device drivers -- it allows driver structs to hold direct
references to devres-managed resources, because the bus device private
data (and with it all such references) is guaranteed to be dropped while
the underlying devres resources are still alive.

Without this change, devres resources would be freed first, leaving the
driver's bus device private data with dangling references during its
destructor.

Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://patch.msgid.link/20260525202921.124698-6-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
3 weeks agorust: driver: decouple driver private data from driver type
Danilo Krummrich [Mon, 25 May 2026 20:20:51 +0000 (22:20 +0200)] 
rust: driver: decouple driver private data from driver type

Add a type Data<'bound> associated type to all bus driver traits,
decoupling the driver's bus device private data type from the driver
struct itself.

In the context of adding a 'bound lifetime, making this an associated
type has the advantage that it allows us to avoid a driver trait global
lifetime and it avoids the need for ForLt for bus device private data;
both of which make the subsequent implementation by buses much simpler.

All existing drivers and doc examples set type Data = Self to preserve
the current behavior.

Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://patch.msgid.link/20260525202921.124698-5-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
3 weeks agorust: driver: move 'static bounds to constructor
Gary Guo [Mon, 25 May 2026 20:20:50 +0000 (22:20 +0200)] 
rust: driver: move 'static bounds to constructor

With the ForeignOwnable lifetime change, the 'static bound is no longer
necessary on the drvdata methods or bus adapter impls. Move it to the
Registration constructor instead.

Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Gary Guo <gary@garyguo.net>
Link: https://patch.msgid.link/20260525202921.124698-4-dakr@kernel.org
Co-developed-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
3 weeks agorust: alloc: remove `'static` bound on `ForeignOwnable`
Gary Guo [Mon, 25 May 2026 20:20:49 +0000 (22:20 +0200)] 
rust: alloc: remove `'static` bound on `ForeignOwnable`

The `'static` bound is currently necessary because there's no
restriction on the lifetime of the GAT. Add a `Self: 'a` bound to
restrict possible lifetimes on `Borrowed` and `BorrowedMut`, and lift
the `'static` requirement.

Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Gary Guo <gary@garyguo.net>
Acked-by: Miguel Ojeda <ojeda@kernel.org>
Link: https://patch.msgid.link/20260525202921.124698-3-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
3 weeks agorust: pci: use 'static lifetime for PCI BAR resource names
Danilo Krummrich [Mon, 25 May 2026 20:20:48 +0000 (22:20 +0200)] 
rust: pci: use 'static lifetime for PCI BAR resource names

pci_request_region() stores the name pointer directly in struct
resource; use &'static CStr to ensure the pointer remains valid even if
the Bar is leaked.

Cc: stable@vger.kernel.org
Reported-by: Sashiko <sashiko-bot@kernel.org>
Closes: https://lore.kernel.org/all/20260522004943.CDA7C1F000E9@smtp.kernel.org/
Fixes: 3c2e31d717ac ("rust: pci: move I/O infrastructure to separate file")
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Link: https://patch.msgid.link/20260525202921.124698-2-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
3 weeks agodrm/amdkfd: Fix UML build guards for x86_64-only code
Alex Hung [Thu, 14 May 2026 17:01:39 +0000 (11:01 -0600)] 
drm/amdkfd: Fix UML build guards for x86_64-only code

cpu_data().topo.apicid and kfd_fill_iolink_info_for_cpu() rely on
x86-specific structs not present on UML. The kfd_topology.c and
kfd_crat.c were guarded by CONFIG_X86_64 alone, causing build
failures when CONFIG_DRM_AMDGPU is selected on UML.

Update guards to '#if defined(CONFIG_X86_64) && !defined(CONFIG_UML)'
to ensure x86_64-only paths are excluded on UML builds.

Fixes: af3f2f5db265 ("drm/amdgpu: Remove UML build exclusion from Kconfig")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202605140506.TI8zPIBG-lkp@intel.com/
Cc: Harry Wentland <harry.wentland@amd.com>
Assisted-by: Copilot:Claude-Sonnet-4.6
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agonvmet-tcp: fix page fragment cache leak in error path
Geliang Tang [Tue, 26 May 2026 09:22:22 +0000 (17:22 +0800)] 
nvmet-tcp: fix page fragment cache leak in error path

In nvmet_tcp_alloc_queue(), when a connection is closed during the
allocation process (e.g., nvmet_tcp_set_queue_sock() returns -ENOTCONN),
the error handling jumps to out_destroy_sq and then to out_ida_remove
without draining the page fragment cache.

Although nvmet_tcp_free_cmd() is called in some error paths to release
individual page fragments, the underlying page cache reference held by
queue->pf_cache is never released. The first allocation using pf_cache
is the call to nvmet_tcp_alloc_cmd() for queue->connect, which happens
after ida_alloc() returns successfully. This results in a page leak each
time a connection fails during allocation, which could lead to memory
exhaustion over time if connections are repeatedly opened and closed.

Fix this by calling page_frag_cache_drain() before freeing the queue
structure in the out_ida_remove label.

Fixes: 872d26a391da ("nvmet-tcp: add NVMe over TCP target driver")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Keith Busch <kbusch@kernel.org>
3 weeks agonvme-core: fix unsigned comparison warning in nvme_wait_freeze_timeout
Maurizio Lombardi [Thu, 21 May 2026 15:37:16 +0000 (17:37 +0200)] 
nvme-core: fix unsigned comparison warning in nvme_wait_freeze_timeout

The timeout variable in nvme_wait_freeze_timeout() is an unsigned type.
Checking if it is <= 0 triggers a compiler warning because an unsigned
variable can never be negative.

Fix this warning by changing the type to long.

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <error27@gmail.com>
Closes: https://lore.kernel.org/r/202605211257.STzj2Ujv-lkp@intel.com/
Fixes: 23b6d2cbf75f ("nvme: remove redundant timeout argument from nvme_wait_freeze_timeout")
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
3 weeks agodrm/xe: Restore IDLEDLY regiter on engine reset
Balasubramani Vivekanandan [Fri, 22 May 2026 16:35:32 +0000 (22:05 +0530)] 
drm/xe: Restore IDLEDLY regiter on engine reset

Wa_16023105232 programs the register IDLEDLY. The register is reset
whenever the engine is reset. Therefore it should be added to the GuC
save-restore register list for it to be restored after reset.

Fixes: 7c53ff050ba8 ("drm/xe: Apply Wa_16023105232")
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20260522163531.1365540-2-balasubramani.vivekanandan@intel.com
Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
3 weeks agodrm/dp: Add DSC virtual DPCD quirk for Realtek MST branch device
Imre Deak [Mon, 25 May 2026 12:55:16 +0000 (15:55 +0300)] 
drm/dp: Add DSC virtual DPCD quirk for Realtek MST branch device

The ASUS DC301 USB-C dock containing a Realtek MST branch device
supports the DSC decompression functionality on each of the dock's
downstream connectors, even though there is no discoverable peer-to-peer
virtual device in the MST topology (which the DP Standard
requires/suggests to control the DSC functionality on a per-DFP basis).
Add the DP_DPCD_QUIRK_DSC_WITHOUT_VIRTUAL_DPCD quirk for this branch
device as well to enable the DSC decompression functionality on all DFP
connectors of the dock, similarly to how this is done for dock's
containing older Synaptics branch devices.

Cc: Lyude Paul <lyude@redhat.com>
Reported-and-tested-by: Shawn C Lee <shawn.c.lee@intel.com>
Reviewed-by: Mika Kahola <mika.kahola@intel.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patch.msgid.link/20260525125516.2794636-1-imre.deak@intel.com
3 weeks agoarm64: dts: rockchip: add rga3 dt nodes to rk3588
Sven Püschel [Wed, 20 May 2026 22:44:33 +0000 (00:44 +0200)] 
arm64: dts: rockchip: add rga3 dt nodes to rk3588

Add devicetree nodes for the RGA3 (Raster Graphics Acceleration 3)
peripheral in the RK3588.

The existing rga node refers to the RGA2-Enhanced peripheral. The RK3588
contains one RGA2-Enhanced core and two RGA3 cores. Both feature a similar
functionality of scaling, cropping and rotating of up to two input
images into one output image. Key differences of the RGA3 are:

- supports 10bit YUV output formats
- supports 8x8 tiles and FBCD as inputs and outputs
- supports BT2020 color space conversion
- max output resolution of (8192-64)x(8192-64)
- MMU can map up to 32G DDR RAM
- fully planar formats (3 planes) are not supported
- max scale up/down factor of 8 (RGA2 allows up to 16)

Signed-off-by: Sven Püschel <s.pueschel@pengutronix.de>
Link: https://patch.msgid.link/20260521-spu-rga3-v7-28-3f33e8c7145f@pengutronix.de
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
3 weeks agoKVM: arm64: Pre-check vcpu memcache for host->guest donate
Fuad Tabba [Fri, 1 May 2026 11:21:49 +0000 (12:21 +0100)] 
KVM: arm64: Pre-check vcpu memcache for host->guest donate

__pkvm_host_donate_guest() flips the host stage-2 PTE for the
donated page to a non-valid annotation via
host_stage2_set_owner_metadata_locked() and then calls
kvm_pgtable_stage2_map() to install the matching guest stage-2
mapping. The map's return value is wrapped in WARN_ON() and
otherwise discarded, asserting that the call cannot fail.

WARN_ON() at nVHE EL2 panics, so this assertion is only correct
if the call genuinely cannot fail. kvm_pgtable_stage2_map() can
fail with -ENOMEM even at PAGE_SIZE granularity: the donate path
verifies PKVM_NOPAGE for the guest IPA before the map, so the
walker must allocate fresh page-table pages from the vcpu
memcache, and the host controls the vcpu memcache via the topup
interface. An under-provisioned donation request would otherwise
turn a recoverable -ENOMEM into a fatal hyp panic.

Bound the worst-case walker allocation alongside the existing
__host_check_page_state_range() / __guest_check_page_state_range()
pre-checks, using the helper introduced for host->guest share. If
the vcpu memcache holds fewer pages than kvm_mmu_cache_min_pages(),
return -ENOMEM before any state mutation.

Fixes: 1e579adca177 ("KVM: arm64: Introduce __pkvm_host_donate_guest()")
Assisted-by: Gemini:gemini-3.1-pro review-prompts
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260501112149.2824881-7-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
3 weeks agoKVM: arm64: Pre-check vcpu memcache for host->guest share
Fuad Tabba [Fri, 1 May 2026 11:21:48 +0000 (12:21 +0100)] 
KVM: arm64: Pre-check vcpu memcache for host->guest share

__pkvm_host_share_guest() ends with kvm_pgtable_stage2_map() to
install the guest stage-2 mapping, after a forward pass that mutates
the host vmemmap (sets PKVM_PAGE_SHARED_OWNED and increments
host_share_guest_count) for every page in the range. The map's
return value is wrapped in WARN_ON() and otherwise discarded,
asserting that the call cannot fail.

WARN_ON() at nVHE EL2 panics, so this assertion is only correct if
the call genuinely cannot fail. kvm_pgtable_stage2_map() can fail
with -ENOMEM when the stage-2 walker exhausts the caller's
memcache, and the host controls the vcpu memcache via the topup
interface, so an under-provisioned share request would otherwise
turn a recoverable -ENOMEM into a fatal hyp panic.

Bound the worst-case walker allocation in the existing pre-check
pass so that kvm_pgtable_stage2_map() cannot fail at the call
site, using kvm_mmu_cache_min_pages() -- the same bound host EL1
uses for its own stage-2 maps. If the vcpu memcache holds fewer
pages, return -ENOMEM before any state mutation.

Fixes: d0bd3e6570ae ("KVM: arm64: Introduce __pkvm_host_share_guest()")
Assisted-by: Gemini:gemini-3.1-pro review-prompts
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260501112149.2824881-6-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
3 weeks agoKVM: arm64: Seed pkvm_ownership_selftest vcpu memcache
Fuad Tabba [Fri, 1 May 2026 11:21:47 +0000 (12:21 +0100)] 
KVM: arm64: Seed pkvm_ownership_selftest vcpu memcache

The hypercall handlers call pkvm_refill_memcache() to top up the
hyp_vcpu memcache before invoking __pkvm_host_{share,donate}_guest().
pkvm_ownership_selftest invokes those functions directly with a
static selftest_vcpu that has an empty memcache.

Seed selftest_vcpu's memcache from the prepopulated selftest
pages, leaving the remainder for selftest_vm.pool. Required by
the memcache-sufficiency pre-check added in the following
patches.

Assisted-by: Gemini:gemini-3.1-pro review-prompts
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260501112149.2824881-5-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
3 weeks agoKVM: arm64: Fix __deactivate_fgt macro parameter typo
Fuad Tabba [Fri, 1 May 2026 11:21:46 +0000 (12:21 +0100)] 
KVM: arm64: Fix __deactivate_fgt macro parameter typo

__deactivate_fgt() declares its first parameter as "htcxt" but the body
references "hctxt". The parameter is unused; the macro silently captures
"hctxt" from the enclosing scope. Both existing callers
(__deactivate_traps_hfgxtr() and __deactivate_traps_ich_hfgxtr()) happen
to define a local "struct kvm_cpu_context *hctxt", so the macro works
by coincidence.

A future caller without an "hctxt" local in scope, or naming it
differently, would compile but bind to the wrong context. Align the
parameter name with the sibling __activate_fgt() macro.

The "vcpu" parameter remains unused in the body, kept for API symmetry
with __activate_fgt() (which uses it).

Fixes: f5a5a406b4b8 ("KVM: arm64: Propagate and handle Fine-Grained UNDEF bits")
Assisted-by: Gemini:gemini-3.1-pro review-prompts
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260501112149.2824881-4-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
3 weeks agoKVM: arm64: Guard against NULL vcpu on VHE hyp panic path
Fuad Tabba [Fri, 1 May 2026 11:21:45 +0000 (12:21 +0100)] 
KVM: arm64: Guard against NULL vcpu on VHE hyp panic path

On VHE, __hyp_call_panic() unconditionally calls __deactivate_traps(vcpu)
on the vcpu pointer read from host_ctxt->__hyp_running_vcpu. That pointer
is cleared after every guest exit (and is never set when no guest is
running), so an unexpected EL2 exception landing in _guest_exit_panic,
e.g. via the el2t*_invalid / el2h_irq_invalid vectors - reaches this
function with vcpu == NULL. __deactivate_traps() then dereferences vcpu
via ___deactivate_traps() -> vserror_state_is_nested() -> vcpu_has_nv()
-> vcpu->arch.features, faulting inside the panic handler and obscuring
the original failure.

The nVHE counterpart (hyp_panic() in arch/arm64/kvm/hyp/nvhe/switch.c)
already guards its vcpu-using cleanup with "if (vcpu)"; mirror that
here. sysreg_restore_host_state_vhe() does not depend on vcpu and
continues to run unconditionally, preserving panic forensics. The
trailing panic("...VCPU:%p", vcpu) prints "(null)" safely via printk's
%p handling.

Fixes: 6a0259ed29bb ("KVM: arm64: Remove hyp_panic arguments")
Assisted-by: Gemini:gemini-3.1-pro review-prompts
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260501112149.2824881-3-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
3 weeks agopinctrl: mediatek: fix SPDX comment style in header
Mayur Kumar [Mon, 11 May 2026 18:30:17 +0000 (00:00 +0530)] 
pinctrl: mediatek: fix SPDX comment style in header

Header files should use the C-style '/*' block comment for SPDX
license identifiers. Correct the style in pinctrl-mtk-mt8365.h
to satisfy checkpatch requirements.

Signed-off-by: Mayur Kumar <kmayur809@gmail.com>
Signed-off-by: Linus Walleij <linusw@kernel.org>
3 weeks agopinctrl: actions: fix SPDX comment style in header
Mayur Kumar [Mon, 11 May 2026 18:30:02 +0000 (00:00 +0530)] 
pinctrl: actions: fix SPDX comment style in header

Header files should use the C-style '/*' block comment for SPDX
license identifiers. Correct the style in pinctrl-owl.h
to satisfy checkpatch requirements.

Signed-off-by: Mayur Kumar <kmayur809@gmail.com>
Signed-off-by: Linus Walleij <linusw@kernel.org>
3 weeks agopinctrl: bcm: fix SPDX comment style in header
Mayur Kumar [Mon, 11 May 2026 18:29:43 +0000 (23:59 +0530)] 
pinctrl: bcm: fix SPDX comment style in header

Header files should use the C-style '/*' block comment for SPDX
license identifiers. Correct the style in pinctrl-bcm63xx.h
to satisfy checkpatch requirements.

Signed-off-by: Mayur Kumar <kmayur809@gmail.com>
Signed-off-by: Linus Walleij <linusw@kernel.org>
3 weeks agofs/select: replace __get_free_page() with kmalloc()
Mike Rapoport (Microsoft) [Sat, 23 May 2026 17:54:25 +0000 (20:54 +0300)] 
fs/select: replace __get_free_page() with kmalloc()

poll_get_entry() allocates new memory for poll_table entries using
__get_free_page().

kmalloc() is a better API for such use and it also provides better
scalability and more debugging possibilities.

Replace use of __get_free_page() with kmalloc().

Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Link: https://patch.msgid.link/20260523-b4-fs-v1-13-275e36a83f0e@kernel.org
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
3 weeks agofuse: replace __get_free_page() with kmalloc()
Mike Rapoport (Microsoft) [Sat, 23 May 2026 17:54:24 +0000 (20:54 +0300)] 
fuse: replace __get_free_page() with kmalloc()

fuse_do_ioctl allocates memory for struct iov array using
__get_free_page().

kmalloc() is a better API for such use and it also provides better
scalability and more debugging possibilities.

Replace use of __get_free_page() with kmalloc().

Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Link: https://patch.msgid.link/20260523-b4-fs-v1-12-275e36a83f0e@kernel.org
Acked-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
3 weeks agoisofs: replace __get_free_page() with kmalloc()
Mike Rapoport (Microsoft) [Sat, 23 May 2026 17:54:23 +0000 (20:54 +0300)] 
isofs: replace __get_free_page() with kmalloc()

isofs_readdir() allocates a temporary buffer with __get_free_page().

kmalloc() is a better API for such use and it also provides better
scalability and more debugging possibilities.

Replace use of __get_free_page() with kmalloc().

Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Link: https://patch.msgid.link/20260523-b4-fs-v1-11-275e36a83f0e@kernel.org
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
3 weeks agojbd2: replace __get_free_pages() with kmalloc()
Mike Rapoport (Microsoft) [Sat, 23 May 2026 17:54:22 +0000 (20:54 +0300)] 
jbd2: replace __get_free_pages() with kmalloc()

jbd2_alloc() falls back from kmem_cache_alloc() to __get_free_pages() for
allocations larger than PAGE_SIZE.
But kmalloc() can handle such cases with essentially the same fallback.

Replace use of __get_free_pages() with kmalloc() and simplify
jbd2_free() as both kmem_cache_alloc() and kmalloc() allocations can be
freed with kfree().

Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Link: https://patch.msgid.link/20260523-b4-fs-v1-10-275e36a83f0e@kernel.org
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
3 weeks agojfs: replace __get_free_page() with kmalloc()
Mike Rapoport (Microsoft) [Sat, 23 May 2026 17:54:21 +0000 (20:54 +0300)] 
jfs: replace __get_free_page() with kmalloc()

jfs_readdir() allocates dirent_buf with __get_free_page().

kmalloc() is a better API for such use and it also provides better
scalability and more debugging possibilities.

Replace use of __get_free_page() with kmalloc().

Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Link: https://patch.msgid.link/20260523-b4-fs-v1-9-275e36a83f0e@kernel.org
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
3 weeks agolibfs: simple_transaction_get(): replace get_zeroed_page() with kzalloc()
Mike Rapoport (Microsoft) [Sat, 23 May 2026 17:54:20 +0000 (20:54 +0300)] 
libfs: simple_transaction_get(): replace get_zeroed_page() with kzalloc()

simple_transaction_get() allocates memory with get_zeroed_page(). That
memory is used as a file local buffer that is accessed using
copy_from_user() and simple_read_from_buffer().

kmalloc() is a better API for such use and it also provides better
scalability and more debugging possibilities.

Replace use of get_zeroed_page() with kzalloc().

Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Link: https://patch.msgid.link/20260523-b4-fs-v1-8-275e36a83f0e@kernel.org
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
3 weeks agoNFSD: replace __get_free_page() with kmalloc() in nfsd_buffered_readdir()
Mike Rapoport (Microsoft) [Sat, 23 May 2026 17:54:19 +0000 (20:54 +0300)] 
NFSD: replace __get_free_page() with kmalloc() in nfsd_buffered_readdir()

nfsd_buffered_readdir() allocates a staging buffer with __get_free_page().

kmalloc() is a better API for such use and it also provides better
scalability and more debugging possibilities.

Replace use of __get_free_page() with kmalloc().

Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Link: https://patch.msgid.link/20260523-b4-fs-v1-7-275e36a83f0e@kernel.org
Acked-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
3 weeks agoNFS: remove unused page and page2 in nfs4_replace_transport()
Mike Rapoport (Microsoft) [Sat, 23 May 2026 17:54:18 +0000 (20:54 +0300)] 
NFS: remove unused page and page2 in nfs4_replace_transport()

Temporary buffers page and page2 allocated by nfs4_replace_transport() and
passed to nfs4_try_replacing_one_location() are never used.

Remove them and the code that allocates and frees memory for these buffers.

Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Link: https://patch.msgid.link/20260523-b4-fs-v1-6-275e36a83f0e@kernel.org
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>