Xiang Liu [Thu, 7 May 2026 12:56:15 +0000 (20:56 +0800)]
drm/amd/ras: Fix CPER ring debugfs read overflow
The legacy CPER debugfs reader can reach the payload path without a
valid pointer snapshot. The remaining user byte count is also treated as
the ring occupancy in dwords, so reads past the header can copy more than
requested.
Take the CPER lock before sampling pointers. Resample rptr/wptr for
payload reads, bound the payload copy by available dwords and the
remaining user size, and advance the file position for each dword copied.
Signed-off-by: Xiang Liu <xiang.liu@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Taimur Hassan [Sat, 2 May 2026 09:39:37 +0000 (04:39 -0500)]
drm/amd/display: Promote DC to 3.2.382
This version brings along following update:
-Revert "Enable HUBP/OPTC/DPP power gating"
-Revert "Unify fast update classification paths"
-enable ODM 2:1 on single eDP based on pixel clock
-Enable IPS on DCN42
-Add additional IPS entry/exit for PSR/Replay
-Separate ABM functions into dedicated power_abm.c file
-Fix always-true lower-bound assert
-Refactor dc_link_aux_transfer_raw
-only call pmfw if smu present flags true
-Fix multiple compiler warnings
-Fix CRC open failure during active rendering
-Fix white screen on boot with OLED panel
-Fix refresh rate round up case
Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com> Signed-off-by: James Lin <pinglei.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Taimur Hassan [Fri, 1 May 2026 23:19:13 +0000 (19:19 -0400)]
drm/amd/display: [FW Promotion] Release 0.1.59.0
[Why & How]
Update DMUB related command structure.
Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com> Signed-off-by: James Lin <pinglei.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Wrap DCN32 phantom-plane allocation in DC_RUN_WITH_PREEMPTION_ENABLED
[Why]
dcn32_validate_bandwidth() wraps dcn32_internal_validate_bw() with
DC_FP_START()/DC_FP_END(). In x86 non-RT, DC_FP_START takes fpregs_lock(),
which disables local softirqs.
The DML1 path through dcn32_enable_phantom_plane() calls kvzalloc() to
allocate ~335 KiB for dc_plane_state. This triggers the vmalloc path,
which calls BUG_ON(in_interrupt()) because it's invoked within the
FPU-enabled (softirq disabled) region, leading to a kernel crash.
[How]
Wrap the dc_state_create_phantom_plane() call with the
DC_RUN_WITH_PREEMPTION_ENABLED() macro to allow preemption during
this memory allocation.
Fixes: 235c67634230 ("drm/amd/display: add DCN32/321 specific files for Display Core") Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/4470 Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Signed-off-by: James Lin <pinglei.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Ovidiu Bunea <ovidiu.bunea@amd.com> Signed-off-by: Leo Chen <leo.chen@amd.com> Signed-off-by: James Lin <pinglei.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ovidiu Bunea [Fri, 1 May 2026 20:18:36 +0000 (16:18 -0400)]
drm/amd/display: Revert "Unify fast update classification paths"
[why & how]
This change causes regressions in ACPI and display off/on testing.
Revert the change to unblock testing.
This reverts commit 5f6937c1afb151c85af721fad180d588060430d7.
Reviewed-by: Aric Cyr <aric.cyr@amd.com> Signed-off-by: Ovidiu Bunea <ovidiu.bunea@amd.com> Signed-off-by: James Lin <pinglei.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Charlene Liu [Thu, 30 Apr 2026 21:24:38 +0000 (17:24 -0400)]
drm/amd/display: enable ODM 2:1 on single eDP based on pixel clock
[Why & How]
this is to force ODM 2:1 on single eDP to lower dispclk/dppclk.
Reviewed-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Charlene Liu <Charlene.Liu@amd.com> Signed-off-by: James Lin <pinglei.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
warnings were triggered by enum forward declarations that are not
valid in C++ without an explicit underlying type.
[How]
- Replace problematic enum forward declarations with C++-safe forms where
applicable.
- Use plain integer types for interface-only declarations that do not
require strong enum typing.
- Update dependent winterface signatures and related type usage
consistently.
- Add required include and type-visibility fixes to avoid follow-on parse
and type-resolution issues.
Reviewed-by: Aric Cyr <aric.cyr@amd.com> Signed-off-by: Gaghik Khachatrian <gaghik.khachatrian@amd.com> Signed-off-by: James Lin <pinglei.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ivan Lipski [Wed, 29 Apr 2026 18:21:46 +0000 (14:21 -0400)]
drm/amd/display: Enable IPS on DCN42
[Why & How]
Fully enable IPS to achieve higher power savings.
Reviewed-by: Sunpeng Li <sunpeng.li@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Signed-off-by: James Lin <pinglei.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ivan Lipski [Wed, 29 Apr 2026 23:05:20 +0000 (19:05 -0400)]
drm/amd/display: Add additional IPS entry/exit for PSR/Replay
[Why]
Multiple paths issue DMUB commands without managing IPS state, causing
dc_wake_and_execute_gpint/dmub_cmd to internally wake from IPS and
reallow idle. This flips idle_allowed back to true while
idle_optimizations_allowed remains false during in-flight commits,
desynchronizing the two flags.
Affected paths:
- amdgpu_dm_psr_set_event() and amdgpu_dm_replay_set_event() calls from
amdgpu_dm_handle_vrr_transition(), amdgpu_dm_commit_planes() and
amdgpu_dm_mod_power_update_streams(), that are invoked on atomic commits.
- debugfs psr_get(), psr_read_residency(), replay_get_state(),
replay_set_residency() access hardware without holding dc_lock or
disabling IPS.
[How]
- Explicitly exit IPS before PSR/Replay set_event w/ hw_programming,
called within atomic commit.
- Wrap debugfs PSR/Replay state getters and setters with IPS exit/entry +
dc_lock.
Reviewed-by: Sunpeng Li <sunpeng.li@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Signed-off-by: James Lin <pinglei.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Lohita Mudimela [Tue, 10 Mar 2026 12:16:01 +0000 (17:46 +0530)]
drm/amd/display: Separate ABM functions into dedicated power_abm.c file
[Why]
Improves code organization by separating Adaptive Backlight
Modulation functionality from general power management.
This modular approach enhances maintainability and makes the
codebase easier to navigate.
[How]
Create new power_abm.c file containing all ABM-related functions moved from power.c.
Remove static qualifier from shared functions to enable cross-file access:
- initialize_backlight_caps: Initialize backlight capabilities
- validate_ext_backlight_caps: Validate external backlight capabilities
- backlight_millipercent_to_pwm: Convert brightness percent to PWM
- backlight_millipercent_to_millinit: Convert brightness percent to nits
- fill_backlight_level_params: Populate backlight level parametersAdd function
declarations to mod_power.h header. Update CMakeLists.txt to include power_abm.c in build.
Maintain forward declaration of struct core_power for type compatibility.
Rename struct core_power field from 'public' to 'mod_public'.
Move internal structures (backlight_state, backlight_properties,
dmcu_varibright_cached_properties, core_power) to power_helpers.h to
ensure consistent memory layouts across compilation units.
Reviewed-by: Martin Leung <martin.leung@amd.com> Signed-off-by: Lohita Mudimela <lohita.mudimela@amd.com> Signed-off-by: James Lin <pinglei.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
A recent type change made the lower-bound part of the OTG instance
assert redundant, which can trigger static-analysis noise and distract
from actionable diagnostics.
[How]
Kept the meaningful upper-bound range validation required for safe
narrowing to uint8_t. Removed the redundant non-negative portion of the
assert so the check matches current type semantics. Revalidated with the
latest debug build log: no warnings and no build-failure markers.
Reviewed-by: Aric Cyr <aric.cyr@amd.com> Signed-off-by: Gaghik Khachatrian <gaghik.khachatrian@amd.com> Signed-off-by: James Lin <pinglei.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why & How]
The logic for choosing between the dce_aux_transfer function variants is
moved into dce_aux.c rather than link_ddc.c.
The "dce_aux_transfer_with_retries" function now uses
dce_aux_transfer_raw in its implementation as the logic is equivalent.
Reviewed-by: Nevenko Stupar <nevenko.stupar@amd.com> Reviewed-by: Gabe Teeger <gabe.teeger@amd.com> Signed-off-by: Matthew Stewart <Matthew.Stewart2@amd.com> Signed-off-by: James Lin <pinglei.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Charlene Liu [Mon, 20 Apr 2026 15:47:11 +0000 (11:47 -0400)]
drm/amd/display: only call pmfw if smu present flags true
[Why & How]
for fault safe case: only call pmfw if smu present flags true
and default to 2 channle for bios intergration info table error.
Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Signed-off-by: Charlene Liu <Charlene.Liu@amd.com> Signed-off-by: James Lin <pinglei.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Clay King [Mon, 27 Apr 2026 18:51:35 +0000 (14:51 -0400)]
drm/amd/display: Fix warnings
[Why & How]
Fix various warnings related to unsigned/signed mismatches
- Consistently use the same signedness for a given value
- Explcitly cast between types when needed
Reviewed-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Clay King <clayking@amd.com> Signed-off-by: James Lin <pinglei.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
Unreachable Code;
Copy Constructor Deleted;
Local Declaration Hides Parameter;
Local Declaration Hides Outer Scope;
Uninitialized or Suspicious Memory Use.
[How]
- Removed or refactored unreachable code paths
- Ensured proper copy constructors in C++ classes
- Renamed local variables that shadowed function parameters
- Renamed inner loop/block variables to avoid shadowing outer scope
Fixed in 8 files across several FPU layers
Also fixed in color_gamma and cs_funcs modules
- Reordered guard conditions to validate pipe type before accessing stream
- Ensures safe memory access patterns in DC DMUB service layer
All changes maintain backward compatibility and preserve functional behavior.
Reviewed-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Gaghik Khachatrian <gaghik.khachatrian@amd.com> Signed-off-by: James Lin <pinglei.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why & How]
In dml2_translation_helper.c, rename the inner loop index inside
dml2_init_soc_states() for several project cases
to avoid shadowing the outer function-scope index variable.
In display_mode_core.c, replace shift-based power-of-two expressions
used to compute dpte_row_height and dpte_row_height_linear with an
equivalent floating-point power function, consistent with existing
usage elsewhere in the file.
Behavior for valid inputs is preserved in both cases.
Reviewed-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Gaghik Khachatrian <gaghik.khachatrian@amd.com> Signed-off-by: James Lin <pinglei.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
Address signed/unsigned comparison warnings in DC paths
to keep builds warning-clean and improve type safety at comparison boundaries.
Most warnings came from signed loop/index temporaries compared against unsigned
counters (for example pipe_count, num_states, and resource-cap counters), plus a
small number of mixed signed/unsigned checks in writeback and clock-related assertions.
[How]
Aligned iterator and temporary variable types with the semantic type of the compared
bounds. Used unsigned indices for loops bounded by unsigned counters, and retained signed
types where values are semantically signed (for example arithmetic with sentinel or signed
intermediate values). Where mixed signed/unsigned comparisons are intentional, applied
explicit boundary casts or split assertions (for example non-negative signed-cap
checks before unsigned comparisons) instead of broad type changes.
No functional behavior changes are intended; this is a warning-resolution and
type-alignment cleanup.
Reviewed-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Gaghik Khachatrian <gaghik.khachatrian@amd.com> Signed-off-by: James Lin <pinglei.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Tom Chung [Tue, 28 Apr 2026 08:41:12 +0000 (16:41 +0800)]
drm/amd/display: Fix CRC open failure during active rendering
[Why]
Opening the CRC data file during active rendering can fail with -EINVAL.
The wait for commit->hw_done returns remaining jiffies on success, but
the CRC path was treating that as an error.
[How]
Handle wait_for_completion_interruptible_timeout() correctly:
positive return as success, 0 as timeout, and negative as error.
Reviewed-by: Ray Wu <ray.wu@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: James Lin <pinglei.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ray Wu [Thu, 23 Apr 2026 07:06:12 +0000 (15:06 +0800)]
drm/amd/display: Fix white screen on boot with OLED panel
[Why]
During mode change, replay_event_general_ui may remain set on the old
stream while replay_event_hw_programming is set. This can re-enable
Replay too early before hardware programming is complete.
[How]
Clear replay_event_general_ui in the mode-change path when setting
replay_event_hw_programming to keep Replay blocked until programming
finishes, avoiding white screen on OLED panels after boot.
Reviewed-by: Sunpeng Li <sunpeng.li@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Signed-off-by: James Lin <pinglei.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
ChunTao Tso [Wed, 18 Mar 2026 06:12:18 +0000 (14:12 +0800)]
drm/amd/display: Fix refresh rate round up case
[Why & How]
fix refresh rate round up case
Reviewed-by: Robin Chen <robin.chen@amd.com> Signed-off-by: ChunTao Tso <ChunTao.Tso@amd.com> Signed-off-by: James Lin <pinglei.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Yang Wang [Thu, 7 May 2026 08:09:51 +0000 (16:09 +0800)]
drm/amdgpu: fix error return code in mes_v12_1_map_test_bo
The function mes_v12_1_map_test_bo incorrectly returned 0 unconditionallyon error path,
which would hide the real error code and mislead upperlayers about the failure status.
Fix it by returning the correct error code 'r' instead of 0.
Fixes: 44e5195fa3d4 ("drm/amdgpu/mes_v12_1: add mes self test"); Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Yang Wang [Mon, 27 Apr 2026 07:09:38 +0000 (15:09 +0800)]
drm/amd/pm: use the SMU multi-msgs helper in smu_v15_0_8
Convert the SMU15.0.8 enabled-feature query to
smu_cmn_send_smc_msg_with_params() so it uses the common SMU
multi-msgs helper.
No functional change intended.
Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Yang Wang [Mon, 27 Apr 2026 07:09:37 +0000 (15:09 +0800)]
drm/amd/pm: use the SMU multi-msgs helper in smu_v15_0_0
Convert the SMU15.0.0 table transfer path and enabled-feature query to
smu_cmn_send_smc_msg_with_params() so both paths use the common SMU
multi-msgs helper.
No functional change intended.
Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Mon, 20 Apr 2026 14:08:35 +0000 (16:08 +0200)]
drm/amdgpu: fix userq hang detection and reset
Fix lock inversions pointed out by Prike and Sunil. The hang detection
timeout *CAN'T* grab locks under which we wait for fences, especially
not the userq_mutex lock.
Then instead of this completely broken handling with the
hang_detect_fence just cancel the work when fences are processed and
re-start if necessary.
Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Mon, 20 Apr 2026 13:13:57 +0000 (15:13 +0200)]
drm/amdgpu: remove almost all calls to amdgpu_userq_detect_and_reset_queues
Well the reset handling seems broken on multiple levels.
As first step of fixing this remove most calls to the hang detection.
That function should only be called after we run into a timeout! And *NOT*
as random check spread over the code in multiple places.
Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Thu, 16 Apr 2026 13:32:11 +0000 (15:32 +0200)]
drm/amdgpu: rework amdgpu_userq_signal_ioctl v3
This one was fortunately not looking so bad as the wait ioctl path, but
there were still a few things which could be fixed/improved:
1. Allocating with GFP_ATOMIC was quite unnecessary, we can do that
before taking the userq_lock.
2. Use a new mutex as protection for the fence_drv_xa so that we can do
memory allocations while holding it.
3. Starting the reset timer is unnecessary when the fence is already
signaled when we create it.
4. Cleanup error handling, avoid trying to free the queue when we don't
even got one.
v2: fix incorrect usage of xa_find, destroy the new mutex on error
v3: cleanup ref ordering
Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Yang Wang [Mon, 27 Apr 2026 07:09:37 +0000 (15:09 +0800)]
drm/amd/pm: use the SMU multi-msgs helper in smu_v15_0
Convert the SMU15 table address messages to
smu_cmn_send_smc_msg_with_params() so they use the common SMU
multi-msgs helper instead of open-coding struct smu_msg_args.
No functional change intended.
Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Yang Wang [Mon, 27 Apr 2026 07:09:37 +0000 (15:09 +0800)]
drm/amd/pm: add SMU multi-msgs helpers
SMU15 driver messages can carry multiple input parameters and return
values, but callers still have to build struct smu_msg_args directly.
Add common SMU multi-msgs helpers in smu_cmn and reuse them in the
single-parameter wrapper and the shared table transfer path.
Keep smu_cmn_send_smc_msg() semantics unchanged for older callers.
No functional change intended.
Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Mon, 20 Apr 2026 18:18:43 +0000 (20:18 +0200)]
drm/amdgpu: remove deadlocks from amdgpu_userq_pre_reset
The purpose of a GPU reset is to make sure that fence can be signaled
again and the signal and resume workers can make progress again.
So waiting for the resume worker or any fence in the GPU reset path is
just utterly nonsense.
Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Billy Tsai [Wed, 6 May 2026 08:06:20 +0000 (16:06 +0800)]
pinctrl: aspeed: Add AST2700 SoC0 support
Add pinctrl support for the SoC0 instance of the ASPEED AST2700.
AST2700 consists of two interconnected SoC instances, each with its own
pinctrl register block.
The SoC0 pinctrl hardware closely follows the design found in previous
ASPEED BMC generations, allowing the driver to build upon the common
ASPEED pinctrl infrastructure.
Signed-off-by: Billy Tsai <billy_tsai@aspeedtech.com> Signed-off-by: Linus Walleij <linusw@kernel.org>
Chen-Yu Tsai [Tue, 5 May 2026 10:40:55 +0000 (18:40 +0800)]
pinctrl: mediatek: common-v1: bypass pinctrl GPIO layer in set GPIO direction
pinctrl_gpio_direction_input() / pinctrl_gpio_direction_output() take
the pinctrl mutex. This causes a gpiochip operations to need to sleep.
Worse yet, the .can_sleep field in the gpiochip is not set. This causes
the shared GPIO proxy to trip over, as it uses gpiod_cansleep() to check
whether it can use a spinlock or needs a mutex. In this case, it ends
up taking a spinlock, then calls pinctrl_gpio_direction_output(), which
takes a mutex. This causes a huge warning.
Since the Mediatek hardware has separate clear/set registers, there is
no risk of clobbering other bits like with a read-modify-write pattern.
Also, once the GPIO function is selected / muxed in, further GPIO
operations do not involve pinctrl operations or state. The GPIO direction
and level values do not require toggling the pinmux or any other pin config
options.
Switch to directly calling mtk_pmx_gpio_set_direction() in the GPIO set
direction callbacks to avoid taking the pinctrl mutex. Drop the
.gpio_set_direction field in mtk_pmx_ops to signal we are no longer using
the pinctrl GPIO layer for setting the direction.
Chen-Yu Tsai [Tue, 5 May 2026 10:39:57 +0000 (18:39 +0800)]
pinctrl: mediatek: paris: bypass pinctrl GPIO layer in set GPIO direction
pinctrl_gpio_direction_input() / pinctrl_gpio_direction_output() take
the pinctrl mutex. This causes a gpiochip operations to need to sleep.
Worse yet, the .can_sleep field in the gpiochip is not set. This causes
the shared GPIO proxy to trip over, as it uses gpiod_cansleep() to check
whether it can use a spinlock or needs a mutex. In this case, it ends
up taking a spinlock, then calls pinctrl_gpio_direction_output(), which
takes a mutex. This causes a huge warning.
While this class of Mediatek hardware does not have separate clear/set
registers, the pinctrl context has a spinlock that is taken whenever
a register read-modify-write is done. Also, once the GPIO function is
selected / muxed in, further GPIO operations do not involve pinctrl
operations or state. The GPIO direction and level values do not require
toggling the pinmux or any other pin config options.
Switch to directly calling mtk_pinmux_gpio_set_direction() in the GPIO
set direction callbacks to avoid taking the pinctrl mutex. Drop the
.gpio_set_direction field in mtk_pmxops to signal we are no longer using
the pinctrl GPIO layer for setting the direction.
Perry Yuan [Wed, 15 Apr 2026 02:34:03 +0000 (10:34 +0800)]
drm/amdkfd: bump KFD ioctl minor version to 1.23
Bump `KFD_IOCTL_MINOR_VERSION` from 22 to 23 and document version 1.23
in `kfd_ioctl.h` so userspace can detect profiler ioctl support.
Signed-off-by: Perry Yuan <perry.yuan@amd.com> Suggested-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Perry Yuan [Tue, 10 Mar 2026 02:39:08 +0000 (10:39 +0800)]
drm/amdgpu: fix ptl state isssue after GPU reset or suspend
Fix this by skipping the sysfs disable mapping when the GPU is
currently undergoing a reset or suspend flow.
Additionally, add debug logging in psp_ptl_invoke() to better
trace PTL state and format queries/updates cmd.
Signed-off-by: Perry Yuan <perry.yuan@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Perry Yuan [Fri, 13 Mar 2026 08:31:07 +0000 (16:31 +0800)]
drm/amdgpu/gfx9.4.3: skip PTL disable during GPU reset
During RAS UE-triggered GPU reset, gfx_v9_4_3_hw_fini() attempts to
send a PTL disable command to PSP. Since PSP is unresponsive at that
point, this produces spurious error logs on all hive nodes:
PTL command 0xa0000001 failed, PSP response status: 0xFFFFFFFF
PTL initialization failed (-5)
Skip the PTL disable command when GPU reset is in progress, as PTL
will be properly re-initialized during post-reset recovery via
gfx_v9_4_3_late_init().
Signed-off-by: Perry Yuan <perry.yuan@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Perry Yuan [Thu, 26 Feb 2026 09:50:33 +0000 (17:50 +0800)]
drm/amdgpu: Move KFD sched stop/start into PTL control path
Move amdgpu_amdkfd_stop/start_sched calls from kfd_ptl_control()
into amdgpu_ptl_perf_monitor_ctrl() so all PTL callers (KFD ioctl,
sysfs, GFX init) get consistent scheduling management.
Add amdgpu_amdkfd_stop/start_sched_all() wrappers to stop and
restart KFD scheduling on all nodes without assuming node ID ordering.
v3:
* call start/stop for PTL Set Only
v2:
* move the stop/start sched function to
amdgpu_ptl_perf_monitor_ctrl(Lijo)
* add wrapper amdgpu_amdkfd_stop_sched_all and
amdgpu_amdkfd_start_sched_all (Lijo)
Signed-off-by: Perry Yuan <perry.yuan@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Perry Yuan [Mon, 23 Feb 2026 15:20:50 +0000 (23:20 +0800)]
drm/amdgpu: add SPI idle check for GC 9.4.4 in gfx_v9_4_3_is_idle()
GC 9.4.4 uses SPI busy status for idle detection instead of GRBM GUI_ACTIVE.
Add version check to use SPI_BUSY for 9.4.4 while keeping GRBM_STATUS
GUI_ACTIVE check for other GC versions.
v2: move this check into amdgpu_ptl_perf_monitor_ctrl(Lijo)
Signed-off-by: Perry Yuan <perry.yuan@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Perry Yuan [Sun, 8 Feb 2026 16:42:12 +0000 (00:42 +0800)]
drm/amdgpu: Wait for GFX idle before PTL state transition
Ensure GFX engine is idle before switching PTL state to prevent
register access violations and CP hang. This addresses the race
condition where in-flight GPU commands could conflict with PTL
state changes.
Signed-off-by: Perry Yuan <perry.yuan@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Perry Yuan [Sun, 8 Feb 2026 16:42:09 +0000 (00:42 +0800)]
drm/amdgpu: Track PTL disable requests by source
Use a bitmap to track PTL disable requests from sysfs and profiler.
PTL is only re-enabled once all sources have released their disable
requests, avoiding premature enablement.
Signed-off-by: Perry Yuan <perry.yuan@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Perry Yuan [Sun, 8 Feb 2026 16:42:08 +0000 (00:42 +0800)]
drm/amdkfd: suspend scheduler during PTL re-enabling
Stop the scheduler before releasing the PTL disable request to ensure
the GPU is quiescent during the PTL state transition. This prevents
potential queue preemption failures and GPU resets caused by modifying
PTL state while waves are executing
v1->v2:
only stop/start the scheduler when the PTL state actually needs to transition(Yifan)
Signed-off-by: Perry Yuan <perry.yuan@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
v3 changes:
* move N/A to previous format in format show(Alex)
* fix format check for format store(Alex)
* drop the ptl declarations into amdgpu_ptl.h(Alex)
v2 changes:
* add usage commands in commit info (Alex)
* move amdgpu_ptl_fmt into kgd_kfd_interface.h (Alex)
Signed-off-by: Perry Yuan <perry.yuan@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Perry Yuan [Sun, 8 Feb 2026 16:42:05 +0000 (00:42 +0800)]
Documentation/amdgpu: Add documentation for Peak Tops Limiter (PTL) sysfs interface
The PTL (Peak Tops Limiter) feature exposes per-GPU sysfs files under
/sys/class/drm/cardX/device/ptl/ to allow users to enable or disable PTL,
configure preferred data formats, and query supported formats. The usage
of these sysfs files is not always obvious, so add documentation to
describe their purpose and provide concrete usage examples.
V3 changes:
* format show will display preferred formats instead of N/A (Alex)
Signed-off-by: Perry Yuan <perry.yuan@amd.com> Suggested-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Perry Yuan [Sun, 8 Feb 2026 16:42:03 +0000 (00:42 +0800)]
drm/amdgpu: add PTL enable/query gfx control support for GC 9.4.4
Introduce hardware detection, runtime state tracking and a
kgd->ptl_ctrl() callback to enable/disable/query PTL via the
PSP performance-monitor interface (commands 0xA0000000/1).
The driver now exposes PTL capability to KFD and keeps the
software state in sync with the hardware.
Signed-off-by: Perry Yuan <perry.yuan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Perry Yuan [Sun, 8 Feb 2026 16:42:02 +0000 (00:42 +0800)]
drm/amdgpu: add psp interfaces for peak tops limiter driver
Introduce a Peak Tops Limiter (PTL) driver that dynamically caps
engine frequency to ensure delivered TOPS never exceeds a defined
TOPS_limit. This initial implementation provides core data structures
and kernel-space interfaces (set/get, enable/disable) to manage PTL state.
PTL performs a firmware handshake to initialize its state and update
predefined format types. It supports updating these format types at
runtime while user-space tools automatically switch PTL state, and
also allows explicitly switching PTL state via newly added commands.
Signed-off-by: Perry Yuan <perry.yuan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Benjamin Welton [Sun, 8 Feb 2026 16:42:00 +0000 (00:42 +0800)]
amd/amdkfd: Add kfd_ioctl_profiler to contain profiler kernel driver changes
kfd_ioctl_profiler takes a similar approach to that of
kfd_ioctl_dbg_trap (which contains debugger related IOCTL
services) where kfd_ioctl_profiler will contain all profiler
related IOCTL services. The IOCTL is designed to be expanded
as needed to support additional profiler functionality.
The current functionality of the IOCTL is to allow for profilers
which need PMC counters from GPU devices to both signal to other
profilers that may be on the system that the device has active PMC
profiling taking place on it (multiple PMC profilers on the same
device can result in corrupted counter data) and to setup the device
to allow for the collection of SQ PMC data on all queues on the device.
For PMC data for the SQ block (such as SQ_WAVES) to be available
to a profiler, mmPERFCOUNT_ENABLE must be set on the queues. When
profiling a single process, the profiler can inject PM4 packets into
each queue to turn on PERFCOUNT_ENABLE. When profiling system wide,
the profiler does not have this option and must have a way to turn
on profiling for queues in which it cannot inject packets into directly.
Accomplishing this requires a few steps:
1. Checking if the user has the necessary permissions to profile system
wide on the device. This check uses the same check that linux perf
uses to determine if a user has the necessary permissions to profile
at this scope (primarily if the process has CAP_SYS_PERFMON or is root).
2. Locking the device for profiling. This is done by setting a lock bit
on the device struct and storing the process that locked the device.
3. Iterating all queues on the device and issuing an MQD Update to enable
perfcounting on the queues.
4. Actions to cleanup if the process exits or releases the lock.
The IOCTL also contains a link to the existing PC Sampling IOCTL as well.
This is per a suggestion that we should potentially remove the PC Sampling
IOCTL to have it be a part of the profiler IOCTL. This is a future change.
In addition, we do expect to expand the profiler IOCTL to include
additional profiler functionality in the future (which necessitates the
use of a version number).
Signed-off-by: Benjamin Welton <benjamin.welton@amd.com> Signed-off-by: Perry Yuan <perry.yuan@amd.com> Acked-by: Kent Russell <kent.russell@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
device property: initialize the remaining fields of fwnode_handle in fwnode_init()
If a firmware node is allocated on the stack (for instance: temporary
software node whose life-time we control) or on the heap - but using a
non-zeroing allocation function - and initialized using fwnode_init(),
its secondary pointer will contain uninitialized memory which likely
will be neither NULL nor IS_ERR() and so may end up being dereferenced
(for example: in dev_to_swnode()). Set fwnode->secondary to NULL on
initialization. While at it: initialize the remaining fields of struct
fwnode_handle too just to be sure.
Cc: stable@vger.kernel.org Fixes: 01bb86b380a3 ("driver core: Add fwnode_init()") Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com> Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com> Link: https://patch.msgid.link/20260511074927.9473-1-bartosz.golaszewski@oss.qualcomm.com
[ Fix typo in commit message. - Danilo ] Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Frank Li [Tue, 5 May 2026 16:09:02 +0000 (12:09 -0400)]
pinctrl: imx1: Allow parsing DT without function nodes
The old format to define pinctrl settings for imx in DT has two hierarchy
levels. The first level are function device nodes. The second level are
pingroups which contain a property fsl,pins. The original ntention was to
define all pin functions in a single dtsi file and just reference the
correct ones in the board files.
The commit ("5fcdf6a7ed95e pinctrl: imx: Allow parsing DT without function
nodes") already make moden i.MX chip support flatten layout.
Make legacy chipes (more than 15 years) support this flatten layout also.
Fixes: e948cbdc41d6f ("ARM: dts: imx: remove redundant intermediate node in pinmux hierarchy") Tested-by: Sébastien Szymanski <sebastien.szymanski@armadeus.com> Signed-off-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Linus Walleij <linusw@kernel.org>
Andrea Righi [Mon, 11 May 2026 19:19:40 +0000 (21:19 +0200)]
sched_ext: Replace tryget_task_struct() with get_task_struct()
The tryget_task_struct() calls in scx_sub_disable(),
scx_root_enable_workfn() and scx_sub_enable_workfn() can never fail at
the points they're invoked:
- scx_root_enable_workfn() iterates over scx_tasks under scx_tasks_lock
and rq lock. sched_ext_dead() removes tasks from scx_tasks under the
same scx_tasks_lock before put_task_struct_rcu_user() runs in
finish_task_switch(). So any task observed in scx_tasks must have
usage > 0; put_task_struct_rcu_user() hasn't been called and the
delayed_put_task_struct() callback that decrements usage cannot have
been queued.
- scx_sub_disable() and scx_sub_enable_workfn() iterate via
css_task_iter, which takes a reference on each task in
css_task_iter_next() and holds it until the next iter_next() call, so
usage > 0 is guaranteed by the iter itself.
The actual filter for dead tasks is the SCX_TASK_DEAD check inside
scx_task_iter_next_locked(), not tryget; tryget only fails on zero
usage, a state that can't be reached for tasks visible to these iters.
Commit b7d4b28db7da ("sched_ext: Use SCX_TASK_READY test instead of
tryget_task_struct() during class switch") removed an analogous tryget
in the class-switch loop. Convert the remaining tryget calls to plain
get_task_struct() and update the comment in scx_root_enable_workfn()
that suggested tasks could be observed with zero @usage waiting for an
RCU grace period.
Breno Leitao [Mon, 11 May 2026 12:26:55 +0000 (05:26 -0700)]
workqueue: forbid TEST_WORKQUEUE from being built-in
The benchmark drives the workqueue's affinity_scope through sysfs by
filp_open()'ing /sys/bus/workqueue/devices/bench_wq/affinity_scope. When
CONFIG_TEST_WORKQUEUE=y, the module_init runs during kernel init before
userspace has mounted sysfs, so every open returns -ENOENT and the
benchmark loop spins emitting:
test_workqueue: open /sys/bus/workqueue/devices/bench_wq/affinity_scope failed: -2
Mirror the TEST_BPF pattern and add "depends on m" so Kconfig will not
let this be built into the kernel image, and document the reason in the
help text.
Breno Leitao [Mon, 11 May 2026 12:14:18 +0000 (05:14 -0700)]
workqueue: drop apply_wqattrs_lock()/unlock() wrappers
The apply_wqattrs_lock()/unlock() helpers were introduced by
commit a0111cf6710b ("workqueue: separate out and refactor the locking
of applying attrs") to encapsulate the get_online_cpus() (later
cpus_read_lock()) + mutex_lock(&wq_pool_mutex) acquire pair that was
duplicated across the apply-attrs paths.
Since commit 19af45757383 ("workqueue: Remove cpus_read_lock() from
apply_wqattrs_lock()") removed the cpus_read_lock() (pwq creation and
installation now operate on wq_online_cpumask, so CPU hotplug no longer
needs to be excluded), the wrappers have been one-line forwarders to
mutex_lock(&wq_pool_mutex)/mutex_unlock(&wq_pool_mutex).
They no longer encode any non-trivial locking rule and obscure the fact
that callers just take the existing wq_pool_mutex. This align with the
"unnecessary" helpers that got discussed in [1]
Inline the eight call sites and remove the wrappers. No functional
change.
Raphael Zimmer [Tue, 5 May 2026 09:08:12 +0000 (11:08 +0200)]
libceph: Fix potential out-of-bounds access in osdmap_decode()
When decoding osd_state and osd_weight from an incoming osdmap in
osdmap_decode(), both are decoded for each osd, i.e., map->max_osd
times. The ceph_decode_need() check only accounts for
sizeof(*map->osd_weight) once. This can potentially result in an
out-of-bounds memory access if the incoming message is corrupted such
that the max_osd value exceeds the actual content of the osdmap message.
This patch fixes the issue by changing the corresponding part in the
ceph_decode_need() check to account for
map->max_osd*sizeof(*map->osd_weight).
Brendan Jackman [Mon, 23 Mar 2026 14:25:07 +0000 (14:25 +0000)]
x86: Update comment about pgd_list
This venerable comment got detached from its context when the code moved
in commit 394158559d4c ("x86: move all the pgd_list handling to one
place"). Put it back next to its context. It was originally on
pgd_list_add() but it actually describes pgd_list so put it there.
While moving it, update it to strip away stale and superfluous info.
pageattr.c doesn't exist any more. pgd_list is now required
for all x86 architectures. Also be slightly more precise about what PGDs
are in this list.
[ dhansen: tweak and trim the updated comment a bit ]
Sven Eckelmann [Sun, 10 May 2026 09:31:03 +0000 (11:31 +0200)]
batman-adv: tp_meter: fix tp_vars reference leak in receiver shutdown
The receiver shutdown timer handler, batadv_tp_receiver_shutdown(), is
responsible for releasing the tp_vars reference it holds. However, the
existing logic for coordinating this release with batadv_tp_stop_all() was
flawed.
timer_shutdown_sync() guarantees the timer will not fire again after it
returns, but it returns non-zero only when the timer was pending at the
time of the call. If the timer had already expired (and
batadv_tp_stop_all() would unsucessfully try to rearm itself),
batadv_tp_stop_all() skips its batadv_tp_vars_put(), and
batadv_tp_receiver_shutdown() fails to put its own reference as well.
Fix this by introducing a new atomic variable receiving that is set to 1
when the receiver is initialized and cleared atomically with atomic_xchg()
by whichever side claims it first. Only the side that observes the
transition from 1 to 0 is responsible for releasing the tp_vars timer
reference, eliminating the uncertainty.
Cc: stable@kernel.org Fixes: 3d3cf6a7314a ("batman-adv: stop tp_meter sessions during mesh teardown") Signed-off-by: Sven Eckelmann <sven@narfation.org>
Luxiao Xu [Mon, 11 May 2026 16:52:09 +0000 (18:52 +0200)]
batman-adv: fix tp_meter counter underflow during shutdown
batadv_tp_sender_shutdown() unconditionally decrements the "sending"
atomic counter. If multiple paths (e.g. timeout, user cancel, and
normal finish) call this function, the counter can underflow to -1.
Since the sender logic treats any non-zero value as "still sending",
a negative value causes the sender kthread to loop indefinitely.
This leads to a use-after-free when the interface is removed while
the zombie thread is still active.
Fix this by using atomic_xchg() to ensure the counter only transitions
from 1 to 0 once.
Fixes: 33a3bb4a3345 ("batman-adv: throughput meter implementation") Cc: stable@kernel.org Reported-by: Yuan Tan <yuantan098@gmail.com> Reported-by: Yifan Wu <yifanwucs@gmail.com> Reported-by: Juefei Pu <tomapufckgml@gmail.com> Reported-by: Xin Liu <bird@lzu.edu.cn> Signed-off-by: Luxiao Xu <rakukuip@gmail.com> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
[sven: added missing change in batadv_tp_send] Signed-off-by: Sven Eckelmann <sven@narfation.org>
Jens Axboe [Mon, 11 May 2026 16:58:56 +0000 (10:58 -0600)]
io_uring: hold uring_lock across io_kill_timeouts() in cancel path
io_uring_try_cancel_requests() dropped ctx->uring_lock before calling
io_kill_timeouts(), which walks each timeout's link chain via
io_match_task() to test REQ_F_INFLIGHT. With chain mutation now
serialized by ctx->uring_lock, that walk needs the lock too.
Jens Axboe [Mon, 11 May 2026 16:58:50 +0000 (10:58 -0600)]
io_uring: defer linked-timeout chain splice out of hrtimer context
io_link_timeout_fn() is the hrtimer callback that fires when a linked
timeout expires. It currently calls io_remove_next_linked(prev) under
ctx->timeout_lock to splice the timeout request out of the link chain.
This is the only chain-mutation site that runs without ctx->uring_lock,
because hrtimer callbacks cannot take a mutex. Defer the splicing until
the task_work callback.
Jens Axboe [Mon, 11 May 2026 16:58:38 +0000 (10:58 -0600)]
io_uring: hold uring_lock when walking link chain in io_wq_free_work()
io_wq_free_work() calls io_req_find_next() from io-wq worker context,
which reads and clears req->link without holding any lock. This can
potentially race with other paths that mutate the same chain under
ctx->uring_lock.
Take ctx->uring_lock around the io_req_find_next() call. Only requests
with IO_REQ_LINK_FLAGS reach this path, which is not the hot path.
nvme: fix race condition between connected uevent and STARTED_ONCE flag
When a controller connects, nvme_start_ctrl() emits the
"NVME_EVENT=connected" uevent and sets the NVME_CTRL_STARTED_ONCE flag.
Currently, the uevent is emitted before the flag is set. This creates a
race condition for userspace tools (like udev rules) that might rely on
the "connected" event to configure other attributes.
Swap the order of operations in nvme_start_ctrl() so that the
NVME_CTRL_STARTED_ONCE flag is set before the uevent is sent. This
guarantees that the admin_timeout can already be changed when userspace
is notified.
Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Hannes Reinecke <hare@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
Rosen Penev [Mon, 30 Mar 2026 21:35:28 +0000 (14:35 -0700)]
ARM: omap2: simplify allocation for omap_device
Use a flexible array member (FAM) to combine hwmods array allocation
with the omap_device structure. This reduces the number of allocations
from two separate calls (one for the device, one for the array) to a
single allocation, improving efficiency and reducing memory fragmentation.
The FAM approach also enables bounds checking through __counted_by(),
which provides runtime verification that array accesses stay within
the allocated size. This improves security and helps catch bugs during
development.
Simplify error handling by removing the unnecessary multi-label goto
pattern. The new code is more straightforward: allocate, verify, copy
data, and either return success or error immediately.
Also removes the now-redundant kfree(od->hwmods) in omap_device_delete()
since the hwmods array is now embedded in the structure rather than
separately allocated.
ACPI: driver: Check ACPI_COMPANION() against NULL during probe
Since every platform driver can be forced to match a device that doesn't
match its list of device IDs because of device_match_driver_override(),
platform drivers that rely on the existence of a device's ACPI companion
object should verify its presence.
Accordingly, add requisite ACPI_COMPANION() or ACPI_HANDLE() checks
against NULL to 13 platform drivers handling core ACPI devices.
Also change the value returned by the ACPI thermal zone driver when
the device's ACPI companion is not present to -ENODEV for consistency
with the other drivers.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Hans de Goede <johannes.goede@oss.qualcomm.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://patch.msgid.link/4516068.ejJDZkT8p0@rafael.j.wysocki Cc: 7.0+ <stable@vger.kernel.org> # 7.0+
Jann Horn [Mon, 11 May 2026 15:55:11 +0000 (08:55 -0700)]
exit: prevent preemption of oopsing TASK_DEAD task
When an already-exiting task oopses, make_task_dead() currently calls
do_task_dead() with preemption enabled. That is forbidden:
do_task_dead() calls __schedule(), which has a comment saying "WARNING:
must be called with preemption disabled!".
If an oopsing task is preempted in do_task_dead(), between becoming
TASK_DEAD and entering the scheduler explicitly, bad things happen:
finish_task_switch() assumes that once the scheduler has switched away
from a TASK_DEAD task, the task can never run again and its stack is no
longer needed; but that assumption apparently doesn't hold if the dead
task was preempted (the SM_PREEMPT case).
This means that the scheduler ends up repeatedly dropping references on
the dead task's stack, which can lead to use-after-free or double-free
of the entire task stack; in other words, two tasks can end up running
on the same stack, resulting in various kinds of memory corruption.
(This does not just affect "recursively oopsing" tasks; it is enough to
oops once during task exit, for example in a file_operations::release
handler)
Fixes: 7f80a2fd7db9 ("exit: Stop poorly open coding do_task_dead in make_task_dead") Cc: stable@kernel.org Signed-off-by: Jann Horn <jannh@google.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ulf Hansson [Mon, 11 May 2026 15:36:21 +0000 (17:36 +0200)]
mmc: Merge branch fixes into next
Merge the mmc fixes for v7.1-rc[n] into the next branch, to allow them to
get tested together with the mmc changes that are targeted for the next
release.
Iker Pedrosa [Mon, 11 May 2026 08:53:59 +0000 (10:53 +0200)]
mmc: sdhci-of-k1: add comprehensive SDR tuning support
Implement software tuning algorithm to enable UHS-I SDR modes for SD
card operation and HS200 mode for eMMC. This adds both TX and RX delay
line tuning based on the SpacemiT K1 controller capabilities.
Algorithm features:
- Add tuning register definitions (RX_CFG, DLINE_CTRL, DLINE_CFG)
- Conditional tuning: only for high-speed modes (≥100MHz)
- TX tuning: configure transmit delay line with optimal values
(dline_reg=0, delaycode=127) to ensure optimal signal output timing
- RX tuning: single-pass window detection algorithm testing full
delay range (0-255) to find optimal receive timing window
- Retry mechanism: multiple fallback delays within optimal window
for improved reliability
Iker Pedrosa [Mon, 11 May 2026 08:53:58 +0000 (10:53 +0200)]
mmc: sdhci-of-k1: add regulator and pinctrl voltage switching support
Add voltage switching infrastructure for UHS-I modes by integrating both
regulator framework (for supply voltage control) and pinctrl state
switching (for pin drive strength optimization).
- Add regulator supply parsing and voltage switching callback
- Add optional pinctrl state switching between "default" (3.3V) and
"state_uhs" (1.8V) configurations
- Enable coordinated voltage and pin configuration changes for UHS modes
This provides complete voltage switching support while maintaining
backward compatibility when pinctrl states are not defined.
Iker Pedrosa [Mon, 11 May 2026 08:53:57 +0000 (10:53 +0200)]
mmc: sdhci-of-k1: enable essential clock infrastructure for SD operation
Ensure SD card pins receive clock signals by enabling pad clock
generation and overriding automatic clock gating. Required for all SD
operation modes.
The SDHC_GEN_PAD_CLK_ON setting in LEGACY_CTRL_REG is safe for both SD
and eMMC operation as both protocols use the same physical MMC interface
pins and require proper clock signal generation at the hardware level
for signal integrity and timing.
Additional SD-specific clock overrides (SDHC_OVRRD_CLK_OEN and
SDHC_FORCE_CLK_ON) are conditionally applied only for SD-only
controllers to handle removable card scenarios.
Iker Pedrosa [Mon, 11 May 2026 08:53:56 +0000 (10:53 +0200)]
dt-bindings: mmc: spacemit,sdhci: add pinctrl support for voltage switching
Document pinctrl properties to support voltage-dependent pin
configuration switching for UHS-I SD card modes.
Add optional pinctrl-names property with two states:
- "default": For 3.3V operation with standard drive strength
- "state_uhs": For 1.8V operation with optimized drive strength
These pinctrl states allow the SDHCI driver to coordinate voltage
switching with pin configuration changes, ensuring proper signal
integrity during UHS-I mode transitions.
====================
bpf: Fix call offset truncation and OOB read in bpf_patch_call_args()
From: Yazhou Tang <tangyazhou518@outlook.com>
This patchset addresses a silent truncation bug in the BPF verifier that
occurs when a bpf-to-bpf call involves a massive relative jump offset.
Additionally, it fixes a pre-existing out-of-bounds (OOB) read issue
in the interpreter fallback path.
Because the BPF instruction set utilizes a 32-bit imm field for bpf-to-bpf
calls, implicitly downcasting it to the 16-bit insn->off in bpf_patch_call_args()
causes incorrect call targets or subprog ID resolution for large BPF programs.
While fixing this by swapping the imm and off fields, it was discovered that
the original code also had a load-time OOB read vulnerability when the stack
depth exceeds MAX_BPF_STACK during JIT fallback.
Patch 1/3 fixes the pre-existing OOB read in bpf_patch_call_args(). It
changes the function to return an int and explicitly rejects the JIT
fallback if the stack depth exceeds MAX_BPF_STACK, preventing a potential
stack buffer overflow.
Patch 2/3 fixes the s16 truncation bug.
1. Keep the original imm field unchanged and use the off field to store
the interpreter function index.
2. Adjust the JMP_CALL_ARGS case in ___bpf_prog_run() accordingly.
3. Restore the legacy xlated dump layout in bpf_insn_prepare_dump().
Patch 3/3 introduces a selftest for this fix.
---
Change log:
v10:
1. Make the error log in patch 1/3 more clear. (Kuohai)
2. Drop bpftool and disasm_helpers.c changes, and instead restore the
legacy xlated dump layout in bpf_insn_prepare_dump(). This avoids
requiring bpftool compatibility handling. (Quentin and Alexei)
v9: https://lore.kernel.org/bpf/20260429171904.107244-1-tangyazhou@zju.edu.cn/
1. Modify the selftest in patch 3/3: use __clobber_all in inline asm.
(Sashiko AI reviewer)
v8: https://lore.kernel.org/bpf/20260429105608.92741-1-tangyazhou@zju.edu.cn/
1. Update cfg_partition_funcs() in bpftool to use insn->imm for call target
calculation. (Sashiko AI reviewer)
2. Modify the selftest in patch 3/3: add a large padding before the call
instruction, preventing the kernel panic on kernel without the fix.
(Sashiko AI reviewer)
3. Modify the selftest in patch 3/3: make it more clear.
v7: https://lore.kernel.org/bpf/20260421144504.823756-1-tangyazhou@zju.edu.cn/
1. Rebase the patchset to the bpf-next tree to resolve the apply conflict.
(Alexei)
2. Add Patch 1/3 to properly fix a pre-existing OOB read in bpf_patch_call_args().
(Sashiko AI reviewer)
v6: https://lore.kernel.org/bpf/20260412170334.716778-1-tangyazhou@zju.edu.cn/
1. Use a different but clearer approach to resolve this issue: keeping
the original imm field unchanged and using the off field to store the
interpreter function index. (Kuohai)
2. Update the related dumper code and remove a previous workaround in the
selftests disasm helpers, which is no longer needed after this fix.
v5: https://lore.kernel.org/bpf/20260326090133.221957-1-tangyazhou@zju.edu.cn/
1. Some minor changes in commit messages. (AI Reviewer)
v4: https://lore.kernel.org/bpf/20260326063329.10031-1-tangyazhou@zju.edu.cn/
1. Remove some redundant commit messages of patch 2/3. (Emil)
2. Change the number of instructions in padding_subprog() from 200,000
to 32,765, which is the minimum number of instructions required to
trigger the verifier failure. (Emil)
v3: https://lore.kernel.org/bpf/20260323122254.98540-1-tangyazhou@zju.edu.cn/
1. Resend to fix a typo in v2 and add "Fixes" tag. The rest of the changes
are identical to v2.
v2 (incorrect): https://lore.kernel.org/bpf/20260323081748.106603-1-tangyazhou@zju.edu.cn/
1. Move the s16 boundary check from fixup_call_args() to bpf_patch_call_args(),
and change the return type of bpf_patch_call_args() to int. (Emil)
2. Add Patch 3/3 to fix the incorrect subprog ID in dumped bpf_pseudo_call
instructions, which is caused by the same truncation issue. (Puranjay)
3. Refine the new selftest for clarity and add detailed comments explaining
the test design. (Emil)
Yazhou Tang [Wed, 6 May 2026 09:47:14 +0000 (17:47 +0800)]
selftests/bpf: Add test for large offset bpf-to-bpf call
Add a selftest to verify the verifier and JIT behavior when handling
bpf-to-bpf calls with relative jump offsets exceeding the s16 boundary.
The test utilizes an inline assembly block with ".rept 32765" to generate
a massive dummy subprogram. By placing this padding between the main
program and the target subprogram, it forces the verifier to process a
bpf-to-bpf call where the imm field exceeds the s16 range.
- When JIT is enabled, it asserts that the program is successfully loaded
and executes correctly to return the expected value. Since the fix
does not change the JIT behavior, the test passes whether the fix is
applied or not.
- When JIT is disabled, it also asserts that the program is successfully
loaded and executes correctly to return the expected value 3.
- Before the fix, the verifier rewrites the call instruction with a
truncated offset (here 32768 -> -32768) and lets it pass. When the
program is executed, the call instruction will go to a wrong target
(the landing pad) instead of the intended subprogram, then return -1
and fail.
- After the fix, the verifier correctly handles the large offset and
allows it to pass. The program then executes correctly to return the
expected value 3.
Yazhou Tang [Wed, 6 May 2026 09:47:13 +0000 (17:47 +0800)]
bpf: Fix s16 truncation for large bpf-to-bpf call offsets
Currently, the BPF instruction set allows bpf-to-bpf calls (or internal
calls, pseudo calls) to use a 32-bit imm field to represent the relative
jump offset.
However, when JIT is disabled or falls back to the interpreter, the
verifier invokes bpf_patch_call_args() to rewrite the call instruction.
In this function, the 32-bit imm is downcast to s16 and stored in the off
field.
If the original imm exceeds the s16 range (i.e., a jump offset greater
than 32767 instructions), this downcast silently truncates the offset,
resulting in an incorrect call target.
Fix this by:
1. In bpf_patch_call_args(), keeping the imm field unchanged and using the
off field to store the index of the interpreter function.
2. In ___bpf_prog_run() for the JMP_CALL_ARGS case, retrieving the
interpreter function pointer from the interpreters_args array using the
off field as the index, and passing the original imm to calculate the
last argument of the interpreter function.
After these changes, the truncation issue is resolved, and __bpf_call_base_args
is also no longer needed and can be removed, which makes the code cleaner.
Performance: In ___bpf_prog_run() for the JMP_CALL_ARGS case, changing the
retrieval of the interpreter function pointer from pointer addition to
direct array indexing improves performance. The possible reason is that the
latter has better instruction-level parallelism. See the v5 discussion [1]
for more details.
To avoid requiring bpftool changes, keep the new imm/off encoding internal
and restore the legacy xlated dump layout in bpf_insn_prepare_dump().
For bpf-to-bpf call offsets that do not fit in s16, export off as 0 instead
of a truncated and misleading value.
Fixes: 1ea47e01ad6e ("bpf: add support for bpf_call to interpreter") Fixes: 7105e828c087 ("bpf: allow for correlation of maps and helpers in dump") Suggested-by: Xu Kuohai <xukuohai@huaweicloud.com> Suggested-by: Puranjay Mohan <puranjay@kernel.org> Co-developed-by: Tianci Cao <ziye@zju.edu.cn> Signed-off-by: Tianci Cao <ziye@zju.edu.cn> Co-developed-by: Shenghao Yuan <shenghaoyuan0928@163.com> Signed-off-by: Shenghao Yuan <shenghaoyuan0928@163.com> Signed-off-by: Yazhou Tang <tangyazhou518@outlook.com> Link: https://lore.kernel.org/r/20260506094714.419842-3-tangyazhou@zju.edu.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Yazhou Tang [Wed, 6 May 2026 09:47:12 +0000 (17:47 +0800)]
bpf: Fix out-of-bounds read in bpf_patch_call_args()
The interpreters_args array only accommodates stack depths up to
MAX_BPF_STACK (512 bytes). However, do_misc_fixups() may allow a larger
stack depth if JIT is requested.
If JIT compilation later fails and falls back to the interpreter, the
verifier invokes bpf_patch_call_args() with this oversized stack depth.
This causes a load-time out-of-bounds (OOB) read when calculating the
interpreter function pointer index.
Fix this by changing bpf_patch_call_args() to return an int and explicitly
rejecting the JIT fallback (returning -EINVAL) if the stack depth exceeds
MAX_BPF_STACK.
Fixes: 1ea47e01ad6e ("bpf: add support for bpf_call to interpreter") Co-developed-by: Tianci Cao <ziye@zju.edu.cn> Signed-off-by: Tianci Cao <ziye@zju.edu.cn> Co-developed-by: Shenghao Yuan <shenghaoyuan0928@163.com> Signed-off-by: Shenghao Yuan <shenghaoyuan0928@163.com> Signed-off-by: Yazhou Tang <tangyazhou518@outlook.com> Acked-by: Xu Kuohai <xukuohai@huawei.com> Link: https://lore.kernel.org/r/20260506094714.419842-2-tangyazhou@zju.edu.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>
mmc: via-sdmmc: Simplify initialisation of pci_device_id array
Instead of assigning the pci_device_id members using a list (which is
hard to read as you need to look at the order of the members in that
struct in parallel) use the PCI_VDEVICE() convenience macro to compact
the initialisation while improving readability.
Also drop trailing zeros that the compiler will care about then.
The change doesn't introduce binary changes to the compiled driver,
verified on both ARCH=x86 and ARCH=arm64.
When uart_flush_buffer() runs before the DMA completion IRQ is delivered,
the following race can occur (all steps serialized by uart_port_lock):
1. DMA starts: tx_remaining = N, kfifo contains N bytes
2. DMA completes in hardware; IRQ is pending but not yet delivered
3. uart_flush_buffer() acquires the port lock and calls kfifo_reset(),
making kfifo_len() = 0 while tx_remaining remains N
4. uart_flush_buffer() releases the port lock
5. DMA IRQ fires; handle_tx_dma() acquires the port lock and calls
uart_xmit_advance(uport, tx_remaining) on an empty kfifo
uart_xmit_advance() increments kfifo->out by tx_remaining. Since
kfifo_reset() already set both in and out to 0, out wraps past in,
causing kfifo_len() to return UART_XMIT_SIZE - tx_remaining. The next
start_tx_dma() call then submits a DMA transfer of stale buffer data.
Fix this by snapshotting kfifo_len() at the start of handle_tx_dma()
and skipping uart_xmit_advance() when fifo_len < tx_remaining, which
indicates the kfifo was reset by a preceding flush.
Fixes: 2aaa43c70778 ("tty: serial: qcom-geni-serial: add support for serial engine DMA") Cc: stable <stable@kernel.org> Signed-off-by: Viken Dadhaniya <viken.dadhaniya@oss.qualcomm.com> Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com> Link: https://patch.msgid.link/20260506-serial-dma-stale-tx-buf-v1-1-e3ccb360d719@oss.qualcomm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
serial: fsl_lpuart: fix rx buffer and DMA map leaks in start_rx_dma
lpuart_start_rx_dma() allocates sport->rx_ring.buf with kzalloc() and
then maps a scatterlist via dma_map_sg(). On three subsequent error
paths the function returns directly without releasing those resources:
- when dma_map_sg() returns 0 (-EINVAL):
ring->buf is leaked.
- when dmaengine_slave_config() fails:
ring->buf and the DMA mapping are leaked.
- when dmaengine_prep_dma_cyclic() returns NULL:
ring->buf and the DMA mapping are leaked.
The sole cleanup path, lpuart_dma_rx_free(), is only reached when
lpuart_dma_rx_use is set, and the caller lpuart_rx_dma_startup() clears
that flag on failure of lpuart_start_rx_dma(). So these resources are
permanently leaked on every failure in this function. Repeated port
open/close or termios changes under error conditions will slowly consume
memory and leave stale streaming DMA mappings behind.
Fix it by introducing two error labels that unmap the scatterlist and
free the ring buffer as appropriate. While here, replace the misleading
-EFAULT (bad userspace pointer) returned when dmaengine_prep_dma_cyclic()
fails with the more accurate -ENOMEM, matching how other dmaengine users
in the tree treat this failure.
No functional change on the success path.
Fixes: 5887ad43ee02 ("tty: serial: fsl_lpuart: Use cyclic DMA for Rx") Cc: stable <stable@kernel.org> Signed-off-by: Shitalkumar Gandhi <shitalkumar.gandhi@cambiumnetworks.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Link: https://patch.msgid.link/20260420135903.2062024-1-shitalkumar.gandhi@cambiumnetworks.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
So we may legitimately reach the rest of the handler with
host->data == NULL (and therefore data == NULL). The DATDNE branch
already guards against this with an explicit "if (data != NULL)"
check, but the subsequent TOUTRD ("read data timeout") and
CRCWR/CRCRD ("data CRC error") branches dereference data
unconditionally: