git.ipfire.org Git - thirdparty/linux.git/log

drm/amdgpu/jpeg: set no_user_fence for JPEG v5.0.1 ring

JPEG rings do not support 64-bit user fence writes, reject CS
submissions with user fences.

Fixes: b8f57b69942b ("drm/amdgpu: Add JPEG5_0_1 support")
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/jpeg: set no_user_fence for JPEG v5.0.0 ring

JPEG rings do not support 64-bit user fence writes, reject CS
submissions with user fences.

Fixes: dfad65c65728 ("drm/amdgpu: Add JPEG5 support")
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/jpeg: set no_user_fence for JPEG v4.0.5 ring

JPEG rings do not support 64-bit user fence writes, reject CS
submissions with user fences.

Fixes: 8f98a715da8e ("drm/amdgpu/jpeg: add jpeg support for VCN4_0_5")
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/jpeg: set no_user_fence for JPEG v4.0.3 ring

JPEG rings do not support 64-bit user fence writes, reject CS
submissions with user fences.

Fixes: e684e654eba9 ("drm/amdgpu/jpeg: add jpeg support for VCN4_0_3")
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/jpeg: set no_user_fence for JPEG v4.0 ring

JPEG rings do not support 64-bit user fence writes, reject CS
submissions with user fences.

Fixes: b13111de32a9 ("drm/amdgpu/jpeg: add jpeg support for VCN4_0_0")
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/jpeg: set no_user_fence for JPEG v3.0 ring

JPEG rings do not support 64-bit user fence writes, reject CS
submissions with user fences.

Fixes: dfd57dbf44dd ("drm/amdgpu: add JPEG3.0 support for Sienna_Cichlid")
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/jpeg: set no_user_fence for JPEG v2.5 ring

JPEG rings do not support 64-bit user fence writes, reject CS
submissions with user fences.

Fixes: 14f43e8f88c5 ("drm/amdgpu: move JPEG2.5 out from VCN2.5")
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/jpeg: set no_user_fence for JPEG v2.0 ring

JPEG rings do not support 64-bit user fence writes, reject CS
submissions with user fences.

Fixes: 6ac27241106b ("drm/amdgpu: add JPEG v2.0 function supports")
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/vcn: set no_user_fence for VCN v5.0.2 enc ring

VCN encoder and decoder rings do not support 64-bit user fence writes,
reject CS submissions with user fences.

Fixes: 8433398c789c ("drm/amdgpu: Add VCN v5_0_2")
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/vcn: set no_user_fence for VCN v5.0.1 enc ring

VCN encoder and decoder rings do not support 64-bit user fence writes,
reject CS submissions with user fences.

Fixes: 346492f30ce3 ("drm/amdgpu: Add VCN_5_0_1 support")
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/vcn: set no_user_fence for VCN v5.0.0 enc ring

VCN encoder and decoder rings do not support 64-bit user fence writes,
reject CS submissions with user fences.

Fixes: b6d1a0632051 ("drm/amdgpu: add VCN_5_0_0 IP block support")
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/vcn: set no_user_fence for VCN v4.0.5 enc ring

VCN encoder and decoder rings do not support 64-bit user fence writes,
reject CS submissions with user fences.

Fixes: 547aad32edac ("drm/amdgpu: add VCN4 ip block support")
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/vcn: set no_user_fence for VCN v4.0.3 enc ring

VCN encoder and decoder rings do not support 64-bit user fence writes,
reject CS submissions with user fences.

Fixes: b889ef4ac988 ("drm/amdgpu/vcn: add vcn support for VCN4_0_3")
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/vcn: set no_user_fence for VCN v4.0 enc ring

VCN encoder and decoder rings do not support 64-bit user fence writes,
reject CS submissions with user fences.

Fixes: 8da1170a16e4 ("drm/amdgpu: add VCN4 ip block support")
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/vcn: set no_user_fence for VCN v3.0 enc/dec rings

VCN encoder and decoder rings do not support 64-bit user fence writes,
reject CS submissions with user fences.

Fixes: cf14826cdfb5 ("drm/amdgpu: add VCN3.0 support for Sienna_Cichlid")
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/vcn: set no_user_fence for VCN v2.5 enc/dec rings

VCN encoder and decoder rings do not support 64-bit user fence writes,
reject CS submissions with user fences.

Fixes: 28c17d72072b ("drm/amdgpu: add VCN2.5 basic supports")
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: properly handle family setting for early GC 11.5.4

Early variants need an override.

Fixes: 57d00816c6a9 ("drm/amdgpu: set family for GC 11.5.4")
Cc: Pratik Vishwakarma <Pratik.Vishwakarma@amd.com>
Cc: Roman Li <Roman.Li@amd.com>
Cc: Mario Limonciello <superm1@kernel.org>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Tested-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/vcn: set no_user_fence for VCN v2.0 enc/dec rings

VCN encoder and decoder rings do not support 64-bit user fence writes,
reject CS submissions with user fences.

Fixes: 1b61de45dfaf ("drm/amdgpu: add initial VCN2.0 support (v2)")
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Move amdgpu_device_check_iommu_direct_map() earlier

So device init ram_is_direct_mapped is available when gmc_funcs are selected
during IP early init.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: Update emit clock logic

If only one level is enabled in clock table, there is no need to
follow the fine grained clock logic which expects a minimum of
two levels (min/max).

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Promote DC to 3.2.380

This version brings along following update:
-Fix root clock disabled when DSC power gate disabled for DCN314
-Enable/Disable some power gating
-Remove Mall, SubVP and MCLK from DCN42
-Unify fast update classification paths
-Fix narrowing boundaries in dml
-Update MCIF_ADDR macro to address IGT DWB regression
-Fix dual cursor shows on extend desktop
-Fix hubp tmz field define mismatch

Acked-by: Alex Hung <Alex.Hung@amd.com>
Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: [FW Promotion] Release 0.1.57.0

[Why & How]
Modify some IPS related commands.

Acked-by: Alex Hung <Alex.Hung@amd.com>
Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: fix root clock disabled when DSC power gate disabled for DCN314

[Why]
When set debug.disable_dsc_power_gate = true, the original code
uses an early return to skip the power gate sequence and root
clock enable and disable.

For this case, install new driver without uninstall old driver.
The sequence like below:
1. On the power-off path, the old driver will power gate dsc and
   disable_dsc() (root clock disable) due to
   debug.disable_dsc_power_gate = false.
2. On the power-on path, the new driver will force power on dsc but skip
   enable_dsc() (root clock enable) due to
   debug.disable_dsc_power_gate = true.

Finally, when mode needs DSC but the root clock is disabled, underflow
happened.

[How]
- Moving enable_dsc() before the disable_dsc_power_gate check so the
  root clock is always enabled on the power-on path.
- Replacing the early return with a goto that skips only the power gate
  register writes, allowing disable_dsc() to still execute on the
  power-off path.

Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Jing Zhou <Jing.Zhou@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Disable hpo power gate

[Why & How]
Disable HPO power gate temporarily to
work around some DP 2 LL compliance failures on DCN42.

Reviewed-by: Roman Li <roman.li@amd.com>
Signed-off-by: Leo Chen <leo.chen@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Remove Mall, SubVP and MCLK from DCN42

[Why&How]
Remove MALL, SubVP and MCLK features from DCN42 resource file since it is
an APU and does not support them.

Assisted-by: Claude:opus-4.6
Reviewed-by: Dmytro Laktyushkin <dmytro.laktyushkin@amd.com>
Signed-off-by: Ivan Lipski <ivan.lipski@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Unify fast update classification paths

[Why]
The dc_fast_update intermediate struct created code duplication and
complexity with multiple classification paths (populate_fast_updates,
fast_nonaddr_updates_exist, full_update_required). This refactoring
simplifies the update classification system by consolidating to a
single path while maintaining compatibility and adding comprehensive
test coverage for fast sequence functionality.

[How]
Remove entire dc_fast_update struct and associated helper functions:
- populate_fast_updates
- fast_nonaddr_updates_exist
- full_update_required
and
- Refactor check_update_surfaces_for_stream as the
  single classification path with explicit handling for func_shaper,
  lut3d_func, cursor_csc_color_matrix_change and
  scaler_sharpener_update.
- Add reserved bitfields to surface_update_flags and
  stream_update_flags unions for completeness guards.
- Extract dc_check_address_only_update and
  dc_check_update_surfaces_for_stream as public.
- Add comprehensive test coverage with parameterized tests for all
  FAST flags, update classification tests for MED/FULL paths,
  and completeness guard tests.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Rafal Ostrowski <rafal.ostrowski@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Fix narrowing boundaries in dml

[Why]
DML code paths include implicit integer narrowing at protocol and
storage boundaries. Making these boundaries explicit improves clarity
and reduces warning noise while preserving behavior.

[How]
Apply explicit C-style casts at intentional narrowing boundaries
across DML calculation, mode support, RQ/DLG, wrapper, and DSC
helper paths.
Keep intermediate arithmetic in natural-width types where practical,
with minor type-consistency cleanups where needed. Add explicit
boundary casts for intentional narrowing and keep intermediate math
in wider types where practical, with small type-consistency cleanups
to maintain behavior and readability.

Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Gaghik Khachatrian <gaghik.khachatrian@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdkfd: Validate CRIU-restored IDs before idr_alloc

The KFD CRIU restore flow restores previously saved object IDs from
userspace.

For event restore:

  kfd_criu_restore_event()
      -> create_signal_event() / create_other_event()
          -> allocate_event_notification_slot()
              -> idr_alloc(..., *restore_id, *restore_id + 1, ...)

For BO restore:

  criu_restore_memory_of_gpu()
      -> idr_alloc(..., bo_priv->idr_handle, ...)

In both cases, the restored ID comes from userspace-provided CRIU data.

idr_alloc() expects the ID range values to fit within signed int
limits. If a restored ID is larger than INT_MAX, it can trigger a WARN
in the IDR layer.

A kernel WARN is undesirable because it prints a warning trace and may
cause a panic or reboot on systems with panic_on_warn enabled.

Smatch reported these paths as allowing unchecked userspace values to
reach idr_alloc().

Add INT_MAX validation before using restored IDs in:

- kfd_criu_restore_event()
- criu_restore_memory_of_gpu()

If the restored ID is invalid, return -EINVAL.

This prevents invalid restore data from reaching the IDR layer and
avoids WARN-triggering paths, while keeping valid restore behavior
unchanged.

Fixes: 40e8a766a761 ("drm/amdkfd: CRIU checkpoint and restore events")
Reported-by: Dan Carpenter <error27@gmail.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: David Yat Sin <david.yatsin@amd.com>
Cc: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: David Yat Sin <david.yatsin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: rework userq fence signal processing

Move more code into a common userq function.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: move read_indexed_register to amdgpu_reg_access

The read_indexed_register helper is duplicated across multiple files
with identical logic.

Move it to amdgpu_reg_access.c as
amdgpu_read_indexed_register and update all users accordingly.

No functional changes intended.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Gabriel Almeida <gabrielsousa230@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: move program_aspm to amdgpu_nbio

The program_aspm helper is duplicated across multiple files with
identical logic.

Move it to amdgpu_nbio.c as amdgpu_nbio_program_aspm and update
all users accordingly.

Signed-off-by: Gabriel Almeida <gabrielsousa230@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Enable HUBP/OPTC/DPP power gating

[Why & How]
Enable driver power gating on DCN42 for HUBP OPTC
and DPP HW blocks.

Reviewed-by: Ovidiu Bunea <ovidiu.bunea@amd.com>
Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Signed-off-by: Leo Chen <leo.chen@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Fix implicit conversion warning

[Why & How]
Fix implicit narrowing conversion warnings.

Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Gaghik Khachatrian <gaghik.khachatrian@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

tracepoint: Fix typo in tracepoint.h comment

Change "my" to "may" in the description of subsystem configurations.

Link: https://patch.msgid.link/20260422021819.1788091-1-synte4028@gmail.com
Signed-off-by: Sheng Che Peng <synte4028@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

tracing: branch: Fix inverted check on stat tracer registration

init_annotated_branch_stats() and all_annotated_branch_stats() check the
return value of register_stat_tracer() with "if (!ret)", but
register_stat_tracer() returns 0 on success and a negative errno on
failure. The inverted check causes the warning to be printed on every
successful registration, e.g.:

Warning: could not register annotated branches stats

while leaving real failures silent. The initcall also returned a
hard-coded 1 instead of the actual error.

Invert the check and propagate ret so that the warning fires on real
errors and the initcall reports the correct status.

Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: https://patch.msgid.link/20260420-tracing-v1-1-d8f4cd0d6af1@debian.org
Fixes: 002bb86d8d42 ("tracing/ftrace: separate events tracing and stats tracing engine")
Signed-off-by: Breno Leitao <leitao@debian.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

rust: ACPI: fix missing match data for PRP0001

Export `acpi_of_match_device` function and use it to match the of device
table against ACPI PRP0001 in Rust.

This fixes id_info being None on ACPI PRP0001 devices.

Using `device_get_match_data` is not possible, because Rust stores an
index in the of device id instead of a data pointer. This was done this
way to provide a convenient and obvious API for drivers, which can be
evaluated in const context without the use of any unstable language
features.

Fixes: 7a718a1f26d1 ("rust: driver: implement `Adapter`")
Signed-off-by: Markus Probst <markus.probst@posteo.de>
Acked-by: Rafael J. Wysocki (Intel) <rafael@kernel.org> # ACPI
Link: https://patch.msgid.link/20260427-rust_acpi_prp0001-v6-1-6119b2a66183@posteo.de
Signed-off-by: Danilo Krummrich <dakr@kernel.org>

drm/amd/display: Update MCIF_ADDR macro to address IGT DWB regression

[Why]
A previous warning-fix commit updated type casts in the DCN3
mmhubbub code but missed updating the MCIF_ADDR macro to the
correct, fully parenthesized and casted version. This caused
a regression during DWB tests, where address values could be
misinterpreted, potentially leading to incorrect hardware
programming.

[How]
Updated the MCIF_ADDR macro in dcn30_mmhubbub.c to use the
proper parenthesization and type casting, ensuring correct
address handling. Removed redundant casts from REG_UPDATE
calls for improved clarity and consistency with current
coding standards.

Fixes: f4cdbb5d5405 ("drm/amd/display: Fix implicit narrowing conversion warnings")
Reviewed-by: Clayton King <clayton.king@amd.com>
Signed-off-by: Gaghik Khachatrian <gaghik.khachatrian@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: fix dual cursor shows on extend desktop

[why & how]
when dpp pipe power gating disabled in driver, disable_pipe
did not disable cursor so next time as long as this pipe
powers up, it will be visible.

port dcn314 logic: disable cursor when it should be pipe pg.

Reviewed-by: Roman Li <roman.li@amd.com>
Signed-off-by: Charlene Liu <Charlene.Liu@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: fix hubp tmz field define mismatch

[why & how]
to make hubp surface_flip_registers field define mismatch
with dc_plane_address

Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Reviewed-by: Dmytro Laktyushkin <dmytro.laktyushkin@amd.com>
Signed-off-by: Charlene Liu <Charlene.Liu@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Enable driver power gating

[Why & How]
Enable driver power gating. Temporarily disable DIO power gating.

Reviewed-by: Ovidiu Bunea <ovidiu.bunea@amd.com>
Signed-off-by: Leo Chen <leo.chen@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

ipmi:ssif: NULL thread on error

Cleanup code was checking the thread for NULL, but it was possibly
a PTR_ERR() in one spot.

Spotted with static analysis.

Link: https://sourceforge.net/p/openipmi/mailman/message/59324676/
Fixes: 75c486cb1bca ("ipmi:ssif: Clean up kthread on errors")
Cc: <stable@vger.kernel.org> # 91eb7ec72612: ipmi:ssif: Remove unnecessary indention
Cc: stable@vger.kernel.org
Signed-off-by: Corey Minyard <corey@minyard.net>

ipmi:si: Return state to normal if message allocation fails

There were places where nothing would get started if a message
allocation failed, so the driver needs to return to normal state.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: <stable@vger.kernel.org>
Signed-off-by: Corey Minyard <corey@minyard.net>

ipmi: Add limits to event and receive message requests

The driver would just fetch events and receive messages until the
BMC said it was done.  To avoid issues with BMCs that never say they are
done, add a limit of 10 fetches at a time.

In addition, an si interface has an attn state it can return from the
hardware which is supposed to cause a flag fetch to see if the driver
needs to fetch events or message or a few other things.  If the attn
bit gets stuck, it's a similar problem.  So allow messages in between
flag fetches so the driver itself doesn't get stuck.

This is a more general fix than the previous fix for the specific bad
BMC, but should fix the more general issue of a BMC that won't stop
saying it has data.

This has been there from the beginning of the driver.  It's not a bug
per-se, but it is accounting for bugs in BMCs.

Reported-by: Matt Fleming <mfleming@cloudflare.com>
Closes: https://lore.kernel.org/lkml/20260415115930.3428942-1-matt@readmodwrite.com/
Fixes: <1da177e4c3f4> ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Signed-off-by: Corey Minyard <corey@minyard.net>

drm/amdgpu: fix build for CONFIG_DRM_FBDEV_EMULATION=n

The merge-commit 02e778f12359 ("Merge tag 'amd-drm-next-7.1-2026-03-12' of
https://gitlab.freedesktop.org/agd5f/linux into drm-next") removes the stub
for drm_fb_helper_gem_is_fb(), so the buld gets broken if DRM_FBDEV_EMULATION
is not set.

‘drm_fb_helper_gem_is_fb’; did you mean ‘drm_fb_helper_from_client’? [-Wimplicit-function-declaration]
1777 |                 if (!drm_fb_helper_gem_is_fb(dev->fb_helper, fb->obj[0])) {
      |                      ^~~~~~~~~~~~~~~~~~~~~~~
      |                      drm_fb_helper_from_client

Restore it.

Fixes: 02e778f12359 ("Merge tag 'amd-drm-next-7.1-2026-03-12' of https://gitlab.freedesktop.org/agd5f/linux into drm-next")
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Yury Norov <ynorov@nvidia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

sched_ext: Fix scx_flush_disable_work() UAF race

scx_flush_disable_work() calls irq_work_sync() followed by
kthread_flush_work() to ensure that the disable kthread work has
fully completed before bpf_scx_unreg() frees the SCX scheduler.

However, a concurrent scx_vexit() (e.g., triggered by a watchdog stall)
creates a race window between scx_claim_exit() and irq_work_queue():

  CPU A (scx_vexit (watchdog))        CPU B (bpf_scx_unreg)
  ----                                ----
  scx_claim_exit()
    atomic_try_cmpxchg(NONE->kind)
  stack_trace_save()
  vscnprintf()
                                      scx_disable()
                                        scx_claim_exit() -> FAIL
                                      scx_flush_disable_work()
                                        irq_work_sync()      // no-op: not queued yet
                                        kthread_flush_work() // no-op: not queued yet
                                      kobject_put(&sch->kobj) -> free %sch
  irq_work_queue() -> UAF on %sch
  scx_disable_irq_workfn()
    kthread_queue_work() -> UAF

The root cause is that CPU B's scx_flush_disable_work() returns after
syncing an irq_work that has not yet been queued, while CPU A is still
executing the code between scx_claim_exit() and irq_work_queue().

Loop until exit_kind reaches SCX_EXIT_DONE or SCX_EXIT_NONE, draining
disable_irq_work and disable_work in each pass. This ensures that any
work queued after the previous check is caught, while also correctly
handling cases where no disable was triggered (e.g., the
scx_sub_enable_workfn() abort path).

Fixes: 510a27055446 ("sched_ext: sync disable_irq_work in bpf_scx_unreg()")
Reported-by: https://sashiko.dev/#/patchset/20260424100221.32407-1-icheng%40nvidia.com
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Cheng-Yang Chou <yphbchou0911@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>

sched_ext: Collect ext_*.c include headers in build_policy.c

Move <linux/btf_ids.h> from ext.c and "ext_idle.h" from ext.c (plus its
self-include in ext_idle.c) into build_policy.c. Subsequent patches add
their headers the same way for consistency.

No functional change.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Andrea Righi <arighi@nvidia.com>

HID: uclogic: Fix regression of input name assignment

The previous fix for adding the devm_kasprintf() return check in the
commit bd07f751208b ("HID: uclogic: Add NULL check in
uclogic_input_configured()") changed the condition of hi->input->name
assignment, and it resulted in missing the proper input device name
when no custom suffix is defined.

Restore the conditional to the original content to address the
regression.

Fixes: bd07f751208b ("HID: uclogic: Add NULL check in uclogic_input_configured()")
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Jiri Kosina <jkosina@suse.com>

HID: intel-thc-hid: Intel-quickspi: Fix some error codes

If we have a partial read that is supposed to be treated as failure but
in this code we forgot to set the error code. Return -EINVAL.

Fixes: 9d8d51735a3a ("HID: intel-thc-hid: intel-quickspi: Add HIDSPI protocol implementation")
Signed-off-by: Dan Carpenter <error27@gmail.com>
Reviewed-by: Even Xu <even.xu@intel.com>
Reviewed-by: Mark Pearson <mpearson-lenovo@squebb.ca>
Signed-off-by: Jiri Kosina <jkosina@suse.com>

HID: hid-lenovo-go-s: restore OS_TYPE after resume from s2idle

The controller MCU does not persist OS_TYPE across power cycles. During
s2idle resume, the USB device may be power-cycled, causing the OS_TYPE
setting to revert to the default Windows value.

Add a reset_resume callback so that this is correctly restored after
resume.

Fixes: a23f3497bf208c59ad ("HID: hid-lenovo-go-s: Add Lenovo Legion Go S Series HID Driver")
Reviewed-by: Derek J. Clark <derekjohn.clark@gmail.com>
Signed-off-by: Matthew Schwartz <matthew.schwartz@linux.dev>
Signed-off-by: Jiri Kosina <jkosina@suse.com>

HID: elan: Add support for ELAN SB974D touchpad

Elan SB974D touchpad uses ELAN_MT_I2C format to send HID reports. Add an
entry to match for the device and parse its vendor specific format.

Signed-off-by: Damien Dejean <damiendejean@google.com>
Signed-off-by: Kornel Dulęba <korneld@google.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>

sched_ext: Call wakeup_preempt() in local_dsq_post_enq()

There are several edge cases (see linked thread) where an IMMED task
can be left lingering on a local DSQ if an RT task swoops in at the
wrong time. All of these edge cases are due to rq->next_class being idle
even after dispatching a task to rq's local DSQ. We should bump
rq->next_class to &ext_sched_class as soon as we've inserted a task into
the local DSQ.

To optimize the common case of rq->next_class == &ext_sched_class,
only call wakeup_preempt() if rq->next_class is below EXT. If next_class
is EXT or above, wakeup_preempt() is a no-op anyway.

This lets us also simplify the preempt_curr() logic a bit since
wakeup_preempt() will call preempt_curr() for us if next_class is
below EXT.

Link: https://lore.kernel.org/all/DHZPHUFXB4N3.2RY28MUEWBNYK@google.com/
Signed-off-by: Kuba Piecuch <jpiecuch@google.com>
Signed-off-by: Tejun Heo <tj@kernel.org>

HID: sony: add missing size validation for Rock Band 3 Pro instruments

This commit adds the missing size validation for Rock Band 3 PS3 Pro
instruments in sony_raw_event(), this prevents a malicious device from
allowing hid-sony to read out of bounds of the provided buffer.

Signed-off-by: Rosalie Wanders <rosalie@mailbox.org>
Signed-off-by: Jiri Kosina <jkosina@suse.com>

HID: sony: add missing size validation for SMK-Link remotes

This commit adds the missing size validation for SMK-Link remotes in
sony_raw_event(), this prevents a malicious device from allowing
hid-sony to read out of bounds of the provided buffer.

I do not own these devices so the size check only forces that the buffer
is large enough for nsg_mrxu_parse_report().

Signed-off-by: Rosalie Wanders <rosalie@mailbox.org>
Signed-off-by: Jiri Kosina <jkosina@suse.com>

HID: sony: remove unneeded WARN_ON() in sony_leds_init()

This commit removes the unneeded WARN_ON() macro usage in
sony_leds_init(), this is unneeded because the sony_leds_init() function
call is already gated behind a SONY_LED_SUPPORT check in
sony_input_configured()

Signed-off-by: Rosalie Wanders <rosalie@mailbox.org>
Signed-off-by: Jiri Kosina <jkosina@suse.com>

HID: ft260: validate i2c input report length

Add two checks to ft260_raw_event() to prevent out-of-bounds reads
from malicious or malfunctioning devices:

First, reject reports shorter than the 2-byte header (report ID +
length fields). Without this, even accessing xfer->length on a
1-byte report is an OOB read.

Second, validate xfer->length against the actual data capacity of
the received HID report. Each I2C data report ID (0xD0 through
0xDE) defines a different report size in the HID descriptor, so the
available payload varies per report. A corrupted length field could
cause memcpy to read beyond the report buffer.

Reported-by: Sebastián Josué Alba Vives <sebasjosue84@gmail.com>
Signed-off-by: Michael Zaidman <michael.zaidman@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>

HID: sony: fix incorrect force-feedback check in sony_suspend()

This commit fixes the incorrect force-feedback check in sony_suspend(),
without this the check will always be true due to checking a constant
define that is never 0.

Signed-off-by: Rosalie Wanders <rosalie@mailbox.org>
Signed-off-by: Jiri Kosina <jkosina@suse.com>

workqueue: fix devm_alloc_workqueue() va_list misuse

devm_alloc_workqueue() built a va_list and passed it as a single
positional argument to the variadic alloc_workqueue() macro:

va_start(args, max_active);
wq = alloc_workqueue(fmt, flags, max_active, args);
va_end(args);

C does not allow forwarding a va_list through a ... parameter.
alloc_workqueue() expands to alloc_workqueue_noprof(), which runs
its own va_start() over its ... params, so the inner
vsnprintf(wq->name, sizeof(wq->name), fmt, args) in
__alloc_workqueue() received the outer va_list object as the first
variadic slot rather than the caller's actual format arguments.

Add a new static helper alloc_workqueue_va() that wraps
__alloc_workqueue() and runs wq_init_lockdep() on success, and
fold both alloc_workqueue_noprof() and devm_alloc_workqueue_noprof()
onto it as suggested by Tejun.

The wq_init_lockdep() step is required on the devm path
too, otherwise __flush_workqueue()'s on-stack
COMPLETION_INITIALIZER_ONSTACK_MAP would NULL-deref wq->lockdep_map.

No caller changes are required. devm_alloc_ordered_workqueue() is
a macro forwarding to devm_alloc_workqueue() and inherits the fix.
Two in-tree callers actively trigger the broken path on every probe:

drivers/power/supply/mt6370-charger.c:889
drivers/power/supply/max77705_charger.c:649

both of which use devm_alloc_ordered_workqueue(dev, "%s", 0,
dev_name(dev)).

A standalone reproducer module is available at[1].

Link: https://github.com/leitao/debug/blob/main/workqueue/valist/wq_va_test.c
Fixes: 1dfc9d60a69e ("workqueue: devres: Add device-managed allocate workqueue")
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Tejun Heo <tj@kernel.org>

platform/x86: thinkpad_acpi: Remove unneeded goto

Remove an unneeded goto statement in hotkey_kthread(). Since
the function has a single exit location with no cleanup code,
the jump provides no benefit. Per the kernel coding style,
returning directly is preferred over goto in such case [1].

[1] https://www.kernel.org/doc/html/latest/process/coding-style.html

Signed-off-by: Eduardo Vasconcelos <eduardo@eduardovasconcelos.com>
Tested-by: Eduardo Vasconcelos <eduardo@eduardovasconcelos.com>
Reviewed-by: Mark Pearson <mpearson-lenovo@squebb.ca>
Link: https://patch.msgid.link/20260425063936.9360-1-eduardo@eduardovasconcelos.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

platform/x86: meraki-mx100: use real software node references

The lpc_ich MFD driver now exposes the software node associated with the
its GPIO controller cell. Remove the dummy software node from the
meraki-mx100 driver and reference the real one instead.

Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Link: https://patch.msgid.link/20260427-meraki-swnodes-v5-1-ad91cd306472@oss.qualcomm.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

sonypi: use strscpy() in sonypi_acpi_probe

strcpy() has been deprecated¹ because it performs no bounds checking on
the destination buffer, which can lead to buffer overflows. While the
current code works correctly, replace strcpy() with the safer strscpy()
to follow secure coding best practices.

¹ https://www.kernel.org/doc/html/latest/process/deprecated.html#strcpy

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patch.msgid.link/20260424075755.305770-3-thorsten.blum@linux.dev
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

iio: chemical: scd30: fix division by zero in write_raw

Add a zero check for val2 before using it as a divisor when setting the
sampling frequency. A user writing a zero fractional part to the
sampling_frequency sysfs attribute triggers a division by zero in the
kernel.

Fixes: 64b3d8b1b0f5 ("iio: chemical: scd30: add core driver")
Signed-off-by: Antoniu Miclaus <antoniu.miclaus@analog.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>

iio: adc: npcm: fix unbalanced clk_disable_unprepare()

The driver acquired the ADC clock with devm_clk_get() and read its
rate, but never called clk_prepare_enable(). The probe error path and
npcm_adc_remove() both called clk_disable_unprepare() unconditionally,
causing the clk framework's enable/prepare counts to underflow on
probe failure or module unbind.

The issue went unnoticed because NPCM BMC firmware leaves the ADC
clock enabled at boot, so the driver happened to work in practice.

Switch to devm_clk_get_enabled() so the clock is properly enabled
during probe and automatically released by the device-managed
cleanup, and drop the now-redundant clk_disable_unprepare() from
both the probe error path and remove().

While at it, drop the duplicate error message on devm_request_irq()
failure since the IRQ core already logs it.

Fixes: 9bf85fbc9d8f ("iio: adc: add NPCM ADC driver")
Signed-off-by: David Carlier <devnexen@gmail.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>

iio: adc: nxp-sar-adc: Avoid division by zero

When Common Clock Framework is disabled, clk_get_rate() returns 0.
This is used as part of the divisor to perform nanosecond delays
with help of ndelay(). When the above condition occurs the compiler,
due to unspecified behaviour, is free to do what it wants to. Here
it saturates the value, which is logical from mathematics point of
view. However, the ndelay() implementation has set a reasonable
upper threshold and refuses to provide anything for such a long
delay. That's why code may not be linked under these circumstances.

To solve the issue, provide a wrapper that calls ndelay() when
the value is known not to be zero.

Fixes: 4434072a893e ("iio: adc: Add the NXP SAR ADC support for the s32g2/3 platforms")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202603311958.ly6uROit-lkp@intel.com/
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@oss.qualcomm.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>

iio: Fix iio_multiply_value use in iio_read_channel_processed_scale

The function iio_multiply_value returns IIO_VAL_INT (1) on success or a
negative error number on failure, while iio_read_channel_processed_scale
should return an error code or 0. This creates a situation where the
expected result is treated as an error. Fix this by checking the
iio_multiply_value result separately, instead of passing it as a return
value.

Fixes: 05f958d003c9 ("iio: Improve iio_read_channel_processed_scale() precision")
Signed-off-by: Svyatoslav Ryhel <clamor95@gmail.com>
Reviewed-by: Hans de Goede <johannes.goede@oss.qualcomm.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>

iio: gyro: adis16260: fix division by zero in write_raw

Add a validation check for the sampling frequency value before using it
as a divisor. A user writing zero to the sampling_frequency sysfs
attribute triggers a division by zero in the kernel.

Fixes: 089a41985c6c ("staging: iio: adis16260 digital gyro driver")
Signed-off-by: Antoniu Miclaus <antoniu.miclaus@analog.com>
Reviewed-by: Nuno Sá <nuno.sa@analog.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

iio: adc: ad4695: Fix call ordering in offload buffer postenable

ad4695_enter_advanced_sequencer_mode() was called after
spi_offload_trigger_enable(). That is wrong because
ad4695_enter_advanced_sequencer_mode() issues regular SPI transfers to
put the ADC into advanced sequencer mode, and not all SPI offload capable
controllers support regular SPI transfers while offloading is enabled.

Fix this by calling ad4695_enter_advanced_sequencer_mode() before
spi_offload_trigger_enable(), so the ADC is fully configured before the
first CNV pulse can occur. This is consistent with the same constraint
that already applies to the BUSY_GP_EN write above it.

Update the error unwind labels accordingly: add err_exit_conversion_mode
so that a failure of spi_offload_trigger_enable() correctly exits
conversion mode before clearing BUSY_GP_EN.

Fixes: f09f140e3ea8 ("iio: adc: ad4695: Add support for SPI offload")
Reviewed-by: Nuno Sá <nuno.sa@analog.com>
Reviewed-by: David Lechner <dlechner@baylibre.com>
Signed-off-by: Radu Sabau <radu.sabau@analog.com>
Cc: Stable@vger.kernel.org
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

iio: light: veml6070: Fix resource leak in probe error path

The driver calls i2c_new_dummy_device() to create a dummy device,
then calls i2c_smbus_write_byte(). If i2c_smbus_write_byte() fails and
returns, the cleanup via devm_add_action_or_reset() was never registered,
so the dummy device leaks.

Switch to devm_i2c_new_dummy_device() which registers cleanup atomically
with device creation, eliminating the error-path window.

Fixes: 7501bff87c3e ("iio: light: veml6070: add action for i2c_unregister_device")
Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Signed-off-by: Felix Gu <ustc.gu@gmail.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

iio: chemical: mhz19b: reject oversized serial replies

mhz19b_receive_buf() appends each serdev chunk into the fixed
MHZ19B_CMD_SIZE receive buffer and advances buf_idx by len without
checking that the chunk fits in the remaining space. A large callback
can therefore overflow st->buf before the command path validates the
reply.

Reset the reply state before each command and reject oversized serial
replies before copying them into the fixed buffer. When an oversized
reply is detected, wake the waiter and report -EMSGSIZE instead of
overwriting st->buf.

Fixes: 4572a70b3681 ("iio: chemical: Add support for Winsen MHZ19B CO2 sensor")
Cc: stable@vger.kernel.org
Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn>
Acked-by: Gyeyoung Baek <gye976@gmail.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

iio: adc: xilinx-xadc: Fix sequencer mode in postdisable for dual mux

xadc_postdisable() unconditionally sets the sequencer to continuous
mode. For dual external multiplexer configurations this is incorrect:
simultaneous sampling mode is required so that ADC-A samples through
the mux on VAUX[0-7] while ADC-B simultaneously samples through the
mux on VAUX[8-15]. In continuous mode only ADC-A is active, so
VAUX[8-15] channels return incorrect data.

Since postdisable is also called from xadc_probe() to set the initial
idle state, the wrong sequencer mode is active from the moment the
driver loads.

The preenable path already uses xadc_get_seq_mode() which returns
SIMULTANEOUS for dual mux. Fix postdisable to do the same.

Fixes: bdc8cda1d010 ("iio:adc: Add Xilinx XADC driver")
Cc: stable@vger.kernel.org
Signed-off-by: Christofer Jonason <christofer.jonason@guidelinegeo.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Reviewed-by: Nuno Sá <nuno.sa@analog.com>
Reviewed-by: Salih Erim <salih.erim@amd.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

Merge patch series "eventpoll: clarity refactor"

Christian Brauner <brauner@kernel.org> says:

The recent UAF series (a6dc643c6931 and follow-ups) rode on
invariants in fs/eventpoll.c that were nowhere documented and had
to be reverse-engineered from the code: the lifetime relationships
between struct eventpoll, struct epitem, and struct file, the three
removal paths coordinating via epi_fget() pins and ep->mtx, the
ovflist sentinel-encoded scan state machine, the POLLFREE
release/acquire handshake, and the loop / path check globals
serialized by epnested_mutex. The fix was correct but the next
person to touch this code will hit the same learning curve.

This adds a bunch of documentation (a bunch of swearwords were removed
by having an llm go over it) and refactors. The end goal is hopefully a
bit more pallatable than what this is right now. No functional changes
intended yet.

This series codifies those invariants in source and tightens the
surrounding structure.

First there are a couple of pure documentation changes. A top-of-file
overview with field-protection tables for struct eventpoll and struct
epitem, a section gathering the loop-check / path-check globals next to
their declarations, labelled comments on the two sides of the POLLFREE
handshake, refreshed comments on epi_fget() and ep_remove_file() (whose
contract the UAF fix re-shaped), and a docblock on
ep_clear_and_put() that names its two-pass structure as load-bearing.

Next are a couple of mechanical naming cleanups.
ep_refcount_dec_and_test() -> ep_put() to pair with ep_get(); the unused
depth argument dropped from epoll_mutex_lock() (all three callers passed
zero); attach_epitem() -> ep_attach_file() for ep_remove_file()
symmetry; and the CONFIG_KCMP block relocated next to CONFIG_COMPAT so
the hot-path code is contiguous.

Next are a couple of changes that extract long bodies into named
helpers. ep_insert() splits into ep_alloc_epitem() and
ep_register_epitem(); ep_clear_and_put()'s two passes become
ep_drain_pollwaits() and ep_drain_tree() so the ordering invariant is
enforced by the call sequence rather than convention; the per-event
delivery loop body extracts from ep_send_events() as ep_deliver_event();
and the ep->mtx + epnested_mutex acquisition dance lifts out of
do_epoll_ctl() into ep_ctl_lock() / ep_ctl_unlock(), with a return value
that doubles as the @full_check argument to ep_insert().

Next are a couple of changes that address sentinel and predicate sprawl.
The EP_UNACTIVE_PTR overload (meaning "no scan in progress" on
ep->ovflist and "epi not on ovflist" on epi->next) is hidden behind
named helpers (ep_is_scanning, epi_on_ovflist, ...); epi->next is
renamed to epi->ovflist_next and the local txlist to scan_batch; and
is_file_epoll(), ep_is_linked(), ep_events_available() are converted to
return bool to match their already-boolean bodies.

And last we move the per-CTL_ADD scratch state (tfile_check_list,
path_count[], inserting_into) from file-scope globals into a
stack-allocated struct ep_ctl_ctx plumbed through the loop / path check
chain. loop_check_gen stays at file scope because the stamp it leaves on
ep->gen across calls must not collide with a future walk.

The load-bearing invariants the UAF series closed are preserved
verbatim: the epi_fget() pin in ep_remove(), the ordering of
ep_unregister_pollwait() before ep_remove_file() / ep_remove_epi()
in all three removal paths, kfree_rcu(epi) and kfree_rcu(ep), the
POLLFREE smp_store_release / smp_load_acquire pair on pwq->whead,
ep->lock IRQ-safety, the mutex_lock_nested() subclass arithmetic
in ep_insert (subclass 0 outer, 1 for tep) and __ep_eventpoll_poll
/ ep_loop_check_proc (depth-based), and the WARN_ON_ONCE contract
on ep_put() in ep_remove().

* patches from https://patch.msgid.link/20260424-work-epoll-rework-v1-0-249ed00a20f3@kernel.org:
  eventpoll: hoist CTL_ADD scratch state into struct ep_ctl_ctx
  eventpoll: use bool for predicate helpers
  eventpoll: rename epi->next and txlist for clarity
  eventpoll: wrap EP_UNACTIVE_PTR in typed sentinel helpers
  eventpoll: extract lock dance from do_epoll_ctl() into ep_ctl_lock()
  eventpoll: extract ep_deliver_event() from ep_send_events()
  eventpoll: split ep_clear_and_put() into drain helpers
  eventpoll: split ep_insert() into alloc + register stages
  eventpoll: relocate KCMP helpers near compat syscalls
  eventpoll: rename attach_epitem() to ep_attach_file()
  eventpoll: drop unused depth argument from epoll_mutex_lock()
  eventpoll: rename ep_refcount_dec_and_test() to ep_put()
  eventpoll: document ep_clear_and_put() two-pass pattern
  eventpoll: refresh epi_fget() / ep_remove_file() comments
  eventpoll: clarify POLLFREE handshake comments
  eventpoll: document loop-check / path-check globals
  eventpoll: expand top-of-file overview / locking doc

Link: https://patch.msgid.link/20260424-work-epoll-rework-v1-0-249ed00a20f3@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>

eventpoll: hoist CTL_ADD scratch state into struct ep_ctl_ctx

Three globals were shared between the loop check and the path check
paths: tfile_check_list (chain of epitems_head to walk afterwards),
path_count[] (per-depth wakeup-path tally) and inserting_into
(cycle-detection sentinel). All three are scratch state used only
during a single EPOLL_CTL_ADD full_check, yet they sit at file
scope and rely on epnested_mutex for exclusion.

The area has had three bugs in the last year -- CVE-2025-38349,
f2e467a48287 ("eventpoll: Fix semi-unbounded recursion"), and
fdcfce93073d ("eventpoll: Fix integer overflow in
ep_loop_check_proc()") -- all rooted in the shared-mutable-global
pattern being hard to reason about.

Collect the three into a stack-allocated struct ep_ctl_ctx:

    struct ep_ctl_ctx {
        struct eventpoll   *inserting_into;
        struct epitems_head *tfile_check_list;
        int                 path_count[PATH_ARR_SIZE];
    };

do_epoll_ctl() zero-initializes one on its stack and plumbs it
through ep_ctl_lock() / ep_ctl_unlock() / ep_insert() /
ep_register_epitem() / list_file() / ep_loop_check() /
ep_loop_check_proc() / reverse_path_check() /
reverse_path_check_proc() / path_count_inc() / path_count_init() /
clear_tfile_check_list(). Non-nested inserts leave the ctx zeroed
and skip the machinery entirely.

With the scratch state in ctx:
  - tfile_check_list no longer has an EP_UNACTIVE_PTR sentinel --
    NULL is the obvious "empty" value and the zero-init handles it
    for free;
  - path_count[] is no longer an array global that could be touched
    in unexpected orderings;
  - inserting_into is scoped to the exact call that set it.

loop_check_gen stays as a file-scope monotonic counter, because the
stamp left on ep->gen by a completed walk must not equal the stamp
of a future walk -- something a stack-local value cannot guarantee
across calls. It remains protected by epnested_mutex for the bump
and read lockless for the "do we need a full check" trigger in
ep_ctl_lock().

Every bail-out that existed before (the ELOOP on cycle, the path
limit check, the unbounded-recursion cap, the +1 overflow guard) is
preserved verbatim; only the data they operate on moved from file
scope to the stack ctx.

No functional change.

Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
Link: https://patch.msgid.link/20260424-work-epoll-rework-v1-17-249ed00a20f3@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>

eventpoll: use bool for predicate helpers

Three inline predicates -- is_file_epoll(), ep_is_linked(),
ep_events_available() -- were declared to return int even though
their only use is as a truthy test and their bodies are already
boolean expressions. ep_has_wakeup_source(), in the same file,
returns bool, so the convention was already inconsistent.

Convert all three to return bool. Rewrite ep_events_available()'s
verbose kerneldoc to the same one-line style the rest of the
predicates use now.

ep_poll()'s local eavail variable stores the result of
ep_events_available() (already boolean), ep_busy_loop() (returns
bool), and list_empty() (int but tested as boolean). Split it out
of the combined int declaration and give it bool type; replace the
one "eavail = 1" after a wakeup with "eavail = true" to match.

No functional change.

Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
Link: https://patch.msgid.link/20260424-work-epoll-rework-v1-16-249ed00a20f3@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>

eventpoll: rename epi->next and txlist for clarity

Two list-related names were confusing in isolation:

  struct epitem::next
    A singly-linked link slot used only when an epi is queued on
    ep->ovflist during an ep_start_scan/ep_done_scan window. The
    bare name "next" suggests a generic list link and doesn't say
    which list it belongs to.

  txlist
    The caller-local list_head used by ep_send_events() and
    __ep_eventpoll_poll() to hold the batch of items stolen from
    ep->rdllist for the current scan. "txlist" ("transmission
    list") is abbreviated and overloaded: it doesn't distinguish
    itself from ep->rdllist or ep->ovflist at a glance.

Rename for what each actually is:

  struct epitem::next   -> struct epitem::ovflist_next
  local txlist          -> scan_batch

With these in place:
  - epi->ovflist_next reads as "this is the ep->ovflist link slot",
    matching the rdllink pattern above it.
  - scan_batch reads as "the batch currently being scanned", clearly
    distinct from rdllist (canonical ready list) and ovflist
    (scan-window overflow).

ep->rdllist and ep->ovflist struct field names are preserved -- they
are long-standing interface-facing identifiers, and the new inline
helpers (ep_is_scanning, epi_on_ovflist, ...) already hide the
sentinel semantics at call sites.

No functional change.

Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
Link: https://patch.msgid.link/20260424-work-epoll-rework-v1-15-249ed00a20f3@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>

eventpoll: wrap EP_UNACTIVE_PTR in typed sentinel helpers

ep->ovflist and epi->next both use EP_UNACTIVE_PTR (a cast to
(void *)-1) as a sentinel, with distinct meanings at each site:

  ep->ovflist == EP_UNACTIVE_PTR         no scan in progress
  epi->next   == EP_UNACTIVE_PTR         epi not on ovflist

Call sites had to know the sentinel's value and, by convention, what
it meant in each context. Hide both behind inline helpers:

  ep_is_scanning(ep)       predicate for "scan in progress"
  ep_enter_scan(ep)        WRITE_ONCE flip to NULL (scan start)
  ep_exit_scan(ep)         WRITE_ONCE flip to sentinel (scan end)
  epi_on_ovflist(epi)      predicate for "epi is on ovflist"
  epi_clear_ovflist(epi)   clear epi's ovflist link slot

Convert ep_events_available(), ep_start_scan(), ep_done_scan(),
ep_poll_callback(), and ep_alloc_epitem() to use the wrappers. The
ovflist state-machine transitions are now named, not encoded in
sentinel comparisons, and the top-of-file "Ready-list state machine"
section is the single place that spells out the sentinel's meaning.

ep_alloc() keeps the raw "ep->ovflist = EP_UNACTIVE_PTR" init (no
concurrent access at that point) with an inline "not scanning"
comment, and the tfile_check_list sentinel is left alone -- it will
disappear entirely when the loop-check globals move into a
stack-allocated ep_ctl_ctx in a later commit.

Also rework ep_done_scan()'s for-loop: the combined initializer +
update clause that advanced nepi AND cleared epi->next in one step
was clever but hard to read; splitting the update into two
statements inside the body makes the epi_clear_ovflist() call
visible.

No functional change.

Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
Link: https://patch.msgid.link/20260424-work-epoll-rework-v1-14-249ed00a20f3@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>

eventpoll: extract lock dance from do_epoll_ctl() into ep_ctl_lock()

do_epoll_ctl() interleaved three concerns in one body: input
validation, the ep->mtx + epnested_mutex acquisition dance for
EPOLL_CTL_ADD on potentially-nested topologies, and the op dispatch
with final unlock. The middle concern is the error-prone one; the
error_tgt_fput label existed mainly to orchestrate it.

Extract the acquisition as ep_ctl_lock() and the release as
ep_ctl_unlock(). ep_ctl_lock() always takes ep->mtx and, for
EPOLL_CTL_ADD on a topology that can change, additionally runs the
loop / path check under epnested_mutex. The return value is a
ternary:

   0        ep->mtx held.
   1        ep->mtx AND epnested_mutex held (full-check mode).
   -errno   failure, no locks held.

The non-negative value doubles as the @full_check argument to
ep_insert() and as the argument to ep_ctl_unlock(), so the caller
neither needs an out-parameter nor a separate boolean:

   full_check = ep_ctl_lock(ep, op, epfile, tfile, nonblock);
   if (full_check < 0)
       return full_check;
   ...
   ep_ctl_unlock(ep, full_check);

ep_ctl_unlock() drops ep->mtx and, if full_check == 1, clears
tfile_check_list, bumps loop_check_gen, and drops epnested_mutex --
mirroring the old error_tgt_fput block.

With that in place do_epoll_ctl()'s preconditions become direct
returns (no locks held, nothing to clean up), the acquisition is a
single call, the op dispatch is unchanged, and the epilogue is a
single ep_ctl_unlock() before return. The error_tgt_fput label goes
away.

The two loop_check_gen bumps (one at the start of the full check,
one after) are preserved inside ep_ctl_lock() / ep_ctl_unlock(),
keeping the invariant that ep->gen stamps left on per-eventpoll
caches never equal loop_check_gen after the check completes.

No functional change.

Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
Link: https://patch.msgid.link/20260424-work-epoll-rework-v1-13-249ed00a20f3@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>

eventpoll: extract ep_deliver_event() from ep_send_events()

ep_send_events()'s body covered two concerns: per-item work (PM
wakeup-source bookkeeping, re-poll, copy_to_user, level-trigger
re-queue, EPOLLONESHOT mask clear) and the scan-level accumulator
(maxevents cap, EFAULT preservation, txlist/rdllist splice).

Extract the per-item work as ep_deliver_event(), which returns a
tri-state int:

  1       one event was delivered; caller advances the counter,
  0       re-poll produced no caller-requested events (item drops
          out of the ready list; a future callback will re-queue),
-EFAULT  copy_to_user() faulted; item is already re-inserted at
          the head of the txlist so ep_done_scan() splices it back
          to rdllist.

The per-item comments (PM ordering, the "sole writer to rdllist"
invariant for the LT re-queue, the EFAULT semantics) move into
ep_deliver_event(). ep_send_events() reduces to the fatal-signal
short-circuit, scan bracket, and a short txlist walk that accumulates
the deliveries and preserves the "first error wins" EFAULT contract
(res = delivered only if no event was previously delivered; otherwise
the success count is returned and -EFAULT is reported on the next
call).

No functional change.

Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
Link: https://patch.msgid.link/20260424-work-epoll-rework-v1-12-249ed00a20f3@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>

eventpoll: split ep_clear_and_put() into drain helpers

ep_clear_and_put()'s two-pass walk is the main way an ep file close
tears down its state, and the ordering between the passes is
load-bearing (see previous commit's docblock). Give each pass its
own function so the ordering is enforced by the call sequence in
ep_clear_and_put() rather than by convention inside one body.

ep_drain_pollwaits() carries out Pass 1: walk the rbtree and
ep_unregister_pollwait() each epi. The function-level comment names
it as Pass 1 and spells out the synchronization contract with
ep_poll_callback().

ep_drain_tree() carries out Pass 2: walk the rbtree and ep_remove()
each epi, capturing rb_next() before each erase. The comment names
it as Pass 2 and documents the hand-off with a concurrent
eventpoll_release_file() (removal path C).

ep_clear_and_put() keeps the poll-on-ep wakeup, ep->mtx bracketing,
and ep_put() + conditional ep_free(), and its docblock shrinks to
the high-level summary; the per-pass detail moved into the helpers.

No functional change.

Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
Link: https://patch.msgid.link/20260424-work-epoll-rework-v1-11-249ed00a20f3@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>

eventpoll: split ep_insert() into alloc + register stages

ep_insert() was 130 lines and mixed four concerns in one body: user
quota charge and epitem allocation, attach-into-file-hlist plus
rbtree insert plus target-ep locking, reverse-path + EPOLLWAKEUP +
poll-queue install with rollback, and ready-list publication.
Factor the first two concerns into named helpers so the body reduces
to orchestration.

ep_alloc_epitem() charges the user's epoll_watches quota, allocates
a fresh epitem, and initializes its fields. On failure it returns
ERR_PTR(-ENOSPC) or ERR_PTR(-ENOMEM); on success the epi is not yet
linked into anything.

ep_register_epitem() installs @epi into @tfile's f_ep hlist and
@ep's rbtree, optionally chains @tfile onto tfile_check_list for the
path check, takes the tep->mtx nested lock for the epoll-watches-
epoll case, and finally takes the ep_get() reference that pairs
with ep_remove()'s ep_put() in ep_insert()'s error paths. On failure
it frees the epi and decrements epoll_watches to match
ep_alloc_epitem().

ep_insert()'s remaining body is the rollback-via-ep_remove() chain
(reverse_path_check, EPOLLWAKEUP source creation, ep_ptable_queue_proc
allocation) and the ready-list / wake publication. Remove a few
stale comments that duplicated function-level documentation or
described obvious code.

No functional change; rollback boundaries unchanged -- every error
path after ep_register_epitem() still calls ep_remove(), preserving
the ep->refcount invariant that keeps ep_remove()'s WARN_ON_ONCE safe.

Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
Link: https://patch.msgid.link/20260424-work-epoll-rework-v1-10-249ed00a20f3@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>

eventpoll: relocate KCMP helpers near compat syscalls

ep_find_tfd() and get_epoll_tfile_raw_ptr() are only used when
CONFIG_KCMP=y. They implement the lookup side of the kcmp(2)
KCMP_EPOLL_TFD query. The helpers currently live between ep_find()
and ep_poll_callback(), interrupting the run of hot-path code
(callback, wait-queue setup, path check, insert, modify, send_events,
poll) with a feature-gated block.

Move the #ifdef CONFIG_KCMP block next to #ifdef CONFIG_COMPAT, which
is also a peripheral ABI extension. Hot-path code becomes a
contiguous span, and the userspace-adjacent extensions cluster at the
end of the file just before eventpoll_init().

Pure code movement; diff is 44 removed and 44 added, all within one
block. No functional change.

Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
Link: https://patch.msgid.link/20260424-work-epoll-rework-v1-9-249ed00a20f3@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>

eventpoll: rename attach_epitem() to ep_attach_file()

ep_remove_file() tears down the f_ep linkage that attach_epitem()
establishes, so the pair should look like one. Rename to
ep_attach_file() for the "ep_*" + subject symmetry and to match the
naming used elsewhere in the file (ep_insert, ep_modify, ep_remove,
ep_remove_file, ep_remove_epi, ep_unregister_pollwait).

Pure rename; no functional change.

Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
Link: https://patch.msgid.link/20260424-work-epoll-rework-v1-8-249ed00a20f3@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>

eventpoll: drop unused depth argument from epoll_mutex_lock()

epoll_mutex_lock() has three callers, all in do_epoll_ctl(), and every
one passes depth == 0. The argument has been dead since the helper was
introduced. Drop it. Because a zero subclass makes mutex_lock_nested()
equivalent to mutex_lock(), switch the blocking path to the simpler
primitive as well.

No functional change.

Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
Link: https://patch.msgid.link/20260424-work-epoll-rework-v1-7-249ed00a20f3@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>

eventpoll: rename ep_refcount_dec_and_test() to ep_put()

ep_refcount_dec_and_test() mirrors refcount_dec_and_test() verbatim,
which reads fine at a call site like

  if (ep_refcount_dec_and_test(ep))
      ep_free(ep);

but awkward at

  WARN_ON_ONCE(ep_refcount_dec_and_test(ep));

and does not pair cleanly with ep_get(). Rename to the idiomatic
ep_put() and reword the kerneldoc to spell out the return-value
contract (caller is responsible for ep_free() iff the return is
true). Leave ep_put() as a bool-returning wrapper -- we cannot fold
ep_free() into it because ep_remove() calls it under ep->mtx and the
mutex would still be held when ep_free()'s mutex_destroy() ran (see
commit 8c2e52ebbe88 "eventpoll: don't decrement ep refcount while
still holding the ep mutex").

No functional change.

Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
Link: https://patch.msgid.link/20260424-work-epoll-rework-v1-6-249ed00a20f3@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>

eventpoll: document ep_clear_and_put() two-pass pattern

ep_clear_and_put() walks the rbtree twice: once to drain each epi's
pwqlist, then again to ep_remove() each entry. The split is
load-bearing -- fusing the passes into one loop would let a poll
callback still queued on some epi_i fire after epi_{i+k} has already
been freed -- but the previous comments described each pass in
isolation and did not explain the ordering invariant or the
cooperation with removal path C (eventpoll_release_file).

Add a function-level docblock that labels this as path B from the
top-of-file "Removal paths" section, names the two passes and the
ordering invariant, explains the pwqlist drain as synchronization
with in-flight ep_poll_callback() via whead->lock, describes the
C-path hand-off when epi_fget() returns NULL, and states the
ep->refcount invariant that keeps ep_remove()'s WARN_ON_ONCE safe
across the loop.

Also tighten the per-pass comments to one line each and fix the
minor grammar bug in the poll_wait release comment ("these file" ->
"poll-on-ep").

No functional change.

Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
Link: https://patch.msgid.link/20260424-work-epoll-rework-v1-5-249ed00a20f3@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>

eventpoll: refresh epi_fget() / ep_remove_file() comments

Two comments drifted from the code they sit on.

epi_fget()'s block comment still referenced atomic_long_inc_not_zero,
which has been file_ref_get() for a while, and described only one of
the function's two roles: safe dereference of epi->ffd.file under
ep->mtx. Since commit a6dc643c6931 ("eventpoll: fix ep_remove struct
eventpoll / struct file UAF") the refcount bump also serves as a pin
that blocks __fput() from starting, which is what lets ep_remove()
touch file->f_lock and file->f_ep without racing
eventpoll_release_file(). Update the block to name both roles and the
commit that introduced the pin role.

ep_remove_file()'s one-line "See eventpoll_release() for details"
pointed at an inline in include/linux/eventpoll.h but said nothing
about what those details were. Replace it with a short explanation:
we publish NULL so the eventpoll_release() fastpath can skip the slow
path, and this is safe because every f_ep writer either holds a pin
via epi_fget() or is __fput() itself.

Comment-only; no functional change.

Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
Link: https://patch.msgid.link/20260424-work-epoll-rework-v1-4-249ed00a20f3@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>

eventpoll: clarify POLLFREE handshake comments

ep_remove_wait_queue() and the POLLFREE branch of ep_poll_callback()
are the two halves of a release/acquire handshake that lets a
subsystem (binder, signalfd, ...) tear down a wait-queue head from
under a registered epitem. The existing local comments documented the
race but did not name the protocol or refer readers from one side to
the other. After the previous commit added a "POLLFREE handshake"
section to the top-of-file banner, these sites can point at the
banner and at each other.

Rework the two comment blocks so that each side is labelled
"acquire side" or "release side", references the banner, and
explains its role in the protocol. On the release side fuse the two
former comments into one narrative: list_del_init() tolerates a
second delete from a racing ep_remove_wait_queue(), and the
smp_store_release() is what lets that racing remover discover the
teardown.

Comment-only; no functional change.

Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
Link: https://patch.msgid.link/20260424-work-epoll-rework-v1-3-249ed00a20f3@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>

eventpoll: document loop-check / path-check globals

The globals that support EPOLL_CTL_ADD's cycle and path-length checks
are scattered: epnested_mutex, loop_check_gen, inserting_into, and
tfile_check_list sit at the top of the file; path_count[] and
path_limits[] are declared inline with the path-check code further
down. Their interaction -- the "ep->gen == loop_check_gen" trigger in
do_epoll_ctl(), the two loop_check_gen++ bumps that sandwich a check,
the EP_UNACTIVE_PTR sentinel on tfile_check_list, the -ELOOP back-edge
detection via inserting_into -- is not documented anywhere.

The area has had three recent fixes (CVE-2025-38349, the unbounded
recursion fix, and the overflow fix) whose logic depends on these
invariants. Collect the description in one block alongside the
declarations, cross-reference the path_count[] declaration that lives
with the path-check code, and name the fix commits so future readers
can find the context.

Also add a short comment on struct epitems_head describing its
dual use (wrapper for non-epoll file->f_ep versus pointing into
&ep->refs for the epoll-watches-epoll case), which the old comment
on tfile_check_list had accidentally attached to the struct.

Comment-only; no functional change.

Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
Link: https://patch.msgid.link/20260424-work-epoll-rework-v1-2-249ed00a20f3@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>

eventpoll: expand top-of-file overview / locking doc

The existing ~40-line "LOCKING:" banner covered the three-level lock
hierarchy (epnested_mutex > ep->mtx > ep->lock) but nothing else.
Lifetime rules, the ready-list state machine, the three removal paths,
and the POLLFREE contract are implicit in the code. The recent UAF
series (a6dc643c6931, 07712db80857, 8c2e52ebbe88, f2e467a48287) rode
on invariants that were only implicit.

Codify them at the top of the file: the subsystem overview, the lock
hierarchy and its mutex_lock_nested() subclass convention (reworded
from the old banner), a field-protection table for struct eventpoll
and struct epitem that names the two faces of the rbn/rcu union (rbn
under ep->mtx while linked into ep->rbr; rcu touched only by
kfree_rcu(epi) on the free path), the ovflist sentinel encoding and
scan-flip invariants, the three removal paths (A ep_remove, B
ep_clear_and_put, C eventpoll_release_file) and the epi_fget() pin
that orchestrates A vs C, and the POLLFREE store-release /
load-acquire handshake.

No functional change.

Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
Link: https://patch.msgid.link/20260424-work-epoll-rework-v1-1-249ed00a20f3@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>

parisc: drivers: switch to dynamic root device

Driver core expects devices to be dynamically allocated and will, for
example, complain loudly if a device that lacks a release function
is ever freed.

Use root_device_register() to allocate and register the root device
instead of open coding using a static device.

While at it, drop the redundant additional reference taken at init.

Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Helge Deller <deller@gmx.de>

Revert "parisc: led: fix reference leak on failed device registration"

This reverts commit 707610bcccbd0327530938e33f3f33211a640a4e.

platform_device_register() is going to be fixed instead.

Signed-off-by: Helge Deller <deller@gmx.de>

RDMA/mlx5: Fix error path fall-through in mlx5_ib_dev_res_srq_init()

mlx5_ib_dev_res_srq_init() allocates two SRQs, s0 and s1. When
ib_create_srq() fails for s1, the error branch destroys s0 but falls
through and unconditionally assigns the freed s0 and the ERR_PTR s1 to
devr->s0 and devr->s1.

This leads to several problems: the lock-free fast path checks
"if (devr->s1) return 0;" and treats the ERR_PTR as already initialised;
users in mlx5_ib_create_qp() dereference the freed SRQ or ERR_PTR via
to_msrq(devr->s0)->msrq.srqn; and mlx5_ib_dev_res_cleanup() dereferences
the ERR_PTR and double-frees s0 on teardown.

Fix by adding the same `goto unlock` in the s1 failure path.

Cc: stable@vger.kernel.org
Fixes: 5895e70f2e6e ("IB/mlx5: Allocate resources just before first QP/SRQ is created")
Link: https://patch.msgid.link/r/SYBPR01MB7881E1E0970268BD69C0BA75AF2B2@SYBPR01MB7881.ausprd01.prod.outlook.com
Reported-by: Yuhao Jiang <danisjiang@gmail.com>
Signed-off-by: Junrui Luo <moonafterrain@outlook.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

remoteproc: imx_rproc: Add support for i.MX94

Add basic remoteproc support for the i.MX94 M-core processors, including
address translation tables(dev addr is from view of remote processor,
sys addr is from view of main processor) and device configuration data for
the CM70, CM71, and CM33S cores.

Reviewed-by: Daniel Baluta <daniel.baluta@nxp.com>
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Link: https://lore.kernel.org/r/20260427-imx943-rproc-v4-3-68d7c7253acd@nxp.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>

remoteproc: imx_rproc: Program non-zero SM CPU/LMM reset vector

Cortex-M[7,33] processors use a fixed reset vector table format:

  0x00  Initial SP value
  0x04  Reset vector
  0x08  NMI
  0x0C  ...
  ...
  IRQ[n]

In ELF images, the corresponding layout is:

reset_vectors:  --> hardware reset address
        .word __stack_end__
        .word Reset_Handler
        .word NMI_Handler
        .word HardFault_Handler
        ...
        .word UART_IRQHandler
        .word SPI_IRQHandler
        ...

Reset_Handler:  --> ELF entry point address
        ...

The hardware fetches the first two words from reset_vectors and populates
SP with __stack_end__ and PC with Reset_Handler. Execution proceeds from
Reset_Handler.

However, the ELF entry point does not always match the hardware reset
address. For example, on i.MX94 CM33S:

  ELF entry point:     0x0ffc211d
  hardware reset base: 0x0ffc0000 (default reset value, sw programmable)

Current driver always programs the reset vector as 0. But i.MX94 CM33S's
default reset base is 0x0ffc0000, so the correct reset vector must be
passed to the SM API; otherwise the M33 Sync core cannot boot successfully.

rproc_elf_get_boot_addr() returns the ELF entry point, which is not the
hardware reset vector address. Fix the issue by deriving the hardware reset
vector locally using a SoC-specific mask:

  reset_vector = rproc->bootaddr & reset_vector_mask

The ELF entry point semantics remain unchanged. The masking is applied only
at the point where the SM reset vector is programmed.

Add reset_vector_mask = GENMASK_U32(31, 16) to the i.MX95 M7 configuration
so the hardware reset vector is derived correctly. Without this mask, the
SM reset vector would be programmed with an unaligned ELF entry point and
the M7 core would fail to boot.

Reviewed-by: Daniel Baluta <daniel.baluta@nxp.com>
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Link: https://lore.kernel.org/r/20260427-imx943-rproc-v4-2-68d7c7253acd@nxp.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>

dt-bindings: remoteproc: imx-rproc: Support i.MX94

Add compatible string for:
Cortex-M7 core[0,1] in i.MX94
Cortex-M33 Sync core in i.MX94

To i.MX94, Cortex-M7 core0 and core1 have different memory view from
Cortex-A55 core, so different compatible string is used.

Reviewed-by: Daniel Baluta <daniel.baluta@nxp.com>
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Link: https://lore.kernel.org/r/20260427-imx943-rproc-v4-1-68d7c7253acd@nxp.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>

RDMA/rxe: Reject non-8-byte ATOMIC_WRITE payloads

atomic_write_reply() at drivers/infiniband/sw/rxe/rxe_resp.c
unconditionally dereferences 8 bytes at payload_addr(pkt):

value = *(u64 *)payload_addr(pkt);

check_rkey() previously accepted an ATOMIC_WRITE request with pktlen ==
resid == 0 because the length validation only compared pktlen against
resid. A remote initiator that sets the RETH length to 0 therefore reaches
atomic_write_reply() with a zero-byte logical payload, and the responder
reads sizeof(u64) bytes from past the logical end of the packet into
skb->head tailroom, then writes those 8 bytes into the attacker's MR via
rxe_mr_do_atomic_write(). That is a remote disclosure of 4 bytes of kernel
tailroom per probe (the other 4 bytes are the packet's own trailing ICRC).

IBA oA19-28 defines ATOMIC_WRITE as exactly 8 bytes. Anything else is
protocol-invalid. Hoist a strict length check into check_rkey() so the
responder never reaches the unchecked dereference, and keep the existing
WRITE-family length logic for the normal RDMA WRITE path.

Reproduced on mainline with an unmodified rxe driver: a sustained
zero-length ATOMIC_WRITE probe repeatedly leaks adjacent skb head-buffer
bytes into the attacker's MR, including recognisable kernel strings and
partial kernel-direct-map pointer words. With this patch applied the
responder rejects the PDU and the MR stays all-zero.

Cc: stable@vger.kernel.org
Fixes: 034e285f8b99 ("RDMA/rxe: Make responder support atomic write on RC service")
Link: https://patch.msgid.link/r/20260418162141.3610201-1-michael.bommarito@gmail.com
Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

selftests/bpf: Rename libarena malloc/free methods

The s390 architecture uses the token "free" for an enum, conflicting
with the malloc/free definitions. Rename the calls to arena_malloc and
arena_free instead to prevent collisions.

Reported-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Signed-off-by: Emil Tsalapatis <etsal@meta.com>
Fixes: 86426a28c52d ("selftests/bpf: Add buddy allocator for libarena")
Acked-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Link: https://lore.kernel.org/r/20260428134252.2783519-1-etsal@meta.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

RDMA/rxe: Reject unknown opcodes before ICRC processing

Even after applying commit 7244491dab34 ("RDMA/rxe: Validate pad and ICRC
before payload_size() in rxe_rcv"), a single unauthenticated UDP packet
can still trigger panic.  That patch handled payload_size() underflow only
for valid opcodes with short packets, not for packets carrying an unknown
opcode.  The unknown-opcode OOB read described below predates that commit
and reaches back to the initial Soft RoCE driver.

The check added there reads

    pkt->paylen < header_size(pkt) + bth_pad(pkt) + RXE_ICRC_SIZE

where header_size(pkt) expands to rxe_opcode[pkt->opcode].length.  The
rxe_opcode[] array has 256 entries but is only populated for defined IB
opcodes; any other entry (for example opcode 0xff) is zero-initialized, so
length == 0 and the check degenerates to

    pkt->paylen < 0 + bth_pad(pkt) + RXE_ICRC_SIZE

which does not constrain pkt->paylen enough.  rxe_icrc_hdr() then computes

    rxe_opcode[pkt->opcode].length - RXE_BTH_BYTES

which underflows when length == 0 and passes a huge value to rxe_crc32(),
causing an out-of-bounds read of the skb payload.

Reproduced on v7.0-rc7 with that fix applied, QEMU/KVM with
CONFIG_RDMA_RXE=y and CONFIG_KASAN=y, after

    rdma link add rxe0 type rxe netdev eth0

A single 48-byte UDP packet to port 4791 with BTH opcode=0xff and
QPN=IB_MULTICAST_QPN triggers:

    BUG: KASAN: slab-out-of-bounds in crc32_le+0x115/0x170
    Read of size 1 at addr ...
    The buggy address is located 0 bytes to the right of
     allocated 704-byte region
    Call Trace:
     crc32_le+0x115/0x170
     rxe_icrc_hdr.isra.0+0x226/0x300
     rxe_icrc_check+0x13f/0x3a0
     rxe_rcv+0x6e1/0x16e0
     rxe_udp_encap_recv+0x20a/0x320
     udp_queue_rcv_one_skb+0x7ed/0x12c0

Subsequent packets with the same shape fault on unmapped memory and panic
the kernel.  The trigger requires only module load and "rdma link add"; no
QP, no connection, and no authentication.

Fix this by rejecting packets whose opcode has no rxe_opcode[] entry,
detected via the zero mask or zero length, before any length arithmetic
runs.

Cc: stable@vger.kernel.org
Fixes: 8700e3e7c485 ("Soft RoCE driver")
Link: https://patch.msgid.link/r/20260414111555.3386793-1-michael.bommarito@gmail.com
Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Merge tag 'md-7.1-20260428' of https://git.kernel.org/pub/scm/linux/kernel/git/mdraid/linux into block-7.1

Pull MD fixes from Yu Kuai:

"Bug Fixes:
- Fix a raid5 UAF on IO across the reshape position.
- Avoid failing RAID1/RAID10 devices for invalid IO errors.
- Fix RAID10 divide-by-zero when far_copies is zero.
- Restore bitmap grow through sysfs.

Cleanups:
- Use mddev_is_dm() instead of open-coding gendisk checks.
- Use ATTRIBUTE_GROUPS() for md default sysfs attributes.
- Replace open-coded wait loops with wait_event helpers."

* tag 'md-7.1-20260428' of https://git.kernel.org/pub/scm/linux/kernel/git/mdraid/linux:
  md: use ATTRIBUTE_GROUPS() for md default sysfs attributes
  md: use mddev_is_dm() instead of open-coding gendisk checks
  md/raid1: replace wait loop with wait_event_idle() in raid1_write_request()
  md/md-bitmap: add a none backend for bitmap grow
  md/md-bitmap: split bitmap sysfs groups
  md: factor bitmap creation away from sysfs handling
  md: use mddev_lock_nointr() in mddev_suspend_and_lock_nointr()
  md: replace wait loop with wait_event() in md_handle_request()
  md/raid10: fix divide-by-zero in setup_geo() with zero far_copies
  md/raid1,raid10: don't fail devices for invalid IO errors
  MAINTAINERS: Add Xiao Ni as md/raid reviewer
  md/raid5: Fix UAF on IO across the reshape position

IB/hfi1: Fix potential use-after-free in PIO and SDMA map teardown

The current teardown logic for dd->pio_map and dd->sdma_map frees the
structures while they might still be accessed by RCU readers. Although the
pointer is nulled under a spinlock, the memory is reclaimed before waiting
for the grace period to end.

This patch fixes the sequence by:
1. Extracting the pointer under the lock.
2. Clearing the RCU-protected pointer.
3. Waiting for readers to finish with synchronize_rcu().
4. Finally freeing the memory.

Fixes: 7724105686e7 ("IB/hfi1: add driver files")
Link: https://patch.msgid.link/r/20260206050836.5890-1-lirongqing@baidu.com
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

net: phy: dp83869: fix setting CLK_O_SEL field.

Table 7-121 in datasheet says we have to set register 0xc6
to value 0x10 before CLK_O_SEL can be modified. No more infos
about this field found in datasheet. With this fix, setting
of CLK_O_SEL field in IO_MUX_CFG register worked through dts
property "ti,clk-output-sel" on a DP83869HMRGZR.

Signed-off-by: Heiko Schocher <hs@nabladev.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Fixes: 01db923e8377 ("net: phy: dp83869: Add TI dp83869 phy")
Link: https://patch.msgid.link/20260425031339.3318-1-hs@nabladev.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

platform/x86: dell_rbu: use strscpy in image_type_write

strcpy() has been deprecated [1] because it performs no bounds checking
on the destination buffer, which can lead to buffer overflows. While the
current code works correctly, replace strcpy() with the safer strscpy()
to follow secure coding best practices.

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strcpy
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Link: https://patch.msgid.link/20260410091633.2822-6-thorsten.blum@linux.dev
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>