]> git.ipfire.org Git - thirdparty/linux.git/log
thirdparty/linux.git
9 days agodrm/amd/pm: smu_v14_0_0: use SoftMin for gfxclk in set_soft_freq_limited_range
Priya Hosur [Thu, 7 May 2026 08:01:37 +0000 (13:31 +0530)] 
drm/amd/pm: smu_v14_0_0: use SoftMin for gfxclk in set_soft_freq_limited_range

In smu_v14_0_0_set_soft_freq_limited_range(), the gfxclk floor is
programmed via SetHardMinGfxClk together with SetSoftMaxGfxClk. Under
power_dpm_force_performance_level=high this pins HardMin to peak gfxclk.

In PMFW arbitration HardMin has higher priority than SoftMax, so the
firmware thermal/PPT throttler cannot clamp gfxclk via SoftMax once
HardMin is set to peak. Replace SetHardMinGfxClk with SetSoftMinGfxclk
so the driver still requests peak performance but the firmware
throttler retains the ability to clamp gfxclk under thermal/PPT
pressure. SoftMax handling is unchanged and no other clock domains
are affected.

Signed-off-by: Priya Hosur <Priya.Hosur@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 3ea273267fd29cbf6d83ee72329f59eb5042605b)
Cc: stable@vger.kernel.org
9 days agodrm/amdgpu: Fix incorrect VRAM GART mappings on non-4K page size systems
Donet Tom [Wed, 27 May 2026 13:19:31 +0000 (18:49 +0530)] 
drm/amdgpu: Fix incorrect VRAM GART mappings on non-4K page size systems

When mapping VRAM pages into the GART page table,
amdgpu_gart_map_vram_range() assumes that the system page size is the
same as the GPU page size.

On systems with non-4K page sizes, multiple GPU pages can exist within
a single CPU page. As a result, the mappings are created incorrectly
because fewer page table entries are programmed than required.

Fix this by programming the mappings correctly for non-4K page size
systems.

Fixes: 237d623ae659 ("drm/amdgpu/gart: Add helper to bind VRAM pages (v2)")
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit a8f0bc22388f74e0cf4ed8b7d1846c580eaf44cc)
Cc: stable@vger.kernel.org
9 days agodrm/amdgpu/userq: move wptr_obj cleanup in mqd_destroy
Sunil Khatri [Mon, 25 May 2026 04:26:23 +0000 (09:56 +0530)] 
drm/amdgpu/userq: move wptr_obj cleanup in mqd_destroy

In case when queue_create fails and mqd has already been
allocated and hence wptr_obj is not cleaned up.

So moving that cleanup part to mqd_destroy so it takes
care of all the cases of clean up and during tear down of
the queue.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 43355f62cd2ef5386c2693df537c232ea0f2ce6c)

9 days agodrm/amdgpu: improve the userq seq BO free bit lookup
Prike Liang [Tue, 26 May 2026 02:25:26 +0000 (10:25 +0800)] 
drm/amdgpu: improve the userq seq BO free bit lookup

Use find_next_zero_bit() to locate the next free seq slot bit
instead of the current walk, for more efficient bitmap scanning.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ff905a9b6228de9eedd0db71ecb1bdde91fb898d)

9 days agodrm/amdgpu/userq: remove the vital queue unmap logging
Sunil Khatri [Mon, 25 May 2026 07:48:00 +0000 (13:18 +0530)] 
drm/amdgpu/userq: remove the vital queue unmap logging

Mesa userqueues free does not wait for the free to complete and go ahead
in unmapping the vital bos while kernel is still in queue free and
corresponding cleanup.

So ideally we don't need the logging for that and hence remove the warn
message as this is expected behaviour and functionally, we are making
sure to wait for the required fences before unmap.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 758a868043dcb07eca923bc451c16da3e73dc47c)

9 days agodrm/amdkfd: Fix buffer overflow in SDMA queue checkpoint/restore on GFX11
Andrew Martin [Thu, 28 May 2026 16:54:39 +0000 (12:54 -0400)] 
drm/amdkfd: Fix buffer overflow in SDMA queue checkpoint/restore on GFX11

The v11 MQD manager incorrectly assigned the CP-compute variants of
checkpoint_mqd/restore_mqd for KFD_MQD_TYPE_SDMA queues. These functions
use sizeof(struct v11_compute_mqd) (2048 bytes) instead of sizeof(struct
v11_sdma_mqd) (512 bytes), causing a 1536-byte overflow.

During CRIU checkpoint of an SDMA queue on Navi3x:
- checkpoint_mqd() reads 2048 bytes from a 512-byte SDMA MQD buffer,
  leaking 1536 bytes of adjacent GTT memory to userspace

During CRIU restore:
- restore_mqd() writes 2048 bytes into a 512-byte SDMA MQD buffer,
  corrupting 1536 bytes of adjacent GTT memory (often the ring buffer
  or neighboring MQDs)

This is a copy-paste regression unique to v11. All other ASIC backends
(cik, vi, v9, v10, v12) correctly use the SDMA-specific variants.

Add checkpoint_mqd_sdma() and restore_mqd_sdma() functions that properly
handle the smaller v11_sdma_mqd structure, matching the pattern used in
other MQD managers.

Fixes: cc009e613de6 ("drm/amdkfd: Add KFD support for soc21 v3")
Assisted-by: Claude:Sonnet 4-5
Signed-off-by: Andrew Martin <andrew.martin@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 6fa41db7ffdec97d62433adf03b7b9b759af8c2c)
Cc: stable@vger.kernel.org
9 days agodrm/amdkfd: fix NULL dereference in get_queue_ids()
Muhammad Bilal [Sat, 23 May 2026 16:56:46 +0000 (16:56 +0000)] 
drm/amdkfd: fix NULL dereference in get_queue_ids()

When usr_queue_id_array is NULL and num_queues is non-zero,
get_queue_ids() returns NULL. The callers check only IS_ERR() on the
return value; since IS_ERR(NULL) == false the check passes, and
suspend_queues() calls q_array_invalidate() which immediately
dereferences NULL while iterating num_queues times.

Userspace can trigger this via kfd_ioctl_set_debug_trap() by supplying
num_queues > 0 with a zero queue_array_ptr, causing a kernel panic.

A NULL usr_queue_id_array with num_queues == 0 is a legitimate no-op
(q_array_invalidate never executes, and resume_queues already guards
all queue_ids dereferences behind a NULL check). Return ERR_PTR(-EINVAL)
only when num_queues is non-zero and the pointer is absent; both callers
already propagate IS_ERR() returns correctly to userspace.

Fixes: a70a93fa568b ("drm/amdkfd: add debug suspend and resume process queues operation")
Signed-off-by: Muhammad Bilal <meatuni001@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit f165a82cdf503884bb1797771c61b2fcc72113d4)
Cc: stable@vger.kernel.org
9 days agodrm/amdgpu: set noretry=1 as default for GFX 10.1.x (Navi10/12/14)
Vitaly Prosyak [Fri, 29 May 2026 17:50:38 +0000 (13:50 -0400)] 
drm/amdgpu: set noretry=1 as default for GFX 10.1.x (Navi10/12/14)

Problem:
While developing the amd_close_race IGT test (which intentionally triggers
execute permission faults by removing VM_PAGE_EXECUTABLE from GPU page table
entries), we discovered that on Navi10 (GFX 10.1.x) these faults produce
zero diagnostic output. The GPU simply hangs silently for ~10s until the
scheduler timeout fires. There is no way to distinguish an execute
permission fault from any other type of GPU hang.

Root cause:
GFX 10.1.x defaults to noretry=0, which sets
RETRY_PERMISSION_OR_INVALID_PAGE_FAULT=1 in the GFXHUB UTCL2 registers
(gfxhub_v2_0.c line 313). With this bit set, permission faults (valid PTE,
wrong R/W/X bits) are handled entirely within the UTCL1/UTCL2 hardware
loop: UTCL2 returns an XNACK to UTCL1, and UTCL1 re-requests the
translation indefinitely, expecting software to eventually fix the
permission bits (as happens in SVM/HMM recovery). No interrupt of any kind
reaches the IH ring.

This is different from invalid-page faults (V=0) which DO generate a retry
fault interrupt that the driver can escalate to a no-retry fault. Permission
faults with valid PTEs loop silently forever in hardware.

GFX 10.3+ already defaults to noretry=1, which makes permission faults
generate immediate L2 protection fault interrupts. GFX 10.1.x was
inadvertently left out of this default.

Fix:
Change the noretry=1 threshold from IP_VERSION(10, 3, 0) to
IP_VERSION(10, 1, 0) in amdgpu_gmc_noretry_set(). This is a one-line
change that aligns GFX 10.1.x behavior with GFX 10.3+ and all newer
generations.

With noretry=1, the existing non-retry fault handler
(gmc_v10_0_process_interrupt) already decodes and prints the full
GCVM_L2_PROTECTION_FAULT_STATUS register including PERMISSION_FAULTS,
faulting address, VMID, PASID, and process name. No additional logging
code is needed — the fix is purely routing permission faults to the
existing, fully-capable non-retry interrupt handler.

v2: Dropped GFX10-specific logging from gmc_v10_0.c and
kfd_int_process_v10.c (Felix Kuehling). v1 added logging in the retry
fault handler, but with noretry=1 permission faults take the non-retry
path — the v1 retry handler code was dead and would never execute.

Tested on Navi10 (GFX 10.1.10):
- Execute permission faults now produce immediate, clear output:
    [gfxhub] page fault (src_id:0 ring:64 vmid:4 pasid:592)
     Process amd_close_race pid 13380 thread amd_close_race pid 13384
      in page at address 0x40001000 from client 0x1b (UTCL2)
    GCVM_L2_PROTECTION_FAULT_STATUS:0x00700881
         PERMISSION_FAULTS: 0x8
- No regressions with properly-mapped GPU workloads

Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit eb21edd24c40d81066753f8ac6f23bce15745395)
Cc: stable@vger.kernel.org
9 days agodrm/amdgpu/gfxhub: Program CRASH_ON_*_FAULT bits to 0 as needed
Timur Kristóf [Mon, 25 May 2026 11:45:02 +0000 (13:45 +0200)] 
drm/amdgpu/gfxhub: Program CRASH_ON_*_FAULT bits to 0 as needed

When the fault stop mode isn't AMDGPU_VM_FAULT_STOP_ALWAYS,
these bits should be programmed to 0.

Program CRASH_ON_NO_RETRY_FAULT and CRASH_ON_RETRY_FAULT
always, to make sure to clear the bits when we don't want
to crash.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit d0cd99e73090700b7a942b98a3327ec966597d0a)

9 days agodrm/amdgpu: fix waiting for all submissions for userptrs
Christian König [Wed, 18 Feb 2026 12:05:46 +0000 (13:05 +0100)] 
drm/amdgpu: fix waiting for all submissions for userptrs

Wait for all submissions when userptrs need to be invalidated by the MMU
notifier, not just the one the userptr was involved into.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Tested-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 91250893cbaa25c86872deca95a540d08de1f91e)
Cc: stable@vger.kernel.org
9 days agodrm/amdgpu: drm/amdgpu: Set correct DMA mask for gfx12.1
Harish Kasiviswanathan [Tue, 12 May 2026 14:57:49 +0000 (10:57 -0400)] 
drm/amdgpu: drm/amdgpu: Set correct DMA mask for gfx12.1

Set correct DMA mask for gfx12

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit a2ef14ee2593b48242b8d90f229f71c1710529da)

9 days agodrm/amdgpu: Use asic specific pte_addr_mask
Harish Kasiviswanathan [Tue, 28 Apr 2026 21:45:06 +0000 (17:45 -0400)] 
drm/amdgpu: Use asic specific pte_addr_mask

For PTE creation use asic specific physical page base address mask

v2: Change variable name from pa_mask to pte_addr_mask

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 2ea989885941a6e5607ef86dbe309e90b7191f21)

9 days agodrm/amd/pm: zero unused SMU argument registers
Yang Wang [Mon, 11 May 2026 08:33:37 +0000 (16:33 +0800)] 
drm/amd/pm: zero unused SMU argument registers

SMU messages may use fewer arguments than the available argument registers,
the previous code only wrote used registers and left the rest unchanged,
so stale values from a prior message could persist.

Write all argument registers for each message and zero the unused tail
to keep command arguments deterministic and avoid unintended carry-over.

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit e03b635f61f77ebd5107ef82f48e3221cb695856)

9 days agodrm/amd/pm: mark metrics.energy_accumulator is invalid for smu 14.0.2
Yang Wang [Fri, 29 May 2026 03:47:31 +0000 (11:47 +0800)] 
drm/amd/pm: mark metrics.energy_accumulator is invalid for smu 14.0.2

EnergyAccumulator is unsupported on SMU 14.0.2, mark it invalid.

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 646b05043eeed04b51c14aad22a400a8250af4b7)
Cc: stable@vger.kernel.org
9 days agodrm/amd/pm: fix smu13 power limit default/cap calculation
Yang Wang [Tue, 19 May 2026 03:18:12 +0000 (11:18 +0800)] 
drm/amd/pm: fix smu13 power limit default/cap calculation

smu_v13_0_0_get_power_limit() and smu_v13_0_7_get_power_limit() mix
runtime power_limit with PP table limits when reporting default/min/max.

When current power limit query succeeds, default_power_limit was set to the
runtime value instead of the PP table default, and min/max could be derived
from inconsistent bases (MsgLimits/runtime), leading to incorrect cap info.

Use SocketPowerLimitAc/Dc as the PP default base (pp_limit), keep
current_power_limit as runtime value, and derive min/max from pp_limit with
OD percentages.

Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/5227
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 1eaf26db95901ca70737503a89b831dd763c8453)
Cc: stable@vger.kernel.org
9 days agodrm/amd/pm: apply SMU 13.0.10 workaround during MP1 unload
Yang Wang [Thu, 21 May 2026 14:36:37 +0000 (22:36 +0800)] 
drm/amd/pm: apply SMU 13.0.10 workaround during MP1 unload

On SMU v13.0.10, sending PrepareMp1ForUnload with the default
parameter may leave the device in an inaccessible state. This can
affect runtime power management and partial PnP flows.
e.g: kexec, driver unload, boco/d3cold.

Pass the required workaround parameter 0x55, when preparing MP1 for
unload on SMU v13.0.10, keep the existing behavior for other SMU
versions.

Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/5133
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 4e8ee1afeedb8d24dd22cdd5ae9f98a6d76ebe4b)
Cc: stable@vger.kernel.org
9 days agodrm/amdgpu: Align amdgpu_gtt_mgr entries to TLB size on all SI
Timur Kristóf [Mon, 25 May 2026 11:22:04 +0000 (13:22 +0200)] 
drm/amdgpu: Align amdgpu_gtt_mgr entries to TLB size on all SI

It seems that Pitcairn has the same issues as Tahiti
with regards to the TLB size. This commit fixes a
VCE1 FW validation timeout on suspend/resume on Pitcairn.

Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/5336
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 629279e2e798cd161cf74f40aaebfeb16d45eb01)

9 days agodrm/amdgpu: unmap userq for evicting user queue
Prike Liang [Thu, 14 May 2026 09:21:09 +0000 (17:21 +0800)] 
drm/amdgpu: unmap userq for evicting user queue

If the driver only preempts queues, there can still be inflight waves,
pending dispatch state, or resume/redispatch possibility tied to the
same queue. Then the VM/TTM side may proceed to move/unmap queue related
BOs during evicting userq objects while shader TCP clients still need to
access them.

So for eviction, unmap is safer because it makes the queue nonrunnable
before memory backing is invalidated. Meanwhile, for a idle queue it's
more sutiable for unmapping it rather preempt and unmapping also can
save more processing time than preempt.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit d87c9d86727a0bcc95c3009a213a1b27a11b691e)

9 days agodrm/amdgpu/sdma7.1: fix support for disable_kq
Alex Deucher [Wed, 27 May 2026 19:41:59 +0000 (15:41 -0400)] 
drm/amdgpu/sdma7.1: fix support for disable_kq

Set the flag in the ring structure.

Fixes: 80d4d3a45b86 ("drm/amdgpu/sdma7.1: add support for disable_kq")
Reviewed-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit e0a3aa8a6750e8cf067fe2146dc618ffd296d5ef)

9 days agodrm/amdkfd: fix UAF race in destroy_queue_cpsch
Alysa Liu [Wed, 27 May 2026 15:31:35 +0000 (11:31 -0400)] 
drm/amdkfd: fix UAF race in destroy_queue_cpsch

wait_on_destroy_queue() drops locks to wait for queue resume, allowing
a concurrent destroy to free the queue. Use is_being_destroyed flag to
serialize destruction.

Reviewed-by: Amir Shetaia <Amir.Shetaia@amd.com>
Signed-off-by: Alysa Liu <Alysa.Liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ac081deaf16a639ea7dff2f285fe421a33c1ade0)

9 days agodrm/amd/display: Bound VBIOS record-chain walk loops
Harry Wentland [Tue, 12 May 2026 19:24:22 +0000 (15:24 -0400)] 
drm/amd/display: Bound VBIOS record-chain walk loops

[Why & How]
All record-chain walk loops in bios_parser.c and bios_parser2.c use
for(;;) and only terminate on a 0xFF record_type sentinel or zero
record_size. A malformed VBIOS image missing the terminator record
causes unbounded iteration at probe time, potentially hundreds of
thousands of iterations with record_size=1. In the final iterations
near the BIOS image boundary, struct casts beyond the 2-byte header
validated by GET_IMAGE can also read out of bounds.

Cap all 14 record-chain walk loops to BIOS_MAX_NUM_RECORD (256)
iterations. The atombios.h defines up to 22 distinct record types
and atomfirmware.h has 13. Assuming an average of less than 10
records per type (which is reasonable since most are connector-
based) 256 is a generous upper bound.

Fixes: 4562236b3bc0 ("drm/amd/dc: Add dc display driver (v2)")
Assisted-by: Copilot:claude-opus-4.6 Mythos
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 95700a3d660287ed657d6892f7be9ffc0e294a93)
Cc: stable@vger.kernel.org
9 days agodrm/amd/display: Clamp HDMI HDCP2 rx_id_list read to buffer size
Harry Wentland [Thu, 7 May 2026 19:38:37 +0000 (15:38 -0400)] 
drm/amd/display: Clamp HDMI HDCP2 rx_id_list read to buffer size

[Why & How]
During HDCP 2.x repeater authentication over HDMI, the driver reads the
sink's RxStatus register and extracts a 10-bit message size field (max
value 1023). This value is used as the read length for the ReceiverID
list without being clamped to the size of the destination buffer
rx_id_list[177]. A malicious HDMI repeater could advertise a message
size larger than the buffer, causing an out-of-bounds write during the
I2C read.

Clamp the read length in mod_hdcp_read_rx_id_list() to the size of the
rx_id_list buffer, matching the approach already used in the DP branch.

Fixes: eff682f83c9c ("drm/amd/display: Add DDC handles for HDCP2.2")
Assisted-by: Copilot:claude-opus-4.6
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 229212219e4247d9486f8ba41ef087358490be09)
Cc: stable@vger.kernel.org
9 days agodrm/amd/display: Reject gpio_bitshift >= 32 in bios_parser_get_gpio_pin_info()
Harry Wentland [Tue, 5 May 2026 15:50:07 +0000 (11:50 -0400)] 
drm/amd/display: Reject gpio_bitshift >= 32 in bios_parser_get_gpio_pin_info()

[Why & How]
gpio_bitshift is a uint8_t read directly from the VBIOS GPIO pin table.
If the value is >= 32, the expression "1 << gpio_bitshift" triggers
undefined behaviour in C (shift count exceeds type width). On x86 the
shift is silently masked to 5 bits, producing an incorrect GPIO mask
that may cause wrong MMIO register bits to be toggled.

Validate gpio_bitshift before use and return BP_RESULT_BADBIOSTABLE for
out-of-range values.

Fixes: ae79c310b1a6 ("drm/amd/display: Add DCE12 bios parser support")
Assisted-by: Copilot:claude-opus-4.6
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit eadf438ab8d370b9d19acee9359918c85afeb80d)
Cc: stable@vger.kernel.org
9 days agodrm/amd/display: Use krealloc_array() in dal_vector_reserve()
Harry Wentland [Tue, 5 May 2026 15:52:15 +0000 (11:52 -0400)] 
drm/amd/display: Use krealloc_array() in dal_vector_reserve()

[Why & How]
dal_vector_reserve() computes the allocation size as
"capacity * vector->struct_size" using uint32_t arithmetic, which can
silently wrap to a small value on overflow. This would cause krealloc to
return a smaller buffer than expected, leading to heap overflows on
subsequent vector appends.

Replace krealloc() with krealloc_array() which performs an internal
overflow check and returns NULL on wrap, preventing the issue.

Fixes: 2004f45ef83f ("drm/amd/display: Use kernel alloc/free")
Assisted-by: Copilot:claude-opus-4.6
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 37668568641ccc4cc1dbca4923d0a16609dd5707)
Cc: stable@vger.kernel.org
9 days agodrm/amd/display: Fix NULL deref and buffer over-read in SDP debugfs
Harry Wentland [Mon, 11 May 2026 20:46:25 +0000 (16:46 -0400)] 
drm/amd/display: Fix NULL deref and buffer over-read in SDP debugfs

[Why & How]
dp_sdp_message_debugfs_write() dereferences connector->base.state->crtc
without checking for NULL. A connector can be connected but not bound to
any CRTC (e.g. after hot-plug before the next atomic commit), causing a
kernel crash when writing to the sdp_message debugfs node.

The function also ignores the user-provided size argument and always
passes 36 bytes to copy_from_user(), reading past the user buffer when
size < 36.

Fix both issues by:
- Returning -ENODEV when connector->base.state or state->crtc is NULL
- Clamping write_size to min(size, sizeof(data))

Fixes: c7ba3653e977 ("drm/amd/display: Generic SDP message access in amdgpu")
Assisted-by: Copilot:claude-opus-4.6
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 6ab4c36a522842ff70474a1c0af2e40e50fc8300)
Cc: stable@vger.kernel.org
9 days agodrm/amd/display: Clamp VBIOS HDMI retimer register count to array size
Harry Wentland [Mon, 4 May 2026 19:51:13 +0000 (15:51 -0400)] 
drm/amd/display: Clamp VBIOS HDMI retimer register count to array size

[Why & How]
The VBIOS integrated info tables (v1_11 and v2_1) contain HdmiRegNum and
Hdmi6GRegNum fields that are used as loop bounds when copying retimer I2C
register settings into fixed-size arrays (dp*_ext_hdmi_reg_settings[9]
and dp*_ext_hdmi_6g_reg_settings[3]). These u8 fields are not validated
before use, so a malformed VBIOS can specify values up to 255, causing an
out-of-bounds heap write during driver probe.

Clamp each register count to the destination array size using min_t()
before the copy loops, in both get_integrated_info_v11() and
get_integrated_info_v2_1().

Assisted-by: GitHub Copilot:claude-opus-4.6
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 5a7f0ef90195940c54b0f5bb85b87da55f038c69)
Cc: stable@vger.kernel.org
9 days agodrm/amd/display: Fix out-of-bounds read in dp_get_eq_aux_rd_interval()
Harry Wentland [Tue, 5 May 2026 15:44:15 +0000 (11:44 -0400)] 
drm/amd/display: Fix out-of-bounds read in dp_get_eq_aux_rd_interval()

[Why & How]
The aux_rd_interval array in struct dc_lttpr_caps is declared with
MAX_REPEATER_CNT - 1 (7) elements, indexed 0..6. However, the offset
parameter passed to dp_get_eq_aux_rd_interval() can be as large as
MAX_REPEATER_CNT (8) when a sink reports 8 LTTPR repeaters via DPCD.
This leads to an out-of-bounds read of aux_rd_interval[7] when offset
is 8.

Fix this by growing aux_rd_interval to MAX_REPEATER_CNT elements to
accommodate the full range of valid repeater counts defined by the DP
spec.

Assisted-by: GitHub Copilot:Claude claude-4-opus
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit a55a458a8df37a65ffda5cf721d554a8f74f6b04)
Cc: stable@vger.kernel.org
9 days agodrm/amd/display: add missing CSC entries for BT.2020 for DCE IPs
Leorize [Thu, 28 May 2026 06:58:54 +0000 (23:58 -0700)] 
drm/amd/display: add missing CSC entries for BT.2020 for DCE IPs

DCE-based hardware does not have the CSC matrices for BT.2020, which
causes the driver to fallback to the GPU built-in matrices. This does
not appear to cause any issues for RGB sinks, but causes major color
artifacts for YCbCr ones (e.g. black becomes green).

This commit adds the missing CSC matrices (taken from DC common) to DCE
CSC tables, resolving the issue.

Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/3358
Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/5333
Assisted-by: oh-my-pi:GPT-5.5
Signed-off-by: Leorize <leorize+oss@disroot.org>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 51e6668ab4baf55b082c376318d51ef965757196)
Cc: stable@vger.kernel.org
9 days agoRDMA/core: Validate cpu_id against nr_cpu_ids in DMAH alloc
Yishai Hadas [Mon, 25 May 2026 14:21:36 +0000 (17:21 +0300)] 
RDMA/core: Validate cpu_id against nr_cpu_ids in DMAH alloc

The cpu_id attribute supplied by user space through
UVERBS_ATTR_ALLOC_DMAH_CPU_ID is passed directly to cpumask_test_cpu()
without first verifying that the value is within the valid CPU range.

Passing such untrusted data to cpumask_test_cpu() may lead to an
out-of-bounds read of the underlying cpumask bitmap: the helper expands
to a test_bit() that indexes the bitmap by cpu_id / BITS_PER_LONG with
no bound check.

In addition, on kernels built with CONFIG_DEBUG_PER_CPU_MAPS it trips
the WARN_ON_ONCE() in cpumask_check(); combined with panic_on_warn this
turns a bad user input into a machine reboot.

Reject any cpu_id that is not smaller than nr_cpu_ids with -EINVAL
before it is used.

Reported by Smatch.

Fixes: d83edab562a4 ("RDMA/core: Introduce a DMAH object and its alloc/free APIs")
Link: https://patch.msgid.link/r/20260525142136.28165-1-yishaih@nvidia.com
Cc: stable@vger.kernel.org
Reported-by: Dan Carpenter <error27@gmail.com>
Closes: https://lore.kernel.org/r/ag68qoAW3P04J7pT@stanley.mountain/
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
9 days agodrm/dumb-buffer: Drop buffer-size limits for now
Thomas Zimmermann [Tue, 2 Jun 2026 11:24:01 +0000 (13:24 +0200)] 
drm/dumb-buffer: Drop buffer-size limits for now

The size limits break some of the CI tests. So drop them for now. Keep
the other overflow tests from commit 5ab62dd3687b ("drm: prevent integer
overflows in dumb buffer creation helpers") in place.

There is still a pre-existing overflow check for 32-bit type limits in
drm_mode_create_dumb() that will catch the really absurd size requests.
Drivers that still do not use drm_mode_size_dumb() should be updated. The
helper calculates dumb-buffer geometry with overflow checks.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Fixes: 5ab62dd3687b ("drm: prevent integer overflows in dumb buffer creation helpers")
Reported-by: Jani Nikula <jani.nikula@linux.intel.com>
Closes: https://lore.kernel.org/dri-devel/ddf0233e50044059c85279f928661563ef6a55bf@intel.com/
Cc: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patch.msgid.link/20260602112842.252279-1-tzimmermann@suse.de
9 days agoMerge tag 'mmc-v7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Linus Torvalds [Wed, 3 Jun 2026 16:09:24 +0000 (09:09 -0700)] 
Merge tag 'mmc-v7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc

Pull MMC fixes from Ulf Hansson:
 "MMC core:
   - Fix host controller programming for eMMC fixed driver type

  MMC host:
   - dw_mmc-rockchip: Add missing private data for very old controllers
   - litex_mmc: Fix clock management
   - renesas_sdhi: Add OF entry for RZ/G2H SoC
   - sdhci: Manage signal voltage switch during system resume for some hosts
   - sdhci-of-dwcmshc: Fix reset, clk and SDIO support for Eswin EIC7700"

* tag 'mmc-v7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: sdhci: add signal voltage switch in sdhci_resume_host
  mmc: dw_mmc-rockchip: Add missing private data for very old controllers
  mmc: litex_mmc: Set mandatory idle clocks before CMD0
  mmc: litex_mmc: Use DIV_ROUND_UP for more accurate clock calculation
  mmc: renesas_sdhi: Add OF entry for RZ/G2H SoC
  mmc: sdhci-of-dwcmshc: Fix reset, clk, and SDIO support for Eswin EIC7700
  mmc: core: Fix host controller programming for fixed driver type

9 days agoMerge tag 'cgroup-for-7.1-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Wed, 3 Jun 2026 15:59:24 +0000 (08:59 -0700)] 
Merge tag 'cgroup-for-7.1-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup

Pull cgroup fixes from Tejun Heo:
 "One cpuset fix and a maintenance update, both low-risk:

   - Fix cpuset partition CPU accounting under sibling CPU exclusion
     that could produce wrong CPU assignments and trigger
     scheduling-domain warnings. Includes selftests.

   - Update an email address in MAINTAINERS"

* tag 'cgroup-for-7.1-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup/cpuset: Change Ridong's email
  cgroup/cpuset: Add test cases for sibling CPU exclusion on partition update
  cgroup/cpuset: Use effective_xcpus in partcmd_update add/del mask calculation

10 days agoMerge tag 'sched_ext-for-7.1-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Wed, 3 Jun 2026 15:52:26 +0000 (08:52 -0700)] 
Merge tag 'sched_ext-for-7.1-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext

Pull sched_ext fixes from Tejun Heo:
 "Two low-risk fixes:

   - Drop a spurious warning that can fire during cgroup migration while
     a sched_ext scheduler is loaded

   - Fix a drgn-based debug script that broke after scheduler state
     moved into a per-scheduler struct"

* tag 'sched_ext-for-7.1-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext:
  sched_ext: Don't warn on NULL cgrp_moving_from in scx_cgroup_move_task()
  tools/sched_ext: Fix scx_show_state per-scheduler state reads

10 days agoBluetooth: MGMT: Fix backward compatibility with userspace
Luiz Augusto von Dentz [Tue, 2 Jun 2026 20:48:34 +0000 (16:48 -0400)] 
Bluetooth: MGMT: Fix backward compatibility with userspace

bluetoothd has a bug with makes it send extra bytes as part of
MGMT_OP_ADD_EXT_ADV_DATA which are now being checked to be the
exact the expected length, relax this so only when the expected
length is greater than the data length to cause an error since
that would result in accessing invalid memory, otherwise just
ignore the extra bytes.

Link: https://lore.kernel.org/linux-bluetooth/20260602204749.210857-1-luiz.dentz@gmail.com/T/#u
Fixes: d3f7d17960ed ("Bluetooth: MGMT: validate Add Extended Advertising Data length")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
10 days agoBluetooth: SCO: Fix data-race on sco_pi fields in sco_connect
SeungJu Cheon [Mon, 1 Jun 2026 11:19:08 +0000 (20:19 +0900)] 
Bluetooth: SCO: Fix data-race on sco_pi fields in sco_connect

sco_sock_connect() copies the destination address into sco_pi(sk)->dst
under lock_sock(), then releases the lock and calls sco_connect(),
which reads dst, src, setting, and codec without holding lock_sock() in
hci_get_route() and hci_connect_sco().

These fields may be modified concurrently by connect(), bind(), or
setsockopt() on the same socket, resulting in data-races reported by
KCSAN.

Fix this by snapshotting dst, src, setting, and codec under lock_sock()
at the start of sco_connect() before passing them to hci_get_route()
and hci_connect_sco().

BUG: KCSAN: data-race in memcmp+0x45/0xb0

race at unknown origin, with read to 0xffff88800e6b0dd0 of 1 bytes
by task 315 on cpu 0:
 memcmp+0x45/0xb0
 hci_connect_acl+0x1b7/0x6b0
 hci_connect_sco+0x4d/0xb30
 sco_sock_connect+0x27b/0xd60
 __sys_connect_file+0xbd/0xe0
 __sys_connect+0xe0/0x110
 __x64_sys_connect+0x40/0x50
 x64_sys_call+0xcad/0x1c60
 do_syscall_64+0x133/0x590
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

Fixes: 9a8ec9e8ebb5 ("Bluetooth: SCO: Fix possible circular locking dependency on sco_connect_cfm")
Signed-off-by: SeungJu Cheon <suunj1331@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
10 days agoBluetooth: ISO: Fix data-race on iso_pi fields in hci_get_route calls
SeungJu Cheon [Mon, 1 Jun 2026 11:19:07 +0000 (20:19 +0900)] 
Bluetooth: ISO: Fix data-race on iso_pi fields in hci_get_route calls

iso_connect_bis(), iso_connect_cis(), iso_listen_bis(), and
iso_conn_big_sync() call hci_get_route() using iso_pi(sk)->dst,
iso_pi(sk)->src, and iso_pi(sk)->src_type without holding lock_sock().

These fields may be modified concurrently by connect() or setsockopt()
on the same socket, resulting in data-races reported by KCSAN.

Fix this by snapshotting the required fields under lock_sock() before
calling hci_get_route().

BUG: KCSAN: data-race in memcmp+0x45/0xb0

race at unknown origin, with read to 0xffff8880122135cf of 1 bytes
by task 333 on cpu 1:
 memcmp+0x45/0xb0
 hci_get_route+0x27e/0x490
 iso_connect_cis+0x4c/0xa10
 iso_sock_connect+0x60e/0xb30
 __sys_connect_file+0xbd/0xe0
 __sys_connect+0xe0/0x110
 __x64_sys_connect+0x40/0x50
 x64_sys_call+0xcad/0x1c60
 do_syscall_64+0x133/0x590
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

Fixes: 241f51931c35 ("Bluetooth: ISO: Avoid circular locking dependency")
Signed-off-by: SeungJu Cheon <suunj1331@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
10 days agoBluetooth: ISO: Fix a use-after-free of the hci_conn pointer
Luiz Augusto von Dentz [Mon, 1 Jun 2026 18:52:09 +0000 (14:52 -0400)] 
Bluetooth: ISO: Fix a use-after-free of the hci_conn pointer

In iso_sock_rebind_bc(), the bis pointer is cached, then the socket lock is
dropped:
bis = iso_pi(sk)->conn->hcon;
/* Release the socket before lookups since that requires hci_dev_lock
 * which shall not be acquired while holding sock_lock for proper
 * ordering.
 */
release_sock(sk);
hci_dev_lock(bis->hdev);

During the unlocked window, could a concurrent close() destroy the connection
and free the bis structure, causing hci_dev_lock(bis->hdev) to access memory
after it is freed, fix this by using the hdev reference which was safely
acquired via iso_conn_get_hdev().

Fixes: d3413703d5f8 ("Bluetooth: ISO: Add support to bind to trigger PAST")
Reported-by: Sashiko <sashiko-bot@kernel.org>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
10 days agoBluetooth: ISO: Fix not releasing hdev reference on iso_conn_big_sync
Luiz Augusto von Dentz [Mon, 1 Jun 2026 18:45:42 +0000 (14:45 -0400)] 
Bluetooth: ISO: Fix not releasing hdev reference on iso_conn_big_sync

hci_get_route() returns a reference-counted hci_dev pointer via
hci_dev_hold(). The function exits normally or with an error without ever
releasing it.

Fixes: 07a9342b94a9 ("Bluetooth: ISO: Send BIG Create Sync via hci_sync")
Reported-by: Sashiko <sashiko-bot@kernel.org>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
10 days agoBluetooth: fix memory leak in error path of hci_alloc_dev()
Bharath Reddy [Mon, 1 Jun 2026 03:24:26 +0000 (08:54 +0530)] 
Bluetooth: fix memory leak in error path of hci_alloc_dev()

Early failures in Bluetooth HCI UART configuration leak SRCU percpu
memory.

When device initialization fails before hci_register_dev() completes,
the HCI_UNREGISTER flag is never set. As a result, when the device
reference count reaches zero, bt_host_release() evaluates this flag as
false and falls back to a direct kfree(hdev).

Because hci_release_dev() is bypassed, the SRCU struct initialized
early in hci_alloc_dev() is never cleaned up, resulting in a leak of
percpu memory.

Fix the leak by explicitly calling cleanup_srcu_struct() in the
fallback (unregistered) branch of bt_host_release() before freeing
the device.

Reported-by: syzbot+535ecc844591e50588a5@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=535ecc844591e50588a5
Tested-by: syzbot+535ecc844591e50588a5@syzkaller.appspotmail.com
Fixes: 1d6123102e9f ("Bluetooth: hci_core: Fix use-after-free in vhci_flush()")
Signed-off-by: Bharath Reddy <kbreddy.rpbc@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
10 days agoBluetooth: bnep: reject short frames before parsing
Zhang Cen [Fri, 29 May 2026 03:22:09 +0000 (11:22 +0800)] 
Bluetooth: bnep: reject short frames before parsing

A BNEP peer can send a short BNEP SDU. bnep_rx_frame() reads the
packet type byte immediately and, for control packets, reads the control
opcode and setup UUID-size byte before proving that those bytes are
present. bnep_rx_control() also dereferences the control opcode without
rejecting an empty control payload.

Use skb_pull_data() for the fixed fields in bnep_rx_frame() so a NULL
return gates each dereference. Split the control handler so the frame
path can pass an opcode that has already been pulled, and keep the
byte-buffer wrapper for extension control payloads.

For BNEP_SETUP_CONN_REQ, name the UUID-size byte before pulling the
setup payload. struct bnep_setup_conn_req carries destination and source
service UUIDs after that byte, each uuid_size bytes, so the parser now
documents that tuple explicitly instead of leaving the pull length as an
opaque multiplication.

Validation reproduced this kernel report:
KASAN slab-out-of-bounds in bnep_rx_frame.isra.0+0x130c/0x1790
The buggy address belongs to the object at ffff88800c0f7908 which belongs
to the cache kmalloc-8 of size 8
The buggy address is located 0 bytes to the right of allocated 1-byte
region [ffff88800c0f7908ffff88800c0f7909)
Read of size 1
Call trace:
  dump_stack_lvl+0xb3/0x140 (?:?)
  print_address_description+0x57/0x3a0 (?:?)
  bnep_rx_frame+0x130c/0x1790 (net/bluetooth/bnep/core.c:306)
  print_report+0xb9/0x2b0 (?:?)
  __virt_addr_valid+0x1ba/0x3a0 (?:?)
  srso_alias_return_thunk+0x5/0xfbef5 (?:?)
  kasan_addr_to_slab+0x21/0x60 (?:?)
  kasan_report+0xe0/0x110 (?:?)
  process_one_work+0xfce/0x17e0 (kernel/workqueue.c:3200)
  worker_thread+0x65c/0xe40 (?:?)
  __kthread_parkme+0x184/0x230 (?:?)
  kthread+0x35e/0x470 (?:?)
  _raw_spin_unlock_irq+0x28/0x50 (?:?)
  ret_from_fork+0x586/0x870 (?:?)
  __switch_to+0x74f/0xdc0 (?:?)
  ret_from_fork_asm+0x1a/0x30 (?:?)

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Assisted-by: Codex:gpt-5.5
Signed-off-by: Zhang Cen <rollkingzzc@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
10 days agoBluetooth: hci_sync: reject oversized Broadcast Announcement prepend
Yuqi Xu [Fri, 29 May 2026 08:54:23 +0000 (16:54 +0800)] 
Bluetooth: hci_sync: reject oversized Broadcast Announcement prepend

Existing advertising instances can already hold the maximum extended
advertising payload. When hci_adv_bcast_annoucement() prepends the
Broadcast Announcement service data to that payload, the combined data
may no longer fit in the temporary buffer used to rebuild the
advertising data.

Reject that case before copying the existing payload and report the
failure through the device log. This keeps the existing advertising
data intact and avoids overrunning the temporary buffer.

Fixes: 5725bc608252 ("Bluetooth: hci_sync: Fix broadcast/PA when using an existing instance")
Cc: stable@kernel.org
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Zhengchuan Liang <zcliangcn@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Assisted-by: Codex:GPT-5.4
Signed-off-by: Yuqi Xu <xuyq21@lenovo.com>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
10 days agoBluetooth: L2CAP: reject BR/EDR signaling packets over MTUsig
Michael Bommarito [Thu, 21 May 2026 14:45:17 +0000 (10:45 -0400)] 
Bluetooth: L2CAP: reject BR/EDR signaling packets over MTUsig

net/bluetooth/l2cap_core.c:l2cap_sig_channel() accepts BR/EDR
signaling packets up to the channel MTU and dispatches each command
without enforcing the signaling MTU (MTUsig). A Bluetooth BR/EDR peer
within radio range can send a fixed-channel CID 0x0001 packet that is
larger than MTUsig and contains many L2CAP_ECHO_REQ commands before
pairing. In a real-radio stock-kernel run, one 681-byte signaling
packet containing 168 zero-length ECHO_REQ commands made the target
transmit 168 ECHO_RSP frames over about 220 ms.

Impact: a Bluetooth BR/EDR peer within radio range, before pairing, can
force 168 ECHO_RSP frames from one 681-byte fixed-channel signaling
packet containing packed ECHO_REQ commands.

Define Linux's BR/EDR signaling MTU as the spec minimum of 48 bytes and
reject any larger signaling packet with one L2CAP_COMMAND_REJECT_RSP
carrying L2CAP_REJ_MTU_EXCEEDED before any command is dispatched.

The Bluetooth Core spec wording for MTUExceeded says the reject
identifier shall match the first request command in the packet, and
that packets containing only responses shall be silently discarded.
Linux intentionally deviates from that prescription: silently
discarding desynchronizes the peer because the remote stack never
learns its responses were dropped, and locating the first request
command requires walking command headers past MTUsig, i.e. processing
bytes from a packet we have already decided is too large to process.
We therefore always emit one reject and use the identifier from the
first command header, a single fixed-offset byte read.

The unrestricted BR/EDR signaling parser and ECHO_REQ response path both
trace to the initial git import; no later introducing commit is
available for a Fixes tag.

Cc: stable@vger.kernel.org
Suggested-by: Luiz Augusto von Dentz <luiz.dentz@gmail.com>
Link: https://lore.kernel.org/r/20260518002800.1361430-1-michael.bommarito@gmail.com
Link: https://lore.kernel.org/r/20260520135034.1060859-1-michael.bommarito@gmail.com
Link: https://lore.kernel.org/r/20260521000555.3712030-1-michael.bommarito@gmail.com
Assisted-by: Claude:claude-opus-4-7
Assisted-by: Codex:gpt-5-5-xhigh
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
10 days agoBluetooth: RFCOMM: validate skb length in MCC handlers
SeungJu Cheon [Mon, 25 May 2026 11:04:43 +0000 (20:04 +0900)] 
Bluetooth: RFCOMM: validate skb length in MCC handlers

The RFCOMM MCC handlers cast skb->data to protocol-specific structs
without validating skb->len first. A malicious remote device can send
truncated MCC frames and trigger out-of-bounds reads in these handlers.

Fix this by using skb_pull_data() to validate and access the required
data before dereferencing it.

rfcomm_recv_rpn() requires special handling since ETSI TS 07.10 allows
1-byte RPN requests. Handle this by validating only the DLCI byte first,
and validating the full struct only when len > 1.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Suggested-by: Muhammad Bilal <meatuni001@gmail.com>
Signed-off-by: SeungJu Cheon <suunj1331@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
10 days agoBluetooth: MGMT: validate advertising TLV before type checks
Zhang Cen [Thu, 28 May 2026 09:45:06 +0000 (17:45 +0800)] 
Bluetooth: MGMT: validate advertising TLV before type checks

tlv_data_is_valid() reads each advertising data field length from
data[i], then inspects data[i + 1] for managed EIR types before
checking that the current field still fits inside the supplied buffer.

A malformed field whose length byte is the last byte of the buffer can
therefore make the parser read one byte past the advertising data.

KASAN reported the following when a malformed MGMT_OP_ADD_ADVERTISING
request reached that path:

  BUG: KASAN: vmalloc-out-of-bounds in tlv_data_is_valid()
  Read of size 1
  Call trace:
    tlv_data_is_valid()
    add_advertising()
    hci_mgmt_cmd()
    hci_sock_sendmsg()

Move the existing element-length check before any type-octet inspection
so each non-empty element is proven to contain its type byte before the
parser looks at data[i + 1].

Fixes: 2bb36870e8cb ("Bluetooth: Unify advertising instance flags check")
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Zhang Cen <rollkingzzc@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
10 days agoBluetooth: RFCOMM: hold listener socket in rfcomm_connect_ind()
Zhang Cen [Thu, 28 May 2026 07:56:41 +0000 (15:56 +0800)] 
Bluetooth: RFCOMM: hold listener socket in rfcomm_connect_ind()

rfcomm_get_sock_by_channel() scans rfcomm_sk_list under the list lock,
but returns the selected listener after dropping that lock without
taking a reference. rfcomm_connect_ind() then locks the listener,
queues a child socket on it, and may notify it after unlocking it.

The buggy scenario involves two paths, with each column showing the
order within that path:

rfcomm_connect_ind():            listener close:
  1. Find parent in              1. close() enters
     rfcomm_get_sock_by_channel()   rfcomm_sock_release().
  2. Drop rfcomm_sk_list.lock    2. rfcomm_sock_shutdown()
     without pinning parent.        closes the listener.
  3. Call lock_sock(parent) and  3. rfcomm_sock_kill()
     bt_accept_enqueue(parent,      unlinks and puts parent.
     sk, true).
  4. Read parent flags and may   4. parent can be freed.
     call sk_state_change().

If close wins the race, parent can be freed before
rfcomm_connect_ind() reaches lock_sock(), bt_accept_enqueue(), or the
deferred-setup callback.

Take a reference on the listener before leaving rfcomm_sk_list.lock.
After lock_sock() succeeds, recheck that it is still in BT_LISTEN
before queueing a child, cache the deferred-setup bit while the parent
is locked, and drop the reference after the last parent use.

KASAN reported a slab-use-after-free in lock_sock_nested() from
rfcomm_connect_ind(), with the freeing stack going through
rfcomm_sock_kill() and rfcomm_sock_release().

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Zhang Cen <rollkingzzc@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
10 days agoMerge tag 'kvm-s390-master-7.1-3' of https://git.kernel.org/pub/scm/linux/kernel...
Paolo Bonzini [Wed, 3 Jun 2026 14:46:31 +0000 (16:46 +0200)] 
Merge tag 'kvm-s390-master-7.1-3' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

KVM: s390: More gmap and vsie fixes

10 days agoKVM: SEV: Unmap and unpin the GHCB as needed on vCPU free
Sean Christopherson [Fri, 29 May 2026 18:35:42 +0000 (20:35 +0200)] 
KVM: SEV: Unmap and unpin the GHCB as needed on vCPU free

Unmap and unpin the GHCB as needed when freeing a vCPU.  If the VM is
destroyed after mapping+pinning the GHCB on #VMGEXIT, without re-running
the vCPU, KVM will effectively leak the GHCB and any mappings created for
the GHCB.

Fixes: 291bd20d5d88 ("KVM: SVM: Add initial support for a VMGEXIT VMEXIT")
Cc: stable@vger.kernel.org
Tested-by: Michael Roth <michael.roth@amd.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260501202250.2115252-18-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20260529183549.1104619-18-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
10 days agoKVM: SEV: Decouple the need to sync the GHCB SA from the need to free the SA
Sean Christopherson [Fri, 29 May 2026 18:35:41 +0000 (20:35 +0200)] 
KVM: SEV: Decouple the need to sync the GHCB SA from the need to free the SA

Decouple synchronizing the GHCB SA from freeing/unpinning the SA, so that
the free/unpin path can be reused when freeing a vCPU.

Opportunistically add a WARN to harden KVM against stomping over (and thus
leaking) an already-allocated scratch area.

Cc: stable@vger.kernel.org
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260501202250.2115252-17-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20260529183549.1104619-17-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
10 days agoKVM: SEV: Move sev_free_vcpu() down below sev_es_unmap_ghcb()
Sean Christopherson [Fri, 29 May 2026 18:35:40 +0000 (20:35 +0200)] 
KVM: SEV: Move sev_free_vcpu() down below sev_es_unmap_ghcb()

Relocate sev_free_vcpu() down in sev.c so that it's definition comes after
sev_es_unmap_ghcb().  This will allow sharing unmap functionality between
the two functions without needing a forward declaration (or weird placement
of the common code).

No functional change intended.

Cc: stable@vger.kernel.org
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260501202250.2115252-16-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20260529183549.1104619-16-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
10 days agoKVM: Don't WARN if memory is dirtied without a vCPU when the VM is dying
Sean Christopherson [Fri, 29 May 2026 18:35:39 +0000 (20:35 +0200)] 
KVM: Don't WARN if memory is dirtied without a vCPU when the VM is dying

When marking a page dirty, complain about not having a running/loaded vCPU
if and only if the VM is still alive, i.e. its refcount is non-zero.  This
will allow fixing a memory leak for x86 SEV-ES guests without hitting what
is effectively a false positive on the WARN.

For some SEV-ES VM-Exits, KVM keeps a writable mapping of a guest page
across an exit to userspace, and typically unmaps the page on the next
KVM_RUN.  But if userspace never calls KVM_RUN after such an exit, then KVM
needs to unmap the page when the vCPU is destroyed, which in turn triggers
the WARN about not having a running vCPU.

Alternatively, SEV-ES could temporarily load the vCPU to suppress the WARN,
as is done in nested_vmx_free_vcpu() (but for completely unrelated reasons;
suppressing WARN from nested_put_vmcs12_pages() is pure happenstance).  But
loading a vCPU during destruction is gross (ideally nVMX code would be
cleaned up), risks complicating the SEV-ES code (KVM would need to ensure
the temporarily load()+put() only runs when the vCPU isn't already loaded),
and is ultimately pointless.

The motivation for the WARN is to guard against KVM dirtying guest memory
without pushing the corresponding GFN to the active vCPU's dirty ring, e.g.
to ensure userspace doesn't miss a dirty page.  But for the VM's refcount
to reach zero, there can't be _any_ userspace mappings to the dirty ring,
as mapping the dirty ring requires doing mmap() on the vCPU FD.  I.e. if
userspace had a valid mapping for the dirty ring, then the vCPU file and
thus the owning VM would still be alive.  And so since userspace can't
possibly reach the dirty ring, whether or not KVM technically "misses" a
push to the dirty ring is irrelevant.

Reported-by: Michael Roth <michael.roth@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260501202250.2115252-15-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20260529183549.1104619-15-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
10 days agoKVM: SEV: Read start/end indices of PSC requests exactly once per #VMGEXIT
Sean Christopherson [Fri, 29 May 2026 18:35:38 +0000 (20:35 +0200)] 
KVM: SEV: Read start/end indices of PSC requests exactly once per #VMGEXIT

Rework Page State Change (PSC) handling to read the guest-provided start
and end indices exactly once, at the beginning of the request.  Re-reading
the indices is "fine", _if_ the guest is well-behaved.  KVM _should_ be
safe against concurrent guest modification of the indices, but there is
zero reason to introduce unnecessary risk.

Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260501202250.2115252-14-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20260529183549.1104619-14-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
10 days agoKVM: SEV: Add an anonymous "psc" struct to track current PSC metadata
Sean Christopherson [Fri, 29 May 2026 18:35:37 +0000 (20:35 +0200)] 
KVM: SEV: Add an anonymous "psc" struct to track current PSC metadata

Add a "psc" struct to vcpu_sev_es_state to avoid having to prefix all of
the fields with "psc_".

Take advantage of the code churn to opportunistically rename local
variables to "guest_psc" to make it more obvious that the buffer is guest
data, and more importantly, guest accessible!

Opportunistically rename inflight => batch_size as well, because there can
really only be one operation in-flight (per-vCPU), i.e. "inflight" _looks_
like a boolean, but in actuality is an integer tracking how many pages are
being handled by the current operation.

No functional change intended.

Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260501202250.2115252-13-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20260529183549.1104619-13-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
10 days agoKVM: SEV: Make it more obvious when KVM is writing back the current PSC index
Sean Christopherson [Fri, 29 May 2026 18:35:36 +0000 (20:35 +0200)] 
KVM: SEV: Make it more obvious when KVM is writing back the current PSC index

Increment the guest-visible "cur_entry" index outside of the for-loop
when processing Page State Change entries, and add a comment to make it
more obvious which code is operating on trusted data, and which code is
touching guest-accessible data.

No functional change intended.

Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260501202250.2115252-12-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20260529183549.1104619-12-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
10 days agodma-debug: fix physical address retrieval in debug_dma_sync_sg_for_device
Li RongQing [Wed, 3 Jun 2026 12:37:08 +0000 (20:37 +0800)] 
dma-debug: fix physical address retrieval in debug_dma_sync_sg_for_device

In debug_dma_sync_sg_for_device(), when iterating over a scatterlist,
the debug entry population mistakenly uses the head of the scatterlist
'sg' to fetch the physical address via sg_phys(), instead of using the
current iterator variable 's'.

This causes dma-debug to track the physical address of the very first
scatterlist entry for all subsequent entries in the list.

Fix this by passing the correct loop iterator 's' to sg_phys()

Fixes: 9d4f645a1fd49ee ("dma-debug: store a phys_addr_t in struct dma_debug_entry")
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lore.kernel.org/r/20260603123708.1665-1-lirongqing@baidu.com
10 days agoRDMA/umem: Fix truncation for block sizes >= 4G
Jason Gunthorpe [Mon, 1 Jun 2026 16:52:31 +0000 (13:52 -0300)] 
RDMA/umem: Fix truncation for block sizes >= 4G

When the iommu is used the linearization of the mapping can give a single
block that is very large split across multiple SG entries.

When __rdma_block_iter_next() reassembles the split SG entries it is
overflowing the 32 bit stack values and computed the wrong DMA addresses
for blocks after the truncation.

Use the right types to hold DMA addresses.

Link: https://patch.msgid.link/r/1-v1-88303e9e509f+f7-ib_umem_types_jgg@nvidia.com
Cc: stable@vger.kernel.org
Fixes: a808273a495c ("RDMA/verbs: Add a DMA iterator to return aligned contiguous memory blocks")
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
10 days agowifi: cfg80211: enforce HE/EHT cap/oper consistency
Johannes Berg [Wed, 3 Jun 2026 09:18:11 +0000 (11:18 +0200)] 
wifi: cfg80211: enforce HE/EHT cap/oper consistency

Xiang Mei reports that mac80211 could crash if eht_cap is set
but eht_oper isn't. Rather than fixing that for the individual
user(s), enforce that both HE/EHT have consistent elements.

Reported-by: Xiang Mei <xmei5@asu.edu>
Fixes: 22c64f37e1d4 ("wifi: mac80211: Update MCS15 support in link_conf")
Link: https://patch.msgid.link/20260603091812.101894-2-johannes@sipsolutions.net
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
10 days agodrm/v3d: Skip CSD when it has zeroed workgroups
Maíra Canal [Tue, 2 Jun 2026 17:50:15 +0000 (14:50 -0300)] 
drm/v3d: Skip CSD when it has zeroed workgroups

A compute shader dispatch encodes its workgroup counts in the CFG0..CFG2
registers. Kicking off a dispatch with a zero count in any of the three
dimensions is invalid. First, the hardware will process 0 as 65536,
while the user-space driver exposes a maximum of 65535. Over that, a
submission with a zeroed workgroup dimension should be a no-op.

These zeroed counts can reach the dispatch path through an indirect CSD
job, whose workgroup counts are only known once the indirect buffer is
read and may legitimately be zero, but such scenario should only result in
a no-op.

Overwrite the indirect CSD job workgroup counts with the indirect BO
ones, even if they are zeroed, and don't submit the job to the hardware
when any of the workgroup counts is zero, so the job completes immediately
instead of running the shader.

Cc: stable@vger.kernel.org
Fixes: d223f98f0209 ("drm/v3d: Add support for compute shader dispatch.")
Suggested-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patch.msgid.link/20260602-v3d-fix-indirect-csd-v4-2-654309e32bc0@igalia.com
Signed-off-by: Maíra Canal <mcanal@igalia.com>
10 days agodrm/v3d: Fix vaddr leak when indirect CSD has zeroed workgroups
Maíra Canal [Tue, 2 Jun 2026 17:50:14 +0000 (14:50 -0300)] 
drm/v3d: Fix vaddr leak when indirect CSD has zeroed workgroups

v3d_rewrite_csd_job_wg_counts_from_indirect() maps both the indirect
buffer and the workgroup buffer and is expected to release them before
returning. When any of the workgroup counts read from the buffer is zero,
the function bailed out early and skipped the cleanup, leaking the vaddr
mappings of both BOs.

Jump to the cleanup path instead of returning directly, so the mappings
are always dropped.

Cc: stable@vger.kernel.org
Fixes: 18b8413b25b7 ("drm/v3d: Create a CPU job extension for a indirect CSD job")
Suggested-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patch.msgid.link/20260602-v3d-fix-indirect-csd-v4-1-654309e32bc0@igalia.com
Signed-off-by: Maíra Canal <mcanal@igalia.com>
10 days agorv: Use 0 to check preemption enabled in opid
Gabriele Monaco [Mon, 1 Jun 2026 15:38:37 +0000 (17:38 +0200)] 
rv: Use 0 to check preemption enabled in opid

Tracepoint handlers no longer run with preemption disabled by default
since a46023d5616 ("tracing: Guard __DECLARE_TRACE() use of
__DO_TRACE_CALL() with SRCU-fast"), the opid monitor should now count 1
in the preemption count as preemption disabled.

Change the rule for preempt_off to preempt > 0.

Reviewed-by: Nam Cao <namcao@linutronix.de>
Link: https://lore.kernel.org/r/20260601153840.124372-11-gmonaco@redhat.com
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
10 days agorv: Prevent task migration while handling per-CPU events
Gabriele Monaco [Mon, 1 Jun 2026 15:38:36 +0000 (17:38 +0200)] 
rv: Prevent task migration while handling per-CPU events

Tracepoint handlers are fully preemptible after a46023d5616 ("tracing:
Guard __DECLARE_TRACE() use of __DO_TRACE_CALL() with SRCU-fast"). When
a per-CPU monitor handles an event, it retrieves the monitor state using
a per-CPU pointer. If the event itself doesn't disable preemption, the
task can migrate to a different CPU and we risk updating the wrong
monitor.

Mitigate this by explicitly disabling task migration before acquiring
the monitor pointer. This cannot guarantee the monitor runs on the
correct CPU but reduces the race condition window and prevents warnings.

Reviewed-by: Wen Yang <wen.yang@linux.dev>
Reviewed-by: Nam Cao <namcao@linutronix.de>
Link: https://lore.kernel.org/r/20260601153840.124372-10-gmonaco@redhat.com
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
10 days agorv: Ensure synchronous cleanup for HA monitors
Gabriele Monaco [Mon, 1 Jun 2026 15:38:35 +0000 (17:38 +0200)] 
rv: Ensure synchronous cleanup for HA monitors

HA monitors may start timers, all cleanup functions currently stop the
timers asynchronously to avoid sleeping in the wrong context.
Nothing makes sure running callbacks terminate on cleanup.

Run the entire HA timer callback in an RCU read-side critical section,
this way we can simply synchronize_rcu() with any pending timer and are
sure any cleanup using kfree_rcu() runs after callbacks terminated.
Additionally make sure any unlikely callback running late won't run any
code if the monitor is marked as disabled or if destruction started.
Use memory barriers to serialise with racing resets.

Fixes: f5587d1b6ec9 ("rv: Add Hybrid Automata monitor type")
Fixes: 4a24127bd6cb ("rv: Add support for per-object monitors in DA/HA")
Reviewed-by: Nam Cao <namcao@linutronix.de>
Link: https://lore.kernel.org/r/20260601153840.124372-9-gmonaco@redhat.com
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
10 days agorv: Add automatic cleanup handlers for per-task HA monitors
Gabriele Monaco [Mon, 1 Jun 2026 15:38:34 +0000 (17:38 +0200)] 
rv: Add automatic cleanup handlers for per-task HA monitors

Hybrid automata monitors may start timers, depending on the model, these
may remain active on an exiting task and cause false positives or even
access freed memory.

Add an enable/disable hook in the HA code, currently only populated by
the per-task handler for registration and deregistration.
This hooks to the sched_process_exit event and ensures the timer is
stopped for every exiting task. The handler is enabled automatically but
may be disabled, for instance if the monitor uses the event for another
purpose (but should still manually ensure timers are stopped).

Fixes: f5587d1b6ec9 ("rv: Add Hybrid Automata monitor type")
Reviewed-by: Nam Cao <namcao@linutronix.de>
Link: https://lore.kernel.org/r/20260601153840.124372-8-gmonaco@redhat.com
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
10 days agorv: Do not rely on clean monitor when initialising HA
Gabriele Monaco [Mon, 1 Jun 2026 15:38:33 +0000 (17:38 +0200)] 
rv: Do not rely on clean monitor when initialising HA

Hybrid Automata monitors hook into the DA implementation when doing
da_monitor_reset(). This function is called both on initialisation and
teardown, HA monitors try to cancel a timer only when it's initialised
relying on the da_mon->monitoring flag. This flag could however be
corrupted during initialisation. This happens for instance on per-task
monitors that share the same storage with different type of monitors
like LTL or in case of races during a previous teardown.

Stop relying on the monitoring flag during initialisation, assume that
can have any value, so use a separate da_reset_state() skiping timer
cancellation.
New monitors (e.g. new tasks) are always zero-initialised so it is safe
to rely on the monitoring flag for those.

Reported-by: Wen Yang <wen.yang@linux.dev>
Closes: https://lore.kernel.org/lkml/d02c656aada7d071f083460a5c9a454363669b61.1778522945.git.wen.yang@linux.dev
Suggested-by: Nam Cao <namcao@linutronix.de>
Fixes: f5587d1b6ec9 ("rv: Add Hybrid Automata monitor type")
Reviewed-by: Wen Yang <wen.yang@linux.dev>
Reviewed-by: Nam Cao <namcao@linutronix.de>
Link: https://lore.kernel.org/r/20260601153840.124372-7-gmonaco@redhat.com
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
10 days agorv: Fix monitor start ordering and memory ordering for monitoring flag
Wen Yang [Mon, 1 Jun 2026 15:38:32 +0000 (17:38 +0200)] 
rv: Fix monitor start ordering and memory ordering for monitoring flag

da_monitor_start() set monitoring=1 before calling da_monitor_init_hook(),
may racing with the sched_switch handler:

  da_monitor_start()               sched_switch handler
  -------------------------        ---------------------------------
  da_mon->monitoring = 1;
                                   if (da_monitoring(da_mon))  /* true  */
                                       ha_start_timer_ns(...);
                                       /* hrtimer->base == NULL, crash */
  da_monitor_init_hook(da_mon);
  /* hrtimer_setup() sets base */

Fix the ordering and pair with release/acquire semantics:

  da_monitor_init_hook(da_mon);
  smp_store_release(&da_mon->monitoring, 1);    /* da_monitor_start()  */
  return smp_load_acquire(&da_mon->monitoring); /* da_monitoring()     */

On ARM64 a plain STR + LDR does not form a release-acquire pair, so
the load can observe monitoring=1 while hrtimer->base is still NULL.
The plain accesses are also data races under KCSAN.

Use WRITE_ONCE for the monitoring=0 store in da_monitor_reset() to
cover the reset path.

Fixes: 792575348ff7 ("rv/include: Add deterministic automata monitor definition via C macros")
Signed-off-by: Wen Yang <wen.yang@linux.dev>
Reviewed-by: Gabriele Monaco <gmonaco@redhat.com>
Reviewed-by: Nam Cao <namcao@linutronix.de>
Link: https://lore.kernel.org/r/20260601153840.124372-6-gmonaco@redhat.com
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
10 days agorv: Ensure all pending probes terminate on per-obj monitor destroy
Gabriele Monaco [Mon, 1 Jun 2026 15:38:31 +0000 (17:38 +0200)] 
rv: Ensure all pending probes terminate on per-obj monitor destroy

The monitor disable/destroy sequence detaches all probes and resets the
monitor's data, however it doesn't wait for pending probes. This is an
issue with per-object monitors, which free the monitor storage.

Call tracepoint_synchronize_unregister() to make sure to wait for all
pending probes before destroying the monitor storage.

Fixes: 4a24127bd6cb ("rv: Add support for per-object monitors in DA/HA")
Reviewed-by: Wen Yang <wen.yang@linux.dev>
Reviewed-by: Nam Cao <namcao@linutronix.de>
Link: https://lore.kernel.org/r/20260601153840.124372-5-gmonaco@redhat.com
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
10 days agorv: Prevent in-flight per-task handlers from using invalid slots
Gabriele Monaco [Mon, 1 Jun 2026 15:38:30 +0000 (17:38 +0200)] 
rv: Prevent in-flight per-task handlers from using invalid slots

Per-task monitors use a slot in the task_struct->rv[] array and store
that locally (e.g. task_mon_slot), this slot is returned during the
destruction process but currently hanlers can be running while that slot
is returning and this race may lead to accessing an invalid slot.

Synchronise with all in-flight tracepoint handlers using
tracepoint_synchronize_unregister() before returning the slot.

Fixes: f5587d1b6ec9 ("rv: Add Hybrid Automata monitor type")
Fixes: a9769a5b9878 ("rv: Add support for LTL monitors")
Suggested-by: Wen Yang <wen.yang@linux.dev>
Reviewed-by: Nam Cao <namcao@linutronix.de>
Link: https://lore.kernel.org/r/20260601153840.124372-4-gmonaco@redhat.com
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
10 days agorv: Reset per-task DA monitors before releasing the slot
Gabriele Monaco [Mon, 1 Jun 2026 15:38:29 +0000 (17:38 +0200)] 
rv: Reset per-task DA monitors before releasing the slot

Per-task monitors use task_mon_slot to determine which slot in the array
to use for the monitor. During destruction, this slot is returned but
this is done before resetting the monitor. As a result, the monitor's
reset is in fact resetting a slot that is outside of the array
(RV_PER_TASK_MONITOR_INIT).

Release the slot only after the reset to avoid out-of-bound memory
access.

Fixes: f5587d1b6ec93 ("rv: Add Hybrid Automata monitor type")
Cc: stable@vger.kernel.org
Suggested-by: Wen Yang <wen.yang@linux.dev>
Reviewed-by: Wen Yang <wen.yang@linux.dev>
Reviewed-by: Nam Cao <namcao@linutronix.de>
Link: https://lore.kernel.org/r/20260601153840.124372-3-gmonaco@redhat.com
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
10 days agorv: Fix __user specifier usage in extract_params()
Gabriele Monaco [Mon, 1 Jun 2026 15:38:28 +0000 (17:38 +0200)] 
rv: Fix __user specifier usage in extract_params()

The attributes variables extracted from syscalls in the helper are both
defined with the __user specifier although only the actual pointer to
user data should be marked.

Remove the __user specifier from attr.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202604150820.Ny143u6X-lkp@intel.com
Fixes: b133207deb72 ("rv: Add nomiss deadline monitor")
Reviewed-by: Wen Yang <wen.yang@linux.dev>
Reviewed-by: Nam Cao <namcao@linutronix.de>
Link: https://lore.kernel.org/r/20260601153840.124372-2-gmonaco@redhat.com
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
10 days agopmdomain: imx: fix OF node refcount
Bartosz Golaszewski [Thu, 21 May 2026 08:36:27 +0000 (10:36 +0200)] 
pmdomain: imx: fix OF node refcount

for_each_child_of_node_scoped() decrements the reference count of the
nod after each iteration. Assigning it without incrementing the refcount
to a dynamically allocated platform device will result in a double put
in platform_device_release(). Add the missing call to of_node_get().

Cc: stable@vger.kernel.org
Fixes: 3e4d109ee8fc ("pmdomain: imx: gpc: Simplify with scoped for each OF child loop")
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Signed-off-by: Ulf Hansson <ulfh@kernel.org>
10 days agopmdomain: ti_sci: add wakeup constraint to parent devices of wakeup source
Kendall Willis [Thu, 7 May 2026 03:16:45 +0000 (22:16 -0500)] 
pmdomain: ti_sci: add wakeup constraint to parent devices of wakeup source

Set wakeup constraint for any device in a wakeup path. All parent devices
of a wakeup device should not be turned off during suspend. This ensures
the wakeup device is kept on while the system is suspended.

Cc: stable@vger.kernel.org
Fixes: 9d8aa0dd3be4 ("pmdomain: ti_sci: add wakeup constraint management")
Reported-by: Vitor Soares <vitor.soares@toradex.com>
Closes: https://lore.kernel.org/linux-pm/c0fe43a2339c802e9ce5900092cd530a2ba17a6b.camel@gmail.com/
Signed-off-by: Kendall Willis <k-willis@ti.com>
Reviewed-by: Sebin Francis <sebin.francis@ti.com>
Signed-off-by: Ulf Hansson <ulfh@kernel.org>
10 days agodrm/i915: Fix color blob reference handling in intel_plane_state
Chaitanya Kumar Borah [Mon, 1 Jun 2026 08:29:53 +0000 (13:59 +0530)] 
drm/i915: Fix color blob reference handling in intel_plane_state

Take proper references for hw color blobs (degamma_lut, gamma_lut,
ctm, lut_3d) in intel_plane_duplicate_state() and drop them in
intel_plane_destroy_state().

v2:
- handle blobs in hw state clear

Cc: <stable@vger.kernel.org> #v6.19+
Fixes: 3b7476e786c2 ("drm/i915/color: Add framework to program PRE/POST CSC LUT")
Fixes: a78f1b6baf4d ("drm/i915/color: Add framework to program CSC")
Fixes: 65db7a1f9cf7 ("drm/i915/color: Add 3D LUT to color pipeline")
Reviewed-by: Pranay Samala <pranay.samala@intel.com> #v1
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com>
Signed-off-by: Uma Shankar <uma.shankar@intel.com>
Link: https://patch.msgid.link/20260601082953.128539-4-chaitanya.kumar.borah@intel.com
(cherry picked from commit c6eea1925154b6697fe22b217faab9bb30635e6b)
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
10 days agoMerge tag 'iwlwifi-fixes-2026-05-31' of https://git.kernel.org/pub/scm/linux/kernel...
Johannes Berg [Tue, 2 Jun 2026 11:28:38 +0000 (13:28 +0200)] 
Merge tag 'iwlwifi-fixes-2026-05-31' of https://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-next

wifi: iwlwifi: fixes - 2026-05-31

Miri Korenblit says:
====================
This contains a few fixes:
- Don't grab nic access in non-fast-resume
- Don't send a large hcmd than transport supports
- In AP mode, don't send tx power constraints command before activating
  the link
- Don't do sw reset handshake on older firmwares.
====================

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
10 days agowifi: fix leak if split 6 GHz scanning fails
Fedor Pchelkin [Mon, 1 Jun 2026 09:41:56 +0000 (12:41 +0300)] 
wifi: fix leak if split 6 GHz scanning fails

rdev->int_scan_req is leaked if cfg80211_scan() fails.  Note that it's
supposed to be released at ___cfg80211_scan_done() but this doesn't happen
as rdev->scan_req is NULL at that point, too, leading to the early return
from the freeing function.

unreferenced object 0xffff8881161d0800 (size 512):
  comm "wpa_supplicant", pid 379, jiffies 4294749765
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 f0 81 13 16 81 88 ff ff  ................
  backtrace (crc c867fdb6):
    kmemleak_alloc+0x89/0x90
    __kmalloc_noprof+0x2fd/0x410
    cfg80211_scan+0x133/0x730
    nl80211_trigger_scan+0xc69/0x1cc0
    genl_family_rcv_msg_doit+0x204/0x2f0
    genl_rcv_msg+0x431/0x6b0
    netlink_rcv_skb+0x143/0x3f0
    genl_rcv+0x27/0x40
    netlink_unicast+0x4f6/0x820
    netlink_sendmsg+0x797/0xce0
    __sock_sendmsg+0xc4/0x160
    ____sys_sendmsg+0x5e4/0x890
    ___sys_sendmsg+0xf8/0x180
    __sys_sendmsg+0x136/0x1e0
    __x64_sys_sendmsg+0x76/0xc0
    x64_sys_call+0x13f0/0x17d0

Found by Linux Verification Center (linuxtesting.org).

Fixes: c8cb5b854b40 ("nl80211/cfg80211: support 6 GHz scanning")
Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
Link: https://patch.msgid.link/20260601094157.92703-1-pchelkin@ispras.ru
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
10 days agodma-mapping: direct: fix missing mapping for THRU_HOST_BRIDGE segments
Li RongQing [Wed, 3 Jun 2026 01:37:23 +0000 (09:37 +0800)] 
dma-mapping: direct: fix missing mapping for THRU_HOST_BRIDGE segments

In dma_direct_map_sg(), the case PCI_P2PDMA_MAP_THRU_HOST_BRIDGE
incorrectly used 'break' instead of falling through to MAP_NONE.
As a result, segments traversing the host bridge skipped the required
dma_direct_map_phys() call entirely, leaving sg->dma_address
uninitialized and leading to DMA failures. Fix this by using
'fallthrough;'.

Fixes: a25e7962db0d79 ("PCI/P2PDMA: Refactor the p2pdma mapping helpers")
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lore.kernel.org/r/20260603013723.2439-1-lirongqing@baidu.com
10 days agoipv6: anycast: insert aca into global hash under idev->lock
Jiayuan Chen [Fri, 29 May 2026 15:22:18 +0000 (23:22 +0800)] 
ipv6: anycast: insert aca into global hash under idev->lock

syzbot reported a splat [1]: a slab-use-after-free in
ipv6_chk_acast_addr(), which walks the global inet6_acaddr_lst[] hash
under RCU and dereferences a struct ifacaddr6 that has already been
freed while still linked in the hash, so a later reader walks into a
dangling node.

In __ipv6_dev_ac_inc() the aca is allocated with refcount 1, then
aca_get() bumps it to 2 to keep it alive across the unlocked region.
It is published to idev->ac_list under idev->lock, but
ipv6_add_acaddr_hash() runs after write_unlock_bh(). A concurrent
teardown (ipv6_ac_destroy_dev() from addrconf_ifdown(), under RTNL)
can slip into that window:

  CPU0 __ipv6_dev_ac_inc           CPU1 ipv6_ac_destroy_dev (RTNL)
  ------------------------------   ------------------------------------
  aca_alloc()              refcnt 1
  aca_get()               refcnt 2
  write_lock_bh(idev->lock)
    add aca to ac_list
  write_unlock_bh(idev->lock)
                                   write_lock_bh(idev->lock)
                                     pull aca off ac_list
                                   write_unlock_bh(idev->lock)
                                   ipv6_del_acaddr_hash(aca)
                                     hlist_del_init_rcu() is a no-op,
                                     aca is not in the hash yet
                                   aca_put()           refcnt 2->1
  ipv6_add_acaddr_hash(aca)
    aca now inserted into the hash
  aca_put()                refcnt 1->0
    call_rcu(aca_free_rcu) -> kfree(aca)

The hash removal becomes a no-op because the insertion has not
happened yet, so once CPU0 inserts and drops the last reference, the
aca is freed while still linked in inet6_acaddr_lst[], and readers
dereference freed memory after the slab slot is reused.

This window opened once RTNL stopped serializing the join path against
device teardown. Move ipv6_add_acaddr_hash() inside the idev->lock
section so the ac_list and hash insertions are atomic with respect to
teardown: a racing remover now either misses the aca entirely or finds
it in both lists.

acaddr_hash_lock is now nested under idev->lock, which is acquired in
softirq context, so switch all acaddr_hash_lock sites to spin_lock_bh()
to avoid the irq lock inversion reported in [2].

[1] https://syzkaller.appspot.com/bug?extid=a01df04303c131efbf3a
[2] https://lore.kernel.org/netdev/6a194ef7.ba3b1513.1890b4.0000.GAE@google.com/

Reported-by: syzbot+819eb928d120d2bdad0e@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/6a191f87.ce022c6e.138e56.0003.GAE@google.com/T/
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Fixes: eb1ac9ff6c4a ("ipv6: anycast: Don't hold RTNL for IPV6_JOIN_ANYCAST.")
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20260529152219.235475-1-jiayuan.chen@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 days agoInput: atkbd - add DMI quirk for Lenovo Yoga Air 14 (83QK)
Zeyu WANG [Tue, 2 Jun 2026 17:09:09 +0000 (01:09 +0800)] 
Input: atkbd - add DMI quirk for Lenovo Yoga Air 14 (83QK)

The Lenovo Yoga Air 14 (83QK) laptop keyboard becomes unresponsive
after the standard atkbd init sequence. Controlled testing on the
actual hardware shows the F5 (ATKBD_CMD_RESET_DIS / deactivate)
command specifically corrupts the EC state, causing zero IRQ1
interrupts after init.

Skipping only the deactivate command (while keeping F4 ENABLE)
resolves the issue completely: both keystroke input and CapsLock
LED toggle work correctly. The reverse test - skipping only F4
while keeping F5 - makes the problem worse (zero keystroke
interrupts), confirming F5 is the sole culprit.

Add a DMI quirk entry for LENOVO/83QK using the existing
atkbd_deactivate_fixup callback, consistent with the existing
entries for LG Electronics and HONOR FMB-P that address the
same EC F5 deactivate issue.

Signed-off-by: Zeyu WANG <zeyu.thomas.wang@gmail.com>
Link: https://patch.msgid.link/20260602170909.14725-1-zeyu.thomas.wang@gmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
10 days agocgroup/cpuset: Change Ridong's email
Ridong Chen [Tue, 2 Jun 2026 09:10:38 +0000 (17:10 +0800)] 
cgroup/cpuset: Change Ridong's email

The chenridong@huaweicloud.com is no longer a valid email,
replace it with the personal email ridong.chen@linux.dev

Signed-off-by: Ridong Chen <ridong.chen@linux.dev>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
10 days agosched_ext: Don't warn on NULL cgrp_moving_from in scx_cgroup_move_task()
Tejun Heo [Mon, 1 Jun 2026 19:22:37 +0000 (09:22 -1000)] 
sched_ext: Don't warn on NULL cgrp_moving_from in scx_cgroup_move_task()

A WARN fires when systemd's user manager writes "+cpu +memory +pids" to
its own subtree_control while a sched_ext scheduler is loaded:

  WARNING: at kernel/sched/ext.c:3227 scx_cgroup_move_task+0xa8/0xb0
   scx_cgroup_move_task+0xa8/0xb0
   sched_move_task+0x134/0x290
   cpu_cgroup_attach+0x39/0x70
   cgroup_migrate_execute+0x37d/0x450
   cgroup_update_dfl_csses+0x1e3/0x270
   cgroup_subtree_control_write+0x3e7/0x440

scx_cgroup_can_attach() arms cgrp_moving_from only when a task's cpu
cgroup changes. It can still be NULL when scx_cgroup_move_task() runs,
through this sequence:

  Step                               Result
  ---------------------------------  ----------------------------------
  1. cpu enabled on cgroup G         cpu css = A
  2. cpu toggled off then on for G   A killed, B created (same cgroup)
  3. an exiting task keeps A alive   migration skips it, A now stale
  4. +memory migrates G              stale A vs current B pulls cpu in
  5. cpu attach runs for all tasks   hits a live, cpu-unchanged task
  6. scx_cgroup_move_task() on it    cgrp_moving_from NULL -> WARN

The mismatch is that scx_cgroup_can_attach() keys on cgroup identity
while migration drives the move on css identity, so a NULL cgrp_moving_from
here is a legitimate css-only migration, not a missing prep.

The call is already gated on cgrp_moving_from, so just drop the warning.
ops.cgroup_prep_move() and ops.cgroup_move() stay paired.

Fixes: 819513666966 ("sched_ext: Add cgroup support")
Cc: stable@vger.kernel.org # v6.12+
Reported-by: Matt Fleming <mfleming@cloudflare.com>
Closes: https://lore.kernel.org/all/20260601124156.2205704-1-mfleming@cloudflare.com/
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Andrea Righi <arighi@nvidia.com>
10 days agosctp: diag: reject stale associations in dump_one path
Zhao Zhang [Sat, 30 May 2026 15:57:14 +0000 (23:57 +0800)] 
sctp: diag: reject stale associations in dump_one path

The SCTP exact sock_diag lookup can hold a transport reference, block on
lock_sock(sk), and then resume after sctp_association_free() has marked
the association dead and freed its bind address list.

When that happens, inet_assoc_attr_size() and
inet_diag_msg_sctpasoc_fill() can still dereference association state
that is no longer valid for reporting. In particular,
inet_diag_msg_sctpasoc_fill() may read an empty bind-address list as a
real sctp_sockaddr_entry and trigger an out-of-bounds read from
unrelated association memory.

Reject the association after taking the socket lock if it has been
reaped or detached from the endpoint, and report the lookup as stale.
This keeps the exact dump-one path from formatting torn association
state.

Fixes: 8f840e47f190 ("sctp: add the sctp_diag.c file")
Cc: stable@kernel.org
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Reported-by: Zhengchuan Liang <zcliangcn@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Signed-off-by: Zhao Zhang <zzhan461@ucr.edu>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
Acked-by: Xin Long <lucien.xin@gmail.com>
Link: https://patch.msgid.link/fac6043fa20a2ff68e12958c431836f692c51268.1780113823.git.zzhan461@ucr.edu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 days agonet: fec: fix pinctrl default state restore order on resume
Tapio Reijonen [Fri, 29 May 2026 06:18:57 +0000 (06:18 +0000)] 
net: fec: fix pinctrl default state restore order on resume

In fec_resume(), fec_enet_clk_enable() is called before
pinctrl_pm_select_default_state() in the non-WoL path, inverting the
ordering used in fec_suspend() which correctly switches to the sleep
pinctrl state before disabling clocks.

For PHYs with the PHY_RST_AFTER_CLK_EN flag (e.g. TI DP83848 or
SMSC LAN87xx), fec_enet_clk_enable() triggers a hardware reset pulse
via the phy-reset GPIO. With the GPIO pin still in sleep pinctrl state
at that point, the GPIO write has no physical effect and the PHY never
receives the required reset after clock enable, leading to unreliable
link establishment after system resume.

Fix by restoring the default pinctrl state before enabling clocks,
making resume the proper mirror of suspend. The call is made
unconditionally: fec_suspend() only switches to the sleep pinctrl state
on the non-WoL path and leaves the pins in the default state when WoL
is enabled, so on a WoL resume the device is already in the default
state and pinctrl_pm_select_default_state() is a no-op.

Fixes: de40ed31b3c5 ("net: fec: add Wake-on-LAN support")
Signed-off-by: Tapio Reijonen <tapio.reijonen@vaisala.com>
Reviewed-by: Wei Fang <wei.fang@nxp.com>
Link: https://patch.msgid.link/20260529-b4-fec-resume-pinctrl-order-v3-1-6eda0f592fca@vaisala.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 days agonet: lan743x: permit VLAN-tagged packets up to configured MTU
David Thompson [Fri, 29 May 2026 21:03:00 +0000 (21:03 +0000)] 
net: lan743x: permit VLAN-tagged packets up to configured MTU

VLAN-tagged interfaces on lan743x devices were previously unreachable via
SSH and failed to respond to large ping packets (e.g. "ping -s 1469" given
MTU=1500). In these scenarios, "ethtool -S" reports non-zero "RX Oversize
Frame Errors". According to Microchip AN2948, the MAC_RX FSE (VLAN field
size enforcement) bit determines whether frames with VLAN tags exceeding
the base MTU plus tag length are discarded.

The driver must set the MAC_RX.FSE bit before setting MAC_RX.RXEN to allow
VLAN-tagged frames up to the interface MTU, preventing them from being
treated as oversized. As a result, both the base and VLAN-tagged interfaces
can use the same MTU without receive errors.

Fixes: 23f0703c125b ("lan743x: Add main source files for new lan743x driver")
Signed-off-by: David Thompson <davthompson@nvidia.com>
Reviewed-by: Thangaraj Samynathan <Thangaraj.s@microchip.com>
Reviewed-by: Nicolai Buchwitz <nb@tipi-net.de>
Tested-by: Nicolai Buchwitz <nb@tipi-net.de> # lan7430 on arm64 (RevPi
Link: https://patch.msgid.link/20260529210300.433135-1-davthompson@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 days agonet: rds: clear i_sends on setup unwind
Yuqi Xu [Fri, 29 May 2026 13:01:44 +0000 (21:01 +0800)] 
net: rds: clear i_sends on setup unwind

The RDS IB connection teardown path is written so it can run during
partial startup and on repeated shutdown attempts. It uses NULL
pointers to distinguish resources that are still owned from resources
that have already been released.

When rds_ib_setup_qp() fails after allocating i_sends but before
allocating i_recvs, the sends_out path frees i_sends without clearing
the pointer. A later shutdown pass can still treat that stale pointer
as a live send ring allocation.

Clear i_sends after vfree() in the error unwind path so the existing
shutdown logic continues to use the correct ownership state.

Fixes: 3b12f73a5c29 ("rds: ib: add error handle")
Cc: stable@kernel.org
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Zhengchuan Liang <zcliangcn@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Signed-off-by: Yuqi Xu <xuyq21@lenovo.com>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
Reviewed-by: Allison Henderson <achender@kernel.org>
Link: https://patch.msgid.link/5a0f7624bb9845a7b67d26166a150b59e7f394ce.1779632468.git.xuyq21@lenovo.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 days agofutex/requeue: Prevent NULL pointer dereference in remove_waiter() on self-deadlock
Ji'an Zhou [Tue, 2 Jun 2026 09:12:04 +0000 (09:12 +0000)] 
futex/requeue: Prevent NULL pointer dereference in remove_waiter() on self-deadlock

When FUTEX_CMP_REQUEUE_PI requeues a non-top waiter that already owns the
target PI futex, task_blocks_on_rt_mutex() returns -EDEADLK before setting
waiter->task.

The subsequent remove_waiter() in rt_mutex_start_proxy_lock() dereferences
the NULL waiter->task, causing a kernel crash.

Add a self-deadlock check for non-top waiters before calling
rt_mutex_start_proxy_lock(), analogous to the top-waiter check in
futex_lock_pi_atomic().

Fixes: 3bfdc63936dd4773109b7b8c280c0f3b5ae7d349 ("rtmutex: Use waiter::task instead of current in remove_waiter()")
Signed-off-by: Ji'an Zhou <eilaimemedsnaimel@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Cc: stable@vger.kernel.org
10 days agovdso/datastore: Mark vdso_k_*_data pointers as __ro_after_init
Thomas Weißschuh [Wed, 13 May 2026 06:32:46 +0000 (08:32 +0200)] 
vdso/datastore: Mark vdso_k_*_data pointers as __ro_after_init

These pointers are only modified once in vdso_setup_data_pages(),
during the init phase. Make them read-only after that.

Drop __refdata as that would conflict with __ro_after_init.
Modpost does accept the reference from a __ro_after_init symbol to
an __init one.

Fixes: 05988dba1179 ("vdso/datastore: Allocate data pages dynamically")
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260513-vdso-ro-after-init-v1-1-4b51f74015a4@linutronix.de
10 days agonet: garp: fix unsigned integer underflow in garp_pdu_parse_attr
Yizhou Zhao [Wed, 27 May 2026 08:31:58 +0000 (16:31 +0800)] 
net: garp: fix unsigned integer underflow in garp_pdu_parse_attr

The receive-side GARP attribute parser computes dlen with reversed
operands:

        dlen = sizeof(*ga) - ga->len;

ga->len is the on-wire attribute length and includes the GARP attribute
header. For normal attributes with data, ga->len is larger than
sizeof(*ga), so the subtraction underflows in unsigned arithmetic.

The resulting value is later passed to garp_attr_lookup(), whose length
argument is u8. After truncation, the parsed data length usually no
longer matches the length stored for locally registered attributes, so
received Join/Leave events are ignored. This breaks the GARP receive path
for common attributes, such as GVRP VLAN registration attributes.

Compute the data length as the attribute length minus the header length.

Fixes: eca9ebac651f ("net: Add GARP applicant-only participant")
Reported-by: Yizhou Zhao <zhaoyz24@mails.tsinghua.edu.cn>
Reported-by: Yuxiang Yang <yangyx22@mails.tsinghua.edu.cn>
Reported-by: Ao Wang <wangao@seu.edu.cn>
Reported-by: Xuewei Feng <fengxw06@126.com>
Reported-by: Qi Li <qli01@tsinghua.edu.cn>
Reported-by: Ke Xu <xuke@tsinghua.edu.cn>
Signed-off-by: Yizhou Zhao <zhaoyz24@mails.tsinghua.edu.cn>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260527083200.42861-1-zhaoyz24@mails.tsinghua.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 days agotime: Fix off-by-one in settimeofday() usec validation
Naveen Kumar Chaudhary [Tue, 2 Jun 2026 18:07:37 +0000 (23:37 +0530)] 
time: Fix off-by-one in settimeofday() usec validation

The validation check uses '>' instead of '>=' when comparing tv_usec
against USEC_PER_SEC, allowing the value 1000000 through. After
conversion to nanoseconds (*= 1000), this produces tv_nsec ==
NSEC_PER_SEC, violating the timespec invariant that tv_nsec must be
less than NSEC_PER_SEC.

Use '>=' to reject tv_usec values that are not in the valid range of
0 to 999999.

Fixes: 5e0fb1b57bea ("y2038: time: avoid timespec usage in settimeofday()")
Signed-off-by: Naveen Kumar Chaudhary <naveen.osdev@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Acked-by: John Stultz <jstultz@google.com>
Link: https://patch.msgid.link/4rikk44zew3s6577dugmx4jyblz7o5c57niuap6ct3td5yfm6w@gh7pcumg7qor
10 days agoclockevents: Fix duplicate type specifier in stub function parameter
Naveen Kumar Chaudhary [Mon, 1 Jun 2026 03:57:46 +0000 (09:27 +0530)] 
clockevents: Fix duplicate type specifier in stub function parameter

The stub for arch_inlined_clockevent_set_next_coupled() has 'u64 u64
cycles' in its parameter list. Since u64 is a typedef, the compiler
parses the second 'u64' as the parameter name, making 'cycles' an
unused token. Remove the duplicate so the parameter is correctly named.

Fixes: 89f951a1e8ad ("clockevents: Provide support for clocksource coupled comparators")
Signed-off-by: Naveen Kumar Chaudhary <naveen.osdev@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/7tostpvxzdn6tobmyow63a5rweatls5kux3scqp2vzhe7mv6uq@ecr746b4hyhf
10 days agohsr: Remove WARN_ONCE() in hsr_addr_is_self().
Kuniyuki Iwashima [Sat, 30 May 2026 06:42:58 +0000 (06:42 +0000)] 
hsr: Remove WARN_ONCE() in hsr_addr_is_self().

syzbot reported the warning [0] in hsr_addr_is_self(),
whose assumption is simply wrong.

hsr->self_node is cleared in hsr_del_self_node(), which
is called from hsr_dellink().

Since dev->rtnl_link_ops->dellink() is called before
unregister_netdevice_many(), there is a window when
user can find the device but without hsr->self_node.

Let's remove WARN_ONCE() in hsr_addr_is_self().

[0]:
HSR: No self node
WARNING: net/hsr/hsr_framereg.c:39 at hsr_addr_is_self+0x211/0x3f0 net/hsr/hsr_framereg.c:39, CPU#0: syz.4.16848/17220
Modules linked in:
CPU: 0 UID: 0 PID: 17220 Comm: syz.4.16848 Tainted: G             L      syzkaller #0 PREEMPT_{RT,(full)}
Tainted: [L]=SOFTLOCKUP
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/18/2026
RIP: 0010:hsr_addr_is_self+0x211/0x3f0 net/hsr/hsr_framereg.c:39
Code: 33 2f 41 0f b7 dd 89 ee 09 de 31 ff e8 c8 b4 c6 f6 09 dd 74 54 e8 0f b0 c6 f6 31 ed eb 53 e8 06 b0 c6 f6 48 8d 3d 2f 50 9c 04 <67> 48 0f b9 3a 31 ed eb 42 e8 c1 13 1f 00 89 c5 31 ff 89 c6 e8 96
RSP: 0018:ffffc900041c70e0 EFLAGS: 00010283
RAX: ffffffff8afdc6ca RBX: ffffffff8afdc4e6 RCX: 0000000000080000
RDX: ffffc90010493000 RSI: 0000000000000948 RDI: ffffffff8f9a1700
RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
R10: ffffc900041c71e8 R11: fffff52000838e3f R12: dffffc0000000000
R13: ffff888041f9e3c0 R14: ffff888086ee3802 R15: 0000000000000000
FS:  00007f6fe985d6c0(0000) GS:ffff888126176000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f80bd437dac CR3: 0000000025096000 CR4: 00000000003526f0
DR0: ffffffffffffffff DR1: 00000000000001f8 DR2: 0000000000000002
DR3: ffffffffefffff15 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 check_local_dest net/hsr/hsr_forward.c:592 [inline]
 fill_frame_info net/hsr/hsr_forward.c:728 [inline]
 hsr_forward_skb+0xa11/0x2a80 net/hsr/hsr_forward.c:739
 hsr_dev_xmit+0x253/0x370 net/hsr/hsr_device.c:236
 __netdev_start_xmit include/linux/netdevice.h:5368 [inline]
 netdev_start_xmit include/linux/netdevice.h:5377 [inline]
 xmit_one net/core/dev.c:3888 [inline]
 dev_hard_start_xmit+0x2df/0x860 net/core/dev.c:3904
 __dev_queue_xmit+0x1428/0x3900 net/core/dev.c:4870
 neigh_output include/net/neighbour.h:556 [inline]
 ip_finish_output2+0xcec/0x10b0 net/ipv4/ip_output.c:237
 ip_send_skb net/ipv4/ip_output.c:1510 [inline]
 ip_push_pending_frames+0x8b/0x110 net/ipv4/ip_output.c:1530
 raw_sendmsg+0x1547/0x1a50 net/ipv4/raw.c:659
 sock_sendmsg_nosec net/socket.c:787 [inline]
 __sock_sendmsg net/socket.c:802 [inline]
 ____sys_sendmsg+0x7da/0x9c0 net/socket.c:2698
 ___sys_sendmsg+0x2a5/0x360 net/socket.c:2752
 __sys_sendmsg net/socket.c:2784 [inline]
 __do_sys_sendmsg net/socket.c:2789 [inline]
 __se_sys_sendmsg net/socket.c:2787 [inline]
 __x64_sys_sendmsg+0x1c3/0x2a0 net/socket.c:2787
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x15f/0xf80 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f6feb62ce59
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f6fe985d028 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007f6feb8a6090 RCX: 00007f6feb62ce59
RDX: 0000000000000000 RSI: 0000200000000000 RDI: 0000000000000004
RBP: 00007f6feb6c2d6f R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f6feb8a6128 R14: 00007f6feb8a6090 R15: 00007ffcf01cc488
 </TASK>

Fixes: f266a683a480 ("net/hsr: Better frame dispatch")
Reported-by: syzbot+652670cf249077eb498b@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/6a1a861e.b111c304.35cd64.0016.GAE@google.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Fernando Fernandez Mancera <fmancera@suse.de>
Link: https://patch.msgid.link/20260530064300.340793-1-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 days agoMerge tag 'nf-26-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Jakub Kicinski [Tue, 2 Jun 2026 18:57:21 +0000 (11:57 -0700)] 
Merge tag 'nf-26-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf

Pablo Neira Ayuso says:

====================
Netfilter/IPVS fixes for net

The following patchset contains Netfilter/IPVS fixes for net:

1) Fix splat with PREEMPT_RCU because smp_processor_id() in nfqueue,
   from Fernando Fernandez Mancera.

2) Fix possible use of pointer to old IPVS scheduler after RCU grace
   period when editing service, from Julian Anastasov.

3) Fix possible forever RCU walk over rt->fib6_siblings in nft_fib6,
   if rt is unlinked mid-iteration, apparently same issue happens in
   the fib6 core. From Jiayuan Chen.

4) Add mutex to guard refcount in synproxy infrastructure, since
   concurrent hook {un}registration can happen.
   From Fernando Fernandez Mancera.

5) Bail out if IRC conntrack helper fails to parse a command, do not
   try parsing using other command handlers, from Florian Westphal.
   This fixes a possible out-of-bound read.

6) Possible use-after-free in nft_tunnel by releasing template dst
   after all references has been dropped, from Tristan Madani.

7) Ignore conntrack template in nft_ct, from Jiayuan Chen.

8) Missing skb_ensure_writable() in ebt_snat, Yiming Qian.

9) Remove multi-register byteorder support, this allows for kernel
   stack info leak, from Florian Westphal.

* tag 'nf-26-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: nft_byteorder: remove multi-register support
  netfilter: bridge: make ebt_snat ARP rewrite writable
  netfilter: nft_ct: bail out on template ct in get eval
  netfilter: nft_tunnel: fix use-after-free on object destroy
  netfilter: conntrack_irc: fix possible out-of-bounds read
  netfilter: synproxy: add mutex to guard hook reference counting
  netfilter: nft_fib_ipv6: bail out of sibling walk if rt got unlinked
  ipvs: clear the svc scheduler ptr early on edit
  netfilter: xt_NFQUEUE: prefer raw_smp_processor_id
====================

Link: https://patch.msgid.link/20260601115923.433946-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 days agotcp: Add preempt_{disable,enable}_nested() in reqsk_queue_hash_req().
Kuniyuki Iwashima [Mon, 1 Jun 2026 18:20:55 +0000 (18:20 +0000)] 
tcp: Add preempt_{disable,enable}_nested() in reqsk_queue_hash_req().

syzbot reported a weird reqsk->rsk_refcnt underflow in
__inet_csk_reqsk_queue_drop().

The captured reqsk_put() in __inet_csk_reqsk_queue_drop()
is called only when it successfully removes reqsk from ehash.

Moreover, reqsk_timer_handler() calls another reqsk_put()
after that.

This indicates that the reqsk was missing both refcnts for
ehash and the timer itself.

Since all the syzbot reports had PREEMPT_RT enabled, the only
possible scenario is that reqsk_queue_hash_req() is preempted
after mod_timer() and before refcount_set(), and then the timer
triggered after 1s aborts the reqsk due to its listener's close().

Let's wrap mod_timer() and refcount_set() with
preempt_disable_nested() and preempt_enable_nested().

Note that inet_ehash_insert() holds the normal spin_lock()
(mutex in PREEMPT_RT), so it must be called outside of
preempt_disable_nested(), but this is fine.

The lookup path just ignores 0 sk_refcnt entries in ehash
and tries to create another reqsk, but this will fail at
inet_ehash_insert().

[0]:
refcount_t: underflow; use-after-free.
WARNING: lib/refcount.c:28 at refcount_warn_saturate+0xb2/0x110 lib/refcount.c:28, CPU#0: ktimers/0/16
Modules linked in:
CPU: 0 UID: 0 PID: 16 Comm: ktimers/0 Tainted: G             L      syzkaller #0 PREEMPT_{RT,(full)}
Tainted: [L]=SOFTLOCKUP
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/18/2026
RIP: 0010:refcount_warn_saturate+0xb2/0x110 lib/refcount.c:28
Code: e4 7d d1 0a 67 48 0f b9 3a eb 4a e8 38 3d 23 fd 48 8d 3d e1 7d d1 0a 67 48 0f b9 3a eb 37 e8 25 3d 23 fd 48 8d 3d de 7d d1 0a <67> 48 0f b9 3a eb 24 e8 12 3d 23 fd 48 8d 3d db 7d d1 0a 67 48 0f
RSP: 0000:ffffc90000157948 EFLAGS: 00010246
RAX: ffffffff84a1301b RBX: 0000000000000003 RCX: ffff88801ca98000
RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffffffff8f72ae00
RBP: ffffffff99ae3b01 R08: ffff88801ca98000 R09: 0000000000000005
R10: 0000000000000100 R11: 0000000000000004 R12: ffff8880425ef568
R13: ffff8880425ef4f8 R14: ffff8880425ef578 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff888126386000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7b46710e9c CR3: 000000000dbb6000 CR4: 00000000003526f0
Call Trace:
 <TASK>
 __refcount_sub_and_test include/linux/refcount.h:400 [inline]
 __refcount_dec_and_test include/linux/refcount.h:432 [inline]
 refcount_dec_and_test include/linux/refcount.h:450 [inline]
 reqsk_put include/net/request_sock.h:136 [inline]
 __inet_csk_reqsk_queue_drop+0x3ce/0x440 net/ipv4/inet_connection_sock.c:1007
 reqsk_timer_handler+0x651/0xdf0 net/ipv4/inet_connection_sock.c:1137
 call_timer_fn+0x192/0x5e0 kernel/time/timer.c:1748
 expire_timers kernel/time/timer.c:1799 [inline]
 __run_timers kernel/time/timer.c:2374 [inline]
 __run_timer_base+0x6a3/0x9f0 kernel/time/timer.c:2386
 run_timer_base kernel/time/timer.c:2395 [inline]
 run_timer_softirq+0x67/0x170 kernel/time/timer.c:2403
 handle_softirqs+0x1de/0x6d0 kernel/softirq.c:622
 __do_softirq kernel/softirq.c:656 [inline]
 run_ktimerd+0x69/0x100 kernel/softirq.c:1151
 smpboot_thread_fn+0x541/0xa50 kernel/smpboot.c:160
 kthread+0x388/0x470 kernel/kthread.c:436
 ret_from_fork+0x514/0xb70 arch/x86/kernel/process.c:158
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
 </TASK>

Fixes: d2d6422f8bd1 ("x86: Allow to enable PREEMPT_RT.")
Reported-by: syzbot+e809069bc15f26300526@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/6a1a7bcf.0a9e871e.332604.000b.GAE@google.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/20260601182101.3183993-1-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 days agonet: Annotate sk->sk_write_space() for UDP SOCKMAP.
Kuniyuki Iwashima [Fri, 29 May 2026 19:39:23 +0000 (19:39 +0000)] 
net: Annotate sk->sk_write_space() for UDP SOCKMAP.

UDP TX skb->destructor() is sock_wfree(), and UDP holds lock_sock()
only for UDP_CORK / MSG_MORE sendmsg().

Otherwise, sk->sk_write_space() may be read locklessly while SOCKMAP
rewrites sk->sk_write_space().

Let's use WRITE_ONCE() and READ_ONCE() for sk->sk_write_space().

Note that the write side is annotated by commit 2ef2b20cf4e0
("net: annotate data-races around sk->sk_{data_ready,write_space}").

Fixes: 7b98cd42b049 ("bpf: sockmap: Add UDP support")
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://patch.msgid.link/20260529193941.3897256-1-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 days agopcnet32: stop holding device spin lock during napi_complete_done
Oscar Maes [Thu, 28 May 2026 14:03:20 +0000 (16:03 +0200)] 
pcnet32: stop holding device spin lock during napi_complete_done

napi_complete_done may call gro_flush_normal (though not currently, as GRO
is unsupported at the moment), which may result in packet TX. This will
eventually result in calling pcnet32_start_xmit - resulting in a deadlock
while trying to re-acquire the already locked spin lock.

It is safe to split the spinlock block into two, because the hardware
registers are still protected from concurrent access, and the two blocks
perform unrelated operations that don't need to happen atomically.

Fixes: 5b2ec6f2be51 ("pcnet32: use napi_complete_done()")
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Oscar Maes <oscmaes92@gmail.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://patch.msgid.link/20260528140320.5556-1-oscmaes92@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 days agoMerge tag 'soc-fixes-7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Linus Torvalds [Tue, 2 Jun 2026 17:54:11 +0000 (10:54 -0700)] 
Merge tag 'soc-fixes-7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull SoC fixes from Arnd Bergmann:
 "Following the previous set of fixes, this addresses another
  significant number of small issues found in firmware drivers (tee,
  optee, qcomtee, qcom ice, exynos acpm) drivers through various tools.

  This is about error handling, resource leaks, concurrency and a
  use-after-free bug.

  The fixes for the Qualcomm ICE driver also introduce interface changes
  in the UFS and MMC drivers using it.

  Outside of firmware drivers, there are a few fixes across the tree:

   - Minor driver code mistakes in the Atmel EBI memory controller, the
     i.MX soc ID driver and socfpga boot logic

   - A defconfig change to avoid a boot time regression on multiple
     qualcomm boards

   - Device tree fixes for qualcomm, at91 and gemini, addressing mostly
     minor configuration mistakes"

* tag 'soc-fixes-7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (28 commits)
  firmware: samsung: acpm: Fix infinite loop on sequence number exhaustion
  firmware: samsung: acpm: Fix missing LKMM barriers in sequence allocator
  firmware: samsung: acpm: Fix false timeouts and Use-After-Free in polling
  ARM: dts: gemini: Fix partition offsets
  ARM: socfpga: Fix OF node refcount leak in SMP setup
  soc: qcom: ice: Fix the error code when 'qcom,ice' property is not found
  arm64: dts: qcom: eliza: Add power-domain and iface clk for ice node
  arm64: dts: qcom: milos: Add power-domain and iface clk for ice node
  tee: qcomtee: add missing va_end in early return qcomtee_object_user_init()
  tee: fix params_from_user() error path in tee_ioctl_supp_recv
  tee: shm: fix shm leak in register_shm_helper()
  tee: fix tee_ioctl_object_invoke_arg padding
  arm64: defconfig: Enable PCI M.2 power sequencing driver
  scsi: ufs: ufs-qcom: Remove NULL check from devm_of_qcom_ice_get()
  mmc: sdhci-msm: Remove NULL check from devm_of_qcom_ice_get()
  soc: qcom: ice: Return proper error codes from devm_of_qcom_ice_get() instead of NULL
  soc: qcom: ice: Return -ENODEV if the ICE platform device is not found
  soc: qcom: ice: Fix race between qcom_ice_probe() and of_qcom_ice_get()
  ARM: dts: microchip: sam9x7: fix GMAC clock configuration
  firmware: samsung: acpm: Fix mailbox channel leak on probe error
  ...

10 days agoRDMA/efa: Validate SQ ring size against max LLQ size
Yonatan Nachum [Tue, 26 May 2026 08:15:36 +0000 (08:15 +0000)] 
RDMA/efa: Validate SQ ring size against max LLQ size

Validate the SQ ring size against the device's max LLQ size. This
ensures that when using 128-byte WQEs, userspace cannot exceed the queue
limits.

On create QP, userspace provides the SQ ring size (depth x WQE size)
which is validated against the max LLQ size.

Fixes: 40909f664d27 ("RDMA/efa: Add EFA verbs implementation")
Link: https://patch.msgid.link/r/20260526081536.1203553-1-ynachum@amazon.com
Reviewed-by: Michael Margolin <mrgolin@amazon.com>
Signed-off-by: Yonatan Nachum <ynachum@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
10 days agocpufreq/amd-pstate: Fix setting EPP in performance mode
Mario Limonciello (AMD) [Sat, 30 May 2026 15:04:34 +0000 (17:04 +0200)] 
cpufreq/amd-pstate: Fix setting EPP in performance mode

EPP 0 is the only supported value in the performance policy.
commit 798c47593cca ("cpufreq/amd-pstate: Add support for platform profile
class") changed this while adding platform profile support to the
dynamic EPP feature, but this actually wasn't necessary since platform
profile writes disable manual EPP writes.

Restore allowing writing EPP of 0 when in performance mode.

Reviewed-by: Marco Scardovi <scardracs@disroot.org>
Tested-by: Marco Scardovi <scardracs@disroot.org>
Reported-by: Stuart Meckle <stuartmeckle@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=221473
Closes: https://gitlab.freedesktop.org/upower/power-profiles-daemon/-/work_items/190
Fixes: 798c47593cca ("cpufreq/amd-pstate: Add support for platform profile class")
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
10 days agoKVM: s390: Remove ptep_zap_softleaf_entry()
Claudio Imbrenda [Tue, 2 Jun 2026 14:23:56 +0000 (16:23 +0200)] 
KVM: s390: Remove ptep_zap_softleaf_entry()

Migration entries do not need to be removed.

The swap subsystem has been (and still is being) heavily reworked. The
current implementation of ptep_zap_softleaf_entry() has been slowly
modified and is now wrong, since it unconditionally calls
swap_put_entries_direct() for both swap and migration entries.

Remove ptep_zap_softleaf_entry() altogether, merge the path for proper
swap entries directly in the only caller, and ignore migration entries.

Fixes: 200197908dc4 ("KVM: s390: Refactor and split some gmap helpers")
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Message-ID: <20260602142356.169458-11-imbrenda@linux.ibm.com>

10 days agoKVM: s390: Fix possible reference leak in fault-in code
Claudio Imbrenda [Tue, 2 Jun 2026 14:23:55 +0000 (16:23 +0200)] 
KVM: s390: Fix possible reference leak in fault-in code

If kvm_s390_new_mmu_cache() fails, kvm_s390_faultin_gfn() returns
without releasing the faulted page.

Fix this by moving the allocation of the memory cache outside of the
loop. There is no reason to check at every iteration.

Opportunistically fix a comment.

Fixes: e907ae530133 ("KVM: s390: Add helper functions for fault handling")
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Message-ID: <20260602142356.169458-10-imbrenda@linux.ibm.com>

10 days agoKVM: s390: Prevent memslots outside the ASCE range
Claudio Imbrenda [Tue, 2 Jun 2026 14:23:54 +0000 (16:23 +0200)] 
KVM: s390: Prevent memslots outside the ASCE range

With KVM_S390_VM_MEM_LIMIT_SIZE, userspace can set the highest address
allowed for the VM. Creating a memslot that lies over the maximum
address does not make sense and is only a potential source of bugs.

Prevent creation of memslots over the maximum address, and prevent the
maximum address from being reduced below the end of existing memslots.

Fixes: e38c884df921 ("KVM: s390: Switch to new gmap")
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Message-ID: <20260602142356.169458-9-imbrenda@linux.ibm.com>

10 days agoMerge tag 'mm-hotfixes-stable-2026-06-01-20-58' of git://git.kernel.org/pub/scm/linux...
Linus Torvalds [Tue, 2 Jun 2026 15:59:35 +0000 (08:59 -0700)] 
Merge tag 'mm-hotfixes-stable-2026-06-01-20-58' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM fixes from Andrew Morton:
 "13 hotfixes. All are for MM. 10 are cc:stable and the remaining 3
  address post-7.1 issues or aren't considered suitable for backporting.

  There's a three-patch series "userfaultfd: verify VMA state across
  UFFDIO_COPY retry" from Mike Rapoport which fixes a few uffd things.
  The rest are singletons - please see the individual changelogs for
  details"

* tag 'mm-hotfixes-stable-2026-06-01-20-58' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  userfaultfd: remove redundant check in vm_uffd_ops()
  userfaultfd: refuse to __mfill_atomic_pte() for unsupported VMAs
  userfaultfd: verify VMA state across UFFDIO_COPY retry
  mm/huge_memory: update file PMD counter before folio_put()
  mm/huge_memory: update file PUD counter before folio_put()
  mm/hugetlb_vmemmap: fix incorrect vmemmap restore in rollback
  mm/damon/ops-common: call folio_test_lru() after folio_get()
  mm/cma: fix reserved page leak on activation failure
  mm/memory-failure: fix hugetlb_lock AA deadlock in get_huge_page_for_hwpoison
  mm/hugetlb: restore reservation on error in hugetlb folio copy paths
  mm/cma_debug: fix invalid accesses for inactive CMA areas
  memcg: use round-robin victim selection in refill_stock
  mm/hugetlb: avoid false positive lockdep assertion

11 days agoKVM: s390: Lock pte when making page secure
Claudio Imbrenda [Tue, 2 Jun 2026 14:23:53 +0000 (16:23 +0200)] 
KVM: s390: Lock pte when making page secure

Make sure _kvm_s390_pv_make_secure() takes the pte lock for the given
address when attempting to make the page secure.

One of the steps in making the page secure is freezing the folio using
folio_ref_freeze(), which temporarily sets the reference count to 0.
Any attempt to get such a folio while frozen will fail and cause a
warning to be printed.

Other users of folio_ref_freeze() make sure that the page is not mapped
while it's being frozen, thus preventing gup functions from being able
to access it. For _kvm_s390_pv_make_secure(), this is not possible,
because the page needs to be mapped in order for the import to succeed.

By taking the pte lock, gup functions will be blocked until the import
operation is done, thus avoiding the race.

In theory this does not completely solve the issue: if a page is mapped
through multiple mappings, locking one pte does not protect from
calling gup on it through the other mapping. In practice this does not
happen and it is a decent stopgap solution until a more correct
solution is available.

Fixes: e38c884df921 ("KVM: s390: Switch to new gmap")
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Message-ID: <20260602142356.169458-8-imbrenda@linux.ibm.com>