git.ipfire.org Git - thirdparty/kernel/linux.git/log

Merge tag 'batadv-net-pullrequest-20260515' of https://git.open-mesh.org/batadv

Simon Wunderlich says:

====================
Here are various batman-adv bugfixes:

- fix tp_meter counter underflow during shutdown, by Luxiao Xu

- fix tp_meter tp_vars reference leak in receiver shutdown,
   by Sven Eckelmann

- fix various translation table integer handling issues,
   by Sven Eckelmann (3 patches)

- fix various translation table counter issues,
   by Sven Eckelmann (3 patches)

- fix fragment reassembly length accounting, by Ruide Cao

- clear current gateway during teardown, by Ruijie Li

- handle forward allocation error in DAT, by Sven Eckelmann

- tp_meter: avoid use of uninitialized sender variables in tp_meter,
   by Sven Eckelmann

- disallow unicast fragment in fragment, by Sven Eckelmann

- directly shut down tp_meter timer on cleanup, by Sven Eckelmann

* tag 'batadv-net-pullrequest-20260515' of https://git.open-mesh.org/batadv:
  batman-adv: tp_meter: directly shut down timer on cleanup
  batman-adv: frag: disallow unicast fragment in fragment
  batman-adv: tp_meter: avoid use of uninit sender vars
  batman-adv: dat: handle forward allocation error
  batman-adv: clear current gateway during teardown
  batman-adv: fix fragment reassembly length accounting
  batman-adv: tt: prevent TVLV entry number overflow
  batman-adv: tt: avoid empty VLAN responses
  batman-adv: tt: fix TOCTOU race for reported vlans
  batman-adv: tt: fix negative last_changeset_len
  batman-adv: tt: fix negative tt_buff_len
  batman-adv: tt: reject oversized local TVLV buffers
  batman-adv: tp_meter: fix tp_vars reference leak in receiver shutdown
  batman-adv: fix tp_meter counter underflow during shutdown
====================

Link: https://patch.msgid.link/20260515095540.325586-1-sw@simonwunderlich.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'for-net-2026-05-14' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth

Luiz Augusto von Dentz says:

====================
bluetooth pull request for net:

- af_bluetooth: serialize accept_q access
- L2CAP: ecred_reconfigure: send packed pdu, not stack pointer
- btmtk: accept too short WMT FUNC_CTRL events
- hci_qca: Convert timeout from jiffies to ms

* tag 'for-net-2026-05-14' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
  Bluetooth: hci_qca: Convert timeout from jiffies to ms
  Bluetooth: L2CAP: ecred_reconfigure: send packed pdu, not stack pointer
  Bluetooth: btmtk: accept too short WMT FUNC_CTRL events
  Bluetooth: serialize accept_q access
====================

Link: https://patch.msgid.link/20260514172340.1515042-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

openvswitch: vport: fix race between linking and the device notifier

Sashiko reports that it is technically possible that we got the device
reference, but by the time we're linking it to the OVS datapath, it
may be already in the process of being deleted.  In this case if the
notifier wins the race for RTNL, it will see that the device is not
yet in the OVS datapath (ovs_netdev_get_vport() will fail in the
dp_device_event()) and will do nothing.  Then the ovs_netdev_link()
will take the RTNL and link the unregistering device to OVS datapath.

Eventually, netdev_wait_allrefs_any() will re-broadcast the event and
the device will be properly detached, but it will take at least a
second before that happens, so it's not something we should rely on.

Let's avoid linking the non-registered device in the first place.

Note: As per documentation, RTNL doesn't protect the reg_state, but
it actually does for all the state transitions we care about here,
so it should not be necessary to use READ_ONCE or taking the instance
lock.  We can still do that, but we have a few more places even in
this file where the reg_state is accessed without those while under
RTNL, and many more places like this across the kernel code, so it
might make more sense to change all of them in a more centralized
fashion in the future, if necessary.

Fixes: ccb1352e76cf ("net: Add Open vSwitch kernel components.")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Reviewed-by: Aaron Conole <aconole@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Link: https://patch.msgid.link/20260514184702.2461435-1-i.maximets@ovn.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: qualcomm: rmnet: fix endpoint use-after-free in rmnet_dellink()

rmnet_dellink() removes the endpoint from the hash table with
hlist_del_init_rcu() and then immediately frees it with kfree(). However,
RCU readers on the receive path (rmnet_rx_handler ->
__rmnet_map_ingress_handler) may still hold a reference to the endpoint and
dereference ep->egress_dev after the memory has been freed. The endpoint is
a kmalloc-32 object, and the stale read at offset 8 corresponds to the
egress_dev pointer.

  BUG: unable to handle page fault for address: ffffffffde942eef
  Oops: 0002 [#1] SMP NOPTI
  CPU: 1 UID: 0 PID: 137 Comm: poc_write Not tainted 7.0.0+ #4 PREEMPTLAZY
  RIP: 0010:rmnet_vnd_rx_fixup (rmnet_vnd.c:27)
  Call Trace:
   <TASK>
   __rmnet_map_ingress_handler (rmnet_handlers.c:48 rmnet_handlers.c:101)
   rmnet_rx_handler (rmnet_handlers.c:129 rmnet_handlers.c:235)
   __netif_receive_skb_core.constprop.0 (net/core/dev.c:6096)
   __netif_receive_skb_one_core (net/core/dev.c:6208)
   netif_receive_skb (net/core/dev.c:6467)
   tun_get_user (drivers/net/tun.c:1955)
   tun_chr_write_iter (drivers/net/tun.c:2003)
   vfs_write (fs/read_write.c:688)
   ksys_write (fs/read_write.c:740)
   </TASK>

Add an rcu_head field to struct rmnet_endpoint and replace kfree() with
kfree_rcu() so the endpoint memory remains valid through the RCU grace
period. Also remove the rmnet_vnd_dellink() call and inline only the
nr_rmnet_devs decrement, since rmnet_vnd_dellink() would set
ep->egress_dev to NULL during the grace period, creating a data race
with lockless readers.

Fixes: ceed73a2cf4a ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation")
Reported-by: Xiang Mei <xmei5@asu.edu>
Signed-off-by: Weiming Shi <bestswngs@gmail.com>
Link: https://patch.msgid.link/20260514122511.3083479-2-bestswngs@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: appletalk: fix NULL pointer dereference in aarp_send_ddp()

aarp_send_ddp() calls atalk_find_dev_addr(dev) in the LocalTalk fast
path without checking for NULL. When the device has no AppleTalk
interface configured (dev->atalk_ptr == NULL), this leads to a NULL
pointer dereference at the at->s_net access.

KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
RIP: 0010:aarp_send_ddp (net/appletalk/aarp.c:552 (discriminator 2))
Call Trace:
  <TASK>
  atalk_sendmsg (net/appletalk/ddp.c:1715)
  __sys_sendto (net/socket.c:2265 (discriminator 1))
  __x64_sys_sendto (net/socket.c:2272)
  do_syscall_64 (arch/x86/entry/syscall_64.c:94)
  entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:121)

Add a NULL check consistent with the other callers of
atalk_find_dev_addr().

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: Xiang Mei <xmei5@asu.edu>
Signed-off-by: Weiming Shi <bestswngs@gmail.com>
Link: https://patch.msgid.link/20260514123806.3085961-3-bestswngs@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

netdevsim: psp: reset spi on key rotation and check for exhaustion on alloc

The PSP spec states that the lower 31b of the SPI need to be
non-zero. Though not in the spec, I think it is reasonable to reset
the lower 31b of the spi space after a key rotation, and to also
decline to generate session keys when the lower 31b saturate.

Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260515-spi-handle-v1-1-debf8cb467cb@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'net-mlx5-frag-buffer-improvements'

Tariq Toukan says:

====================
net/mlx5: frag buffer improvements

This series adds observability for mlx5 fragment buffer DMA pools and
improves the default NUMA placement policy for fragment buffer
allocations.

Patch 1 adds a debugfs interface exposing per-node DMA pool usage
statistics for mlx5_frag_buf allocations, helping with debugging and
visibility into pool utilization.

Patch 2 improves locality of default fragment buffer allocations by
using numa_mem_id() when no explicit NUMA node is requested, allowing
allocations to prefer the current CPU's local memory node.

Together, these changes improve both introspection and memory locality
behavior of mlx5 fragment buffer allocations.
====================

Link: https://patch.msgid.link/20260514104925.337570-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5: add debugfs stats for frag buf dma pools

Add a debugfs file exposing per-node DMA pool usage for mlx5_frag_buf
allocations.

  # cat /sys/kernel/debug/mlx5/<dev>/frag_buf_dma_pools
  node  block_size  used_blocks  allocated_blocks
     0        4096            0                 0
     0        8192            0                 0
     0       16384            0                 0
     0       32768            0                 0
     0       65536            0                 0
     1        4096            0                 0
     1        8192            0                 0
     1       16384            0                 0
     1       32768            0                 0
     1       65536            0                 0

Signed-off-by: Nimrod Oren <noren@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260514104925.337570-3-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5: use numa_mem_id() for default frag buf allocations

Use the current CPU's local memory node when callers do not request a
specific NUMA node for mlx5_frag_buf allocations.

Signed-off-by: Nimrod Oren <noren@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260514104925.337570-2-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5e: xsk: Fix unlocked writing to ICOSQ

During napi poll, when the affinity changes and there's still XSK work
to be done, we trigger an ICOSQ interrupt on the new CPU. However, this
triggering on the ICOSQ is done unprotected.

There are 2 such races:

A) mlx5e_trigger_irq() is called while mlx5e_xsk_alloc_rx_mpwqe() is
running from a different CPU due to affinity change. This can happen
because IRQ triggering is done after napi_complete_done(). At this point
the NAPI can be scheduled on a different CPU. Like this:

  CPU A (old affinity, NAPI tail)    CPU B (new affinity, fresh NAPI)
  -------------------------------    --------------------------------
  napi_complete_done()  clears SCHED
  mlx5e_cq_arm(...)
                                     napi_schedule_prep() sets SCHED
                                     mlx5e_napi_poll()
                                       mlx5e_xsk_alloc_rx_mpwqe()
                                         mlx5e_icosq_sync_lock() // noop
                                         memcpy 640 B UMR body
                                         advance sq->pc by 10
  mlx5e_trigger_irq(&c->icosq)
    wqe_info[pi] = {NOP, 1}
    mlx5e_post_nop() advances sq->pc

B) mlx5e_trigger_irq() is called on the ICOSQ when
mlx5e_trigger_napi_icosq() is running.

The obvious fix would be to lock the ICOSQ. But ICOSQ has an optimized
locking scheme that doesn't work for this scenario. Kick the async ICOSQ
instead which is always locked.

This issue was noticed in the wild with the following splat:

  netdevice: ge-0-0-1: Bad OP in ICOSQ CQE: 0xd
  WARNING: drivers/net/ethernet/mellanox/mlx5/core/en_rx.c:826 [...]
  [...]
  Call Trace:
   <IRQ>
   mlx5e_napi_poll+0x11d/0x7f0 [mlx5_core]
   __napi_poll+0x30/0x200
   ? skb_defer_free_flush+0x9c/0xc0
   net_rx_action+0x2fe/0x3f0
   handle_softirqs+0xd8/0x340
   __irq_exit_rcu+0xbc/0xe0
   common_interrupt+0x85/0xa0
   </IRQ>
   <TASK>
   asm_common_interrupt+0x26/0x40
  [...]
  ---[ end trace 0000000000000000 ]---
  mlx5_core 0000:08:00.0 ge-0-0-1: Error cqe on cqn 0x548, ci 0x2022, qn 0x8f4,
  opcode 0xd, syndrome 0x2, vendor syndrome 0x68
  00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  00000030: 00 00 00 00 01 00 68 02 01 00 08 f4 de 14 59 d2
  WQE DUMP: WQ size 16384 WQ cur size 0, WQE index 0x1e14, len: 64
  00000000: 00 00 00 01 d9 ed 80 02 00 00 00 01 d9 ed 90 02
  00000010: 00 00 00 01 d9 ed a0 02 00 00 00 01 d9 ed b0 02
  00000020: 00 00 00 01 d9 ed c0 02 00 00 00 01 d9 ed d0 02
  00000030: 00 00 00 01 d9 ed e0 02 00 00 00 01 d9 ed f0 02
  mlx5_core 0000:08:00.0 ge-0-0-1: Error cqe on cqn 0x548, ci 0x2023, qn 0x8f4,
  opcode 0xd, syndrome 0x5, vendor syndrome 0xf9
  00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  00000030: 00 00 00 00 01 00 f9 05 01 00 08 f4 de 15 cf d2

Fixes: db05815b36cb ("net/mlx5e: Add XSK zero-copy support")
Reported-by: Paul Saab <ps@mu.org>
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260513064613.334602-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

accel/amdxdna: Add expandable device heap support

Introduce an expandable device heap to avoid allocating a large heap
upfront. Start with a smaller initial heap and grow it on demand.
Return -EAGAIN when BO allocation fails due to insufficient heap space,
allowing userspace to trigger heap expansion via a heap BO creation
IOCTL and retry the allocation.

Manage heap chunks using an xarray. On expansion, register new chunks
with the firmware via MSG_OP_ADD_HOST_BUFFER.

Since heap shrinking is not supported by the firmware, release all heap
chunks on device close.

Co-developed-by: Wendy Liang <wendy.liang@amd.com>
Signed-off-by: Wendy Liang <wendy.liang@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20260515161922.744647-1-lizhi.hou@amd.com

drm/v3d: Release indirect CSD GEM reference on CPU job free

v3d_get_cpu_indirect_csd_params() takes a reference to the indirect BO via
drm_gem_object_lookup() and stashes it in cpu_job->indirect_csd.indirect,
but nothing on the CPU job teardown path ever drops that reference.

Drop the extra reference in v3d_cpu_job_free(). The NULL check covers ioctl
errors before the lookup ran and CPU job types other than
V3D_CPU_JOB_TYPE_INDIRECT_CSD, which leave the field zero-initialised.

Cc: stable@vger.kernel.org
Fixes: 18b8413b25b7 ("drm/v3d: Create a CPU job extension for a indirect CSD job")
Assisted-by: Claude:claude-opus-4.7
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patch.msgid.link/20260515-v3d-cpu-job-leaks-v1-2-7f147cbbf935@igalia.com
Signed-off-by: Maíra Canal <mcanal@igalia.com>

drm/v3d: Fix use-after-free of CPU job query arrays on error path

The CPU job ioctl's fail label calls kvfree() on cpu_job's timestamp and
performance query arrays after v3d_job_cleanup(), which drops the job's
last reference and frees cpu_job. Reading cpu_job at that point is a
use-after-free. Also, on the early v3d_job_init() failure path, it is a
NULL dereference, since v3d_job_deallocate() zeroes the local pointer.

In the success path, the arrays are released from the scheduler's
.free_job callback, but on the error path, they are freed manually, as
the job was never pushed to the scheduler. While the success path deals
with this correctly, the fail path doesn't.

On top of that, the manual kvfree() calls only free the array storage;
they don't drm_syncobj_put() the per-query syncobjs that
v3d_timestamp_query_info_free() and v3d_performance_query_info_free()
release on the success path. So the same fail path that triggers the
use-after-free also leaks one syncobj reference per query.

Unify the CPU job teardown into the CPU job's kref destructor, mirroring
v3d_render_job_free(). The scheduler's .free_job slot reverts to the
generic v3d_sched_job_free() and the fail label drops the manual
kvfree() calls, leaving a single teardown path that is reached from both
the scheduler and the ioctl error path. That removes the use-after-free,
the NULL dereference, and the syncobj leak by construction.

Cc: stable@vger.kernel.org
Fixes: 9ba0ff3e083f ("drm/v3d: Create a CPU job extension for the timestamp query job")
Assisted-by: Claude:claude-opus-4.7
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patch.msgid.link/20260515-v3d-cpu-job-leaks-v1-1-7f147cbbf935@igalia.com
Signed-off-by: Maíra Canal <mcanal@igalia.com>

drm/amd/display: Fix eDP receiver ready status check in T7 sequence

[Why]
Some eDP panels return sinkstatus as 0x5, causing the original sinkstatus == 1
check to never match and resulting in unnecessary polling delay. The
equality check is too restrictive and doesn't properly validate the
specific status bit that indicates receiver readiness.

[How]
Replace direct value comparison with proper bitmask check using
DP_RECEIVE_PORT_0_STATUS constant.

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Sung-huai Wang <Danny.Wang@amd.com>
Signed-off-by: Ivan Lipski <ivan.lipski@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Revert "drm/amd/display: dmub_cmd.h: add missing kernel-doc for enums"

[Why & How]
This reverts commit f71d0b68ec58b781f4f44ea642846bedce075e85.

1. Auto-generated Header: The file 'dmub_cmd.h' is an auto-generated header
   managed in an external repository (dmu_stg). Manual changes made directly in
   this repository will be overwritten and lost during the next automated weekly
   synchronization.

2. Tooling Compatibility: This header is governed by internal AMD firmware
   standards which require Doxygen formatting for cross-team documentation.
   Moving to kernel-doc syntax may break internal documentation pipelines.

3. Suppressing Warnings: Current 'make htmldocs' and 'make W=1' builds
   do not actively scan 'dmub_cmd.h' for kernel-doc compliance, thus no warnings
   are triggered during standard compilation. To address warnings generated when
   manually running './scripts/kernel-doc', we have added a notice at the file
   header indicating that this is an auto-generated file that does not strictly
   follow kernel-doc formatting. This ensures that any future linting tools or
   manual checks recognize the formatting as intentional.

Acked-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: James Lin <pinglei.lin@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Add some missing code for dcn42

[why & how]
Some DCN4.2 related code is missing from upstream

Fixes: e56e3cff2a1b ("drm/amd/display: Sync dcn42 with DC 3.2.373")
Acked-by: ChiaHsuan Chung <ChiaHsuan.Chung@amd.com>
Reviewed-by: Roman Li <Roman.Li@amd.com>
Signed-off-by: James Lin <pinglei.lin@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd: Reduce code duplication in runtime PM

[Why]
amdgpu_pmops_runtime_suspend() runs almost the same code that
amdgpu_pmops_runtime_idle() runs. That is there is pointless code
duplication.

[How]
Move amdgpu_pmops_runtime_idle() up, extract common code and then
call from both functions. No intended functional changes.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdkfd: Check bounds for allocate_sdma_queue restore_sdma_id

allocate_sdma_queue has an option where the sdma queue id can be
specified (used by CRIU). We weren't bounds-checking that
value.

Confirm it's less than the maximum number of queues.

Signed-off-by: David Francis <David.Francis@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: use atomic operation to achieve lockless serialization

In amdgpu_seq64_alloc there is a possibility that two difference cores
from two separate NODES can try to and could get the same free slot.
So this fixes that race here using atomic test_and_set clear operations.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/vce3: Fix VCE 3 firmware size and offsets

The VCPU BO contains the actual FW at an offset, but
it was not calculated into the VCPU BO size.
Subtract this from the FW size to make sure there is
no out of bounds access.

This may fix VM faults when using VCE 3.

Cc: John Olender <john.olender@gmail.com>
Fixes: e98226221467 ("drm/amdgpu: recalculate VCE firmware BO size")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdkfd: Check bounds on allocate_doorbell

allocated_doorbell has an option to set the doorbell id
to a specific value (used by CRIU). This value was not
bounds checked.

Check to confirm it's less than KFD_MAX_NUM_OF_QUEUES_PER_PROCESS.

Signed-off-by: David Francis <David.Francis@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/vce2: Fix VCE 2 firmware size and offsets

The VCPU BO contains the actual FW at an offset, but
it was not calculated into the VCPU BO size.
Subtract this from the FW size to make sure there is
no out of bounds access.

Additionally, increase the VCE_V2_0_DATA_SIZE to
have extra space after the VCE handles.

Also increase the data size used for each VCE handle.
The FW needs 23744 bytes, use 24K to be safe.

This fixes VM faults when using VCE 2.

Cc: John Olender <john.olender@gmail.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/4802
Fixes: e98226221467 ("drm/amdgpu: recalculate VCE firmware BO size")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/vce1: Stop using amdgpu_vce_resume

The VCE1 firmware works slightly differently and is already
loaded by vce_v1_0_load_fw(). It doesn't actually need to
call amdgpu_vce_resume().

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/vce1: Fix VCE 1 firmware size and offsets

The VCPU BO contains the actual FW at an offset, but
it was not calculated into the VCPU BO size.
Subtract this from the FW size to make sure there is
no out of bounds access.

Make sure the stack and data offsets are aligned to
the 32K TLB size.

Check that the FW microcode actually fits in the
space that is reserved for it.

Fixes: d4a640d4b9f3 ("drm/amdgpu/vce1: Implement VCE1 IP block (v2)")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/vce1: Don't repeat GTT MGR node allocation

Only allocate entries from the GTT manager when the
VCE GTT node is not allocated yet. This prevents the
possibility of allocating them multiple times, which
causes issues during GPU reset and suspend/resume.

Fixes: 71aec08f80e7 ("amdgpu/vce: use amdgpu_gtt_mgr_alloc_entries")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/vce1: Check if VRAM address is lower than GART.

Previously, I had assumed this was not possible
so it was OK to not handle it, but now we got a report
from a user who has a board that is configured this way.

When the VCPU BO is already located in a low 32-bit address
in VRAM (eg. when VRAM is mapped to the low address space),
don't do the workaround.

Fixes: 71aec08f80e7 ("amdgpu/vce: use amdgpu_gtt_mgr_alloc_entries")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/vce1: Remove superfluous address check

The same thing is already checked a few lines above.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/vce1: Check that the GPU address is < 128 MiB

When ensuring the low 32-bit address, make sure it is
less than 128 MiB, otherwise the VCE seems to fail to initialize.
This seems to be an undocumented limitation of the firmware
validation mechanism. Note that in case of VCE1 the BAR
address is zero and we can't change it also due to the
firmware validator.

When programming the mmVCE_VCPU_CACHE_OFFSETn registers,
don't AND them with a mask. This is incorrect because
the register mask is actually 0x0fffffff and useless because
we already ensure the addresses are below the limit.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Align amdgpu_gtt_mgr entries to TLB size on Tahiti (v2)

The TLB is organized in groups of 8 entries, each one is 4K.
On Tahiti, the HW requires these GART entries to be 32K-aligned.

This fixes a VCE 1 firmware validation failure that can happen
after suspend/resume since we use amdgpu_gtt_mgr for VCE 1.

v2:
- Change variable declaration order
- Add comment about "V bit HW bug"

Fixes: 698fa62f56aa ("drm/amdgpu: Add helper to alloc GART entries")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdkfd: Fix OOB memory exposure in get_wave_state()

The get_wave_state() function for v9 trusts cp_hqd_cntl_stack_size and
cp_hqd_cntl_stack_offset values read directly from the MQD, which are
written by GPU microcode and fully attacker-controlled on the
CRIU-restore path (via AMDKFD_IOC_RESTORE_PROCESS with H3).

this leads to an unbounded copy_to_user() that can leak adjacent
GTT/kernel memory. If offset > size, integer underflow produces a ~4 GiB
read length, if size is set to 1 MiB against a 4 KiB allocation, we leak
1 MiB of adjacent kernel memory (other queues' MQDs, ring buffers, KASLR
pointers).

Fix by clamping both cp_hqd_cntl_stack_size to the actual allocated
buffer size (q->ctl_stack_size) and cp_hqd_cntl_stack_offset to the
clamped size before performing arithmetic and copy_to_user().

This ensures we never read beyond the allocated kernel BO regardless of
attacker-supplied MQD field values.

Signed-off-by: Sunday Clement <Sunday.Clement@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: fix memleak of dpm_policies on smu v15

In smu_v15_0_fini_smc_tables, dpm_policies was not freed or NULLed, causing a memory leak.
Add kfree() and NULL assignment to properly release memory and avoid dangling pointers.

Fixes: 2beedc3a92b7 ("drm/amd/pm: Add initial support for smu v15_0_8");
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdkfd: Enable SDMA queue reset on gfx v12.1

After suspend/resume sdma_gang is supported on MES 12.1, SDMA queue reset
is supported too.

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Michael Chen<michael.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Support MES suspend_all_sdma_gangs

suspend_all_sdma_gangs is supported in new MES firmware for gfx 12.1

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michael Chen<michael.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: fix OOB risk parsing virt RAS batch trace replies on the VF

amdgpu_virt_ras_get_batch_records() indexed batchs[] and records[]
from ras_cmd_batch_trace_record_rsp copied out of shared memory without
fully bounding the cache window or per-batch offset/trace_num. A
tampered or corrupted buffer could set real_batch_num past the array,
make a naive start_batch_id + real_batch_num comparison wrap in
uint64_t, or point offset+trace_num past records[].

Add amdgpu_virt_ras_check_batch_cached() for a subtraction-based window
with a real_batch_num cap, re-run it after GET_BATCH_TRACE_RECORD, and
use an explicit batch index into batchs[]. Consolidate batch_id,
trace_num, and offset+trace_num checks; on any failure memset the cache
and return -EIO so the next call refetches.

Signed-off-by: Chenglei Xie <Chenglei.Xie@amd.com>
Reviewed-by: YiPeng Chai <YiPeng.Chai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Add guest driver CUID support

v3:
improve the coding style.

v2:
use debugfs_create_x64 and debugfs_create_x8 to create node.

v1:
1. Add guest driver CUID support
2. Do not expose vf index(variable "fcn_idx") to customers,
replace the fcn_idx with pad.
Only expose the unitid to customers.

background:
Change fcn_idx to pad, VF index won't expose to guest vm.
Introduce a new unitid field as the VF identifier to replace the VF index:
1).unitid is assigned by the host driver
2).It is delivered to the guest via the pf2vf message
3).The application or umd can retrieve united from the sysfs node

Signed-off-by: chong li <chongli2@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Fix discovery offset check under VF

Discovery table may be kept at offset 0 by host driver. Remove the
validation check.

Fixes: 01bdc7e219c4 ("drm/amdgpu: New interface to get IP discovery binary v3")
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Ellen Pan <yunru.pan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: remove va cursors for all mappings

va_cursor struct needs to be cleaned even if the mapping
has been removed already.

Also simplify it by make it a void function as return value
check isn't needed as its called during tear down.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: reject non-user addresses early in GEM_USERPTR ioctl

amdgpu_gem_userptr_ioctl() currently accepts any value of args->addr
and only discovers an out-of-range pointer much later, inside
amdgpu_gem_object_create() and the HMM mirror registration path.
Userspace can drive that path with kernel-side virtual addresses;
the get_user_pages() layer rejects them, but only after the driver
has already allocated a GEM object and started wiring up notifier
state that then has to be torn down on failure.

Add an access_ok() guard at the top of the ioctl, right after the
existing page-alignment check and before flag validation, so any
address that does not lie within the calling task's user address
range is rejected with -EFAULT before any allocation occurs. No
legitimate ROCm/HSA userspace passes kernel-mode pointers through
this interface, so this is defense-in-depth rather than a behaviour
change for valid callers; -EFAULT matches the convention already
used by other uaccess-style rejections in the kernel.

Also add an explicit #include <linux/uaccess.h>; access_ok() is
otherwise only available transitively through other headers in
this translation unit.

Signed-off-by: Amir Shetaia <Amir.Shetaia@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

watchdog: ziirave_wdt: Use named initializers for struct i2c_device_id

While being less compact, using named initializers allows to more easily
see which members of the structs are assigned which value without having
to lookup the declaration of the struct. And it's also more robust
against changes to the struct definition.

This patch doesn't modify the compiled arrays, only their representation
in source form benefits. The former was confirmed with x86 and arm64
builds.

Signed-off-by: Uwe Kleine-König (The Capable Hub) <u.kleine-koenig@baylibre.com>
Link: https://lore.kernel.org/r/20260518171901.904094-2-u.kleine-koenig@baylibre.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>

drm/amdgpu/vpe: Force collaborate sync after TRAP

VPE1 could possibly hang and fail to power off at the end of commands in
collaboration mode. This workaround adds a COLLAB_SYNC after TRAP to
force instances synchronized to avoid VPE1 fail to power off.

Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Alan liu <haoping.liu@amd.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/5171
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/userq: update the vm task info during signal ioctl

Pagefaults does not have process information correctly populated
as vm->task is not set during vm_init but should be updated while
real submission. So setting that up during signal_ioctl to get
the correct submission process details.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/userq: cancel reset work while tear down in progress

While tear down of a userq_mgr is happening when all the queues
are free we should cancel any reset work if pending before exiting.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Remove UML build exclusion from Kconfig

The depends on !UML was added in commit dffe68131707 ("amdgpu: Avoid
building on UML") to work around build failures with allyesconfig on
UML. The original errors were:

- smu7_hwmgr.c: incompatible pointer type 'struct cpuinfo_um *' vs
'struct cpuinfo_x86 *' in intel_core_rkl_chk()
- kfd_topology.c: 'struct cpuinfo_um' has no member named 'apicid'

Both issues have since been resolved independently:
- intel_core_rkl_chk() has been removed entirely.
- kfd_topology.c now uses a proper #ifdef CONFIG_X86_64 guard.
- All other cpuinfo_x86/cpu_data() references in the driver are
guarded by #if IS_ENABLED(CONFIG_X86) or #ifdef CONFIG_X86_64.

Removing this exclusion allows CONFIG_DRM_AMDGPU to be selected on UML,
which in turn enables running KUnit tests (such as amdgpu_dm_crc_test)
under UML without needing a full hardware-capable kernel build.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Assisted-by: Claude:claude-opus-4.6
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: rework userq reset work handling

It is illegal to schedule reset work from another reset work!

Fix this by scheduling the userq reset work directly on the work queue
of the reset domain.

Not fully tested, I leave that to the IGT test cases.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/userq: pin mqd and fw object bo to avoid eviction

mqd and fw objects are queue core objects which should remain
valid and never be unmapped and evicted for user queues to work
properly.

During eviction if these buffers are evicted the hw continue to
use the invalid addresses and caused page faults and system hung.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/userq: use drm_exec in amdgpu_userq_fence_read_wptr

To access the bo from vm mapping first lock the root bo and
then the object bo of the mapping to make sure both locks
are taken safely.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/xe/guc: Use xe_device_is_l2_flush_optimized()

We encapsulate the logic to check if the platform has L2 flush
optimization feature in xe_device_is_l2_flush_optimized(), but
guc_ctl_feature_flags() is using an open-coded version of that same type
of check. Fix that by replacing the open-coded check with
xe_device_is_l2_flush_optimized().

Reviewed-by: Himanshu Girotra <himanshu.girotra@intel.com>
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Link: https://patch.msgid.link/20260513-guc-l2-flush-opt-use-xe_device_is_l2_flush_optimized-v1-1-36fa866d6ed8@intel.com
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>

HID: core: Fix size_t specifier in hid_report_raw_event()

When building for 32-bit platforms, for which 'size_t' is
'unsigned int', there are warnings around using the incorrect format
specifier to print bsize in hid_report_raw_event():

  drivers/hid/hid-core.c:2054:29: error: format specifies type 'long' but the argument has type 'size_t' (aka 'unsigned int') [-Werror,-Wformat]
   2053 |                 hid_warn_ratelimited(hid, "Event data for report %d is incorrect (%d vs %ld)\n",
        |                                                                                         ~~~
        |                                                                                         %zu
   2054 |                                      report->id, csize, bsize);
        |                                                         ^~~~~
  drivers/hid/hid-core.c:2076:29: error: format specifies type 'long' but the argument has type 'size_t' (aka 'unsigned int') [-Werror,-Wformat]
   2075 |                 hid_warn_ratelimited(hid, "Event data for report %d was too short (%d vs %ld)\n",
        |                                                                                          ~~~
        |                                                                                          %zu
   2076 |                                      report->id, rsize, bsize);
        |                                                         ^~~~~

Use the proper 'size_t' format specifier, '%zu', to clear up the
warnings.

Cc: stable@vger.kernel.org
Fixes: 2c85c61d1332 ("HID: pass the buffer size to hid_report_raw_event")
Reported-by: Miguel Ojeda <ojeda@kernel.org>
Closes: https://lore.kernel.org/20260516020430.110135-1-ojeda@kernel.org/
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'perf-upstream'

sched/cache: Fix stale preferred_llc for a new task

On fork without CLONE_VM, the child gets a new mm,
the parent's preferred_llc value is stale for the
child.

Fix this by resetting the task's preferred_llc to -1.

This bug was reported by sashiko.

Fixes: 47d8696b95f7 ("sched/cache: Assign preferred LLC ID to processes")
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Co-developed-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/0ec7309d0e24ede97656754d1505b7490403d966.1778703694.git.tim.c.chen@linux.intel.com

sched/cache: Fix has_multi_llcs iff at least one partition has multiple LLCs

sched_cache_present is a global static key, but build_sched_domains()
is called per partition from the "Build new domains" loop in
partition_sched_domains_locked(). Each call unconditionally sets the
key based solely on the has_multi_llcs local variable for that partition.

The call to the last partition set the value even when there
are previous partitions with multiple LLCs.

If partition A (multi-LLC) is built first, the key is enabled. Then
when partition B (single-LLC) is built, the key is disabled. The
multi-LLC partition A is still active but the key is now off.

Fix it by doing a similar thing as sched_energy_present: check the
multi-LLCs during the iteration over all the partitions rather than
checking it on a single partition.

This bug was reported by sashiko.

Fixes: d59f4fd1d303 ("sched/cache: Enable cache aware scheduling for multi LLCs NUMA node")
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Co-developed-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/c541af2547d54509fbfd3b3a1e8072e2e5c7ff68.1778703694.git.tim.c.chen@linux.intel.com

sched/cache: Fix cache aware scheduling enabling for multi LLCs system

If there are multiple LLCs in the system, cache aware scheduling
should be enabled. However, there is a corner case where, if there
is a single NUMA node and a single LLC per node, cache aware
scheduling will be turned on in the current implementation -
because at this moment, the parent domain has not yet been
degenerated, and it is possible that the current domain has the
same cpu span as its parent. There is no need to turn cache aware
scheduling on in this scenario.

Fix it by iterating the parent domains to find a domain that is
a superset of the current sd_llc, so that later, after the duplicated
parent domains have been degenerated, cache aware scheduling will
take effect.

For example, the expected behavior would be:
2 sockets, 1 LLC per socket: MC span=0-3, PKG span=0-7, has_multi_llcs=true
1 socket, 2 LLCs per socket: MC span=0-3, PKG span=0-7, has_multi_llcs=true
2 sockets, 2 LLCs per socket: MC span=0-3, PKG span=0-7, has_multi_llcs=true
1 socket, 1 LLC per socket: MC span=0-3, PKG span=0-3, has_multi_llcs=false

This bug was reported by sashiko.

Fixes: d59f4fd1d303 ("sched/cache: Enable cache aware scheduling for multi LLCs NUMA node")
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Co-developed-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/6328a8a7f40925cec2a712d81ee58128a4c4444a.1778703694.git.tim.c.chen@linux.intel.com

sched/cache: Fix race condition during sched domain rebuild

sched_cache_active_set_unlocked() checks hardware support without
locks:
static void sched_cache_active_set(bool locked)
{
        /* hardware does not support */
        if (!static_branch_likely(&sched_cache_present)) {
                _sched_cache_active_set(false, locked);
                return;
        }
    ...
If build_sched_domains() runs concurrently during CPU hotplug,
it can disable sched_cache_present under sched_domains_mutex
and the CPU hotplug lock. If a debugfs write thread evaluates
sched_cache_present as true right before that, and then blocks
or gets preempted, it might proceed to enable sched_cache_active
after the hardware support has been marked as absent. Make it
safer by acquiring cpus_read_lock() and sched_domains_mutex_lock()
when the user changes sched_cache_active via debugfs.

This bug was reported by sashiko.

Fixes: 067a31358143 ("sched/cache: Allow the user space to turn on and off cache aware scheduling")
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Co-developed-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/9afddf439687f04bb56b46625bd9f153eb8abad5.1778703694.git.tim.c.chen@linux.intel.com

sched/cache: Fix checking active load balance by only considering the CFS task

The currently running task cur may not be a CFS task, such as
an RT or Deadline task. For non-CFS tasks, the task_util(cur)
utilization average is not maintained, so this might pass a
stale or meaningless value to can_migrate_llc().

Check if the task is CFS before getting its task_util().

This bug was reported by sashiko.

Fixes: 714059f79ff0 ("sched/cache: Handle moving single tasks to/from their preferred LLC")
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Co-developed-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/f9161133cf040d286dca11344a112c5ef2a5253d.1778703694.git.tim.c.chen@linux.intel.com

sched/cache: Fix unpaired account_llc_enqueue/dequeue

There is a race condition that, after a task is enqueued
on a runqueue, task_llc(p) may change due to CPU hotplug,
because the llc_id is dynamically allocated and adjusted
at runtime.
Therefore, checking task_llc(p) to determine whether the
task is being dequeued from its preferred LLC is unreliable
and can cause inconsistent values.

To fix this problem, record whether p is enqueued on its
preferred LLC, in order to pair with account_llc_dequeue()
to maintain a consistent nr_pref_llc_running per runqueue.

This bug was reported by sashiko, and the solution was once
suggested by Prateek.

Fixes: 46afe3af7ead ("sched/cache: Track LLC-preferred tasks per runqueue")
Suggested-by: K Prateek Nayak <kprateek.nayak@amd.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Co-developed-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/0c8c6a1571d66792a4d2ff0103ba3cc13e059046.1778703694.git.tim.c.chen@linux.intel.com

sched/cache: Annotate lockless accesses to mm->sc_stat.cpu

mm->sc_stat.cpu is written by task_cache_work() and could be read
locklessly by several functions on other CPUs. Use READ_ONCE and
WRITE_ONCE on mm->sc_stat.cpu access and write to prevent inconsistent
values from compiler optimizations when there are multiple accesses.

For example in get_pref_llc(), if the writer updated the field between
two compiler-generated loads, the validation (e.g., cpu != -1) and
subsequent use (e.g., llc_id(cpu)) could operate on different values,
allowing a negative CPU ID to be used as an index.

Leave plain write in mm_init_sched(), where the mm is not
yet visible to other CPUs.

This bug was reported by sashiko.

Fixes: 47d8696b95f7 ("sched/cache: Assign preferred LLC ID to processes")
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Co-developed-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/63ea494f12efcf265d7134400a06cd75d7f2c310.1778703694.git.tim.c.chen@linux.intel.com

sched/cache: Fix potential NULL mm pointer access

A concurrent task exit might cause a NULL pointer dereference
in account_mm_sched(). Use the locally cached mm pointer instead,
since the active_mm reference guarantees the structure remains
allocated. Meanwhile, skip the kernel thread because it has
nothing to do with cache aware scheduling.

This bug was reported by sashiko and Vern.

Fixes: df0d98475954 ("sched/cache: Introduce infrastructure for cache-aware load balancing")
Reported-by: Vern Hao <haoxing990@gmail.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Co-developed-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/all/09cf7ee3-6e27-4505-9692-4b4a4707c8b2@gmail.com/
Link: https://patch.msgid.link/066d8cfa45d4822bf4367e788c50377c66bbcc82.1778703694.git.tim.c.chen@linux.intel.com

sched/cache: Fix rcu warning when accessing sd_llc domain

rcu_dereference_all() should be used to access the
sd_llc domain under RCU protection.

This bug was reported by sashiko.

Fixes: df0d98475954 ("sched/cache: Introduce infrastructure for cache-aware load balancing")
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Co-developed-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/2dc49455e861215d8059a1c877953f0b95990038.1778703694.git.tim.c.chen@linux.intel.com

sched/cache: Add user control to adjust the aggressiveness of cache-aware scheduling

Introduce a set of debugfs knobs to control how aggressively the
cache aware scheduling does the task aggregation.

(1) aggr_tolerance
With sched_cache enabled, the scheduler uses a process's footprint
as a proxy for its LLC footprint to determine if aggregating tasks
on the preferred LLC could cause cache contention. If the footprint
exceeds the LLC size, aggregation is skipped. Since the kernel
cannot efficiently track per-task cache usage (resctrl is
user-space only), userspace can provide a more accurate hint.

Introduce /sys/kernel/debug/sched/llc_balancing/aggr_tolerance to
let users control how strictly footprint limits aggregation. Values
range from 0 to 100:
  - 0: Cache-aware scheduling is disabled.
  - 1: Strict; tasks with footprint larger than LLC size are skipped.
  - >=100: Aggressive; tasks are aggregated regardless of footprint.
For example, with a 32MB L3 cache:

  - aggr_tolerance=1 -> tasks with footprint > 32MB are skipped.
  - aggr_tolerance=99 -> tasks with footprint > 784GB are skipped
    (784GB = (1 + (99 - 1) * 256) * 32MB).
Similarly, /sys/kernel/debug/sched/llc_balancing/aggr_tolerance also
controls how strictly the number of active threads is considered when
doing cache aware load balance. The number of SMTs is also considered.
High SMT counts reduce the aggregation capacity, preventing excessive
task aggregation on SMT-heavy systems like Power10/Power11.

Yangyu suggested introducing separate aggregation controls for the
number of active threads and memory footprint checks. Since there are
plans to add per-process/task group controls, fine-grained tunables are
deferred to that implementation.

(2) epoch_period, epoch_affinity_timeout,
    imb_pct, overaggr_pct are also turned into tunables.

Suggested-by: K Prateek Nayak <kprateek.nayak@amd.com>
Suggested-by: Madadi Vineeth Reddy <vineethr@linux.ibm.com>
Suggested-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Suggested-by: Tingyin Duan <tingyin.duan@gmail.com>
Suggested-by: Jianyong Wu <jianyong.wu@outlook.com>
Suggested-by: Yangyu Chen <cyy@cyyself.name>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Co-developed-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Tingyin Duan <tingyin.duan@gmail.com>
Link: https://patch.msgid.link/1c62cc060ba2b33d7b1f0ed98b3390128edbae93.1778703694.git.tim.c.chen@linux.intel.com

sched/cache: Avoid cache-aware scheduling for memory-heavy processes

Prateek and Tingyin reported that memory-intensive workloads (such as
stream) can saturate memory bandwidth and caches on the preferred LLC
when sched_cache aggregates too many threads.

To mitigate this, estimate a process's memory footprint by comparing
its NUMA balancing fault statistics to the size of the LLC. If the
footprint exceeds the LLC size, skip cache-aware scheduling.

Note that footprint is only an approximation of the memory footprint,
since the kernel lacks suitable metrics to estimate the real working
set. If a user-provided hint is available in the future, it would be
more accurate. A later patch will allow users to provide a hint to
adjust this threshold.

Suggested-by: K Prateek Nayak <kprateek.nayak@amd.com>
Suggested-by: Vern Hao <vernhao@tencent.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Co-developed-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Tingyin Duan <tingyin.duan@gmail.com>
Link: https://patch.msgid.link/95cf64a385bcc12f18dcebe9d59e8d3ba8bb318f.1778703694.git.tim.c.chen@linux.intel.com

sched/cache: Calculate the LLC size and store it in sched_domain

Cache aware scheduling needs to know the LLC size that a process
can use, so as to avoid memory-intensive tasks from being
over-aggregated on a single LLC.

Introduce a preparation patch to add get_effective_llc_bytes() to
get the LLC size that a CPU can use. The function can be further
enhanced by subtracting the LLC cache ways reserved by resctrl
(CAT in Intel RDT, etc).

Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Co-developed-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Tingyin Duan <tingyin.duan@gmail.com>
Link: https://patch.msgid.link/37afee09ff608034da0ce149e72d33b6f4698edf.1778703694.git.tim.c.chen@linux.intel.com

sched/cache: Skip cache-aware scheduling for single-threaded processes

For a single thread, the current wakeup path tends to place it
on the same LLC where it was previously running with cache-hot
data. There is no need to enable cache-aware scheduling for
single-threaded processes for the following reasons:

1. Cache-aware scheduling primarily benefits multi-threaded
   processes where threads share data. Single-threaded processes
   typically have no inter-thread data sharing and thus gain little.

2. Enabling it incurs the additional overhead of tracking the
   thread's residency in the LLCs.

3. Bypassing single-threaded processes avoids excessive
   concentration of such tasks on a single LLC.

Nevertheless, this check can be omitted if users explicitly
provide hints for such single-threaded workloads where different
processes have shared memory, e.g., via prctl() or other interfaces
to be added in the future.

Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Co-developed-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Tingyin Duan <tingyin.duan@gmail.com>
Link: https://patch.msgid.link/8a59a13aa58fdb48e410ecb2aabd97fe3ea5d256.1778703694.git.tim.c.chen@linux.intel.com

sched/cache: Disable cache aware scheduling for processes with high thread counts

A performance regression was observed by Prateek when running hackbench
with many threads per process (high fd count). To avoid this, processes
with a large number of active threads are excluded from cache-aware
scheduling.

With sched_cache enabled, record the number of active threads in each
process during the periodic task_cache_work(). While iterating over
CPUs, if the currently running task belongs to the same process as the
task that launched task_cache_work(), increment the active thread count.

If the number of active threads within the process exceeds the number
of Cores (divided by the SMT number) in the LLC, do not enable
cache-aware scheduling. However, on systems with a smaller number of
CPUs within 1 LLC, like Power10/Power11 with SMT4 and an LLC size of 4,
this check effectively disables cache-aware scheduling for any process.
One possible solution suggested by Peter is to use an LLC-mask instead
of a single LLC value for preference. Once there are a 'few' LLCs as
preference, this constraint becomes a little easier. It could be an
enhancement in the future.

For users who wish to perform task aggregation regardless, a debugfs knob
is provided for tuning in a subsequent change.

Suggested-by: K Prateek Nayak <kprateek.nayak@amd.com>
Suggested-by: Aaron Lu <ziqianlu@bytedance.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Co-developed-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Tingyin Duan <tingyin.duan@gmail.com>
Link: https://patch.msgid.link/d076cd21a8e6c6341d1e2d927e118db770ebb650.1778703694.git.tim.c.chen@linux.intel.com

sched/cache: Allow only 1 thread of the process to calculate the LLC occupancy

Scanning online CPUs to calculate the occupancy might be
time-consuming. Only allow 1 thread of the process to scan
the CPUs at the same time, which is similar to what
NUMA balance does in task_numa_work().

Signed-off-by: Jianyong Wu <wujianyong@hygon.cn>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/5672b52e588b855b01e5a1a17822f7c6c7237a3d.1778703694.git.tim.c.chen@linux.intel.com

drm/xe/multi_queue: Fix secondary queue error case

If xe_lrc_create() fails, the secondary queue added to the
multi-queue group list is not removed before freeing the
queue. Fix error path handling for secondary queues by
removing it from the multi-queue group list at the right
place.

Reported-by: Sebastian Österlund <sebastian.osterlund@intel.com>
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/work_items/7979
Fixes: d716a5088c88 ("drm/xe/multi_queue: Handle tearing down of a multi queue")
Cc: stable@vger.kernel.org # v7.0+
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patch.msgid.link/20260518191639.320890-2-niranjana.vishwanathapura@intel.com

cgroup/rstat: validate cpu before css_rstat_cpu() access

css_rstat_updated() is exposed as a BPF kfunc and accepts a
caller-provided cpu argument. The function uses cpu for per-cpu rstat
lookups without checking whether it refers to a valid possible CPU.

A BPF iter/cgroup program with CAP_BPF and CAP_PERFMON can pass an
invalid cpu value. On an unfixed UBSCAN_BOUNDS test kernel, cpu ==
0x7fffffff triggers:

  UBSAN: array-index-out-of-bounds in kernel/cgroup/rstat.c:31:9
  index 2147483647 is out of range for type 'long unsigned int [64]'
  Call Trace:
    css_rstat_updated
    bpf_iter_run_prog
    cgroup_iter_seq_show
    bpf_seq_read

Add cpu validation to the BPF-facing css_rstat_updated() kfunc and
move the common implementation to __css_rstat_updated() for in-kernel
callers.

Fixes: a319185be9f5 ("cgroup: bpf: enable bpf programs to integrate with rstat")
Signed-off-by: Qing Ming <a0yami@mailbox.org>
Signed-off-by: Tejun Heo <tj@kernel.org>

srcu: Don't queue workqueue handlers to never-online CPUs

While an srcu_struct structure is in the midst of switching from CPU-0
to all-CPUs state, it can attempt to invoke callbacks for CPUs that
have never been online.  Worse yet, it can attempt in invoke callbacks
for CPUs that never will be online, even including imaginary CPUs not in
cpu_possible_mask.  This can cause hangs on s390, which is not set up to
deal with workqueue handlers being scheduled on such CPUs.  This commit
therefore causes Tree SRCU to refrain from queueing workqueue handlers
on CPUs that have not yet (and might never) come online.

Because callbacks are not invoked on CPUs that have not been
online, it is an error to invoke call_srcu(), synchronize_srcu(), or
synchronize_srcu_expedited() on a CPU that is not yet fully online.
However, it turns out to be less code to redirect the callbacks
from too-early invocations of call_srcu() than to warn about such
invocations.  This commit therefore also redirects callbacks queued on
not-yet-fully-online CPUs to the boot CPU.

Reported-by: Vasily Gorbik <gor@linux.ibm.com>
Fixes: 61bbcfb50514 ("srcu: Push srcu_node allocation to GP when non-preemptible")
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Tested-by: Vasily Gorbik <gor@linux.ibm.com>
Tested-by: Samir <samir@linux.ibm.com>
Reviewed-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Boqun Feng <boqun@kernel.org>

cgroup/rdma: Drop unnecessary READ_ONCE() on event counters

All accesses to the event counters are serialized by rdmacg_mutex,
making the READ_ONCE() annotations unnecessary. Remove them.

Signed-off-by: Tao Cui <cuitao@kylinos.cn>
Signed-off-by: Tejun Heo <tj@kernel.org>

drm/dp/mst: fix OOB reads on 2-byte fields in sideband reply parsers

Three sideband reply parsers read 16-bit fields as:

  val = (raw->msg[idx] << 8) | (raw->msg[idx+1]);

and check bounds only after the fact. When idx == raw->curlen,
raw->msg[idx+1] reads one byte past the received message data into
the following struct fields (curchunk_len, curchunk_idx, curlen).

Affected functions:
- drm_dp_sideband_parse_enum_path_resources_ack()
   full_payload_bw_number and avail_payload_bw_number fields
- drm_dp_sideband_parse_allocate_payload_ack()
   allocated_pbn field
- drm_dp_sideband_parse_query_payload_ack()
   allocated_pbn field

Fix by using a single combined check (idx + 2 > curlen) before each
2-byte read. Since the check is strictly tighter than idx > curlen,
no separate step is needed.

Fixes: ad7f8a1f9ced ("drm/helper: add Displayport multi-stream helper (v0.6)")
Cc: <stable@vger.kernel.org> # v3.17+
Signed-off-by: Ashutosh Desai <ashutoshdesai993@gmail.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
[added fixes tag]
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patch.msgid.link/20260510203128.2884846-1-ashutoshdesai993@gmail.com

drm/dp/mst: fix OOB reads in remote DPCD/I2C sideband reply parsers

drm_dp_sideband_parse_remote_dpcd_read() reads num_bytes from the raw
message and then unconditionally does:

memcpy(bytes, &raw->msg[idx], num_bytes);

without checking that idx + num_bytes <= raw->curlen. raw->msg[] is
256 bytes; if a malicious or misbehaving MST hub sets num_bytes larger
than the remaining payload, the memcpy reads past the received data
into whatever follows in raw->msg[].

drm_dp_sideband_parse_remote_i2c_read_ack() has the same flaw (noted
with a /* TODO check */ comment since the code was introduced).

Fix both functions by using a single combined check
(idx + num_bytes > curlen) before each memcpy. Since num_bytes is u8,
it is always >= 0, so this strictly subsumes the simpler idx > curlen
form and no separate step is needed.

Fixes: ad7f8a1f9ced ("drm/helper: add Displayport multi-stream helper (v0.6)")
Cc: <stable@vger.kernel.org> # v3.17+
Signed-off-by: Ashutosh Desai <ashutoshdesai993@gmail.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
[added missing fixes tag]
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patch.msgid.link/20260510201733.2882224-1-ashutoshdesai993@gmail.com

drm/panel-edp: Add panel for Surface Pro 12in

Add an entry for the BOE NE120DRM-N28 panel,
used in the Microsoft Surface Pro 12-inch.

The values chosen were tested to be working fine
for wake from sleep and hibernation.

Panel edid:

00 ff ff ff ff ff ff 00 09 e5 c9 0c a0 06 00 07
0a 22 01 04 a5 19 11 78 07 9f 15 a6 55 4c 9b 25
0e 50 54 00 00 00 01 01 01 01 01 01 01 01 01 01
01 01 01 01 01 01 62 53 94 a0 80 b8 2e 50 18 10
3a 00 fe a9 00 00 00 1a 13 7d 94 a0 80 b8 2e 50
18 10 3a 00 fe a9 00 00 00 1a 00 00 00 fd 00 18
5a 5b 88 20 01 0a 20 20 20 20 20 20 00 00 00 fc
00 4e 45 31 32 30 44 52 4d 2d 4e 32 38 0a 00 0a

Signed-off-by: Harrison Vanderbyl <harrison.vanderbyl@gmail.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://patch.msgid.link/9e749a3a483e4a3c684eac3ee6a4b241c94a0362.1778822464.git.harrison.vanderbyl@gmail.com

ASoC: Add support for GPIOs driven amplifiers

Herve Codina <herve.codina@bootlin.com> says:

On some embedded system boards, audio amplifiers are designed using
discrete components such as op-amp, several resistors and switches to
either adjust the gain (switching resistors) or fully switch the
audio signal path (mute and/or bypass features).

Those switches are usually driven by simple GPIOs.

This kind of amplifiers are not handled in ASoC and the fallback is to
let the user-space handle those GPIOs out of the ALSA world.

In order to have those kind of amplifiers fully integrated in the audio
stack, this series introduces the audio-gpio-amp to handle them.

This new ASoC component allows to have the amplifiers seen as ASoC
auxiliarty devices and so it allows to control them through audio mixer
controls.

In order to ease the review, I choose to split modifications related
to the merge of the gpio-audio-amp part into the simple-amplfier driver
in several commits.

Link: https://patch.msgid.link/20260513081702.317117-1-herve.codina@bootlin.com

MAINTAINERS: Add the ASoC gpio audio amplifier entry

After contributing the component, add myself as the maintainer for the
ASoC gpio audio amplifier component.

Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Link: https://patch.msgid.link/20260513081702.317117-18-herve.codina@bootlin.com
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: simple-amplifier: Update author and copyright

After reworking the simple-amplifier driver and adding support for
gpio-audio-amp in the driver, add myself as the author of the
gpio-audio-amp part of the driver and add a related copyright.

Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Link: https://patch.msgid.link/20260513081702.317117-17-herve.codina@bootlin.com
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: simple-amplifier: gpio-audio-amp: Add support for gain-labels

The possible gain values can be described using labels instead of gain
values in dB.

Those different labels are attached to a gpio values using the
gain-labels property.

Using the gain-labels description is mutually exclusive with gain-ranges
description used to describe the relationship between gpios values and
gain values.

Handle the gain-labels description and the related kcontrol.

Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Link: https://patch.msgid.link/20260513081702.317117-16-herve.codina@bootlin.com
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: simple-amplifier: gpio-audio-amp: Add support for gain-ranges

The mapping between physical gain values and gpio values can be
expressed using ranges described in the gain-ranges property.

This gain-ranges property is an array of ranges.

Each range in the array is defined by the first point and last point in
the range. Those points are a pair of values, the gpios value and the
related gain (dB) value.

With that, a given range defines N possible items (from the first point
gpios value to the last point gpios value) in order to set a gain from
the first point gain value to the last point gain value.

Handle this description and the related kcontrol.

Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Link: https://patch.msgid.link/20260513081702.317117-15-herve.codina@bootlin.com
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: simple-amplifier: gpio-audio-amp: Add support for basic gain

Several gpios can be used to control the amplifier gain.

Add basic support for those gpios.

This basic support doesn't include any mapping between the GPIOs value
and the physical gain value (dB).

The support for this kind of mapping will be added later on.

Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Link: https://patch.msgid.link/20260513081702.317117-14-herve.codina@bootlin.com
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: simple-amplifier: gpio-audio-amp: Add support for bypass gpio

A gpio can be used to control the amplifier bypass feature.

Add support for this bypass gpio in the same way as it has been done for
the mute gpio.

Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Link: https://patch.msgid.link/20260513081702.317117-13-herve.codina@bootlin.com
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: simple-amplifier: gpio-audio-amp: Add support for mute gpio

A gpio can be used to control the amplifier mute feature.

Add support for this mute gpio.

Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Link: https://patch.msgid.link/20260513081702.317117-12-herve.codina@bootlin.com
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: simple-amplifier: gpio-audio-amp: Add support for extra power supplies

The gpio-audio-amp devices can use additional power supplies:
  - vddio,
  - vdda1,
  - vdda2

Add support for those additional power supplies.

Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Link: https://patch.msgid.link/20260513081702.317117-11-herve.codina@bootlin.com
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: simple-amplifier: Introduce support for gpio-audio-amp

Improve the simple-amplifier introducing preliminary support for
gpio-audio-amp.

Those amplifiers are amplifiers driven by gpios.

This support introduction doesn't handle any GPIO yet but introduces
the compatible strings and the related DAPM table.

Two gpio-audio-amp are available: A mono and a stereo version.

The mono version has only one audio channel and gpio settings impact
features such as the gain or mute of this sole channel.

The stereo version has two channels (left and right). Gpio settings
impact both channels in the same manner and at the same time. For
instance, the gain setting set the gain of both channels as well as
the mute setting mutes both channels.

Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Link: https://patch.msgid.link/20260513081702.317117-10-herve.codina@bootlin.com
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: simple-amplifier: Remove DAPM widgets and routes from the ASoC component driver

The simple-amplifier set the DAPM wigets and routes table in the ASoC
component driver. This is perfectly fine when the component has well
known DAPM tables.

The simple-amplifier is going to handle several kind of components based
on the driver compatible string. The DAPM table will not be the same for
all components supported by the driver.

In order to have different DAPM table based on matching compatible
strings, move those tables from the ASoC component driver to the device
compatible string matching data.

Add those DAPM widgets and routes dynamically during the ASoC component
probe operation.

Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Link: https://patch.msgid.link/20260513081702.317117-9-herve.codina@bootlin.com
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: simple-amplifier: Use 'simple_amp' variable name instead of 'priv'

The simple-amplifier driver use 'priv' as variable name for its private
data (struct simple_amp).

With the support for gpio-audio-amp, more functions and data
structures will be added.

Those future additions will add more complexity in data manipulation and
will make the 'priv' term error prone.

In order to clearly identify the struct simple_amp private data, use
'simple_amp' as variable name when this structure is involved.

Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Link: https://patch.msgid.link/20260513081702.317117-8-herve.codina@bootlin.com
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: simple-amplifier: Rename drv_event() function

The drv_event() is used to handle power events related to the DRV item.

Later, with the support for gpio-audio-amp, this function will be
also used to handle power events related to the PGA item.

Also, more functions will be added in the driver and it is a common
usage to prefix functions based on the driver name.

Rename the drv_event() function to simple_amp_power_event() to follow
common usage and get rid of the 'drv' term.

Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Link: https://patch.msgid.link/20260513081702.317117-7-herve.codina@bootlin.com
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: simple-amplifier: Remove CONFIG_OF flag and of_match_ptr()

The simple-amplifier Use CONFIG_OF flag for its of_device_id table
and of_match_ptr() when it assigns the table in the driver declaration.

This is no more needed. Drop them.

Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Link: https://patch.msgid.link/20260513081702.317117-6-herve.codina@bootlin.com
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: simple-amplifier: Add missing headers

The simple-amplifier driver is a platform device driver.

Add missing include files related to this kind of driver.

Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Link: https://patch.msgid.link/20260513081702.317117-5-herve.codina@bootlin.com
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: simple-amplifier: Remove DRV_NAME defined value

DRV_NAME is defined and used only in the simple-amplifier driver
declaration.

Remove the useless defined and use directly the value in the driver
declaration itself.

Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Link: https://patch.msgid.link/20260513081702.317117-4-herve.codina@bootlin.com
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: dt-bindings: Add support for the GPIOs driven amplifier

Some amplifiers based on analog switches and op-amps can be present in
the audio path and can be driven by GPIOs in order to control their gain
value, their mute and/or bypass functions.

Those components needs to be viewed as audio components in order to be
fully integrated in the audio path.

gpio-audio-amp allows to consider these GPIO driven amplifiers as
auxiliary audio devices.

Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://patch.msgid.link/20260513081702.317117-3-herve.codina@bootlin.com
Signed-off-by: Mark Brown <broonie@kernel.org>

of: Introduce of_property_read_s32_index()

Signed integers can be read from single value properties using
of_property_read_s32() but nothing exist to read signed integers
from multi-value properties.

Fix this lack adding of_property_read_s32_index().

Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://patch.msgid.link/20260513081702.317117-2-herve.codina@bootlin.com
Signed-off-by: Mark Brown <broonie@kernel.org>

spi: ti-qspi: fix use-after-free after DMA setup failure

The driver falls back to PIO mode if DMA setup fails during probe.

Make sure to clear the DMA channel pointer also if buffer allocation
fails to avoid passing a pointer to the released channel to the DMA
engine (or trying to free the channel a second time on late probe errors
or driver unbind).

This issue was flagged by Sashiko when reviewing a devres allocation
conversion patch.

Fixes: c687c46e9e45 ("spi: spi-ti-qspi: Use bounce buffer if read buffer is not DMA'ble")
Link: https://sashiko.dev/#/patchset/20260505072909.618363-1-johan%40kernel.org?part=17
Cc: stable@vger.kernel.org # 4.12
Cc: Vignesh R <vigneshr@ti.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://patch.msgid.link/20260512074809.915084-1-johan@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>

spi: sprd: fix error pointer deref after DMA setup failure

The driver falls back to PIO mode if DMA setup fails during probe.

Make sure to check the dma.enabled flag before trying to release the DMA
channels also on late probe errors to avoid dereferencing an error
pointer (or attempting to release a channel a second time).

This issue was flagged by Sashiko when reviewing a devres allocation
conversion patch.

Fixes: 386119bc7be9 ("spi: sprd: spi: sprd: Add DMA mode support")
Link: https://sashiko.dev/#/patchset/20260505072909.618363-1-johan%40kernel.org?part=10
Cc: stable@vger.kernel.org # 5.1
Cc: Lanqing Liu <lanqing.liu@unisoc.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://patch.msgid.link/20260512074733.915029-1-johan@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: fsl_sai: Eliminate possible interrupt storm during probe

When the SAI peripheral is left in a running state by the bootloader,
the driver can experience an interrupt storm during probe that prevents
successful initialization. This occurs because the current code registers
the IRQ handler before resetting the hardware to a known state.

The issue manifests as:
- Continuous interrupts firing immediately after devm_request_irq()
- Driver probe failure or system hang
- Error messages about unhandled interrupts

This is particularly problematic on systems where U-Boot or other
bootloaders enable SAI for boot-time audio feedback or diagnostics
and don't properly disable it before handing control to Linux.

Fix this by reordering the probe sequence:
1. Add fsl_sai_reset_hw() to clear TCSR/RCSR control registers,
which disables the transmitter/receiver and all interrupt sources
2. Move devm_request_irq() to after hardware initialization

This ensures the SAI is in a clean reset state before the interrupt
handler can be invoked, preventing the storm while maintaining proper
error handling and cleanup paths.

Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Link: https://patch.msgid.link/20260512065252.75859-1-shengjiu.wang@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>

spi: qup: fix error pointer deref after DMA setup failure

The driver falls back to PIO mode if DMA setup fails during probe.

Make sure to the clear the DMA channel pointers on setup failure to
avoid dereferencing an error pointer (or attempting to release a channel
a second time) on later probe errors or driver unbind.

This issue was flagged by Sashiko when reviewing a devres allocation
conversion patch.

Fixes: 612762e82ae6 ("spi: qup: Add DMA capabilities")
Link: https://sashiko.dev/#/patchset/20260505072909.618363-1-johan%40kernel.org?part=4
Cc: stable@vger.kernel.org # 4.1
Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://patch.msgid.link/20260512074334.914735-1-johan@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>

nfc: nxp-nci: i2c: use rising-edge IRQ on ACPI systems

Some ACPI-based platforms report incorrect IRQ trigger types (e.g.
IRQF_TRIGGER_HIGH), which can lead to interrupt storms.

Use the historically working rising-edge trigger on ACPI systems to
avoid this regression.

Device Tree-based systems continue to use the firmware-provided
trigger type.

Fixes: 57be33f85e36 ("nfc: nxp-nci: remove interrupt trigger type")
Signed-off-by: Carl Lee <carl.lee@amd.com>
Tested-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Reviewed-by: Mark Pearson <mpearson-lenovo@squebb.ca>
Tested-by: Mark Pearson <mpearson-lenovo@squebb.ca>
Tested-by: Luca Stefani <luca.stefani.ge1@gmail.com>
Link: https://patch.msgid.link/20260516-nfc-nxp-nci-i2c-restore-irq-trigger-fallback-v3-1-37ba4b6e9086@amd.com
Signed-off-by: David Heidelberg <david@ixit.cz>

ASoC: mediatek: mt8196: Fix probe resource cleanup

The MT8196 AFE probe assigns reserved memory with
of_reserved_mem_device_init(), but never releases it.
This leaks the reserved memory assignment on driver
removal and on later probe failures.

The same probe path also uses unchecked pm_runtime_get_sync() calls.
A failure while resuming the device can leave the runtime PM usage
count in an unexpected state.

The regmap error path returns directly while the device is still
runtime active, and the remove path drops a runtime PM reference even
though successful probe has already released its temporary reference.

Register a devm cleanup action for the reserved memory assignment,
use pm_runtime_resume_and_get(), and only drop runtime PM references
on paths where they are actually held.

Fixes: 57513aabfe5b ("ASoC: mediatek: mt8196: add platform driver")
Signed-off-by: Cássio Gabriel <cassiogabrielcontato@gmail.com>
Link: https://patch.msgid.link/20260517-asoc-mt8196-probe-cleanup-v1-1-a5d26949d7fe@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>

Merge tag 'media/v7.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media fix from Mauro Carvalho Chehab:
"Fix inverted error logic in ttusbir driver"

* tag 'media/v7.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
media: rc: ttusbir: fix inverted error logic

dt-bindings: fpga: altr,socfpga-fpga-mgr: convert to DT schema

Convert the Altera SoCFPGA FPGA Manager bindings from text
format to YAML schema.

Signed-off-by: Manish Baing <manishbaing2789@gmail.com>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Acked-by: Xu Yilun <yilun.xu@intel.com>
Link: https://lore.kernel.org/r/20260512182033.66222-1-manishbaing2789@gmail.com
Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com>

dt-bindings: fpga: altr,a10-pr-ip: convert to DT schema

Convert the Altera Arria 10 Partial Reconfiguration IP bindings
from text format to YAML schema.

Signed-off-by: Manish Baing <manishbaing2789@gmail.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Acked-by: Xu Yilun <yilun.xu@intel.com>
Link: https://lore.kernel.org/r/20260512180225.65902-1-manishbaing2789@gmail.com
Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com>

Merge tag 'soc_fsl-7.1-2' of https://git.kernel.org/pub/scm/linux/kernel/git/chleroy/linux into soc/drivers

FSL SOC Changes for 7.1

Freescale QUICC Engine:
- Add missing cleanup on device removal and switch to irq_domain_create_linear()
in interrupt controller for IO Ports
- Panic on ioremap() failure in qe_reset()

Freescale Management Complex:
- Move fsl-mc over to device MSI infrastructure
- Wait for the MC firmware to complete its boot

Freescale Hypervisor:
- Fix header kernel-doc warnings

* tag 'soc_fsl-7.1-2' of https://git.kernel.org/pub/scm/linux/kernel/git/chleroy/linux:
  bus: fsl-mc: wait for the MC firmware to complete its boot
  soc: fsl: qe: panic on ioremap() failure in qe_reset()
  soc: fsl: qe_ports_ic: switch to irq_domain_create_linear()
  soc: fsl: qe_ports_ic: Add missing cleanup on device removal
  virt: fsl_hypervisor: fix header kernel-doc warnings
  platform-msi: Remove stale comment
  fsl-mc: Remove legacy MSI implementation
  fsl-mc: Switch over to per-device platform MSI
  irqchip/gic-v3-its: Add fsl_mc device plumbing to the msi-parent handling
  fsl-mc: Add minimal infrastructure to use platform MSI
  fsl-mc: Remove MSI domain propagation to sub-devices

Signed-off-by: Arnd Bergmann <arnd@arndb.de>

io_uring: propagate array_index_nospec opcode into req->opcode

Commit 1e988c3fe126 ("io_uring: prevent opcode speculation") added
array_index_nospec() to io_init_req(), but applied it only to a local
opcode variable. req->opcode is initialized from sqe->opcode before the
bounds check and remains the raw value.

Keep req->opcode as the canonical opcode in io_init_req(): reject
out-of-range values architecturally, then write the array_index_nospec()
result back to req->opcode before any table lookup. This keeps downstream
users of req->opcode from observing the raw user byte on a mispredicted
path.

No functional change: array_index_nospec() is a no-op for opcodes in
[0, IORING_OP_LAST), and out-of-range opcodes are still rejected at the
bounds check above the assignment.

Fixes: 1e988c3fe126 ("io_uring: prevent opcode speculation")
Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Link: https://patch.msgid.link/20260517213010.696135-1-michael.bommarito@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>