git.ipfire.org Git - thirdparty/kernel/stable.git/log

ext4: publish jinode after initialization

ext4_inode_attach_jinode() publishes ei->jinode to concurrent users.
It used to set ei->jinode before jbd2_journal_init_jbd_inode(),
allowing a reader to observe a non-NULL jinode with i_vfs_inode
still unset.

The fast commit flush path can then pass this jinode to
jbd2_wait_inode_data(), which dereferences i_vfs_inode->i_mapping and
may crash.

Below is the crash I observe:
```
BUG: unable to handle page fault for address: 000000010beb47f4
PGD 110e51067 P4D 110e51067 PUD 0
Oops: Oops: 0000 [#1] SMP NOPTI
CPU: 1 UID: 0 PID: 4850 Comm: fc_fsync_bench_ Not tainted 6.18.0-00764-g795a690c06a5 #1 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.17.0-2-2 04/01/2014
RIP: 0010:xas_find_marked+0x3d/0x2e0
Code: e0 03 48 83 f8 02 0f 84 f0 01 00 00 48 8b 47 08 48 89 c3 48 39 c6 0f 82 fd 01 00 00 48 85 c9 74 3d 48 83 f9 03 77 63 4c 8b 0f <49> 8b 71 08 48 c7 47 18 00 00 00 00 48 89 f1 83 e1 03 48 83 f9 02
RSP: 0018:ffffbbee806e7bf0 EFLAGS: 00010246
RAX: 000000000010beb4 RBX: 000000000010beb4 RCX: 0000000000000003
RDX: 0000000000000001 RSI: 0000002000300000 RDI: ffffbbee806e7c10
RBP: 0000000000000001 R08: 0000002000300000 R09: 000000010beb47ec
R10: ffff9ea494590090 R11: 0000000000000000 R12: 0000002000300000
R13: ffffbbee806e7c90 R14: ffff9ea494513788 R15: ffffbbee806e7c88
FS: 00007fc2f9e3e6c0(0000) GS:ffff9ea6b1444000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000010beb47f4 CR3: 0000000119ac5000 CR4: 0000000000750ef0
PKRU: 55555554
Call Trace:
<TASK>
filemap_get_folios_tag+0x87/0x2a0
__filemap_fdatawait_range+0x5f/0xd0
? srso_alias_return_thunk+0x5/0xfbef5
? __schedule+0x3e7/0x10c0
? srso_alias_return_thunk+0x5/0xfbef5
? srso_alias_return_thunk+0x5/0xfbef5
? srso_alias_return_thunk+0x5/0xfbef5
? preempt_count_sub+0x5f/0x80
? srso_alias_return_thunk+0x5/0xfbef5
? cap_safe_nice+0x37/0x70
? srso_alias_return_thunk+0x5/0xfbef5
? preempt_count_sub+0x5f/0x80
? srso_alias_return_thunk+0x5/0xfbef5
filemap_fdatawait_range_keep_errors+0x12/0x40
ext4_fc_commit+0x697/0x8b0
? ext4_file_write_iter+0x64b/0x950
? srso_alias_return_thunk+0x5/0xfbef5
? preempt_count_sub+0x5f/0x80
? srso_alias_return_thunk+0x5/0xfbef5
? vfs_write+0x356/0x480
? srso_alias_return_thunk+0x5/0xfbef5
? preempt_count_sub+0x5f/0x80
ext4_sync_file+0xf7/0x370
do_fsync+0x3b/0x80
? syscall_trace_enter+0x108/0x1d0
__x64_sys_fdatasync+0x16/0x20
do_syscall_64+0x62/0x2c0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
...
```

Fix this by initializing the jbd2_inode first.
Use smp_wmb() and WRITE_ONCE() to publish ei->jinode after
initialization. Readers use READ_ONCE() to fetch the pointer.

Fixes: a361293f5fede ("jbd2: Fix oops in jbd2_journal_file_inode()")
Cc: stable@vger.kernel.org
Signed-off-by: Li Chen <me@linux.beauty>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20260225082617.147957-1-me@linux.beauty
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org

ext4: replace BUG_ON with proper error handling in ext4_read_inline_folio

Replace BUG_ON() with proper error handling when inline data size
exceeds PAGE_SIZE. This prevents kernel panic and allows the system to
continue running while properly reporting the filesystem corruption.

The error is logged via ext4_error_inode(), the buffer head is released
to prevent memory leak, and -EFSCORRUPTED is returned to indicate
filesystem corruption.

Signed-off-by: Yuto Ohnuki <ytohnuki@amazon.com>
Link: https://patch.msgid.link/20260223123345.14838-2-ytohnuki@amazon.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org

ext4: fix fsync(2) for nojournal mode

When inode metadata is changed, we sometimes just call
ext4_mark_inode_dirty() to track modified metadata. This copies inode
metadata into block buffer which is enough when we are journalling
metadata. However when we are running in nojournal mode we currently
fail to write the dirtied inode buffer during fsync(2) because the inode
is not marked as dirty. Use explicit ext4_write_inode() call to make
sure the inode table buffer is written to the disk. This is a band aid
solution but proper solution requires a much larger rewrite including
changes in metadata bh tracking infrastructure.

Reported-by: Free Ekanayaka <free.ekanayaka@gmail.com>
Link: https://lore.kernel.org/all/87il8nhxdm.fsf@x1.mail-host-address-is-not-set/
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Link: https://patch.msgid.link/20260216164848.3074-4-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org

ext4: make recently_deleted() properly work with lazy itable initialization

recently_deleted() checks whether inode has been used in the near past.
However this can give false positive result when inode table is not
initialized yet and we are in fact comparing to random garbage (or stale
itable block of a filesystem before mkfs). Ultimately this results in
uninitialized inodes being skipped during inode allocation and possibly
they are never initialized and thus e2fsck complains. Verify if the
inode has been initialized before checking for dtime.

Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Link: https://patch.msgid.link/20260216164848.3074-3-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org

Merge branch 'fix-page-fragment-handling-when-page_size-4k'

Dimitri Daskalakis says:

====================
Fix page fragment handling when PAGE_SIZE > 4K

FBNIC operates on fixed size descriptors (4K). When the OS supports pages
larger than 4K, we fragment the page across multiple descriptors.

While performance testing, I found several issues with our page fragment
handling, resulting in low throughput and potential RX stalls.
====================

Link: https://patch.msgid.link/20260324195123.3486219-1-dimitri.daskalakis1@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ext4: fix journal credit check when setting fscrypt context

Fix an issue arising when ext4 features has_journal, ea_inode, and encrypt
are activated simultaneously, leading to ENOSPC when creating an encrypted
file.

Fix by passing XATTR_CREATE flag to xattr_set_handle function if a handle
is specified, i.e., when the function is called in the control flow of
creating a new inode. This aligns the number of jbd2 credits set_handle
checks for with the number allocated for creating a new inode.

ext4_set_context must not be called with a non-null handle (fs_data) if
fscrypt context xattr is not guaranteed to not exist yet. The only other
usage of this function currently is when handling the ioctl
FS_IOC_SET_ENCRYPTION_POLICY, which calls it with fs_data=NULL.

Fixes: c1a5d5f6ab21eb7e ("ext4: improve journal credit handling in set xattr paths")
Co-developed-by: Anthony Durrer <anthonydev@fastmail.com>
Signed-off-by: Anthony Durrer <anthonydev@fastmail.com>
Signed-off-by: Simon Weber <simon.weber.39@gmail.com>
Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Link: https://patch.msgid.link/20260207100148.724275-4-simon.weber.39@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org

eth: fbnic: Fix debugfs output for BDQ's with page frags

The rings size_mask represents the number of pages, so we need to
determine the number of page frags when dumping the descriptors.

Fixes: df04373b0dab ("eth fbnic: Add debugfs hooks for tx/rx rings")
Signed-off-by: Dimitri Daskalakis <daskald@meta.com>
Link: https://patch.msgid.link/20260324195123.3486219-3-dimitri.daskalakis1@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

eth: fbnic: Account for page fragments when updating BDQ tail

FBNIC supports fixed size buffers of 4K. When PAGE_SIZE > 4K, we
fragment the page across multiple descriptors (FBNIC_BD_FRAG_COUNT).
When refilling the BDQ, the correct number of entries are populated,
but tail was only incremented by one. So on a system with 64K pages,
HW would get one descriptor refilled for every 16 we populate.

Additionally, we program the ring size in the HW when enabling the BDQ.
This was not accounting for page fragments, so on systems with 64K pages,
the HW used 1/16th of the ring.

Fixes: 0cb4c0a13723 ("eth: fbnic: Implement Rx queue alloc/start/stop/free")
Signed-off-by: Dimitri Daskalakis <daskald@meta.com>
Link: https://patch.msgid.link/20260324195123.3486219-2-dimitri.daskalakis1@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ext4: convert inline data to extents when truncate exceeds inline size

Add a check in ext4_setattr() to convert files from inline data storage
to extent-based storage when truncate() grows the file size beyond the
inline capacity. This prevents the filesystem from entering an
inconsistent state where the inline data flag is set but the file size
exceeds what can be stored inline.

Without this fix, the following sequence causes a kernel BUG_ON():

1. Mount filesystem with inode that has inline flag set and small size
2. truncate(file, 50MB) - grows size but inline flag remains set
3. sendfile() attempts to write data
4. ext4_write_inline_data() hits BUG_ON(write_size > inline_capacity)

The crash occurs because ext4_write_inline_data() expects inline storage
to accommodate the write, but the actual inline capacity (~60 bytes for
i_block + ~96 bytes for xattrs) is far smaller than the file size and
write request.

The fix checks if the new size from setattr exceeds the inode's actual
inline capacity (EXT4_I(inode)->i_inline_size) and converts the file to
extent-based storage before proceeding with the size change.

This addresses the root cause by ensuring the inline data flag and file
size remain consistent during truncate operations.

Reported-by: syzbot+7de5fe447862fc37576f@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=7de5fe447862fc37576f
Tested-by: syzbot+7de5fe447862fc37576f@syzkaller.appspotmail.com
Signed-off-by: Deepanshu Kartikey <Kartikey406@gmail.com>
Link: https://patch.msgid.link/20260207043607.1175976-1-kartikey406@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org

ext4: fix stale xarray tags after writeback

There are cases where ext4_bio_write_page() gets called for a page which
has no buffers to submit. This happens e.g. when the part of the file is
actually a hole, when we cannot allocate blocks due to being called from
jbd2, or in data=journal mode when checkpointing writes the buffers
earlier. In these cases we just return from ext4_bio_write_page()
however if the page didn't need redirtying, we will leave stale DIRTY
and/or TOWRITE tags in xarray because those get cleared only in
__folio_start_writeback(). As a result we can leave these tags set in
mappings even after a final sync on filesystem that's getting remounted
read-only or that's being frozen. Various assertions can then get upset
when writeback is started on such filesystems (Gerald reported assertion
in ext4_journal_check_start() firing).

Fix the problem by cycling the page through writeback state even if we
decide nothing needs to be written for it so that xarray tags get
properly updated. This is slightly silly (we could update the xarray
tags directly) but I don't think a special helper messing with xarray
tags is really worth it in this relatively rare corner case.

Reported-by: Gerald Yang <gerald.yang@canonical.com>
Link: https://lore.kernel.org/all/20260128074515.2028982-1-gerald.yang@canonical.com
Fixes: dff4ac75eeee ("ext4: move keep_towrite handling to ext4_bio_write_page()")
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20260205092223.21287-2-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org

ip6_tunnel: clear skb2->cb[] in ip4ip6_err()

Oskar Kjos reported the following problem.

ip4ip6_err() calls icmp_send() on a cloned skb whose cb[] was written
by the IPv6 receive path as struct inet6_skb_parm. icmp_send() passes
IPCB(skb2) to __ip_options_echo(), which interprets that cb[] region
as struct inet_skb_parm (IPv4). The layouts differ: inet6_skb_parm.nhoff
at offset 14 overlaps inet_skb_parm.opt.rr, producing a non-zero rr
value. __ip_options_echo() then reads optlen from attacker-controlled
packet data at sptr[rr+1] and copies that many bytes into dopt->__data,
a fixed 40-byte stack buffer (IP_OPTIONS_DATA_FIXED_SIZE).

To fix this we clear skb2->cb[], as suggested by Oskar Kjos.

Also add minimal IPv4 header validation (version == 4, ihl >= 5).

Fixes: c4d3efafcc93 ("[IPV6] IP6TUNNEL: Add support to IPv4 over IPv6 tunnel.")
Reported-by: Oskar Kjos <oskar.kjos@hotmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20260326155138.2429480-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ipv6: icmp: clear skb2->cb[] in ip6_err_gen_icmpv6_unreach()

Sashiko AI-review observed:

  In ip6_err_gen_icmpv6_unreach(), the skb is an outer IPv4 ICMP error packet
  where its cb contains an IPv4 inet_skb_parm. When skb is cloned into skb2
  and passed to icmp6_send(), it uses IP6CB(skb2).

  IP6CB interprets the IPv4 inet_skb_parm as an inet6_skb_parm. The cipso
  offset in inet_skb_parm.opt directly overlaps with dsthao in inet6_skb_parm
  at offset 18.

  If an attacker sends a forged ICMPv4 error with a CIPSO IP option, dsthao
  would be a non-zero offset. Inside icmp6_send(), mip6_addr_swap() is called
  and uses ipv6_find_tlv(skb, opt->dsthao, IPV6_TLV_HAO).

  This would scan the inner, attacker-controlled IPv6 packet starting at that
  offset, potentially returning a fake TLV without checking if the remaining
  packet length can hold the full 18-byte struct ipv6_destopt_hao.

  Could mip6_addr_swap() then perform a 16-byte swap that extends past the end
  of the packet data into skb_shared_info?

  Should the cb array also be cleared in ip6_err_gen_icmpv6_unreach() and
  ip6ip6_err() to prevent this?

This patch implements the first suggestion.

I am not sure if ip6ip6_err() needs to be changed.
A separate patch would be better anyway.

Fixes: ca15a078bd90 ("sit: generate icmpv6 error when receiving icmpv4 error")
Reported-by: Ido Schimmel <idosch@nvidia.com>
Closes: https://sashiko.dev/#/patchset/20260326155138.2429480-1-edumazet%40google.com
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Oskar Kjos <oskar.kjos@hotmail.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20260326202608.2976021-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ext4: do not check fast symlink during orphan recovery

Commit '5f920d5d6083 ("ext4: verify fast symlink length")' causes the
generic/475 test to fail during orphan cleanup of zero-length symlinks.

  generic/475  84s ... _check_generic_filesystem: filesystem on /dev/vde is inconsistent

The fsck reports are provided below:

  Deleted inode 9686 has zero dtime.
  Deleted inode 158230 has zero dtime.
  ...
  Inode bitmap differences:  -9686 -158230
  Orphan file (inode 12) block 13 is not clean.
  Failed to initialize orphan file.

In ext4_symlink(), a newly created symlink can be added to the orphan
list due to ENOSPC. Its data has not been initialized, and its size is
zero. Therefore, we need to disregard the length check of the symbolic
link when cleaning up orphan inodes. Instead, we should ensure that the
nlink count is zero.

Fixes: 5f920d5d6083 ("ext4: verify fast symlink length")
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20260131091156.1733648-1-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org

Update MAINTAINERS file to add reviewers for ext4

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>

Merge tag 'hwmon-for-v7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

Pull hwmon fixes from Guenter Roeck:

- PMBus driver fixes:
     - Add mutex protection for regulator operations
     - Fix reading from "write-only" attributes
     - Mark lowest/average/highest/rated attributes as read-only
     - isl68137: Add mutex protection for AVS enable sysfs attributes
     - ina233:  Fix error handling and sign extension when reading shunt voltage

- adm1177: Fix sysfs ABI violation and current unit conversion

- peci: Fix off-by-one in cputemp_is_visible(), and crit_hyst returning
   delta instead of absolute temperature

* tag 'hwmon-for-v7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (pmbus/core) Protect regulator operations with mutex
  hwmon: (pmbus) Introduce the concept of "write-only" attributes
  hwmon: (pmbus) Mark lowest/average/highest/rated attributes as read-only
  hwmon: (adm1177) fix sysfs ABI violation and current unit conversion
  hwmon: (peci/cputemp) Fix off-by-one in cputemp_is_visible()
  hwmon: (peci/cputemp) Fix crit_hyst returning delta instead of absolute temperature
  hwmon: (pmbus/isl68137) Add mutex protection for AVS enable sysfs attributes
  hwmon: (pmbus/ina233) Fix error handling and sign extension in shunt voltage read

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
"Driver (and enclosure) only fixes. Most are obvious. The big change is
  in the tcm_loop driver to add command draining to error handling (the
  lack of which was causing hangs with the potential for double use
  crashes)"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: target: file: Use kzalloc_flex for aio_cmd
  scsi: scsi_transport_sas: Fix the maximum channel scanning issue
  scsi: target: tcm_loop: Drain commands in target_reset handler
  scsi: ibmvfc: Fix OOB access in ibmvfc_discover_targets_done()
  scsi: ses: Handle positive SCSI error from ses_recv_diag()

MAINTAINERS: Change email address for Thierry Reding

Use the kernel.org email address as a level of indirection to enable
transparent switching of providers if necessary.

Signed-off-by: Thierry Reding <treding@nvidia.com>

arm64: tegra: Add Tegra264 GPIO controllers

Add device tree nodes for MAIN, AON and UPHY GPIO controller instances.

Signed-off-by: Prathamesh Shete <pshete@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>

arm64: tegra: smaug: Enable SPI-NOR flash

Add support for the SPI-NOR flash found in Pixel C devices.

Signed-off-by: Diogo Ivo <diogo.ivo@tecnico.ulisboa.pt>
Signed-off-by: Thierry Reding <treding@nvidia.com>

Merge tag 'drm-fixes-2026-03-28-1' of https://gitlab.freedesktop.org/drm/kernel

Pull drm fixes from Dave Airlie:
"Weekly fixes, still a bit busy, but the usual suspects amdgpu and
  i915/xe have a bunch of small fixes, and otherwise it's just a few
  minor driver fixes.

  loognsoon:
   - update MAINTAINERS

  shmem:
   - fault handler fix

  syncobj:
   - fix GFP flags

  amdgpu:
   - DSC fix
   - Module parameter parsing fix
   - PASID reuse fix
   - drm_edid leak fix
   - SMU 13.x fixes
   - SMU 14.x fix
   - Fence fix in amdgpu_amdkfd_submit_ib()
   - LVDS fixes
   - GPU page fault fix for non-4K pages

  amdkfd:
   - Ordering fix in kfd_ioctl_create_process()

  i915/display:
   - DP tunnel error handling fix
   - Spurious GMBUS timeout fix
   - Unlink NV12 planes earlier
   - Order OP vs. timeout correctly in __wait_for()

  xe:
   - Fix UAF in SRIOV migration restore
   - Updates to HW W/a
   - VMBind remap fix

  ivpu:
   - poweroff fix

  mediatek:
   - fix register ordering"

* tag 'drm-fixes-2026-03-28-1' of https://gitlab.freedesktop.org/drm/kernel: (25 commits)
  MAINTAINERS: Update GPU driver maintainer information
  drm/xe: always keep track of remap prev/next
  drm/syncobj: Fix xa_alloc allocation flags
  drm/amd/display: Fix DCE LVDS handling
  drm/amdgpu: Handle GPU page faults correctly on non-4K page systems
  drm/amd/pm: disable OD_FAN_CURVE if temp or pwm range invalid for smu v14
  drm/amdkfd: Fix NULL pointer check order in kfd_ioctl_create_process
  drm/amd/display: check if ext_caps is valid in BL setup
  drm/amdgpu: Fix fence put before wait in amdgpu_amdkfd_submit_ib
  drm/xe: Implement recent spec updates to Wa_16025250150
  accel/ivpu: Add disable clock relinquish workaround for NVL-A0
  drm/i915/dp_tunnel: Fix error handling when clearing stream BW in atomic state
  drm/amd/pm: disable OD_FAN_CURVE if temp or pwm range invalid for smu v13
  drm/amd/pm: Return -EOPNOTSUPP for unsupported OD_MCLK on smu_v13_0_6
  drm/amd/pm: Skip redundant UCLK restore in smu_v13_0_6
  drm/amd/display: Fix drm_edid leak in amdgpu_dm
  drm/amdgpu: prevent immediate PASID reuse case
  drm/amdgpu: fix strsep() corrupting lockup_timeout on multi-GPU (v3)
  drm/amd/display: Do not skip unrelated mode changes in DSC validation
  drm/xe/pf: Fix use-after-free in migration restore
  ...

ptr_ring: disable KCSAN warnings

Eric Dumazet reported KCSAN warnings:

BUG: KCSAN: data-race in pfifo_fast_dequeue / pfifo_fast_enqueue

write to 0xffff88811d5ccc00 of 8 bytes by interrupt on cpu 0:
__ptr_ring_zero_tail include/linux/ptr_ring.h:259 [inline]
__ptr_ring_discard_one include/linux/ptr_ring.h:291 [inline]
__ptr_ring_consume include/linux/ptr_ring.h:311 [inline]
__skb_array_consume include/linux/skb_array.h:98 [inline]
pfifo_fast_dequeue+0x770/0x8f0 net/sched/sch_generic.c:770
dequeue_skb net/sched/sch_generic.c:297 [inline]
qdisc_restart net/sched/sch_generic.c:402 [inline]
__qdisc_run+0x189/0xc80 net/sched/sch_generic.c:420
qdisc_run include/net/pkt_sched.h:120 [inline]
net_tx_action+0x379/0x590 net/core/dev.c:5793
handle_softirqs+0xb9/0x280 kernel/softirq.c:622
do_softirq+0x45/0x60 kernel/softirq.c:523
__local_bh_enable_ip+0x70/0x80 kernel/softirq.c:450
local_bh_enable include/linux/bottom_half.h:33 [inline]
bpf_test_run+0x2db/0x620 net/bpf/test_run.c:426
bpf_prog_test_run_skb+0x9a4/0xef0 net/bpf/test_run.c:1159
bpf_prog_test_run+0x204/0x340 kernel/bpf/syscall.c:4721
__sys_bpf+0x52e/0x7e0 kernel/bpf/syscall.c:6246
__do_sys_bpf kernel/bpf/syscall.c:6341 [inline]
__se_sys_bpf kernel/bpf/syscall.c:6339 [inline]
__x64_sys_bpf+0x41/0x50 kernel/bpf/syscall.c:6339
x64_sys_call+0x10cb/0x3020 arch/x86/include/generated/asm/syscalls_64.h:322
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x12c/0x370 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f

read to 0xffff88811d5ccc00 of 8 bytes by task 22632 on cpu 1:
__ptr_ring_produce include/linux/ptr_ring.h:106 [inline]
ptr_ring_produce include/linux/ptr_ring.h:129 [inline]
skb_array_produce include/linux/skb_array.h:44 [inline]
pfifo_fast_enqueue+0xd5/0x2c0 net/sched/sch_generic.c:741
dev_qdisc_enqueue net/core/dev.c:4144 [inline]
__dev_xmit_skb net/core/dev.c:4188 [inline]
__dev_queue_xmit+0x6a4/0x1f20 net/core/dev.c:4795
dev_queue_xmit include/linux/netdevice.h:3384 [inline]
__bpf_tx_skb net/core/filter.c:2153 [inline]
__bpf_redirect_common net/core/filter.c:2197 [inline]
__bpf_redirect+0x862/0x990 net/core/filter.c:2204
____bpf_clone_redirect net/core/filter.c:2487 [inline]
bpf_clone_redirect+0x20c/0x290 net/core/filter.c:2450
bpf_prog_53f18857bc887b09+0x22/0x2a
bpf_dispatcher_nop_func include/linux/bpf.h:1402 [inline]
__bpf_prog_run include/linux/filter.h:723 [inline]
bpf_prog_run include/linux/filter.h:730 [inline]
bpf_test_run+0x29d/0x620 net/bpf/test_run.c:423
bpf_prog_test_run_skb+0x9a4/0xef0 net/bpf/test_run.c:1159
bpf_prog_test_run+0x204/0x340 kernel/bpf/syscall.c:4721
__sys_bpf+0x52e/0x7e0 kernel/bpf/syscall.c:6246
__do_sys_bpf kernel/bpf/syscall.c:6341 [inline]
__se_sys_bpf kernel/bpf/syscall.c:6339 [inline]
__x64_sys_bpf+0x41/0x50 kernel/bpf/syscall.c:6339
x64_sys_call+0x10cb/0x3020 arch/x86/include/generated/asm/syscalls_64.h:322
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x12c/0x370 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f

value changed: 0xffff888104a93a00 -> 0x0000000000000000

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 UID: 0 PID: 22632 Comm: syz.0.4135 Tainted: G W syzkaller #0
PREEMPT(full)
Tainted: [W]=WARN
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 01/24/2026

There is no race on ring accesses: reading/writing a partial pointer
would be fine, because the reading is done by the producer
which merely cares about NULL/non NULL.
Document and disable the warnings using data_race().

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/dd3984b3bce9df3591927f927668cb31cc7ecf34.1774460059.git.mst@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

arm64: tegra: Add Jetson AGX Thor Developer Kit support

Add basic support for the Jetson AGX Thor Developer Kit. It's quite
similar to the existing reference platform but has a slightly different
carrier board with different mass storage options and I/O.

Signed-off-by: Thierry Reding <treding@nvidia.com>

dt-bindings: arm: tegra: Document Jetson AGX Thor DevKit

The Jetson AGX Thor Developer Kit uses the same module (P3834) as the
P3971 reference platform but a slightly different carrier board (P4071).

Acked-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Thierry Reding <treding@nvidia.com>

soc/tegra: pmc: Add IO pads for Tegra264

Populate the IO pads and pins for Tegra264. Tegra264 has internal 1.8V
and 0.6V regulators that must be enabled when selecting the 1.8V mode
for the sdmmc1-hv IO pad. To support this a new 'ena_1v8' member is
added to the 'tegra_io_pad_vctrl' structure to populate the bits that
need to be set to enable these internal regulators. Although this is
enabling 1.8V (bit 1) and 0.6V (bit 2) regulators, it is simply called
'ena_1v8' because these are both enabled for 1.8V operation. Note that
these internal regulators are disabled when not using 1.8V mode.

Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>

soc/tegra: pmc: Rename has_impl_33v_pwr flag

The flag 'has_impl_33v_pwr' is now only used to determine if we need to
set the write-enable bit before we can set the bit to select if 3.3V IO
is used or not. Therefore, rename the flag to 'has_io_pad_wren' to
indicate that the SoC supports the write-enable register.

Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>

soc/tegra: pmc: Refactor IO pad voltage control

For Tegra devices, only a subset of IO pads can be configured for 1.8V
or 3.3V. Therefore, in the 'tegra_io_pad_soc' structure for Tegra SoCs
either all or most of the 'voltage' entries are set to UINT_MAX to
indicate the IO pad voltage cannot be configured. So for the majority of
IO pads this configuration is not applicable. However, refactoring the
IO pad data to move this parameter into a separate structure does not
make sense because the benefits are marginal.

Support for the Tegra264 IO pads is currently missing and the control
for configuring the voltage for the IO pads for Tegra264 has changed.
Instead of having a single register that is used for setting the IO pad
voltage for all IO pads, there is now a register associated with the
specific IO pad. For Tegra264, there is now only one IO pad that can be
configured for 1.8V or 3.3V which is the sdmmc1-hv. While we could make
this work with by adding a new SoC flag, the implementation will be a
bit cumbersome. Therefore, it now seems reasonable to refactor the IO
pad code. Hence, introduce a new 'tegra_io_pad_vctrl' structure that
contains the register offset and bit for enabling/disabling 3.3V mode
and move the existing voltage control data for supported SoCs to this
structure. This has an added benefit of simplifying the code in the
functions tegra_io_pad_get_voltage and tegra_io_pad_set_voltage.

Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>

soc/tegra: pmc: Add Tegra264 wake events

Populate the various wake events for the Tegra264 device.

Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>

soc/tegra: pmc: Add AOWAKE regs for Tegra264

Populate the AOWAKE register offsets for Tegra264.

Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>

soc/tegra: pmc: Add support for SoC specific AOWAKE offsets

For Tegra264, some of the AOWAKE registers have different register
offsets. Prepare for adding the Tegra264 AOWAKE register by moving the
offsets for the AOWAKE registers that are different for Tegra264 into
the 'tegra_pmc_regs' structure and populate these offsets for the SoCs
that support these registers.

Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>

soc/tegra: pmc: Remove unused AOWAKE definitions

For Tegra264, the offsets for the AOWAKE registers have changed. Before
adding support for the Tegra264 AOWAKE register offsets, remove the
unused AOWAKE definitions.

Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>

soc/tegra: pmc: Add kerneldoc for wake-up variables

Commit e6d96073af68 ("soc/tegra: pmc: Fix unsafe generic_handle_irq()
call") added the variables 'wake_work' and 'wake_status' to the
'tegra_pmc' structure but did not add the associated kerneldoc for these
new variables. Add the kerneldoc for these variables.

Fixes: e6d96073af68 ("soc/tegra: pmc: Fix unsafe generic_handle_irq() call")
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>

soc/tegra: pmc: Correct function names in kerneldoc

Commit 70f752ebb08c ("soc/tegra: pmc: Add PMC contextual functions")
added the functions devm_tegra_pmc_get() and
tegra_pmc_io_pad_power_enable(), but the names of the functions in the
associated kerneldoc is incorrect. Update the kerneldoc for these
functions to correct their names.

Fixes: 70f752ebb08c ("soc/tegra: pmc: Add PMC contextual functions")
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>

soc/tegra: pmc: Add kerneldoc for reboot notifier

Commit 48b7f802fb78 ("soc/tegra: pmc: Embed reboot notifier in PMC
context") added the reboot_notifier structure to the PMC SoC structure
but did not update the kerneldoc accordingly. Add this missing kerneldoc
description to fix this.

Fixes: 48b7f802fb78 ("soc/tegra: pmc: Embed reboot notifier in PMC context")
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>

soc/tegra: common: Add Tegra114 support to devm_tegra_core_dev_init_opp_table

Determine the Tegra114 hardware version using the SoC Speedo ID bit macro,
mirroring the approach already used for Tegra30 and Tegra124.

Signed-off-by: Svyatoslav Ryhel <clamor95@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>

soc/tegra: pmc: Enable core domain support for Tegra114

Enable core domain support for Tegra114 since now it has power domains
fully configured.

Signed-off-by: Svyatoslav Ryhel <clamor95@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>

ARM: tegra: paz00: Configure WiFi rfkill switch through device tree

As of d64c732dfc9e ("net: rfkill: gpio: add DT support") rfkill-gpio
device can be instantiated via device tree.

Add the declaration there and drop board-paz00.c file and relevant
Makefile fragments.

Tested-by: Marc Dietrich <marvin24@gmx.de>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>

s390/entry: Scrub r12 register on kernel entry

Before commit f33f2d4c7c80 ("s390/bp: remove TIF_ISOLATE_BP"),
all entry handlers loaded r12 with the current task pointer
(lg %r12,__LC_CURRENT) for use by the BPENTER/BPEXIT macros. That
commit removed TIF_ISOLATE_BP, dropping both the branch prediction
macros and the r12 load, but did not add r12 to the register clearing
sequence.

Add the missing xgr %r12,%r12 to make the register scrub consistent
across all entry points.

Fixes: f33f2d4c7c80 ("s390/bp: remove TIF_ISOLATE_BP")
Cc: stable@kernel.org
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

s390/syscalls: Add spectre boundary for syscall dispatch table

The s390 syscall number is directly controlled by userspace, but does
not have an array_index_nospec() boundary to prevent access past the
syscall function pointer tables.

Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Fixes: 56e62a737028 ("s390: convert to generic entry")
Cc: stable@kernel.org
Assisted-by: gkh_clanker_2000
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/2026032404-sterling-swoosh-43e6@gregkh
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

s390/barrier: Make array_index_mask_nospec() __always_inline

Mark array_index_mask_nospec() as __always_inline to guarantee the
mitigation is emitted inline regardless of compiler inlining decisions.

Fixes: e2dd833389cc ("s390: add optimized array_index_mask_nospec")
Cc: stable@kernel.org
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

ARM: tegra: transformers: Add connector node

All ASUS Transformers have micro-HDMI connector directly available.
After Tegra HDMI got bridge/connector support, we should use connector
framework for proper HW description.

Tested-by: Andreas Westman Dorcsak <hedmoo@yahoo.com> # ASUS TF T30
Tested-by: Robert Eckelmann <longnoserob@gmail.com> # ASUS TF101 T20
Tested-by: Svyatoslav Ryhel <clamor95@gmail.com> # ASUS TF201 T30
Signed-off-by: Svyatoslav Ryhel <clamor95@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>

Merge tag 'spi-fix-v7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi

Pull spi fixes from Mark Brown:
"There are two core fixes here. One is from Johan dealing with an issue
  introduced by a devm_ API usage update causing things to be freed
  earlier than they had earlier when we fail to register a device,
  another from Danilo avoids unlocked acccess to data by converting to
  use a driver core API.

  We also have a few relatively minor driver specific fixes"

* tag 'spi-fix-v7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: spi-fsl-lpspi: fix teardown order issue (UAF)
  spi: fix use-after-free on managed registration failure
  spi: use generic driver_override infrastructure
  spi: meson-spicc: Fix double-put in remove path
  spi: sn-f-ospi: Use devm_mutex_init() to simplify code
  spi: sn-f-ospi: Fix resource leak in f_ospi_probe()

Merge tag 'regulator-fix-v7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Pull regulator fix from Mark Brown:
"A fix from Alice for the rust bindings, they didn't handle the stub
  implementation of the C API used when CONFIG_REGULATOR is disabled
  leading to undefined behaviour"

* tag 'regulator-fix-v7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  rust: regulator: do not assume that regulator_get() returns non-null

Merge tag 'regmap-fix-v7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap

Pull regmap fix from Mark Brown:
"A fix from Andy Shevchenko for an issue with caching of page selector
registers which are located inside the page they are switching"

* tag 'regmap-fix-v7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: Synchronize cache for the page selector

ASoC: Intel: avs: Include CPUID header at file scope

Commit

    cbe37a4d2b3c ("ASoC: Intel: avs: Configure basefw on TGL-based platforms")

includes the main CPUID header from within a C function.  This works by
luck and forbids valid refactoring inside that header.

Include the CPUID header at file scope instead.

Remove the COMPILE_TEST build flag so that the CONFIG_X86 conditionals can
be removed.  The driver gets enough compilation testing already on x86.

For clarity, refactor the CPUID(0x15) code into its own function without
changing any of the driver's logic.

Fixes: cbe37a4d2b3c ("ASoC: Intel: avs: Configure basefw on TGL-based platforms")
Suggested-by: Borislav Petkov <bp@alien8.de> # CONFIG_X86 removal
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Cezary Rojewski <cezary.rojewski@intel.com>
Link: https://lore.kernel.org/all/20250612234010.572636-3-darwi@linutronix.de

ASoC: Intel: avs: Check maximum valid CPUID leaf

The Intel AVS driver queries CPUID(0x15) before checking if the CPUID leaf
is available. Check the maximum-valid CPU standard leaf beforehand.

Use the CPUID_LEAF_TSC macro instead of the custom local one for the
CPUID(0x15) leaf number.

Fixes: cbe37a4d2b3c ("ASoC: Intel: avs: Configure basefw on TGL-based platforms")
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Cezary Rojewski <cezary.rojewski@intel.com>
Link: https://patch.msgid.link/20260327021645.555257-2-darwi@linutronix.de

Merge tag 'tsm-fixes-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/devsec/tsm

Pull tsm fix from Dan Williams:

- Fix a VMM controlled buffer length used to emit TDX attestation
reports

* tag 'tsm-fixes-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/devsec/tsm:
virt: tdx-guest: Fix handling of host controlled 'quote' buffer length

Merge tag 'vfio-v7.0-rc6' of https://github.com/awilliam/linux-vfio

Pull VFIO fix from Alex Williamson:

- Fix double-free and reference count underflow if dma-buf file
allocation fails (Alex Williamson)

* tag 'vfio-v7.0-rc6' of https://github.com/awilliam/linux-vfio:
vfio/pci: Fix double free in dma-buf feature

Merge tag 'efi-fixes-for-v7.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi

Pull EFI fix from Ard Biesheuvel:
"Fix a potential buffer overrun issue introduced by the previous fix
for EFI boot services region reservations on x86"

* tag 'efi-fixes-for-v7.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
x86/efi: efi_unmap_boot_services: fix calculation of ranges_to_free size

Merge tag 'loongarch-fixes-7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson

Pull LoongArch fixes from Huacai Chen:
"Fix missing NULL checks for kstrdup(), workaround LS2K/LS7A GPU
  DMA hang bug, emit GNU_EH_FRAME for vDSO correctly, and fix some
  KVM-related bugs"

* tag 'loongarch-fixes-7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
  LoongArch: KVM: Fix base address calculation in kvm_eiointc_regs_access()
  LoongArch: KVM: Handle the case that EIOINTC's coremap is empty
  LoongArch: KVM: Make kvm_get_vcpu_by_cpuid() more robust
  LoongArch: vDSO: Emit GNU_EH_FRAME correctly
  LoongArch: Workaround LS2K/LS7A GPU DMA hang bug
  LoongArch: Fix missing NULL checks for kstrdup()

Merge tag 'io_uring-7.0-20260327' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux

Pull io_uring fixes from Jens Axboe:
"Just two small fixes, both fixing regressions added in the fdinfo code
  in 6.19 with the SQE mixed size support"

* tag 'io_uring-7.0-20260327' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
  io_uring/fdinfo: fix OOB read in SQE_MIXED wrap check
  io_uring/fdinfo: fix SQE_MIXED SQE displaying

Merge tag 'mediatek-drm-fixes-20260323' of https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux into drm-fixes

Mediatek DRM Fixes - 20260323

1. dsi: Store driver data before invoking mipi_dsi_host_register

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Link: https://patch.msgid.link/20260323160135.39609-1-chunkuang.hu@kernel.org

KVM: x86/mmu: Only WARN in direct MMUs when overwriting shadow-present SPTE

Adjust KVM's sanity check against overwriting a shadow-present SPTE with a
another SPTE with a different target PFN to only apply to direct MMUs,
i.e. only to MMUs without shadowed gPTEs.  While it's impossible for KVM
to overwrite a shadow-present SPTE in response to a guest write, writes
from outside the scope of KVM, e.g. from host userspace, aren't detected
by KVM's write tracking and so can break KVM's shadow paging rules.

  ------------[ cut here ]------------
  pfn != spte_to_pfn(*sptep)
  WARNING: arch/x86/kvm/mmu/mmu.c:3069 at mmu_set_spte+0x1e4/0x440 [kvm], CPU#0: vmx_ept_stale_r/872
  Modules linked in: kvm_intel kvm irqbypass
  CPU: 0 UID: 1000 PID: 872 Comm: vmx_ept_stale_r Not tainted 7.0.0-rc2-eafebd2d2ab0-sink-vm #319 PREEMPT
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  RIP: 0010:mmu_set_spte+0x1e4/0x440 [kvm]
  Call Trace:
   <TASK>
   ept_page_fault+0x535/0x7f0 [kvm]
   kvm_mmu_do_page_fault+0xee/0x1f0 [kvm]
   kvm_mmu_page_fault+0x8d/0x620 [kvm]
   vmx_handle_exit+0x18c/0x5a0 [kvm_intel]
   kvm_arch_vcpu_ioctl_run+0xc55/0x1c20 [kvm]
   kvm_vcpu_ioctl+0x2d5/0x980 [kvm]
   __x64_sys_ioctl+0x8a/0xd0
   do_syscall_64+0xb5/0x730
   entry_SYSCALL_64_after_hwframe+0x4b/0x53
   </TASK>
  ---[ end trace 0000000000000000 ]---

Fixes: 11d45175111d ("KVM: x86/mmu: Warn if PFN changes on shadow-present SPTE in shadow MMU")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>

KVM: x86/mmu: Drop/zap existing present SPTE even when creating an MMIO SPTE

When installing an emulated MMIO SPTE, do so *after* dropping/zapping the
existing SPTE (if it's shadow-present).  While commit a54aa15c6bda3 was
right about it being impossible to convert a shadow-present SPTE to an
MMIO SPTE due to a _guest_ write, it failed to account for writes to guest
memory that are outside the scope of KVM.

E.g. if host userspace modifies a shadowed gPTE to switch from a memslot
to emulted MMIO and then the guest hits a relevant page fault, KVM will
install the MMIO SPTE without first zapping the shadow-present SPTE.

  ------------[ cut here ]------------
  is_shadow_present_pte(*sptep)
  WARNING: arch/x86/kvm/mmu/mmu.c:484 at mark_mmio_spte+0xb2/0xc0 [kvm], CPU#0: vmx_ept_stale_r/4292
  Modules linked in: kvm_intel kvm irqbypass
  CPU: 0 UID: 1000 PID: 4292 Comm: vmx_ept_stale_r Not tainted 7.0.0-rc2-eafebd2d2ab0-sink-vm #319 PREEMPT
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  RIP: 0010:mark_mmio_spte+0xb2/0xc0 [kvm]
  Call Trace:
   <TASK>
   mmu_set_spte+0x237/0x440 [kvm]
   ept_page_fault+0x535/0x7f0 [kvm]
   kvm_mmu_do_page_fault+0xee/0x1f0 [kvm]
   kvm_mmu_page_fault+0x8d/0x620 [kvm]
   vmx_handle_exit+0x18c/0x5a0 [kvm_intel]
   kvm_arch_vcpu_ioctl_run+0xc55/0x1c20 [kvm]
   kvm_vcpu_ioctl+0x2d5/0x980 [kvm]
   __x64_sys_ioctl+0x8a/0xd0
   do_syscall_64+0xb5/0x730
   entry_SYSCALL_64_after_hwframe+0x4b/0x53
  RIP: 0033:0x47fa3f
   </TASK>
  ---[ end trace 0000000000000000 ]---

Reported-by: Alexander Bulekov <bkov@amazon.com>
Debugged-by: Alexander Bulekov <bkov@amazon.com>
Suggested-by: Fred Griffoul <fgriffo@amazon.co.uk>
Fixes: a54aa15c6bda3 ("KVM: x86/mmu: Handle MMIO SPTEs directly in mmu_set_spte()")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>

Merge tag 'kvm-s390-master-7.0-2' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

KVM: s390: More memory management fixes

Lots of small and not-so-small fixes for the newly rewritten gmap,
mostly affecting the handling of nested guests.

EDAC/ie31200: Make rpl_s_cfg static

The rpl_s_cfg variable is only used within this file, so mark it static, as
Sparse reports:

  drivers/edac/ie31200_edac.c:709:19: warning: symbol 'rpl_s_cfg' was not
  declared. Should it be static?

  [ bp: Massage commit message. ]

Signed-off-by: Marilene Andrade Garcia <marilene.agarcia@gmail.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://patch.msgid.link/f353c3aff47abd88bafc46a1c0c5eb9917e11b28.1774619301.git.marilene.agarcia@gmail.com

dm: provide helper to set stacked limits

There are multiple device mappers that set up their stacking limits
exactly the same for the logical, physical and minimum IO queue limits.
Provide a helper for it.

Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

dm-integrity: always set the io hints

Don't depend on the defaults to be what is desired if the integrity
device was set up with 512b sector size. Always set the queue limits to
be at least what the device mapper wants.

Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

dm-integrity: fix mismatched queue limits

A user can integritysetup a device with a backing device using a 4k
logical block size, but request the dm device use 1k or 2k. This
mismatch creates an inconsistency such that the dm device would report
limits for IO that it can't actually execute. Fix this by using the
backing device's limits if they are larger.

Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

hfsplus: extract hidden directory search into a helper function

In hfsplus_fill_super(), the process of looking up the hidden directory
involves initializing a catalog search, building a search key, reading
the b-tree record, and releasing the search data.

Currently, this logic is open-coded directly within the main superblock
initialization routine. This makes hfsplus_fill_super() quite lengthy
and its error handling paths less straightforward.

Extract the hidden directory search sequence into a new helper function,
hfsplus_get_hidden_dir_entry(). This improves overall code readability,
cleanly encapsulates the hfs_find_data lifecycle, and simplifies the
error exits in hfsplus_fill_super().

Signed-off-by: Zilin Guan <zilin@seu.edu.cn>
Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com>
Tested-by: Viacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>

hfsplus: fix held lock freed on hfsplus_fill_super()

hfsplus_fill_super() calls hfs_find_init() to initialize a search
structure, which acquires tree->tree_lock. If the subsequent call to
hfsplus_cat_build_key() fails, the function jumps to the out_put_root
error label without releasing the lock. The later cleanup path then
frees the tree data structure with the lock still held, triggering a
held lock freed warning.

Fix this by adding the missing hfs_find_exit(&fd) call before jumping
to the out_put_root error label. This ensures that tree->tree_lock is
properly released on the error path.

The bug was originally detected on v6.13-rc1 using an experimental
static analysis tool we are developing, and we have verified that the
issue persists in the latest mainline kernel. The tool is specifically
designed to detect memory management issues. It is currently under active
development and not yet publicly available.

We confirmed the bug by runtime testing under QEMU with x86_64 defconfig,
lockdep enabled, and CONFIG_HFSPLUS_FS=y. To trigger the error path, we
used GDB to dynamically shrink the max_unistr_len parameter to 1 before
hfsplus_asc2uni() is called. This forces hfsplus_asc2uni() to naturally
return -ENAMETOOLONG, which propagates to hfsplus_cat_build_key() and
exercises the faulty error path. The following warning was observed
during mount:

=========================
WARNING: held lock freed!
7.0.0-rc3-00016-gb4f0dd314b39 #4 Not tainted
-------------------------
mount/174 is freeing memory ffff888103f92000-ffff888103f92fff, with a lock still held there!
ffff888103f920b0 (&tree->tree_lock){+.+.}-{4:4}, at: hfsplus_find_init+0x154/0x1e0
2 locks held by mount/174:
#0: ffff888103f960e0 (&type->s_umount_key#42/1){+.+.}-{4:4}, at: alloc_super.constprop.0+0x167/0xa40
#1: ffff888103f920b0 (&tree->tree_lock){+.+.}-{4:4}, at: hfsplus_find_init+0x154/0x1e0

stack backtrace:
CPU: 2 UID: 0 PID: 174 Comm: mount Not tainted 7.0.0-rc3-00016-gb4f0dd314b39 #4 PREEMPT(lazy)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x82/0xd0
debug_check_no_locks_freed+0x13a/0x180
kfree+0x16b/0x510
? hfsplus_fill_super+0xcb4/0x18a0
hfsplus_fill_super+0xcb4/0x18a0
? __pfx_hfsplus_fill_super+0x10/0x10
? srso_return_thunk+0x5/0x5f
? bdev_open+0x65f/0xc30
? srso_return_thunk+0x5/0x5f
? pointer+0x4ce/0xbf0
? trace_contention_end+0x11c/0x150
? __pfx_pointer+0x10/0x10
? srso_return_thunk+0x5/0x5f
? bdev_open+0x79b/0xc30
? srso_return_thunk+0x5/0x5f
? srso_return_thunk+0x5/0x5f
? vsnprintf+0x6da/0x1270
? srso_return_thunk+0x5/0x5f
? __mutex_unlock_slowpath+0x157/0x740
? __pfx_vsnprintf+0x10/0x10
? srso_return_thunk+0x5/0x5f
? srso_return_thunk+0x5/0x5f
? mark_held_locks+0x49/0x80
? srso_return_thunk+0x5/0x5f
? srso_return_thunk+0x5/0x5f
? irqentry_exit+0x17b/0x5e0
? trace_irq_disable.constprop.0+0x116/0x150
? __pfx_hfsplus_fill_super+0x10/0x10
? __pfx_hfsplus_fill_super+0x10/0x10
get_tree_bdev_flags+0x302/0x580
? __pfx_get_tree_bdev_flags+0x10/0x10
? vfs_parse_fs_qstr+0x129/0x1a0
? __pfx_vfs_parse_fs_qstr+0x3/0x10
vfs_get_tree+0x89/0x320
fc_mount+0x10/0x1d0
path_mount+0x5c5/0x21c0
? __pfx_path_mount+0x10/0x10
? trace_irq_enable.constprop.0+0x116/0x150
? trace_irq_enable.constprop.0+0x116/0x150
? srso_return_thunk+0x5/0x5f
? srso_return_thunk+0x5/0x5f
? kmem_cache_free+0x307/0x540
? user_path_at+0x51/0x60
? __x64_sys_mount+0x212/0x280
? srso_return_thunk+0x5/0x5f
__x64_sys_mount+0x212/0x280
? __pfx___x64_sys_mount+0x10/0x10
? srso_return_thunk+0x5/0x5f
? trace_irq_enable.constprop.0+0x116/0x150
? srso_return_thunk+0x5/0x5f
do_syscall_64+0x111/0x680
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7ffacad55eae
Code: 48 8b 0d 85 1f 0f 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00 8
RSP: 002b:00007fff1ab55718 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ffacad55eae
RDX: 000055740c64e5b0 RSI: 000055740c64e630 RDI: 000055740c651ab0
RBP: 000055740c64e380 R08: 0000000000000000 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 000055740c64e5b0 R14: 000055740c651ab0 R15: 000055740c64e380
</TASK>

After applying this patch, the warning no longer appears.

Fixes: 89ac9b4d3d1a ("hfsplus: fix longname handling")
CC: stable@vger.kernel.org
Signed-off-by: Zilin Guan <zilin@seu.edu.cn>
Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com>
Tested-by: Viacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>

perf tools: Add --pmu-filter option for filtering PMUs

This patch adds a new --pmu-filter option to perf-stat command to allow
filtering events on specific PMUs. This is useful when there are
multiple PMUs with same type (e.g. hisi_sicl2_cpa0 and hisi_sicl0_cpa0).

[root@localhost tmp]# perf stat -M cpa_p0_avg_bw
Performance counter stats for 'system wide':

    19,417,779,115      hisi_sicl0_cpa0/cpa_cycles/      #     0.00 cpa_p0_avg_bw
                 0      hisi_sicl0_cpa0/cpa_p0_wr_dat/
                 0      hisi_sicl0_cpa0/cpa_p0_rd_dat_64b/
                 0      hisi_sicl0_cpa0/cpa_p0_rd_dat_32b/
    19,417,751,103      hisi_sicl10_cpa0/cpa_cycles/     #     0.00 cpa_p0_avg_bw
                 0      hisi_sicl10_cpa0/cpa_p0_wr_dat/
                 0      hisi_sicl10_cpa0/cpa_p0_rd_dat_64b/
                 0      hisi_sicl10_cpa0/cpa_p0_rd_dat_32b/
    19,417,730,679      hisi_sicl2_cpa0/cpa_cycles/      #     0.31 cpa_p0_avg_bw
        75,635,749      hisi_sicl2_cpa0/cpa_p0_wr_dat/
        18,520,640      hisi_sicl2_cpa0/cpa_p0_rd_dat_64b/
                 0      hisi_sicl2_cpa0/cpa_p0_rd_dat_32b/
    19,417,674,227      hisi_sicl8_cpa0/cpa_cycles/      #     0.00 cpa_p0_avg_bw
                 0      hisi_sicl8_cpa0/cpa_p0_wr_dat/
                 0      hisi_sicl8_cpa0/cpa_p0_rd_dat_64b/
                 0      hisi_sicl8_cpa0/cpa_p0_rd_dat_32b/

      19.417734480 seconds time elapsed

[root@localhost tmp]# perf stat --pmu-filter hisi_sicl2_cpa0 -M cpa_p0_avg_bw
Performance counter stats for 'system wide':

     6,234,093,559      cpa_cycles                       #     0.60 cpa_p0_avg_bw
        50,548,465      cpa_p0_wr_dat
         7,552,182      cpa_p0_rd_dat_64b
                 0      cpa_p0_rd_dat_32b

       6.234139320 seconds time elapsed

Signed-off-by: Qinxin Xia <xiaqinxin@huawei.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

gpu: nova-core: use sized array for GSP log buffers

Switch LogBuffer from Coherent<[u8]> (unsized) to
Coherent<[u8; LOG_BUFFER_SIZE]> (sized). The buffer size is a
compile-time constant (RM_LOG_BUFFER_NUM_PAGES * GSP_PAGE_SIZE), so a
fixed-size array is more precise and avoids the need for the runtime
length parameter of zeroed_slice().

Acked-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Link: https://patch.msgid.link/20260325003921.3420-3-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>

rust: dma: generalize BinaryWriter impl for Coherent<T>

Generalize the BinaryWriter implementation from Coherent<[u8]> to
Coherent<T> where T: KnownSize + AsBytes + ?Sized. The implementation
only uses size() and write_dma(), neither of which depends on the
inner type being a byte slice.

This allows any Coherent allocation with an AsBytes inner type to be
exposed as a debugfs binary file.

Acked-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Link: https://patch.msgid.link/20260325003921.3420-2-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>

rust: uaccess: generalize write_dma() to accept any Coherent<T>

Generalize write_dma() from &Coherent<[u8]> to &Coherent<T> where
T: KnownSize + AsBytes + ?Sized. The function body only uses as_ptr()
and size(), which work for any such T, so there is no reason to
restrict it to byte slices.

Acked-by: Miguel Ojeda <ojeda@kernel.org>
Acked-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://patch.msgid.link/20260325003921.3420-1-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>

rust: drm: gem: shmem: Add DRM shmem helper abstraction

The DRM shmem helper includes common code useful for drivers which
allocate GEM objects as anonymous shmem. Add a Rust abstraction for
this. Drivers can choose the raw GEM implementation or the shmem layer,
depending on their needs.

Signed-off-by: Asahi Lina <lina@asahilina.net>
Signed-off-by: Daniel Almeida <daniel.almeida@collabora.com>
Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Janne Grunau <j@jananu.net>
Tested-by: Deborah Brouwer <deborah.brouwer@collabora.com>
Link: https://patch.msgid.link/20260316211646.650074-6-lyude@redhat.com
[ * DRM_GEM_SHMEM_HELPER is a tristate; when a module driver selects it,
    it becomes =m. The Rust kernel crate and its C helpers are always
    built into vmlinux and can't reference symbols from a module,
    causing link errors.

    Thus, add RUST_DRM_GEM_SHMEM_HELPER bool Kconfig that selects
    DRM_GEM_SHMEM_HELPER, forcing it built-in when Rust drivers need it;
    use cfg(CONFIG_RUST_DRM_GEM_SHMEM_HELPER) for the shmem module.

  * Add cfg_attr(not(CONFIG_RUST_DRM_GEM_SHMEM_HELPER), expect(unused))
    on pub(crate) use impl_aref_for_gem_obj and BaseObjectPrivate, so
    that unused warnings are suppressed when shmem is not enabled.

  * Enable const_refs_to_static (stabilized in 1.83) to prevent build
    errors with older compilers.

  * Use &raw const for bindings::drm_gem_shmem_vm_ops and add
    #[allow(unused_unsafe, reason = "Safe since Rust 1.82.0")].

  * Fix incorrect C Header path and minor spelling and formatting
    issues.

  * Drop shmem::Object::sg_table() as the current implementation is
    unsound.

    - Danilo ]
Signed-off-by: Danilo Krummrich <dakr@kernel.org>

rust: drm: gem: Add raw_dma_resv() function

For retrieving a pointer to the struct dma_resv for a given GEM object. We
also introduce it in a new trait, BaseObjectPrivate, which we automatically
implement for all gem objects and don't expose to users outside of the
crate.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Janne Grunau <j@jananu.net>
Tested-by: Janne Grunau <j@jannau.net>
Tested-by: Deborah Brouwer <deborah.brouwer@collabora.com>
Link: https://patch.msgid.link/20260316211646.650074-3-lyude@redhat.com
[ Fix incorrect reference in safety comment. - Danilo ]
Signed-off-by: Danilo Krummrich <dakr@kernel.org>

lib/crypto: chacha: Zeroize permuted_state before it leaves scope

Since the ChaCha permutation is invertible, the local variable
'permuted_state' is sufficient to compute the original 'state', and thus
the key, even after the permutation has been done.

While the kernel is quite inconsistent about zeroizing secrets on the
stack (and some prominent userspace crypto libraries don't bother at all
since it's not guaranteed to work anyway), the kernel does try to do it
as a best practice, especially in cases involving the RNG.

Thus, explicitly zeroize 'permuted_state' before it goes out of scope.

Fixes: c08d0e647305 ("crypto: chacha20 - Add a generic ChaCha20 stream cipher implementation")
Cc: stable@vger.kernel.org
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260326032920.39408-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma fixes from Jason Gunthorpe:

- Quite a few irdma bug fixes, several user triggerable

- Fix a 0 SMAC header in ionic

- Tolerate FW errors for RAAS in bng_re

- Don't UAF in efa when printing error events

- Better handle pool exhaustion in the new bvec paths

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  RDMA/irdma: Harden depth calculation functions
  RDMA/irdma: Return EINVAL for invalid arp index error
  RDMA/irdma: Fix deadlock during netdev reset with active connections
  RDMA/irdma: Remove reset check from irdma_modify_qp_to_err()
  RDMA/irdma: Clean up unnecessary dereference of event->cm_node
  RDMA/irdma: Remove a NOP wait_event() in irdma_modify_qp_roce()
  RDMA/irdma: Update ibqp state to error if QP is already in error state
  RDMA/irdma: Initialize free_qp completion before using it
  RDMA/efa: Fix possible deadlock
  RDMA/rw: Fix MR pool exhaustion in bvec RDMA READ path
  RDMA/rw: Fall back to direct SGE on MR pool exhaustion
  RDMA/efa: Fix use of completion ctx after free
  RDMA/bng_re: Fix silent failure in HWRM version query
  RDMA/ionic: Preserve and set Ethernet source MAC after ib_ud_header_init()
  RDMA/irdma: Fix double free related to rereg_user_mr

Merge tag 'pci-v7.0-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

Pull pci fixes from Bjorn Helgaas:

- Remove power-off from pwrctrl drivers since this is now done directly
   by the PCI controller drivers (Chen-Yu Tsai)

- Fix pwrctrl device node leak (Felix Gu)

- Document a TLP header decoder for AER log messages (Lukas Wunner)

* tag 'pci-v7.0-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
  Documentation: PCI: Document PCIe TLP Header decoder for AER messages
  PCI/pwrctrl: Fix pci_pwrctrl_is_required() device node leak
  PCI/pwrctrl: Do not power off on pwrctrl device removal

Merge tag 'sound-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
"This became slightly big partly due to my time off in the last week.
  But all changes are about device-specific fixes, so it should be
  safely applicable.

  ASoC:
   - Fix double free in sma1307
   - Fix uninitialized variables in simple-card-utils/imx-card
   - Address clock leaks and error propagation in ADAU1372
   - Add DMI quirks and ACP/SDW support for ASUS
   - Fix Intel CATPT DMA mask
   - Fix SOF topology parsing
   - Fix DT bindings for RK3576 SPDIF, STM32 SAI and WCD934x

  HD-audio:
   - Quirks for Lenovo, ASUS, and various HP models, as well as
     a speaker pop fix on Star Labs StarFighter
   - Revert MSI X870E Tomahawk denylist again

  USB-Audio:
   - Fix distorted audio on Focusrite Scarlett 2i2/2i4 1st Gen
   - Add iface reset quirk for AB17X
   - Update Qualcomm USB audio Kconfig dependencies and license

  Misc:
   - Fix minor compile warnings for firewire and asihpi drivers"

* tag 'sound-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (35 commits)
  Revert "ALSA: hda/intel: Add MSI X870E Tomahawk to denylist"
  ALSA: usb-audio: Add iface reset and delay quirk for AB17X USB Audio
  ALSA: hda/realtek: add HP Laptop 15-fd0xxx mute LED quirk
  ALSA: usb-audio: Exclude Scarlett 2i4 1st Gen from SKIP_IFACE_SETUP
  ALSA: hda/realtek: Add mute LED quirk for HP Pavilion 15-eg0xxx
  ALSA: hda/realtek - Fixed Speaker Mute LED for HP EliteBoard G1a platform
  ASoC: SOF: ipc4-topology: Allow bytes controls without initial payload
  ASoC: adau1372: Fix clock leak on PLL lock failure
  ASoC: adau1372: Fix unchecked clk_prepare_enable() return value
  ASoC: SDCA: fix finding wrong entity
  ASoC: SDCA: remove the max count of initialization table
  ASoC: codecs: wcd934x: fix typo in dt parsing
  ASoC: dt-bindings: stm32: Fix incorrect compatible string in stm32h7-sai match
  ASoC: Intel: catpt: Fix the device initialization
  ASoC: amd: acp: add ASUS HN7306EA quirk for legacy SDW machine
  ASoC: SOF: topology: reject invalid vendor array size in token parser
  ASoC: tas2781: Add null check for calibration data
  ALSA: asihpi: avoid write overflow check warning
  ASoC: fsl: imx-card: initialize playback_only and capture_only
  ASoC: simple-card-utils: Check value of is_playback_only and is_capture_only
  ...

Merge tag 'media/v7.0-6' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media fixes from Mauro Carvalho Chehab:

- uvcvideo may cause OOPS when out of memory

- remove a deadlock in the ccs driver

* tag 'media/v7.0-6' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
media: ccs: Avoid deadlock in ccs_init_state()
media: uvcvideo: Fix bug in error path of uvc_alloc_urb_buffers

Merge tag 'sysctl-7.00-fixes-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl

Pull sysctl fix from Joel Granados:
"Fix uninitialized variable error when writing to a sysctl bitmap

  Removed the possibility of returning an unjustified -EINVAL when
  writing to a sysctl bitmap"

* tag 'sysctl-7.00-fixes-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl:
  sysctl: fix uninitialized variable in proc_do_large_bitmap

rust: helpers: Add bindings/wrappers for dma_resv_lock

This is just for basic usage in the DRM shmem abstractions for implied
locking, not intended as a full DMA Reservation abstraction yet.

Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Asahi Lina <lina+kernel@asahilina.net>
Signed-off-by: Daniel Almeida <daniel.almeida@collabora.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Janne Grunau <j@jannau.net>
Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
Acked-by: David Airlie <airlied@gmail.com>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Link: https://patch.msgid.link/20260320-gpuvm-rust-v5-2-76fd44f17a87@google.com
Signed-off-by: Danilo Krummrich <dakr@kernel.org>

Merge tag 'xfs-fixes-7.0-rc6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs fixes from Carlos Maiolino:
"This includes a few important bug fixes, and some code refactoring
  that was necessary for one of the fixes"

* tag 'xfs-fixes-7.0-rc6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: remove file_path tracepoint data
  xfs: don't irele after failing to iget in xfs_attri_recover_work
  xfs: remove redundant validation in xlog_recover_attri_commit_pass2
  xfs: fix ri_total validation in xlog_recover_attri_commit_pass2
  xfs: close crash window in attr dabtree inactivation
  xfs: factor out xfs_attr3_leaf_init
  xfs: factor out xfs_attr3_node_entry_remove
  xfs: only assert new size for datafork during truncate extents
  xfs: annotate struct xfs_attr_list_context with __counted_by_ptr
  xfs: cleanup buftarg handling in XFS_IOC_VERIFY_MEDIA
  xfs: scrub: unlock dquot before early return in quota scrub
  xfs: refactor xfsaild_push loop into helper
  xfs: save ailp before dropping the AIL lock in push callbacks
  xfs: avoid dereferencing log items after push callbacks
  xfs: stop reclaim before pushing AIL during unmount

tracing: Fix potential deadlock in cpu hotplug with osnoise

The following sequence may leads deadlock in cpu hotplug:

    task1        task2        task3
    -----        -----        -----

mutex_lock(&interface_lock)

            [CPU GOING OFFLINE]

            cpus_write_lock();
            osnoise_cpu_die();
              kthread_stop(task3);
                wait_for_completion();

                      osnoise_sleep();
                        mutex_lock(&interface_lock);

cpus_read_lock();

[DEAD LOCK]

Fix by swap the order of cpus_read_lock() and mutex_lock(&interface_lock).

Cc: stable@vger.kernel.org
Cc: <mathieu.desnoyers@efficios.com>
Cc: <zhang.run@zte.com.cn>
Cc: <yang.tao172@zte.com.cn>
Cc: <ran.xiaokai@zte.com.cn>
Fixes: bce29ac9ce0bb ("trace: Add osnoise tracer")
Link: https://patch.msgid.link/20260326141953414bVSj33dAYktqp9Oiyizq8@zte.com.cn
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Luo Haiyang <luo.haiyang@zte.com.cn>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

ASoC: Intel: ehl_rt5660: remove unused macro definitions

DUAL_CHANNEL and NAME_SIZE macros are not being used (anymore) but the
macros are still defined. Remove them to clean up dead code.

Signed-off-by: Sachin Mokashi <sachin.mokashi@intel.com>
Link: https://patch.msgid.link/20260324163400.1276247-1-sachin.mokashi@intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: SDCA: fix the register to ctl value conversion for Q7.8 format

The division calculation should be implemented using signed integer format.
This patch changes mc->shift from an unsigned type to a signed integer during the calculation.

Fixes: 501efdcb3b3a ("ASoC: SDCA: Pull the Q7.8 volume helpers out of soc-ops")
Signed-off-by: Shuming Fan <shumingf@realtek.com>
Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://patch.msgid.link/20260327082331.2277498-1-shumingf@realtek.com
Signed-off-by: Mark Brown <broonie@kernel.org>

Merge tag 'v7.0-rc5-ksmbd-srv-fixes' of git://git.samba.org/ksmbd

Pull smb server fixes from Steve French:

- Fix out of bounds write

- Fix for better calculating max output buffers

- Fix memory leaks in SMB2/SMB3 lock

- Fix use after free

- Multichannel fix

* tag 'v7.0-rc5-ksmbd-srv-fixes' of git://git.samba.org/ksmbd:
  ksmbd: fix potencial OOB in get_file_all_info() for compound requests
  ksmbd: replace hardcoded hdr2_len with offsetof() in smb2_calc_max_out_buf_len()
  ksmbd: fix memory leaks and NULL deref in smb2_lock()
  ksmbd: fix use-after-free and NULL deref in smb_grant_oplock()
  ksmbd: do not expire session on binding failure

iommu/riscv: Fix signedness bug

The function platform_irq_count() returns negative error codes and
iommu->irqs_count is an unsigned integer, so the check
(iommu->irqs_count <= 0) is always impossible.

Make the return value of platform_irq_count() be assigned to ret, check
for error, and then assign iommu->irqs_count to ret.

Detected by Smatch:
drivers/iommu/riscv/iommu-platform.c:119 riscv_iommu_platform_probe() warn:
'iommu->irqs_count' unsigned <= 0

Signed-off-by: Ethan Tidmore <ethantidmore06@gmail.com>
Fixes: 5c0ebbd3c6c6 ("iommu/riscv: Add RISC-V IOMMU platform device driver")
Reviewed-by: Andrew Jones <andrew.jones@oss.qualcomm.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>

cxl/core: use cleanup.h for devm_cxl_add_dax_region

Cleanup the gotos in the function. No functional change.

Signed-off-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Link: https://patch.msgid.link/20260327020203.876122-4-gourry@gourry.net
Signed-off-by: Dave Jiang <dave.jiang@intel.com>

cxl/core/region: move dax region device logic into region_dax.c

core/region.c is overloaded with per-region control logic (pmem, dax,
sysram, etc). Move the CXL DAX region device infrastructure from
region.c into a new region_dax.c file.

This will also allow us to add additional dax-driver integration paths
that don't further dirty the core region.c logic.

No functional changes.

Signed-off-by: Gregory Price <gourry@gourry.net>
Co-developed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Link: https://patch.msgid.link/20260327020203.876122-3-gourry@gourry.net
Signed-off-by: Dave Jiang <dave.jiang@intel.com>

cxl/core/region: move pmem region driver logic into region_pmem.c

core/region.c is overloaded with per-region control logic (pmem, dax,
sysram, etc). Move the pmem region driver logic from region.c into
region_pmem.c make it clear that this code only applies to pmem regions.

No functional changes.

[ dj: Fixed up some tabbing issues, may be from original code. ]

Signed-off-by: Gregory Price <gourry@gourry.net>
Co-developed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Link: https://patch.msgid.link/20260327020203.876122-2-gourry@gourry.net
Signed-off-by: Dave Jiang <dave.jiang@intel.com>

ASoC: soc-core: remove unused dobj_list

commit 8a9782346dccd ("ASoC: topology: Add topology core")
added dobj_list to Component and Card, but Card side has
never been used. Remove it.

Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Link: https://patch.msgid.link/874im2xa98.wl-kuninori.morimoto.gx@renesas.com
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: Drop some unused GPIO includes

Linus Walleij <linusw@kernel.org> says:

This drops the unnecessary legacy includes from three
more codecs.

Link: https://patch.msgid.link/20260327-asoc-rt1318-v1-0-9fcecf868fda@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: ts3a227e: Drop unused include

The driver includes the legacy GPIO header <linux/gpio.h> but does
not use any symbols from it so drop the include.

Signed-off-by: Linus Walleij <linusw@kernel.org>
Link: https://patch.msgid.link/20260327-asoc-rt1318-v1-3-9fcecf868fda@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: nau8315: Drop unused include

The driver includes the legacy GPIO header <linux/gpio.h> but does
not use any symbols from it so drop the include. (It is already
using the consumer header as is proper.)

Signed-off-by: Linus Walleij <linusw@kernel.org>
Link: https://patch.msgid.link/20260327-asoc-rt1318-v1-2-9fcecf868fda@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: rt1318: Drop unused include

The driver includes the legacy GPIO header <linux/gpio.h> but does
not use any symbols from it so drop the include.

Signed-off-by: Linus Walleij <linusw@kernel.org>
Link: https://patch.msgid.link/20260327-asoc-rt1318-v1-1-9fcecf868fda@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>

sched_ext: Document why built-in DSQs are unsupported sources in scx_bpf_dsq_move_to_local()

Add a comment explaining the design intent behind rejecting built-in DSQs
(%SCX_DSQ_GLOBAL and %SCX_DSQ_LOCAL*) as sources. Local DSQs support
reenqueueing but the BPF scheduler cannot directly iterate or move tasks
from them. %SCX_DSQ_GLOBAL is similar but also doesn't support
reenqueueing because it maps to multiple per-node DSQs, making the scope
difficult to define.

Also annotate @dsq_id to make clear it must be a user-created DSQ.

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Cheng-Yang Chou <yphbchou0911@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>

scx_central: Defer timer start to central dispatch to fix init error

scx_central currently assumes that ops.init() runs on the selected
central CPU and aborts otherwise. This is no longer true, as ops.init()
is invoked from the scx_enable_helper thread, which can run on any
CPU.

As a result, sched_setaffinity() from userspace doesn't work, causing
scx_central to fail when loading with:

[ 1985.319942] sched_ext: central: scx_central.bpf.c:314: init from non-central CPU
[ 1985.320317]    scx_exit+0xa3/0xd0
[ 1985.320535]    scx_bpf_error_bstr+0xbd/0x220
[ 1985.320840]    bpf_prog_3a445a8163fa8149_central_init+0x103/0x1ba
[ 1985.321073]    bpf__sched_ext_ops_init+0x40/0xa8
[ 1985.321286]    scx_root_enable_workfn+0x507/0x1650
[ 1985.321461]    kthread_worker_fn+0x260/0x940
[ 1985.321745]    kthread+0x303/0x3e0
[ 1985.321901]    ret_from_fork+0x589/0x7d0
[ 1985.322065]    ret_from_fork_asm+0x1a/0x30

DEBUG DUMP
===================================================================

central: root
scx_enable_help[134] triggered exit kind 1025:
  scx_bpf_error (scx_central.bpf.c:314: init from non-central CPU)

Fix this by:
- Defer bpf_timer_start() to the first dispatch on the central CPU.
- Initialize the BPF timer in central_init() and kick the central CPU
to guarantee entering the dispatch path on the central CPU immediately.
- Remove the unnecessary sched_setaffinity() call in userspace.

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Zhao Mengmeng <zhaomengmeng@kylinos.cn>
Signed-off-by: Tejun Heo <tj@kernel.org>

arm64: armv8_deprecated: Disable swp emulation when FEAT_LSUI present

The purpose of supporting LSUI is to eliminate PAN toggling. CPUs that
support LSUI are unlikely to support a 32-bit runtime. Rather than
emulating the SWP instruction using LSUI instructions in order to remove
PAN toggling, simply disable SWP emulation.

Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
[catalin.marinas@arm.com: some tweaks to the in-code comment]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

dax/hmem, cxl: Defer and resolve Soft Reserved ownership

The current probe time ownership check for Soft Reserved memory based
solely on CXL window intersection is insufficient. dax_hmem probing is not
always guaranteed to run after CXL enumeration and region assembly, which
can lead to incorrect ownership decisions before the CXL stack has
finished publishing windows and assembling committed regions.

Introduce deferred ownership handling for Soft Reserved ranges that
intersect CXL windows. When such a range is encountered during the
initial dax_hmem probe, schedule deferred work to wait for the CXL stack
to complete enumeration and region assembly before deciding ownership.

Once the deferred work runs, evaluate each Soft Reserved range
individually: if a CXL region fully contains the range, skip it and let
dax_cxl bind. Otherwise, register it with dax_hmem. This per-range
ownership model avoids the need for CXL region teardown and
alloc_dax_region() resource exclusion prevents double claiming.

Introduce a boolean flag dax_hmem_initial_probe to live inside device.c
so it survives module reload. Ensure dax_cxl defers driver registration
until dax_hmem has completed ownership resolution. dax_cxl calls
dax_hmem_flush_work() before cxl_driver_register(), which both waits for
the deferred work to complete and creates a module symbol dependency that
forces dax_hmem.ko to load before dax_cxl.

Co-developed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20260322195343.206900-9-Smita.KoralahalliChannabasappa@amd.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>

cxl/region: Add helper to check Soft Reserved containment by CXL regions

Add a helper to determine whether a given Soft Reserved memory range is
fully contained within the committed CXL region.

This helper provides a primitive for policy decisions in subsequent
patches such as co-ordination with dax_hmem to determine whether CXL has
fully claimed ownership of Soft Reserved memory ranges.

Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20260322195343.206900-8-Smita.KoralahalliChannabasappa@amd.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>

dax: Track all dax_region allocations under a global resource tree

Introduce a global "DAX Regions" resource root and register each
dax_region->res under it via request_resource(). Release the resource on
dax_region teardown.

By enforcing a single global namespace for dax_region allocations, this
ensures only one of dax_hmem or dax_cxl can successfully register a
dax_region for a given range.

Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://patch.msgid.link/20260322195343.206900-7-Smita.KoralahalliChannabasappa@amd.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>

ASoC: dt-bindings: mediatek,mt8173-rt5650-rt5514: convert to DT schema

Convert the Mediatek MT8173 with RT5650 and RT5514 sound card
bindings to DT schema.

Signed-off-by: Khushal Chitturi <khushalchitturi@gmail.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Link: https://patch.msgid.link/20260327134649.31376-1-khushalchitturi@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>

dax/cxl, hmem: Initialize hmem early and defer dax_cxl binding

Move hmem/ earlier in the dax Makefile so that hmem_init() runs before
dax_cxl.

In addition, defer registration of the dax_cxl driver to a workqueue
instead of using module_cxl_driver(). This ensures that dax_hmem has
an opportunity to initialize and register its deferred callback and make
ownership decisions before dax_cxl begins probing and claiming Soft
Reserved ranges.

Mark the dax_cxl driver as PROBE_PREFER_ASYNCHRONOUS so its probe runs
out of line from other synchronous probing avoiding ordering
dependencies while coordinating ownership decisions with dax_hmem.

Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Tested-by: Tomasz Wolski <tomasz.wolski@fujitsu.com>
Link: https://patch.msgid.link/20260322195343.206900-6-Smita.KoralahalliChannabasappa@amd.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>

dax/hmem: Gate Soft Reserved deferral on DEV_DAX_CXL

Replace IS_ENABLED(CONFIG_CXL_REGION) with IS_ENABLED(CONFIG_DEV_DAX_CXL)
so that HMEM only defers Soft Reserved ranges when CXL DAX support is
enabled. This makes the coordination between HMEM and the CXL stack more
precise and prevents deferral in unrelated CXL configurations.

Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Link: https://patch.msgid.link/20260322195343.206900-5-Smita.KoralahalliChannabasappa@amd.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>

dax/hmem: Request cxl_acpi and cxl_pci before walking Soft Reserved ranges

Ensure cxl_acpi has published CXL Window resources before HMEM walks Soft
Reserved ranges.

Replace MODULE_SOFTDEP("pre: cxl_acpi") with an explicit, synchronous
request_module("cxl_acpi"). MODULE_SOFTDEP() only guarantees eventual
loading, it does not enforce that the dependency has finished init
before the current module runs. This can cause HMEM to start before
cxl_acpi has populated the resource tree, breaking detection of overlaps
between Soft Reserved and CXL Windows.

Also, request cxl_pci before HMEM walks Soft Reserved ranges. Unlike
cxl_acpi, cxl_pci attach is asynchronous and creates dependent devices
that trigger further module loads. Asynchronous probe flushing
(wait_for_device_probe()) is added later in the series in a deferred
context before HMEM makes ownership decisions for Soft Reserved ranges.

Add an additional explicit Kconfig ordering so that CXL_ACPI and CXL_PCI
must be initialized before DEV_DAX_HMEM. This prevents HMEM from consuming
Soft Reserved ranges before CXL drivers have had a chance to claim them.

Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Tested-by: Tomasz Wolski <tomasz.wolski@fujitsu.com>
Link: https://patch.msgid.link/20260322195343.206900-4-Smita.KoralahalliChannabasappa@amd.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>

Merge back earlier Intel thermal drivers updates for 7.1

dax/hmem: Factor HMEM registration into __hmem_register_device()

Separate the CXL overlap check from the HMEM registration path and keep
the platform-device setup in a dedicated __hmem_register_device().

This makes hmem_register_device() the policy entry point for deciding
whether a range should be deferred to CXL, while __hmem_register_device()
handles the HMEM registration flow.

No functional changes.

Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://patch.msgid.link/20260322195343.206900-3-Smita.KoralahalliChannabasappa@amd.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>

dax/bus: Use dax_region_put() in alloc_dax_region() error path

alloc_dax_region() calls kref_init() on the dax_region early in the
function, but the error path for sysfs_create_groups() failure uses
kfree() directly to free the dax_region. This bypasses the kref lifecycle.

Use dax_region_put() instead to handle kref lifecycle correctly.

Suggested-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://patch.msgid.link/20260322195343.206900-2-Smita.KoralahalliChannabasappa@amd.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>