Boris Burkov [Tue, 12 May 2026 16:55:28 +0000 (09:55 -0700)]
btrfs: swallow btrfs_record_squota_delta() ENOENT
I thought that it was likely I could harden squota deletion to the point
that it was impossible to end up with an extent accounted to a qgroup
outliving its qgroup. Several recent bugs have made me re-consider that
position.
Ultimately, this is a tradeoff between short term stability and long
term strictness, but I think given that there could be another layer of
bugs behind the 2-3 I just fixed, I would feel much more confident in
people using squotas if the risk was "your values can get a bit out of
whack which you can fix by deleting stuff or
disabling/re-enabling/repairing" vs "it will abort your filesystem".
As the final nail in the coffin, the Meta production kernel was lacking
earlier fixes from me and Qu regarding subvol qgroup lifetime, so this
is what we have been testing at scale, so I think at least for now
upstream should have the same extra layer of protection.
Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Boris Burkov <boris@bur.io> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
Boris Burkov [Mon, 11 May 2026 20:06:24 +0000 (13:06 -0700)]
btrfs: clamp to avoid squota underflow
Simple quota accounting can undercount metadata tree block allocations
in certain scenarios. When an undercounted subvolume is deleted and its
tree blocks freed, the free deltas decrement rfer/excl past zero,
wrapping the u64 to a value near U64_MAX.
Once wrapped, can_delete_squota_qgroup() sees non-zero rfer and refuses
to delete the qgroup. The qgroup becomes permanently orphaned in the
quota tree, since there is no subvolume left to generate frees that
would bring the counter back to zero.
While we ultimately want to fix any mis-accounting at the source, it is
also helpful and worthwhile to mitigate the damage by clamping rfer and
excl to zero on underflow rather than allowing the u64 to wrap. This at
least allows us to clean up the messed up qgroups on subvol deletion.
Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Boris Burkov <boris@bur.io> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
Boris Burkov [Tue, 12 May 2026 02:53:46 +0000 (19:53 -0700)]
btrfs: fix squota accounting during enable generation
The first transaction that enables squotas is special and a bit tricky.
We have to set BTRFS_FS_QUOTA_ENABLED after the transaction to avoid a
deadlock, so any delayed refs that run before we set the bit are not
squota accounted. For data this is fine, we don't get an owner_ref, so
there is no real harm, it's as if the extent predated squotas. However
for metadata, the tree block will have gen == enable_gen so when we free
it later, we will decrement the squota accounting, which can result in
an underflow. Before it is freed, btrfs check shows errors, as we have
mismatched usage between the node generations/owners and the squota
values.
There are two angles to this fix:
1. For extents that come in delayed_refs that run during the
enable_gen transaction, we must actually set enable_gen to the *next*
transaction. That is the first transaction that we can really
properly account in any way.
2. For extents that come in between the end of our transaction handle
and the time we set the BTRFS_FS_QUOTA_ENABLED bit, we need an
additional bit, BTRFS_FS_SQUOTA_ENABLING which only affects recording
squota deltas, so we do pick up those extents. Otherwise, we would
miss them, even for enable_gen + 1.
Fixes: bd7c1ea3a302 ("btrfs: qgroup: check generation when recording simple quota delta") Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Boris Burkov <boris@bur.io> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
Boris Burkov [Mon, 11 May 2026 20:07:11 +0000 (13:07 -0700)]
btrfs: check for subvolume before deleting squota qgroup
The invariant that we want to maintain with subvolume qgroups is that
the qgroup can only be deleted if there is no root. With squotas, we
thought that it was sufficient to just check the usage, because we
assumed that deleting a subvolume will drive it's qgroups usage to 0,
and thus 0 usage implies no subvolume.
However, this is false, for two reasons:
- A subvol whose extents are all from before squotas was enabled.
- A subvol that was created in this transaction and for which we have
not yet run any delayed refs.
In both cases, deleting the qgroup breaks the desired invariant and we
are left with a subvolume with no qgroup but squotas are enabled.
Fix this by unifying the deletion check logic between full qgroups and
squotas. Squotas do all the same checks *and* the additional usage == 0
check, which is the one extra rule peculiar to squotas.
Link: https://lore.kernel.org/linux-btrfs/adnBhWfJQ1n3hZC8@merlins.org/ Fixes: a8df35619948 ("btrfs: forbid deleting live subvol qgroup") Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Boris Burkov <boris@bur.io> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
Boris Burkov [Fri, 8 May 2026 20:11:26 +0000 (13:11 -0700)]
btrfs: always drop root->inodes lock before cond_resched()
find_first_inode() and find_first_inode_to_shrink() lock root->inodes,
then loop over them, occasionally skipping some inodes. When they skip
an inode, they attempt to share the cpu/lock with cond_resched_lock().
However, that has a subtle problem associated with it.
cond_resched_lock() only drops the lock if it needs to actually call
schedule(). With CONFIG_PREEMPT_NONE, this means the full timeslice as
detected at ticks. With 8+ cpus and default tunables, this is 2.8ms. So
regardless of HZ, we will run for at least 2.8ms in this loop without
dropping the lock, assuming it finds no suitable inodes. If HZ is
small enough, it might be even worse as the tick granularity becomes
bigger than the timeslice.
The knock-on effect of this is that callers to
btrfs_del_inode_from_root() like kswapd trying to shrink the inode slab
or userspace threads calling evict() will spin on xa_lock(&root->inodes)
for 2.8ms, so the extent map shrinker dominates the lock even though
ostensibly it is intending to share it. This produces memory pressure as
there is only one kswapd and it runs sequentially so it can get stuck in
the inode slab shrinking.
To fix it, simply replace cond_resched_lock() with an open coded variant
which unconditionally does unlock/lock around cond_resched. Sharing the
lock is decoupled from sharing the CPU, and all the users of the lock
now share it fairly.
I was able to reproduce this on test systems by producing a lot of empty
files (to make a big root->inodes xarray), then producing memory
pressure by reading large files larger than ram, triggering kswapd and
the extent_map shrinker. The lock contention is visible with perf or
lockstat. This patch also relieved a user-apparent bottleneck on a
production system from the original report.
Tested-by: Rik van Riel <riel@surriel.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Boris Burkov <boris@bur.io> Signed-off-by: David Sterba <dsterba@suse.com>
Robbie Ko [Fri, 8 May 2026 13:42:11 +0000 (21:42 +0800)]
btrfs: mark file extent range dirty after converting prealloc extents
When writing into a preallocated extent, ordered extent completion calls
btrfs_mark_extent_written() to convert the file extent item from the
BTRFS_FILE_EXTENT_PREALLOC type to the BTRFS_FILE_EXTENT_REG type.
If the preallocated extent was created beyond i_size with fallocate
keep-size, and the inode is evicted and loaded again before the write,
the inode's file_extent_tree is initialized only up to i_size.
The beyond i_size prealloc extent is therefore not tracked there.
After a write into that extent extends i_size, btrfs_mark_extent_written()
updates the file extent item, but the corresponding range is not marked
dirty in the inode's file_extent_tree.
This can leave disk_i_size stale when the filesystem does not use the
no-holes feature, so after remount the file size can go back to the old
value.
The following reproducer triggers the problem:
$ cat test.sh
#!/bin/bash
DEV=/dev/sdi
MNT=/mnt/sdi
mkfs.btrfs -f -O ^no-holes $DEV
mount $DEV $MNT
touch $MNT/file
fallocate -n -l 2M $MNT/file
umount $MNT
mount $DEV $MNT
dd if=/dev/zero of=$MNT/file bs=1M count=1 conv=notrunc
ls -lh $MNT/file
umount $MNT
mount $DEV $MNT
ls -lh $MNT/file
umount $MNT
Running the reproducer gives the following result:
Fix this by marking the written range dirty in the inode's
file_extent_tree after successfully converting the prealloc extent to a
regular extent.
Fixes: 9ddc959e802b ("btrfs: use the file extent tree infrastructure") Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Robbie Ko <robbieko@synology.com>
[ Minor change log updates ] Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
Avinash Duduskar [Wed, 13 May 2026 09:22:53 +0000 (14:52 +0530)]
llc: avoid sparse cast-truncates warning in counter clamps
llc_conn_ac_inc_npta_value() and llc_conn_ac_inc_tx_win_size()
clamp their counters to the maximum valid 7-bit value via
(u8) ~LLC_2_SEQ_NBR_MODULO. LLC_2_SEQ_NBR_MODULO is defined as
((u8) 128) in include/net/llc_pdu.h, but the (u8) cast does not
prevent integer promotion of the operand of ~: ~128 is computed
as int (0xffffff7f), and the surrounding (u8) cast truncates
back to 0x7f. The result is correct (127), but the implicit
truncation is flagged by sparse:
net/llc/llc_c_ac.c:1008:38: warning: cast truncates bits from
constant value (ffffff7f becomes 7f)
(and three more at lines 1009, 1099, 1100)
Replace the (u8) ~LLC_2_SEQ_NBR_MODULO expression with
LLC_2_SEQ_NBR_MODULO - 1, which evaluates to 127 directly and
silences sparse.
The same ~LLC_2_SEQ_NBR_MODULO pattern also appears in
include/net/llc_pdu.h:148 as part of PDU_GET_NEXT_Vr, but there
the result is immediately &-masked, so the int promotion is
harmless and sparse does not flag it; it is left alone.
This patch is the minimum diff to silence the warning. The
counter-clamp idiom itself could be modernized to
min_t(u8, ..., LLC_2_SEQ_NBR_MODULO - 1), but that is a
separate cleanup left for another patch.
Eric Dumazet [Thu, 14 May 2026 09:55:06 +0000 (09:55 +0000)]
net: always declare __sock_wfree() and tcp_wfree()
Even if guarded by IS_ENABLED(CONFIG_INET) compilers need to know
what __sock_wfree() and tcp_wfree() are:
include/net/sock.h:1861:63: note: each undeclared identifier is reported only once for each function it appears in
include/net/sock.h:1862:63: error: 'tcp_wfree' undeclared (first use in this function); did you mean 'sock_wfree'?
1862 | (IS_ENABLED(CONFIG_INET) && skb->destructor == tcp_wfree);
Fixes: f0de88303d5e ("net: make is_skb_wmem() available to modules") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202605141607.mDXnYFKY-lkp@intel.com/ Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260514095506.3919094-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
vsock/virtio: fix zerocopy completion for multi-skb sends
When a large message is fragmented into multiple skbs, the zerocopy
uarg is only allocated and attached to the last skb in the loop.
Non-final skbs carry pinned user pages with no completion tracking,
so the kernel has no way to notify userspace when those pages are safe
to reuse. If the loop breaks early the uarg is never allocated at all,
leaking pinned pages with no completion notification.
Fix this by following the approach used by TCP: allocate the zerocopy
uarg (if not provided by the caller) before the send loop and attach
it to every skb via skb_zcopy_set(), which takes a reference per skb.
Each skb's completion properly decrements the refcount, and the
notification only fires after the last skb is freed.
On failure, if no data was sent, the uarg is cleanly aborted via
net_zcopy_put_abort().
This issue was initially discovered by sashiko while reviewing commit 1cb36e252211 ("vsock/virtio: fix MSG_ZEROCOPY pinned-pages accounting")
but was pre-existing.
Fixes: 581512a6dc93 ("vsock/virtio: MSG_ZEROCOPY flag support") Closes: https://sashiko.dev/#/patchset/20260420132051.217589-1-sgarzare%40redhat.com Reported-by: Maher Azzouzi <maherazz04@gmail.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Arseniy Krasnov <avkrasnov@salutedevices.com> Link: https://patch.msgid.link/20260514092948.268720-1-sgarzare@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Luka Gejak [Wed, 13 May 2026 18:26:57 +0000 (20:26 +0200)]
net: hsr: reject unresolved interlink ifindex
In hsr_newlink(), a provided but invalid IFLA_HSR_INTERLINK attribute
was silently ignored if __dev_get_by_index() returned NULL. This leads
to incorrect RedBox topology creation without notifying the user.
Fix this by returning -EINVAL and an extack message when the
interlink attribute is present but cannot be resolved.
Reviewed-by: Felix Maurer <fmaurer@redhat.com> Signed-off-by: Luka Gejak <luka.gejak@linux.dev> Reviewed-by: Fernando Fernandez Mancera <fmancera@suse.de> Link: https://patch.msgid.link/20260513182657.20346-3-luka.gejak@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sam Daly [Wed, 13 May 2026 16:42:53 +0000 (18:42 +0200)]
octeontx2-af: CGX: add bounds check to cgx_speed_mbps index
cgx_speed_mbps has 13 elements but RESP_LINKSTAT_SPEED can yield values
0-15. If it returns a value >= 13, this causes an out-of-bounds array
access. Add a bounds check and default to speed 0 if the index is out of
range.
Fixes: 61071a871ea6 ("octeontx2-af: Forward CGX link notifications to PFs") Cc: Sunil Goutham <sgoutham@marvell.com> Cc: Linu Cherian <lcherian@marvell.com> Cc: Geetha sowjanya <gakula@marvell.com> Cc: hariprasad <hkelam@marvell.com> Cc: Subbaraya Sundeep <sbhatta@marvell.com> Cc: Andrew Lunn <andrew+netdev@lunn.ch> Cc: stable <stable@kernel.org> Signed-off-by: Sam Daly <sam@samdaly.ie> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://patch.msgid.link/2026051352-refined-demise-e88d@gregkh Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Dragos Tatulea [Wed, 13 May 2026 12:45:18 +0000 (15:45 +0300)]
IB/IPoIB: ndo_set_rx_mode_async conversion
The commit in the fixes tag added a warning for devices
that are netdev ops locked that they should be converted
to .ndo_set_rx_mode_async. IPoIB for mlx5 is such a
driver which was missed during the conversion because the
flow is more complex:
- mlx5 part of IPoIB device was converted to ops-lock in commit [1].
- ipoib_intf_init() then overrides netdev_ops with
ipoib_netdev_ops_{pf,vf}, which still wired ndo_set_rx_mode to the
legacy sync path -- tripping the new warning on every probe.
So now we have the following splat:
netdevice: ib0 (uninitialized): ops-locked drivers should use ndo_set_rx_mode_async
WARNING: net/core/dev.c:11366 at register_netdevice+0x83c/0x21d0
...
register_netdev+0x1f/0x40
ipoib_add_one+0x35c/0x880 [ib_ipoib]
This patch implements .ndo_set_rx_mode_async but it simply schedules the
multicast restart task like before. This is done to maintain the
assumption that this task and others [2] must run on the same order
workqueue to avoid racing with themselves. The race between
ipoib_mcast_join_task() and ipoib_mcast_restart_task() would be the most
obvious example.
Linus Torvalds [Sat, 16 May 2026 00:00:45 +0000 (17:00 -0700)]
Merge tag 'drm-fixes-2026-05-16' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
"Weekly fixes pull, small and all over fixes, mostly xe and amdgpu,
with some ttm and a core fix for the handle change pain.
core:
- fix for the fix for the handle change race
ttm:
- avoid infinite loop in swap out
- avoid infinite loop in BO shrinking
- convert -EAGAIN from dmem_cgroup_try_charge to -ENOSPC
bridge:
- imx8qxp-pxl2dpi: avoid ERR_PTR with device_node cleanup
i915:
- Skip __i915_request_skip() for already signaled requests
- Fix VSC dynamic range signaling for RGB formats [dp]
xe:
- Madvise fix around purgeability tracking
- Restore engine mask for specific blitter style
- Couple UAF fixes
- Drop unused ggtt_balloon field
loongson:
- use managed cleanup for connector polling
panfrost:
- handle results from reservation locking correctly
qaic:
- check for integer overflows in mmap logic
rocket:
- handle results from reservation locking correctly"
* tag 'drm-fixes-2026-05-16' of https://gitlab.freedesktop.org/drm/kernel: (26 commits)
drm: Replace old pointer to new idr
drm/loongson: Use managed KMS polling
drm/ttm: Fix ttm_bo_shrink() infinite LRU walk on backup failure
drm/ttm: Convert -EAGAIN from dmem_cgroup_try_charge to -ENOSPC
drm/gma500/oaktrail_lvds: fix i2c adapter leaks on init
drm/gma500/oaktrail_lvds: fix hang on init failure
drm/gma500/oaktrail_hdmi: fix i2c adapter leak on setup
drm/xe: Drop unused ggtt_balloon field
accel/qaic: Add overflow check to remap_pfn_range during mmap
drm/i915/dp: Fix VSC dynamic range signaling for RGB formats
drm/i915: skip __i915_request_skip() for already signaled requests
drm/bridge: imx8qxp-pxl2dpi: avoid ERR_PTR with device_node cleanup
drm/amdgpu/gfx_v12_0: set gfx.rs64_enable from PFP header on GFX12
drm/amd/ras: Fix CPER ring debugfs read overflow
drm/amd/display: Wrap DCN32 phantom-plane allocation in DC_RUN_WITH_PREEMPTION_ENABLED
drm/amdgpu: fix userq hang detection and reset
drm/amdgpu: remove almost all calls to amdgpu_userq_detect_and_reset_queues
drm/amdgpu: rework amdgpu_userq_signal_ioctl v3
drm/amdgpu: remove deadlocks from amdgpu_userq_pre_reset
drm/xe/dma-buf: fix UAF with retry loop
...
Commit 5e28b7b94408 introduced a logical error by failing to replace the
newly generated IDR pointer to old id's pointer at the correct location
within the "change handle" logic; this resulted in the issue reported by
syzbot [1].
Specifically, the new IDR object pointer is intended to replace the original
id's pointer during the normal execution flow.
Additionally, an unnecessary conditional check for the ret exit path has
been removed.
Fixes: 5e28b7b94408 ("drm: Set old handle to NULL before prime swap in change_handle") Reported-by: syzbot+d7c9eed171647e421013@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=d7c9eed171647e421013 Cc: stable@vger.kernel.org Tested-by: syzbot+d7c9eed171647e421013@syzkaller.appspotmail.com Signed-off-by: Edward Adam Davis <eadavis@qq.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patch.msgid.link/tencent_C267296443AAA4567771176886DFF364A305@qq.com
The unfortunately named 'phandle-array' property type is really a matrix
with phandle and fixed arg cells entries. A matrix property should have 2
levels of items constraints.
if (iphlen >= sizeof(*iph)) {
/* fix up saddr, tot_len, id, csum, transport_header */
}
It does not, however, reject ihl < 5. For such a packet the
"if (iphlen >= sizeof(*iph))" branch is skipped, leaving the
crafted iphdr untouched, but the packet is still handed to
__ip_local_out() and onward. Downstream consumers that read
iph->ihl assume a sane value: net/ipv4/ah4.c:ah_output() in
particular subtracts sizeof(struct iphdr) from top_iph->ihl * 4
and passes the (signed-int-negative, then cast to size_t)
result to memcpy(), producing an OOB access of length close to
SIZE_MAX and a host kernel panic.
An IPv4 header with ihl < 5 is malformed by definition (RFC 791:
"Internet Header Length is the length of the internet header in
32 bit words ... Note that the minimum value for a correct header
is 5."). The kernel should not be willing to inject such a
packet into its own output path.
Reject "iphlen < sizeof(*iph)" alongside the existing
"iphlen > length" check. This matches the principle that locally
constructed packets that re-enter the IP stack must pass the same
basic sanity tests that a foreign packet would be subjected to.
Once this lands, the "if (iphlen >= sizeof(*iph))" wrapper around
the fixup branch becomes redundant; left in place to keep the
patch minimal and backport-friendly. A follow-up can unwrap it.
Note that commit 86f4c90a1c5c ("ipv4, ipv6: ensure raw socket
message is big enough to hold an IP header") ensures the message
buffer is large enough to hold an iphdr, but does not constrain
the self-reported iph->ihl.
Reachability: the malformed packet source is any caller with
CAP_NET_RAW, including an unprivileged process in a user+net
namespace on a kernel with CONFIG_USER_NS=y. The reproduced AH
crash also requires a matching xfrm AH policy on the outgoing
route; a container granted CAP_NET_ADMIN can install that state
and policy in its netns. Loopback bypasses xfrm_output, so the
trigger uses a real netdev.
Reproduced on UML + KASAN: kernel-mode fault at addr 0x0 with
memcpy_orig at the crash site. Same shape reproduces inside a
rootless Docker container with --cap-add NET_ADMIN on a stock
distro kernel.
Update the docs to match the code (include/linux/netlink.h):
/*
* skb should fit one page. This choice is good for headerless malloc.
* But we should limit to 8K so that userspace does not have to
* use enormous buffer sizes on recvmsg() calls just to avoid
* MSG_TRUNC when PAGE_SIZE is very large.
*/
#if PAGE_SIZE < 8192UL
#define NLMSG_GOODSIZE SKB_WITH_OVERHEAD(PAGE_SIZE)
#else
#define NLMSG_GOODSIZE SKB_WITH_OVERHEAD(8192UL)
#endif
Linus Torvalds [Fri, 15 May 2026 22:40:25 +0000 (15:40 -0700)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 MPAM fixes from Catalin Marinas:
- Fix NULL dereference and a false-positive warning when the driver
probes hardware with surprising version numbers
- Fix writing values to the wrong registers when probing
cache-utilisation counters. Replace 'NRDY' probing with a version
that is robust for platforms where the bit is writeable by both
hardware and software
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm_mpam: Check whether the config array is allocated before destroying it
arm_mpam: Fix false positive assert failure during mpam_disable()
arm_mpam: Improve check for whether or not NRDY is hardware managed
arm_mpam: Pretend that NRDY is always hardware managed
arm_mpam: Fix monitor instance selection when checking for hardware NRDY
Linus Torvalds [Fri, 15 May 2026 22:22:26 +0000 (15:22 -0700)]
Merge tag 'iommu-fixes-v7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux
Pull iommu fixes from Joerg Roedel:
"This is probably the largest fixes pull-request ever sent for IOMMU. I
partially blame it on AI code review which found some issues but there
is also some rework in here to fix issues in the iommu parts of PCI
device reset.
AMD-Vi:
- Add bounds checks to debugfs and table lookups
Intel VT-d:
- Apply an existing quirk for Q35 graphic device
- Skip dev_pasid teardown for the blocked domain to avoid
out-of-bounds access
- Return early if dev_pasid is missing to prevent NULL dereference
or UAF
Core:
- Fix bugs and corner cases in pci_dev_reset_iommu_prepare/done()
- Fix various issues found by AI in iommupt code
MAINTAINERS email address update for RISCV IOMMU"
* tag 'iommu-fixes-v7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux:
MAINTAINERS: update Tomasz Jeznach's email address
iommupt: Fix the end_index calculation in __map_range_leaf()
iommupt: Check for missing PAGE_SIZE in the pgsize_bitmap
iommu: Handle unmap error when iommu_debug is enabled
iommu: Fix up map/unmap debugging for iommupt domains
iommu: Fix loss of errno on map failure for classic ops
iommu/vt-d: Avoid NULL pointer dereference or refcount corruption
iommu/vt-d: Fix oops due to out of scope access
iommu/vt-d: Disable DMAR for Intel Q35 IGFX
iommu: Warn on premature unblock during DMA aliased sibling reset
iommu: Fix WARN_ON in __iommu_group_set_domain_nofail() due to reset
iommu: Fix ATS invalidation timeouts during __iommu_remove_group_pasid()
iommu: Fix nested pci_dev_reset_iommu_prepare/done()
iommu: Fix pasid attach in pci_dev_reset_iommu_prepare/done()
iommu: Replace per-group resetting_domain with per-gdev blocked flag
iommu: Fix kdocs of pci_dev_reset_iommu_done()
iommu: Fix NULL group->domain dereference in pci_dev_reset_iommu_done()
iommu/amd: Bounds-check devid in __rlookup_amd_iommu()
iommu/amd: Remove latent out-of-bounds access in IOMMU debugfs
Linus Torvalds [Fri, 15 May 2026 22:13:02 +0000 (15:13 -0700)]
Merge tag 'vfio-v7.1-rc4' of https://github.com/awilliam/linux-vfio
Pull VFIO fixes from Alex Williamson:
- Convert vfio-pci BAR resource requests and iomaps initialization
from a lazy, on-demand model to an eager pre-allocation model to
avoid races while preserving legacy error behavior. Fix unchecked
barmap access in dma-buf export path (Matt Evans)
- Introduce an implicit unsigned cast in converting vfio-pci device
offsets to region indexes, closing a potential out-of-bounds
access through the vfio_pci_ioeventfd() interface (Matt Evans)
- Fix a dma-buf kref underflow and stuck wait_for_completion() when
closing a previously revoked dma-buf (Alex Williamson)
* tag 'vfio-v7.1-rc4' of https://github.com/awilliam/linux-vfio:
vfio/pci: Check BAR resources before exporting a DMABUF
vfio/pci: Set up BAR resources and maps in vfio_pci_core_enable()
vfio/pci: Make VFIO_PCI_OFFSET_TO_INDEX() return unsigned
vfio/pci: fix dma-buf kref underflow after revoke
Linus Torvalds [Fri, 15 May 2026 21:52:17 +0000 (14:52 -0700)]
Merge tag 'v7.1-rc4-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:
- Fix integer overflow in read
- Fix smbdirect error cleanup
- Multichannel reconnect fix
- Add some missing defines and correct some references to protocol spec
- Fix oob symlink read
* tag 'v7.1-rc4-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
smbdirect: Fix error cleanup in smbdirect_map_sges_from_iter()
smb: client: avoid integer overflow in SMB2 READ length check
cifs: client: stage smb3_reconfigure() updates and restore ctx on failure
smb/client: fix possible infinite loop and oob read in symlink_data()
SMB3.1.1: add missing QUERY_DIR info levels
Jakub Sitnicki [Mon, 11 May 2026 12:10:22 +0000 (14:10 +0200)]
bpf: Add Jiayuan Chen to sockmap maintainers
Nominate Jiayuan Chen for the sockmap co-maintainer. Jiayuan has been a
regular contributor and reviewer for the sockmap and networking code.
Since we are now down to just two maintainers, and John has to split his
time between BPF core, BPF networking, and sockmap, having three
maintainers again will help with the review load.
Linus Torvalds [Fri, 15 May 2026 21:48:09 +0000 (14:48 -0700)]
Merge tag 'ceph-for-7.1-rc4' of https://github.com/ceph/ceph-client
Pull ceph fixes from Ilya Dryomov:
"An important patch from Hristo that squashes a folio reference leak
that could lead to OOM kills in CephFS and a number of miscellaneous
fixes from Raphael and Slava.
All but two are marked for stable"
* tag 'ceph-for-7.1-rc4' of https://github.com/ceph/ceph-client:
libceph: Fix potential null-ptr-deref in decode_choose_args()
libceph: handle rbtree insertion error in decode_choose_args()
libceph: Fix potential out-of-bounds access in osdmap_decode()
ceph: put folios not suitable for writeback
ceph: add ceph_has_realms_with_quotas() check to ceph_quota_update_statfs()
libceph: Fix potential out-of-bounds access in __ceph_x_decrypt()
ceph: fix BUG_ON in __ceph_build_xattrs_blob() due to stale blob size
ceph: fix a buffer leak in __ceph_setxattr()
libceph: Fix unnecessarily high ceph_decode_need() for uniform bucket
libceph: Fix potential out-of-bounds access in crush_decode()
Gustavo Sousa [Thu, 14 May 2026 21:44:50 +0000 (18:44 -0300)]
drm/xe/reg_sr: Do sanity check for MCR vs non-MCR
The type struct xe_reg_mcr exists to ensure that the correct API is used
when handling MCR registers. However, for the register save/restore
functionality, the RTP processing always cast the register to a struct
xe_reg and then apply_one_mmio() selects the MMIO API based on the "mcr"
field of the register instance.
This allows the developer to commit mistakes like passing a MCR register
for an RTP action for a GT where the respective register is not MCR; and
vice-versa.
To capture such scenarios, do a sanity check in xe_reg_sr_add() that,
upon an inconsistency:
- "fixes" the register type by favoring what we have in our MCR range
tables instead of what the developer selected for the save/restore
entry;
- raises a notice-level message to inform about the inconsistency.
Note: As a collateral of this change, we need to include MCR
initialization in xe_wa_test.c, otherwise a bunch of test cases end up
failing because xe_gt_mcr_check_reg() will always return false, meaning
that will incorrectly say that a MCR register is not MCR.
v2:
- Downgrade messages to notice level so as not to block CI execution
when inconsistencies are found. (Matt)
- Add missing EXPORT_SYMBOL_IF_KUNIT() calls. (Gustavo)
Gustavo Sousa [Thu, 14 May 2026 21:44:49 +0000 (18:44 -0300)]
drm/xe/mcr: Extract reg_in_steering_type_ranges()
The logic to check if a register falls within one of the ranges for a
steering type is already duplicated in
xe_gt_mcr_get_nonterminated_steering(). We will also want to use that
same logic in another upcoming function. Let's factor out that logic
and put it into a function named reg_in_steering_type_ranges().
Gustavo Sousa [Thu, 14 May 2026 21:44:47 +0000 (18:44 -0300)]
drm/xe: Extract xe_hw_engine_setup_reg_lrc()
The steps for processing RTP rules that build up an engine's reg_lrc
arguably belongs to xe_hw_engine.c and should be encapsulated into a
function in that unit.
Move that logic to a new function called xe_hw_engine_setup_reg_lrc().
Gustavo Sousa [Thu, 14 May 2026 21:44:46 +0000 (18:44 -0300)]
drm/xe: Define and use MCR version of COMMON_SLICE_CHICKEN4
The register COMMON_SLICE_CHICKEN4 is a MCR register on both Xe2 and
Xe3. Let's make sure to define a MCR version of it and use it for the
relevant IP versions.
Use XEHP_ as prefix for the register name, since it is MCR as of Xe_HP.
v2:
- Also change for one entry in lrc_tunnings, which was caught by
manual testing and add corresponging Fixes tag in commit message.
(Gustavo)
Fixes: 8d6f16f1f082 ("drm/xe: Extend Wa_22021007897 to Xe3 platforms") Fixes: e5c13e2c505b ("drm/xe/xe2hpg: Add Wa_22021007897") Fixes: 8ccf5f6b2295 ("drm/xe/tuning: Apply windower hardware filtering setting on Xe3 and Xe3p")
Bspec: 66534, 71185, 74417 Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patch.msgid.link/20260514-rtp-mcr-check-v3-3-30dd47855fee@intel.com Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>
Daniel Lezcano [Tue, 5 May 2026 14:44:47 +0000 (16:44 +0200)]
thermal/core: Split __thermal_cooling_device_register() into two functions
In preparation for the upcoming changes separating OF and non-OF code,
split __thermal_cooling_device_register() into allocation and addition
phases.
This allows moving the device node assignment out of the core
initialization path.
This change is not a trivial split. The lifetime of the cooling device
is managed by the device core through put_device(), which triggers
thermal_release() to free all associated resources.
With the introduction of thermal_cooling_device_alloc(), the allocation
path must mirror what thermal_release() undoes. In contrast,
thermal_cooling_device_add() must not perform any rollback and relies
on put_device() for cleanup on error paths. This avoids both double
free and resource leaks.
As part of this rework, add the missing device_initialize() call when
allocating the cooling device.
Suggested-by: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@oss.qualcomm.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
[ rjw: Replace device_register() with device_add() ]
[ rjw: Rebase on top of previously applied material ] Link: https://patch.msgid.link/20260505144447.2853933-1-daniel.lezcano@oss.qualcomm.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Linus Torvalds [Fri, 15 May 2026 20:22:07 +0000 (13:22 -0700)]
Merge tag 'for-7.1-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- fixup warning when allocating memory for readahead, __GFP_NOWARN was
accidentally dropped when setting mapping constraints
- in tracepoint of file sync, fix sleeping in atomic context when
handling dentries
- harden initial loading of block group on crafted/fuzzed images,
iterate all chunk mapping entries unconditionally
- fix freeing pages of submitted io after checking for errors
- fix incorrect inode size after remount when using fallocate KEEP_SIZE
mode (also requires disabled 'no-holes' feature)
* tag 'for-7.1-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: fix incorrect i_size after remount caused by KEEP_SIZE prealloc gap
btrfs: only release the dirty pages io tree after successful writes
btrfs: tracepoints: fix sleep while in atomic context in btrfs_sync_file()
btrfs: always pass __GFP_NOWARN from add_ra_bio_pages()
btrfs: fix check_chunk_block_group_mappings() to iterate all chunk maps
Linus Torvalds [Fri, 15 May 2026 20:17:46 +0000 (13:17 -0700)]
Merge tag 'xfs-fixes-7.1-rc4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs fixes from Carlos Maiolino:
"A few bug fixes, nothing really special stands out"
* tag 'xfs-fixes-7.1-rc4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: Fix typo in comment
xfs: fix the "limiting open zones" message
xfs: flush delalloc blocks on ENOSPC in xfs_trans_alloc_icreate
xfs: check da node block pad field during scrub
xfs: fix memory leak for data allocated by xfs_zone_gc_data_alloc()
xfs: fix memory leak on error in xfs_alloc_zone_info()
xfs: check directory data block header padding in scrub
xfs: zero directory data block padding on write verification
xfs: zero entire directory data block header region at init
xfs: remove the meaningless XFS_ALLOC_FLAG_FREEING
Linus Torvalds [Fri, 15 May 2026 20:11:41 +0000 (13:11 -0700)]
Merge tag 'nfsd-7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd fixes from Chuck Lever:
"Fixes for this release:
- Correctness fix for the new sunrpc cache netlink protocol
Marked for stable:
- Correctness fixes for delegated attributes
- Prevent an infinite loop when revoking layouts"
* tag 'nfsd-7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
NFSD: Fix infinite loop in layout state revocation
sunrpc: start cache request seqno at 1 to fix netlink GET_REQS
nfsd: update mtime/ctime on COPY in presence of delegated attributes
nfsd: update mtime/ctime on CLONE in presense of delegated attributes
nfsd: fix file change detection in CB_GETATTR
nfsd: fix GET_DIR_DELEGATION when VFS leases are disabled
Linus Torvalds [Fri, 15 May 2026 19:47:00 +0000 (12:47 -0700)]
Merge tag 'block-7.1-20260515' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull block fixes from Jens Axboe:
- NVMe merge request via Keith:
- Fix memory leak on a passthrough integrity mapping failure (Keith)
- Hide secrets behind debug option (Hannes)
- Fix pci use-after-free for host memory buffer (Chia-Lin Kao)
- Fix tcp taregt use-after-free for data digest (Sagi)
- Revert a mistaken quirk (Alan Cui)
- Fix uevent and controller state race condition (Maurizio)
- Fix apple submission queue re-initialization (Nick Chan)
- Three fixes for blk-integrity, fixing an issue with the user data
mapping and two problems with recomputing number of segments
- Two fixes for the iov_iter bounce buffering
- Fix for the handling of dead zoned write plugs
- ublk max_sectors validation fix, with associated selftest addition
* tag 'block-7.1-20260515' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
nvme-apple: Reset q->sq_tail during queue init
block: align down bounces bios
block: pass a minsize argument to bio_iov_iter_bounce
selftests: ublk: cap nthreads to kernel's actual nr_hw_queues
block: fix handling of dead zone write plugs
block: bio-integrity: Fix null-ptr-deref in bio_integrity_map_user()
block: recompute nr_integrity_segments in blk_insert_cloned_request
block: don't overwrite bip_vcnt in bio_integrity_copy_user()
nvme: fix race condition between connected uevent and STARTED_ONCE flag
Revert "nvme: add quirk NVME_QUIRK_IGNORE_DEV_SUBNQN for 144d:a808"
nvmet-tcp: Fix potential UAF when ddgst mismatch
nvme-pci: fix use-after-free in nvme_free_host_mem()
nvmet-auth: Do not print DH-HMAC-CHAP secrets
nvme: fix bio leak on mapping failure
nvme: make prp passthrough usage less scary
ublk: reject max_sectors smaller than PAGE_SECTORS in parameter validation
Linus Torvalds [Fri, 15 May 2026 19:34:02 +0000 (12:34 -0700)]
Merge tag 'io_uring-7.1-20260515' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull io_uring fixes from Jens Axboe:
- Small series sanitizing the locking done for either modifying or
reading a chain of requests
- If the application has a pid namespace, ensure that the sqthread pid
is correctly printed in fdinfo
- Fix for a hashing issue in the io-wq thread pool, which could lead to
a use-after-free
- Kill dead argument from io_prep_rw_pi()
- Fix for a missed validation of the CQ ring head, affecting CQE refill
* tag 'io_uring-7.1-20260515' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
io_uring: validate user-controlled cq.head in io_cqe_cache_refill()
io-wq: check that the predecessor is hashed in io_wq_remove_pending()
io_uring/rw: drop unused attr_type_mask from io_prep_rw_pi()
io_uring: hold uring_lock across io_kill_timeouts() in cancel path
io_uring: defer linked-timeout chain splice out of hrtimer context
io_uring: hold uring_lock when walking link chain in io_wq_free_work()
io_uring/fdinfo: translate SqThread PID through caller's pid_ns
Linus Torvalds [Fri, 15 May 2026 19:27:03 +0000 (12:27 -0700)]
Merge tag 'hardening-v7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardening fix from Kees Cook:
- gcc-plugins: Fix GCC 16 removal of CONST_CAST macros
* tag 'hardening-v7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
gcc-plugins: Always define CONST_CAST_GIMPLE and CONST_CAST_TREE
Linus Torvalds [Fri, 15 May 2026 19:24:09 +0000 (12:24 -0700)]
Merge tag 'docs-7.1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/docs/linux
Pull documentation fixes from Jonathan Corbet:
"This is Willy Tarreau's new document clarifying the definition and
handling of security-related bugs, which we're trying to get out there
quickly on the theory that some of the bug reporters might actually
read and pay attention to it"
* tag 'docs-7.1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/docs/linux:
docs: threat-model: don't limit root capabilities to CAP_SYS_ADMIN
docs: security-bugs: add a link to the threat-model documentation
Documentation: security-bugs: clarify requirements for AI-assisted reports
Documentation: security-bugs: explain what is and is not a security bug
Documentation: security-bugs: do not systematically Cc the security team
Reading debugfs file (/sys/kernel/debug/dri/0/gt*/pf/adverse_events)
with CFI (Control Flow Integrity) enabled, the kernel panics at
xe_gt_debugfs_simple_show+0x82/0xc0.
xe_gt_debugfs_simple_show() declare a function pointer expecting int
return type, but xe_gt_sriov_pf_monitor_print_events() is void return
type, leading to CFI failure and kernel panic.
Michal Wajdeczko [Thu, 14 May 2026 15:57:26 +0000 (17:57 +0200)]
drm/xe/vf: Fix signature of print functions
We have plugged-in existing VF print functions into our GT debugfs
show helper as-is, but we missed that the helper expects functions
to return int, while they were defined as void. This can lead to
errors being reported when CFI is enabled.
Fixes: 63d8cb8fe3dd ("drm/xe/vf: Expose SR-IOV VF attributes to GT debugfs") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Mohanram Meenakshisundaram <mohanram.meenakshisundaram@intel.com> Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com> Link: https://patch.msgid.link/20260514155726.7165-1-michal.wajdeczko@intel.com
Arnd Bergmann [Fri, 15 May 2026 10:57:09 +0000 (12:57 +0200)]
ring-buffer remote: Avoid unexpected symbol warnings (arm, s390)
The now more verbose check found more architecture specific symbol
missing from the whitelist, during randconfig testing on s390
and 32-bit arm:
Unexpected symbols in kernel/trace/simple_ring_buffer.o:
U __aeabi_unwind_cpp_pr1
Unexpected symbols in kernel/trace/simple_ring_buffer.o:
U __s390_indirect_jump_r1
U __s390_indirect_jump_r10
U __s390_indirect_jump_r14
U __s390_indirect_jump_r2
U __s390_indirect_jump_r5
U __s390_indirect_jump_r7
U __s390_indirect_jump_r8
U __s390_indirect_jump_r9
make[6]: *** [/home/arnd/arm-soc/kernel/trace/Makefile:160: kernel/trace/simple_ring_buffer.o.checked] Error 1
Add these to the list and keep it roughly sorted into sanitizer
and architecture symbols.
Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Vincent Donnefort <vdonnefort@google.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://patch.msgid.link/20260515105717.1023007-1-arnd@kernel.org Fixes: 1211907ac0b5 ("tracing: Generate undef symbols allowlist for simple_ring_buffer") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Linus Torvalds [Fri, 15 May 2026 18:24:51 +0000 (11:24 -0700)]
Merge tag 'for-linus-7.1b-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
- one simple cleanup
- a fix for a corner case when running as Xen PV dom0
- a fix of a regression for Xen PV guests, introduced in 7.0
* tag 'for-linus-7.1b-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
x86/xen: Tolerate nested XEN_LAZY_MMU entering/leaving
x86/xen: Fix xen_e820_swap_entry_with_ram()
xen/arm: Replace __ASSEMBLY__ with __ASSEMBLER__ in interface.h
Linus Torvalds [Fri, 15 May 2026 18:12:54 +0000 (11:12 -0700)]
Merge tag 'platform-drivers-x86-v7.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver fixes from Ilpo Järvinen:
- asus-nb-wmi:
- Use existing keyboard quirk for ASUS Zenbook Duo UX8407AA
- hp-wmi:
- Add support for Victus 16-r0xxx (8BC2)
- intel/vsec_tpmi:
- Move debugfs register before creating devices
- Prevent fault during unbind
- lenovo-wmi-*:
- Fix memory leak in lwmi_dev_evaluate_int()
- Balance IDA id allocation and free
- Balance component bind and unbind
- Prevent sending uninitialized WMI arguments to the device
- Decouple lenovo-wmi-gamezone and lenovo-wmi-other to simplify
module dependency graph
- Limit adding attributes to supported devices
- samsung-galaxybook:
- Handle kbd backlight, mic mute and camera block hotkeys
* tag 'platform-drivers-x86-v7.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
platform/x86: asus-nb-wmi: add DMI quirk for ASUS Zenbook Duo UX8407AA
platform/x86: lenovo-wmi-other: Limit adding attributes to supported devices
platform/x86: lenovo-wmi-other: Add Attribute ID helper functions
platform/x86: lenovo-wmi-helpers: Move gamezone enums to wmi-helpers
platform/x86: lenovo: Decouple lenovo-wmi-gamezone and lenovo-wmi-other
platform/x86: lenovo-wmi-other: Fix tunable_attr_01 struct members
platform/x86: lenovo-wmi-other: Zero initialize WMI arguments
platform/x86: lenovo-wmi-other: Balance component bind and unbind
platform/x86: lenovo-wmi-other: Balance IDA id allocation and free
platform/x86: lenovo-wmi-helpers: Fix memory leak in lwmi_dev_evaluate_int()
platform/x86: hp-wmi: Add support for Victus 16-r0xxx (8BC2)
platform/x86/intel/tpmi/plr: Prevent fault during unbind
platform/x86: intel: Add notifiers support
platform/x86: intel: Move debugfs register before creating devices
platform/x86: samsung-galaxybook: Handle ACPI hotkey notifications
platform/x86: samsung-galaxybook: Refactor camera lens cover input device
PCI: brcmstb: Assign pcie->gen from of_pci_get_max_link_speed()
After commit 03f920936977 ("PCI: controller: Validate max-link-speed"),
pcie->gen stopped being assigned and as a result the established PCIe link
would stop supporting Gen3 speeds on 2712 since pcie->gen is used to
populate LnkCntl2 and LnkCap in brcm_pcie_set_gen().
If the 'max-link-speed' property is not specified, or it exceeds Gen3,
resort to the HW defaults.
Linus Torvalds [Fri, 15 May 2026 17:38:37 +0000 (10:38 -0700)]
Merge tag 'v7.1-p4' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
- Fix potential dead-lock in rhashtable when used by xattr
- Avoid calling kvfree on atomic path in rhashtable
* tag 'v7.1-p4' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
rhashtable: Add bucket_table_free_atomic() helper
mm/slab: Add kvfree_atomic() helper
rhashtable: drop ht->mutex in rhashtable_free_and_destroy()
Merge patch series "VFS changes for nfsd CB_NOTIFY callbacks in directory delegations"
The series starts with patches to allow the vfs to ignore certain types
of events on directories. nfsd can then request these sorts of
delegations on directories, and then set up inotify watches on the
directory to trigger sending CB_NOTIFY events.
* patches from https://patch.msgid.link/20260428-dir-deleg-v3-0-5a0780ba9def@kernel.org:
fsnotify: add FSNOTIFY_EVENT_RENAME data type
fsnotify: add fsnotify_modify_mark_mask()
fsnotify: new tracepoint in fsnotify()
filelock: add an inode_lease_ignore_mask helper
filelock: add a tracepoint to start of break_lease()
filelock: add support for ignoring deleg breaks for dir change events
filelock: pass current blocking lease to trace_break_lease_block() rather than "new_fl"
Jeff Layton [Tue, 28 Apr 2026 07:09:51 +0000 (08:09 +0100)]
fsnotify: add FSNOTIFY_EVENT_RENAME data type
Add a new fsnotify_rename_data struct and FSNOTIFY_EVENT_RENAME data
type that carries both the moved dentry and the inode that was
overwritten by the rename (if any).
Update fsnotify_data_inode(), fsnotify_data_dentry(), and
fsnotify_data_sb() to handle the new type, and add a new
fsnotify_data_rename_target() helper for extracting the overwritten
target inode.
Update fsnotify_move() to use the new data type for FS_RENAME and
FS_MOVED_TO events, passing the overwritten target inode through the
event data. FS_MOVED_FROM is unchanged since the source directory
doesn't need overwrite information.
This is done so that fsnotify consumers like nfsd can atomically
observe the overwritten file when a rename replaces an existing entry,
without needing a separate FS_DELETE event.
Assisted-by: Claude (Anthropic Claude Code) Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260428-dir-deleg-v3-7-5a0780ba9def@kernel.org Acked-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
Jeff Layton [Tue, 28 Apr 2026 07:09:50 +0000 (08:09 +0100)]
fsnotify: add fsnotify_modify_mark_mask()
nfsd needs to be able to modify the mask on an existing mark when new
directory delegations are set or unset. Add an exported function that
allows the caller to set and clear bits in the mark->mask, and does
the recalculation if something changed.
Suggested-by: Jan Kara <jack@suse.cz> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260428-dir-deleg-v3-6-5a0780ba9def@kernel.org Acked-by: Jan Kara <jack@suse.cz> Acked-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
Tejun Heo [Tue, 5 May 2026 00:51:21 +0000 (14:51 -1000)]
cgroup: Defer kill_css_finish() in cgroup_apply_control_disable()
Same race shape as the rmdir path that 93618edf7538 ("cgroup: Defer css
percpu_ref kill on rmdir until cgroup is depopulated") fixed: a task past
exit_signals() whose cset subsys[ssid] still pins the disabled controller's
css can be touching subsys state while ->css_offline() runs. The earlier
patches in this series built up the per-subsys-css deferral machinery and
routed cgroup_destroy_locked() through it. Apply the same shape to
cgroup_apply_control_disable():
kill_css_sync(css);
if (!css_is_populated(css))
kill_css_finish(css);
When the dying css is still populated, kill_css_finish() is deferred. The
walker in css_update_populated() fires kill_finish_work once the css's
hierarchical populated count drops to zero.
cgroup_lock_and_drain_offline()'s wait predicate switches from
percpu_ref_is_dying() to css_is_dying(). CSS_DYING is set by kill_css_sync()
and is a strict superset of percpu_ref_is_dying. Without this change, a +cpu
re-enable after a deferred -cpu disable would skip the drain (percpu_ref
isn't killed yet) and observe the still-CSS_DYING css through cgroup_css(),
treating it as live.
Jeff Layton [Tue, 28 Apr 2026 07:09:48 +0000 (08:09 +0100)]
filelock: add an inode_lease_ignore_mask helper
Add a new routine that returns a mask of all dir change events that are
currently ignored by any leases. nfsd will use this to determine how to
configure the fsnotify_mark mask.
Jeff Layton [Tue, 28 Apr 2026 07:09:46 +0000 (08:09 +0100)]
filelock: add support for ignoring deleg breaks for dir change events
If a NFS client requests a directory delegation with a notification
bitmask covering directory change events, the server shouldn't recall
the delegation. Instead the client will be notified of the change after
the fact.
Add support for ignoring lease breaks on directory changes. Add a new
flags parameter to try_break_deleg() and teach __break_lease how to
ignore certain types of delegation break events.
Jeff Layton [Tue, 28 Apr 2026 07:09:45 +0000 (08:09 +0100)]
filelock: pass current blocking lease to trace_break_lease_block() rather than "new_fl"
The break_lease_block tracepoint currently just shows the type of
"new_fl", which we can predict from the "flags" value. Switch it to
display info about "fl" instead, as that's the file_lease on which the
code is blocking.
For trace_break_lease_unblock(), pass it a NULL pointer. "fl" may have
been freed by that point, and passing it the info in new_fl is
deceptive.
93618edf7538 ("cgroup: Defer css percpu_ref kill on rmdir until cgroup is
depopulated") deferred kill_css_finish() at the cgroup level: rmdir waits
for the entire cgroup's populated count to drop to zero, then fires
kill_css_finish() on every subsystem css at once. Replace that with
per-subsys-css deferral. Each subsystem css now tracks its own hierarchical
populated count and independently defers its kill_css_finish() until its own
subtree drains.
The rmdir-race fix carries through unchanged in shape. The dying css's
->css_offline() still waits until no PF_EXITING task references it, and v2's
cgroup-level machinery goes away.
cgroup_apply_control_disable() has the same race shape (PF_EXITING tasks
pinning a css whose ->css_offline() is about to run) and stays synchronous
here. This patch lays the groundwork for fixing it - per-cgroup waiting
can't gate one subsys css being killed while the rest of the cgroup stays
live, but per-css can.
Subtree-wide invariant preserved: a dying ancestor css stays populated
through nr_populated_children until every dying descendant's task drains, so
the walker fires the ancestor's kill_finish_work only after all descendants
have drained.
Add paired smp_mb()s in kill_css_sync() and css_update_populated() to fence
the StoreLoad on (CSS_DYING, populated counter), guaranteeing that either
the walker queues kill_finish_work or the caller fires synchronously.
cgroup_destroy_locked() was implicitly fenced by an unrelated css_set_lock
pair; cgroup_apply_control_disable() in the next patch is not.
Tejun Heo [Tue, 5 May 2026 00:51:19 +0000 (14:51 -1000)]
cgroup: Move populated counters to cgroup_subsys_state
Later patches replace the cgroup-level finish_destroy_work deferral added
by 93618edf7538 ("cgroup: Defer css percpu_ref kill on rmdir until cgroup
is depopulated") with a per-subsys-css deferral. That needs each subsystem
css to track its own populated count. Move the populated counters from
cgroup onto cgroup_subsys_state. cgroup->self is itself a
cgroup_subsys_state and self.parent walks the same chain as cgroup_parent(),
so cgroup_update_populated() generalizes to a single css_update_populated()
taking a css. The cgroup-side bookkeeping runs only when the walk started
from a self css.
Keep nr_populated_{domain,threaded}_children on cgroup. Both sum to
self.nr_populated_children, but staying as dedicated fields to allow readers
like cgroup_can_be_thread_root() unlocked access.
css_set_update_populated() also walks the per-subsys-css chain so each
subsystem css's hierarchical populated count is maintained. No reader
consumes those counts yet.
Tejun Heo [Tue, 5 May 2026 00:51:18 +0000 (14:51 -1000)]
cgroup: Annotate unlocked nr_populated_* accesses with READ_ONCE/WRITE_ONCE
cgroup_update_populated() updates nr_populated_csets,
nr_populated_domain_children, and nr_populated_threaded_children under
css_set_lock, but cgroup_has_tasks(), cgroup_is_populated(), and
cgroup_can_be_thread_root() read them without holding it. Use
READ_ONCE/WRITE_ONCE.
Tejun Heo [Tue, 5 May 2026 00:51:17 +0000 (14:51 -1000)]
cgroup: Inline cgroup_has_tasks() in cgroup.h
cpuset reads cs->css.cgroup->nr_populated_csets directly in two places to
test whether a cgroup has tasks. cgroup.c already has a matching helper,
cgroup_has_tasks(). Move it to cgroup.h as static inline and use that
instead. This is to prepare for relocation of cgroup->nr_populated_csets. No
semantic change.
Jens Axboe [Fri, 15 May 2026 16:19:09 +0000 (10:19 -0600)]
io_uring/net: punt IORING_OP_BIND async if it needs file create
For two reasons:
1) An opcode cannot block inside io_uring_enter() doing submissions, as
it'll stall the submission side pipeline.
2) Ending up in sb_start_write() -> __sb_start_write() ->
percpu_down_read_freezable() introduces a new lockdep edge, which it
correctly complains about.
Check if the socket type is AF_UNIX and has a non-empty pathname. If it
does, mark it REQ_F_FORCE_ASYNC to punt the submission to io-wq rather
than attempt to do it inline.
With PROVE_LOCKING on an Snapdragon X1 and VM reclaim pressure, we see:
======================================================
WARNING: possible circular locking dependency detected
7.0.0-debug+ #43 Tainted: G W
------------------------------------------------------
kswapd0/82 is trying to acquire lock: ffff800080ec3870 (reservation_ww_class_acquire){+.+.}-{0:0}, at: msm_gem_shrinker_scan+0x17c/0x400 [msm]
but task is already holding lock: ffffc31709b263b8 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x88/0x988
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
other info that might help us debug this:
Chain exists of:
reservation_ww_class_acquire --> reservation_ww_class_mutex --> fs_reclaim
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(fs_reclaim);
lock(reservation_ww_class_mutex);
lock(fs_reclaim);
lock(reservation_ww_class_acquire);
*** DEADLOCK ***
1 lock held by kswapd0/82:
#0: ffffc31709b263b8 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x88/0x988
Hans Zhang [Fri, 15 May 2026 15:36:35 +0000 (23:36 +0800)]
MAINTAINERS: Remove Jianjun Wang as PCIe mediatek maintainer
Email to Jianjun Wang <jianjun.wang@mediatek.com> bounces with error:
"550 Relaying mail to jianjun.wang@mediatek.com is not allowed".
Remove the address to avoid sending future kernel maintenance queries
to an unreachable destination.
The MediaTek PCIe driver remains supported by Ryder Lee.
io_uring/epoll: disallow adding an epoll file to an epoll context
One of the nastier things about epoll is how it allows adding epoll
files to epoll contexts. This leads to all sorts of loop detection
code, and has been a source of issues in the past.
Arguably adding IORING_EPOLL_CTL is a historical mistake on the
io_uring side, but we're kind of stuck with it now as it does seem
to be in use according to code searches. But we can at least minimize
the damage a bit and just disallow this part of epoll, where nesting
issues can arise.
Cássio Gabriel [Mon, 11 May 2026 04:29:34 +0000 (01:29 -0300)]
ALSA: hda/cs35l41: Fix firmware load work teardown
cs35l41_hda creates ALSA controls whose private data points at the
cs35l41_hda object. The firmware load control can also queue
fw_load_work.
Those controls are not removed on component unbind, and device remove
only cancels fw_load_work through cs35l41_remove_dsp(). That helper is
skipped when halo_initialized is false. With firmware_autostart
disabled, a firmware load can be requested before the DSP has been
initialized. If the component or device is removed before the queued
work runs, the worker can run after teardown and dereference driver
state that is no longer valid.
Track the created controls and remove them on unbind so no new control
callback can reach the driver data or queue more work. Then cancel
fw_load_work to drain any request that was already queued. Also cancel
the work unconditionally during device remove before runtime PM teardown.
Fixes: 47ceabd99a28 ("ALSA: hda: cs35l41: Support Firmware switching and reloading") Fixes: 4c870513fbb0 ("ALSA: hda: cs35l41: Add read-only ALSA control for forced mute") Cc: stable@vger.kernel.org Signed-off-by: Cássio Gabriel <cassiogabrielcontato@gmail.com> Reviewed-by: Stefan Binding <sbinding@opensource.cirrus.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/20260511-alsa-hda-cs35l41-fw-work-teardown-v1-1-1184e9bc4f25@gmail.com
* patches from https://patch.msgid.link/20260515153515.362266-1-cel@kernel.org:
nfsd: Cap case-folding probe cost across READDIR entries
nfsd: Map -ESTALE from case probe to NFS3ERR_STALE
nfsd: Use kernel credentials for case-info probe
fs: Clarify FS_CASEFOLD_FL semantics in UAPI header
nfs: Skip pathconf probe when neither field is consumed
nfs: Avoid transient zeroed case capability bits during probe
tools headers UAPI: Sync case-sensitivity flags from linux/fs.h
Chuck Lever [Fri, 15 May 2026 15:35:15 +0000 (11:35 -0400)]
nfsd: Cap case-folding probe cost across READDIR entries
NFSv4 READDIR carries a per-entry attrmask. When the attrmask
includes FATTR4_CASE_INSENSITIVE or FATTR4_CASE_PRESERVING,
nfsd4_encode_fattr4() resolves each non-directory child's case
attributes by calling nfsd_get_case_info(), which dget_parent()s
back to the directory being read and re-runs the cred swap and LSM
probe per child. The encoder amplifies a single answer into one
prepare_kernel_cred() allocation, two LSM hooks, and one put_cred()
RCU callback for every non-directory entry.
No mainstream NFSv4 client has been observed to populate a READDIR
attrmask with these attributes; the Linux client queries them only
via SERVER_CAPS at mount time. The exposure is therefore to test
clients exploring corner cases and to hostile clients that submit
an attrmask designed to multiply server work by rd_dircount.
Probe the directory being read once and cache the result on
struct nfsd4_readdir for use by every non-directory child. The
probe targets the readdir filehandle's dentry, which is held for
the duration of the request, rather than dget_parent() of a
child's locklessly-acquired dentry; the latter could be moved out
of the directory by a concurrent rename and report attributes
from an unrelated parent. Directory entries continue to be
queried individually, because casefold-capable filesystems (ext4,
f2fs) report case state per directory. The other callers of
nfsd4_encode_fattr4() (single GETATTR, the buffer wrapper) pass
NULL for the cache pointer and behave as before.
Reported-by: sashiko-bot <sashiko-bot@kernel.org> Closes: https://sashiko.dev/#/patchset/20260507-case-sensitivity-v14-0-e62cc8200435@oracle.com?part=14 Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Link: https://patch.msgid.link/20260515153515.362266-8-cel@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
Chuck Lever [Fri, 15 May 2026 15:35:14 +0000 (11:35 -0400)]
nfsd: Map -ESTALE from case probe to NFS3ERR_STALE
The PATHCONF switch in nfsd3_proc_pathconf() recognizes -EOPNOTSUPP
(filesystem does not expose case state) and maps -EACCES / -EPERM to
nfserr_stale, but lets every other errno fall through to
nfserr_serverfault. -ESTALE escapes the same way even though RFC 1813
lists NFS3ERR_STALE as a permitted PATHCONF status, so a probe of an
NFS-backed re-export whose parent dentry has been invalidated returns
SERVERFAULT and tells the client the server is broken when the handle
itself simply went stale.
Add an explicit -ESTALE arm that maps to nfserr_stale.
Fixes: a8de9c3b40e4 ("nfsd: Report export case-folding via NFSv3 PATHCONF") Reported-by: sashiko-bot <sashiko-bot@kernel.org> Closes: https://sashiko.dev/#/patchset/20260507-case-sensitivity-v14-0-e62cc8200435@oracle.com?part=13 Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Link: https://patch.msgid.link/20260515153515.362266-7-cel@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
Chuck Lever [Fri, 15 May 2026 15:35:13 +0000 (11:35 -0400)]
nfsd: Use kernel credentials for case-info probe
nfsd_get_case_info() takes prepare_creds() and overrides fsuid/fsgid
to GLOBAL_ROOT, intending to escape per-client policy on the parent
directory. prepare_creds() copies the calling task's full credential,
including the LSM security label, so only the DAC identity is
neutralized. With labeled NFS, where the active LSM context has been
mapped to the client, security_inode_file_getattr() can still deny the
probe with -EACCES even though the case-folding property the caller
wants is structural and identical for every client. The docblock
already states the intent ("the probe runs with kernel credentials"),
which the implementation does not deliver.
prepare_kernel_cred(&init_task) constructs a credential from
init_task's identity and security label, the kernel's own unconfined
context. Use it instead and drop the redundant fsuid/fsgid overrides
that init_task already supplies. The probe now matches the docblock,
LSM denials on the parent disappear, and the call sites that map an
unexpected error to NFS3ERR_SERVERFAULT or fail an NFSv4 GETATTR
outright stop seeing -EACCES from this path.
Reported-by: sashiko-bot <sashiko-bot@kernel.org> Closes: https://sashiko.dev/#/patchset/20260507-case-sensitivity-v14-0-e62cc8200435@oracle.com?part=14 Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Link: https://patch.msgid.link/20260515153515.362266-6-cel@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
Chuck Lever [Fri, 15 May 2026 15:35:12 +0000 (11:35 -0400)]
fs: Clarify FS_CASEFOLD_FL semantics in UAPI header
The existing one-liner "Folder is case insensitive" leaves the
impression that FS_CASEFOLD_FL is reserved for directories.
That impression is wrong: filesystems that derive
case-insensitivity from mount or volume state report the bit on
non-directory inodes via i_op->fileattr_get, so userspace
inspecting FS_IOC_GETFLAGS can see it on any inode type.
Replace the one-liner with a block comment that names directories
as the typical case, records that non-directory inodes may also
report the bit, and notes FS_XFLAG_CASEFOLD as the read-only
companion exposed through FS_IOC_FSGETXATTR.
Reported-by: sashiko-bot <sashiko-bot@kernel.org> Closes: https://sashiko.dev/#/patchset/20260507-case-sensitivity-v14-0-e62cc8200435@oracle.com?part=3 Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Link: https://patch.msgid.link/20260515153515.362266-5-cel@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
Chuck Lever [Fri, 15 May 2026 15:35:11 +0000 (11:35 -0400)]
nfs: Skip pathconf probe when neither field is consumed
The PATHCONF RPC issued from nfs_probe_fsinfo() supplies two pieces of
information: max_namelen, used only when server->namelen has not been
pinned by mount options, and the case_insensitive / case_preserving
fields, used only by the NFSv2/NFSv3 path. NFSv4 receives its case
sensitivity caps from the FATTR4_CASE_* attributes during the
set_capabilities probe, and a non-zero server->namelen short-circuits
the only other field of interest.
When both conditions hold (NFSv4 with namelen pinned), the pathconf
reply is discarded in full but the round-trip is still on the mount
critical path. Gate the call on version < 4 || namelen == 0 so that
mounts which cannot benefit from the reply do not pay for it.
Reported-by: sashiko-bot <sashiko-bot@kernel.org> Closes: https://sashiko.dev/#/patchset/20260507-case-sensitivity-v14-0-e62cc8200435@oracle.com?part=10 Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Link: https://patch.msgid.link/20260515153515.362266-4-cel@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
Chuck Lever [Fri, 15 May 2026 15:35:10 +0000 (11:35 -0400)]
nfs: Avoid transient zeroed case capability bits during probe
nfs_probe_fsinfo() clears NFS_CAP_CASE_INSENSITIVE and
NFS_CAP_CASE_NONPRESERVING ahead of the synchronous pathconf RPC and
sets them again only after the reply arrives. The code path is gated
by clp->rpc_ops->version < 4 and is therefore reached on NFSv2/v3
remount via nfs_reconfigure(), which calls nfs_probe_server() against
a live mount. Concurrent readers walking server->caps can observe the
cleared state for the duration of the round-trip and report the wrong
case-sensitivity attributes.
Compute the post-probe capability mask on the stack and assign it to
server->caps in a single store so readers see either the stale value
or the freshly computed one, never an intermediate zero. Preserve the
original behaviour of dropping the bits when the pathconf RPC itself
fails.
The analogous transient zero on the NFSv4 path lives in
nfs4_server_capabilities() and is left for a separate fix.
Reported-by: sashiko-bot <sashiko-bot@kernel.org> Closes: https://sashiko.dev/#/patchset/20260507-case-sensitivity-v14-0-e62cc8200435@oracle.com?part=10 Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Link: https://patch.msgid.link/20260515153515.362266-3-cel@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
Chuck Lever [Fri, 15 May 2026 15:35:09 +0000 (11:35 -0400)]
tools headers UAPI: Sync case-sensitivity flags from linux/fs.h
The case-sensitivity series adds FS_XFLAG_CASEFOLD and
FS_XFLAG_CASENONPRESERVING to include/uapi/linux/fs.h, and
tools/perf/check-headers.sh would warn about the resulting drift
in the perf beauty copy. Pick up only those two flags (and the
surrounding comment block) so the series does not introduce new
drift of its own.
This is not a full sync. The perf copy is also missing the
FS_IOC_SHUTDOWN block added by commit 1f662195dbc0 ("fs: add
generic FS_IOC_SHUTDOWN definitions"). Because
tools/perf/check-headers.sh emits a single warning per file, that
warning will remain active until the older drift is picked up
too; closing it is left to a separate sync outside this series.
Reported-by: sashiko-bot <sashiko-bot@kernel.org> Closes: https://sashiko.dev/#/patchset/20260507-case-sensitivity-v14-0-e62cc8200435@oracle.com?part=2 Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Link: https://patch.msgid.link/20260515153515.362266-2-cel@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
Merge patch series "io_uring related epoll cleanups"
Jens Axboe <axboe@kernel.dk> says:
One of the nastier things about epoll is how it allows nesting contexts
inside each other, leading to the necessity of loop detection and the
issues that have come with that.
I don't believe there's any reason to support nesting on the io_uring
side, in fact IORING_OP_EPOLL_CTL is a historical mistake, imho. But
let's at least try and contain the damage and disallow nested contexts
from our side.
Christian Brauner <brauner@kernel.org> says:
Bring in the eventpoll specific io_uring changes together with the
eventpoll cleanup I did this cycle. The io_uring changes can go on top
of both through the block tree.
* patches from https://patch.msgid.link/20260514140817.623026-1-axboe@kernel.dk:
eventpoll: rename struct epoll_filefd to epoll_key
eventpoll: add file based control interface
eventpoll: export is_file_epoll()
eventpoll: pass struct epoll_filefd through ep_find() and ep_insert()
Jens Axboe [Thu, 14 May 2026 14:07:19 +0000 (08:07 -0600)]
eventpoll: add file based control interface
Add do_epoll_ctl_file(), which takes a pre-resolved epoll file and a
struct epoll_filefd for the target rather than two integer file
descriptors. do_epoll_ctl() remains as a thin wrapper.
In preparation for using the file based interface from io_uring.
Cheng-Han Wu [Sun, 3 May 2026 10:14:29 +0000 (18:14 +0800)]
docs: admin-guide: add IGNORE_DIRS example for cscope
The workload tracing guide shows how to build a cscope database by
running cscope command directly. The kernel build system also provides
a cscope target, which supports IGNORE_DIRS for excluding directories
from the generated database.
Mention make cscope and show how to exclude Documentation/ as an example.
Cheng-Han Wu [Sun, 3 May 2026 10:14:28 +0000 (18:14 +0800)]
docs: admin-guide: clarify perf bench all behavior
The workload tracing guide lists a fixed set of benchmarks for
"perf bench all". This list is stale and can become outdated when
perf adds, removes, or renames benchmark collections or individual
benchmarks.
Describe "perf bench all" as running all available benchmarks in the perf
bench framework instead. Also document how to list the collections and
benchmarks available on a given system.
Cheng-Han Wu [Sun, 3 May 2026 10:14:27 +0000 (18:14 +0800)]
docs: admin-guide: fix stress-ng command examples
The workload tracing guide includes stress-ng command examples with a
stray "command." word at the end. This makes the examples invalid if they
are copied and run directly.
Remove the stray word from the stress-ng example. Also use "--" in the
perf record example to clearly separate perf record options from the
workload command being recorded.
Arnd Bergmann [Fri, 15 May 2026 14:47:09 +0000 (16:47 +0200)]
Merge tag 'pxa1908-dt-for-7.2' of https://codeberg.org/pxa1908-mainline/linux into soc/dt
PXA1908 DT changes for 7.2
This set consists of a bug fix and three QoL fixes:
* Fix touchscreen breakage in low ambient temperatures on coreprimevelte
* Free up most of the framebuffer memory on coreprimevelte
* Fill in some missing properties for pre-0.2 PSCI and coreprimevelte SDIO
* tag 'pxa1908-dt-for-7.2' of https://codeberg.org/pxa1908-mainline/linux:
arm64: dts: marvell: samsung-coreprimevelte: Add missing SDIO properties
arm64: dts: marvell: pxa1908: Add PSCI function IDs
arm64: dts: marvell: samsung,coreprimevelte: Use memory-region for framebuffer
arm64: dts: marvell: samsung-coreprimevelte: Increase touchscreen voltage
Chao Gao [Thu, 7 May 2026 13:47:31 +0000 (06:47 -0700)]
Documentation: core-api/cpu_hotplug: Remove stale cpu0_hotplug docs
Commit e59e74dc48a3 ("x86/topology: Remove CPU0 hotplug option")
removed the 'cpu0_hotplug' option, but its documentation remained in
cpu_hotplug.rst. Remove the stale entry.
Reported-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Chao Gao <chao.gao@intel.com>
Message-ID: <20260507134732.254617-1-chao.gao@intel.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>