Merge tag 'landlock-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux
Pull Landlock update from Mickaël Salaün:
"This adds a new Landlock access right for pathname UNIX domain sockets
thanks to a new LSM hook, and a few fixes"
* tag 'landlock-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux: (23 commits)
landlock: Document fallocate(2) as another truncation corner case
landlock: Document FS access right for pathname UNIX sockets
selftests/landlock: Simplify ruleset creation and enforcement in fs_test
selftests/landlock: Check that coredump sockets stay unrestricted
selftests/landlock: Audit test for LANDLOCK_ACCESS_FS_RESOLVE_UNIX
selftests/landlock: Test LANDLOCK_ACCESS_FS_RESOLVE_UNIX
selftests/landlock: Replace access_fs_16 with ACCESS_ALL in fs_test
samples/landlock: Add support for named UNIX domain socket restrictions
landlock: Clarify BUILD_BUG_ON check in scoping logic
landlock: Control pathname UNIX domain socket resolution by path
landlock: Use mem_is_zero() in is_layer_masks_allowed()
lsm: Add LSM hook security_unix_find
landlock: Fix kernel-doc warning for pointer-to-array parameters
landlock: Fix formatting in tsync.c
landlock: Improve kernel-doc "Return:" section consistency
landlock: Add missing kernel-doc "Return:" sections
selftests/landlock: Fix format warning for __u64 in net_test
selftests/landlock: Skip stale records in audit_match_record()
selftests/landlock: Drain stale audit records on init
selftests/landlock: Fix socket file descriptor leaks in audit helpers
...
Merge tag 'selinux-pr-20260410' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull selinux update from Paul Moore:
- Annotate a known race condition to soothe KCSAN
* tag 'selinux-pr-20260410' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: annotate intentional data race in inode_doinit_with_dentry()
Merge tag 'lsm-pr-20260410' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm
Pull LSM updates from Paul Moore:
"We only have five patches in the LSM tree, but three of the five are
for an important bugfix relating to overlayfs and the mmap() and
mprotect() access controls for LSMs. Highlights below:
- Fix problems with the mmap() and mprotect() LSM hooks on overlayfs
As we are dealing with problems both in mmap() and mprotect() there
are essentially two components to this fix, spread across three
patches with all marked for stable.
The simplest portion of the fix is the creation of a new LSM hook,
security_mmap_backing_file(), that is used to enforce LSM mmap()
access controls on backing files in the stacked/overlayfs case. The
existing security_mmap_file() does not have visibility past the
user file. You can see from the associated SELinux hook callback
the code is fairly straightforward.
The mprotect() fix is a bit more complicated as there is no way in
the mprotect() code path to inspect both the user and backing
files, and bolting on a second file reference to vm_area_struct
wasn't really an option.
The solution taken here adds a LSM security blob and associated
hooks to the backing_file struct that LSMs can use to capture and
store relevant information from the user file. While the necessary
SELinux information is relatively small, a single u32, I expect
other LSMs to require more than that, and a dedicated backing_file
LSM blob provides a storage mechanism without negatively impacting
other filesystems.
I want to note that other LSMs beyond SELinux have been involved in
the discussion of the fixes presented here and they are working on
their own related changes using these new hooks, but due to other
issues those patches will be coming at a later date.
- Use kstrdup_const()/kfree_const() for securityfs symlink targets
- Resolve a handful of kernel-doc warnings in cred.h"
* tag 'lsm-pr-20260410' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm:
selinux: fix overlayfs mmap() and mprotect() access checks
lsm: add backing_file LSM hooks
fs: prepare for adding LSM blob to backing_file
securityfs: use kstrdup_const() to manage symlink targets
cred: fix kernel-doc warnings in cred.h
Merge tag 'audit-pr-20260410' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit
Pull audit updates from Paul Moore:
- Improved handling of unknown status requests from userspace
The current kernel code ignores unknown/unused request bits sent from
userspace and returns an error code based on the results of the
request(s) it does understand. The patch from Ricardo fixes this so
that unknown requests return an -EINVAL to userspace, making
compatibility a bit easier moving forward.
- A number of small style and formatting cleanups
* tag 'audit-pr-20260410' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
audit: handle unknown status requests in audit_receive_msg()
audit: fix coding style issues
audit: remove redundant initialization of static variables to 0
audit: fix whitespace alignment in include/uapi/linux/audit.h
Merge tag 'vfs-7.1-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull misc vfs updates from Christian Brauner:
"Features:
- coredump: add tracepoint for coredump events
- fs: hide file and bfile caches behind runtime const machinery
Fixes:
- fix architecture-specific compat_ftruncate64 implementations
- dcache: Limit the minimal number of bucket to two
- fs/omfs: reject s_sys_blocksize smaller than OMFS_DIR_START
- fs/mbcache: cancel shrink work before destroying the cache
- dcache: permit dynamic_dname()s up to NAME_MAX
Cleanups:
- remove or unexport unused fs_context infrastructure
- trivial ->setattr cleanups
- selftests/filesystems: Assume that TIOCGPTPEER is defined
- writeback: fix kernel-doc function name mismatch for wb_put_many()
- autofs: replace manual symlink buffer allocation in autofs_dir_symlink
- init/initramfs.c: trivial fix: FSM -> Finite-state machine
- fs: remove stale and duplicate forward declarations
- readdir: Introduce dirent_size()
- fs: Replace user_access_{begin/end} by scoped user access
- kernel: acct: fix duplicate word in comment
- fs: write a better comment in step_into() concerning .mnt assignment
- fs: attr: fix comment formatting and spelling issues"
* tag 'vfs-7.1-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (28 commits)
dcache: permit dynamic_dname()s up to NAME_MAX
fs: attr: fix comment formatting and spelling issues
fs: hide file and bfile caches behind runtime const machinery
fs: write a better comment in step_into() concerning .mnt assignment
proc: rename proc_notify_change to proc_setattr
proc: rename proc_setattr to proc_nochmod_setattr
affs: rename affs_notify_change to affs_setattr
adfs: rename adfs_notify_change to adfs_setattr
hfs: update comments on hfs_inode_setattr
kernel: acct: fix duplicate word in comment
fs: Replace user_access_{begin/end} by scoped user access
readdir: Introduce dirent_size()
coredump: add tracepoint for coredump events
fs: remove do_sys_truncate
fs: pass on FTRUNCATE_* flags to do_truncate
fs: fix archiecture-specific compat_ftruncate64
fs: remove stale and duplicate forward declarations
init/initramfs.c: trivial fix: FSM -> Finite-state machine
autofs: replace manual symlink buffer allocation in autofs_dir_symlink
fs/mbcache: cancel shrink work before destroying the cache
...
Merge tag 'vfs-7.1-rc1.pidfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull clone and pidfs updates from Christian Brauner:
"Add three new clone3() flags for pidfd-based process lifecycle
management.
CLONE_AUTOREAP:
CLONE_AUTOREAP makes a child process auto-reap on exit without ever
becoming a zombie. This is a per-process property in contrast to
the existing auto-reap mechanism via SA_NOCLDWAIT or SIG_IGN for
SIGCHLD which applies to all children of a given parent.
Currently the only way to automatically reap children is to set
SA_NOCLDWAIT or SIG_IGN on SIGCHLD. This is a parent-scoped
property affecting all children which makes it unsuitable for
libraries or applications that need selective auto-reaping of
specific children while still being able to wait() on others.
CLONE_AUTOREAP stores an autoreap flag in the child's
signal_struct. When the child exits do_notify_parent() checks this
flag and causes exit_notify() to transition the task directly to
EXIT_DEAD. Since the flag lives on the child it survives
reparenting: if the original parent exits and the child is
reparented to a subreaper or init the child still auto-reaps when
it eventually exits. This is cleaner than forcing the subreaper to
get SIGCHLD and then reaping it. If the parent doesn't care the
subreaper won't care. If there's a subreaper that would care it
would be easy enough to add a prctl() that either just turns back
on SIGCHLD and turns off auto-reaping or a prctl() that just
notifies the subreaper whenever a child is reparented to it.
CLONE_AUTOREAP can be combined with CLONE_PIDFD to allow the parent
to monitor the child's exit via poll() and retrieve exit status via
PIDFD_GET_INFO. Without CLONE_PIDFD it provides a fire-and-forget
pattern. No exit signal is delivered so exit_signal must be zero.
CLONE_THREAD and CLONE_PARENT are rejected: CLONE_THREAD because
autoreap is a process-level property, and CLONE_PARENT because an
autoreap child reparented via CLONE_PARENT could become an
invisible zombie under a parent that never calls wait().
The flag is not inherited by the autoreap process's own children.
Each child that should be autoreaped must be explicitly created
with CLONE_AUTOREAP.
CLONE_NNP:
CLONE_NNP sets no_new_privs on the child at clone time. Unlike
prctl(PR_SET_NO_NEW_PRIVS) which a process sets on itself,
CLONE_NNP allows the parent to impose no_new_privs on the child at
creation without affecting the parent's own privileges.
CLONE_THREAD is rejected because threads share credentials.
CLONE_NNP is useful on its own for any spawn-and-sandbox pattern
but was specifically introduced to enable unprivileged usage of
CLONE_PIDFD_AUTOKILL.
CLONE_PIDFD_AUTOKILL:
This flag ties a child's lifetime to the pidfd returned from
clone3(). When the last reference to the struct file created by
clone3() is closed the kernel sends SIGKILL to the child. A pidfd
obtained via pidfd_open() for the same process does not keep the
child alive and does not trigger autokill - only the specific
struct file from clone3() has this property. This is useful for
container runtimes, service managers, and sandboxed subprocess
execution - any scenario where the child must die if the parent
crashes or abandons the pidfd or just wants a throwaway helper
process.
CLONE_PIDFD_AUTOKILL requires both CLONE_PIDFD and CLONE_AUTOREAP.
It requires CLONE_PIDFD because the whole point is tying the
child's lifetime to the pidfd. It requires CLONE_AUTOREAP because a
killed child with no one to reap it would become a zombie - the
primary use case is the parent crashing or abandoning the pidfd so
no one is around to call waitpid(). CLONE_THREAD is rejected
because autokill targets a process not a thread.
If CLONE_NNP is specified together with CLONE_PIDFD_AUTOKILL an
unprivileged user may spawn a process that is autokilled. The child
cannot escalate privileges via setuid/setgid exec after being
spawned. If CLONE_PIDFD_AUTOKILL is specified without CLONE_NNP the
caller must have have CAP_SYS_ADMIN in its user namespace"
Merge tag 'vfs-7.1-rc1.bh.metadata' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs buffer_head updates from Christian Brauner:
"This cleans up the mess that has accumulated over the years in
metadata buffer_head tracking for inodes.
It moves the tracking into dedicated structure in filesystem-private
part of the inode (so that we don't use private_list, private_data,
and private_lock in struct address_space), and also moves couple other
users of private_data and private_list so these are removed from
struct address_space saving 3 longs in struct inode for 99% of inodes"
* tag 'vfs-7.1-rc1.bh.metadata' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (42 commits)
fs: Drop i_private_list from address_space
fs: Drop mapping_metadata_bhs from address space
ext4: Track metadata bhs in fs-private inode part
minix: Track metadata bhs in fs-private inode part
udf: Track metadata bhs in fs-private inode part
fat: Track metadata bhs in fs-private inode part
bfs: Track metadata bhs in fs-private inode part
affs: Track metadata bhs in fs-private inode part
ext2: Track metadata bhs in fs-private inode part
fs: Provide functions for handling mapping_metadata_bhs directly
fs: Switch inode_has_buffers() to take mapping_metadata_bhs
fs: Make bhs point to mapping_metadata_bhs
fs: Move metadata bhs tracking to a separate struct
fs: Fold fsync_buffers_list() into sync_mapping_buffers()
fs: Drop osync_buffers_list()
kvm: Use private inode list instead of i_private_list
fs: Remove i_private_data
aio: Stop using i_private_data and i_private_lock
hugetlbfs: Stop using i_private_data
fs: Stop using i_private_data for metadata bh tracking
...
Merge tag 'vfs-7.1-rc1.kino' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs i_ino updates from Christian Brauner:
"For historical reasons, the inode->i_ino field is an unsigned long,
which means that it's 32 bits on 32 bit architectures. This has caused
a number of filesystems to implement hacks to hash a 64-bit identifier
into a 32-bit field, and deprives us of a universal identifier field
for an inode.
This changes the inode->i_ino field from an unsigned long to a u64.
This shouldn't make any material difference on 64-bit hosts, but
32-bit hosts will see struct inode grow by at least 4 bytes. This
could have effects on slabcache sizes and field alignment.
The bulk of the changes are to format strings and tracepoints, since
the kernel itself doesn't care that much about the i_ino field. The
first patch changes some vfs function arguments, so check that one out
carefully.
With this change, we may be able to shrink some inode structures. For
instance, struct nfs_inode has a fileid field that holds the 64-bit
inode number. With this set of changes, that field could be
eliminated. I'd rather leave that sort of cleanups for later just to
keep this simple"
* tag 'vfs-7.1-rc1.kino' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
nilfs2: fix 64-bit division operations in nilfs_bmap_find_target_in_group()
EVM: add comment describing why ino field is still unsigned long
vfs: remove externs from fs.h on functions modified by i_ino widening
treewide: fix missed i_ino format specifier conversions
ext4: fix signed format specifier in ext4_load_inode trace event
treewide: change inode->i_ino from unsigned long to u64
nilfs2: widen trace event i_ino fields to u64
f2fs: widen trace event i_ino fields to u64
ext4: widen trace event i_ino fields to u64
zonefs: widen trace event i_ino fields to u64
hugetlbfs: widen trace event i_ino fields to u64
ext2: widen trace event i_ino fields to u64
cachefiles: widen trace event i_ino fields to u64
vfs: widen trace event i_ino fields to u64
net: change sock.sk_ino and sock_i_ino() to u64
audit: widen ino fields to u64
vfs: widen inode hash/lookup functions to u64
Merge tag 'vfs-7.1-rc1.integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs integrity updates from Christian Brauner:
"This adds support to generate and verify integrity information (aka
T10 PI) in the file system, instead of the automatic below the covers
support that is currently used.
The implementation is based on refactoring the existing block layer PI
code to be reusable for this use case, and then adding relatively
small wrappers for the file system use case. These are then used in
iomap to implement the semantics, and wired up in XFS with a small
amount of glue code.
Compared to the baseline this does not change performance for writes,
but increases read performance up to 15% for 4k I/O, with the benefit
decreasing with larger I/O sizes as even the baseline maxes out the
device quickly on my older enterprise SSD"
* tag 'vfs-7.1-rc1.integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
xfs: support T10 protection information
iomap: support T10 protection information
iomap: support ioends for buffered reads
iomap: add a bioset pointer to iomap_read_folio_ops
ntfs3: remove copy and pasted iomap code
iomap: allow file systems to hook into buffered read bio submission
iomap: only call into ->submit_read when there is a read_ctx
iomap: pass the iomap_iter to ->submit_read
iomap: refactor iomap_bio_read_folio_range
block: pass a maxlen argument to bio_iov_iter_bounce
block: add fs_bio_integrity helpers
block: make max_integrity_io_size public
block: prepare generation / verification helpers for fs usage
block: add a bdev_has_integrity_csum helper
block: factor out a bio_integrity_setup_default helper
block: factor out a bio_integrity_action helper
Merge tag 'vfs-7.1-rc1.directory' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs directory updates from Christian Brauner:
"Recently 'start_creating', 'start_removing', 'start_renaming' and
related interfaces were added which combine the locking and the
lookup.
At that time many callers were changed to use the new interfaces.
However there are still an assortment of places out side of the core
vfs where the directory is locked explictly, whether with inode_lock()
or lock_rename() or similar. These were missed in the first pass for
an assortment of uninteresting reasons.
This addresses the remaining places where explicit locking is used,
and changes them to use the new interfaces, or otherwise removes the
explicit locking.
The biggest changes are in overlayfs. The other changes are quite
simple, though maybe the cachefiles changes is the least simple of
those"
* tag 'vfs-7.1-rc1.directory' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
VFS: unexport lock_rename(), lock_rename_child(), unlock_rename()
ovl: remove ovl_lock_rename_workdir()
ovl: use is_subdir() for testing if one thing is a subdir of another
ovl: change ovl_create_real() to get a new lock when re-opening created file.
ovl: pass name buffer to ovl_start_creating_temp()
cachefiles: change cachefiles_bury_object to use start_renaming_dentry()
ovl: Simplify ovl_lookup_real_one()
VFS: make lookup_one_qstr_excl() static.
nfsd: switch purge_old() to use start_removing_noperm()
selinux: Use simple_start_creating() / simple_done_creating()
Apparmor: Use simple_start_creating() / simple_done_creating()
libfs: change simple_done_creating() to use end_creating()
VFS: move the start_dirop() kerndoc comment to before start_dirop()
fs/proc: Don't lock root inode when creating "self" and "thread-self"
VFS: note error returns in documentation for various lookup functions
Merge tag 'vfs-7.1-rc1.xattr' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs xattr updates from Christian Brauner:
"This reworks the simple_xattr infrastructure and adds support for
user.* extended attributes on sockets.
The simple_xattr subsystem currently uses an rbtree protected by a
reader-writer spinlock. This series replaces the rbtree with an
rhashtable giving O(1) average-case lookup with RCU-based lockless
reads. This sped up concurrent access patterns on tmpfs quite a bit
and it's an overall easy enough conversion to do and gets rid or
rwlock_t.
The conversion is done incrementally: a new rhashtable path is added
alongside the existing rbtree, consumers are migrated one at a time
(shmem, kernfs, pidfs), and then the rbtree code is removed. All three
consumers switch from embedded structs to pointer-based lazy
allocation so the rhashtable overhead is only paid for inodes that
actually use xattrs.
With this infrastructure in place the series adds support for user.*
xattrs on sockets. Path-based AF_UNIX sockets inherit xattr support
from the underlying filesystem (e.g. tmpfs) but sockets in sockfs -
that is everything created via socket() including abstract namespace
AF_UNIX sockets - had no xattr support at all.
The xattr_permission() checks are reworked to allow user.* xattrs on
S_IFSOCK inodes. Sockfs sockets get per-inode limits of 128 xattrs and
128KB total value size matching the limits already in use for kernfs.
The practical motivation comes from several directions. systemd and
GNOME are expanding their use of Varlink as an IPC mechanism.
For D-Bus there are tools like dbus-monitor that can observe IPC
traffic across the system but this only works because D-Bus has a
central broker.
For Varlink there is no broker and there is currently no way to
identify which sockets speak Varlink. With user.* xattrs on sockets a
service can label its socket with the IPC protocol it speaks (e.g.,
user.varlink=1) and an eBPF program can then selectively capture
traffic on those sockets. Enumerating bound sockets via netlink
combined with these xattr labels gives a way to discover all Varlink
IPC entrypoints for debugging and introspection.
Similarly, systemd-journald wants to use xattrs on the /dev/log socket
for protocol negotiation to indicate whether RFC 5424 structured
syslog is supported or whether only the legacy RFC 3164 format should
be used.
In containers these labels are particularly useful as high-privilege
or more complicated solutions for socket identification aren't
available.
The series comes with comprehensive selftests covering path-based
AF_UNIX sockets, sockfs socket operations, per-inode limit
enforcement, and xattr operations across multiple address families
(AF_INET, AF_INET6, AF_NETLINK, AF_PACKET)"
* tag 'vfs-7.1-rc1.xattr' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
selftests/xattr: test xattrs on various socket families
selftests/xattr: sockfs socket xattr tests
selftests/xattr: path-based AF_UNIX socket xattr tests
xattr: support extended attributes on sockets
xattr,net: support limited amount of extended attributes on sockfs sockets
xattr: move user limits for xattrs to generic infra
xattr: switch xattr_permission() to switch statement
xattr: add xattr_permission_error()
xattr: remove rbtree-based simple_xattr infrastructure
pidfs: adapt to rhashtable-based simple_xattrs
kernfs: adapt to rhashtable-based simple_xattrs with lazy allocation
shmem: adapt to rhashtable-based simple_xattrs with lazy allocation
xattr: add rhashtable-based simple_xattr infrastructure
xattr: add rcu_head and rhash_head to struct simple_xattr
Merge tag 'vfs-7.1-rc1.writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs writeback updates from Christian Brauner:
"This introduces writeback helper APIs and converts f2fs, gfs2 and nfs
to stop accessing writeback internals directly"
* tag 'vfs-7.1-rc1.writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
nfs: stop using writeback internals for WB_WRITEBACK accounting
gfs2: stop using writeback internals for dirty_exceeded check
f2fs: stop using writeback internals for dirty_exceeded checks
writeback: prep helpers for dirty-limit and writeback accounting
Merge tag 'rust-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux
Pull Rust updates from Miguel Ojeda:
"Toolchain and infrastructure:
- Bump the minimum Rust version to 1.85.0 (and 'bindgen' to 0.71.1).
As proposed in LPC 2025 and the Maintainers Summit [1], we are
going to follow Debian Stable's Rust versions as our minimum
versions.
Debian Trixie was released on 2025-08-09 with a Rust 1.85.0 and
'bindgen' 0.71.1 toolchain, which is a fair amount of time for e.g.
kernel developers to upgrade.
Other major distributions support a Rust version that is high
enough as well, including:
+ Arch Linux.
+ Fedora Linux.
+ Gentoo Linux.
+ Nix.
+ openSUSE Slowroll and openSUSE Tumbleweed.
+ Ubuntu 25.10 and 26.04 LTS. In addition, 24.04 LTS using
their versioned packages.
The merged patch series comes with the associated cleanups and
simplifications treewide that can be performed thanks to both
bumps, as well as documentation updates.
In addition, start using 'bindgen''s '--with-attribute-custom-enum'
feature to set the 'cfi_encoding' attribute for the 'lru_status'
enum used in Binder.
Link: https://lwn.net/Articles/1050174/
- Add experimental Kconfig option ('CONFIG_RUST_INLINE_HELPERS') that
inlines C helpers into Rust.
Essentially, it performs a step similar to LTO, but just for the
helpers, i.e. very local and fast.
It relies on 'llvm-link' and its '--internalize' flag, and requires
a compatible LLVM between Clang and 'rustc' (i.e. same major
version, 'CONFIG_RUSTC_CLANG_LLVM_COMPATIBLE'). It is only enabled
for two architectures for now.
The result is a measurable speedup in different workloads that
different users have tested. For instance, for the null block
driver, it amounts to a 2%.
- Support global per-version flags.
While we already have per-version flags in many places, we didn't
have a place to set global ones that depend on the compiler
version, i.e. in 'rust_common_flags', which sometimes is needed to
e.g. tweak the lints set per version.
Use that to allow the 'clippy::precedence' lint for Rust < 1.86.0,
since it had a change in behavior.
- Support overriding the crate name and apply it to Rust Binder,
which wanted the module to be called 'rust_binder'.
- Add the remaining '__rust_helper' annotations (started in the
previous cycle).
'kernel' crate:
- Introduce the 'const_assert!' macro: a more powerful version of
'static_assert!' that can refer to generics inside functions or
implementation bodies, e.g.:
In addition, reorganize our set of build-time assertion macros
('{build,const,static_assert}!') to live in the 'build_assert'
module.
Finally, improve the docs as well to clarify how these are
different from one another and how to pick the right one to use,
and their equivalence (if any) to the existing C ones for extra
clarity.
- 'sizes' module: add 'SizeConstants' trait.
This gives us typed 'SZ_*' constants (avoiding casts) for use in
device address spaces where the address width depends on the
hardware (e.g. 32-bit MMIO windows, 64-bit GPU framebuffers, etc.),
e.g.:
let gpu_heap = 14 * u64::SZ_1M;
let mmio_window = u32::SZ_16M;
- 'clk' module: implement 'Send' and 'Sync' for 'Clk' and thus
simplify the users in Tyr and PWM.
- 'ptr' module: add 'const_align_up'.
- 'str' module: improve the documentation of the 'c_str!' macro to
explain that one should only use it for non-literal cases (for the
other case we instead use C string literals, e.g. 'c"abc"').
- Disallow the use of 'CStr::{as_ptr,from_ptr}' and clean one such
use in the 'task' module.
- 'sync' module: finish the move of 'ARef' and 'AlwaysRefCounted'
outside of the 'types' module, i.e. update the last remaining
instances and finally remove the re-exports.
- 'error' module: clarify that 'from_err_ptr' can return 'Ok(NULL)',
including runtime-tested examples.
The intention is to hopefully prevent UB that assumes the result of
the function is not 'NULL' if successful. This originated from a
case of UB I noticed in 'regulator' that created a 'NonNull' on it.
Timekeeping:
- Expand the example section in the 'HrTimer' documentation.
- Mark the 'ClockSource' trait as unsafe to ensure valid values for
'ktime_get()'.
- Add 'Delta::from_nanos()'.
'pin-init' crate:
- Replace the 'Zeroable' impls for 'Option<NonZero*>' with impls of
'ZeroableOption' for 'NonZero*'.
- Improve feature gate handling for unstable features.
- Declutter the documentation of implementations of 'Zeroable' for
tuples.
- Replace uses of 'addr_of[_mut]!' with '&raw [mut]'.
rust-analyzer:
- Add type annotations to 'generate_rust_analyzer.py'.
- Add support for scripts written in Rust ('generate_rust_target.rs',
'rustdoc_test_builder.rs', 'rustdoc_test_gen.rs').
- Refactor 'generate_rust_analyzer.py' to explicitly identify host
and target crates, improve readability, and reduce duplication.
And some other fixes, cleanups and improvements"
* tag 'rust-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux: (79 commits)
rust: sizes: add SizeConstants trait for device address space constants
rust: kernel: update `file_with_nul` comment
rust: kbuild: allow `clippy::precedence` for Rust < 1.86.0
rust: kbuild: support global per-version flags
rust: declare cfi_encoding for lru_status
docs: rust: general-information: use real example
docs: rust: general-information: simplify Kconfig example
docs: rust: quick-start: remove GDB/Binutils mention
docs: rust: quick-start: remove Nix "unstable channel" note
docs: rust: quick-start: remove Gentoo "testing" note
docs: rust: quick-start: add Ubuntu 26.04 LTS and remove subsection title
docs: rust: quick-start: update minimum Ubuntu version
docs: rust: quick-start: update Ubuntu versioned packages
docs: rust: quick-start: openSUSE provides `rust-src` package nowadays
rust: kbuild: remove "dummy parameter" workaround for `bindgen` < 0.71.1
rust: kbuild: update `bindgen --rust-target` version and replace comment
rust: rust_is_available: remove warning for `bindgen` < 0.69.5 && libclang >= 19.1
rust: rust_is_available: remove warning for `bindgen` 0.66.[01]
rust: bump `bindgen` minimum supported version to 0.71.1 (Debian Trixie)
rust: block: update `const_refs_to_static` MSRV TODO comment
...
Merge tag 'rcu.2026.03.31a' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux
Pull RCU updates from Joel Fernandes:
"NOCB CPU management:
- Consolidate rcu_nocb_cpu_offload() and rcu_nocb_cpu_deoffload() to
reduce code duplication
- Extract nocb_bypass_needs_flush() helper to reduce duplication in
NOCB bypass path
rcutorture/torture infrastructure:
- Add NOCB01 config for RCU_LAZY torture testing
- Add NOCB02 config for NOCB poll mode testing
- Add TRIVIAL-PREEMPT config for textbook-style preemptible RCU
torture
- Test call_srcu() with preemption both disabled and enabled
- Remove kvm-check-branches.sh in favor of kvm-series.sh
- Make hangs more visible in torture.sh output
- Add informative message for tests without a recheck file
- Fix numeric test comparison in srcu_lockdep.sh
- Use torture_shutdown_init() in refscale and rcuscale instead of
open-coded shutdown functions
- Fix modulo-zero error in torture_hrtimeout_ns().
SRCU:
- Fix SRCU read flavor macro comments
- Fix s/they disables/they disable/ typo in srcu_read_unlock_fast()
RCU Tasks:
- Document that RCU Tasks Trace grace periods now imply RCU grace
periods
- Remove unnecessary smp_store_release() in cblist_init_generic()"
* tag 'rcu.2026.03.31a' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux:
rcutorture: Test call_srcu() with preemption disabled and not
rcu: Add BOOTPARAM_RCU_STALL_PANIC Kconfig option
torture: Avoid modulo-zero error in torture_hrtimeout_ns()
rcu/nocb: Extract nocb_bypass_needs_flush() to reduce duplication
rcu/nocb: Consolidate rcu_nocb_cpu_offload/deoffload functions
rcu-tasks: Remove unnecessary smp_store_release() in cblist_init_generic()
rcutorture: Add NOCB02 config for nocb poll mode testing
rcutorture: Add NOCB01 config for RCU_LAZY torture testing
rcu-tasks: Document that RCU Tasks Trace grace periods now imply RCU grace periods
srcu: Fix s/they disables/they disable/ typo in srcu_read_unlock_fast()
srcu: Fix SRCU read flavor macro comments
rcuscale: Ditch rcu_scale_shutdown in favor of torture_shutdown_init()
refscale: Ditch ref_scale_shutdown in favor of torture_shutdown_init()
rcutorture: Fix numeric "test" comparison in srcu_lockdep.sh
torture: Print informative message for test without recheck file
torture: Make hangs more visible in torture.sh output
kvm-check-branches.sh: Remove in favor of kvm-series.sh
rcutorture: Add a textbook-style trivial preemptible RCU
proc: make PROC_MEM_FORCE_PTRACE the Kconfig default
This kconfig option was introduced 18 months ago, with the historical
default of always allowing forcing memory permission overrides in order
to not change any existing behavior.
But it was documented as "for now", and this is a gentle nudge to people
that you probably _should_ be using PROC_MEM_FORCE_PTRACE. I've had
that in my local kernel config since the option was introduced.
Anybody who just does "make oldconfig" will pick up their old
configuration with no change, so this is still meant to not change any
existing system behavior, but at least gently prod people into trying
it.
I'd love to get rid of FOLL_FORCE entirely (see commit 8ee74a91ac30
"proc: try to remove use of FOLL_FORCE entirely" from roughly a decade
ago), but sadly that is likely not a realistic option (see commit f511c0b17b08 "Yes, people use FOLL_FORCE ;)" three weeks later).
But at least let's make it more obvious that you have the choice to
limit it and force people to at least be a bit more conscious about
their use of FOLL_FORCE, since judging from a recent discussion people
weren't even aware of this one.
This series cleans up some of the special user copy functions naming and
semantics. In particular, get rid of the (very traditional) double
underscore names and behavior: the whole "optimize away the range check"
model has been largely excised from the other user accessors because
it's so subtle and can be unsafe, but also because it's just not a
relevant optimization any more.
To do that, a couple of drivers that misused the "user" copies as kernel
copies in order to get non-temporal stores had to be fixed up, but that
kind of code should never have been allowed anyway.
The x86-only "nocache" version was also renamed to more accurately
reflect what it actually does.
This was all done because I looked at this code due to a report by Jann
Horn, and I just couldn't stand the inconsistent naming, the horrible
semantics, and the random misuse of these functions. This code should
probably be cleaned up further, but it's at least slightly closer to
normal semantics.
I had a more intrusive series that went even further in trying to
normalize the semantics, but that ended up hitting so many other
inconsistencies between different architectures in this area (eg
'size_t' vs 'unsigned long' vs 'int' as size arguments, and various
iovec check differences that Vasily Gorbik pointed out) that I ended up
with this more limited version that fixed the worst of the issues.
Reported-by: Jann Horn <jannh@google.com> Tested-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/all/CAHk-=wgg1QVWNWG-UCFo1hx0zqrPnB3qhPzUTrWNft+MtXQXig@mail.gmail.com/
* nocache-cleanup:
x86-64/arm64/powerpc: clean up and rename __copy_from_user_flushcache
x86: rename and clean up __copy_from_user_inatomic_nocache()
x86-64: rename misleadingly named '__copy_user_nocache()' function
Merge tag 'wq-for-7.0-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue fix from Tejun Heo:
"This is a fix for a stall which triggers on ordered workqueues when
there are multiple inactive work items during workqueue property
changes through sysfs, which doesn't happen that frequently.
While really late, the fix is very low risk as it just repeats an
operation which is already being performed:
- Fix incomplete activation of multiple inactive works when
unplugging a pool_workqueue, where the pending_pwqs list
wasn't being updated for subsequent works"
* tag 'wq-for-7.0-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: Add pool_workqueue to pending_pwqs list when unplugging multiple inactive works
Merge tag 'timers-urgent-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Thomas Gleixner:
"Two fixes for the time/timers subsystem:
- Invert the inverted fastpath decision in check_tick_dependency(),
which prevents NOHZ full to stop the tick. That's a regression
introduced in the 7.0 merge window.
- Prevent a unpriviledged DoS in the clockevents code, where user
space can starve the timer interrupt by arming a timerfd or posix
interval timer in a tight loop with an absolute expiry time in the
past. The fix turned out to be incomplete and was was amended
yesterday to make it work on some 20 years old AMD machines as
well. All issues with it have been confirmed to be resolved by
various reporters"
* tag 'timers-urgent-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clockevents: Prevent timer interrupt starvation
tick/nohz: Fix inverted return value in check_tick_dependency() fast path
Merge tag 'perf-urgent-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"Four Intel uncore PMU driver fixes by Zide Chen"
* tag 'perf-urgent-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel/uncore: Remove extra double quote mark
perf/x86/intel/uncore: Fix die ID init and look up bugs
perf/x86/intel/uncore: Skip discovery table for offline dies
perf/x86/intel/uncore: Fix iounmap() leak on global_init failure
X.509: Fix out-of-bounds access when parsing extensions
Leo reports an out-of-bounds access when parsing a certificate with
empty Basic Constraints or Key Usage extension because the first byte of
the extension is read before checking its length. Fix it.
The bug can be triggered by an unprivileged user by submitting a
specially crafted certificate to the kernel through the keyrings(7) API.
Leo has demonstrated this with a proof-of-concept program responsibly
disclosed off-list.
Fixes: 30eae2b037af ("KEYS: X.509: Parse Basic Constraints for CA") Fixes: 567671281a75 ("KEYS: X.509: Parse Key Usage") Reported-by: Leo Lin <leo@depthfirst.com> # off-list Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Ignat Korchagin <ignat@linux.win> Cc: stable@vger.kernel.org # v6.4+ Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Merge tag 'spi-fix-v7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A couple of changes here, one update to MAINTAINERS for the AMD
controller and a chnage from Pei Xiao which in spite of the changelog
is actually a fix - previously the zynq-qspi driver leaked a clock
enable for every flash operation it did which isn't good, these extra
enables were removed when doing the enable cleanup which are probably
a good idea anyway"
* tag 'spi-fix-v7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
MAINTAINERS: Update AMD SPI driver maintainers
spi: zynq-qspi: Simplify clock handling with devm_clk_get_enabled()
Merge tag 'regulator-fix-v7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fix from Mark Brown:
"One last fix for v7.0, the BD72720 incorrectly described which DCDC is
tied to the LDO for its LDON-HEAD mode which automates using the DCDC
to more efficiently drop a supply for delivery via the LDO"
* tag 'regulator-fix-v7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: bd71828-regulator.c: Fix LDON-HEAD mode
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"s390:
- vsie: Fix races with partial gmap invalidations
x86:
- Use __DECLARE_FLEX_ARRAY() for UAPI structures with VLAs"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: s390: vsie: Fix races with partial gmap invalidations
KVM: x86: Use __DECLARE_FLEX_ARRAY() for UAPI structures with VLAs
Merge tag 'usb-7.0-final' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fix from Greg KH:
"Here is a single USB fix for a reported regression in a recent USB
typec patch for 7.0-final. Sorry for the late submission, but it does
fix a problem that people have been seeing with 7.0-rc7 and the stable
kernels (due to a backported fix from there.)
This has been in linux-next this week with no reported issues, and the
reporter (Takashi), has said it resolves the problem they were seeing"
* tag 'usb-7.0-final' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: typec: ucsi: skip connector validation before init
Merge tag 'input-for-v7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov:
"Two fixes for force feedback handling in uinput driver:
- fix circular locking dependency in uinput
- fix potential corruption of uinput event queue"
* tag 'input-for-v7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: uinput - take event lock when submitting FF request "event"
Input: uinput - fix circular locking dependency with ff-core
Merge tag 'riscv-for-linus-v7.0-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Paul Walmsley:
"Before v7.0 is released, fix a few issues with the CFI patchset,
merged earlier in v7.0-rc, that primarily affect interfaces to
non-kernel code:
- Improve the prctl() interface for per-task indirect branch landing
pad control to expand abbreviations and to resemble the speculation
control prctl() interface
- Expand the "LP" and "SS" abbreviations in the ptrace uapi header
file to "branch landing pad" and "shadow stack", to improve
readability
- Fix a typo in a CFI-related macro name in the ptrace uapi header
file
- Ensure that the indirect branch tracking state and shadow stack
state are unlocked immediately after an exec() on the new task so
that libc subsequently can control it
- While working in this area, clean up the kernel-internal,
cross-architecture prctl() function names by expanding the
abbreviations mentioned above"
* tag 'riscv-for-linus-v7.0-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
prctl: cfi: change the branch landing pad prctl()s to be more descriptive
riscv: ptrace: cfi: expand "SS" references to "shadow stack" in uapi headers
prctl: rename branch landing pad implementation functions to be more explicit
riscv: ptrace: expand "LP" references to "branch landing pads" in uapi headers
riscv: cfi: clear CFI lock status in start_thread()
riscv: ptrace: cfi: fix "PRACE" typo in uapi header
Merge tag 'drm-fixes-2026-04-11' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
"Last set of fixes, a few vc4, and i915, one xe and one ethosu Kconfig
fix.
xe:
- Fix HW engine idleness unit conversion
i915:
- Drop check for changed VM in EXECBUF
- Fix refcount underflow race in intel_engine_park_heartbeat
- Do not use pipe_src as borders for SU area in PSR
* tag 'drm-fixes-2026-04-11' of https://gitlab.freedesktop.org/drm/kernel:
drm/i915/gem: Drop check for changed VM in EXECBUF
drm/i915/gt: fix refcount underflow in intel_engine_park_heartbeat
drm/xe: Fix bug in idledly unit conversion
drm/i915/psr: Do not use pipe_src as borders for SU area
accel: ethosu: Add hardware dependency hint
drm/vc4: Protect madv read in vc4_gem_object_mmap() with madv_lock
drm/vc4: Fix a memory leak in hang state error path
drm/vc4: Fix memory leak of BO array in hang state
drm/vc4: Release runtime PM reference after binding V3D
Dave Airlie [Fri, 10 Apr 2026 21:35:21 +0000 (07:35 +1000)]
Merge tag 'drm-intel-fixes-2026-04-09' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes
- Drop check for changed VM in EXECBUF
- Fix refcount underflow race in intel_engine_park_heartbeat
- Do not use pipe_src as borders for SU area in PSR
Thomas Gleixner [Tue, 7 Apr 2026 08:54:17 +0000 (10:54 +0200)]
clockevents: Prevent timer interrupt starvation
Calvin reported an odd NMI watchdog lockup which claims that the CPU locked
up in user space. He provided a reproducer, which sets up a timerfd based
timer and then rearms it in a loop with an absolute expiry time of 1ns.
As the expiry time is in the past, the timer ends up as the first expiring
timer in the per CPU hrtimer base and the clockevent device is programmed
with the minimum delta value. If the machine is fast enough, this ends up
in a endless loop of programming the delta value to the minimum value
defined by the clock event device, before the timer interrupt can fire,
which starves the interrupt and consequently triggers the lockup detector
because the hrtimer callback of the lockup mechanism is never invoked.
As a first step to prevent this, avoid reprogramming the clock event device
when:
- a forced minimum delta event is pending
- the new expiry delta is less then or equal to the minimum delta
Thanks to Calvin for providing the reproducer and to Borislav for testing
and providing data from his Zen5 machine.
The problem is not limited to Zen5, but depending on the underlying
clock event device (e.g. TSC deadline timer on Intel) and the CPU speed
not necessarily observable.
This change serves only as the last resort and further changes will be made
to prevent this scenario earlier in the call chain as far as possible.
[ tglx: Updated to restore the old behaviour vs. !force and delta <= 0 and
fixed up the tick-broadcast handlers as pointed out by Borislav ]
Merge tag 'vfs-7.0-rc8.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
"The kernfs rbtree is keyed by (hash, ns, name) where the hash
is seeded with the raw namespace pointer via init_name_hash(ns).
The resulting hash values are exposed to userspace through
readdir seek positions, and the pointer-based ordering in
kernfs_name_compare() is observable through entry order.
Switch from raw pointers to ns_common::ns_id for both hashing
and comparison.
A preparatory commit first replaces all const void * namespace
parameters with const struct ns_common * throughout kernfs, sysfs,
and kobject so the code can access ns->ns_id. Also compare the
ns_id when hashes match in the rbtree to handle crafted collisions.
Also fix eventpoll RCU grace period issue and a cachefiles refcount
problem"
* tag 'vfs-7.0-rc8.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
kernfs: make directory seek namespace-aware
kernfs: use namespace id instead of pointer for hashing and comparison
kernfs: pass struct ns_common instead of const void * for namespace tags
eventpoll: defer struct eventpoll free to RCU grace period
cachefiles: fix incorrect dentry refcount in cachefiles_cull()
Merge tag 'pinctrl-v7.0-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
"Some late pin control fixes. I'm not happy to have bugs so late in the
kernel cycle, but they are all driver specifics so I guess it's how it
is.
- Three fixes for the Intel pin control driver fixing the feature set
for the new silicon
- One fix for an IRQ storm in the MCP23S08 pin controller/GPIO
expander"
* tag 'pinctrl-v7.0-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: mcp23s08: Disable all pin interrupts during probe
pinctrl: intel: Enable 3-bit PAD_OWN feature
pinctrl: intel: Fix the revision for new features (1kOhm PD, HW debouncer)
pinctrl: intel: Improve capability support
This occurs because perf_l2_init() calls err(). However, the code has been
written in such a manner that it is able to perform cleanup and continue.
Therefore, this issue can be addressed by changing the appropriate calls
to err() to warnx().
Additionally, correct the PMU type arguments passed to the warning strings
in the ecore and lcore blocks so the logs accurately reflect the failing
counter type.
Signed-off-by: David Arcari <darcari@redhat.com> Signed-off-by: Len Brown <len.brown@intel.com>
gpio: tegra: return -ENOMEM on allocation failure in probe
devm_kzalloc() failure in tegra_gpio_probe() returns -ENODEV, which
indicates "no such device". The correct error code for a memory
allocation failure is -ENOMEM.
Merge tag 'kbuild-fixes-7.0-4' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux
Pull Kbuild fixes from Nathan Chancellor:
- Make modules-cpio-pkg respect INSTALL_MOD_PATH so that it can be
used with distribution initramfs files that have a merged /usr,
such as Fedora
- Silence an instance of -Wunused-but-set-global, a strengthening
of -Wunused-but-set-variable in tip of tree Clang, in modpost,
as the variable for extra warnings is currently unused
* tag 'kbuild-fixes-7.0-4' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux:
modpost: Declare extra_warn with unused attribute
kbuild: modules-cpio-pkg: Respect INSTALL_MOD_PATH
Merge tag 'sound-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Still a bit higher amount than wished, but nothing looks really scary,
and all changes are about nice and smooth device-specific fixes.
- HD-audio quirks, one revert for a regression and another oneliner
- AMD ACP quirks
- Fixes for SDCA interrupt handling
- A few Intel SOF, avs and NVL fixes
- Fixes for TAS2552 DT, NAU8325, and STM32"
* tag 'sound-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ASoC: amd: acp: update DMI quirk and add ACP DMIC for Lenovo platforms
ASoC: SDCA: Unregister IRQ handlers on module remove
ASoC: SDCA: mask Function_Status value
ASoC: SDCA: Fix overwritten var within for loop
ASoC: stm32_sai: fix incorrect BCLK polarity for DSP_A/B, LEFT_J
ASoC: SOF: Intel: hda: modify period size constraints for ACE4
ALSA: hda/intel: enforce stricter period-size alignment for Intel NVL
ASoC: nau8325: Add software reset during probe
Revert "ALSA: hda/realtek: Add quirk for Gigabyte Technology to fix headphone"
ASoC: Intel: avs: Fix memory leak in avs_register_i2s_test_boards()
ASoC: SOF: Intel: fix iteration in is_endpoint_present()
ASoC: SOF: Intel: Fix endpoint index if endpoints are missing
ASoC: SDCA: Fix errors in IRQ cleanup
ASoC: amd: acp: add Lenovo P16s G5 AMD quirk for legacy SDW machine
ASoC: dt-bindings: ti,tas2552: Add sound-dai-cells
ALSA: hda/realtek: Add quirk for Lenovo Yoga Pro 7 14IAH10
Merge tag 'mmc-v7.0-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC fixes from Ulf Hansson:
- vub300: Fix use-after-free and NULL-deref on disconnect
* tag 'mmc-v7.0-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: vub300: fix use-after-free on disconnect
mmc: vub300: fix NULL-deref on disconnect
Merge tag 'pmdomain-v7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm
Pull pmdomain fixes from Ulf Hansson:
- imx: Prevent hang at power down for imx8mp-blk-ctrl
- thead: Fix buffer overflow for TH1520 AON driver
- Change Ulf Hansson's email
* tag 'pmdomain-v7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm:
MAINTAINERS, mailmap: Change Ulf Hansson's email
pmdomain: imx8mp-blk-ctrl: Keep the NOC_HDCP clock enabled
firmware: thead: Fix buffer overflow and use standard endian macros
Merge tag 'dma-mapping-7.0-2026-04-09' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux
Pull dma-mapping fix from Marek Szyprowski:
"A fix for DMA-mapping subsystem, which hides annoying, false-positive
warnings from DMA-API debug on coherent platforms like x86_64 (Mikhail
Gavrilov)"
* tag 'dma-mapping-7.0-2026-04-09' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux:
dma-debug: suppress cacheline overlap warning when arch has no DMA alignment requirement
Merge tag 'net-7.0-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from netfilter, IPsec and wireless. This is again
considerably bigger than the old average. No known outstanding
regressions.
Current release - regressions:
- net: increase IP_TUNNEL_RECURSION_LIMIT to 5
- eth: ice: fix PTP timestamping broken by SyncE code on E825C
Current release - new code bugs:
- eth: stmmac: dwmac-motorcomm: fix eFUSE MAC address read failure
Previous releases - regressions:
- core: fix cross-cache free of KFENCE-allocated skb head
- sched: act_csum: validate nested VLAN headers
- rxrpc: fix call removal to use RCU safe deletion
- xfrm:
- wait for RCU readers during policy netns exit
- fix refcount leak in xfrm_migrate_policy_find
- wifi: rt2x00usb: fix devres lifetime
- mptcp: fix slab-use-after-free in __inet_lookup_established
- ipvs: fix NULL deref in ip_vs_add_service error path
- eth:
- airoha: fix memory leak in airoha_qdma_rx_process()
- lan966x: fix use-after-free and leak in lan966x_fdma_reload()
Previous releases - always broken:
- ipv6: ioam: fix potential NULL dereferences in __ioam6_fill_trace_data()
- ipv4: nexthop: avoid duplicate NHA_HW_STATS_ENABLE on nexthop group
dump
- bridge: guard local VLAN-0 FDB helpers against NULL vlan group
- xsk: tailroom reservation and MTU validation
- rxrpc:
- fix to request an ack if window is limited
- fix RESPONSE authenticator parser OOB read
- netfilter: nft_ct: fix use-after-free in timeout object destroy
- batman-adv: hold claim backbone gateways by reference
- eth:
- stmmac: fix PTP ref clock for Tegra234
- idpf: fix PREEMPT_RT raw/bh spinlock nesting for async VC handling
- ipa: fix GENERIC_CMD register field masks for IPA v5.0+"
* tag 'net-7.0-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (104 commits)
net: lan966x: fix use-after-free and leak in lan966x_fdma_reload()
net: lan966x: fix page pool leak in error paths
net: lan966x: fix page_pool error handling in lan966x_fdma_rx_alloc_page_pool()
nfc: pn533: allocate rx skb before consuming bytes
l2tp: Drop large packets with UDP encap
net: ipa: fix event ring index not programmed for IPA v5.0+
net: ipa: fix GENERIC_CMD register field masks for IPA v5.0+
MAINTAINERS: Add Prashanth as additional maintainer for amd-xgbe driver
devlink: Fix incorrect skb socket family dumping
af_unix: read UNIX_DIAG_VFS data under unix_state_lock
Revert "mptcp: add needs_id for netlink appending addr"
mptcp: fix slab-use-after-free in __inet_lookup_established
net: txgbe: leave space for null terminators on property_entry
net: ioam6: fix OOB and missing lock
rxrpc: proc: size address buffers for %pISpc output
rxrpc: only handle RESPONSE during service challenge
rxrpc: Fix buffer overread in rxgk_do_verify_authenticator()
rxrpc: Fix leak of rxgk context in rxgk_verify_response()
rxrpc: Fix integer overflow in rxgk_verify_response()
rxrpc: Fix missing error checks for rxkad encryption/decryption failure
...
Merge tag 'iommu-fixes-v7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux
Pull IOMMU fix from Will Deacon:
- Fix regression introduced by the empty MMU gather fix in -rc7, where
the ->iotlb_sync() callback can be elided incorrectly, resulting in
boot failures (hangs), crashes and potential memory corruption.
* tag 'iommu-fixes-v7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux:
iommu: Ensure .iotlb_sync is called correctly
Merge tag 'platform-drivers-x86-v7.0-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform drivers fixes from Ilpo Järvinen:
- amd/pmc: Add Thinkpad L14 Gen3 to quirk_s2idle_bug
- asus-armoury: Add support for FA607NU, GU605MU, and GV302XU.
- intel-uncore-freq: Handle autonomous UFS status bit
- ISST: Handle cases with less than max buckets correctly
- intel-uncore-freq & ISST: Mark minor version 3 supported (no
additional driver changes required)
* tag 'platform-drivers-x86-v7.0-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
platform/x86: asus-armoury: add support for GU605MU
platform/x86: asus-armoury: add support for FA607NU
platform/x86: asus-armoury: add support for GV302XU
platform/x86/amd: pmc: Add Thinkpad L14 Gen3 to quirk_s2idle_bug
platform/x86/intel-uncore-freq: Increase minor version
platform/x86: ISST: Increase minor version
platform/x86/intel-uncore-freq: Handle autonomous UFS status bit
platform/x86: ISST: Reset core count to 0
====================
net: lan966x: fix page_pool error handling and error paths
This series fixes error handling around the lan966x page pool:
1/3 adds the missing IS_ERR check after page_pool_create(), preventing
a kernel oops when the error pointer flows into
xdp_rxq_info_reg_mem_model().
2/3 plugs page pool leaks in the lan966x_fdma_rx_alloc() and
lan966x_fdma_init() error paths, now reachable after 1/3.
3/3 fixes a use-after-free and page pool leak in the
lan966x_fdma_reload() restore path, where the hardware could
resume DMA into pages already returned to the page pool.
====================
David Carlier [Sun, 5 Apr 2026 05:52:41 +0000 (06:52 +0100)]
net: lan966x: fix use-after-free and leak in lan966x_fdma_reload()
When lan966x_fdma_reload() fails to allocate new RX buffers, the restore
path restarts DMA using old descriptors whose pages were already freed
via lan966x_fdma_rx_free_pages(). Since page_pool_put_full_page() can
release pages back to the buddy allocator, the hardware may DMA into
memory now owned by other kernel subsystems.
Additionally, on the restore path, the newly created page pool (if
allocation partially succeeded) is overwritten without being destroyed,
leaking it.
Fix both issues by deferring the release of old pages until after the
new allocation succeeds. Save the old page array before the allocation
so old pages can be freed on the success path. On the failure path, the
old descriptors, pages and page pool are all still valid, making the
restore safe. Also ensure the restore path re-enables NAPI and wakes
the netdev, matching the success path.
Fixes: 89ba464fcf54 ("net: lan966x: refactor buffer reload function") Cc: stable@vger.kernel.org Signed-off-by: David Carlier <devnexen@gmail.com> Link: https://patch.msgid.link/20260405055241.35767-4-devnexen@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
David Carlier [Sun, 5 Apr 2026 05:52:40 +0000 (06:52 +0100)]
net: lan966x: fix page pool leak in error paths
lan966x_fdma_rx_alloc() creates a page pool but does not destroy it if
the subsequent fdma_alloc_coherent() call fails, leaking the pool.
Similarly, lan966x_fdma_init() frees the coherent DMA memory when
lan966x_fdma_tx_alloc() fails but does not destroy the page pool that
was successfully created by lan966x_fdma_rx_alloc(), leaking it.
Add the missing page_pool_destroy() calls in both error paths.
David Carlier [Sun, 5 Apr 2026 05:52:39 +0000 (06:52 +0100)]
net: lan966x: fix page_pool error handling in lan966x_fdma_rx_alloc_page_pool()
page_pool_create() can return an ERR_PTR on failure. The return value
is used unconditionally in the loop that follows, passing the error
pointer through xdp_rxq_info_reg_mem_model() into page_pool_use_xdp_mem(),
which dereferences it, causing a kernel oops.
Add an IS_ERR check after page_pool_create() to return early on failure.
The rbtree backing kernfs directories is ordered by (hash, ns_id, name)
but kernfs_dir_pos() only searches by hash when seeking to a position
during readdir. When two nodes from different namespaces share the same
hash value, the binary search can land on a node in the wrong namespace.
The subsequent skip-forward loop walks rb_next() and may overshoot the
correct node, silently dropping an entry from the readdir results.
With the recent switch from raw namespace pointers to public namespace
ids as hash seeds, computing hash collisions became an offline operation.
An unprivileged user could unshare into a new network namespace, create
a single interface whose name-hash collides with a target entry in
init_net, and cause a victim's seekdir/readdir on /sys/class/net to miss
that entry.
Fix this by extending the rbtree search in kernfs_dir_pos() to also
compare namespace ids when hashes match. Since the rbtree is already
ordered by (hash, ns_id, name), this makes the seek land directly in the
correct namespace's range, eliminating the wrong-namespace overshoot.
Signed-off-by: Christian Brauner <brauner@kernel.org>
kernfs: use namespace id instead of pointer for hashing and comparison
kernfs uses the namespace tag as both a hash seed (via init_name_hash())
and a comparison key in the rbtree. The resulting hash values are exposed
to userspace through directory seek positions (ctx->pos), and the raw
pointer comparisons in kernfs_name_compare() encode kernel pointer
ordering into the rbtree layout.
This constitutes a KASLR information leak since the hash and ordering
derived from kernel pointers can be observed from userspace.
Fix this by using the 64-bit namespace id (ns_common::ns_id) instead of
the raw pointer value for both hashing and comparison. The namespace id
is a stable, non-secret identifier that is already exposed to userspace
through other interfaces (e.g., /proc/pid/ns/, ioctl NS_GET_NSID).
Introduce kernfs_ns_id() as a helper that extracts the namespace id from
a potentially-NULL ns_common pointer, returning 0 for the no-namespace
case.
All namespace equality checks in the directory iteration and dentry
revalidation paths are also switched from pointer comparison to ns_id
comparison for consistency.
Signed-off-by: Christian Brauner <brauner@kernel.org>
kernfs: pass struct ns_common instead of const void * for namespace tags
kernfs has historically used const void * to pass around namespace tags
used for directory-level namespace filtering. The only current user of
this is sysfs network namespace tagging where struct net pointers are
cast to void *.
Replace all const void * namespace parameters with const struct
ns_common * throughout the kernfs, sysfs, and kobject namespace layers.
This includes the kobj_ns_type_operations callbacks, kobject_namespace(),
and all sysfs/kernfs APIs that accept or return namespace tags.
Passing struct ns_common is needed because various codepaths require
access to the underlying namespace. A struct ns_common can always be
converted back to the concrete namespace type (e.g., struct net) via
container_of() or to_ns_common() in the reverse direction.
This is a preparatory change for switching to ns_id-based directory
iteration to prevent a KASLR pointer leak through the current use of
raw namespace pointers as hash seeds and comparison keys.
Signed-off-by: Christian Brauner <brauner@kernel.org>
Robin Murphy [Wed, 8 Apr 2026 14:40:57 +0000 (15:40 +0100)]
iommu: Ensure .iotlb_sync is called correctly
Many drivers have no reason to use the iotlb_gather mechanism, but do
still depend on .iotlb_sync being called to properly complete an unmap.
Since the core code is now relying on the gather to detect when there
is legitimately something to sync, it should also take care of encoding
a successful unmap when the driver does not touch the gather itself.
Fixes: 90c5def10bea ("iommu: Do not call drivers for empty gathers") Reported-by: Jon Hunter <jonathanh@nvidia.com> Closes: https://lore.kernel.org/r/8800a38b-8515-4bbe-af15-0dae81274bf7@nvidia.com Signed-off-by: Robin Murphy <robin.murphy@arm.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Will Deacon <will@kernel.org>
nfc: pn533: allocate rx skb before consuming bytes
pn532_receive_buf() reports the number of accepted bytes to the serdev
core. The current code consumes bytes into recv_skb and may already hand
a complete frame to pn533_recv_frame() before allocating a fresh receive
buffer.
If that alloc_skb() fails, the callback returns 0 even though it has
already consumed bytes, and it leaves recv_skb as NULL for the next
receive callback. That breaks the receive_buf() accounting contract and
can also lead to a NULL dereference on the next skb_put_u8().
Allocate the receive skb lazily before consuming the next byte instead.
If allocation fails, return the number of bytes already accepted.
drm/i915/gem: Drop check for changed VM in EXECBUF
Since the introduction of d4433c7600f7 ("drm/i915/gem: Use the proto-context
to handle create parameters (v5)") it has not been possible for VM to change
after context creation so the check will never fail.
Sima's analysis:
This check was added in f7ce8639f6ff ("drm/i915/gem: Split the context's
obj:vma lut into its own mutex") but without any hint in the commit
message as to why. In another hunk of that commit there's a hint though in
__eb_add_lut:
/* user racing with ctx set-vm */
This would mean that this bug was introduced in e0695db7298e ("drm/i915:
Create/destroy VM (ppGTT) for use with contexts"), which allowed to change
the gem_ctx->vm at runtime, opening up the race that was partially fixed
in the earlier referenced commit about a year later.
But it cannot be exploited anymore in anything remotely recent because
with the introduction of proto-contexts we've made gem_ctx->vm invariant
again, exactly to preemptively close all these potential issues.
Specifically d4433c7600f7 ("drm/i915/gem: Use the proto-context to handle
create parameters (v5)") is the vm specific part of the proto-context
work.
gpio: tegra: fix irq_release_resources calling enable instead of disable
tegra_gpio_irq_release_resources() erroneously calls tegra_gpio_enable()
instead of tegra_gpio_disable(). When IRQ resources are released, the
GPIO configuration bit (CNF) should be cleared to deconfigure the pin as
a GPIO. Leaving it enabled wastes power and can cause unexpected behavior
if the pin is later reused for an alternate function via pinctrl.
syzbot reported a WARN on my patch series [1]. The actual issue is an
overflow of 16-bit UDP length field, and it exists in the upstream code.
My series added a debug WARN with an overflow check that exposed the
issue, that's why syzbot tripped on my patches, rather than on upstream
code.
It basically sends an oversized (0x34000 bytes) PPPoL2TP packet with UDP
encapsulation, and l2tp_xmit_core doesn't check for overflows when it
assigns the UDP length field. The value gets trimmed to 16 bites.
Add an overflow check that drops oversized packets and avoids sending
packets with trimmed UDP length to the wire.
net: ipa: fix event ring index not programmed for IPA v5.0+
For IPA v5.0+, the event ring index field moved from CH_C_CNTXT_0 to
CH_C_CNTXT_1. The v5.0 register definition intended to define this
field in the CH_C_CNTXT_1 fmask array but used the old identifier of
ERINDEX instead of CH_ERINDEX.
Without a valid event ring, GSI channels could never signal transfer
completions. This caused gsi_channel_trans_quiesce() to block
forever in wait_for_completion().
At least for IPA v5.2 this resolves an issue seen where runtime
suspend, system suspend, and remoteproc stop all hanged forever. It
also meant the IPA data path was completely non functional.
Fixes: faf0678ec8a0 ("net: ipa: add IPA v5.0 GSI register definitions") Signed-off-by: Alexander Koskovich <akoskovich@pm.me> Signed-off-by: Luca Weiss <luca.weiss@fairphone.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260403-milos-ipa-v1-2-01e9e4e03d3e@fairphone.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Merge tag 'asoc-fix-v7.0-rc7' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v7.0
A somewhat larger set of fixes than I'd like unfortunatey, not from any
one place but rather spread out over different drivers. We've got a
bunch more fixes for the SDCA interrupt support, several relatively
minor SOF fixes, a few more driver specific fixes and a couple more AMD
quirks.
Emil converts to use spinlock_t for virtchnl transactions to make
consistent use of the xn_bm_lock when accessing the free_xn_bm bitmap,
while also avoiding nested raw/bh spinlock issue on PREEMPT_RT kernels.
He also sets payload size before calling the async handler, to make sure
it doesn't error out prematurely due to invalid size check for idpf.
Kohei Enju changes WARN_ON for missing PTP control PF to a dev_info() on
ice as there are cases where this is expected and acceptable.
Petr Oros fixes conditions in which error paths failed to call
ice_ptp_port_phy_restart() breaking PTP functionality on ice.
Alex significantly reduces reporting of driver information, and time
under RTNL locl, on ixgbe e610 devices by reducing reads of flash info
only on events that could change it.
Michal Schmidt adds missing Hyper-V op on ixgbevf.
Alex Dvoretsky removes call to napi_synchronize() in igb_down() to
resolve a deadlock.
Agalakov Daniil adds error check on e1000 for failed EEPROM read.
* '200GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
e1000: check return value of e1000_read_eeprom
igb: remove napi_synchronize() in igb_down()
ixgbevf: add missing negotiate_features op to Hyper-V ops table
ixgbe: stop re-reading flash on every get_drvinfo for e610
ice: fix PTP timestamping broken by SyncE code on E825C
ice: ptp: don't WARN when controlling PF is unavailable
idpf: set the payload size before calling the async handler
idpf: improve locking around idpf_vc_xn_push_free()
idpf: fix PREEMPT_RT raw/bh spinlock nesting for async VC handling
====================
Li RongQing [Tue, 7 Apr 2026 02:27:30 +0000 (22:27 -0400)]
devlink: Fix incorrect skb socket family dumping
The devlink_fmsg_dump_skb function was incorrectly using the socket
type (sk->sk_type) instead of the socket family (sk->sk_family)
when filling the "family" field in the fast message dump.
This patch fixes this to properly display the socket family.
Jiexun Wang [Tue, 7 Apr 2026 08:00:14 +0000 (16:00 +0800)]
af_unix: read UNIX_DIAG_VFS data under unix_state_lock
Exact UNIX diag lookups hold a reference to the socket, but not to
u->path. Meanwhile, unix_release_sock() clears u->path under
unix_state_lock() and drops the path reference after unlocking.
Read the inode and device numbers for UNIX_DIAG_VFS while holding
unix_state_lock(), then emit the netlink attribute after dropping the
lock.
This keeps the VFS data stable while the reply is being built.
Fixes: 5f7b0569460b ("unix_diag: Unix inode info NLA") Reported-by: Yifan Wu <yifanwucs@gmail.com> Reported-by: Juefei Pu <tomapufckgml@gmail.com> Co-developed-by: Yuan Tan <yuantan098@gmail.com> Signed-off-by: Yuan Tan <yuantan098@gmail.com> Suggested-by: Xin Liu <bird@lzu.edu.cn> Tested-by: Ren Wei <enjou1224z@gmail.com> Signed-off-by: Jiexun Wang <wangjiexun2025@gmail.com> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20260407080015.1744197-1-n05ec@lzu.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Revert "mptcp: add needs_id for netlink appending addr"
This commit was originally adding the ability to add MPTCP endpoints
with ID 0 by accident. The in-kernel PM, handling MPTCP endpoints at the
net namespace level, is not supposed to handle endpoints with such ID,
because this ID 0 is reserved to the initial subflow, as mentioned in
the MPTCPv1 protocol [1], a per-connection setting.
Note that 'ip mptcp endpoint add id 0' stops early with an error, but
other tools might still request the in-kernel PM to create MPTCP
endpoints with this restricted ID 0.
In other words, it was wrong to call the mptcp_pm_has_addr_attr_id
helper to check whether the address ID attribute is set: if it was set
to 0, a new MPTCP endpoint would be created with ID 0, which is not
expected, and might cause various issues later.
mptcp: fix slab-use-after-free in __inet_lookup_established
The ehash table lookups are lockless and rely on
SLAB_TYPESAFE_BY_RCU to guarantee socket memory stability
during RCU read-side critical sections. Both tcp_prot and
tcpv6_prot have their slab caches created with this flag
via proto_register().
However, MPTCP's mptcp_subflow_init() copies tcpv6_prot into
tcpv6_prot_override during inet_init() (fs_initcall, level 5),
before inet6_init() (module_init/device_initcall, level 6) has
called proto_register(&tcpv6_prot). At that point,
tcpv6_prot.slab is still NULL, so tcpv6_prot_override.slab
remains NULL permanently.
This causes MPTCP v6 subflow child sockets to be allocated via
kmalloc (falling into kmalloc-4k) instead of the TCPv6 slab
cache. The kmalloc-4k cache lacks SLAB_TYPESAFE_BY_RCU, so
when these sockets are freed without SOCK_RCU_FREE (which is
cleared for child sockets by design), the memory can be
immediately reused. Concurrent ehash lookups under
rcu_read_lock can then access freed memory, triggering a
slab-use-after-free in __inet_lookup_established.
Fix this by splitting the IPv6-specific initialization out of
mptcp_subflow_init() into a new mptcp_subflow_v6_init(), called
from mptcp_proto_v6_init() before protocol registration. This
ensures tcpv6_prot_override.slab correctly inherits the
SLAB_TYPESAFE_BY_RCU slab cache.
net: txgbe: leave space for null terminators on property_entry
Lists of struct property_entry are supposed to be terminated with an
empty property, this driver currently seems to be allocating exactly the
amount of entry used.
Change the struct definition to leave an extra element for all
property_entry.
Fixes: c3e382ad6d15 ("net: txgbe: Add software nodes to support phylink") Signed-off-by: Fabio Baltieri <fabio.baltieri@gmail.com> Tested-by: Jiawen Wu <jiawenwu@trustnetic.com> Link: https://patch.msgid.link/20260405222013.5347-1-fabio.baltieri@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This code can lead to an out-of-bounds access of the dev->_tx[] array
when is_input is true. In such a case, the packet is on the RX path and
skb->queue_mapping contains the RX queue index of the ingress device. If
the ingress device has more RX queues than the egress device (dev) has
TX queues, skb_get_queue_mapping(skb) will exceed dev->num_tx_queues.
Add a check to avoid this situation since skb_get_tx_queue() does not
clamp the index. This issue has also revealed that per queue visibility
cannot be accurate and will be replaced later as a new feature.
While at it, add missing lock around qdisc_qstats_qlen_backlog(). The
function __ioam6_fill_trace_data() is called from both softirq and
process contexts, hence the use of spin_lock_bh() here.
Fixes: b63c5478e9cb ("ipv6: ioam: Support for Queue depth data field") Reported-by: Jakub Kicinski <kuba@kernel.org> Closes: https://lore.kernel.org/netdev/20260403214418.2233266-2-kuba@kernel.org/ Signed-off-by: Justin Iurman <justin.iurman@gmail.com> Link: https://patch.msgid.link/20260404134137.24553-1-justin.iurman@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 9 Apr 2026 01:56:17 +0000 (18:56 -0700)]
Merge tag 'wireless-2026-04-08' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless
Johannes Berg says:
====================
A few last-minute fixes:
- rfkill: prevent boundless event list
- rt2x00: fix USB resource management
- brcmfmac: validate firmware IDs
- brcmsmac: fix DMA free size
* tag 'wireless-2026-04-08' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
net: rfkill: prevent unlimited numbers of rfkill events from being created
wifi: rt2x00usb: fix devres lifetime
wifi: brcmfmac: validate bsscfg indices in IF events
wifi: brcmsmac: Fix dma_free_coherent() size
====================
1) Clear trailing padding in build_polexpire() to prevent
leaking unititialized memory. From Yasuaki Torimaru.
2) Fix aevent size calculation when XFRMA_IF_ID is used.
From Keenan Dong.
3) Wait for RCU readers during policy netns exit before
freeing the policy hash tables.
4) Fix dome too eaerly dropped references on the netdev
when uding transport mode. From Qi Tang.
5) Fix refcount leak in xfrm_migrate_policy_find().
From Kotlyarov Mihail.
6) Fix two fix info leaks in build_report() and
in build_mapping(). From Greg Kroah-Hartman.
7) Zero aligned sockaddr tail in PF_KEY exports.
From Zhengchuan Liang.
* tag 'ipsec-2026-04-08' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec:
net: af_key: zero aligned sockaddr tail in PF_KEY exports
xfrm_user: fix info leak in build_report()
xfrm_user: fix info leak in build_mapping()
xfrm: fix refcount leak in xfrm_migrate_policy_find
xfrm: hold dev ref until after transport_finish NF_HOOK
xfrm: Wait for RCU readers during policy netns exit
xfrm: account XFRMA_IF_ID in aevent size calculation
xfrm: clear trailing padding in build_polexpire()
====================
Jakub Kicinski [Thu, 9 Apr 2026 01:50:27 +0000 (18:50 -0700)]
Merge tag 'batadv-net-pullrequest-20260408' of https://git.open-mesh.org/linux-merge
Simon Wunderlich says:
====================
Here are two batman-adv bugfixes:
- reject oversized global TT response buffers, by Ruide Cao
- hold claim backbone gateways by reference, by Haoze Xie
* tag 'batadv-net-pullrequest-20260408' of https://git.open-mesh.org/linux-merge:
batman-adv: hold claim backbone gateways by reference
batman-adv: reject oversized global TT response buffers
====================
Jakub Kicinski [Thu, 9 Apr 2026 01:48:44 +0000 (18:48 -0700)]
Merge tag 'nf-26-04-08' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Florian Westphal says:
====================
netfilter updates for net
I only included crash fixes, as we're closer to a release, rest will
be handled via -next.
1) Fix a NULL pointer dereference in ip_vs_add_service error path, from
Weiming Shi, bug added in 6.2 development cycle.
2) Don't leak kernel data bytes from allocator to userspace: nfnetlink_log
needs to init the trailing NLMSG_DONE terminator. From Xiang Mei.
3) xt_multiport match lacks range validation, bogus userspace request will
cause out-of-bounds read. From Ren Wei.
4) ip6t_eui64 match must reject packets with invalid mac header before
calling eth_hdr. Make existing check unconditional. From Zhengchuan
Liang.
5) nft_ct timeout policies are free'd via kfree() while they may still
be reachable by other cpus that process a conntrack object that
uses such a timeout policy. Existing reaping of entries is not
sufficient because it doesn't wait for a grace period. Use kfree_rcu().
From Tuan Do.
6/7) Make nfnetlink_queue hash table per queue. As-is we can hit a page
fault in case underlying page of removed element was free'd. Per-queue
hash prevents parallel lookups. This comes with a test case that
demonstrates the bug, from Fernando Fernandez Mancera.
* tag 'nf-26-04-08' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
selftests: nft_queue.sh: add a parallel stress test
netfilter: nfnetlink_queue: make hash table per queue
netfilter: nft_ct: fix use-after-free in timeout object destroy
netfilter: ip6t_eui64: reject invalid MAC header for all packets
netfilter: xt_multiport: validate range encoding in checkentry
netfilter: nfnetlink_log: initialize nfgenmsg in NLMSG_DONE terminator
ipvs: fix NULL deref in ip_vs_add_service error path
====================
Jakub Kicinski [Thu, 9 Apr 2026 01:44:37 +0000 (18:44 -0700)]
Merge branch 'rxrpc-miscellaneous-fixes'
David Howells says:
====================
rxrpc: Miscellaneous fixes
Here are some fixes for rxrpc:
(1) Fix key quota calculation.
(2) Fix a memory leak.
(3) Fix rxrpc_new_client_call_for_sendmsg() to substitute NULL for an
empty key.
Might want to remove this substitution entirely or handle it in
rxrpc_init_client_call_security() instead.
(4) Fix deletion of call->link to be RCU safe.
(5) Fix missing bounds checks when parsing RxGK tickets.
(6) Fix use of wrong skbuff to get challenge serial number. Also actually
substitute the newer response skbuff and release the older one.
(7) Fix unexpected RACK timer warning to report old mode.
(8) Fix call key refcount leak.
(9) Fix the interaction of jumbograms with Tx window space, setting the
request-ack flag when the window space is getting low, typically
because each jumbogram take a big bite out of the window and fewer UDP
packets get traded.
(10) Don't call rxrpc_put_call() with a NULL pointer.
(11) Reject undecryptable rxkad response tickets by checking result of
decryption.
(12) Fix buffer bounds calculation in the RESPONSE authenticator parser.
(13) Fix oversized response length check.
(14) Fix refcount leak on multiple setting of server keyring.
(15) Fix checks made by RXRPC_SECURITY_KEY and RXRPC_SECURITY_KEYRING (both
should be allowed).
(16) Fix lack of result checking on calls to crypto_skcipher_en/decrypt().
(17) Fix token_len limit check in rxgk_verify_response().
(18) Fix rxgk context leak in rxgk_verify_response().
(19) Fix read beyond end of buffer in rxgk_do_verify_authenticator().
(20) Fix parsing of RESPONSE packet on a connection that has already been set
from a prior response.
(21) Fix size of buffers used for rendering addresses into for procfiles.
====================
rxrpc: proc: size address buffers for %pISpc output
The AF_RXRPC procfs helpers format local and remote socket addresses into
fixed 50-byte stack buffers with "%pISpc".
That is too small for the longest current-tree IPv6-with-port form the
formatter can produce. In lib/vsprintf.c, the compressed IPv6 path uses a
dotted-quad tail not only for v4mapped addresses, but also for ISATAP
addresses via ipv6_addr_is_isatap().
is possible with the current formatter. That is 50 visible characters, so
51 bytes including the trailing NUL, which does not fit in the existing
char[50] buffers used by net/rxrpc/proc.c.
Size the buffers from the formatter's maximum textual form and switch the
call sites to scnprintf().
Changes since v1:
- correct the changelog to cite the actual maximum current-tree case
explicitly
- frame the proof around the ISATAP formatting path instead of the earlier
mapped-v4 example
Fixes: 75b54cb57ca3 ("rxrpc: Add IPv6 support") Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn> Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Anderson Nascimento <anderson@allelesecurity.com>
cc: Simon Horman <horms@kernel.org>
cc: linux-afs@lists.infradead.org
cc: stable@kernel.org Link: https://patch.msgid.link/20260408121252.2249051-22-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Wang Jie [Wed, 8 Apr 2026 12:12:48 +0000 (13:12 +0100)]
rxrpc: only handle RESPONSE during service challenge
Only process RESPONSE packets while the service connection is still in
RXRPC_CONN_SERVICE_CHALLENGING. Check that state under state_lock before
running response verification and security initialization, then use a local
secured flag to decide whether to queue the secured-connection work after
the state transition. This keeps duplicate or late RESPONSE packets from
re-running the setup path and removes the unlocked post-transition state
test.
Fixes: 17926a79320a ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both") Reported-by: Yifan Wu <yifanwucs@gmail.com> Reported-by: Juefei Pu <tomapufckgml@gmail.com> Co-developed-by: Yuan Tan <yuantan098@gmail.com> Signed-off-by: Yuan Tan <yuantan098@gmail.com> Suggested-by: Xin Liu <bird@lzu.edu.cn> Signed-off-by: Jie Wang <jiewang2024@lzu.edu.cn> Signed-off-by: Yang Yang <n05ec@lzu.edu.cn> Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Jeffrey Altman <jaltman@auristor.com>
cc: Simon Horman <horms@kernel.org>
cc: linux-afs@lists.infradead.org
cc: stable@kernel.org Link: https://patch.msgid.link/20260408121252.2249051-21-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
David Howells [Wed, 8 Apr 2026 12:12:45 +0000 (13:12 +0100)]
rxrpc: Fix integer overflow in rxgk_verify_response()
In rxgk_verify_response(), there's a potential integer overflow due to
rounding up token_len before checking it, thereby allowing the length check to
be bypassed.
Fix this by checking the unrounded value against len too (len is limited as
the response must fit in a single UDP packet).
Fixes: 9d1d2b59341f ("rxrpc: rxgk: Implement the yfs-rxgk security class (GSSAPI)") Closes: https://sashiko.dev/#/patchset/20260401105614.1696001-10-dhowells@redhat.com Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Jeffrey Altman <jaltman@auristor.com>
cc: Simon Horman <horms@kernel.org>
cc: linux-afs@lists.infradead.org
cc: stable@kernel.org Link: https://patch.msgid.link/20260408121252.2249051-18-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
David Howells [Wed, 8 Apr 2026 12:12:44 +0000 (13:12 +0100)]
rxrpc: Fix missing error checks for rxkad encryption/decryption failure
Add error checking for failure of crypto_skcipher_en/decrypt() to various
rxkad function as the crypto functions can fail with ENOMEM at least.
Fixes: 17926a79320a ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both") Closes: https://sashiko.dev/#/patchset/20260401105614.1696001-10-dhowells@redhat.com Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Jeffrey Altman <jaltman@auristor.com>
cc: Simon Horman <horms@kernel.org>
cc: linux-afs@lists.infradead.org
cc: stable@kernel.org Link: https://patch.msgid.link/20260408121252.2249051-17-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
David Howells [Wed, 8 Apr 2026 12:12:43 +0000 (13:12 +0100)]
rxrpc: Fix key/keyring checks in setsockopt(RXRPC_SECURITY_KEY/KEYRING)
An AF_RXRPC socket can be both client and server at the same time. When
sending new calls (ie. it's acting as a client), it uses rx->key to set the
security, and when accepting incoming calls (ie. it's acting as a server),
it uses rx->securities.
setsockopt(RXRPC_SECURITY_KEY) sets rx->key to point to an rxrpc-type key
and setsockopt(RXRPC_SECURITY_KEYRING) sets rx->securities to point to a
keyring of rxrpc_s-type keys.
Now, it should be possible to use both rx->key and rx->securities on the
same socket - but for userspace AF_RXRPC sockets rxrpc_setsockopt()
prevents that.
Fix this by:
(1) Remove the incorrect check rxrpc_setsockopt(RXRPC_SECURITY_KEYRING)
makes on rx->key.
(2) Move the check that rxrpc_setsockopt(RXRPC_SECURITY_KEY) makes on
rx->key down into rxrpc_request_key().
(3) Remove rxrpc_request_key()'s check on rx->securities.
This (in combination with a previous patch) pushes the checks down into the
functions that set those pointers and removes the cross-checks that prevent
both key and keyring being set.
Fixes: 17926a79320a ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both") Closes: https://sashiko.dev/#/patchset/20260401105614.1696001-10-dhowells@redhat.com Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Anderson Nascimento <anderson@allelesecurity.com>
cc: Luxiao Xu <rakukuip@gmail.com>
cc: Yuan Tan <yuantan098@gmail.com>
cc: Simon Horman <horms@kernel.org>
cc: linux-afs@lists.infradead.org
cc: stable@kernel.org Link: https://patch.msgid.link/20260408121252.2249051-16-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
rxgk_verify_response() decodes auth_len from the packet and is supposed
to verify that it fits in the remaining bytes. The existing check is
inverted, so oversized RESPONSE authenticators are accepted and passed
to rxgk_decrypt_skb(), which can later reach skb_to_sgvec() with an
impossible length and hit BUG_ON(len).
Decoded from the original latest-net reproduction logs with
scripts/decode_stacktrace.sh:
rxgk_verify_authenticator() copies auth_len bytes into a temporary
buffer and then passes p + auth_len as the parser limit to
rxgk_do_verify_authenticator(). Since p is a __be32 *, that inflates the
parser end pointer by a factor of four and lets malformed RESPONSE
authenticators read past the kmalloc() buffer.
Decoded from the original latest-net reproduction logs with
scripts/decode_stacktrace.sh:
rxkad_decrypt_ticket() decrypts the RXKAD response ticket and then
parses the buffer as plaintext without checking whether
crypto_skcipher_decrypt() succeeded.
A malformed RESPONSE can therefore use a non-block-aligned ticket
length, make the decrypt operation fail, and still drive the ticket
parser with attacker-controlled bytes.
Check the decrypt result and abort the connection with RXKADBADTICKET
when ticket decryption fails.
Fixes: 17926a79320a ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both") Reported-by: Yifan Wu <yifanwucs@gmail.com> Reported-by: Juefei Pu <tomapufckgml@gmail.com> Co-developed-by: Yuan Tan <yuantan098@gmail.com> Signed-off-by: Yuan Tan <yuantan098@gmail.com> Suggested-by: Xin Liu <bird@lzu.edu.cn> Tested-by: Ren Wei <enjou1224z@gmail.com> Signed-off-by: Yuqi Xu <xuyuqiabc@gmail.com> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn> Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Simon Horman <horms@kernel.org>
cc: linux-afs@lists.infradead.org
cc: stable@kernel.org Link: https://patch.msgid.link/20260408121252.2249051-12-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Douya Le [Wed, 8 Apr 2026 12:12:38 +0000 (13:12 +0100)]
rxrpc: Only put the call ref if one was acquired
rxrpc_input_packet_on_conn() can process a to-client packet after the
current client call on the channel has already been torn down. In that
case chan->call is NULL, rxrpc_try_get_call() returns NULL and there is
no reference to drop.
The client-side implicit-end error path does not account for that and
unconditionally calls rxrpc_put_call(). This turns a protocol error
path into a kernel crash instead of rejecting the packet.
Only drop the call reference if one was actually acquired. Keep the
existing protocol error handling unchanged.
Fixes: 5e6ef4f1017c ("rxrpc: Make the I/O thread take over the call and local processor work") Reported-by: Yifan Wu <yifanwucs@gmail.com> Reported-by: Juefei Pu <tomapufckgml@gmail.com> Signed-off-by: Douya Le <ldy3087146292@gmail.com> Co-developed-by: Yuan Tan <tanyuan98@gmail.com> Signed-off-by: Yuan Tan <tanyuan98@gmail.com> Suggested-by: Xin Liu <bird@lzu.edu.cn> Signed-off-by: Ao Zhou <n05ec@lzu.edu.cn> Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Simon Horman <horms@kernel.org>
cc: linux-afs@lists.infradead.org
cc: stable@kernel.org Link: https://patch.msgid.link/20260408121252.2249051-11-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Marc Dionne [Wed, 8 Apr 2026 12:12:37 +0000 (13:12 +0100)]
rxrpc: Fix to request an ack if window is limited
Peers may only send immediate acks for every 2 UDP packets received.
When sending a jumbogram, it is important to check that there is
sufficient window space to send another same sized jumbogram following
the current one, and request an ack if there isn't. Failure to do so may
cause the call to stall waiting for an ack until the resend timer fires.
Where jumbograms are in use this causes a very significant drop in
performance.
Fixes: fe24a5494390 ("rxrpc: Send jumbo DATA packets") Signed-off-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com>
cc: Jeffrey Altman <jaltman@auristor.com>
cc: Simon Horman <horms@kernel.org>
cc: linux-afs@lists.infradead.org
cc: stable@kernel.org Link: https://patch.msgid.link/20260408121252.2249051-10-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>