Linus Torvalds [Tue, 10 Feb 2026 00:00:21 +0000 (16:00 -0800)]
Merge tag 'hfs-v7.0-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/vdubeyko/hfs
Pull hfs/hfsplus updates from Viacheslav Dubeyko:
"This pull request contains several fixes of syzbot reported issues and
HFS+ fixes of xfstests failures.
- fix an issue reported by syzbot triggering BUG_ON() in the case of
corrupted superblock, replacing the BUG_ON()s with proper error
handling (Jori Koolstra)
- fix memory leaks in the mount logic of HFS/HFS+ file systems. When
HFS/HFS+ were converted to the new mount api a bug was introduced
by changing the allocation pattern of sb->s_fs_info (Mehdi Ben Hadj
Khelifa)
- fix hfs_bnode_create() by returning ERR_PTR(-EEXIST) instead of
the node pointer when it's already hashed. This avoids a double
unload_nls() on mount failure (suggested by Shardul Bankar)
- set inode's mode as regular file for system inodes (Tetsuo Handa)
The rest fix failures in generic/020, generic/037, generic/062,
generic/480, and generic/498 xfstests for the case of HFS+ file
system. Currently, only 30 xfstests' test-cases experience failures
for HFS+ file system (initially, it was around 100 failed xfstests)"
* tag 'hfs-v7.0-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/vdubeyko/hfs:
hfsplus: avoid double unload_nls() on mount failure
hfsplus: fix warning issue in inode.c
hfsplus: fix generic/062 xfstests failure
hfsplus: fix generic/037 xfstests failure
hfsplus: pretend special inodes as regular files
hfsplus: return error when node already exists in hfs_bnode_create
hfs: Replace BUG_ON with error handling for CNID count checks
hfsplus: fix generic/020 xfstests failure
hfsplus: fix volume corruption issue for generic/498
hfsplus: fix volume corruption issue for generic/480
hfsplus: ensure sb->s_fs_info is always cleaned up
hfs: ensure sb->s_fs_info is always cleaned up
Linus Torvalds [Mon, 9 Feb 2026 23:55:41 +0000 (15:55 -0800)]
Merge tag 'nilfs2-v7.0-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/vdubeyko/nilfs2
Pull nilfs2 updates from Viacheslav Dubeyko:
- Fix potential block overflow that cause system hang
When executing the FITRIM command, an underflow can occur in the
calculation of nblocks. This ultimately leads to the block layer
function __blkdev_issue_discard() taking an excessively long time
to process the bio chain, and the ns_segctor_sem lock remains held
for a long period.
This prevents other tasks from acquiring the ns_segctor_sem lock,
resulting in a hang reported by syzbot (Edward Adam Davis)
- Fix missing struct keywords in nilfs2_api.h kernel-doc (Ryusuke
Konishi)
- Convert nilfs_super_block to kernel-doc
Eliminate 40+ kernel-doc warnings in nilfs2_ondisk.h by converting
all of the struct member comments to kernel-doc comments (Randy
Dunlap)
* tag 'nilfs2-v7.0-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/vdubeyko/nilfs2:
nilfs2: fix missing struct keywords in nilfs2_api.h kernel-doc
nilfs2: convert nilfs_super_block to kernel-doc
nilfs2: Fix potential block overflow that cause system hang
Linus Torvalds [Mon, 9 Feb 2026 23:45:21 +0000 (15:45 -0800)]
Merge tag 'for-6.20-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs updates from David Sterba:
"User visible changes, feature updates:
- when using block size > page size, enable direct IO
- fallback to buffered IO if the data profile has duplication,
workaround to avoid checksum mismatches on block group profiles
with redundancy, real direct IO is possible on single or RAID0
- redo export of zoned statistics, moved from sysfs to
/proc/pid/mountstats due to size limitations of the former
Experimental features:
- remove offload checksum tunable, intended to find best way to do it
but since we've switched to offload to thread for everything we
don't need it anymore
- initial support for remap-tree feature, a translation layer of
logical block addresses that allow changes without moving/rewriting
blocks to do eg. relocation, or other changes that require COW
Notable fixes:
- automatic removal of accidentally leftover chunks when
free-space-tree is enabled since mkfs.btrfs v6.16.1
- zoned mode:
- do not try to append to conventional zones when RAID is mixing
zoned and conventional drives
- fixup write pointers when mixing zoned and conventional on
DUP/RAID* profiles
- when using squota, relax deletion rules for qgroups with 0 members
to allow easier recovery from accounting bugs, also add more checks
to detect bad accounting
- fix periodic reclaim scanning, properly check boundary conditions
not to trigger it unexpectedly or miss the time to run it
- trim:
- continue after first error
- change reporting to the first detected error
- add more cancellation points
- reduce contention of big device lock that can block other
operations when there's lots of trimmed space
- when chunk allocation is forced (needs experimental build) fix
transaction abort when unexpected space layout is detected
Core:
- switch to crypto library API for checksumming, removed module
dependencies, pointer indirections, etc.
- error handling improvements
- adjust how and where transaction commit or abort are done and are
maybe not necessary
- minor compression optimization to skip single block ranges
- improve how compression folios are handled
- new and updated selftests
- cleanups, refactoring:
- auto-freeing and other automatic variable cleanup conversion
- structure size optimizations
- condition annotations"
* tag 'for-6.20-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (137 commits)
btrfs: get rid of compressed_bio::compressed_folios[]
btrfs: get rid of compressed_folios[] usage for encoded writes
btrfs: get rid of compressed_folios[] usage for compressed read
btrfs: remove the old btrfs_compress_folios() infrastructure
btrfs: switch to btrfs_compress_bio() interface for compressed writes
btrfs: introduce btrfs_compress_bio() helper
btrfs: zlib: introduce zlib_compress_bio() helper
btrfs: zstd: introduce zstd_compress_bio() helper
btrfs: lzo: introduce lzo_compress_bio() helper
btrfs: zoned: factor out the zone loading part into a testable function
btrfs: add cleanup function for btrfs_free_chunk_map
btrfs: tests: add cleanup functions for test specific functions
btrfs: raid56: fix memory leak of btrfs_raid_bio::stripe_uptodate_bitmap
btrfs: tests: add unit tests for pending extent walking functions
btrfs: fix EEXIST abort due to non-consecutive gaps in chunk allocation
btrfs: fix transaction commit blocking during trim of unallocated space
btrfs: handle user interrupt properly in btrfs_trim_fs()
btrfs: preserve first error in btrfs_trim_fs()
btrfs: continue trimming remaining devices on failure
btrfs: do not BUG_ON() in btrfs_remove_block_group()
...
Linus Torvalds [Mon, 9 Feb 2026 23:13:05 +0000 (15:13 -0800)]
Merge tag 'vfs-7.0-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull misc vfs updates from Christian Brauner:
"This contains a mix of VFS cleanups, performance improvements, API
fixes, documentation, and a deprecation notice.
Scalability and performance:
- Rework pid allocation to only take pidmap_lock once instead of
twice during alloc_pid(), improving thread creation/teardown
throughput by 10-16% depending on false-sharing luck. Pad the
namespace refcount to reduce false-sharing
- Track file lock presence via a flag in ->i_opflags instead of
reading ->i_flctx, avoiding false-sharing with ->i_readcount on
open/close hot paths. Measured 4-16% improvement on 24-core
open-in-a-loop benchmarks
- Use a consume fence in locks_inode_context() to match the
store-release/load-consume idiom, eliminating a hardware fence on
some architectures
- Annotate cdev_lock with __cacheline_aligned_in_smp to prevent
false-sharing
- Remove a redundant DCACHE_MANAGED_DENTRY check in
__follow_mount_rcu() that never fires since the caller already
verifies it, eliminating a 100% mispredicted branch
- Fix a 100% mispredicted likely() in devcgroup_inode_permission()
that became wrong after a prior code reorder
Bug fixes and correctness:
- Make insert_inode_locked() wait for inode destruction instead of
skipping, fixing a corner case where two matching inodes could
exist in the hash
- Move f_mode initialization before file_ref_init() in alloc_file()
to respect the SLAB_TYPESAFE_BY_RCU ordering contract
- Add a WARN_ON_ONCE guard in try_to_free_buffers() for folios with
no buffers attached, preventing a null pointer dereference when
AS_RELEASE_ALWAYS is set but no release_folio op exists
- Fix select restart_block to store end_time as timespec64, avoiding
truncation of tv_sec on 32-bit architectures
- Make dump_inode() use get_kernel_nofault() to safely access inode
and superblock fields, matching the dump_mapping() pattern
API modernization:
- Make posix_acl_to_xattr() allocate the buffer internally since
every single caller was doing it anyway. Reduces boilerplate and
unnecessary error checking across ~15 filesystems
- Replace deprecated simple_strtoul() with kstrtoul() for the
ihash_entries, dhash_entries, mhash_entries, and mphash_entries
boot parameters, adding proper error handling
- Convert chardev code to use guard(mutex) and __free(kfree) cleanup
patterns
- Replace min_t() with min() or umin() in VFS code to avoid silently
truncating unsigned long to unsigned int
- Gate LOOKUP_RCU assertions behind CONFIG_DEBUG_VFS since callers
already check the flag
Deprecation:
- Begin deprecating legacy BSD process accounting (acct(2)). The
interface has numerous footguns and better alternatives exist
(eBPF)
Documentation:
- Fix and complete kernel-doc for struct export_operations, removing
duplicated documentation between ReST and source
- Fix kernel-doc warnings for __start_dirop() and ilookup5_nowait()
Testing:
- Add a kunit test for initramfs cpio handling of entries with
filesize > PATH_MAX
Misc:
- Add missing <linux/init_task.h> include in fs_struct.c"
* tag 'vfs-7.0-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (28 commits)
posix_acl: make posix_acl_to_xattr() alloc the buffer
fs: make insert_inode_locked() wait for inode destruction
initramfs_test: kunit test for cpio.filesize > PATH_MAX
fs: improve dump_inode() to safely access inode fields
fs: add <linux/init_task.h> for 'init_fs'
docs: exportfs: Use source code struct documentation
fs: move initializing f_mode before file_ref_init()
exportfs: Complete kernel-doc for struct export_operations
exportfs: Mark struct export_operations functions at kernel-doc
exportfs: Fix kernel-doc output for get_name()
acct(2): begin the deprecation of legacy BSD process accounting
device_cgroup: remove branch hint after code refactor
VFS: fix __start_dirop() kernel-doc warnings
fs: Describe @isnew parameter in ilookup5_nowait()
fs/namei: Remove redundant DCACHE_MANAGED_DENTRY check in __follow_mount_rcu
fs: only assert on LOOKUP_RCU when built with CONFIG_DEBUG_VFS
select: store end_time as timespec64 in restart block
chardev: Switch to guard(mutex) and __free(kfree)
namespace: Replace simple_strtoul with kstrtoul to parse boot params
dcache: Replace simple_strtoul with kstrtoul in set_dhash_entries
...
Linus Torvalds [Mon, 9 Feb 2026 23:08:16 +0000 (15:08 -0800)]
Merge tag 'vfs-7.0-rc1.iomap' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs iomap updates from Christian Brauner:
- Erofs page cache sharing preliminaries:
Plumb a void *private parameter through iomap_read_folio() and
iomap_readahead() into iomap_iter->private, matching iomap DIO. Erofs
uses this to replace a bogus kmap_to_page() call, as preparatory work
for page cache sharing.
- Fix for invalid folio access:
Fix an invalid folio access when a folio without iomap_folio_state
is fully submitted to the IO helper — the helper may call
folio_end_read() at any time, so ctx->cur_folio must be invalidated
after full submission.
* tag 'vfs-7.0-rc1.iomap' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
iomap: fix invalid folio access after folio_end_read()
erofs: hold read context in iomap_iter if needed
iomap: stash iomap read ctx in the private field of iomap_iter
Linus Torvalds [Mon, 9 Feb 2026 22:43:47 +0000 (14:43 -0800)]
Merge tag 'vfs-7.0-rc1.namespace' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs mount updates from Christian Brauner:
- statmount: accept fd as a parameter
Extend struct mnt_id_req with a file descriptor field and a new
STATMOUNT_BY_FD flag. When set, statmount() returns mount information
for the mount the fd resides on — including detached mounts
(unmounted via umount2(MNT_DETACH)).
For detached mounts the STATMOUNT_MNT_POINT and STATMOUNT_MNT_NS_ID
mask bits are cleared since neither is meaningful. The capability
check is skipped for STATMOUNT_BY_FD since holding an fd already
implies prior access to the mount and equivalent information is
available through fstatfs() and /proc/pid/mountinfo without
privilege. Includes comprehensive selftests covering both attached
and detached mount cases.
- fs: Remove internal old mount API code (1 patch)
Now that every in-tree filesystem has been converted to the new
mount API, remove all the legacy shim code in fs_context.c that
handled unconverted filesystems. This deletes ~280 lines including
legacy_init_fs_context(), the legacy_fs_context struct, and
associated wrappers. The mount(2) syscall path for userspace remains
untouched. Documentation references to the legacy callbacks are
cleaned up.
- mount: add OPEN_TREE_NAMESPACE to open_tree()
Container runtimes currently use CLONE_NEWNS to copy the caller's
entire mount namespace — only to then pivot_root() and recursively
unmount everything they just copied. With large mount tables and
thousands of parallel container launches this creates significant
contention on the namespace semaphore.
OPEN_TREE_NAMESPACE copies only the specified mount tree (like
OPEN_TREE_CLONE) but returns a mount namespace fd instead of a
detached mount fd. The new namespace contains the copied tree mounted
on top of a clone of the real rootfs.
This functions as a combined unshare(CLONE_NEWNS) + pivot_root() in a
single syscall. Works with user namespaces: an unshare(CLONE_NEWUSER)
followed by OPEN_TREE_NAMESPACE creates a mount namespace owned by
the new user namespace. Mount namespace file mounts are excluded from
the copy to prevent cycles. Includes ~1000 lines of selftests"
* tag 'vfs-7.0-rc1.namespace' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
selftests/open_tree: add OPEN_TREE_NAMESPACE tests
mount: add OPEN_TREE_NAMESPACE
fs: Remove internal old mount API code
selftests: statmount: tests for STATMOUNT_BY_FD
statmount: accept fd as a parameter
statmount: permission check should return EPERM
Linus Torvalds [Mon, 9 Feb 2026 22:25:37 +0000 (14:25 -0800)]
Merge tag 'vfs-7.0-rc1.atomic_open' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs atomic_open updates from Christian Brauner:
"Allow knfsd to use atomic_open()
While knfsd offers combined exclusive create and open results to
clients, on some filesystems those results are not atomic. The
separate vfs_create() + vfs_open() sequence in dentry_create() can
produce races and unexpected errors. For example, open O_CREAT with
mode 0 will succeed in creating the file but return -EACCES from
vfs_open(). Additionally, network filesystems benefit from reducing
remote round-trip operations by using a single atomic_open() call.
Teach dentry_create() -- whose sole caller is knfsd -- to use
atomic_open() for filesystems that support it"
* tag 'vfs-7.0-rc1.atomic_open' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
fs/namei: fix kernel-doc markup for dentry_create
VFS/knfsd: Teach dentry_create() to use atomic_open()
VFS: Prepare atomic_open() for dentry_create()
VFS: move dentry_create() from fs/open.c to fs/namei.c
Linus Torvalds [Mon, 9 Feb 2026 21:41:34 +0000 (13:41 -0800)]
Merge tag 'vfs-7.0-rc1.nullfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs nullfs update from Christian Brauner:
"Add a completely catatonic minimal pseudo filesystem called "nullfs"
and make pivot_root() work in the initramfs.
Currently pivot_root() does not work on the real rootfs because it
cannot be unmounted. Userspace has to recursively delete initramfs
contents manually before continuing boot, using the fragile
switch_root sequence (overmount + chroot).
Add nullfs, a minimal immutable filesystem that serves as the true
root of the mount hierarchy. The mutable rootfs (tmpfs/ramfs) is
mounted on top of it. This allows userspace to simply:
without the traditional switch_root workarounds. systemd already
handles this correctly. It tries pivot_root() first and falls back
to MS_MOVE only when that fails.
This also means rootfs mounts in unprivileged namespaces no longer
need MNT_LOCKED, since the immutable nullfs guarantees nothing can be
revealed by unmounting the covering mount.
nullfs is a single-instance filesystem (get_tree_single()) marked
SB_NOUSER | SB_I_NOEXEC | SB_I_NODEV with an immutable empty root
directory. This means sooner or later it can be used to overmount
other directories to hide their contents without any additional
protection needed.
We enable it unconditionally. If we see any real regression we'll
hide it behind a boot option.
nullfs has extensions beyond this in the future. It will serve as a
concept to support the creation of completely empty mount namespaces -
which is work coming up in the next cycle"
* tag 'vfs-7.0-rc1.nullfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
fs: use nullfs unconditionally as the real rootfs
docs: mention nullfs
fs: add immutable rootfs
fs: add init_pivot_root()
fs: ensure that internal tmpfs mount gets mount id zero
Linus Torvalds [Mon, 9 Feb 2026 21:38:07 +0000 (13:38 -0800)]
Merge tag 'vfs-7.0-rc1.minix' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull minix update from Christian Brauner:
"Consolidate and strengthen superblock validation in
minix_check_superblock()
The minix filesystem driver does not validate several superblock
fields before using them during mount, allowing a crafted filesystem
image to trigger out-of-bounds accesses (reported by syzbot)"
* tag 'vfs-7.0-rc1.minix' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
minix: Add required sanity checking to minix_check_superblock()
Linus Torvalds [Mon, 9 Feb 2026 21:05:35 +0000 (13:05 -0800)]
Merge tag 'vfs-7.0-rc1.btrfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs updates for btrfs from Christian Brauner:
"This contains some changes for btrfs that are taken to the vfs tree to
stop duplicating VFS code for subvolume/snapshot dentry
Btrfs has carried private copies of the VFS may_delete() and
may_create() functions in fs/btrfs/ioctl.c for permission checks
during subvolume creation and snapshot destruction. These copies have
drifted out of sync with the VFS originals — btrfs_may_delete() is
missing the uid/gid validity check and btrfs_may_create() is missing
the audit_inode_child() call.
Export the VFS functions as may_{create,delete}_dentry() and switch
btrfs to use them, removing ~70 lines of duplicated code"
* tag 'vfs-7.0-rc1.btrfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
btrfs: use may_create_dentry() in btrfs_mksubvol()
btrfs: use may_delete_dentry() in btrfs_ioctl_snap_destroy()
fs: export may_create() as may_create_dentry()
fs: export may_delete() as may_delete_dentry()
Linus Torvalds [Mon, 9 Feb 2026 20:21:37 +0000 (12:21 -0800)]
Merge tag 'vfs-7.0-rc1.fserror' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs error reporting updates from Christian Brauner:
"This contains the changes to support generic I/O error reporting.
Filesystems currently have no standard mechanism for reporting
metadata corruption and file I/O errors to userspace via fsnotify.
Each filesystem (xfs, ext4, erofs, f2fs, etc.) privately defines
EFSCORRUPTED, and error reporting to fanotify is inconsistent or
absent entirely.
This introduces a generic fserror infrastructure built around struct
super_block that gives filesystems a standard way to queue metadata
and file I/O error reports for delivery to fsnotify.
Errors are queued via mempools and queue_work to avoid holding
filesystem locks in the notification path; unmount waits for pending
events to drain. A new super_operations::report_error callback lets
filesystem drivers respond to file I/O errors themselves (to be used
by an upcoming XFS self-healing patchset).
On the uapi side, EFSCORRUPTED and EUCLEAN are promoted from private
per-filesystem definitions to canonical errno.h values across all
architectures"
* tag 'vfs-7.0-rc1.fserror' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
ext4: convert to new fserror helpers
xfs: translate fsdax media errors into file "data lost" errors when convenient
xfs: report fs metadata errors via fsnotify
iomap: report file I/O errors to the VFS
fs: report filesystem and file I/O errors to fsnotify
uapi: promote EFSCORRUPTED and EUCLEAN to errno.h
Linus Torvalds [Mon, 9 Feb 2026 19:59:07 +0000 (11:59 -0800)]
Merge tag 'vfs-7.0-rc1.leases' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs lease updates from Christian Brauner:
"This contains updates for lease support to require filesystems to
explicitly opt-in to lease support
Currently kernel_setlease() falls through to generic_setlease() when a
a filesystem does not define ->setlease(), silently granting lease
support to every filesystem regardless of whether it is prepared for
it.
This is a poor default: most filesystems never intended to support
leases, and the silent fallthrough makes it impossible to distinguish
"supports leases" from "never thought about it".
This inverts the default. It adds explicit
.setlease = generic_setlease;
assignments to every in-tree filesystem that should retain lease
support, then changes kernel_setlease() to return -EINVAL when
->setlease is NULL.
With the new default in place, simple_nosetlease() is redundant and
is removed along with all references to it"
Linus Torvalds [Mon, 9 Feb 2026 19:25:01 +0000 (11:25 -0800)]
Merge tag 'vfs-7.0-rc1.nonblocking_timestamps' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs timestamp updates from Christian Brauner:
"This contains the changes to support non-blocking timestamp updates.
Since commit 66fa3cedf16a ("fs: Add async write file modification
handling") file_update_time_flags() unconditionally returns -EAGAIN
when any timestamp needs updating and IOCB_NOWAIT is set. This makes
non-blocking direct writes impossible on file systems with granular
enough timestamps, which in practice means all of them.
This reworks the timestamp update path to propagate IOCB_NOWAIT
through ->update_time so that file systems which can update timestamps
without blocking are no longer penalized.
With that groundwork in place, the core change passes IOCB_NOWAIT into
->update_time and returns -EAGAIN only when the file system indicates
it would block.
XFS implements non-blocking timestamp updates by using the new
->sync_lazytime and open-coding generic_update_time without the
S_NOWAIT check, since the lazytime path through the generic helpers
can never block in XFS"
* tag 'vfs-7.0-rc1.nonblocking_timestamps' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
xfs: enable non-blocking timestamp updates
xfs: implement ->sync_lazytime
fs: refactor file_update_time_flags
fs: add support for non-blocking timestamp updates
fs: add a ->sync_lazytime method
fs: factor out a sync_lazytime helper
fs: refactor ->update_time handling
fat: cleanup the flags for fat_truncate_time
nfs: split nfs_update_timestamps
fs: allow error returns from generic_update_time
fs: remove inode_update_time
Linus Torvalds [Mon, 9 Feb 2026 19:03:25 +0000 (11:03 -0800)]
Merge tag 'vfs-7.0-rc1.initrd' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs initrd removal from Christian Brauner:
"Remove the deprecated linuxrc-based initrd code path and related dead
code. The linuxrc initrd path was deprecated in 2020 and this series
completes its removal. If we see real-life regressions we'll revert.
The core change removes handle_initrd() and init_linuxrc() — the
entire flow that ran /linuxrc from an initrd, pivoted roots, and
handed off to the real root filesystem. With that gone, initrd_load()
becomes void (no longer short-circuits prepare_namespace()),
rd_load_image() is simplified to always load /initrd.image instead of
taking a path, and rd_load_disk() is deleted.
The /proc/sys/kernel/real-root-dev sysctl and its backing variable are
removed since they only existed for linuxrc to communicate the real
root device back to the kernel.
The no-op load_ramdisk= and prompt_ramdisk= parameters are dropped,
and noinitrd and ramdisk_start= gain deprecation warnings.
Initramfs is entirely unaffected. The non-linuxrc initrd path
(root=/dev/ram0) is preserved but now carries a deprecation warning
targeting January 2027 removal"
* tag 'vfs-7.0-rc1.initrd' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
init: remove /proc/sys/kernel/real-root-dev
initrd: remove deprecated code path (linuxrc)
init: remove deprecated "load_ramdisk" and "prompt_ramdisk" command line parameters
Linus Torvalds [Mon, 9 Feb 2026 18:41:56 +0000 (10:41 -0800)]
Merge tag 'vfs-7.0-rc1.rust' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs rust updates from Christian Brauner:
"Allow inlining C helpers into Rust when using LTO: Add the
__rust_helper annotation to all VFS-related Rust helper functions.
Currently, C helpers cannot be inlined into Rust code even under LTO
because LLVM detects slightly different codegen options between the C
and Rust compilation units (differing null-pointer-check flags,
builtin lists, and target feature strings). The __rust_helper macro is
the first step toward fixing this: it is currently #defined to
nothing, but a follow-up series will change it to __always_inline when
compiling with LTO (while keeping it empty for bindgen, which ignores
inline functions).
This picks up the VFS portion (fs, pid_namespace, poll) of a larger
tree-wide series"
* tag 'vfs-7.0-rc1.rust' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
rust: poll: add __rust_helper to helpers
rust: pid_namespace: add __rust_helper to helpers
rust: fs: add __rust_helper to helpers
Linus Torvalds [Mon, 9 Feb 2026 18:38:05 +0000 (10:38 -0800)]
Merge tag 'selinux-pr-20260203' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull selinux updates from Paul Moore:
- Add support for SELinux based access control of BPF tokens
We worked with the BPF devs to add the necessary LSM hooks when the
BPF token code was first introduced, but it took us a bit longer to
add the SELinux wiring and support.
In order to preserve existing token-unaware SELinux policies, the new
code is gated by the new "bpf_token_perms" policy capability.
Additional details regarding the new permissions, and behaviors can
be found in the associated commit.
- Remove a BUG() from the SELinux capability code
We now perform a similar check during compile time so we can safely
remove the BUG() call.
* tag 'selinux-pr-20260203' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: drop the BUG() in cred_has_capability()
selinux: fix a capabilities parsing typo in selinux_bpf_token_capable()
selinux: add support for BPF token access control
selinux: move the selinux_blob_sizes struct
Linus Torvalds [Mon, 9 Feb 2026 18:16:48 +0000 (10:16 -0800)]
Merge tag 'lsm-pr-20260203' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm
Pull lsm updates from Paul Moore:
- Unify the security_inode_listsecurity() calls in NFSv4
While looking at security_inode_listsecurity() with an eye towards
improving the interface, we realized that the NFSv4 code was making
multiple calls to the LSM hook that could be consolidated into one.
- Mark the LSM static branch keys as static - this helps resolve some
sparse warnings
- Add __rust_helper annotations to the LSM and cred wrapper functions
- Remove the unsused set_security_override_from_ctx() function
- Minor fixes to some of the LSM kdoc comment blocks
* tag 'lsm-pr-20260203' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm:
lsm: make keys for static branch static
cred: remove unused set_security_override_from_ctx()
rust: security: add __rust_helper to helpers
rust: cred: add __rust_helper to helpers
nfs: unify security_inode_listsecurity() calls
lsm: fix kernel-doc struct member names
Linus Torvalds [Mon, 9 Feb 2026 18:13:03 +0000 (10:13 -0800)]
Merge tag 'audit-pr-20260203' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit
Pull audit updates from Paul Moore:
- Improve the NETFILTER_PKT audit records
Add source and destination ports to the NETFILTER_PKT audit records
while also consolidating a lot of the code into a new, singular
audit_log_nf_skb() function. This new approach to structuring the
NETFILTER_PKT record generation should eliminate some unnecessary
overhead when audit is not built into the kernel.
- Update the audit syscall classifier code
Add the listxattrat(), getxattrat(), and fchmodat2() syscall to the
audit code which classifies syscalls into categories of operations,
e.g. "read" or "change attributes".
- Move the syscall classifier declarations into audit_arch.h
Shuffle around some header file declarations to resolve some sparse
warnings.
* tag 'audit-pr-20260203' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
audit: move the compat_xxx_class[] extern declarations to audit_arch.h
audit: add missing syscalls to read class
audit: include source and destination ports to NETFILTER_PKT
audit: add audit_log_nf_skb helper function
audit: add fchmodat2() to change attributes class
* tag 'i3c/for-6.20' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux: (52 commits)
i3c: dw-i3c-master: fix SIR reject bit mapping for dynamic addresses
i3c: dw-i3c-master: convert spinlock usage to scoped guards
i3c: dw: Fix memory leak in dw_i3c_master_i2c_xfers()
i3c: mipi-i3c-hci-pci: Add System Suspend support
i3c: mipi-i3c-hci: Add optional System Suspend support
i3c: master: Add i3c_master_do_daa_ext() for post-hibernation address recovery
i3c: dw: Initialize spinlock to avoid upsetting lockdep
i3c: mipi-i3c-hci-pci: Add Runtime PM support
i3c: mipi-i3c-hci: Add optional Runtime PM support
i3c: master: Introduce optional Runtime PM support
i3c: mipi-i3c-hci: Factor out master dynamic address setting into helper
i3c: mipi-i3c-hci: Allow core re-initialization for Runtime PM support
i3c: mipi-i3c-hci: Factor out core initialization into helper
i3c: mipi-i3c-hci: Factor out IO mode setting into helper
i3c: mipi-i3c-hci: Factor out software reset into helper
i3c: mipi-i3c-hci: Add PIO suspend and resume support
i3c: mipi-i3c-hci: Refactor PIO register initialization
i3c: mipi-i3c-hci: Add DMA suspend and resume support
i3c: mipi-i3c-hci: Extract ring initialization from hci_dma_init()
i3c: mipi-i3c-hci: Introduce helper to restore DAT
...
Linus Torvalds [Mon, 9 Feb 2026 17:46:26 +0000 (09:46 -0800)]
Merge tag 'rcu.release.v7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux
Pull RCU updates from Boqun Feng:
- RCU Tasks Trace:
Re-implement RCU tasks trace in term of SRCU-fast, not only more than
500 lines of code are saved because of the reimplementation, a new
set of API, rcu_read_{,un}lock_tasks_trace(), becomes possible as
well. Compared to the previous rcu_read_{,un}lock_trace(), the new
API avoid the task_struct accesses thanks to the SRCU-fast semantics.
As a result, the old rcu_read{,un}lock_trace() API is now deprecated.
- RCU Torture Test:
- Multiple improvements on kvm-series.sh (parallel run and
progress showing metrics)
- Add context checks to rcu_torture_timer()
- Make config2csv.sh properly handle comments in .boot files
- Include commit discription in testid.txt
- Miscellaneous RCU changes:
- Reduce synchronize_rcu() latency by reporting GP kthread's
CPU QS early
- Use suitable gfp_flags for the init_srcu_struct_nodes()
- Fix rcu_read_unlock() deadloop due to softirq
- Correctly compute probability to invoke ->exp_current()
in rcutorture
- Make expedited RCU CPU stall warnings detect stall-end races
- RCU nocb:
- Remove unnecessary WakeOvfIsDeferred wake path and callback
overload handling
- Extract nocb_defer_wakeup_cancel() helper
* tag 'rcu.release.v7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux: (25 commits)
rcu/nocb: Extract nocb_defer_wakeup_cancel() helper
rcu/nocb: Remove dead callback overload handling
rcu/nocb: Remove unnecessary WakeOvfIsDeferred wake path
rcu: Reduce synchronize_rcu() latency by reporting GP kthread's CPU QS early
srcu: Use suitable gfp_flags for the init_srcu_struct_nodes()
rcu: Fix rcu_read_unlock() deadloop due to softirq
rcutorture: Correctly compute probability to invoke ->exp_current()
rcu: Make expedited RCU CPU stall warnings detect stall-end races
rcutorture: Add --kill-previous option to terminate previous kvm.sh runs
rcutorture: Prevent concurrent kvm.sh runs on same source tree
torture: Include commit discription in testid.txt
torture: Make config2csv.sh properly handle comments in .boot files
torture: Make kvm-series.sh give run numbers and totals
torture: Make kvm-series.sh give build numbers and totals
torture: Parallelize kvm-series.sh guest-OS execution
rcutorture: Add context checks to rcu_torture_timer()
rcutorture: Test rcu_tasks_trace_expedite_current()
srcu: Create an rcu_tasks_trace_expedite_current() function
checkpatch: Deprecate rcu_read_{,un}lock_trace()
rcu: Update Requirements.rst for RCU Tasks Trace
...
Linus Torvalds [Mon, 9 Feb 2026 17:42:21 +0000 (09:42 -0800)]
Merge tag 'linux_kselftest-next-6.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kselftest updates from Shuah Khan:
"resctrl test:
- fix division by zero error on Hygon
- fix non-contiguous CBM check for Hygon
- define CPU vendor IDs as bits to match usage
- add CPU vendor detection for Hygon
misc:
- coredeump test: use __builtin_trap() instead of a null pointer
- anon_inode: replace null pointers with empty arrays
- kublk: include message in _Static_assert for C11 compatibility
- run_kselftest.sh: add `--skip` argument option
- pidfd: fix typo in comment"
* tag 'linux_kselftest-next-6.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests/pidfd: fix typo in comment
selftests/run_kselftest.sh: Add `--skip` argument option
selftests/resctrl: Fix non-contiguous CBM check for Hygon
selftests/resctrl: Add CPU vendor detection for Hygon
selftests/resctrl: Define CPU vendor IDs as bits to match usage
selftests/resctrl: Fix a division by zero error on Hygon
kselftest/kublk: include message in _Static_assert for C11 compatibility
kselftest/anon_inode: replace null pointers with empty arrays
kselftest/coredump: use __builtin_trap() instead of null pointer
Linus Torvalds [Mon, 9 Feb 2026 17:37:55 +0000 (09:37 -0800)]
Merge tag 'linux_kselftest-kunit-6.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kunit updates from Shuah Khan:
"kunit:
- add __rust_helper to helpers
- fix up const mismatch in many assert functions
- fix up const mismatch in test_list_sort
- protect KUNIT_BINARY_STR_ASSERTION against ERR_PTR values
- respect KBUILD_OUTPUT env variable by default
- add bash completion
kunit tool:
- add test for nested test result reporting
- do not overwrite test status based on subtest counts
- add 32-bit big endian ARM configuration to qemu_configs
- rename test_data_path() to _test_data_path()
- do not rely on implicit working directory change"
* tag 'linux_kselftest-kunit-6.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
kunit: add bash completion
kunit: tool: test: Don't rely on implicit working directory change
kunit: tool: test: Rename test_data_path() to _test_data_path()
kunit: qemu_configs: Add 32-bit big endian ARM configuration
kunit: tool: Don't overwrite test status based on subtest counts
kunit: tool: Add test for nested test result reporting
kunit: respect KBUILD_OUTPUT env variable by default
kunit: Protect KUNIT_BINARY_STR_ASSERTION against ERR_PTR values
test_list_sort: fix up const mismatch
kunit: fix up const mis-match in many assert functions
rust: kunit: add __rust_helper to helpers
Linus Torvalds [Sat, 7 Feb 2026 17:37:34 +0000 (09:37 -0800)]
Merge tag 'spi-fix-v6.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"One final batch of fixes for the Tegra SPI drivers, the main one is a
batch of fixes for races with the interrupts in the Tegra210 QSPI
driver that Breno has been working on for a while"
* tag 'spi-fix-v6.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: tegra114: Preserve SPI mode bits in def_command1_reg
spi: tegra: Fix a memory leak in tegra_slink_probe()
spi: tegra210-quad: Protect curr_xfer check in IRQ handler
spi: tegra210-quad: Protect curr_xfer clearing in tegra_qspi_non_combined_seq_xfer
spi: tegra210-quad: Protect curr_xfer in tegra_qspi_combined_seq_xfer
spi: tegra210-quad: Protect curr_xfer assignment in tegra_qspi_setup_transfer_one
spi: tegra210-quad: Move curr_xfer read inside spinlock
spi: tegra210-quad: Return IRQ_HANDLED when timeout already processed transfer
Linus Torvalds [Sat, 7 Feb 2026 17:34:49 +0000 (09:34 -0800)]
Merge tag 'regulator-fix-v6.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fix from Mark Brown:
"One last fix for v6.19: the voltages for the SpaceMIT P1 were not
described correctly"
* tag 'regulator-fix-v6.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: spacemit-p1: Fix n_voltages for BUCK and LDO regulators
Linus Torvalds [Sat, 7 Feb 2026 17:27:57 +0000 (09:27 -0800)]
Merge tag 'char-misc-6.19-final' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull binder fixes from Greg KH:
"Here are some small, last-minute binder C and Rust driver fixes for
reported issues. They include a number of fixes for reported crashes
and other problems.
All of these have been in linux-next this week, and longer"
* tag 'char-misc-6.19-final' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
binderfs: fix ida_alloc_max() upper bound
rust_binderfs: fix ida_alloc_max() upper bound
binder: fix BR_FROZEN_REPLY error log
rust_binder: add additional alignment checks
binder: fix UAF in binder_netlink_report()
rust_binder: correctly handle FDA objects of length zero
Linus Torvalds [Sat, 7 Feb 2026 17:10:42 +0000 (09:10 -0800)]
Merge tag 'sched-urgent-2026-02-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
"Miscellaneous MMCID fixes to address bugs and performance regressions
in the recent rewrite of the SCHED_MM_CID management code:
- Fix livelock triggered by BPF CI testing
- Fix hard lockup on weakly ordered systems
- Simplify the dropping of CIDs in the exit path by removing an
unintended transition phase
- Fix performance/scalability regression on a thread-pool benchmark
by optimizing transitional CIDs when scheduling out"
* tag 'sched-urgent-2026-02-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/mmcid: Optimize transitional CIDs when scheduling out
sched/mmcid: Drop per CPU CID immediately when switching to per task mode
sched/mmcid: Protect transition on weakly ordered systems
sched/mmcid: Prevent live lock on task to CPU mode transition
Linus Torvalds [Sat, 7 Feb 2026 16:21:21 +0000 (08:21 -0800)]
Merge tag 'objtool-urgent-2026-02-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull objtool fixes from Ingo Molnar::
- Bump up the Clang minimum version requirements for livepatch
builds, due to Clang assembler section handling bugs causing
silent miscompilations
- Strip livepatching symbol artifacts from non-livepatch modules
- Fix livepatch build warnings when certain Clang LTO options
are enabled
- Fix livepatch build error when CONFIG_MEM_ALLOC_PROFILING_DEBUG=y
* tag 'objtool-urgent-2026-02-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool/klp: Fix unexported static call key access for manually built livepatch modules
objtool/klp: Fix symbol correlation for orphaned local symbols
livepatch: Free klp_{object,func}_ext data after initialization
livepatch: Fix having __klp_objects relics in non-livepatch modules
livepatch/klp-build: Require Clang assembler >= 20
Shardul Bankar [Wed, 4 Feb 2026 17:04:40 +0000 (22:34 +0530)]
hfsplus: avoid double unload_nls() on mount failure
The recent commit "hfsplus: ensure sb->s_fs_info is always cleaned up"
[1] introduced a custom ->kill_sb() handler (hfsplus_kill_super) that
cleans up the s_fs_info structure (including the NLS table) on
superblock destruction.
However, the error handling path in hfsplus_fill_super() still calls
unload_nls() before returning an error. Since the VFS layer calls
->kill_sb() when fill_super fails, this results in unload_nls() being
called twice for the same sbi->nls pointer: once in hfsplus_fill_super()
and again in hfsplus_kill_super() (via delayed_free).
Remove the explicit unload_nls() call from the error path in
hfsplus_fill_super() to rely solely on the cleanup in ->kill_sb().
Josh Poimboeuf [Fri, 6 Feb 2026 22:24:55 +0000 (14:24 -0800)]
x86/vmware: Fix hypercall clobbers
Fedora QA reported the following panic:
BUG: unable to handle page fault for address: 0000000040003e54
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS edk2-20251119-3.fc43 11/19/2025
RIP: 0010:vmware_hypercall4.constprop.0+0x52/0x90
..
Call Trace:
vmmouse_report_events+0x13e/0x1b0
psmouse_handle_byte+0x15/0x60
ps2_interrupt+0x8a/0xd0
...
because the QEMU VMware mouse emulation is buggy, and clears the top 32
bits of %rdi that the kernel kept a pointer in.
The QEMU vmmouse driver saves and restores the register state in a
"uint32_t data[6];" and as a result restores the state with the high
bits all cleared.
RDI originally contained the value of a valid kernel stack address
(0xff5eeb3240003e54). After the vmware hypercall it now contains
0x40003e54, and we get a page fault as a result when it is dereferenced.
The proper fix would be in QEMU, but this works around the issue in the
kernel to keep old setups working, when old kernels had not happened to
keep any state in %rdi over the hypercall.
In theory this same issue exists for all the hypercalls in the vmmouse
driver; in practice it has only been seen with vmware_hypercall3() and
vmware_hypercall4(). For now, just mark RDI/RSI as clobbered for those
two calls. This should have a minimal effect on code generation overall
as it should be rare for the compiler to want to make RDI/RSI live
across hypercalls.
Linus Torvalds [Fri, 6 Feb 2026 21:07:47 +0000 (13:07 -0800)]
Merge tag 'mm-hotfixes-stable-2026-02-06-12-37' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull hotfixes from Andrew Morton:
"A couple of late-breaking MM fixes. One against a new-in-this-cycle
patch and the other addresses a locking issue which has been there for
over a year"
* tag 'mm-hotfixes-stable-2026-02-06-12-37' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mm/memory-failure: reject unsupported non-folio compound page
procfs: avoid fetching build ID while holding VMA lock
Linus Torvalds [Fri, 6 Feb 2026 20:37:28 +0000 (12:37 -0800)]
Merge tag 'trace-v6.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fix from Steven Rostedt:
- Fix event format field alignments for 32 bit architectures
The fields in the event format files are used to parse the raw binary
buffer data by applications. If they are incorrect, then the
application produces garbage.
On 32 bit architectures, the function graph 64bit calltime and
rettime were off by 4bytes. That's because the actual fields are in a
packed structure but the macros used by the ftrace events did not
mark them as packed, and instead, gave them their natural alignment
which made their offsets off by 4 bytes.
There are macros to have a packed field within an embedded structure
of an event, but there's no macro for normal fields within a packed
structure of the event. The macro __field_packed() was used for the
packed embedded structure field. Rename that to __field_desc_packed()
(to match the non-packed embedded field macro __field_desc()), and
make __field_packed() for fields that are in a packed event structure
(which matches the unpacked __field() macro).
Switch the calltime and rettime fields of the function graph event to
use the new __field_packed() and this makes the offsets correct.
* tag 'trace-v6.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: Fix ftrace event field alignments
Linus Torvalds [Fri, 6 Feb 2026 18:34:17 +0000 (10:34 -0800)]
Merge tag 'ceph-for-6.19-rc9' of https://github.com/ceph/ceph-client
Pull ceph fixes from Ilya Dryomov:
"One RBD and two CephFS fixes which address potential oopses.
The RBD thing is more of a rare edge case that pops up in our CI,
while the two CephFS scenarios are regressions that were reported by
users and can be triggered trivially in normal operation. All marked
for stable"
* tag 'ceph-for-6.19-rc9' of https://github.com/ceph/ceph-client:
ceph: fix NULL pointer dereference in ceph_mds_auth_match()
ceph: fix oops due to invalid pointer for kfree() in parse_longname()
rbd: check for EOD after exclusive lock is ensured to be held
Linus Torvalds [Fri, 6 Feb 2026 18:27:42 +0000 (10:27 -0800)]
Merge tag 'dma-mapping-6.19-2026-02-06' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux
Pull dma-mapping fixes from Marek Szyprowski:
"Two minor fixes for the DMA-mapping subsystem:
- check for the rare case of the allocation failure of the global CMA
pool (Shanker Donthineni)
- avoid perf buffer overflow when tracing large scatter-gather lists
(Deepanshu Kartikey)"
* tag 'dma-mapping-6.19-2026-02-06' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux:
dma: contiguous: Check return value of dma_contiguous_reserve_area()
tracing/dma: Cap dma_map_sg tracepoint arrays to prevent buffer overflow
Linus Torvalds [Fri, 6 Feb 2026 18:10:39 +0000 (10:10 -0800)]
Merge tag 'pmdomain-v6.19-rc3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm
Pull pmdomain fixes from Ulf Hansson:
- imx:
- Fix system wakeup support for imx8mp power domains
- Fix potential out-of-range access for imx8m power domains
- Fix the imx8mm gpu hang
- qcom: Fix off-by-one error for highest state in rpmpd
* tag 'pmdomain-v6.19-rc3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm:
pmdomain: imx8mp-blk-ctrl: Keep usb phy power domain on for system wakeup
pmdomain: imx8mp-blk-ctrl: Keep gpc power domain on for system wakeup
pmdomain: imx8m-blk-ctrl: fix out-of-range access of bc->domains
pmdomain: imx: gpcv2: Fix the imx8mm gpu hang due to wrong adb400 reset
pmdomain: qcom: rpmpd: fix off-by-one error in clamping to the highest state
Linus Torvalds [Fri, 6 Feb 2026 17:59:40 +0000 (09:59 -0800)]
Merge tag 'sound-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A collection of small fixes. It became a bit larger than wished, but
all of them are device-specific small fixes, and it should be still
fairly safe to take at the last minute.
Included are a few quirks and fixes for Intel, AMD, HD-audio, and
USB-audio, as well as a race fix in aloop driver and corrections of
Cirrus firmware kunit test"
* tag 'sound-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda/realtek: Enable headset mic for Acer Nitro 5
ASoC: fsl_xcvr: fix missing lock in fsl_xcvr_mode_put()
ASoC: dt-bindings: ti,tlv320aic3x: Add compatible string ti,tlv320aic23
ASoC: amd: fix memory leak in acp3x pdm dma ops
ALSA: usb-audio: fix broken logic in snd_audigy2nx_led_update()
ALSA: aloop: Fix racy access at PCM trigger
ASoC: rt1320: fix intermittent no-sound issue
ASoC: SOF: Intel: use hdev->info.link_mask directly
firmware: cs_dsp: rate-limit log messages in KUnit builds
ASoC: amd: yc: Add quirk for HP 200 G2a 16
ASoC: cs42l43: Correct handling of 3-pole jack load detection
ASoC: Intel: sof_es8336: Add DMI quirk for Huawei BOD-WXX9
ASoC: sof_sdw: Add a quirk for Lenovo laptop using sidecar amps with cs42l43
Linus Torvalds [Fri, 6 Feb 2026 17:56:03 +0000 (09:56 -0800)]
Merge tag 'slab-for-6.19-rc8-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab
Pull slab fix from Vlastimil Babka:
"A stable fix for memory allocation profiling tag not being cleared
when aborting an allocation due to memcg charge failure (Hao Ge)"
* tag 'slab-for-6.19-rc8-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
mm/slab: Add alloc_tagging_slab_free_hook for memcg_alloc_abort_single
Viktor Kleen [Thu, 5 Feb 2026 08:49:41 +0000 (16:49 +0800)]
iommu/vt-d: Treat PAGE_SNOOP and PWSNP separately
The PASID_FLAG_PAGE_SNOOP and PASID_FLAG_PWSNP constants are identical.
This will cause the pasid code to always set both or neither of the
PGSNP and PWSNP bits in PASID table entries. However, PWSNP is a
reserved bit if SMPWC is not set in the IOMMU's extended capability
register, even if SC is supported.
This has resulted in DMAR errors when testing the iommufd code on an
Arrow Lake platform. With this patch, those errors disappear and the
PASID table entries look correct.
Fixes: 101a2854110fa ("iommu/vt-d: Follow PT_FEAT_DMA_INCOHERENT into the PASID entry") Cc: stable@vger.kernel.org Signed-off-by: Viktor Kleen <viktor@kleen.org> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20260202192109.1665799-1-viktor@kleen.org Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
When __memcg_slab_post_alloc_hook() fails, there are two different
free paths depending on whether size == 1 or size != 1. In the
kmem_cache_free_bulk() path, we do call alloc_tagging_slab_free_hook().
However, in memcg_alloc_abort_single() we don't, the above warning will be
triggered on the next allocation.
Therefore, add alloc_tagging_slab_free_hook() to the
memcg_alloc_abort_single() path.
Fixes: 9f9796b413d3 ("mm, slab: move memcg charging to post-alloc hook") Cc: stable@vger.kernel.org Suggested-by: Hao Li <hao.li@linux.dev> Signed-off-by: Hao Ge <hao.ge@linux.dev> Reviewed-by: Hao Li <hao.li@linux.dev> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Harry Yoo <harry.yoo@oracle.com> Link: https://patch.msgid.link/20260204101401.202762-1-hao.ge@linux.dev Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Linus Torvalds [Fri, 6 Feb 2026 05:33:22 +0000 (21:33 -0800)]
Merge tag 'hwmon-for-v6.19-final' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
- occ: Mark occ_init_attribute() as __printf to avoid build failure due
to '-Werror=suggest-attribute=format'
- gpio-fan: Allow to stop fans when CONFIG_PM is disabled, and fix
set_rpm() return value
- acpi_power_meter: Fix deadlocks related to acpi_power_meter_notify()
- dell-smm: Add Dell G15 5510 to fan control whitelist
* tag 'hwmon-for-v6.19-final' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (occ) Mark occ_init_attribute() as __printf
hwmon: (gpio-fan) Allow to stop FANs when CONFIG_PM is disabled
hwmon: (gpio-fan) Fix set_rpm() return value
hwmon: (acpi_power_meter) Fix deadlocks related to acpi_power_meter_notify()
hwmon: (dell-smm) Add Dell G15 5510 to fan control whitelist
Linus Torvalds [Fri, 6 Feb 2026 03:56:47 +0000 (19:56 -0800)]
Merge tag 'drm-fixes-2026-02-06' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
"The usual xe/amdgpu selection, and a couple of misc changes for
gma500, mgag200 and bridge. There is a nouveau revert, and also a set
of changes that fix a regression since we moved to 570 firmware.
Suspend/resume was broken on a bunch of GPUs. The fix looks big, but
it's mostly just refactoring to pass an extra bit down the nouveau
abstractions to the firmware command.
amdgpu:
- MES 11 old firmware compatibility fix
- ASPM fix
- DC LUT fixes
amdkfd:
- Fix possible double deletion of validate list
xe:
- Fix topology query pointer advance
- A couple of kerneldoc fixes
- Disable D3Cold for BMG only on specific platforms
- Fix CFI violation in debugfs access
nouveau:
- Revert adding atomic commit functions as it regresses pre-nv50
- Fix suspend/resume bugs exposed by enabling 570 firmware
gma500:
- Revert a regression caused by vblank changes
mgag200:
- Replace a busy loop with a polling loop to fix that blocking 1 cpu
for 300 ms roughly every 20 minutes
bridge:
- imx8mp-hdmi-pa: Use runtime pm to fix a bug in channel ordering"
* tag 'drm-fixes-2026-02-06' of https://gitlab.freedesktop.org/drm/kernel:
drm/xe/guc: Fix CFI violation in debugfs access.
drm/bridge: imx8mp-hdmi-pai: enable PM runtime
drm/xe/pm: Disable D3Cold for BMG only on specific platforms
drm/xe: Fix kerneldoc for xe_tlb_inval_job_alloc_dep
drm/xe: Fix kerneldoc for xe_gt_tlb_inval_init_early
drm/xe: Fix kerneldoc for xe_migrate_exec_queue
drm/xe/query: Fix topology query pointer advance
drm/mgag200: fix mgag200_bmc_stop_scanout()
nouveau/gsp: fix suspend/resume regression on r570 firmware
nouveau: add a third state to the fini handler.
nouveau/gsp: use rpc sequence numbers properly.
drm/amdgpu: Fix double deletion of validate_list
drm/amd/display: remove assert around dpp_base replacement
drm/amd/display: extend delta clamping logic to CM3 LUT helper
drm/amd/display: fix wrong color value mapping on MCM shaper LUT
Revert "drm/amd: Check if ASPM is enabled from PCIe subsystem"
drm/amd: Set minimum version for set_hw_resource_1 on gfx11 to 0x52
Revert "drm/gma500: use drm_crtc_vblank_crtc()"
Revert "drm/nouveau/disp: Set drm_mode_config_funcs.atomic_(check|commit)"
Dave Airlie [Fri, 6 Feb 2026 02:41:35 +0000 (12:41 +1000)]
Merge tag 'drm-xe-fixes-2026-02-05' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
Driver Changes:
- Fix topology query pointer advance (Shuicheng)
- A couple of kerneldoc fixes (Shuicheng)
- Disable D3Cold for BMG only on specific platforms (Karthik)
- Fix CFI violation in debugfs access (Daniele)
Linus Torvalds [Thu, 5 Feb 2026 23:00:53 +0000 (15:00 -0800)]
Merge tag 'block-6.19-20260205' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull block fixes from Jens Axboe:
- Revert of a change for loop, which caused regressions for some users
(Actually revert of two commits, where one is just an existing fix
for the offending commit)
- NVMe pull via Keith:
- Fix NULL pointer access setting up dma mappings
- Fix invalid memory access from malformed TCP PDU
* tag 'block-6.19-20260205' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
loop: revert exclusive opener loop status change
nvmet-tcp: add bounds checks in nvmet_tcp_build_pdu_iovec
nvme-pci: handle changing device dma map requirements
Linus Torvalds [Thu, 5 Feb 2026 22:40:06 +0000 (14:40 -0800)]
Merge tag 'io_uring-6.19-20260205' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull io_uring fixes from Jens Axboe:
- Two small fixes for zcrx
- Two small fixes for fdinfo - one is just killing a superflous newline
* tag 'io_uring-6.19-20260205' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
io_uring/fdinfo: be a bit nicer when looping a lot of SQEs/CQEs
io_uring/fdinfo: kill unnecessary newline feed in CQE32 printing
io_uring/zcrx: fix rq flush locking
io_uring/zcrx: fix page array leak
When !CONFIG_TRANSPARENT_HUGEPAGE, a non-folio compound page can appear in
a userspace mapping via either vm_insert_*() functions or
vm_operatios_struct->fault(). They are not folios, thus should not be
considered for folio operations like split. To reject these pages, make
sure get_hwpoison_page() is always called as HWPoisonHandlable() will do
the right work.
[Some commit log borrowed from Zi Yan. Thanks.]
Link: https://lkml.kernel.org/r/20260205075328.523211-1-linmiaohe@huawei.com Fixes: 689b8986776c ("mm/memory-failure: improve large block size folio handling") Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Reported-by: 是参差 <shicenci@gmail.com> Closes: https://lore.kernel.org/all/PS1PPF7E1D7501F1E4F4441E7ECD056DEADAB98A@PS1PPF7E1D7501F.apcprd02.prod.outlook.com/ Reviewed-by: Zi Yan <ziy@nvidia.com> Tested-by: Zi Yan <ziy@nvidia.com> Cc: David Hildenbrand <david@kernel.org> Cc: Jane Chu <jane.chu@oracle.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Naoya Horiguchi <nao.horiguchi@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Andrii Nakryiko [Thu, 29 Jan 2026 21:53:40 +0000 (13:53 -0800)]
procfs: avoid fetching build ID while holding VMA lock
Fix PROCMAP_QUERY to fetch optional build ID only after dropping mmap_lock
or per-VMA lock, whichever was used to lock VMA under question, to avoid
deadlock reported by syzbot:
This seems to be exacerbated (as we haven't seen these syzbot reports
before that) by the recent:
777a8560fd29 ("lib/buildid: use __kernel_read() for sleepable context")
To make this safe, we need to grab file refcount while VMA is still locked, but
other than that everything is pretty straightforward. Internal build_id_parse()
API assumes VMA is passed, but it only needs the underlying file reference, so
just add another variant build_id_parse_file() that expects file passed
directly.
[akpm@linux-foundation.org: fix up kerneldoc] Link: https://lkml.kernel.org/r/20260129215340.3742283-1-andrii@kernel.org Fixes: ed5d583a88a9 ("fs/procfs: implement efficient VMA querying API for /proc/<pid>/maps") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reported-by: <syzbot+4e70c8e0a2017b432f7a@syzkaller.appspotmail.com> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Tested-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Hao Luo <haoluo@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: Song Liu <song@kernel.org> Cc: Stanislav Fomichev <sdf@fomichev.me> Cc: Yonghong Song <yonghong.song@linux.dev> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Vishwaroop A [Wed, 4 Feb 2026 14:12:12 +0000 (14:12 +0000)]
spi: tegra114: Preserve SPI mode bits in def_command1_reg
The COMMAND1 register bits [29:28] set the SPI mode, which controls
the clock idle level. When a transfer ends, tegra_spi_transfer_end()
writes def_command1_reg back to restore the default state, but this
register value currently lacks the mode bits. This results in the
clock always being configured as idle low, breaking devices that
need it high.
Fix this by storing the mode bits in def_command1_reg during setup,
to prevent this field from always being cleared.
Linus Torvalds [Thu, 5 Feb 2026 19:19:26 +0000 (11:19 -0800)]
Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull dcache fixes from Al Viro:
"A couple of regression fixes for the tree-in-dcache series this cycle"
* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
functionfs: use spinlock for FFS_DEACTIVATED/FFS_CLOSING transitions
rust_binderfs: fix a dentry leak
Al Viro [Sat, 31 Jan 2026 23:24:41 +0000 (18:24 -0500)]
functionfs: use spinlock for FFS_DEACTIVATED/FFS_CLOSING transitions
When all files are closed, functionfs needs ffs_data_reset() to be
done before any further opens are allowed.
During that time we have ffs->state set to FFS_CLOSING; that makes
->open() fail with -EBUSY. Once ffs_data_reset() is done, it
switches state (to FFS_READ_DESCRIPTORS) indicating that opening
that thing is allowed again. There's a couple of additional twists:
* mounting with -o no_disconnect delays ffs_data_reset()
from doing that at the final ->release() to the first subsequent
open(). That's indicated by ffs->state set to FFS_DEACTIVATED;
if open() sees that, it immediately switches to FFS_CLOSING and
proceeds with doing ffs_data_reset() before returning to userland.
* a couple of usb callbacks need to force the delayed
transition; unfortunately, they are done in locking environment
that does not allow blocking and ffs_data_reset() can block.
As the result, if these callbacks see FFS_DEACTIVATED, they change
state to FFS_CLOSING and use schedule_work() to get ffs_data_reset()
executed asynchronously.
Unfortunately, the locking is rather insufficient. A fix attempted
in e5bf5ee26663 ("functionfs: fix the open/removal races") had closed
a bunch of UAF, but it didn't do anything to the callbacks, lacked
barriers in transition from FFS_CLOSING to FFS_READ_DESCRIPTORS
_and_ it had been too heavy-handed in open()/open() serialization -
I've used ffs->mutex for that, and it's being held over actual IO on
ep0, complete with copy_from_user(), etc.
Even more unfortunately, the userland side is apparently racy enough
to have the resulting timing changes (no failures, just a delayed
return of open(2)) disrupt the things quite badly. Userland bugs
or not, it's a clear regression that needs to be dealt with.
Solution is to use a spinlock for serializing these state checks and
transitions - unlike ffs->mutex it can be taken in these callbacks
and it doesn't disrupt the timings in open().
We could introduce a new spinlock, but it's easier to use the one
that is already there (ffs->eps_lock) instead - the locking
environment is safe for it in all affected places.
Since now it is held over all places that alter or check the
open count (ffs->opened), there's no need to keep that atomic_t -
int would serve just fine and it's simpler that way.
Fixes: e5bf5ee26663 ("functionfs: fix the open/removal races") Fixes: 18d6b32fca38 ("usb: gadget: f_fs: add "no_disconnect" mode") # v4.0 Tested-by: Samuel Wu <wusamuel@google.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Mon, 26 Jan 2026 06:05:57 +0000 (01:05 -0500)]
rust_binderfs: fix a dentry leak
Parallel to binderfs patches - 02da8d2c0965 "binderfs_binder_ctl_create():
kill a bogus check" and the bit of b89aa544821d "convert binderfs" that
got lost when making 4433d8e25d73 "convert rust_binderfs"; the former is
a cleanup, the latter is about marking /binder-control persistent, so that
it would be taken out on umount.
Fixes: 4433d8e25d73 ("convert rust_binderfs") Acked-by: Alice Ryhl <aliceryhl@google.com> Acked-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Shigeru Yoshida [Wed, 4 Feb 2026 09:58:37 +0000 (18:58 +0900)]
ipv6: Fix ECMP sibling count mismatch when clearing RTF_ADDRCONF
syzbot reported a kernel BUG in fib6_add_rt2node() when adding an IPv6
route. [0]
Commit f72514b3c569 ("ipv6: clear RA flags when adding a static
route") introduced logic to clear RTF_ADDRCONF from existing routes
when a static route with the same nexthop is added. However, this
causes a problem when the existing route has a gateway.
When RTF_ADDRCONF is cleared from a route that has a gateway, that
route becomes eligible for ECMP, i.e. rt6_qualify_for_ecmp() returns
true. The issue is that this route was never added to the
fib6_siblings list.
This leads to a mismatch between the following counts:
- The sibling count computed by iterating fib6_next chain, which
includes the newly ECMP-eligible route
- The actual siblings in fib6_siblings list, which does not include
that route
When a subsequent ECMP route is added, fib6_add_rt2node() hits
BUG_ON(sibling->fib6_nsiblings != rt->fib6_nsiblings) because the
counts don't match.
Fix this by only clearing RTF_ADDRCONF when the existing route does
not have a gateway. Routes without a gateway cannot qualify for ECMP
anyway (rt6_qualify_for_ecmp() requires fib_nh_gw_family), so clearing
RTF_ADDRCONF on them is safe and matches the original intent of the
commit.
Jakub Kicinski [Thu, 5 Feb 2026 16:38:02 +0000 (08:38 -0800)]
Merge tag 'nf-26-02-05' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Florian Westphal says:
====================
netfilter: update for net
This is one last-minute crash fix for nf_tables, from Andrew Fasano:
Logical check is inverted, this makes kernel fail to correctly undo
the transaction, leading to a use-after-free.
* tag 'nf-26-02-05' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nf_tables: fix inverted genmask check in nft_map_catchall_activate()
====================
Jens Axboe [Thu, 5 Feb 2026 16:26:53 +0000 (09:26 -0700)]
loop: revert exclusive opener loop status change
This commit effectively reverts the following two commits:
2704024d83fa ("loop: add missing bd_abort_claiming in loop_set_status") 08e136ebd193 ("loop: don't change loop device under exclusive opener in loop_set_status")
as there are reports of them causing issues with unmounting. As we're
close to the 6.19 kernel release and the original author hasn't taken a
closer look at this yet, revert them for release.
This is caused an extra file->klp sanity check which was added by commit 164c9201e1da ("objtool: Add base objtool support for livepatch
modules"). That check was intended to ensure that livepatch modules
built with klp-build always have full access to their static call keys.
However, it failed to account for the fact that manually built livepatch
modules (i.e., not built with klp-build) might need access to unexported
static call keys, for which read-only access is typically allowed for
modules.
While the livepatch-shadow-fix1 module doesn't explicitly use any static
calls, it does have a memory allocation, which can cause
CONFIG_MEM_ALLOC_PROFILING_DEBUG to insert a WARN() call. And WARN() is
now an unexported static call as of commit 860238af7a33 ("x86_64/bug:
Inline the UD1").
Fix it by removing the overzealous file->klp check, restoring the
original behavior for manually built livepatch modules.
Josh Poimboeuf [Mon, 2 Feb 2026 18:01:08 +0000 (10:01 -0800)]
objtool/klp: Fix symbol correlation for orphaned local symbols
When compiling with CONFIG_LTO_CLANG_THIN, vmlinux.o has
__irf_[start|end] before the first FILE entry:
$ readelf -sW vmlinux.o
Symbol table '.symtab' contains 597706 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 NOTYPE LOCAL DEFAULT 18 __irf_start
2: 0000000000000200 0 NOTYPE LOCAL DEFAULT 18 __irf_end
3: 0000000000000000 0 SECTION LOCAL DEFAULT 17 .text
4: 0000000000000000 0 SECTION LOCAL DEFAULT 18 .init.ramfs
This causes klp-build warnings like:
vmlinux.o: warning: objtool: no correlation: __irf_start
vmlinux.o: warning: objtool: no correlation: __irf_end
The problem is that Clang LTO is stripping the initramfs_data.o FILE
symbol, causing those two symbols to be orphaned and not noticed by
klp-diff's correlation logic. Add a loop to correlate any symbols found
before the first FILE symbol.
Petr Pavlu [Fri, 23 Jan 2026 10:26:57 +0000 (11:26 +0100)]
livepatch: Free klp_{object,func}_ext data after initialization
The klp_object_ext and klp_func_ext data, which are stored in the
__klp_objects and __klp_funcs sections, respectively, are not needed
after they are used to create the actual klp_object and klp_func
instances. This operation is implemented by the init function in
scripts/livepatch/init.c.
Prefix the two sections with ".init" so they are freed after the module
is initializated.
Signed-off-by: Petr Pavlu <petr.pavlu@suse.com> Acked-by: Joe Lawrence <joe.lawrence@redhat.com> Acked-by: Miroslav Benes <mbenes@suse.cz> Reviewed-by: Aaron Tomlin <atomlin@atomlin.com> Link: https://patch.msgid.link/20260123102825.3521961-3-petr.pavlu@suse.com Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Petr Pavlu [Fri, 23 Jan 2026 10:26:56 +0000 (11:26 +0100)]
livepatch: Fix having __klp_objects relics in non-livepatch modules
The linker script scripts/module.lds.S specifies that all input
__klp_objects sections should be consolidated into an output section of
the same name, and start/stop symbols should be created to enable
scripts/livepatch/init.c to locate this data.
This start/stop pattern is not ideal for modules because the symbols are
created even if no __klp_objects input sections are present.
Consequently, a dummy __klp_objects section also appears in the
resulting module. This unnecessarily pollutes non-livepatch modules.
Instead, since modules are relocatable files, the usual method for
locating consolidated data in a module is to read its section table.
This approach avoids the aforementioned problem.
The klp_modinfo already stores a copy of the entire section table with
the final addresses. Introduce a helper function that
scripts/livepatch/init.c can call to obtain the location of the
__klp_objects section from this data.
Fixes: dd590d4d57eb ("objtool/klp: Introduce klp diff subcommand for diffing object files") Signed-off-by: Petr Pavlu <petr.pavlu@suse.com> Acked-by: Joe Lawrence <joe.lawrence@redhat.com> Acked-by: Miroslav Benes <mbenes@suse.cz> Reviewed-by: Aaron Tomlin <atomlin@atomlin.com> Link: https://patch.msgid.link/20260123102825.3521961-2-petr.pavlu@suse.com Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
YunJe Shin [Wed, 28 Jan 2026 00:41:07 +0000 (09:41 +0900)]
nvmet-tcp: add bounds checks in nvmet_tcp_build_pdu_iovec
nvmet_tcp_build_pdu_iovec() could walk past cmd->req.sg when a PDU
length or offset exceeds sg_cnt and then use bogus sg->length/offset
values, leading to _copy_to_iter() GPF/KASAN. Guard sg_idx, remaining
entries, and sg->length/offset before building the bvec.
The initial state of dma_needs_unmap may be false, but change to true
while mapping the data iterator. Enabling swiotlb is one such case that
can change the result. The nvme driver needs to save the mapped dma
vectors to be unmapped later, so allocate as needed during iteration
rather than assume it was always allocated at the beginning. This fixes
a NULL dereference from accessing an uninitialized dma_vecs when the
device dma unmapping requirements change mid-iteration.
Steven Rostedt [Wed, 4 Feb 2026 16:36:28 +0000 (11:36 -0500)]
tracing: Fix ftrace event field alignments
The fields of ftrace specific events (events used to save ftrace internal
events like function traces and trace_printk) are generated similarly to
how normal trace event fields are generated. That is, the fields are added
to a trace_events_fields array that saves the name, offset, size,
alignment and signness of the field. It is used to produce the output in
the format file in tracefs so that tooling knows how to parse the binary
data of the trace events.
The issue is that some of the ftrace event structures are packed. The
function graph exit event structures are one of them. The 64 bit calltime
and rettime fields end up 4 byte aligned, but the algorithm to show to
userspace shows them as 8 byte aligned.
The macros that create the ftrace events has one for embedded structure
fields. There's two macros for theses fields:
__field_desc() and __field_packed()
The difference of the latter macro is that it treats the field as packed.
Rename that field to __field_desc_packed() and create replace the
__field_packed() to be a normal field that is packed and have the calltime
and rettime use those.
This showed up on 32bit architectures for function graph time fields. It
had:
~# cat /sys/kernel/tracing/events/ftrace/funcgraph_exit/format
[..]
field:unsigned long func; offset:8; size:4; signed:0;
field:unsigned int depth; offset:12; size:4; signed:0;
field:unsigned int overrun; offset:16; size:4; signed:0;
field:unsigned long long calltime; offset:24; size:8; signed:0;
field:unsigned long long rettime; offset:32; size:8; signed:0;
Notice that overrun is at offset 16 with size 4, where in the structure
calltime is at offset 20 (16 + 4), but it shows the offset at 24. That's
because it used the alignment of unsigned long long when used as a
declaration and not as a member of a structure where it would be aligned
by word size (in this case 4).
By using the proper structure alignment, the format has it at the correct
offset:
~# cat /sys/kernel/tracing/events/ftrace/funcgraph_exit/format
[..]
field:unsigned long func; offset:8; size:4; signed:0;
field:unsigned int depth; offset:12; size:4; signed:0;
field:unsigned int overrun; offset:16; size:4; signed:0;
field:unsigned long long calltime; offset:20; size:8; signed:0;
field:unsigned long long rettime; offset:28; size:8; signed:0;
Xu Yang [Wed, 4 Feb 2026 11:11:41 +0000 (19:11 +0800)]
pmdomain: imx8mp-blk-ctrl: Keep gpc power domain on for system wakeup
Current design will power off all dependent GPC power domains in
imx8mp_blk_ctrl_suspend(), even though the user device has enabled
wakeup capability. The result is that wakeup function never works
for such device.
An example will be USB wakeup on i.MX8MP. PHY device '382f0040.usb-phy'
is attached to power domain 'hsioblk-usb-phy2' which is spawned by hsio
block control. A virtual power domain device 'genpd:3:32f10000.blk-ctrl'
is created to build connection with 'hsioblk-usb-phy2' and it depends on
GPC power domain 'usb-otg2'. If device '382f0040.usb-phy' enable wakeup,
only power domain 'hsioblk-usb-phy2' keeps on during system suspend,
power domain 'usb-otg2' is off all the time. So the wakeup event can't
happen.
In order to further establish a connection between the power domains
related to GPC and block control during system suspend, register a genpd
power on/off notifier for the power_dev. This allows us to prevent the GPC
power domain from being powered off, in case the block control power
domain is kept on to serve system wakeup.
Shengjiu Wang [Fri, 30 Jan 2026 08:09:10 +0000 (16:09 +0800)]
drm/bridge: imx8mp-hdmi-pai: enable PM runtime
There is an audio channel shift issue with multi channel case - the
channel order is correct for the first run, but the channel order is
shifted for the second run. The fix method is to reset the PAI interface
at the end of playback.
The reset can be handled by PM runtime, so enable PM runtime.
Fixes: 0205fae6327a ("drm/bridge: imx: add driver for HDMI TX Parallel Audio Interface") Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com> Reviewed-by: Liu Ying <victor.liu@nxp.com> Signed-off-by: Liu Ying <victor.liu@nxp.com> Link: https://lore.kernel.org/r/20260130080910.3532724-1-shengjiu.wang@nxp.com
Andrew Fasano [Wed, 4 Feb 2026 16:46:58 +0000 (17:46 +0100)]
netfilter: nf_tables: fix inverted genmask check in nft_map_catchall_activate()
nft_map_catchall_activate() has an inverted element activity check
compared to its non-catchall counterpart nft_mapelem_activate() and
compared to what is logically required.
nft_map_catchall_activate() is called from the abort path to re-activate
catchall map elements that were deactivated during a failed transaction.
It should skip elements that are already active (they don't need
re-activation) and process elements that are inactive (they need to be
restored). Instead, the current code does the opposite: it skips inactive
elements and processes active ones.
Compare the non-catchall activate callback, which is correct:
nft_mapelem_activate():
if (nft_set_elem_active(ext, iter->genmask))
return 0; /* skip active, process inactive */
With the buggy catchall version:
nft_map_catchall_activate():
if (!nft_set_elem_active(ext, genmask))
continue; /* skip inactive, process active */
The consequence is that when a DELSET operation is aborted,
nft_setelem_data_activate() is never called for the catchall element.
For NFT_GOTO verdict elements, this means nft_data_hold() is never
called to restore the chain->use reference count. Each abort cycle
permanently decrements chain->use. Once chain->use reaches zero,
DELCHAIN succeeds and frees the chain while catchall verdict elements
still reference it, resulting in a use-after-free.
This is exploitable for local privilege escalation from an unprivileged
user via user namespaces + nftables on distributions that enable
CONFIG_USER_NS and CONFIG_NF_TABLES.
Fix by removing the negation so the check matches nft_mapelem_activate():
skip active elements, process inactive ones.
Fixes: 628bd3e49cba ("netfilter: nf_tables: drop map element references from preparation phase") Signed-off-by: Andrew Fasano <andrew.fasano@nist.gov> Signed-off-by: Florian Westphal <fw@strlen.de>
Jakub Kicinski [Thu, 5 Feb 2026 04:29:53 +0000 (20:29 -0800)]
Merge tag 'wireless-2026-02-04' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless
Johannes Berg says:
====================
Two last-minute iwlwifi fixes:
- cancel mlo_scan_work on disassoc to avoid
use-after-free/init-after-queue issues
- pause TCM work on suspend to avoid crashing
the FW (and sometimes the host) on resume
with traffic
* tag 'wireless-2026-02-04' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
wifi: iwlwifi: mvm: pause TCM on fast resume
wifi: iwlwifi: mld: cancel mlo_scan_start_wk
====================
Linus Torvalds [Thu, 5 Feb 2026 00:04:00 +0000 (16:04 -0800)]
Merge tag 'mm-hotfixes-stable-2026-02-04-15-55' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"Five hotfixes. Two are cc:stable, two are for MM.
All are singletons - please see the changelogs for details"
* tag 'mm-hotfixes-stable-2026-02-04-15-55' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
Documentation: document liveupdate cmdline parameter
mm, shmem: prevent infinite loop on truncate race
mailmap: update Alexander Mikhalitsyn's emails
liveupdate: luo_file: do not clear serialized_data on unfreeze
x86/kfence: fix booting on 32bit non-PAE systems
Linus Torvalds [Wed, 4 Feb 2026 23:15:54 +0000 (15:15 -0800)]
Merge tag 'tsm-fixes-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/devsec/tsm
Pull TSM (TEE security Manager) fixes from Dan Williams:
"The largest change is reverting part of an ABI that never shipped in a
released kernel (Documentation/ABI/testing/sysfs-class-tsm). The fix /
replacement for that is too large to squeeze in at this late date.
The rest is a collection of small fixups:
- Fix multiple streams per host bridge for SEV-TIO
- Drop the TSM ABI for reporting IDE streams (to be replaced)
- Fix virtual function enumeration
- Fix reserved stream ID initialization
- Fix unused variable compiler warning"
* tag 'tsm-fixes-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/devsec/tsm:
crypto/ccp: Allow multiple streams on the same root bridge
crypto/ccp: Use PCI bridge defaults for IDE
coco/tsm: Remove unused variable tsm_rwsem
PCI/IDE: Fix reading a wrong reg for unused sel stream initialization
PCI/IDE: Fix off by one error calculating VF RID range
Revert "PCI/TSM: Report active IDE streams"
Linus Torvalds [Wed, 4 Feb 2026 23:11:24 +0000 (15:11 -0800)]
Merge tag 'sched_ext-for-6.19-rc8-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext
Pull sched_ext fix from Tejun Heo:
- Fix race where sched_class operations (sched_setscheduler() and
friends) could be invoked on dead tasks after sched_ext_dead()
already ran, causing invalid SCX task state transitions and NULL
pointer dereferences.
This was a regression from the cgroup exit ordering fix which
moved sched_ext_free() to finish_task_switch().
* tag 'sched_ext-for-6.19-rc8-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext:
sched_ext: Short-circuit sched_class operations on dead tasks
Arnd Bergmann [Tue, 3 Feb 2026 16:34:36 +0000 (17:34 +0100)]
hwmon: (occ) Mark occ_init_attribute() as __printf
This is a printf-style function, which gcc -Werror=suggest-attribute=format
correctly points out:
drivers/hwmon/occ/common.c: In function 'occ_init_attribute':
drivers/hwmon/occ/common.c:761:9: error: function 'occ_init_attribute' might be a candidate for 'gnu_printf' format attribute [-Werror=suggest-attribute=format]
Add the attribute to avoid this warning and ensure any incorrect
format strings are detected here.
Tejun Heo [Wed, 4 Feb 2026 20:07:55 +0000 (10:07 -1000)]
sched_ext: Short-circuit sched_class operations on dead tasks
7900aa699c34 ("sched_ext: Fix cgroup exit ordering by moving sched_ext_free()
to finish_task_switch()") moved sched_ext_free() to finish_task_switch() and
renamed it to sched_ext_dead() to fix cgroup exit ordering issues. However,
this created a race window where certain sched_class ops may be invoked on
dead tasks leading to failures - e.g. sched_setscheduler() may try to switch a
task which finished sched_ext_dead() back into SCX triggering invalid SCX task
state transitions.
Add task_dead_and_done() which tests whether a task is TASK_DEAD and has
completed its final context switch, and use it to short-circuit sched_class
operations which may be called on dead tasks.
Fixes: 7900aa699c34 ("sched_ext: Fix cgroup exit ordering by moving sched_ext_free() to finish_task_switch()") Reported-by: Andrea Righi <arighi@nvidia.com> Link: http://lkml.kernel.org/r/20260202151341.796959-1-arighi@nvidia.com Reviewed-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
ceph: fix NULL pointer dereference in ceph_mds_auth_match()
The CephFS kernel client has regression starting from 6.18-rc1.
We have issue in ceph_mds_auth_match() if fs_name == NULL:
const char fs_name = mdsc->fsc->mount_options->mds_namespace;
...
if (auth->match.fs_name && strcmp(auth->match.fs_name, fs_name)) {
/ fsname mismatch, try next one */
return 0;
}
Patrick Donnelly suggested that: In summary, we should definitely start
decoding `fs_name` from the MDSMap and do strict authorizations checks
against it. Note that the `-o mds_namespace=foo` should only be used for
selecting the file system to mount and nothing else. It's possible
no mds_namespace is specified but the kernel will mount the only
file system that exists which may have name "foo".
This patch reworks ceph_mdsmap_decode() and namespace_equals() with
the goal of supporting the suggested concept. Now struct ceph_mdsmap
contains m_fs_name field that receives copy of extracted FS name
by ceph_extract_encoded_string(). For the case of "old" CephFS file
systems, it is used "cephfs" name.
[ idryomov: replace redundant %*pE with %s in ceph_mdsmap_decode(),
get rid of a series of strlen() calls in ceph_namespace_match(),
drop changes to namespace_equals() body to avoid treating empty
mds_namespace as equal, drop changes to ceph_mdsc_handle_fsmap()
as namespace_equals() isn't an equivalent substitution there ]
Linus Torvalds [Wed, 4 Feb 2026 18:38:56 +0000 (10:38 -0800)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
- Fix a bug where AVIC is incorrectly inhibited when running with
x2AVIC disabled via module param (or on a system without x2AVIC)
- Fix a dangling device posted IRQs bug by explicitly checking if the
irqfd is still active (on the list) when handling an eventfd signal,
instead of zeroing the irqfd's routing information when the irqfd is
deassigned.
Zeroing the irqfd's routing info causes arm64 and x86's to not
disable posting for the IRQ (kvm_arch_irq_bypass_del_producer() looks
for an MSI), incorrectly leaving the IRQ in posted mode (and leading
to use-after-free and memory leaks on AMD in particular).
This is both the most pressing and scariest, but it's been in -next
for a while.
- Disable FORTIFY_SOURCE for KVM selftests to prevent the compiler from
generating calls to the checked versions of memset() and friends,
which leads to unexpected page faults in guest code due e.g.
__memset_chk@plt not being resolved.
- Explicitly configure the supported XSS capabilities from within
{svm,vmx}_set_cpu_caps() to fix a bug where VMX will compute the
reference VMCS configuration with SHSTK and IBT enabled, but then
compute each CPUs local config with SHSTK and IBT disabled if not all
CET xfeatures are enabled, e.g. if the kernel is built with
X86_KERNEL_IBT=n.
The mismatch in features results in differing nVMX setting, and
ultimately causes kvm-intel.ko to refuse to load with nested=1.
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: Explicitly configure supported XSS from {svm,vmx}_set_cpu_caps()
KVM: selftests: Add -U_FORTIFY_SOURCE to avoid some unpredictable test failures
KVM: x86: Assert that non-MSI doesn't have bypass vCPU when deleting producer
KVM: Don't clobber irqfd routing type when deassigning irqfd
KVM: SVM: Check vCPU ID against max x2AVIC ID if and only if x2AVIC is enabled
Paolo Bonzini [Wed, 4 Feb 2026 17:30:32 +0000 (18:30 +0100)]
Merge tag 'kvm-x86-fixes-6.19-rc8' of https://github.com/kvm-x86/linux into HEAD
Final KVM fixes for 6.19:
- Fix a bug where AVIC is incorrectly inhibited when running with x2AVIC
disabled via module param (or on a system without x2AVIC).
- Fix a dangling device posted IRQs bug by explicitly checking if the irqfd is
still active (on the list) when handling an eventfd signal, instead of
zeroing the irqfd's routing information when the irqfd is deassigned.
Zeroing the irqfd's routing info causes arm64 and x86's to not disable
posting for the IRQ (kvm_arch_irq_bypass_del_producer() looks for an MSI),
incorrectly leaving the IRQ in posted mode (and leading to use-after-free
and memory leaks on AMD in particular).
This is both the most pressing and scariest, but it's been in -next for
a while.
- Disable FORTIFY_SOURCE for KVM selftests to prevent the compiler from
generating calls to the checked versions of memset() and friends, which
leads to unexpected page faults in guest code due e.g. __memset_chk@plt
not being resolved.
- Explicitly configure the support XSS from within {svm,vmx}_set_cpu_caps() to
fix a bug where VMX will compute the reference VMCS configuration with SHSTK
and IBT enabled, but then compute each CPUs local config with SHSTK and IBT
disabled if not all CET xfeatures are enabled, e.g. if the kernel is built
with X86_KERNEL_IBT=n. The mismatch in features results in differing nVMX
setting, and ultimately causes kvm-intel.ko to refuse to load with nested=1.
Linus Torvalds [Wed, 4 Feb 2026 16:26:22 +0000 (08:26 -0800)]
Merge tag 'soc-fixes-6.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC fixes from Arnd Bergmann:
"Shawn Guo is moving on from maintaining the NXP i.MX platform and
hands over to Frank Li. Shawn has maintained the platform for 15 years
after initially upstreaming support for i.MX6 and i.MX23/28, and his
work has helped make this the most important industrial embedded Linux
platform. Roughly one out of five devicetree files in mainline kernels
are for the wider i.MX platform. Many thanks to Shawn for the taking
care of the platform all these years!
There are also two additional updates for the MAINTAINERS file, and a
fix for error handling in the qualcomm smem driver"
* tag 'soc-fixes-6.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
MAINTAINERS: Change Sudeep Holla's email address
MAINTAINERS: Add myself as maintainer of hisi_soc_hha
soc: qcom: smem: fix qcom_smem_is_available and check if __smem is valid
MAINTAINERS: Replace Shawn with Frank as i.MX platform maintainer
Takashi Iwai [Wed, 4 Feb 2026 16:03:08 +0000 (17:03 +0100)]
Merge tag 'asoc-fix-v6.19-rc8' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v6.19
A bunch more small fixes here, plus some more of the constant stream of
quirks. The most notable change here is Richard's change to the cs_dsp
code for the KUnit tests which is relatively large, mostly due to
boilerplate. The tests were triggering large numbers of error messages
as part of verifying that problems with input data are appropriately
detected which in turn caused runtime issues for the framework due to
the performance impact of pushing the logging out, while the logging is
valuable in normal operation it's basically useless while doing tests
designed to trigger it so rate limiting is an appropriate fix.
Shuicheng Lin [Thu, 29 Jan 2026 23:38:38 +0000 (23:38 +0000)]
drm/xe: Fix kerneldoc for xe_tlb_inval_job_alloc_dep
Correct the function name in the kerneldoc.
It is for below warning:
"Warning: drivers/gpu/drm/xe/xe_tlb_inval_job.c:210 expecting prototype for
xe_tlb_inval_alloc_dep(). Prototype was for xe_tlb_inval_job_alloc_dep()
instead"
Fixes: 15366239e2130 ("drm/xe: Decouple TLB invalidations from GT") Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20260129233834.419977-8-shuicheng.lin@intel.com
(cherry picked from commit 9f9c117ac566cb567dd56cc5b7564c45653f7a2a) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Shuicheng Lin [Thu, 29 Jan 2026 23:38:37 +0000 (23:38 +0000)]
drm/xe: Fix kerneldoc for xe_gt_tlb_inval_init_early
Correct the function name in the kerneldoc.
It is for below warning:
"Warning: drivers/gpu/drm/xe/xe_tlb_inval.c:136 expecting prototype for
xe_gt_tlb_inval_init(). Prototype was for xe_gt_tlb_inval_init_early()
instead"
v2: add () for the function. (Michal)
Fixes: db16f9d90c1d9 ("drm/xe: Split TLB invalidation code in frontend and backend") Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20260129233834.419977-7-shuicheng.lin@intel.com
(cherry picked from commit 0651dbb9d6a72e99569576fbec4681fd8160d161) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Shuicheng Lin [Thu, 29 Jan 2026 23:38:36 +0000 (23:38 +0000)]
drm/xe: Fix kerneldoc for xe_migrate_exec_queue
Correct the function name in the kerneldoc.
It is for below warning:
"Warning: drivers/gpu/drm/xe/xe_migrate.c:1262 expecting prototype for
xe_get_migrate_exec_queue(). Prototype was for xe_migrate_exec_queue()
instead"
Fixes: 916ee4704a865 ("drm/xe/vf: Register CCS read/write contexts with Guc") Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20260129233834.419977-6-shuicheng.lin@intel.com
(cherry picked from commit 9fd8da717934f05125b9ba6782622c459a368dc0) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Shuicheng Lin [Fri, 30 Jan 2026 04:39:08 +0000 (04:39 +0000)]
drm/xe/query: Fix topology query pointer advance
The topology query helper advanced the user pointer by the size
of the pointer, not the size of the structure. This can misalign
the output blob and corrupt the following mask. Fix the increment
to use sizeof(*topo).
There is no issue currently, as sizeof(*topo) happens to be equal
to sizeof(topo) on 64-bit systems (both evaluate to 8 bytes).
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patch.msgid.link/20260130043907.465128-2-shuicheng.lin@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
(cherry picked from commit c2a6859138e7f73ad904be17dd7d1da6cc7f06b3) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Ziyi Guo [Mon, 2 Feb 2026 17:41:12 +0000 (17:41 +0000)]
ASoC: fsl_xcvr: fix missing lock in fsl_xcvr_mode_put()
fsl_xcvr_activate_ctl() has
lockdep_assert_held(&card->snd_card->controls_rwsem),
but fsl_xcvr_mode_put() calls it without acquiring this lock.
Other callers of fsl_xcvr_activate_ctl() in fsl_xcvr_startup() and
fsl_xcvr_shutdown() properly acquire the lock with down_read()/up_read().
Add the missing down_read()/up_read() calls around fsl_xcvr_activate_ctl()
in fsl_xcvr_mode_put() to fix the lockdep assertion and prevent potential
race conditions when multiple userspace threads access the control.
Add compatible string ti,tlv320aic23 to fix below CHECK_DTB warning:
arch/arm/boot/dts/nxp/imx/imx35-eukrea-mbimxsd35-baseboard.dtb:
/soc/bus@43f00000/i2c@43f80000/codec@1a: failed to match any schema with compatible: ['ti,tlv320aic23']
Thomas Gleixner [Mon, 2 Feb 2026 09:39:55 +0000 (10:39 +0100)]
sched/mmcid: Optimize transitional CIDs when scheduling out
During the investigation of the various transition mode issues
instrumentation revealed that the amount of bitmap operations can be
significantly reduced when a task with a transitional CID schedules out
after the fixup function completed and disabled the transition mode.
At that point the mode is stable and therefore it is not required to drop
the transitional CID back into the pool. As the fixup is complete the
potential exhaustion of the CID pool is not longer possible, so the CID can
be transferred to the scheduling out task or to the CPU depending on the
current ownership mode.
The racy snapshot of mm_cid::mode which contains both the ownership state
and the transition bit is valid because runqueue lock is held and the fixup
function of a concurrent mode switch is serialized.
Assigning the ownership right there not only spares the bitmap access for
dropping the CID it also avoids it when the task is scheduled back in as it
directly hits the fast path in both modes when the CID is within the
optimal range. If it's outside the range the next schedule in will need to
converge so dropping it right away is sensible. In the good case this also
allows to go into the fast path on the next schedule in operation.
With a thread pool benchmark which is configured to cross the mode switch
boundaries frequently this reduces the number of bitmap operations by about
30% and increases the fastpath utilization in the low single digit
percentage range.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://patch.msgid.link/20260201192835.100194627@kernel.org
Thomas Gleixner [Mon, 2 Feb 2026 09:39:50 +0000 (10:39 +0100)]
sched/mmcid: Drop per CPU CID immediately when switching to per task mode
When a exiting task initiates the switch from per CPU back to per task
mode, it has already dropped its CID and marked itself inactive. But a
leftover from an earlier iteration of the rework then reassigns the per
CPU CID to the exiting task with the transition bit set.
That's wrong as the task is already marked CID inactive, which means it is
inconsistent state. It's harmless because the CID is marked in transit and
therefore dropped back into the pool when the exiting task schedules out
either through preemption or the final schedule().
Simply drop the per CPU CID when the exiting task triggered the transition.
Fixes: fbd0e71dc370 ("sched/mmcid: Provide CID ownership mode fixup functions") Signed-off-by: Thomas Gleixner <tglx@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://patch.msgid.link/20260201192835.032221009@kernel.org