git.ipfire.org Git - thirdparty/kernel/linux.git/log

Merge tag 'hfs-v7.0-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/vdubeyko/hfs

Pull hfs/hfsplus updates from Viacheslav Dubeyko:
"This pull request contains several fixes of syzbot reported issues and
  HFS+ fixes of xfstests failures.

   - fix an issue reported by syzbot triggering BUG_ON() in the case of
     corrupted superblock, replacing the BUG_ON()s with proper error
     handling (Jori Koolstra)

   - fix memory leaks in the mount logic of HFS/HFS+ file systems. When
     HFS/HFS+ were converted to the new mount api a bug was introduced
     by changing the allocation pattern of sb->s_fs_info (Mehdi Ben Hadj
     Khelifa)

   - fix hfs_bnode_create() by returning ERR_PTR(-EEXIST) instead of
     the node pointer when it's already hashed.  This avoids a double
     unload_nls() on mount failure (suggested by Shardul Bankar)

   - set inode's mode as regular file for system inodes (Tetsuo Handa)

  The rest fix failures in generic/020, generic/037, generic/062,
  generic/480, and generic/498 xfstests for the case of HFS+ file
  system. Currently, only 30 xfstests' test-cases experience failures
  for HFS+ file system (initially, it was around 100 failed xfstests)"

* tag 'hfs-v7.0-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/vdubeyko/hfs:
  hfsplus: avoid double unload_nls() on mount failure
  hfsplus: fix warning issue in inode.c
  hfsplus: fix generic/062 xfstests failure
  hfsplus: fix generic/037 xfstests failure
  hfsplus: pretend special inodes as regular files
  hfsplus: return error when node already exists in hfs_bnode_create
  hfs: Replace BUG_ON with error handling for CNID count checks
  hfsplus: fix generic/020 xfstests failure
  hfsplus: fix volume corruption issue for generic/498
  hfsplus: fix volume corruption issue for generic/480
  hfsplus: ensure sb->s_fs_info is always cleaned up
  hfs: ensure sb->s_fs_info is always cleaned up

Merge tag 'nilfs2-v7.0-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/vdubeyko/nilfs2

Pull nilfs2 updates from Viacheslav Dubeyko:

- Fix potential block overflow that cause system hang

   When executing the FITRIM command, an underflow can occur in the
   calculation of nblocks. This ultimately leads to the block layer
   function __blkdev_issue_discard() taking an excessively long time
   to process the bio chain, and the ns_segctor_sem lock remains held
   for a long period.

   This prevents other tasks from acquiring the ns_segctor_sem lock,
   resulting in a hang reported by syzbot (Edward Adam Davis)

- Fix missing struct keywords in nilfs2_api.h kernel-doc (Ryusuke
   Konishi)

- Convert nilfs_super_block to kernel-doc

   Eliminate 40+ kernel-doc warnings in nilfs2_ondisk.h by converting
   all of the struct member comments to kernel-doc comments (Randy
   Dunlap)

* tag 'nilfs2-v7.0-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/vdubeyko/nilfs2:
  nilfs2: fix missing struct keywords in nilfs2_api.h kernel-doc
  nilfs2: convert nilfs_super_block to kernel-doc
  nilfs2: Fix potential block overflow that cause system hang

Merge tag 'for-6.20-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs updates from David Sterba:
"User visible changes, feature updates:

   - when using block size > page size, enable direct IO

   - fallback to buffered IO if the data profile has duplication,
     workaround to avoid checksum mismatches on block group profiles
     with redundancy, real direct IO is possible on single or RAID0

   - redo export of zoned statistics, moved from sysfs to
     /proc/pid/mountstats due to size limitations of the former

  Experimental features:

   - remove offload checksum tunable, intended to find best way to do it
     but since we've switched to offload to thread for everything we
     don't need it anymore

   - initial support for remap-tree feature, a translation layer of
     logical block addresses that allow changes without moving/rewriting
     blocks to do eg. relocation, or other changes that require COW

  Notable fixes:

   - automatic removal of accidentally leftover chunks when
     free-space-tree is enabled since mkfs.btrfs v6.16.1

   - zoned mode:
      - do not try to append to conventional zones when RAID is mixing
        zoned and conventional drives
      - fixup write pointers when mixing zoned and conventional on
        DUP/RAID* profiles

   - when using squota, relax deletion rules for qgroups with 0 members
     to allow easier recovery from accounting bugs, also add more checks
     to detect bad accounting

   - fix periodic reclaim scanning, properly check boundary conditions
     not to trigger it unexpectedly or miss the time to run it

   - trim:
      - continue after first error
      - change reporting to the first detected error
      - add more cancellation points
      - reduce contention of big device lock that can block other
        operations when there's lots of trimmed space

   - when chunk allocation is forced (needs experimental build) fix
     transaction abort when unexpected space layout is detected

  Core:

   - switch to crypto library API for checksumming, removed module
     dependencies, pointer indirections, etc.

   - error handling improvements

   - adjust how and where transaction commit or abort are done and are
     maybe not necessary

   - minor compression optimization to skip single block ranges

   - improve how compression folios are handled

   - new and updated selftests

   - cleanups, refactoring:
      - auto-freeing and other automatic variable cleanup conversion
      - structure size optimizations
      - condition annotations"

* tag 'for-6.20-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (137 commits)
  btrfs: get rid of compressed_bio::compressed_folios[]
  btrfs: get rid of compressed_folios[] usage for encoded writes
  btrfs: get rid of compressed_folios[] usage for compressed read
  btrfs: remove the old btrfs_compress_folios() infrastructure
  btrfs: switch to btrfs_compress_bio() interface for compressed writes
  btrfs: introduce btrfs_compress_bio() helper
  btrfs: zlib: introduce zlib_compress_bio() helper
  btrfs: zstd: introduce zstd_compress_bio() helper
  btrfs: lzo: introduce lzo_compress_bio() helper
  btrfs: zoned: factor out the zone loading part into a testable function
  btrfs: add cleanup function for btrfs_free_chunk_map
  btrfs: tests: add cleanup functions for test specific functions
  btrfs: raid56: fix memory leak of btrfs_raid_bio::stripe_uptodate_bitmap
  btrfs: tests: add unit tests for pending extent walking functions
  btrfs: fix EEXIST abort due to non-consecutive gaps in chunk allocation
  btrfs: fix transaction commit blocking during trim of unallocated space
  btrfs: handle user interrupt properly in btrfs_trim_fs()
  btrfs: preserve first error in btrfs_trim_fs()
  btrfs: continue trimming remaining devices on failure
  btrfs: do not BUG_ON() in btrfs_remove_block_group()
  ...

fs: fuse: fix max() of incompatible types

The 'max()' value of a 'long long' and an 'unsigned int' is problematic
if the former is negative:

In function 'fuse_wr_pages',
    inlined from 'fuse_perform_write' at fs/fuse/file.c:1347:27:
include/linux/compiler_types.h:652:45: error: call to '__compiletime_assert_390' declared with attribute error: min(((pos + len - 1) >> 12) - (pos >> 12) + 1, max_pages) signedness error
  652 |         _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
      |                                             ^

Use a temporary variable to make it clearer what is going on here.

Fixes: 0f5bb0cfb0b4 ("fs: use min() or umin() instead of min_t()")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'vfs-7.0-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull misc vfs updates from Christian Brauner:
"This contains a mix of VFS cleanups, performance improvements, API
  fixes, documentation, and a deprecation notice.

  Scalability and performance:

   - Rework pid allocation to only take pidmap_lock once instead of
     twice during alloc_pid(), improving thread creation/teardown
     throughput by 10-16% depending on false-sharing luck. Pad the
     namespace refcount to reduce false-sharing

   - Track file lock presence via a flag in ->i_opflags instead of
     reading ->i_flctx, avoiding false-sharing with ->i_readcount on
     open/close hot paths. Measured 4-16% improvement on 24-core
     open-in-a-loop benchmarks

   - Use a consume fence in locks_inode_context() to match the
     store-release/load-consume idiom, eliminating a hardware fence on
     some architectures

   - Annotate cdev_lock with __cacheline_aligned_in_smp to prevent
     false-sharing

   - Remove a redundant DCACHE_MANAGED_DENTRY check in
     __follow_mount_rcu() that never fires since the caller already
     verifies it, eliminating a 100% mispredicted branch

   - Fix a 100% mispredicted likely() in devcgroup_inode_permission()
     that became wrong after a prior code reorder

  Bug fixes and correctness:

   - Make insert_inode_locked() wait for inode destruction instead of
     skipping, fixing a corner case where two matching inodes could
     exist in the hash

   - Move f_mode initialization before file_ref_init() in alloc_file()
     to respect the SLAB_TYPESAFE_BY_RCU ordering contract

   - Add a WARN_ON_ONCE guard in try_to_free_buffers() for folios with
     no buffers attached, preventing a null pointer dereference when
     AS_RELEASE_ALWAYS is set but no release_folio op exists

   - Fix select restart_block to store end_time as timespec64, avoiding
     truncation of tv_sec on 32-bit architectures

   - Make dump_inode() use get_kernel_nofault() to safely access inode
     and superblock fields, matching the dump_mapping() pattern

  API modernization:

   - Make posix_acl_to_xattr() allocate the buffer internally since
     every single caller was doing it anyway. Reduces boilerplate and
     unnecessary error checking across ~15 filesystems

   - Replace deprecated simple_strtoul() with kstrtoul() for the
     ihash_entries, dhash_entries, mhash_entries, and mphash_entries
     boot parameters, adding proper error handling

   - Convert chardev code to use guard(mutex) and __free(kfree) cleanup
     patterns

   - Replace min_t() with min() or umin() in VFS code to avoid silently
     truncating unsigned long to unsigned int

   - Gate LOOKUP_RCU assertions behind CONFIG_DEBUG_VFS since callers
     already check the flag

  Deprecation:

   - Begin deprecating legacy BSD process accounting (acct(2)). The
     interface has numerous footguns and better alternatives exist
     (eBPF)

  Documentation:

   - Fix and complete kernel-doc for struct export_operations, removing
     duplicated documentation between ReST and source

   - Fix kernel-doc warnings for __start_dirop() and ilookup5_nowait()

  Testing:

   - Add a kunit test for initramfs cpio handling of entries with
     filesize > PATH_MAX

  Misc:

   - Add missing <linux/init_task.h> include in fs_struct.c"

* tag 'vfs-7.0-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (28 commits)
  posix_acl: make posix_acl_to_xattr() alloc the buffer
  fs: make insert_inode_locked() wait for inode destruction
  initramfs_test: kunit test for cpio.filesize > PATH_MAX
  fs: improve dump_inode() to safely access inode fields
  fs: add <linux/init_task.h> for 'init_fs'
  docs: exportfs: Use source code struct documentation
  fs: move initializing f_mode before file_ref_init()
  exportfs: Complete kernel-doc for struct export_operations
  exportfs: Mark struct export_operations functions at kernel-doc
  exportfs: Fix kernel-doc output for get_name()
  acct(2): begin the deprecation of legacy BSD process accounting
  device_cgroup: remove branch hint after code refactor
  VFS: fix __start_dirop() kernel-doc warnings
  fs: Describe @isnew parameter in ilookup5_nowait()
  fs/namei: Remove redundant DCACHE_MANAGED_DENTRY check in __follow_mount_rcu
  fs: only assert on LOOKUP_RCU when built with CONFIG_DEBUG_VFS
  select: store end_time as timespec64 in restart block
  chardev: Switch to guard(mutex) and __free(kfree)
  namespace: Replace simple_strtoul with kstrtoul to parse boot params
  dcache: Replace simple_strtoul with kstrtoul in set_dhash_entries
  ...

Merge tag 'vfs-7.0-rc1.iomap' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs iomap updates from Christian Brauner:

- Erofs page cache sharing preliminaries:

   Plumb a void *private parameter through iomap_read_folio() and
   iomap_readahead() into iomap_iter->private, matching iomap DIO. Erofs
   uses this to replace a bogus kmap_to_page() call, as preparatory work
   for page cache sharing.

- Fix for invalid folio access:

   Fix an invalid folio access when a folio without iomap_folio_state
   is fully submitted to the IO helper — the helper may call
   folio_end_read() at any time, so ctx->cur_folio must be invalidated
   after full submission.

* tag 'vfs-7.0-rc1.iomap' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  iomap: fix invalid folio access after folio_end_read()
  erofs: hold read context in iomap_iter if needed
  iomap: stash iomap read ctx in the private field of iomap_iter

Merge tag 'vfs-7.0-rc1.namespace' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs mount updates from Christian Brauner:

- statmount: accept fd as a parameter

   Extend struct mnt_id_req with a file descriptor field and a new
   STATMOUNT_BY_FD flag. When set, statmount() returns mount information
   for the mount the fd resides on — including detached mounts
   (unmounted via umount2(MNT_DETACH)).

   For detached mounts the STATMOUNT_MNT_POINT and STATMOUNT_MNT_NS_ID
   mask bits are cleared since neither is meaningful. The capability
   check is skipped for STATMOUNT_BY_FD since holding an fd already
   implies prior access to the mount and equivalent information is
   available through fstatfs() and /proc/pid/mountinfo without
   privilege. Includes comprehensive selftests covering both attached
   and detached mount cases.

- fs: Remove internal old mount API code (1 patch)

   Now that every in-tree filesystem has been converted to the new
   mount API, remove all the legacy shim code in fs_context.c that
   handled unconverted filesystems. This deletes ~280 lines including
   legacy_init_fs_context(), the legacy_fs_context struct, and
   associated wrappers. The mount(2) syscall path for userspace remains
   untouched. Documentation references to the legacy callbacks are
   cleaned up.

- mount: add OPEN_TREE_NAMESPACE to open_tree()

   Container runtimes currently use CLONE_NEWNS to copy the caller's
   entire mount namespace — only to then pivot_root() and recursively
   unmount everything they just copied. With large mount tables and
   thousands of parallel container launches this creates significant
   contention on the namespace semaphore.

   OPEN_TREE_NAMESPACE copies only the specified mount tree (like
   OPEN_TREE_CLONE) but returns a mount namespace fd instead of a
   detached mount fd. The new namespace contains the copied tree mounted
   on top of a clone of the real rootfs.

   This functions as a combined unshare(CLONE_NEWNS) + pivot_root() in a
   single syscall. Works with user namespaces: an unshare(CLONE_NEWUSER)
   followed by OPEN_TREE_NAMESPACE creates a mount namespace owned by
   the new user namespace. Mount namespace file mounts are excluded from
   the copy to prevent cycles. Includes ~1000 lines of selftests"

* tag 'vfs-7.0-rc1.namespace' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  selftests/open_tree: add OPEN_TREE_NAMESPACE tests
  mount: add OPEN_TREE_NAMESPACE
  fs: Remove internal old mount API code
  selftests: statmount: tests for STATMOUNT_BY_FD
  statmount: accept fd as a parameter
  statmount: permission check should return EPERM

Merge tag 'vfs-7.0-rc1.atomic_open' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs atomic_open updates from Christian Brauner:
"Allow knfsd to use atomic_open()

  While knfsd offers combined exclusive create and open results to
  clients, on some filesystems those results are not atomic. The
  separate vfs_create() + vfs_open() sequence in dentry_create() can
  produce races and unexpected errors. For example, open O_CREAT with
  mode 0 will succeed in creating the file but return -EACCES from
  vfs_open(). Additionally, network filesystems benefit from reducing
  remote round-trip operations by using a single atomic_open() call.

  Teach dentry_create() -- whose sole caller is knfsd -- to use
  atomic_open() for filesystems that support it"

* tag 'vfs-7.0-rc1.atomic_open' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  fs/namei: fix kernel-doc markup for dentry_create
  VFS/knfsd: Teach dentry_create() to use atomic_open()
  VFS: Prepare atomic_open() for dentry_create()
  VFS: move dentry_create() from fs/open.c to fs/namei.c

Merge tag 'vfs-7.0-rc1.nullfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs nullfs update from Christian Brauner:
"Add a completely catatonic minimal pseudo filesystem called "nullfs"
  and make pivot_root() work in the initramfs.

  Currently pivot_root() does not work on the real rootfs because it
  cannot be unmounted. Userspace has to recursively delete initramfs
  contents manually before continuing boot, using the fragile
  switch_root sequence (overmount + chroot).

  Add nullfs, a minimal immutable filesystem that serves as the true
  root of the mount hierarchy. The mutable rootfs (tmpfs/ramfs) is
  mounted on top of it. This allows userspace to simply:

      chdir(new_root);
      pivot_root(".", ".");
      umount2(".", MNT_DETACH);

  without the traditional switch_root workarounds. systemd already
  handles this correctly. It tries pivot_root() first and falls back
  to MS_MOVE only when that fails.

  This also means rootfs mounts in unprivileged namespaces no longer
  need MNT_LOCKED, since the immutable nullfs guarantees nothing can be
  revealed by unmounting the covering mount.

  nullfs is a single-instance filesystem (get_tree_single()) marked
  SB_NOUSER | SB_I_NOEXEC | SB_I_NODEV with an immutable empty root
  directory. This means sooner or later it can be used to overmount
  other directories to hide their contents without any additional
  protection needed.

  We enable it unconditionally. If we see any real regression we'll
  hide it behind a boot option.

  nullfs has extensions beyond this in the future. It will serve as a
  concept to support the creation of completely empty mount namespaces -
  which is work coming up in the next cycle"

* tag 'vfs-7.0-rc1.nullfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  fs: use nullfs unconditionally as the real rootfs
  docs: mention nullfs
  fs: add immutable rootfs
  fs: add init_pivot_root()
  fs: ensure that internal tmpfs mount gets mount id zero

Merge tag 'vfs-7.0-rc1.minix' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull minix update from Christian Brauner:
"Consolidate and strengthen superblock validation in
  minix_check_superblock()

  The minix filesystem driver does not validate several superblock
  fields before using them during mount, allowing a crafted filesystem
  image to trigger out-of-bounds accesses (reported by syzbot)"

* tag 'vfs-7.0-rc1.minix' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  minix: Add required sanity checking to minix_check_superblock()

Merge tag 'vfs-7.0-rc1.btrfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs updates for btrfs from Christian Brauner:
"This contains some changes for btrfs that are taken to the vfs tree to
  stop duplicating VFS code for subvolume/snapshot dentry

  Btrfs has carried private copies of the VFS may_delete() and
  may_create() functions in fs/btrfs/ioctl.c for permission checks
  during subvolume creation and snapshot destruction. These copies have
  drifted out of sync with the VFS originals — btrfs_may_delete() is
  missing the uid/gid validity check and btrfs_may_create() is missing
  the audit_inode_child() call.

  Export the VFS functions as may_{create,delete}_dentry() and switch
  btrfs to use them, removing ~70 lines of duplicated code"

* tag 'vfs-7.0-rc1.btrfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  btrfs: use may_create_dentry() in btrfs_mksubvol()
  btrfs: use may_delete_dentry() in btrfs_ioctl_snap_destroy()
  fs: export may_create() as may_create_dentry()
  fs: export may_delete() as may_delete_dentry()

Merge tag 'vfs-7.0-rc1.fserror' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs error reporting updates from Christian Brauner:
"This contains the changes to support generic I/O error reporting.

  Filesystems currently have no standard mechanism for reporting
  metadata corruption and file I/O errors to userspace via fsnotify.
  Each filesystem (xfs, ext4, erofs, f2fs, etc.) privately defines
  EFSCORRUPTED, and error reporting to fanotify is inconsistent or
  absent entirely.

  This introduces a generic fserror infrastructure built around struct
  super_block that gives filesystems a standard way to queue metadata
  and file I/O error reports for delivery to fsnotify.

  Errors are queued via mempools and queue_work to avoid holding
  filesystem locks in the notification path; unmount waits for pending
  events to drain. A new super_operations::report_error callback lets
  filesystem drivers respond to file I/O errors themselves (to be used
  by an upcoming XFS self-healing patchset).

  On the uapi side, EFSCORRUPTED and EUCLEAN are promoted from private
  per-filesystem definitions to canonical errno.h values across all
  architectures"

* tag 'vfs-7.0-rc1.fserror' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  ext4: convert to new fserror helpers
  xfs: translate fsdax media errors into file "data lost" errors when convenient
  xfs: report fs metadata errors via fsnotify
  iomap: report file I/O errors to the VFS
  fs: report filesystem and file I/O errors to fsnotify
  uapi: promote EFSCORRUPTED and EUCLEAN to errno.h

Merge tag 'vfs-7.0-rc1.leases' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs lease updates from Christian Brauner:
"This contains updates for lease support to require filesystems to
  explicitly opt-in to lease support

  Currently kernel_setlease() falls through to generic_setlease() when a
  a filesystem does not define ->setlease(), silently granting lease
  support to every filesystem regardless of whether it is prepared for
  it.

  This is a poor default: most filesystems never intended to support
  leases, and the silent fallthrough makes it impossible to distinguish
  "supports leases" from "never thought about it".

  This inverts the default. It adds explicit

.setlease = generic_setlease;

  assignments to every in-tree filesystem that should retain lease
  support, then changes kernel_setlease() to return -EINVAL when
  ->setlease is NULL.

  With the new default in place, simple_nosetlease() is redundant and
  is removed along with all references to it"

* tag 'vfs-7.0-rc1.leases' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (25 commits)
  fuse: add setlease file operation
  fs: remove simple_nosetlease()
  filelock: default to returning -EINVAL when ->setlease operation is NULL
  xfs: add setlease file operation
  ufs: add setlease file operation
  udf: add setlease file operation
  tmpfs: add setlease file operation
  squashfs: add setlease file operation
  overlayfs: add setlease file operation
  orangefs: add setlease file operation
  ocfs2: add setlease file operation
  ntfs3: add setlease file operation
  nilfs2: add setlease file operation
  jfs: add setlease file operation
  jffs2: add setlease file operation
  gfs2: add a setlease file operation
  fat: add setlease file operation
  f2fs: add setlease file operation
  exfat: add setlease file operation
  ext4: add setlease file operation
  ...

Merge tag 'vfs-7.0-rc1.nonblocking_timestamps' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs timestamp updates from Christian Brauner:
"This contains the changes to support non-blocking timestamp updates.

  Since commit 66fa3cedf16a ("fs: Add async write file modification
  handling") file_update_time_flags() unconditionally returns -EAGAIN
  when any timestamp needs updating and IOCB_NOWAIT is set. This makes
  non-blocking direct writes impossible on file systems with granular
  enough timestamps, which in practice means all of them.

  This reworks the timestamp update path to propagate IOCB_NOWAIT
  through ->update_time so that file systems which can update timestamps
  without blocking are no longer penalized.

  With that groundwork in place, the core change passes IOCB_NOWAIT into
  ->update_time and returns -EAGAIN only when the file system indicates
  it would block.

  XFS implements non-blocking timestamp updates by using the new
  ->sync_lazytime and open-coding generic_update_time without the
  S_NOWAIT check, since the lazytime path through the generic helpers
  can never block in XFS"

* tag 'vfs-7.0-rc1.nonblocking_timestamps' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  xfs: enable non-blocking timestamp updates
  xfs: implement ->sync_lazytime
  fs: refactor file_update_time_flags
  fs: add support for non-blocking timestamp updates
  fs: add a ->sync_lazytime method
  fs: factor out a sync_lazytime helper
  fs: refactor ->update_time handling
  fat: cleanup the flags for fat_truncate_time
  nfs: split nfs_update_timestamps
  fs: allow error returns from generic_update_time
  fs: remove inode_update_time

Merge tag 'vfs-7.0-rc1.initrd' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs initrd removal from Christian Brauner:
"Remove the deprecated linuxrc-based initrd code path and related dead
  code. The linuxrc initrd path was deprecated in 2020 and this series
  completes its removal. If we see real-life regressions we'll revert.

  The core change removes handle_initrd() and init_linuxrc() — the
  entire flow that ran /linuxrc from an initrd, pivoted roots, and
  handed off to the real root filesystem. With that gone, initrd_load()
  becomes void (no longer short-circuits prepare_namespace()),
  rd_load_image() is simplified to always load /initrd.image instead of
  taking a path, and rd_load_disk() is deleted.

  The /proc/sys/kernel/real-root-dev sysctl and its backing variable are
  removed since they only existed for linuxrc to communicate the real
  root device back to the kernel.

  The no-op load_ramdisk= and prompt_ramdisk= parameters are dropped,
  and noinitrd and ramdisk_start= gain deprecation warnings.

  Initramfs is entirely unaffected. The non-linuxrc initrd path
  (root=/dev/ram0) is preserved but now carries a deprecation warning
  targeting January 2027 removal"

* tag 'vfs-7.0-rc1.initrd' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  init: remove /proc/sys/kernel/real-root-dev
  initrd: remove deprecated code path (linuxrc)
  init: remove deprecated "load_ramdisk" and "prompt_ramdisk" command line parameters

Merge tag 'vfs-7.0-rc1.rust' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs rust updates from Christian Brauner:
"Allow inlining C helpers into Rust when using LTO: Add the
  __rust_helper annotation to all VFS-related Rust helper functions.

  Currently, C helpers cannot be inlined into Rust code even under LTO
  because LLVM detects slightly different codegen options between the C
  and Rust compilation units (differing null-pointer-check flags,
  builtin lists, and target feature strings). The __rust_helper macro is
  the first step toward fixing this: it is currently #defined to
  nothing, but a follow-up series will change it to __always_inline when
  compiling with LTO (while keeping it empty for bindgen, which ignores
  inline functions).

  This picks up the VFS portion (fs, pid_namespace, poll) of a larger
  tree-wide series"

* tag 'vfs-7.0-rc1.rust' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  rust: poll: add __rust_helper to helpers
  rust: pid_namespace: add __rust_helper to helpers
  rust: fs: add __rust_helper to helpers

Merge tag 'selinux-pr-20260203' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux

Pull selinux updates from Paul Moore:

- Add support for SELinux based access control of BPF tokens

   We worked with the BPF devs to add the necessary LSM hooks when the
   BPF token code was first introduced, but it took us a bit longer to
   add the SELinux wiring and support.

   In order to preserve existing token-unaware SELinux policies, the new
   code is gated by the new "bpf_token_perms" policy capability.

   Additional details regarding the new permissions, and behaviors can
   be found in the associated commit.

- Remove a BUG() from the SELinux capability code

   We now perform a similar check during compile time so we can safely
   remove the BUG() call.

* tag 'selinux-pr-20260203' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  selinux: drop the BUG() in cred_has_capability()
  selinux: fix a capabilities parsing typo in selinux_bpf_token_capable()
  selinux: add support for BPF token access control
  selinux: move the selinux_blob_sizes struct

Merge tag 'lsm-pr-20260203' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm

Pull lsm updates from Paul Moore:

- Unify the security_inode_listsecurity() calls in NFSv4

   While looking at security_inode_listsecurity() with an eye towards
   improving the interface, we realized that the NFSv4 code was making
   multiple calls to the LSM hook that could be consolidated into one.

- Mark the LSM static branch keys as static - this helps resolve some
   sparse warnings

- Add __rust_helper annotations to the LSM and cred wrapper functions

- Remove the unsused set_security_override_from_ctx() function

- Minor fixes to some of the LSM kdoc comment blocks

* tag 'lsm-pr-20260203' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm:
  lsm: make keys for static branch static
  cred: remove unused set_security_override_from_ctx()
  rust: security: add __rust_helper to helpers
  rust: cred: add __rust_helper to helpers
  nfs: unify security_inode_listsecurity() calls
  lsm: fix kernel-doc struct member names

Merge tag 'audit-pr-20260203' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit

Pull audit updates from Paul Moore:

- Improve the NETFILTER_PKT audit records

   Add source and destination ports to the NETFILTER_PKT audit records
   while also consolidating a lot of the code into a new, singular
   audit_log_nf_skb() function. This new approach to structuring the
   NETFILTER_PKT record generation should eliminate some unnecessary
   overhead when audit is not built into the kernel.

- Update the audit syscall classifier code

   Add the listxattrat(), getxattrat(), and fchmodat2() syscall to the
   audit code which classifies syscalls into categories of operations,
   e.g. "read" or "change attributes".

- Move the syscall classifier declarations into audit_arch.h

   Shuffle around some header file declarations to resolve some sparse
   warnings.

* tag 'audit-pr-20260203' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
  audit: move the compat_xxx_class[] extern declarations to audit_arch.h
  audit: add missing syscalls to read class
  audit: include source and destination ports to NETFILTER_PKT
  audit: add audit_log_nf_skb helper function
  audit: add fchmodat2() to change attributes class

Merge tag 'tpmdd-next-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd

Pull tpm updates from Jarkko Sakkinen.

* tag 'tpmdd-next-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
tpm: st33zp24: Fix missing cleanup on get_burstcount() error
tpm: tpm_i2c_infineon: Fix locality leak on get_burstcount() failure

Merge tag 'i3c/for-6.20' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux

Pull i3c updates from Alexandre Belloni:
"Subsystem:
   - add sysfs entry and attribute for Device NACK Retry count

  Drivers:
   - dw: Device NACK Retry configuration knob
   - mipi-i3c-hci: support multi-bus instances, runtime PM, and suspend
   - renesas: suspend/resume support"

* tag 'i3c/for-6.20' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux: (52 commits)
  i3c: dw-i3c-master: fix SIR reject bit mapping for dynamic addresses
  i3c: dw-i3c-master: convert spinlock usage to scoped guards
  i3c: dw: Fix memory leak in dw_i3c_master_i2c_xfers()
  i3c: mipi-i3c-hci-pci: Add System Suspend support
  i3c: mipi-i3c-hci: Add optional System Suspend support
  i3c: master: Add i3c_master_do_daa_ext() for post-hibernation address recovery
  i3c: dw: Initialize spinlock to avoid upsetting lockdep
  i3c: mipi-i3c-hci-pci: Add Runtime PM support
  i3c: mipi-i3c-hci: Add optional Runtime PM support
  i3c: master: Introduce optional Runtime PM support
  i3c: mipi-i3c-hci: Factor out master dynamic address setting into helper
  i3c: mipi-i3c-hci: Allow core re-initialization for Runtime PM support
  i3c: mipi-i3c-hci: Factor out core initialization into helper
  i3c: mipi-i3c-hci: Factor out IO mode setting into helper
  i3c: mipi-i3c-hci: Factor out software reset into helper
  i3c: mipi-i3c-hci: Add PIO suspend and resume support
  i3c: mipi-i3c-hci: Refactor PIO register initialization
  i3c: mipi-i3c-hci: Add DMA suspend and resume support
  i3c: mipi-i3c-hci: Extract ring initialization from hci_dma_init()
  i3c: mipi-i3c-hci: Introduce helper to restore DAT
  ...

Merge tag 'rcu.release.v7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux

Pull RCU updates from Boqun Feng:

- RCU Tasks Trace:

   Re-implement RCU tasks trace in term of SRCU-fast, not only more than
   500 lines of code are saved because of the reimplementation, a new
   set of API, rcu_read_{,un}lock_tasks_trace(), becomes possible as
   well. Compared to the previous rcu_read_{,un}lock_trace(), the new
   API avoid the task_struct accesses thanks to the SRCU-fast semantics.

   As a result, the old rcu_read{,un}lock_trace() API is now deprecated.

- RCU Torture Test:
    - Multiple improvements on kvm-series.sh (parallel run and
      progress showing metrics)
    - Add context checks to rcu_torture_timer()
    - Make config2csv.sh properly handle comments in .boot files
    - Include commit discription in testid.txt

- Miscellaneous RCU changes:
    - Reduce synchronize_rcu() latency by reporting GP kthread's
      CPU QS early
    - Use suitable gfp_flags for the init_srcu_struct_nodes()
    - Fix rcu_read_unlock() deadloop due to softirq
    - Correctly compute probability to invoke ->exp_current()
      in rcutorture
    - Make expedited RCU CPU stall warnings detect stall-end races

- RCU nocb:
    - Remove unnecessary WakeOvfIsDeferred wake path and callback
      overload handling
    - Extract nocb_defer_wakeup_cancel() helper

* tag 'rcu.release.v7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux: (25 commits)
  rcu/nocb: Extract nocb_defer_wakeup_cancel() helper
  rcu/nocb: Remove dead callback overload handling
  rcu/nocb: Remove unnecessary WakeOvfIsDeferred wake path
  rcu: Reduce synchronize_rcu() latency by reporting GP kthread's CPU QS early
  srcu: Use suitable gfp_flags for the init_srcu_struct_nodes()
  rcu: Fix rcu_read_unlock() deadloop due to softirq
  rcutorture: Correctly compute probability to invoke ->exp_current()
  rcu: Make expedited RCU CPU stall warnings detect stall-end races
  rcutorture: Add --kill-previous option to terminate previous kvm.sh runs
  rcutorture: Prevent concurrent kvm.sh runs on same source tree
  torture: Include commit discription in testid.txt
  torture: Make config2csv.sh properly handle comments in .boot files
  torture: Make kvm-series.sh give run numbers and totals
  torture: Make kvm-series.sh give build numbers and totals
  torture: Parallelize kvm-series.sh guest-OS execution
  rcutorture: Add context checks to rcu_torture_timer()
  rcutorture: Test rcu_tasks_trace_expedite_current()
  srcu: Create an rcu_tasks_trace_expedite_current() function
  checkpatch: Deprecate rcu_read_{,un}lock_trace()
  rcu: Update Requirements.rst for RCU Tasks Trace
  ...

Merge tag 'linux_kselftest-next-6.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kselftest updates from Shuah Khan:
"resctrl test:
   - fix division by zero error on Hygon
   - fix non-contiguous CBM check for Hygon
   - define CPU vendor IDs as bits to match usage
   - add CPU vendor detection for Hygon

  misc:
   - coredeump test: use __builtin_trap() instead of a null pointer
   - anon_inode: replace null pointers with empty arrays
   - kublk: include message in _Static_assert for C11 compatibility
   - run_kselftest.sh: add `--skip` argument option
   - pidfd: fix typo in comment"

* tag 'linux_kselftest-next-6.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests/pidfd: fix typo in comment
  selftests/run_kselftest.sh: Add `--skip` argument option
  selftests/resctrl: Fix non-contiguous CBM check for Hygon
  selftests/resctrl: Add CPU vendor detection for Hygon
  selftests/resctrl: Define CPU vendor IDs as bits to match usage
  selftests/resctrl: Fix a division by zero error on Hygon
  kselftest/kublk: include message in _Static_assert for C11 compatibility
  kselftest/anon_inode: replace null pointers with empty arrays
  kselftest/coredump: use __builtin_trap() instead of null pointer

Merge tag 'linux_kselftest-kunit-6.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kunit updates from Shuah Khan:
"kunit:
   - add __rust_helper to helpers
   - fix up const mismatch in many assert functions
   - fix up const mismatch in test_list_sort
   - protect KUNIT_BINARY_STR_ASSERTION against ERR_PTR values
   - respect KBUILD_OUTPUT env variable by default
   - add bash completion

  kunit tool:
   - add test for nested test result reporting
   - do not overwrite test status based on subtest counts
   - add 32-bit big endian ARM configuration to qemu_configs
   - rename test_data_path() to _test_data_path()
   - do not rely on implicit working directory change"

* tag 'linux_kselftest-kunit-6.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kunit: add bash completion
  kunit: tool: test: Don't rely on implicit working directory change
  kunit: tool: test: Rename test_data_path() to _test_data_path()
  kunit: qemu_configs: Add 32-bit big endian ARM configuration
  kunit: tool: Don't overwrite test status based on subtest counts
  kunit: tool: Add test for nested test result reporting
  kunit: respect KBUILD_OUTPUT env variable by default
  kunit: Protect KUNIT_BINARY_STR_ASSERTION against ERR_PTR values
  test_list_sort: fix up const mismatch
  kunit: fix up const mis-match in many assert functions
  rust: kunit: add __rust_helper to helpers

Merge tag 'auxdisplay-v6.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/andy/linux-auxdisplay

Pull auxdisplay updates from Andy Shevchenko:

- A good refactoring of arm-charlcd to use modern kernel APIs

* tag 'auxdisplay-v6.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/andy/linux-auxdisplay:
  auxdisplay: max6959: Replace slab.h with device/devres.h
  auxdisplay: arm-charlcd: Remove redundant ternary operators
  auxdisplay: arm-charlcd: Join string literals back
  auxdisplay: arm-charlcd: Use readl_poll_timeout
  auxdisplay: arm-charlcd: Don't use "proxy" headers
  auxdisplay: arm-charlcd: drop of_match_ptr()
  auxdisplay: arm-charlcd: Remove unneeded info message
  auxdisplay: arm-charlcd: convert to use device managed APIs
  auxdisplay: arm-charlcd: fix release_mem_region() size

Linux 6.19

Merge tag 'i2c-for-6.19-final' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

Pull i2c fix from Wolfram Sang:

- imx: preserve error state during SMBus block read length handling

* tag 'i2c-for-6.19-final' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: imx: preserve error state in block data length handler

Merge tag 'spi-fix-v6.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi

Pull spi fixes from Mark Brown:
"One final batch of fixes for the Tegra SPI drivers, the main one is a
  batch of fixes for races with the interrupts in the Tegra210 QSPI
  driver that Breno has been working on for a while"

* tag 'spi-fix-v6.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: tegra114: Preserve SPI mode bits in def_command1_reg
  spi: tegra: Fix a memory leak in tegra_slink_probe()
  spi: tegra210-quad: Protect curr_xfer check in IRQ handler
  spi: tegra210-quad: Protect curr_xfer clearing in tegra_qspi_non_combined_seq_xfer
  spi: tegra210-quad: Protect curr_xfer in tegra_qspi_combined_seq_xfer
  spi: tegra210-quad: Protect curr_xfer assignment in tegra_qspi_setup_transfer_one
  spi: tegra210-quad: Move curr_xfer read inside spinlock
  spi: tegra210-quad: Return IRQ_HANDLED when timeout already processed transfer

Merge tag 'regulator-fix-v6.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Pull regulator fix from Mark Brown:
"One last fix for v6.19: the voltages for the SpaceMIT P1 were not
described correctly"

* tag 'regulator-fix-v6.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: spacemit-p1: Fix n_voltages for BUCK and LDO regulators

Merge tag 'char-misc-6.19-final' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull binder fixes from Greg KH:
"Here are some small, last-minute binder C and Rust driver fixes for
  reported issues. They include a number of fixes for reported crashes
  and other problems.

  All of these have been in linux-next this week, and longer"

* tag 'char-misc-6.19-final' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  binderfs: fix ida_alloc_max() upper bound
  rust_binderfs: fix ida_alloc_max() upper bound
  binder: fix BR_FROZEN_REPLY error log
  rust_binder: add additional alignment checks
  binder: fix UAF in binder_netlink_report()
  rust_binder: correctly handle FDA objects of length zero

Merge tag 'sched-urgent-2026-02-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler fixes from Ingo Molnar:
"Miscellaneous MMCID fixes to address bugs and performance regressions
  in the recent rewrite of the SCHED_MM_CID management code:

   - Fix livelock triggered by BPF CI testing

   - Fix hard lockup on weakly ordered systems

   - Simplify the dropping of CIDs in the exit path by removing an
     unintended transition phase

   - Fix performance/scalability regression on a thread-pool benchmark
     by optimizing transitional CIDs when scheduling out"

* tag 'sched-urgent-2026-02-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/mmcid: Optimize transitional CIDs when scheduling out
  sched/mmcid: Drop per CPU CID immediately when switching to per task mode
  sched/mmcid: Protect transition on weakly ordered systems
  sched/mmcid: Prevent live lock on task to CPU mode transition

Merge tag 'objtool-urgent-2026-02-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull objtool fixes from Ingo Molnar::

- Bump up the Clang minimum version requirements for livepatch
   builds, due to Clang assembler section handling bugs causing
   silent miscompilations

- Strip livepatching symbol artifacts from non-livepatch modules

- Fix livepatch build warnings when certain Clang LTO options
   are enabled

- Fix livepatch build error when CONFIG_MEM_ALLOC_PROFILING_DEBUG=y

* tag 'objtool-urgent-2026-02-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  objtool/klp: Fix unexported static call key access for manually built livepatch modules
  objtool/klp: Fix symbol correlation for orphaned local symbols
  livepatch: Free klp_{object,func}_ext data after initialization
  livepatch: Fix having __klp_objects relics in non-livepatch modules
  livepatch/klp-build: Require Clang assembler >= 20

hfsplus: avoid double unload_nls() on mount failure

The recent commit "hfsplus: ensure sb->s_fs_info is always cleaned up"
[1] introduced a custom ->kill_sb() handler (hfsplus_kill_super) that
cleans up the s_fs_info structure (including the NLS table) on
superblock destruction.

However, the error handling path in hfsplus_fill_super() still calls
unload_nls() before returning an error. Since the VFS layer calls
->kill_sb() when fill_super fails, this results in unload_nls() being
called twice for the same sbi->nls pointer: once in hfsplus_fill_super()
and again in hfsplus_kill_super() (via delayed_free).

Remove the explicit unload_nls() call from the error path in
hfsplus_fill_super() to rely solely on the cleanup in ->kill_sb().

[1] https://lore.kernel.org/r/20251201222843.82310-3-mehdi.benhadjkhelifa@gmail.com/

Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Link: https://lore.kernel.org/r/20260203043806.GF3183987@ZenIV/
Signed-off-by: Shardul Bankar <shardul.b@mpiricsoftware.com>
Link: https://lore.kernel.org/r/20260204170440.1337261-1-shardul.b@mpiricsoftware.com
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>

x86/vmware: Fix hypercall clobbers

Fedora QA reported the following panic:

  BUG: unable to handle page fault for address: 0000000040003e54
  #PF: supervisor write access in kernel mode
  #PF: error_code(0x0002) - not-present page
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS edk2-20251119-3.fc43 11/19/2025
  RIP: 0010:vmware_hypercall4.constprop.0+0x52/0x90
  ..
  Call Trace:
   vmmouse_report_events+0x13e/0x1b0
   psmouse_handle_byte+0x15/0x60
   ps2_interrupt+0x8a/0xd0
   ...

because the QEMU VMware mouse emulation is buggy, and clears the top 32
bits of %rdi that the kernel kept a pointer in.

The QEMU vmmouse driver saves and restores the register state in a
"uint32_t data[6];" and as a result restores the state with the high
bits all cleared.

RDI originally contained the value of a valid kernel stack address
(0xff5eeb3240003e54).  After the vmware hypercall it now contains
0x40003e54, and we get a page fault as a result when it is dereferenced.

The proper fix would be in QEMU, but this works around the issue in the
kernel to keep old setups working, when old kernels had not happened to
keep any state in %rdi over the hypercall.

In theory this same issue exists for all the hypercalls in the vmmouse
driver; in practice it has only been seen with vmware_hypercall3() and
vmware_hypercall4().  For now, just mark RDI/RSI as clobbered for those
two calls.  This should have a minimal effect on code generation overall
as it should be rare for the compiler to want to make RDI/RSI live
across hypercalls.

Reported-by: Justin Forbes <jforbes@fedoraproject.org>
Link: https://lore.kernel.org/all/99a9c69a-fc1a-43b7-8d1e-c42d6493b41f@broadcom.com/
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'mm-hotfixes-stable-2026-02-06-12-37' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull hotfixes from Andrew Morton:
"A couple of late-breaking MM fixes. One against a new-in-this-cycle
  patch and the other addresses a locking issue which has been there for
  over a year"

* tag 'mm-hotfixes-stable-2026-02-06-12-37' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  mm/memory-failure: reject unsupported non-folio compound page
  procfs: avoid fetching build ID while holding VMA lock

Merge tag 'trace-v6.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing fix from Steven Rostedt:

- Fix event format field alignments for 32 bit architectures

   The fields in the event format files are used to parse the raw binary
   buffer data by applications. If they are incorrect, then the
   application produces garbage.

   On 32 bit architectures, the function graph 64bit calltime and
   rettime were off by 4bytes. That's because the actual fields are in a
   packed structure but the macros used by the ftrace events did not
   mark them as packed, and instead, gave them their natural alignment
   which made their offsets off by 4 bytes.

   There are macros to have a packed field within an embedded structure
   of an event, but there's no macro for normal fields within a packed
   structure of the event. The macro __field_packed() was used for the
   packed embedded structure field. Rename that to __field_desc_packed()
   (to match the non-packed embedded field macro __field_desc()), and
   make __field_packed() for fields that are in a packed event structure
   (which matches the unpacked __field() macro).

   Switch the calltime and rettime fields of the function graph event to
   use the new __field_packed() and this makes the offsets correct.

* tag 'trace-v6.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing: Fix ftrace event field alignments

Merge tag 'ceph-for-6.19-rc9' of https://github.com/ceph/ceph-client

Pull ceph fixes from Ilya Dryomov:
"One RBD and two CephFS fixes which address potential oopses.

  The RBD thing is more of a rare edge case that pops up in our CI,
  while the two CephFS scenarios are regressions that were reported by
  users and can be triggered trivially in normal operation. All marked
  for stable"

* tag 'ceph-for-6.19-rc9' of https://github.com/ceph/ceph-client:
  ceph: fix NULL pointer dereference in ceph_mds_auth_match()
  ceph: fix oops due to invalid pointer for kfree() in parse_longname()
  rbd: check for EOD after exclusive lock is ensured to be held

Merge tag 'dma-mapping-6.19-2026-02-06' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux

Pull dma-mapping fixes from Marek Szyprowski:
"Two minor fixes for the DMA-mapping subsystem:

   - check for the rare case of the allocation failure of the global CMA
     pool (Shanker Donthineni)

   - avoid perf buffer overflow when tracing large scatter-gather lists
     (Deepanshu Kartikey)"

* tag 'dma-mapping-6.19-2026-02-06' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux:
  dma: contiguous: Check return value of dma_contiguous_reserve_area()
  tracing/dma: Cap dma_map_sg tracepoint arrays to prevent buffer overflow

Merge tag 'iommu-fix-v6.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux

Pull iommu fix from Joerg Roedel:

- Fix wrong definition of PASID_FLAG_PWSNP bit. This caused DMAR errors
on Arrow Lake platforms.

* tag 'iommu-fix-v6.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux:
iommu/vt-d: Treat PAGE_SNOOP and PWSNP separately

Merge tag 'pmdomain-v6.19-rc3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm

Pull pmdomain fixes from Ulf Hansson:

- imx:
     - Fix system wakeup support for imx8mp power domains
     - Fix potential out-of-range access for imx8m power domains
     - Fix the imx8mm gpu hang

- qcom: Fix off-by-one error for highest state in rpmpd

* tag 'pmdomain-v6.19-rc3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm:
  pmdomain: imx8mp-blk-ctrl: Keep usb phy power domain on for system wakeup
  pmdomain: imx8mp-blk-ctrl: Keep gpc power domain on for system wakeup
  pmdomain: imx8m-blk-ctrl: fix out-of-range access of bc->domains
  pmdomain: imx: gpcv2: Fix the imx8mm gpu hang due to wrong adb400 reset
  pmdomain: qcom: rpmpd: fix off-by-one error in clamping to the highest state

Merge tag 'gpio-fixes-for-v6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux

Pull gpio fixes from Bartosz Golaszewski:

- fix incorrect retval check in gpio-loongson-64bit

- fix GPIO counting with ACPI

* tag 'gpio-fixes-for-v6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpio: loongson-64bit: Fix incorrect NULL check after devm_kcalloc()
gpiolib: acpi: Fix gpio count with string references

Merge tag 'sound-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
"A collection of small fixes. It became a bit larger than wished, but
  all of them are device-specific small fixes, and it should be still
  fairly safe to take at the last minute.

  Included are a few quirks and fixes for Intel, AMD, HD-audio, and
  USB-audio, as well as a race fix in aloop driver and corrections of
  Cirrus firmware kunit test"

* tag 'sound-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda/realtek: Enable headset mic for Acer Nitro 5
  ASoC: fsl_xcvr: fix missing lock in fsl_xcvr_mode_put()
  ASoC: dt-bindings: ti,tlv320aic3x: Add compatible string ti,tlv320aic23
  ASoC: amd: fix memory leak in acp3x pdm dma ops
  ALSA: usb-audio: fix broken logic in snd_audigy2nx_led_update()
  ALSA: aloop: Fix racy access at PCM trigger
  ASoC: rt1320: fix intermittent no-sound issue
  ASoC: SOF: Intel: use hdev->info.link_mask directly
  firmware: cs_dsp: rate-limit log messages in KUnit builds
  ASoC: amd: yc: Add quirk for HP 200 G2a 16
  ASoC: cs42l43: Correct handling of 3-pole jack load detection
  ASoC: Intel: sof_es8336: Add DMI quirk for Huawei BOD-WXX9
  ASoC: sof_sdw: Add a quirk for Lenovo laptop using sidecar amps with cs42l43

Merge tag 'slab-for-6.19-rc8-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab

Pull slab fix from Vlastimil Babka:
"A stable fix for memory allocation profiling tag not being cleared
when aborting an allocation due to memcg charge failure (Hao Ge)"

* tag 'slab-for-6.19-rc8-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
mm/slab: Add alloc_tagging_slab_free_hook for memcg_alloc_abort_single

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux

Pull ARM fix from Russell King:
"Just one fix for memset64() on big endian 32-bit ARM systems"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux:
ARM: 9468/1: fix memset64() on big-endian

iommu/vt-d: Treat PAGE_SNOOP and PWSNP separately

The PASID_FLAG_PAGE_SNOOP and PASID_FLAG_PWSNP constants are identical.
This will cause the pasid code to always set both or neither of the
PGSNP and PWSNP bits in PASID table entries. However, PWSNP is a
reserved bit if SMPWC is not set in the IOMMU's extended capability
register, even if SC is supported.

This has resulted in DMAR errors when testing the iommufd code on an
Arrow Lake platform. With this patch, those errors disappear and the
PASID table entries look correct.

Fixes: 101a2854110fa ("iommu/vt-d: Follow PT_FEAT_DMA_INCOHERENT into the PASID entry")
Cc: stable@vger.kernel.org
Signed-off-by: Viktor Kleen <viktor@kleen.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20260202192109.1665799-1-viktor@kleen.org
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>

mm/slab: Add alloc_tagging_slab_free_hook for memcg_alloc_abort_single

When CONFIG_MEM_ALLOC_PROFILING_DEBUG is enabled, the following warning
may be noticed:

[ 3959.023862] ------------[ cut here ]------------
[ 3959.023891] alloc_tag was not cleared (got tag for lib/xarray.c:378)
[ 3959.023947] WARNING: ./include/linux/alloc_tag.h:155 at alloc_tag_add+0x128/0x178, CPU#6: mkfs.ntfs/113998
[ 3959.023978] Modules linked in: dns_resolver tun brd overlay exfat btrfs blake2b libblake2b xor xor_neon raid6_pq loop sctp ip6_udp_tunnel udp_tunnel ext4 crc16 mbcache jbd2 rfkill sunrpc vfat fat sg fuse nfnetlink sr_mod virtio_gpu cdrom drm_client_lib virtio_dma_buf drm_shmem_helper drm_kms_helper ghash_ce drm sm4 backlight virtio_net net_failover virtio_scsi failover virtio_console virtio_blk virtio_mmio dm_mirror dm_region_hash dm_log dm_multipath dm_mod i2c_dev aes_neon_bs aes_ce_blk [last unloaded: hwpoison_inject]
[ 3959.024170] CPU: 6 UID: 0 PID: 113998 Comm: mkfs.ntfs Kdump: loaded Tainted: G        W           6.19.0-rc7+ #7 PREEMPT(voluntary)
[ 3959.024182] Tainted: [W]=WARN
[ 3959.024186] Hardware name: QEMU KVM Virtual Machine, BIOS unknown 2/2/2022
[ 3959.024192] pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 3959.024199] pc : alloc_tag_add+0x128/0x178
[ 3959.024207] lr : alloc_tag_add+0x128/0x178
[ 3959.024214] sp : ffff80008b696d60
[ 3959.024219] x29: ffff80008b696d60 x28: 0000000000000000 x27: 0000000000000240
[ 3959.024232] x26: 0000000000000000 x25: 0000000000000240 x24: ffff800085d17860
[ 3959.024245] x23: 0000000000402800 x22: ffff0000c0012dc0 x21: 00000000000002d0
[ 3959.024257] x20: ffff0000e6ef3318 x19: ffff800085ae0410 x18: 0000000000000000
[ 3959.024269] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[ 3959.024281] x14: 0000000000000000 x13: 0000000000000001 x12: ffff600064101293
[ 3959.024292] x11: 1fffe00064101292 x10: ffff600064101292 x9 : dfff800000000000
[ 3959.024305] x8 : 00009fff9befed6e x7 : ffff000320809493 x6 : 0000000000000001
[ 3959.024316] x5 : ffff000320809490 x4 : ffff600064101293 x3 : ffff800080691838
[ 3959.024328] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0000d5bcd640
[ 3959.024340] Call trace:
[ 3959.024346]  alloc_tag_add+0x128/0x178 (P)
[ 3959.024355]  __alloc_tagging_slab_alloc_hook+0x11c/0x1a8
[ 3959.024362]  kmem_cache_alloc_lru_noprof+0x1b8/0x5e8
[ 3959.024369]  xas_alloc+0x304/0x4f0
[ 3959.024381]  xas_create+0x1e0/0x4a0
[ 3959.024388]  xas_store+0x68/0xda8
[ 3959.024395]  __filemap_add_folio+0x5b0/0xbd8
[ 3959.024409]  filemap_add_folio+0x16c/0x7e0
[ 3959.024416]  __filemap_get_folio_mpol+0x2dc/0x9e8
[ 3959.024424]  iomap_get_folio+0xfc/0x180
[ 3959.024435]  __iomap_get_folio+0x2f8/0x4b8
[ 3959.024441]  iomap_write_begin+0x198/0xc18
[ 3959.024448]  iomap_write_iter+0x2ec/0x8f8
[ 3959.024454]  iomap_file_buffered_write+0x19c/0x290
[ 3959.024461]  blkdev_write_iter+0x38c/0x978
[ 3959.024470]  vfs_write+0x4d4/0x928
[ 3959.024482]  ksys_write+0xfc/0x1f8
[ 3959.024489]  __arm64_sys_write+0x74/0xb0
[ 3959.024496]  invoke_syscall+0xd4/0x258
[ 3959.024507]  el0_svc_common.constprop.0+0xb4/0x240
[ 3959.024514]  do_el0_svc+0x48/0x68
[ 3959.024520]  el0_svc+0x40/0xf8
[ 3959.024526]  el0t_64_sync_handler+0xa0/0xe8
[ 3959.024533]  el0t_64_sync+0x1ac/0x1b0
[ 3959.024540] ---[ end trace 0000000000000000 ]---

When __memcg_slab_post_alloc_hook() fails, there are two different
free paths depending on whether size == 1 or size != 1. In the
kmem_cache_free_bulk() path, we do call alloc_tagging_slab_free_hook().
However, in memcg_alloc_abort_single() we don't, the above warning will be
triggered on the next allocation.

Therefore, add alloc_tagging_slab_free_hook() to the
memcg_alloc_abort_single() path.

Fixes: 9f9796b413d3 ("mm, slab: move memcg charging to post-alloc hook")
Cc: stable@vger.kernel.org
Suggested-by: Hao Li <hao.li@linux.dev>
Signed-off-by: Hao Ge <hao.ge@linux.dev>
Reviewed-by: Hao Li <hao.li@linux.dev>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
Link: https://patch.msgid.link/20260204101401.202762-1-hao.ge@linux.dev
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>

Merge tag 'hwmon-for-v6.19-final' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

Pull hwmon fixes from Guenter Roeck:

- occ: Mark occ_init_attribute() as __printf to avoid build failure due
   to '-Werror=suggest-attribute=format'

- gpio-fan: Allow to stop fans when CONFIG_PM is disabled, and fix
   set_rpm() return value

- acpi_power_meter: Fix deadlocks related to acpi_power_meter_notify()

- dell-smm: Add Dell G15 5510 to fan control whitelist

* tag 'hwmon-for-v6.19-final' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (occ) Mark occ_init_attribute() as __printf
  hwmon: (gpio-fan) Allow to stop FANs when CONFIG_PM is disabled
  hwmon: (gpio-fan) Fix set_rpm() return value
  hwmon: (acpi_power_meter) Fix deadlocks related to acpi_power_meter_notify()
  hwmon: (dell-smm) Add Dell G15 5510 to fan control whitelist

Merge tag 'drm-fixes-2026-02-06' of https://gitlab.freedesktop.org/drm/kernel

Pull drm fixes from Dave Airlie:
"The usual xe/amdgpu selection, and a couple of misc changes for
  gma500, mgag200 and bridge. There is a nouveau revert, and also a set
  of changes that fix a regression since we moved to 570 firmware.
  Suspend/resume was broken on a bunch of GPUs. The fix looks big, but
  it's mostly just refactoring to pass an extra bit down the nouveau
  abstractions to the firmware command.

  amdgpu:
   - MES 11 old firmware compatibility fix
   - ASPM fix
   - DC LUT fixes

  amdkfd:
   - Fix possible double deletion of validate list

  xe:
   - Fix topology query pointer advance
   - A couple of kerneldoc fixes
   - Disable D3Cold for BMG only on specific platforms
   - Fix CFI violation in debugfs access

  nouveau:
   - Revert adding atomic commit functions as it regresses pre-nv50
   - Fix suspend/resume bugs exposed by enabling 570 firmware

  gma500:
   - Revert a regression caused by vblank changes

  mgag200:
   - Replace a busy loop with a polling loop to fix that blocking 1 cpu
     for 300 ms roughly every 20 minutes

bridge:
   - imx8mp-hdmi-pa: Use runtime pm to fix a bug in channel ordering"

* tag 'drm-fixes-2026-02-06' of https://gitlab.freedesktop.org/drm/kernel:
  drm/xe/guc: Fix CFI violation in debugfs access.
  drm/bridge: imx8mp-hdmi-pai: enable PM runtime
  drm/xe/pm: Disable D3Cold for BMG only on specific platforms
  drm/xe: Fix kerneldoc for xe_tlb_inval_job_alloc_dep
  drm/xe: Fix kerneldoc for xe_gt_tlb_inval_init_early
  drm/xe: Fix kerneldoc for xe_migrate_exec_queue
  drm/xe/query: Fix topology query pointer advance
  drm/mgag200: fix mgag200_bmc_stop_scanout()
  nouveau/gsp: fix suspend/resume regression on r570 firmware
  nouveau: add a third state to the fini handler.
  nouveau/gsp: use rpc sequence numbers properly.
  drm/amdgpu: Fix double deletion of validate_list
  drm/amd/display: remove assert around dpp_base replacement
  drm/amd/display: extend delta clamping logic to CM3 LUT helper
  drm/amd/display: fix wrong color value mapping on MCM shaper LUT
  Revert "drm/amd: Check if ASPM is enabled from PCIe subsystem"
  drm/amd: Set minimum version for set_hw_resource_1 on gfx11 to 0x52
  Revert "drm/gma500: use drm_crtc_vblank_crtc()"
  Revert "drm/nouveau/disp: Set drm_mode_config_funcs.atomic_(check|commit)"

Merge tag 'amd-drm-fixes-6.19-2026-02-05' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-6.19-2026-02-05:

amdgpu:
- MES 11 old firmware compatibility fix
- ASPM fix
- DC LUT fixes

amdkfd:
- Fix possible double deletion of validate list

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patch.msgid.link/20260205182017.2409773-1-alexander.deucher@amd.com

Merge tag 'drm-xe-fixes-2026-02-05' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes

Driver Changes:
- Fix topology query pointer advance (Shuicheng)
- A couple of kerneldoc fixes (Shuicheng)
- Disable D3Cold for BMG only on specific platforms (Karthik)
- Fix CFI violation in debugfs access (Daniele)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
Link: https://patch.msgid.link/aYS2v12R8ELQoTiZ@fedora

Merge tag 'drm-misc-fixes-2026-02-05' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes

drm-misc-fixes for v6.19 final:

nouveau
-------
Revert adding atomic commit functions as it regresses pre-nv50.
Fix bugs exposed by enabling 570 firmware.

gma500
------
Revert a regression caused by vblank changes.

mgag200
-------
Replace a busy loop with a polling loop to fix that blocking 1 cpu for 300 ms roughly every 20 minutes.

bridge
------
imx8mp-hdmi-pa: Use runtime pm to fix a bug in channel ordering.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patch.msgid.link/c0077ea5-faeb-4b0c-bd4a-ea2384d6dc0c@linux.intel.com

Merge tag 'block-6.19-20260205' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux

Pull block fixes from Jens Axboe:

- Revert of a change for loop, which caused regressions for some users
   (Actually revert of two commits, where one is just an existing fix
   for the offending commit)

- NVMe pull via Keith:
      - Fix NULL pointer access setting up dma mappings
      - Fix invalid memory access from malformed TCP PDU

* tag 'block-6.19-20260205' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
  loop: revert exclusive opener loop status change
  nvmet-tcp: add bounds checks in nvmet_tcp_build_pdu_iovec
  nvme-pci: handle changing device dma map requirements

Merge tag 'io_uring-6.19-20260205' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux

Pull io_uring fixes from Jens Axboe:

- Two small fixes for zcrx

- Two small fixes for fdinfo - one is just killing a superflous newline

* tag 'io_uring-6.19-20260205' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
  io_uring/fdinfo: be a bit nicer when looping a lot of SQEs/CQEs
  io_uring/fdinfo: kill unnecessary newline feed in CQE32 printing
  io_uring/zcrx: fix rq flush locking
  io_uring/zcrx: fix page array leak

mm/memory-failure: reject unsupported non-folio compound page

When !CONFIG_TRANSPARENT_HUGEPAGE, a non-folio compound page can appear in
a userspace mapping via either vm_insert_*() functions or
vm_operatios_struct->fault(). They are not folios, thus should not be
considered for folio operations like split. To reject these pages, make
sure get_hwpoison_page() is always called as HWPoisonHandlable() will do
the right work.

[Some commit log borrowed from Zi Yan. Thanks.]

Link: https://lkml.kernel.org/r/20260205075328.523211-1-linmiaohe@huawei.com
Fixes: 689b8986776c ("mm/memory-failure: improve large block size folio handling")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Reported-by: 是参差 <shicenci@gmail.com>
Closes: https://lore.kernel.org/all/PS1PPF7E1D7501F1E4F4441E7ECD056DEADAB98A@PS1PPF7E1D7501F.apcprd02.prod.outlook.com/
Reviewed-by: Zi Yan <ziy@nvidia.com>
Tested-by: Zi Yan <ziy@nvidia.com>
Cc: David Hildenbrand <david@kernel.org>
Cc: Jane Chu <jane.chu@oracle.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

procfs: avoid fetching build ID while holding VMA lock

Fix PROCMAP_QUERY to fetch optional build ID only after dropping mmap_lock
or per-VMA lock, whichever was used to lock VMA under question, to avoid
deadlock reported by syzbot:

-> #1 (&mm->mmap_lock){++++}-{4:4}:
        __might_fault+0xed/0x170
        _copy_to_iter+0x118/0x1720
        copy_page_to_iter+0x12d/0x1e0
        filemap_read+0x720/0x10a0
        blkdev_read_iter+0x2b5/0x4e0
        vfs_read+0x7f4/0xae0
        ksys_read+0x12a/0x250
        do_syscall_64+0xcb/0xf80
        entry_SYSCALL_64_after_hwframe+0x77/0x7f

-> #0 (&sb->s_type->i_mutex_key#8){++++}-{4:4}:
        __lock_acquire+0x1509/0x26d0
        lock_acquire+0x185/0x340
        down_read+0x98/0x490
        blkdev_read_iter+0x2a7/0x4e0
        __kernel_read+0x39a/0xa90
        freader_fetch+0x1d5/0xa80
        __build_id_parse.isra.0+0xea/0x6a0
        do_procmap_query+0xd75/0x1050
        procfs_procmap_ioctl+0x7a/0xb0
        __x64_sys_ioctl+0x18e/0x210
        do_syscall_64+0xcb/0xf80
        entry_SYSCALL_64_after_hwframe+0x77/0x7f

other info that might help us debug this:

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   rlock(&mm->mmap_lock);
                                lock(&sb->s_type->i_mutex_key#8);
                                lock(&mm->mmap_lock);
   rlock(&sb->s_type->i_mutex_key#8);

  *** DEADLOCK ***

This seems to be exacerbated (as we haven't seen these syzbot reports
before that) by the recent:

777a8560fd29 ("lib/buildid: use __kernel_read() for sleepable context")

To make this safe, we need to grab file refcount while VMA is still locked, but
other than that everything is pretty straightforward. Internal build_id_parse()
API assumes VMA is passed, but it only needs the underlying file reference, so
just add another variant build_id_parse_file() that expects file passed
directly.

[akpm@linux-foundation.org: fix up kerneldoc]
Link: https://lkml.kernel.org/r/20260129215340.3742283-1-andrii@kernel.org
Fixes: ed5d583a88a9 ("fs/procfs: implement efficient VMA querying API for /proc/<pid>/maps")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reported-by: <syzbot+4e70c8e0a2017b432f7a@syzkaller.appspotmail.com>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Tested-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Song Liu <song@kernel.org>
Cc: Stanislav Fomichev <sdf@fomichev.me>
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

spi: tegra114: Preserve SPI mode bits in def_command1_reg

The COMMAND1 register bits [29:28] set the SPI mode, which controls
the clock idle level. When a transfer ends, tegra_spi_transfer_end()
writes def_command1_reg back to restore the default state, but this
register value currently lacks the mode bits. This results in the
clock always being configured as idle low, breaking devices that
need it high.

Fix this by storing the mode bits in def_command1_reg during setup,
to prevent this field from always being cleared.

Fixes: f333a331adfa ("spi/tegra114: add spi driver")
Signed-off-by: Vishwaroop A <va@nvidia.com>
Link: https://patch.msgid.link/20260204141212.1540382-1-va@nvidia.com
Signed-off-by: Mark Brown <broonie@kernel.org>

Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull dcache fixes from Al Viro:
"A couple of regression fixes for the tree-in-dcache series this cycle"

* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
functionfs: use spinlock for FFS_DEACTIVATED/FFS_CLOSING transitions
rust_binderfs: fix a dentry leak

functionfs: use spinlock for FFS_DEACTIVATED/FFS_CLOSING transitions

When all files are closed, functionfs needs ffs_data_reset() to be
done before any further opens are allowed.

During that time we have ffs->state set to FFS_CLOSING; that makes
->open() fail with -EBUSY.  Once ffs_data_reset() is done, it
switches state (to FFS_READ_DESCRIPTORS) indicating that opening
that thing is allowed again.  There's a couple of additional twists:
* mounting with -o no_disconnect delays ffs_data_reset()
from doing that at the final ->release() to the first subsequent
open().  That's indicated by ffs->state set to FFS_DEACTIVATED;
if open() sees that, it immediately switches to FFS_CLOSING and
proceeds with doing ffs_data_reset() before returning to userland.
* a couple of usb callbacks need to force the delayed
transition; unfortunately, they are done in locking environment
that does not allow blocking and ffs_data_reset() can block.
As the result, if these callbacks see FFS_DEACTIVATED, they change
state to FFS_CLOSING and use schedule_work() to get ffs_data_reset()
executed asynchronously.

Unfortunately, the locking is rather insufficient.  A fix attempted
in e5bf5ee26663 ("functionfs: fix the open/removal races") had closed
a bunch of UAF, but it didn't do anything to the callbacks, lacked
barriers in transition from FFS_CLOSING to FFS_READ_DESCRIPTORS
_and_ it had been too heavy-handed in open()/open() serialization -
I've used ffs->mutex for that, and it's being held over actual IO on
ep0, complete with copy_from_user(), etc.

Even more unfortunately, the userland side is apparently racy enough
to have the resulting timing changes (no failures, just a delayed
return of open(2)) disrupt the things quite badly.  Userland bugs
or not, it's a clear regression that needs to be dealt with.

Solution is to use a spinlock for serializing these state checks and
transitions - unlike ffs->mutex it can be taken in these callbacks
and it doesn't disrupt the timings in open().

We could introduce a new spinlock, but it's easier to use the one
that is already there (ffs->eps_lock) instead - the locking
environment is safe for it in all affected places.

Since now it is held over all places that alter or check the
open count (ffs->opened), there's no need to keep that atomic_t -
int would serve just fine and it's simpler that way.

Fixes: e5bf5ee26663 ("functionfs: fix the open/removal races")
Fixes: 18d6b32fca38 ("usb: gadget: f_fs: add "no_disconnect" mode") # v4.0
Tested-by: Samuel Wu <wusamuel@google.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

rust_binderfs: fix a dentry leak

Parallel to binderfs patches - 02da8d2c0965 "binderfs_binder_ctl_create():
kill a bogus check" and the bit of b89aa544821d "convert binderfs" that
got lost when making 4433d8e25d73 "convert rust_binderfs"; the former is
a cleanup, the latter is about marking /binder-control persistent, so that
it would be taken out on umount.

Fixes: 4433d8e25d73 ("convert rust_binderfs")
Acked-by: Alice Ryhl <aliceryhl@google.com>
Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Merge tag 'net-6.19-rc9' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
"Including fixes from wireless and Netfilter.

  Previous releases - regressions:

   - eth: stmmac: fix stm32 (and potentially others) resume regression

   - nf_tables: fix inverted genmask check in nft_map_catchall_activate()

   - usb: r8152: fix resume reset deadlock

   - fix reporting RXH_XFRM_NO_CHANGE as input_xfrm for RSS contexts

  Previous releases - always broken:

   - sched: cls_u32: use skb_header_pointer_careful() to avoid OOB reads
     with malicious u32 rules

   - eth: ice: timestamping related fixes"

* tag 'net-6.19-rc9' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (38 commits)
  ipv6: Fix ECMP sibling count mismatch when clearing RTF_ADDRCONF
  netfilter: nf_tables: fix inverted genmask check in nft_map_catchall_activate()
  net: cpsw: Execute ndo_set_rx_mode callback in a work queue
  net: cpsw_new: Execute ndo_set_rx_mode callback in a work queue
  gve: Correct ethtool rx_dropped calculation
  gve: Fix stats report corruption on queue count change
  selftest: net: add a test-case for encap segmentation after GRO
  net: gro: fix outer network offset
  net: add proper RCU protection to /proc/net/ptype
  net: ethernet: adi: adin1110: Check return value of devm_gpiod_get_optional() in adin1110_check_spi()
  wifi: iwlwifi: mvm: pause TCM on fast resume
  wifi: iwlwifi: mld: cancel mlo_scan_start_wk
  net: spacemit: k1-emac: fix jumbo frame support
  net: enetc: Convert 16-bit register reads to 32-bit for ENETC v4
  net: enetc: Convert 16-bit register writes to 32-bit for ENETC v4
  net: enetc: Remove CBDR cacheability AXI settings for ENETC v4
  net: enetc: Remove SI/BDR cacheability AXI settings for ENETC v4
  tipc: use kfree_sensitive() for session key material
  net: stmmac: fix stm32 (and potentially others) resume regression
  net: rss: fix reporting RXH_XFRM_NO_CHANGE as input_xfrm for contexts
  ...

gpio: loongson-64bit: Fix incorrect NULL check after devm_kcalloc()

Fix incorrect NULL check in loongson_gpio_init_irqchip().
The function checks chip->parent instead of chip->irq.parents.

Fixes: 03c146cb6cd1 ("gpio: loongson-64bit: Add support for Loongson-2K0300 SoC")
Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Link: https://patch.msgid.link/20260205072649.3271158-1-nichen@iscas.ac.cn
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>

ipv6: Fix ECMP sibling count mismatch when clearing RTF_ADDRCONF

syzbot reported a kernel BUG in fib6_add_rt2node() when adding an IPv6
route. [0]

Commit f72514b3c569 ("ipv6: clear RA flags when adding a static
route") introduced logic to clear RTF_ADDRCONF from existing routes
when a static route with the same nexthop is added. However, this
causes a problem when the existing route has a gateway.

When RTF_ADDRCONF is cleared from a route that has a gateway, that
route becomes eligible for ECMP, i.e. rt6_qualify_for_ecmp() returns
true. The issue is that this route was never added to the
fib6_siblings list.

This leads to a mismatch between the following counts:

- The sibling count computed by iterating fib6_next chain, which
includes the newly ECMP-eligible route

- The actual siblings in fib6_siblings list, which does not include
that route

When a subsequent ECMP route is added, fib6_add_rt2node() hits
BUG_ON(sibling->fib6_nsiblings != rt->fib6_nsiblings) because the
counts don't match.

Fix this by only clearing RTF_ADDRCONF when the existing route does
not have a gateway. Routes without a gateway cannot qualify for ECMP
anyway (rt6_qualify_for_ecmp() requires fib_nh_gw_family), so clearing
RTF_ADDRCONF on them is safe and matches the original intent of the
commit.

[0]:
kernel BUG at net/ipv6/ip6_fib.c:1217!
Oops: invalid opcode: 0000 [#1] SMP KASAN PTI
CPU: 0 UID: 0 PID: 6010 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025
RIP: 0010:fib6_add_rt2node+0x3433/0x3470 net/ipv6/ip6_fib.c:1217
[...]
Call Trace:
<TASK>
fib6_add+0x8da/0x18a0 net/ipv6/ip6_fib.c:1532
__ip6_ins_rt net/ipv6/route.c:1351 [inline]
ip6_route_add+0xde/0x1b0 net/ipv6/route.c:3946
ipv6_route_ioctl+0x35c/0x480 net/ipv6/route.c:4571
inet6_ioctl+0x219/0x280 net/ipv6/af_inet6.c:577
sock_do_ioctl+0xdc/0x300 net/socket.c:1245
sock_ioctl+0x576/0x790 net/socket.c:1366
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:597 [inline]
__se_sys_ioctl+0xfc/0x170 fs/ioctl.c:583
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xfa/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f

Fixes: f72514b3c569 ("ipv6: clear RA flags when adding a static route")
Reported-by: syzbot+cb809def1baaac68ab92@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=cb809def1baaac68ab92
Tested-by: syzbot+cb809def1baaac68ab92@syzkaller.appspotmail.com
Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
Reviewed-by: Fernando Fernandez Mancera <fmancera@suse.de>
Link: https://patch.msgid.link/20260204095837.1285552-1-syoshida@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'nf-26-02-05' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf

Florian Westphal says:

====================
netfilter: update for net

This is one last-minute crash fix for nf_tables, from Andrew Fasano:

Logical check is inverted, this makes kernel fail to correctly undo
the transaction, leading to a use-after-free.

* tag 'nf-26-02-05' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nf_tables: fix inverted genmask check in nft_map_catchall_activate()
====================

Link: https://patch.msgid.link/20260205074450.3187-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

loop: revert exclusive opener loop status change

This commit effectively reverts the following two commits:

2704024d83fa ("loop: add missing bd_abort_claiming in loop_set_status")
08e136ebd193 ("loop: don't change loop device under exclusive opener in loop_set_status")

as there are reports of them causing issues with unmounting. As we're
close to the 6.19 kernel release and the original author hasn't taken a
closer look at this yet, revert them for release.

Reported-by: nokangaroo <nokangaroo@aon.at>
Link: https://lore.kernel.org/all/62de4453-17e8-47f6-a10b-39bf5a49fdee@leemhuis.info/
Signed-off-by: Jens Axboe <axboe@kernel.dk>

objtool/klp: Fix unexported static call key access for manually built livepatch modules

Enabling CONFIG_MEM_ALLOC_PROFILING_DEBUG with CONFIG_SAMPLE_LIVEPATCH
results in the following error:

  samples/livepatch/livepatch-shadow-fix1.o: error: objtool: static_call: can't find static_call_key symbol: __SCK__WARN_trap

This is caused an extra file->klp sanity check which was added by commit
164c9201e1da ("objtool: Add base objtool support for livepatch
modules").  That check was intended to ensure that livepatch modules
built with klp-build always have full access to their static call keys.

However, it failed to account for the fact that manually built livepatch
modules (i.e., not built with klp-build) might need access to unexported
static call keys, for which read-only access is typically allowed for
modules.

While the livepatch-shadow-fix1 module doesn't explicitly use any static
calls, it does have a memory allocation, which can cause
CONFIG_MEM_ALLOC_PROFILING_DEBUG to insert a WARN() call.  And WARN() is
now an unexported static call as of commit 860238af7a33 ("x86_64/bug:
Inline the UD1").

Fix it by removing the overzealous file->klp check, restoring the
original behavior for manually built livepatch modules.

Fixes: 164c9201e1da ("objtool: Add base objtool support for livepatch modules")
Reported-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Song Liu <song@kernel.org>
Tested-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patch.msgid.link/0bd3ae9a53c3d743417fe842b740a7720e2bcd1c.1770058775.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

objtool/klp: Fix symbol correlation for orphaned local symbols

When compiling with CONFIG_LTO_CLANG_THIN, vmlinux.o has
__irf_[start|end] before the first FILE entry:

  $ readelf -sW vmlinux.o
  Symbol table '.symtab' contains 597706 entries:
     Num:    Value          Size Type    Bind   Vis      Ndx Name
       0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
       1: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT   18 __irf_start
       2: 0000000000000200     0 NOTYPE  LOCAL  DEFAULT   18 __irf_end
       3: 0000000000000000     0 SECTION LOCAL  DEFAULT   17 .text
       4: 0000000000000000     0 SECTION LOCAL  DEFAULT   18 .init.ramfs

This causes klp-build warnings like:

  vmlinux.o: warning: objtool: no correlation: __irf_start
  vmlinux.o: warning: objtool: no correlation: __irf_end

The problem is that Clang LTO is stripping the initramfs_data.o FILE
symbol, causing those two symbols to be orphaned and not noticed by
klp-diff's correlation logic.  Add a loop to correlate any symbols found
before the first FILE symbol.

Fixes: dd590d4d57eb ("objtool/klp: Introduce klp diff subcommand for diffing object files")
Reported-by: Song Liu <song@kernel.org>
Acked-by: Song Liu <song@kernel.org>
Link: https://patch.msgid.link/e21ec1141fc749b5f538d7329b531c1ab63a6d1a.1770055235.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

livepatch: Free klp_{object,func}_ext data after initialization

The klp_object_ext and klp_func_ext data, which are stored in the
__klp_objects and __klp_funcs sections, respectively, are not needed
after they are used to create the actual klp_object and klp_func
instances. This operation is implemented by the init function in
scripts/livepatch/init.c.

Prefix the two sections with ".init" so they are freed after the module
is initializated.

Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
Acked-by: Joe Lawrence <joe.lawrence@redhat.com>
Acked-by: Miroslav Benes <mbenes@suse.cz>
Reviewed-by: Aaron Tomlin <atomlin@atomlin.com>
Link: https://patch.msgid.link/20260123102825.3521961-3-petr.pavlu@suse.com
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

livepatch: Fix having __klp_objects relics in non-livepatch modules

The linker script scripts/module.lds.S specifies that all input
__klp_objects sections should be consolidated into an output section of
the same name, and start/stop symbols should be created to enable
scripts/livepatch/init.c to locate this data.

This start/stop pattern is not ideal for modules because the symbols are
created even if no __klp_objects input sections are present.
Consequently, a dummy __klp_objects section also appears in the
resulting module. This unnecessarily pollutes non-livepatch modules.

Instead, since modules are relocatable files, the usual method for
locating consolidated data in a module is to read its section table.
This approach avoids the aforementioned problem.

The klp_modinfo already stores a copy of the entire section table with
the final addresses. Introduce a helper function that
scripts/livepatch/init.c can call to obtain the location of the
__klp_objects section from this data.

Fixes: dd590d4d57eb ("objtool/klp: Introduce klp diff subcommand for diffing object files")
Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
Acked-by: Joe Lawrence <joe.lawrence@redhat.com>
Acked-by: Miroslav Benes <mbenes@suse.cz>
Reviewed-by: Aaron Tomlin <atomlin@atomlin.com>
Link: https://patch.msgid.link/20260123102825.3521961-2-petr.pavlu@suse.com
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

Merge tag 'nvme-6.19-2026-02-05' of git://git.infradead.org/nvme into block-6.19

Pull NVMe fixes from Keith:

"- Fix NULL pointer access setting up dma mappings (Keith)
- Fix invalid memory access from malformed TCP PDU (YunJe)"

* tag 'nvme-6.19-2026-02-05' of git://git.infradead.org/nvme:
nvmet-tcp: add bounds checks in nvmet_tcp_build_pdu_iovec
nvme-pci: handle changing device dma map requirements

nvmet-tcp: add bounds checks in nvmet_tcp_build_pdu_iovec

nvmet_tcp_build_pdu_iovec() could walk past cmd->req.sg when a PDU
length or offset exceeds sg_cnt and then use bogus sg->length/offset
values, leading to _copy_to_iter() GPF/KASAN. Guard sg_idx, remaining
entries, and sg->length/offset before building the bvec.

Fixes: 872d26a391da ("nvmet-tcp: add NVMe over TCP target driver")
Signed-off-by: YunJe Shin <ioerts@kookmin.ac.kr>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Joonkyo Jung <joonkyoj@yonsei.ac.kr>
Signed-off-by: Keith Busch <kbusch@kernel.org>

nvme-pci: handle changing device dma map requirements

The initial state of dma_needs_unmap may be false, but change to true
while mapping the data iterator. Enabling swiotlb is one such case that
can change the result. The nvme driver needs to save the mapped dma
vectors to be unmapped later, so allocate as needed during iteration
rather than assume it was always allocated at the beginning. This fixes
a NULL dereference from accessing an uninitialized dma_vecs when the
device dma unmapping requirements change mid-iteration.

Fixes: b8b7570a7ec8 ("nvme-pci: fix dma unmapping when using PRPs and not using the IOVA mapping")
Link: https://lore.kernel.org/linux-nvme/20260202125738.1194899-1-pradeep.pragallapati@oss.qualcomm.com/
Reported-by: Pradeep P V K <pradeep.pragallapati@oss.qualcomm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>

tracing: Fix ftrace event field alignments

The fields of ftrace specific events (events used to save ftrace internal
events like function traces and trace_printk) are generated similarly to
how normal trace event fields are generated. That is, the fields are added
to a trace_events_fields array that saves the name, offset, size,
alignment and signness of the field. It is used to produce the output in
the format file in tracefs so that tooling knows how to parse the binary
data of the trace events.

The issue is that some of the ftrace event structures are packed. The
function graph exit event structures are one of them. The 64 bit calltime
and rettime fields end up 4 byte aligned, but the algorithm to show to
userspace shows them as 8 byte aligned.

The macros that create the ftrace events has one for embedded structure
fields. There's two macros for theses fields:

  __field_desc() and __field_packed()

The difference of the latter macro is that it treats the field as packed.

Rename that field to __field_desc_packed() and create replace the
__field_packed() to be a normal field that is packed and have the calltime
and rettime use those.

This showed up on 32bit architectures for function graph time fields. It
had:

~# cat /sys/kernel/tracing/events/ftrace/funcgraph_exit/format
[..]
        field:unsigned long func;       offset:8;       size:4; signed:0;
        field:unsigned int depth;       offset:12;      size:4; signed:0;
        field:unsigned int overrun;     offset:16;      size:4; signed:0;
        field:unsigned long long calltime;      offset:24;      size:8; signed:0;
        field:unsigned long long rettime;       offset:32;      size:8; signed:0;

Notice that overrun is at offset 16 with size 4, where in the structure
calltime is at offset 20 (16 + 4), but it shows the offset at 24. That's
because it used the alignment of unsigned long long when used as a
declaration and not as a member of a structure where it would be aligned
by word size (in this case 4).

By using the proper structure alignment, the format has it at the correct
offset:

~# cat /sys/kernel/tracing/events/ftrace/funcgraph_exit/format
[..]
        field:unsigned long func;       offset:8;       size:4; signed:0;
        field:unsigned int depth;       offset:12;      size:4; signed:0;
        field:unsigned int overrun;     offset:16;      size:4; signed:0;
        field:unsigned long long calltime;      offset:20;      size:8; signed:0;
        field:unsigned long long rettime;       offset:28;      size:8; signed:0;

Cc: stable@vger.kernel.org
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reported-by: "jempty.liang" <imntjempty@163.com>
Link: https://patch.msgid.link/20260204113628.53faec78@gandalf.local.home
Fixes: 04ae87a52074e ("ftrace: Rework event_create_dir()")
Closes: https://lore.kernel.org/all/20260130015740.212343-1-imntjempty@163.com/
Closes: https://lore.kernel.org/all/20260202123342.2544795-1-imntjempty@163.com/
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

pmdomain: imx8mp-blk-ctrl: Keep usb phy power domain on for system wakeup

USB system wakeup need its PHY on, so add the GENPD_FLAG_ACTIVE_WAKEUP
flags to USB PHY genpd configuration.

Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Fixes: 556f5cf9568a ("soc: imx: add i.MX8MP HSIO blk-ctrl")
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

pmdomain: imx8mp-blk-ctrl: Keep gpc power domain on for system wakeup

Current design will power off all dependent GPC power domains in
imx8mp_blk_ctrl_suspend(), even though the user device has enabled
wakeup capability. The result is that wakeup function never works
for such device.

An example will be USB wakeup on i.MX8MP. PHY device '382f0040.usb-phy'
is attached to power domain 'hsioblk-usb-phy2' which is spawned by hsio
block control. A virtual power domain device 'genpd:3:32f10000.blk-ctrl'
is created to build connection with 'hsioblk-usb-phy2' and it depends on
GPC power domain 'usb-otg2'. If device '382f0040.usb-phy' enable wakeup,
only power domain 'hsioblk-usb-phy2' keeps on during system suspend,
power domain 'usb-otg2' is off all the time. So the wakeup event can't
happen.

In order to further establish a connection between the power domains
related to GPC and block control during system suspend, register a genpd
power on/off notifier for the power_dev. This allows us to prevent the GPC
power domain from being powered off, in case the block control power
domain is kept on to serve system wakeup.

Suggested-by: Ulf Hansson <ulf.hansson@linaro.org>
Fixes: 556f5cf9568a ("soc: imx: add i.MX8MP HSIO blk-ctrl")
Cc: stable@vger.kernel.org
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

drm/xe/guc: Fix CFI violation in debugfs access.

xe_guc_print_info is void-returning, but the function pointer it is
assigned to expects an int-returning function, leading to the following
CFI error:

[ 206.873690] CFI failure at guc_debugfs_show+0xa1/0xf0 [xe]
(target: xe_guc_print_info+0x0/0x370 [xe]; expected type: 0xbe3bc66a)

Fix this by updating xe_guc_print_info to return an integer.

Fixes: e15826bb3c2c ("drm/xe/guc: Refactor GuC debugfs initialization")
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: George D Sworo <george.d.sworo@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20260129182547.32899-2-daniele.ceraolospurio@intel.com
(cherry picked from commit dd8ea2f2ab71b98887fdc426b0651dbb1d1ea760)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>

drm/bridge: imx8mp-hdmi-pai: enable PM runtime

There is an audio channel shift issue with multi channel case - the
channel order is correct for the first run, but the channel order is
shifted for the second run. The fix method is to reset the PAI interface
at the end of playback.

The reset can be handled by PM runtime, so enable PM runtime.

Fixes: 0205fae6327a ("drm/bridge: imx: add driver for HDMI TX Parallel Audio Interface")
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Reviewed-by: Liu Ying <victor.liu@nxp.com>
Signed-off-by: Liu Ying <victor.liu@nxp.com>
Link: https://lore.kernel.org/r/20260130080910.3532724-1-shengjiu.wang@nxp.com

ALSA: hda/realtek: Enable headset mic for Acer Nitro 5

Add quirk to support microphone input through headphone jack on Acer Nitro 5 AN515-57 (ALC295).

Signed-off-by: Breno Baptista <brenomb07@gmail.com>
Link: https://patch.msgid.link/20260205024341.26694-1-brenomb07@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>

netfilter: nf_tables: fix inverted genmask check in nft_map_catchall_activate()

nft_map_catchall_activate() has an inverted element activity check
compared to its non-catchall counterpart nft_mapelem_activate() and
compared to what is logically required.

nft_map_catchall_activate() is called from the abort path to re-activate
catchall map elements that were deactivated during a failed transaction.
It should skip elements that are already active (they don't need
re-activation) and process elements that are inactive (they need to be
restored). Instead, the current code does the opposite: it skips inactive
elements and processes active ones.

Compare the non-catchall activate callback, which is correct:

  nft_mapelem_activate():
    if (nft_set_elem_active(ext, iter->genmask))
        return 0;   /* skip active, process inactive */

With the buggy catchall version:

  nft_map_catchall_activate():
    if (!nft_set_elem_active(ext, genmask))
        continue;   /* skip inactive, process active */

The consequence is that when a DELSET operation is aborted,
nft_setelem_data_activate() is never called for the catchall element.
For NFT_GOTO verdict elements, this means nft_data_hold() is never
called to restore the chain->use reference count. Each abort cycle
permanently decrements chain->use. Once chain->use reaches zero,
DELCHAIN succeeds and frees the chain while catchall verdict elements
still reference it, resulting in a use-after-free.

This is exploitable for local privilege escalation from an unprivileged
user via user namespaces + nftables on distributions that enable
CONFIG_USER_NS and CONFIG_NF_TABLES.

Fix by removing the negation so the check matches nft_mapelem_activate():
skip active elements, process inactive ones.

Fixes: 628bd3e49cba ("netfilter: nf_tables: drop map element references from preparation phase")
Signed-off-by: Andrew Fasano <andrew.fasano@nist.gov>
Signed-off-by: Florian Westphal <fw@strlen.de>

Merge tag 'wireless-2026-02-04' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless

Johannes Berg says:

====================
Two last-minute iwlwifi fixes:
- cancel mlo_scan_work on disassoc to avoid
   use-after-free/init-after-queue issues
- pause TCM work on suspend to avoid crashing
   the FW (and sometimes the host) on resume
   with traffic

* tag 'wireless-2026-02-04' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
  wifi: iwlwifi: mvm: pause TCM on fast resume
  wifi: iwlwifi: mld: cancel mlo_scan_start_wk
====================

Link: https://patch.msgid.link/20260204113547.159742-4-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'mm-hotfixes-stable-2026-02-04-15-55' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
"Five hotfixes.  Two are cc:stable, two are for MM.

  All are singletons - please see the changelogs for details"

* tag 'mm-hotfixes-stable-2026-02-04-15-55' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  Documentation: document liveupdate cmdline parameter
  mm, shmem: prevent infinite loop on truncate race
  mailmap: update Alexander Mikhalitsyn's emails
  liveupdate: luo_file: do not clear serialized_data on unfreeze
  x86/kfence: fix booting on 32bit non-PAE systems

Merge tag 'tsm-fixes-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/devsec/tsm

Pull TSM (TEE security Manager) fixes from Dan Williams:
"The largest change is reverting part of an ABI that never shipped in a
  released kernel (Documentation/ABI/testing/sysfs-class-tsm). The fix /
  replacement for that is too large to squeeze in at this late date.

  The rest is a collection of small fixups:

   - Fix multiple streams per host bridge for SEV-TIO

   - Drop the TSM ABI for reporting IDE streams (to be replaced)

   - Fix virtual function enumeration

   - Fix reserved stream ID initialization

   - Fix unused variable compiler warning"

* tag 'tsm-fixes-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/devsec/tsm:
  crypto/ccp: Allow multiple streams on the same root bridge
  crypto/ccp: Use PCI bridge defaults for IDE
  coco/tsm: Remove unused variable tsm_rwsem
  PCI/IDE: Fix reading a wrong reg for unused sel stream initialization
  PCI/IDE: Fix off by one error calculating VF RID range
  Revert "PCI/TSM: Report active IDE streams"

Merge tag 'sched_ext-for-6.19-rc8-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext

Pull sched_ext fix from Tejun Heo:

- Fix race where sched_class operations (sched_setscheduler() and
   friends) could be invoked on dead tasks after sched_ext_dead()
   already ran, causing invalid SCX task state transitions and NULL
   pointer dereferences.

   This was a regression from the cgroup exit ordering fix which
   moved sched_ext_free() to finish_task_switch().

* tag 'sched_ext-for-6.19-rc8-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext:
  sched_ext: Short-circuit sched_class operations on dead tasks

hwmon: (occ) Mark occ_init_attribute() as __printf

This is a printf-style function, which gcc -Werror=suggest-attribute=format
correctly points out:

drivers/hwmon/occ/common.c: In function 'occ_init_attribute':
drivers/hwmon/occ/common.c:761:9: error: function 'occ_init_attribute' might be a candidate for 'gnu_printf' format attribute [-Werror=suggest-attribute=format]

Add the attribute to avoid this warning and ensure any incorrect
format strings are detected here.

Fixes: 744c2fe950e9 ("hwmon: (occ) Rework attribute registration for stack usage")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20260203163440.2674340-1-arnd@kernel.org
Signed-off-by: Guenter Roeck <linux@roeck-us.net>

sched_ext: Short-circuit sched_class operations on dead tasks

7900aa699c34 ("sched_ext: Fix cgroup exit ordering by moving sched_ext_free()
to finish_task_switch()") moved sched_ext_free() to finish_task_switch() and
renamed it to sched_ext_dead() to fix cgroup exit ordering issues. However,
this created a race window where certain sched_class ops may be invoked on
dead tasks leading to failures - e.g. sched_setscheduler() may try to switch a
task which finished sched_ext_dead() back into SCX triggering invalid SCX task
state transitions.

Add task_dead_and_done() which tests whether a task is TASK_DEAD and has
completed its final context switch, and use it to short-circuit sched_class
operations which may be called on dead tasks.

Fixes: 7900aa699c34 ("sched_ext: Fix cgroup exit ordering by moving sched_ext_free() to finish_task_switch()")
Reported-by: Andrea Righi <arighi@nvidia.com>
Link: http://lkml.kernel.org/r/20260202151341.796959-1-arighi@nvidia.com
Reviewed-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>

ceph: fix NULL pointer dereference in ceph_mds_auth_match()

The CephFS kernel client has regression starting from 6.18-rc1.
We have issue in ceph_mds_auth_match() if fs_name == NULL:

    const char fs_name = mdsc->fsc->mount_options->mds_namespace;
    ...
    if (auth->match.fs_name && strcmp(auth->match.fs_name, fs_name)) {
            / fsname mismatch, try next one */
            return 0;
    }

Patrick Donnelly suggested that: In summary, we should definitely start
decoding `fs_name` from the MDSMap and do strict authorizations checks
against it. Note that the `-o mds_namespace=foo` should only be used for
selecting the file system to mount and nothing else. It's possible
no mds_namespace is specified but the kernel will mount the only
file system that exists which may have name "foo".

This patch reworks ceph_mdsmap_decode() and namespace_equals() with
the goal of supporting the suggested concept. Now struct ceph_mdsmap
contains m_fs_name field that receives copy of extracted FS name
by ceph_extract_encoded_string(). For the case of "old" CephFS file
systems, it is used "cephfs" name.

[ idryomov: replace redundant %*pE with %s in ceph_mdsmap_decode(),
  get rid of a series of strlen() calls in ceph_namespace_match(),
  drop changes to namespace_equals() body to avoid treating empty
  mds_namespace as equal, drop changes to ceph_mdsc_handle_fsmap()
  as namespace_equals() isn't an equivalent substitution there ]

Cc: stable@vger.kernel.org
Fixes: 22c73d52a6d0 ("ceph: fix multifs mds auth caps issue")
Link: https://tracker.ceph.com/issues/73886
Signed-off-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Reviewed-by: Patrick Donnelly <pdonnell@ibm.com>
Tested-by: Patrick Donnelly <pdonnell@ibm.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:

- Fix a bug where AVIC is incorrectly inhibited when running with
   x2AVIC disabled via module param (or on a system without x2AVIC)

- Fix a dangling device posted IRQs bug by explicitly checking if the
   irqfd is still active (on the list) when handling an eventfd signal,
   instead of zeroing the irqfd's routing information when the irqfd is
   deassigned.

   Zeroing the irqfd's routing info causes arm64 and x86's to not
   disable posting for the IRQ (kvm_arch_irq_bypass_del_producer() looks
   for an MSI), incorrectly leaving the IRQ in posted mode (and leading
   to use-after-free and memory leaks on AMD in particular).

   This is both the most pressing and scariest, but it's been in -next
   for a while.

- Disable FORTIFY_SOURCE for KVM selftests to prevent the compiler from
   generating calls to the checked versions of memset() and friends,
   which leads to unexpected page faults in guest code due e.g.
   __memset_chk@plt not being resolved.

- Explicitly configure the supported XSS capabilities from within
   {svm,vmx}_set_cpu_caps() to fix a bug where VMX will compute the
   reference VMCS configuration with SHSTK and IBT enabled, but then
   compute each CPUs local config with SHSTK and IBT disabled if not all
   CET xfeatures are enabled, e.g. if the kernel is built with
   X86_KERNEL_IBT=n.

   The mismatch in features results in differing nVMX setting, and
   ultimately causes kvm-intel.ko to refuse to load with nested=1.

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: Explicitly configure supported XSS from {svm,vmx}_set_cpu_caps()
  KVM: selftests: Add -U_FORTIFY_SOURCE to avoid some unpredictable test failures
  KVM: x86: Assert that non-MSI doesn't have bypass vCPU when deleting producer
  KVM: Don't clobber irqfd routing type when deassigning irqfd
  KVM: SVM: Check vCPU ID against max x2AVIC ID if and only if x2AVIC is enabled

Merge tag 'kvm-x86-fixes-6.19-rc8' of https://github.com/kvm-x86/linux into HEAD

Final KVM fixes for 6.19:

- Fix a bug where AVIC is incorrectly inhibited when running with x2AVIC
   disabled via module param (or on a system without x2AVIC).

- Fix a dangling device posted IRQs bug by explicitly checking if the irqfd is
   still active (on the list) when handling an eventfd signal, instead of
   zeroing the irqfd's routing information when the irqfd is deassigned.
   Zeroing the irqfd's routing info causes arm64 and x86's to not disable
   posting for the IRQ (kvm_arch_irq_bypass_del_producer() looks for an MSI),
   incorrectly leaving the IRQ in posted mode (and leading to use-after-free
   and memory leaks on AMD in particular).

   This is both the most pressing and scariest, but it's been in -next for
   a while.

- Disable FORTIFY_SOURCE for KVM selftests to prevent the compiler from
   generating calls to the checked versions of memset() and friends, which
   leads to unexpected page faults in guest code due e.g. __memset_chk@plt
   not being resolved.

- Explicitly configure the support XSS from within {svm,vmx}_set_cpu_caps() to
   fix a bug where VMX will compute the reference VMCS configuration with SHSTK
   and IBT enabled, but then compute each CPUs local config with SHSTK and IBT
   disabled if not all CET xfeatures are enabled, e.g. if the kernel is built
   with X86_KERNEL_IBT=n.  The mismatch in features results in differing nVMX
   setting, and ultimately causes kvm-intel.ko to refuse to load with nested=1.

Merge tag 'soc-fixes-6.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull SoC fixes from Arnd Bergmann:
"Shawn Guo is moving on from maintaining the NXP i.MX platform and
  hands over to Frank Li. Shawn has maintained the platform for 15 years
  after initially upstreaming support for i.MX6 and i.MX23/28, and his
  work has helped make this the most important industrial embedded Linux
  platform. Roughly one out of five devicetree files in mainline kernels
  are for the wider i.MX platform. Many thanks to Shawn for the taking
  care of the platform all these years!

  There are also two additional updates for the MAINTAINERS file, and a
  fix for error handling in the qualcomm smem driver"

* tag 'soc-fixes-6.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  MAINTAINERS: Change Sudeep Holla's email address
  MAINTAINERS: Add myself as maintainer of hisi_soc_hha
  soc: qcom: smem: fix qcom_smem_is_available and check if __smem is valid
  MAINTAINERS: Replace Shawn with Frank as i.MX platform maintainer

Merge tag 'asoc-fix-v6.19-rc8' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v6.19

A bunch more small fixes here, plus some more of the constant stream of
quirks. The most notable change here is Richard's change to the cs_dsp
code for the KUnit tests which is relatively large, mostly due to
boilerplate. The tests were triggering large numbers of error messages
as part of verifying that problems with input data are appropriately
detected which in turn caused runtime issues for the framework due to
the performance impact of pushing the logging out, while the logging is
valuable in normal operation it's basically useless while doing tests
designed to trigger it so rate limiting is an appropriate fix.

drm/xe/pm: Disable D3Cold for BMG only on specific platforms

Restrict D3Cold disablement for BMG to unsupported NUC platforms,
instead of disabling it on all platforms.

Signed-off-by: Karthik Poosa <karthik.poosa@intel.com>
Fixes: 3e331a6715ee ("drm/xe/pm: Temporarily disable D3Cold on BMG")
Link: https://patch.msgid.link/20260123173238.1642383-1-karthik.poosa@intel.com
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit 39125eaf8863ab09d70c4b493f58639b08d5a897)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>

drm/xe: Fix kerneldoc for xe_tlb_inval_job_alloc_dep

Correct the function name in the kerneldoc.
It is for below warning:
"Warning: drivers/gpu/drm/xe/xe_tlb_inval_job.c:210 expecting prototype for
xe_tlb_inval_alloc_dep(). Prototype was for xe_tlb_inval_job_alloc_dep()
instead"

Fixes: 15366239e2130 ("drm/xe: Decouple TLB invalidations from GT")
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20260129233834.419977-8-shuicheng.lin@intel.com
(cherry picked from commit 9f9c117ac566cb567dd56cc5b7564c45653f7a2a)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>

drm/xe: Fix kerneldoc for xe_gt_tlb_inval_init_early

Correct the function name in the kerneldoc.
It is for below warning:
"Warning: drivers/gpu/drm/xe/xe_tlb_inval.c:136 expecting prototype for
xe_gt_tlb_inval_init(). Prototype was for xe_gt_tlb_inval_init_early()
instead"

v2: add () for the function. (Michal)

Fixes: db16f9d90c1d9 ("drm/xe: Split TLB invalidation code in frontend and backend")
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20260129233834.419977-7-shuicheng.lin@intel.com
(cherry picked from commit 0651dbb9d6a72e99569576fbec4681fd8160d161)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>

drm/xe: Fix kerneldoc for xe_migrate_exec_queue

Correct the function name in the kerneldoc.
It is for below warning:
"Warning: drivers/gpu/drm/xe/xe_migrate.c:1262 expecting prototype for
xe_get_migrate_exec_queue(). Prototype was for xe_migrate_exec_queue()
instead"

Fixes: 916ee4704a865 ("drm/xe/vf: Register CCS read/write contexts with Guc")
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20260129233834.419977-6-shuicheng.lin@intel.com
(cherry picked from commit 9fd8da717934f05125b9ba6782622c459a368dc0)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>

drm/xe/query: Fix topology query pointer advance

The topology query helper advanced the user pointer by the size
of the pointer, not the size of the structure. This can misalign
the output blob and corrupt the following mask. Fix the increment
to use sizeof(*topo).
There is no issue currently, as sizeof(*topo) happens to be equal
to sizeof(topo) on 64-bit systems (both evaluate to 8 bytes).

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20260130043907.465128-2-shuicheng.lin@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
(cherry picked from commit c2a6859138e7f73ad904be17dd7d1da6cc7f06b3)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>

Merge tag 'iwlwifi-fixes-2026-02-03' of https://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-next

Miri Korenblit says:
====================
iwlwifi fixes

- Cancel mlo_scan_work on disassoc
- Pause TCM work on suspend
====================

Signed-off-by: Johannes Berg <johannes.berg@intel.com>

ASoC: fsl_xcvr: fix missing lock in fsl_xcvr_mode_put()

fsl_xcvr_activate_ctl() has
lockdep_assert_held(&card->snd_card->controls_rwsem),
but fsl_xcvr_mode_put() calls it without acquiring this lock.

Other callers of fsl_xcvr_activate_ctl() in fsl_xcvr_startup() and
fsl_xcvr_shutdown() properly acquire the lock with down_read()/up_read().

Add the missing down_read()/up_read() calls around fsl_xcvr_activate_ctl()
in fsl_xcvr_mode_put() to fix the lockdep assertion and prevent potential
race conditions when multiple userspace threads access the control.

Signed-off-by: Ziyi Guo <n7l8m4@u.northwestern.edu>
Link: https://patch.msgid.link/20260202174112.2018402-1-n7l8m4@u.northwestern.edu
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: dt-bindings: ti,tlv320aic3x: Add compatible string ti,tlv320aic23

Add compatible string ti,tlv320aic23 to fix below CHECK_DTB warning:
arch/arm/boot/dts/nxp/imx/imx35-eukrea-mbimxsd35-baseboard.dtb:
/soc/bus@43f00000/i2c@43f80000/codec@1a: failed to match any schema with compatible: ['ti,tlv320aic23']

Signed-off-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Daniel Baluta <daniel.baluta@nxp.com>
Link: https://patch.msgid.link/20260202205758.3044617-1-Frank.Li@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: amd: fix memory leak in acp3x pdm dma ops

Fixes: 4a767b1d039a8 ("ASoC: amd: add acp3x pdm driver dma ops")
Signed-off-by: Chris Bainbridge <chris.bainbridge@gmail.com>
Link: https://patch.msgid.link/20260202205034.7697-1-chris.bainbridge@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>

sched/mmcid: Optimize transitional CIDs when scheduling out

During the investigation of the various transition mode issues
instrumentation revealed that the amount of bitmap operations can be
significantly reduced when a task with a transitional CID schedules out
after the fixup function completed and disabled the transition mode.

At that point the mode is stable and therefore it is not required to drop
the transitional CID back into the pool. As the fixup is complete the
potential exhaustion of the CID pool is not longer possible, so the CID can
be transferred to the scheduling out task or to the CPU depending on the
current ownership mode.

The racy snapshot of mm_cid::mode which contains both the ownership state
and the transition bit is valid because runqueue lock is held and the fixup
function of a concurrent mode switch is serialized.

Assigning the ownership right there not only spares the bitmap access for
dropping the CID it also avoids it when the task is scheduled back in as it
directly hits the fast path in both modes when the CID is within the
optimal range. If it's outside the range the next schedule in will need to
converge so dropping it right away is sensible. In the good case this also
allows to go into the fast path on the next schedule in operation.

With a thread pool benchmark which is configured to cross the mode switch
boundaries frequently this reduces the number of bitmap operations by about
30% and increases the fastpath utilization in the low single digit
percentage range.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://patch.msgid.link/20260201192835.100194627@kernel.org

sched/mmcid: Drop per CPU CID immediately when switching to per task mode

When a exiting task initiates the switch from per CPU back to per task
mode, it has already dropped its CID and marked itself inactive. But a
leftover from an earlier iteration of the rework then reassigns the per
CPU CID to the exiting task with the transition bit set.

That's wrong as the task is already marked CID inactive, which means it is
inconsistent state. It's harmless because the CID is marked in transit and
therefore dropped back into the pool when the exiting task schedules out
either through preemption or the final schedule().

Simply drop the per CPU CID when the exiting task triggered the transition.

Fixes: fbd0e71dc370 ("sched/mmcid: Provide CID ownership mode fixup functions")
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://patch.msgid.link/20260201192835.032221009@kernel.org