]> git.ipfire.org Git - thirdparty/kernel/linux.git/log
thirdparty/kernel/linux.git
7 months agoovl: remove unused code in lowerdir param parsing
Amir Goldstein [Sat, 28 Oct 2023 09:07:45 +0000 (12:07 +0300)] 
ovl: remove unused code in lowerdir param parsing

Commit beae836e9c61 ("ovl: temporarily disable appending lowedirs")
removed the ability to append lowerdirs with syntax lowerdir=":<path>".
Remove leftover code and comments that are irrelevant with lowerdir
append mode disabled.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
7 months agoovl: Add documentation on nesting of overlayfs mounts
Alexander Larsson [Wed, 16 Aug 2023 10:57:41 +0000 (12:57 +0200)] 
ovl: Add documentation on nesting of overlayfs mounts

Signed-off-by: Alexander Larsson <alexl@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
7 months agoovl: Add an alternative type of whiteout
Alexander Larsson [Wed, 23 Aug 2023 14:33:42 +0000 (16:33 +0200)] 
ovl: Add an alternative type of whiteout

An xattr whiteout (called "xwhiteout" in the code) is a reguar file of
zero size with the "overlay.whiteout" xattr set. A file like this in a
directory with the "overlay.whiteouts" xattrs set will be treated the
same way as a regular whiteout.

The "overlay.whiteouts" directory xattr is used in order to
efficiently handle overlay checks in readdir(), as we only need to
checks xattrs in affected directories.

The advantage of this kind of whiteout is that they can be escaped
using the standard overlay xattr escaping mechanism. So, a file with a
"overlay.overlay.whiteout" xattr would be unescaped to
"overlay.whiteout", which could then be consumed by another overlayfs
as a whiteout.

Overlayfs itself doesn't create whiteouts like this, but a userspace
mechanism could use this alternative mechanism to convert images that
may contain whiteouts to be used with overlayfs.

To work as a whiteout for both regular overlayfs mounts as well as
userxattr mounts both the "user.overlay.whiteout*" and the
"trusted.overlay.whiteout*" xattrs will need to be created.

Signed-off-by: Alexander Larsson <alexl@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
7 months agoovl: Support escaped overlay.* xattrs
Alexander Larsson [Tue, 15 Aug 2023 07:40:50 +0000 (09:40 +0200)] 
ovl: Support escaped overlay.* xattrs

There are cases where you want to use an overlayfs mount as a lowerdir
for another overlayfs mount. For example, if the system rootfs is on
overlayfs due to composefs, or to make it volatile (via tmps), then
you cannot currently store a lowerdir on the rootfs. This means you
can't e.g. store on the rootfs a prepared container image for use
using overlayfs.

To work around this, we introduce an escapment mechanism for overlayfs
xattrs. Whenever the lower/upper dir has a xattr named
"overlay.overlay.XYZ", we list it as "overlay.XYZ" in listxattrs, and
when the user calls getxattr or setxattr on "overlay.XYZ", we apply to
"overlay.overlay.XYZ" in the backing directories.

This allows storing any kind of overlay xattrs in a overlayfs mount
that can be used as a lowerdir in another mount. It is possible to
stack this mechanism multiple times, such that
"overlay.overlay.overlay.XYZ" will survive two levels of overlay mounts,
however this is not all that useful in practice because of stack depth
limitations of overlayfs mounts.

Note: These escaped xattrs are copied to upper during copy-up.

Signed-off-by: Alexander Larsson <alexl@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
7 months agoovl: Add OVL_XATTR_TRUSTED/USER_PREFIX_LEN macros
Alexander Larsson [Tue, 15 Aug 2023 07:33:44 +0000 (09:33 +0200)] 
ovl: Add OVL_XATTR_TRUSTED/USER_PREFIX_LEN macros

These match the ones for e.g. XATTR_TRUSTED_PREFIX_LEN.

Signed-off-by: Alexander Larsson <alexl@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
7 months agoovl: Move xattr support to new xattrs.c file
Amir Goldstein [Tue, 10 Oct 2023 11:17:59 +0000 (14:17 +0300)] 
ovl: Move xattr support to new xattrs.c file

This moves the code from super.c and inode.c, and makes ovl_xattr_get/set()
static.

This is in preparation for doing more work on xattrs support.

Signed-off-by: Alexander Larsson <alexl@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
7 months agoovl: do not encode lower fh with upper sb_writers held
Amir Goldstein [Wed, 16 Aug 2023 13:47:59 +0000 (16:47 +0300)] 
ovl: do not encode lower fh with upper sb_writers held

When lower fs is a nested overlayfs, calling encode_fh() on a lower
directory dentry may trigger copy up and take sb_writers on the upper fs
of the lower nested overlayfs.

The lower nested overlayfs may have the same upper fs as this overlayfs,
so nested sb_writers lock is illegal.

Move all the callers that encode lower fh to before ovl_want_write().

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
7 months agoovl: do not open/llseek lower file with upper sb_writers held
Amir Goldstein [Wed, 16 Aug 2023 09:42:18 +0000 (12:42 +0300)] 
ovl: do not open/llseek lower file with upper sb_writers held

overlayfs file open (ovl_maybe_lookup_lowerdata) and overlay file llseek
take the ovl_inode_lock, without holding upper sb_writers.

In case of nested lower overlay that uses same upper fs as this overlay,
lockdep will warn about (possibly false positive) circular lock
dependency when doing open/llseek of lower ovl file during copy up with
our upper sb_writers held, because the locking ordering seems reverse to
the locking order in ovl_copy_up_start():

- lower ovl_inode_lock
- upper sb_writers

Let the copy up "transaction" keeps an elevated mnt write count on upper
mnt, but leaves taking upper sb_writers to lower level helpers only when
they actually need it.  This allows to avoid holding upper sb_writers
during lower file open/llseek and prevents the lockdep warning.

Minimizing the scope of upper sb_writers during copy up is also needed
for fixing another possible deadlocks by a following patch.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
7 months agoovl: reorder ovl_want_write() after ovl_inode_lock()
Amir Goldstein [Thu, 20 Jul 2023 09:51:21 +0000 (12:51 +0300)] 
ovl: reorder ovl_want_write() after ovl_inode_lock()

Make the locking order of ovl_inode_lock() strictly between the two
vfs stacked layers, i.e.:
- ovl vfs locks: sb_writers, inode_lock, ...
- ovl_inode_lock
- upper vfs locks: sb_writers, inode_lock, ...

To that effect, move ovl_want_write() into the helpers ovl_nlink_start()
and ovl_copy_up_start which currently take the ovl_inode_lock() after
ovl_want_write().

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
7 months agoovl: split ovl_want_write() into two helpers
Amir Goldstein [Wed, 16 Aug 2023 09:18:15 +0000 (12:18 +0300)] 
ovl: split ovl_want_write() into two helpers

ovl_get_write_access() gets write access to upper mnt without taking
freeze protection on upper sb and ovl_start_write() only takes freeze
protection on upper sb.

These helpers will be used to breakup the large ovl_want_write() scope
during copy up into finer grained freeze protection scopes.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
7 months agoovl: add helper ovl_file_modified()
Amir Goldstein [Wed, 27 Sep 2023 10:43:44 +0000 (13:43 +0300)] 
ovl: add helper ovl_file_modified()

A simple wrapper for updating ovl inode size/mtime, to conform
with ovl_file_accessed().

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
7 months agoovl: protect copying of realinode attributes to ovl inode
Amir Goldstein [Thu, 24 Aug 2023 11:51:17 +0000 (14:51 +0300)] 
ovl: protect copying of realinode attributes to ovl inode

ovl_copyattr() may be called concurrently from aio completion context
without any lock and that could lead to overlay inode attributes getting
permanently out of sync with real inode attributes.

Use ovl inode spinlock to protect ovl_copyattr().

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
7 months agoovl: punt write aio completion to workqueue
Amir Goldstein [Tue, 22 Aug 2023 17:50:59 +0000 (20:50 +0300)] 
ovl: punt write aio completion to workqueue

We want to protect concurrent updates of ovl inode size and mtime
(i.e. ovl_copyattr()) from aio completion context.

Punt write aio completion to a workqueue so that we can protect
ovl_copyattr() with a spinlock.

Export sb_init_dio_done_wq(), so that overlayfs can use its own
dio workqueue to punt aio completions.

Suggested-by: Jens Axboe <axboe@kernel.dk>
Link: https://lore.kernel.org/r/8620dfd3-372d-4ae0-aa3f-2fe97dda1bca@kernel.dk/
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
7 months agoovl: propagate IOCB_APPEND flag on writes to realfile
Amir Goldstein [Tue, 29 Aug 2023 13:25:47 +0000 (16:25 +0300)] 
ovl: propagate IOCB_APPEND flag on writes to realfile

If ovl file is opened O_APPEND, the underlying realfile is also
opened O_APPEND, so it makes sense to propagate the IOCB_APPEND flags
on sync writes to realfile, just as we do with aio writes.

Effectively, because sync ovl writes are protected by inode lock,
this change only makes a difference if the realfile is written to (size
extending writes) from underneath overlayfs.  The behavior in this case
is undefined, so it is ok if we change the behavior (to fail the ovl
IOCB_APPEND write).

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
7 months agoovl: use simpler function to convert iocb to rw flags
Amir Goldstein [Wed, 6 Sep 2023 07:52:13 +0000 (10:52 +0300)] 
ovl: use simpler function to convert iocb to rw flags

Overlayfs implements its own function to translate iocb flags into rw
flags, so that they can be passed into another vfs call.

With commit ce71bfea207b4 ("fs: align IOCB_* flags with RWF_* flags")
Jens created a 1:1 matching between the iocb flags and rw flags,
simplifying the conversion.

Signed-off-by: Alessio Balsini <balsini@android.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
7 months agoMerge tag 'vfs-6.7.ctime' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs
Linus Torvalds [Mon, 30 Oct 2023 19:47:13 +0000 (09:47 -1000)] 
Merge tag 'vfs-6.7.ctime' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs

Pull vfs inode time accessor updates from Christian Brauner:
 "This finishes the conversion of all inode time fields to accessor
  functions as discussed on list. Changing timestamps manually as we
  used to do before is error prone. Using accessors function makes this
  robust.

  It does not contain the switch of the time fields to discrete 64 bit
  integers to replace struct timespec and free up space in struct inode.
  But after this, the switch can be trivially made and the patch should
  only affect the vfs if we decide to do it"

* tag 'vfs-6.7.ctime' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: (86 commits)
  fs: rename inode i_atime and i_mtime fields
  security: convert to new timestamp accessors
  selinux: convert to new timestamp accessors
  apparmor: convert to new timestamp accessors
  sunrpc: convert to new timestamp accessors
  mm: convert to new timestamp accessors
  bpf: convert to new timestamp accessors
  ipc: convert to new timestamp accessors
  linux: convert to new timestamp accessors
  zonefs: convert to new timestamp accessors
  xfs: convert to new timestamp accessors
  vboxsf: convert to new timestamp accessors
  ufs: convert to new timestamp accessors
  udf: convert to new timestamp accessors
  ubifs: convert to new timestamp accessors
  tracefs: convert to new timestamp accessors
  sysv: convert to new timestamp accessors
  squashfs: convert to new timestamp accessors
  server: convert to new timestamp accessors
  client: convert to new timestamp accessors
  ...

7 months agoMerge tag 'vfs-6.7.xattr' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs
Linus Torvalds [Mon, 30 Oct 2023 19:29:44 +0000 (09:29 -1000)] 
Merge tag 'vfs-6.7.xattr' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs

Pull vfs xattr updates from Christian Brauner:
 "The 's_xattr' field of 'struct super_block' currently requires a
  mutable table of 'struct xattr_handler' entries (although each handler
  itself is const). However, no code in vfs actually modifies the
  tables.

  This changes the type of 's_xattr' to allow const tables, and modifies
  existing file systems to move their tables to .rodata. This is
  desirable because these tables contain entries with function pointers
  in them; moving them to .rodata makes it considerably less likely to
  be modified accidentally or maliciously at runtime"

* tag 'vfs-6.7.xattr' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: (30 commits)
  const_structs.checkpatch: add xattr_handler
  net: move sockfs_xattr_handlers to .rodata
  shmem: move shmem_xattr_handlers to .rodata
  overlayfs: move xattr tables to .rodata
  xfs: move xfs_xattr_handlers to .rodata
  ubifs: move ubifs_xattr_handlers to .rodata
  squashfs: move squashfs_xattr_handlers to .rodata
  smb: move cifs_xattr_handlers to .rodata
  reiserfs: move reiserfs_xattr_handlers to .rodata
  orangefs: move orangefs_xattr_handlers to .rodata
  ocfs2: move ocfs2_xattr_handlers and ocfs2_xattr_handler_map to .rodata
  ntfs3: move ntfs_xattr_handlers to .rodata
  nfs: move nfs4_xattr_handlers to .rodata
  kernfs: move kernfs_xattr_handlers to .rodata
  jfs: move jfs_xattr_handlers to .rodata
  jffs2: move jffs2_xattr_handlers to .rodata
  hfsplus: move hfsplus_xattr_handlers to .rodata
  hfs: move hfs_xattr_handlers to .rodata
  gfs2: move gfs2_xattr_handlers_max to .rodata
  fuse: move fuse_xattr_handlers to .rodata
  ...

7 months agoMerge tag 'vfs-6.7.iov_iter' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs
Linus Torvalds [Mon, 30 Oct 2023 19:24:21 +0000 (09:24 -1000)] 
Merge tag 'vfs-6.7.iov_iter' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs

Pull iov_iter updates from Christian Brauner:
 "This contain's David's iov_iter cleanup work to convert the iov_iter
  iteration macros to inline functions:

   - Remove last_offset from iov_iter as it was only used by ITER_PIPE

   - Add a __user tag on copy_mc_to_user()'s dst argument on x86 to
     match that on powerpc and get rid of a sparse warning

   - Convert iter->user_backed to user_backed_iter() in the sound PCM
     driver

   - Convert iter->user_backed to user_backed_iter() in a couple of
     infiniband drivers

   - Renumber the type enum so that the ITER_* constants match the order
     in iterate_and_advance*()

   - Since the preceding patch puts UBUF and IOVEC at 0 and 1, change
     user_backed_iter() to just use the type value and get rid of the
     extra flag

   - Convert the iov_iter iteration macros to always-inline functions to
     make the code easier to follow. It uses function pointers, but they
     get optimised away

   - Move the check for ->copy_mc to _copy_from_iter() and
     copy_page_from_iter_atomic() rather than in memcpy_from_iter_mc()
     where it gets repeated for every segment. Instead, we check once
     and invoke a side function that can use iterate_bvec() rather than
     iterate_and_advance() and supply a different step function

   - Move the copy-and-csum code to net/ where it can be in proximity
     with the code that uses it

   - Fold memcpy_and_csum() in to its two users

   - Move csum_and_copy_from_iter_full() out of line and merge in
     csum_and_copy_from_iter() since the former is the only caller of
     the latter

   - Move hash_and_copy_to_iter() to net/ where it can be with its only
     caller"

* tag 'vfs-6.7.iov_iter' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs:
  iov_iter, net: Move hash_and_copy_to_iter() to net/
  iov_iter, net: Merge csum_and_copy_from_iter{,_full}() together
  iov_iter, net: Fold in csum_and_memcpy()
  iov_iter, net: Move csum_and_copy_to/from_iter() to net/
  iov_iter: Don't deal with iter->copy_mc in memcpy_from_iter_mc()
  iov_iter: Convert iterate*() to inline funcs
  iov_iter: Derive user-backedness from the iterator type
  iov_iter: Renumber ITER_* constants
  infiniband: Use user_backed_iter() to see if iterator is UBUF/IOVEC
  sound: Fix snd_pcm_readv()/writev() to use iov access functions
  iov_iter, x86: Be consistent about the __user tag on copy_mc_to_user()
  iov_iter: Remove last_offset from iov_iter as it was for ITER_PIPE

7 months agoMerge tag 'vfs-6.7.misc' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs
Linus Torvalds [Mon, 30 Oct 2023 19:14:19 +0000 (09:14 -1000)] 
Merge tag 'vfs-6.7.misc' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs

Pull misc vfs updates from Christian Brauner:
 "This contains the usual miscellaneous features, cleanups, and fixes
  for vfs and individual fses.

  Features:

   - Rename and export helpers that get write access to a mount. They
     are used in overlayfs to get write access to the upper mount.

   - Print the pretty name of the root device on boot failure. This
     helps in scenarios where we would usually only print
     "unknown-block(1,2)".

   - Add an internal SB_I_NOUMASK flag. This is another part in the
     endless POSIX ACL saga in a way.

     When POSIX ACLs are enabled via SB_POSIXACL the vfs cannot strip
     the umask because if the relevant inode has POSIX ACLs set it might
     take the umask from there. But if the inode doesn't have any POSIX
     ACLs set then we apply the umask in the filesytem itself. So we end
     up with:

      (1) no SB_POSIXACL -> strip umask in vfs
      (2) SB_POSIXACL    -> strip umask in filesystem

     The umask semantics associated with SB_POSIXACL allowed filesystems
     that don't even support POSIX ACLs at all to raise SB_POSIXACL
     purely to avoid umask stripping. That specifically means NFS v4 and
     Overlayfs. NFS v4 does it because it delegates this to the server
     and Overlayfs because it needs to delegate umask stripping to the
     upper filesystem, i.e., the filesystem used as the writable layer.

     This went so far that SB_POSIXACL is raised eve on kernels that
     don't even have POSIX ACL support at all.

     Stop this blatant abuse and add SB_I_NOUMASK which is an internal
     superblock flag that filesystems can raise to opt out of umask
     handling. That should really only be the two mentioned above. It's
     not that we want any filesystems to do this. Ideally we have all
     umask handling always in the vfs.

   - Make overlayfs use SB_I_NOUMASK too.

   - Now that we have SB_I_NOUMASK, stop checking for SB_POSIXACL in
     IS_POSIXACL() if the kernel doesn't have support for it. This is a
     very old patch but it's only possible to do this now with the wider
     cleanup that was done.

   - Follow-up work on fake path handling from last cycle. Citing mostly
     from Amir:

     When overlayfs was first merged, overlayfs files of regular files
     and directories, the ones that are installed in file table, had a
     "fake" path, namely, f_path is the overlayfs path and f_inode is
     the "real" inode on the underlying filesystem.

     In v6.5, we took another small step by introducing of the
     backing_file container and the file_real_path() helper. This change
     allowed vfs and filesystem code to get the "real" path of an
     overlayfs backing file. With this change, we were able to make
     fsnotify work correctly and report events on the "real" filesystem
     objects that were accessed via overlayfs.

     This method works fine, but it still leaves the vfs vulnerable to
     new code that is not aware of files with fake path. A recent
     example is commit db1d1e8b9867 ("IMA: use vfs_getattr_nosec to get
     the i_version"). This commit uses direct referencing to f_path in
     IMA code that otherwise uses file_inode() and file_dentry() to
     reference the filesystem objects that it is measuring.

     This contains work to switch things around: instead of having
     filesystem code opt-in to get the "real" path, have generic code
     opt-in for the "fake" path in the few places that it is needed.

     Is it far more likely that new filesystems code that does not use
     the file_dentry() and file_real_path() helpers will end up causing
     crashes or averting LSM/audit rules if we keep the "fake" path
     exposed by default.

     This change already makes file_dentry() moot, but for now we did
     not change this helper just added a WARN_ON() in ovl_d_real() to
     catch if we have made any wrong assumptions.

     After the dust settles on this change, we can make file_dentry() a
     plain accessor and we can drop the inode argument to ->d_real().

   - Switch struct file to SLAB_TYPESAFE_BY_RCU. This looks like a small
     change but it really isn't and I would like to see everyone on
     their tippie toes for any possible bugs from this work.

     Essentially we've been doing most of what SLAB_TYPESAFE_BY_RCU for
     files since a very long time because of the nasty interactions
     between the SCM_RIGHTS file descriptor garbage collection. So
     extending it makes a lot of sense but it is a subtle change. There
     are almost no places that fiddle with file rcu semantics directly
     and the ones that did mess around with struct file internal under
     rcu have been made to stop doing that because it really was always
     dodgy.

     I forgot to put in the link tag for this change and the discussion
     in the commit so adding it into the merge message:

       https://lore.kernel.org/r/20230926162228.68666-1-mjguzik@gmail.com

  Cleanups:

   - Various smaller pipe cleanups including the removal of a spin lock
     that was only used to protect against writes without pipe_lock()
     from O_NOTIFICATION_PIPE aka watch queues. As that was never
     implemented remove the additional locking from pipe_write().

   - Annotate struct watch_filter with the new __counted_by attribute.

   - Clarify do_unlinkat() cleanup so that it doesn't look like an extra
     iput() is done that would cause issues.

   - Simplify file cleanup when the file has never been opened.

   - Use module helper instead of open-coding it.

   - Predict error unlikely for stale retry.

   - Use WRITE_ONCE() for mount expiry field instead of just commenting
     that one hopes the compiler doesn't get smart.

  Fixes:

   - Fix readahead on block devices.

   - Fix writeback when layztime is enabled and inodes whose timestamp
     is the only thing that changed reside on wb->b_dirty_time. This
     caused excessively large zombie memory cgroup when lazytime was
     enabled as such inodes weren't handled fast enough.

   - Convert BUG_ON() to WARN_ON_ONCE() in open_last_lookups()"

* tag 'vfs-6.7.misc' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: (26 commits)
  file, i915: fix file reference for mmap_singleton()
  vfs: Convert BUG_ON to WARN_ON_ONCE in open_last_lookups
  writeback, cgroup: switch inodes with dirty timestamps to release dying cgwbs
  chardev: Simplify usage of try_module_get()
  ovl: rely on SB_I_NOUMASK
  fs: fix umask on NFS with CONFIG_FS_POSIX_ACL=n
  fs: store real path instead of fake path in backing file f_path
  fs: create helper file_user_path() for user displayed mapped file path
  fs: get mnt_writers count for an open backing file's real path
  vfs: stop counting on gcc not messing with mnt_expiry_mark if not asked
  vfs: predict the error in retry_estale as unlikely
  backing file: free directly
  vfs: fix readahead(2) on block devices
  io_uring: use files_lookup_fd_locked()
  file: convert to SLAB_TYPESAFE_BY_RCU
  vfs: shave work on failed file open
  fs: simplify misleading code to remove ambiguity regarding ihold()/iput()
  watch_queue: Annotate struct watch_filter with __counted_by
  fs/pipe: use spinlock in pipe_read() only if there is a watch_queue
  fs/pipe: remove unnecessary spinlock from pipe_write()
  ...

7 months agoMerge tag 'vfs-6.7.autofs' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs
Linus Torvalds [Mon, 30 Oct 2023 19:10:21 +0000 (09:10 -1000)] 
Merge tag 'vfs-6.7.autofs' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs

Pull autofs mount api updates from Christian Brauner:
 "This ports autofs to the new mount api. The patchset has existed for
  quite a while but never made it upstream. Ian picked it back up.

  This also fixes a bug where fs_param_is_fd() was passed a garbage
  param->dirfd but it expected it to be set to the fd that was used to
  set param->file otherwise result->uint_32 contains nonsense. So make
  sure it's set.

  One less filesystem using the old mount api. We're getting there,
  albeit rather slow. The last remaining major filesystem that hasn't
  converted is btrfs. Patches exist - I even wrote them - but so far
  they haven't made it upstream"

* tag 'vfs-6.7.autofs' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs:
  autofs: fix add autofs_parse_fd()
  fsconfig: ensure that dirfd is set to aux
  autofs: fix protocol sub version setting
  autofs: convert autofs to use the new mount api
  autofs: validate protocol version
  autofs: refactor parse_options()
  autofs: reformat 0pt enum declaration
  autofs: refactor super block info init
  autofs: add autofs_parse_fd()
  autofs: refactor autofs_prepare_pipe()

7 months agoMerge tag 'vfs-6.7.super' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs
Linus Torvalds [Mon, 30 Oct 2023 18:59:05 +0000 (08:59 -1000)] 
Merge tag 'vfs-6.7.super' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs

Pull vfs superblock updates from Christian Brauner:
 "This contains the work to make block device opening functions return a
  struct bdev_handle instead of just a struct block_device. The same
  struct bdev_handle is then also passed to block device closing
  functions.

  This allows us to propagate context from opening to closing a block
  device without having to modify all users everytime.

  Sidenote, in the future we might even want to try and have block
  device opening functions return a struct file directly but that's a
  series on top of this.

  These are further preparatory changes to be able to count writable
  opens and blocking writes to mounted block devices. That's a separate
  piece of work for next cycle and for that we absolutely need the
  changes to btrfs that have been quietly dropped somehow.

  Originally the series contained a patch that removed the old
  blkdev_*() helpers. But since this would've caused needles churn in
  -next for bcachefs we ended up delaying it.

  The second piece of work addresses one of the major annoyances about
  the work last cycle, namely that we required dropping s_umount
  whenever we used the superblock and fs_holder_ops for a block device.

  The reason for that requirement had been that in some codepaths
  s_umount could've been taken under disk->open_mutex (that's always
  been the case, at least theoretically). For example, on surprise block
  device removal or media change. And opening and closing block devices
  required grabbing disk->open_mutex as well.

  So we did the work and went through the block layer and fixed all
  those places so that s_umount is never taken under disk->open_mutex.
  This means no more brittle games where we yield and reacquire s_umount
  during block device opening and closing and no more requirements where
  block devices need to be closed. Filesystems don't need to care about
  this.

  There's a bunch of other follow-up work such as moving block device
  freezing and thawing to holder operations which makes it work for all
  block devices and not just the main block device just as we did for
  surprise removal. But that is for next cycle.

  Tested with fstests for all major fses, blktests, LTP"

* tag 'vfs-6.7.super' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: (37 commits)
  porting: update locking requirements
  fs: assert that open_mutex isn't held over holder ops
  block: assert that we're not holding open_mutex over blk_report_disk_dead
  block: move bdev_mark_dead out of disk_check_media_change
  block: WARN_ON_ONCE() when we remove active partitions
  block: simplify bdev_del_partition()
  fs: Avoid grabbing sb->s_umount under bdev->bd_holder_lock
  jfs: fix log->bdev_handle null ptr deref in lbmStartIO
  bcache: Fixup error handling in register_cache()
  xfs: Convert to bdev_open_by_path()
  reiserfs: Convert to bdev_open_by_dev/path()
  ocfs2: Convert to use bdev_open_by_dev()
  nfs/blocklayout: Convert to use bdev_open_by_dev/path()
  jfs: Convert to bdev_open_by_dev()
  f2fs: Convert to bdev_open_by_dev/path()
  ext4: Convert to bdev_open_by_dev()
  erofs: Convert to use bdev_open_by_path()
  btrfs: Convert to bdev_open_by_path()
  fs: Convert to bdev_open_by_dev()
  mm/swap: Convert to use bdev_open_by_dev()
  ...

7 months agoLinux 6.6 v6.6
Linus Torvalds [Mon, 30 Oct 2023 02:31:08 +0000 (16:31 -1000)] 
Linux 6.6

7 months agoMerge tag 'x86-urgent-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 28 Oct 2023 18:15:07 +0000 (08:15 -1000)] 
Merge tag 'x86-urgent-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull misc x86 fixes from Ingo Molnar:

 - Fix a possible CPU hotplug deadlock bug caused by the new TSC
   synchronization code

 - Fix a legacy PIC discovery bug that results in device troubles on
   affected systems, such as non-working keybards, etc

 - Add a new Intel CPU model number to <asm/intel-family.h>

* tag 'x86-urgent-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/tsc: Defer marking TSC unstable to a worker
  x86/i8259: Skip probing when ACPI/MADT advertises PCAT compatibility
  x86/cpu: Add model number for Intel Arrow Lake mobile processor

7 months agoMerge tag 'irq-urgent-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 28 Oct 2023 18:12:34 +0000 (08:12 -1000)] 
Merge tag 'irq-urgent-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq fix from Ingo Molnar:
 "Restore unintentionally lost quirk settings in the GIC irqchip driver,
  which broke certain devices"

* tag 'irq-urgent-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/gic-v3-its: Don't override quirk settings with default values

7 months agoMerge tag 'perf-urgent-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 28 Oct 2023 18:10:47 +0000 (08:10 -1000)] 
Merge tag 'perf-urgent-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf event fix from Ingo Molnar:
 "Fix a potential NULL dereference bug"

* tag 'perf-urgent-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/core: Fix potential NULL deref

7 months agoMerge tag 'probes-fixes-v6.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 28 Oct 2023 18:04:56 +0000 (08:04 -1000)] 
Merge tag 'probes-fixes-v6.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull probes fixes from Masami Hiramatsu:

 - tracing/kprobes: Fix kernel-doc warnings for the variable length
   arguments

 - tracing/kprobes: Fix to count the symbols in modules even if the
   module name is not specified so that user can probe the symbols in
   the modules without module name

* tag 'probes-fixes-v6.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing/kprobes: Fix symbol counting logic by looking at modules as well
  tracing/kprobes: Fix the description of variable length arguments

7 months agoMerge tag 'dma-mapping-6.6-2023-10-28' of git://git.infradead.org/users/hch/dma-mapping
Linus Torvalds [Sat, 28 Oct 2023 18:01:31 +0000 (08:01 -1000)] 
Merge tag 'dma-mapping-6.6-2023-10-28' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping fix from Christoph Hellwig:

 - reduce the initialy dynamic swiotlb size to remove an annoying but
   harmless warning from the page allocator (Petr Tesarik)

* tag 'dma-mapping-6.6-2023-10-28' of git://git.infradead.org/users/hch/dma-mapping:
  swiotlb: do not try to allocate a TLB bigger than MAX_ORDER pages

7 months agoMerge tag 'char-misc-6.6-final' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 28 Oct 2023 17:51:27 +0000 (07:51 -1000)] 
Merge tag 'char-misc-6.6-final' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver fixes from Greg KH:
 "Here are some very small driver fixes for 6.6-final that have shown up
  in the past two weeks. Included in here are:

   - tiny fastrpc bugfixes for reported errors

   - nvmem register fixes

   - iio driver fixes for some reported problems

   - fpga test fix

   - MAINTAINERS file update for fpga

  All of these have been in linux-next this week with no reported
  problems"

* tag 'char-misc-6.6-final' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  fpga: Fix memory leak for fpga_region_test_class_find()
  fpga: m10bmc-sec: Change contact for secure update driver
  fpga: disable KUnit test suites when module support is enabled
  iio: afe: rescale: Accept only offset channels
  nvmem: imx: correct nregs for i.MX6ULL
  nvmem: imx: correct nregs for i.MX6UL
  nvmem: imx: correct nregs for i.MX6SLL
  misc: fastrpc: Unmap only if buffer is unmapped from DSP
  misc: fastrpc: Clean buffers on remote invocation failures
  misc: fastrpc: Free DMA handles for RPC calls with no arguments
  misc: fastrpc: Reset metadata buffer to avoid incorrect free
  iio: exynos-adc: request second interupt only when touchscreen mode is used
  iio: adc: xilinx-xadc: Correct temperature offset/scale for UltraScale
  iio: adc: xilinx-xadc: Don't clobber preset voltage/temperature thresholds
  dt-bindings: iio: add missing reset-gpios constrain

7 months agoMerge tag 'i2c-for-6.6-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa...
Linus Torvalds [Sat, 28 Oct 2023 17:48:37 +0000 (07:48 -1000)] 
Merge tag 'i2c-for-6.6-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:
 "Bugfixes for Axxia when it is a target and for PEC handling of
  stm32f7.

  Plus, fix an OF node leak pattern in the mux subsystem"

* tag 'i2c-for-6.6-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: stm32f7: Fix PEC handling in case of SMBUS transfers
  i2c: muxes: i2c-mux-gpmux: Use of_get_i2c_adapter_by_node()
  i2c: muxes: i2c-demux-pinctrl: Use of_get_i2c_adapter_by_node()
  i2c: muxes: i2c-mux-pinctrl: Use of_get_i2c_adapter_by_node()
  i2c: aspeed: Fix i2c bus hang in slave read

7 months agoporting: update locking requirements
Christian Brauner [Wed, 18 Oct 2023 10:26:20 +0000 (12:26 +0200)] 
porting: update locking requirements

Now that s_umount is never taken under open_mutex update the
documentation to say so.

Link: https://lore.kernel.org/r/20231017184823.1383356-1-hch@lst.de
Signed-off-by: Christian Brauner <brauner@kernel.org>
7 months agofs: assert that open_mutex isn't held over holder ops
Christian Brauner [Tue, 17 Oct 2023 18:48:23 +0000 (20:48 +0200)] 
fs: assert that open_mutex isn't held over holder ops

With recent block level changes we should never be in a situation where
we hold disk->open_mutex when calling into these helpers. So assert that
in the code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20231017184823.1383356-6-hch@lst.de
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Christian Brauner <brauner@kernel.org>
7 months agoblock: assert that we're not holding open_mutex over blk_report_disk_dead
Christian Brauner [Tue, 17 Oct 2023 18:48:22 +0000 (20:48 +0200)] 
block: assert that we're not holding open_mutex over blk_report_disk_dead

blk_report_disk_dead() has the following major callers:

(1) del_gendisk()
(2) blk_mark_disk_dead()

Since del_gendisk() acquires disk->open_mutex it's clear that all
callers are assumed to be called without disk->open_mutex held.
In turn, blk_report_disk_dead() is called without disk->open_mutex held
in del_gendisk().

All callers of blk_mark_disk_dead() call it without disk->open_mutex as
well.

Ensure that it is clear that blk_report_disk_dead() is called without
disk->open_mutex on purpose by asserting it and a comment in the code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20231017184823.1383356-5-hch@lst.de
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Christian Brauner <brauner@kernel.org>
7 months agoblock: move bdev_mark_dead out of disk_check_media_change
Christoph Hellwig [Tue, 17 Oct 2023 18:48:21 +0000 (20:48 +0200)] 
block: move bdev_mark_dead out of disk_check_media_change

disk_check_media_change is mostly called from ->open where it makes
little sense to mark the file system on the device as dead, as we
are just opening it.  So instead of calling bdev_mark_dead from
disk_check_media_change move it into the few callers that are not
in an open instance.  This avoid calling into bdev_mark_dead and
thus taking s_umount with open_mutex held.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20231017184823.1383356-4-hch@lst.de
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Christian Brauner <brauner@kernel.org>
7 months agoblock: WARN_ON_ONCE() when we remove active partitions
Christian Brauner [Tue, 17 Oct 2023 18:48:20 +0000 (20:48 +0200)] 
block: WARN_ON_ONCE() when we remove active partitions

The logic for disk->open_partitions is:

blkdev_get_by_*()
-> bdev_is_partition()
   -> blkdev_get_part()
      -> blkdev_get_whole() // bdev_whole->bd_openers++
      -> if (part->bd_openers == 0)
                 disk->open_partitions++
         part->bd_openers

In other words, when we first claim/open a partition we increment
disk->open_partitions and only when all part->bd_openers are closed will
disk->open_partitions be zero. That should mean that
disk->open_partitions is always > 0 as long as there's anyone that
has an open partition.

So the check for disk->open_partitions should mean that we can never
remove an active partition that has a holder and holder ops set. Assert
that in the code. The main disk isn't removed so that check doesn't work
for disk->part0 which is what we want. After all we only care about
partition not about the main disk.

Link: https://lore.kernel.org/r/20231017184823.1383356-3-hch@lst.de
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Christian Brauner <brauner@kernel.org>
7 months agoblock: simplify bdev_del_partition()
Christian Brauner [Tue, 17 Oct 2023 18:48:19 +0000 (20:48 +0200)] 
block: simplify bdev_del_partition()

BLKPG_DEL_PARTITION refuses to delete partitions that still have
openers, i.e., that has an elevated @bdev->bd_openers count. If a device
is claimed by setting @bdev->bd_holder and @bdev->bd_holder_ops
@bdev->bd_openers and @bdev->bd_holders are incremented.
@bdev->bd_openers is effectively guaranteed to be >= @bdev->bd_holders.
So as long as @bdev->bd_openers isn't zero we know that this partition
is still in active use and that there might still be @bdev->bd_holder
and @bdev->bd_holder_ops set.

The only current example is @fs_holder_ops for filesystems. But that
means bdev_mark_dead() which calls into
bdev->bd_holder_ops->mark_dead::fs_bdev_mark_dead() is a nop. As long as
there's an elevated @bdev->bd_openers count we can't delete the
partition and if there isn't an elevated @bdev->bd_openers count then
there's no @bdev->bd_holder or @bdev->bd_holder_ops.

So simply open-code what we need to do. This gets rid of one more
instance where we acquire s_umount under @disk->open_mutex.

Link: https://lore.kernel.org/r/20231016-fototermin-umriss-59f1ea6c1fe6@brauner
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20231017184823.1383356-2-hch@lst.de
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
7 months agofs: Avoid grabbing sb->s_umount under bdev->bd_holder_lock
Jan Kara [Wed, 18 Oct 2023 15:29:24 +0000 (17:29 +0200)] 
fs: Avoid grabbing sb->s_umount under bdev->bd_holder_lock

The implementation of bdev holder operations such as fs_bdev_mark_dead()
and fs_bdev_sync() grab sb->s_umount semaphore under
bdev->bd_holder_lock. This is problematic because it leads to
disk->open_mutex -> sb->s_umount lock ordering which is counterintuitive
(usually we grab higher level (e.g. filesystem) locks first and lower
level (e.g. block layer) locks later) and indeed makes lockdep complain
about possible locking cycles whenever we open a block device while
holding sb->s_umount semaphore. Implement a function
bdev_super_lock_shared() which safely transitions from holding
bdev->bd_holder_lock to holding sb->s_umount on alive superblock without
introducing the problematic lock dependency. We use this function
fs_bdev_sync() and fs_bdev_mark_dead().

Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20231018152924.3858-1-jack@suse.cz
Link: https://lore.kernel.org/r/20231017184823.1383356-1-hch@lst.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner <brauner@kernel.org>
7 months agojfs: fix log->bdev_handle null ptr deref in lbmStartIO
Lizhi Xu [Mon, 9 Oct 2023 09:45:57 +0000 (17:45 +0800)] 
jfs: fix log->bdev_handle null ptr deref in lbmStartIO

When sbi->flag is JFS_NOINTEGRITY in lmLogOpen(), log->bdev_handle can't
be inited, so it value will be NULL.
Therefore, add the "log ->no_integrity=1" judgment in lbmStartIO() to avoid such
problems.

Reported-and-tested-by: syzbot+23bc20037854bb335d59@syzkaller.appspotmail.com
Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com>
Link: https://lore.kernel.org/r/20231009094557.1398920-1-lizhi.xu@windriver.com
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
7 months agobcache: Fixup error handling in register_cache()
Jan Kara [Wed, 4 Oct 2023 09:37:57 +0000 (11:37 +0200)] 
bcache: Fixup error handling in register_cache()

Coverity has noticed that the printing of error message in
register_cache() uses already freed bdev_handle to get to bdev. In fact
the problem has been there even before commit "bcache: Convert to
bdev_open_by_path()" just a bit more subtle one - cache object itself
could have been freed by the time we looked at ca->bdev and we don't
hold any reference to bdev either so even that could in principle go
away (due to device unplug or similar). Fix all these problems by
printing the error message before closing the bdev.

Fixes: dc893f51d24a ("bcache: Convert to bdev_open_by_path()")
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20231004093757.11560-1-jack@suse.cz
Asked-by: Coly Li <colyli@suse.de>
Signed-off-by: Christian Brauner <brauner@kernel.org>
7 months agoxfs: Convert to bdev_open_by_path()
Jan Kara [Wed, 27 Sep 2023 09:34:34 +0000 (11:34 +0200)] 
xfs: Convert to bdev_open_by_path()

Convert xfs to use bdev_open_by_path() and pass the handle around.

CC: "Darrick J. Wong" <djwong@kernel.org>
CC: linux-xfs@vger.kernel.org
Acked-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230927093442.25915-28-jack@suse.cz
Acked-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
7 months agoreiserfs: Convert to bdev_open_by_dev/path()
Jan Kara [Wed, 27 Sep 2023 09:34:33 +0000 (11:34 +0200)] 
reiserfs: Convert to bdev_open_by_dev/path()

Convert reiserfs to use bdev_open_by_dev/path() and pass the handle
around.

CC: reiserfs-devel@vger.kernel.org
Acked-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230927093442.25915-27-jack@suse.cz
Signed-off-by: Christian Brauner <brauner@kernel.org>
7 months agoocfs2: Convert to use bdev_open_by_dev()
Jan Kara [Wed, 27 Sep 2023 09:34:32 +0000 (11:34 +0200)] 
ocfs2: Convert to use bdev_open_by_dev()

Convert ocfs2 heartbeat code to use bdev_open_by_dev() and pass the
handle around.

CC: Joseph Qi <joseph.qi@linux.alibaba.com>
CC: ocfs2-devel@oss.oracle.com
Acked-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230927093442.25915-26-jack@suse.cz
Signed-off-by: Christian Brauner <brauner@kernel.org>
7 months agonfs/blocklayout: Convert to use bdev_open_by_dev/path()
Jan Kara [Wed, 27 Sep 2023 09:34:31 +0000 (11:34 +0200)] 
nfs/blocklayout: Convert to use bdev_open_by_dev/path()

Convert block device handling to use bdev_open_by_dev/path() and pass
the handle around.

CC: linux-nfs@vger.kernel.org
CC: Trond Myklebust <trond.myklebust@hammerspace.com>
CC: Anna Schumaker <anna@kernel.org>
Acked-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230927093442.25915-25-jack@suse.cz
Signed-off-by: Christian Brauner <brauner@kernel.org>
7 months agojfs: Convert to bdev_open_by_dev()
Jan Kara [Wed, 27 Sep 2023 09:34:30 +0000 (11:34 +0200)] 
jfs: Convert to bdev_open_by_dev()

Convert jfs to use bdev_open_by_dev() and pass the handle around.

CC: Dave Kleikamp <shaggy@kernel.org>
CC: jfs-discussion@lists.sourceforge.net
Acked-by: Christoph Hellwig <hch@lst.de>
Acked-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230927093442.25915-24-jack@suse.cz
Signed-off-by: Christian Brauner <brauner@kernel.org>
7 months agof2fs: Convert to bdev_open_by_dev/path()
Jan Kara [Wed, 27 Sep 2023 09:34:29 +0000 (11:34 +0200)] 
f2fs: Convert to bdev_open_by_dev/path()

Convert f2fs to use bdev_open_by_dev/path() and pass the handle around.

CC: Jaegeuk Kim <jaegeuk@kernel.org>
CC: Chao Yu <chao@kernel.org>
CC: linux-f2fs-devel@lists.sourceforge.net
Acked-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230927093442.25915-23-jack@suse.cz
Signed-off-by: Christian Brauner <brauner@kernel.org>
7 months agoext4: Convert to bdev_open_by_dev()
Jan Kara [Wed, 27 Sep 2023 09:34:28 +0000 (11:34 +0200)] 
ext4: Convert to bdev_open_by_dev()

Convert ext4 to use bdev_open_by_dev() and pass the handle around.

CC: linux-ext4@vger.kernel.org
CC: Ted Tso <tytso@mit.edu>
Acked-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230927093442.25915-22-jack@suse.cz
Signed-off-by: Christian Brauner <brauner@kernel.org>
7 months agoerofs: Convert to use bdev_open_by_path()
Jan Kara [Wed, 27 Sep 2023 09:34:27 +0000 (11:34 +0200)] 
erofs: Convert to use bdev_open_by_path()

Convert erofs to use bdev_open_by_path() and pass the handle around.

CC: Gao Xiang <xiang@kernel.org>
CC: Chao Yu <chao@kernel.org>
CC: linux-erofs@lists.ozlabs.org
Acked-by: Christoph Hellwig <hch@lst.de>
Acked-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230927093442.25915-21-jack@suse.cz
Signed-off-by: Christian Brauner <brauner@kernel.org>
7 months agobtrfs: Convert to bdev_open_by_path()
Jan Kara [Wed, 27 Sep 2023 09:34:26 +0000 (11:34 +0200)] 
btrfs: Convert to bdev_open_by_path()

Convert btrfs to use bdev_open_by_path() and pass the handle around.  We
also drop the holder from struct btrfs_device as it is now not needed
anymore.

CC: David Sterba <dsterba@suse.com>
CC: linux-btrfs@vger.kernel.org
Acked-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230927093442.25915-20-jack@suse.cz
Signed-off-by: Christian Brauner <brauner@kernel.org>
7 months agofs: Convert to bdev_open_by_dev()
Jan Kara [Wed, 27 Sep 2023 09:34:25 +0000 (11:34 +0200)] 
fs: Convert to bdev_open_by_dev()

Convert mount code to use bdev_open_by_dev() and propagate the handle
around to bdev_release().

Acked-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230927093442.25915-19-jack@suse.cz
Signed-off-by: Christian Brauner <brauner@kernel.org>
7 months agomm/swap: Convert to use bdev_open_by_dev()
Jan Kara [Wed, 27 Sep 2023 09:34:24 +0000 (11:34 +0200)] 
mm/swap: Convert to use bdev_open_by_dev()

Convert swapping code to use bdev_open_by_dev() and pass the handle
around.

CC: linux-mm@kvack.org
CC: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230927093442.25915-18-jack@suse.cz
Signed-off-by: Christian Brauner <brauner@kernel.org>
7 months agoPM: hibernate: Drop unused snapshot_test argument
Jan Kara [Wed, 27 Sep 2023 09:34:23 +0000 (11:34 +0200)] 
PM: hibernate: Drop unused snapshot_test argument

snapshot_test argument is now unused in swsusp_close() and
load_image_and_restore(). Drop it

CC: linux-pm@vger.kernel.org
Acked-by: Christoph Hellwig <hch@lst.de>
Acked-by: "Rafael J. Wysocki" <rafael@kernel.org>
Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230927093442.25915-17-jack@suse.cz
Signed-off-by: Christian Brauner <brauner@kernel.org>
7 months agoPM: hibernate: Convert to bdev_open_by_dev()
Jan Kara [Wed, 27 Sep 2023 09:34:22 +0000 (11:34 +0200)] 
PM: hibernate: Convert to bdev_open_by_dev()

Convert hibernation code to use bdev_open_by_dev().

CC: linux-pm@vger.kernel.org
Acked-by: Christoph Hellwig <hch@lst.de>
Acked-by: "Rafael J. Wysocki" <rafael@kernel.org>
Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230927093442.25915-16-jack@suse.cz
Signed-off-by: Christian Brauner <brauner@kernel.org>
7 months agoscsi: target: Convert to bdev_open_by_path()
Jan Kara [Wed, 27 Sep 2023 09:34:21 +0000 (11:34 +0200)] 
scsi: target: Convert to bdev_open_by_path()

Convert iblock and pscsi drivers to use bdev_open_by_path() and pass the
handle around.

CC: target-devel@vger.kernel.org
CC: linux-scsi@vger.kernel.org
Acked-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230927093442.25915-15-jack@suse.cz
Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
7 months agos390/dasd: Convert to bdev_open_by_path()
Jan Kara [Wed, 27 Sep 2023 09:34:20 +0000 (11:34 +0200)] 
s390/dasd: Convert to bdev_open_by_path()

Convert dasd to use bdev_open_by_path() and pass the handle around.

CC: linux-s390@vger.kernel.org
CC: Christian Borntraeger <borntraeger@linux.ibm.com>
CC: Sven Schnelle <svens@linux.ibm.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230927093442.25915-14-jack@suse.cz
Signed-off-by: Christian Brauner <brauner@kernel.org>
7 months agonvmet: Convert to bdev_open_by_path()
Jan Kara [Wed, 27 Sep 2023 09:34:19 +0000 (11:34 +0200)] 
nvmet: Convert to bdev_open_by_path()

Convert nvmet to use bdev_open_by_path() and pass the handle around.

CC: linux-nvme@lists.infradead.org
Acked-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian Brauner <brauner@kernel.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230927093442.25915-13-jack@suse.cz
Signed-off-by: Christian Brauner <brauner@kernel.org>
7 months agomtd: block2mtd: Convert to bdev_open_by_dev/path()
Jan Kara [Wed, 27 Sep 2023 09:34:18 +0000 (11:34 +0200)] 
mtd: block2mtd: Convert to bdev_open_by_dev/path()

Convert block2mtd to use bdev_open_by_dev() and bdev_open_by_path() and
pass the handle around.

CC: Joern Engel <joern@lazybastard.org>
CC: linux-mtd@lists.infradead.org
Acked-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230927093442.25915-12-jack@suse.cz
Signed-off-by: Christian Brauner <brauner@kernel.org>
7 months agomd: Convert to bdev_open_by_dev()
Jan Kara [Wed, 27 Sep 2023 09:34:17 +0000 (11:34 +0200)] 
md: Convert to bdev_open_by_dev()

Convert md to use bdev_open_by_dev() and pass the handle around. We also
don't need the 'Holder' flag anymore so remove it.

CC: linux-raid@vger.kernel.org
CC: Song Liu <song@kernel.org>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230927093442.25915-11-jack@suse.cz
Signed-off-by: Christian Brauner <brauner@kernel.org>
7 months agodm: Convert to bdev_open_by_dev()
Jan Kara [Wed, 27 Sep 2023 09:34:16 +0000 (11:34 +0200)] 
dm: Convert to bdev_open_by_dev()

Convert device mapper to use bdev_open_by_dev() and pass the handle
around.

CC: Alasdair Kergon <agk@redhat.com>
CC: Mike Snitzer <snitzer@kernel.org>
CC: dm-devel@redhat.com
Acked-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230927093442.25915-10-jack@suse.cz
Signed-off-by: Christian Brauner <brauner@kernel.org>
7 months agobcache: Convert to bdev_open_by_path()
Jan Kara [Wed, 27 Sep 2023 09:34:15 +0000 (11:34 +0200)] 
bcache: Convert to bdev_open_by_path()

Convert bcache to use bdev_open_by_path() and pass the handle around.

CC: linux-bcache@vger.kernel.org
CC: Coly Li <colyli@suse.de>
CC: Kent Overstreet <kent.overstreet@gmail.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian Brauner <brauner@kernel.org>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230927093442.25915-9-jack@suse.cz
Signed-off-by: Christian Brauner <brauner@kernel.org>
7 months agozram: Convert to use bdev_open_by_dev()
Jan Kara [Wed, 27 Sep 2023 09:34:14 +0000 (11:34 +0200)] 
zram: Convert to use bdev_open_by_dev()

Convert zram to use bdev_open_by_dev() and pass the handle around.

CC: Minchan Kim <minchan@kernel.org>
CC: Sergey Senozhatsky <senozhatsky@chromium.org>
Acked-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian Brauner <brauner@kernel.org>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230927093442.25915-8-jack@suse.cz
Signed-off-by: Christian Brauner <brauner@kernel.org>
7 months agoxen/blkback: Convert to bdev_open_by_dev()
Jan Kara [Wed, 27 Sep 2023 09:34:13 +0000 (11:34 +0200)] 
xen/blkback: Convert to bdev_open_by_dev()

Convert xen/blkback to use bdev_open_by_dev() and pass the
handle around.

CC: xen-devel@lists.xenproject.org
Acked-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230927093442.25915-7-jack@suse.cz
Signed-off-by: Christian Brauner <brauner@kernel.org>
7 months agornbd-srv: Convert to use bdev_open_by_path()
Jan Kara [Wed, 27 Sep 2023 09:34:12 +0000 (11:34 +0200)] 
rnbd-srv: Convert to use bdev_open_by_path()

Convert rnbd-srv to use bdev_open_by_path() and pass the handle
around.

CC: Jack Wang <jinpu.wang@ionos.com>
CC: "Md. Haris Iqbal" <haris.iqbal@ionos.com>
Acked-by: "Md. Haris Iqbal" <haris.iqbal@ionos.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230927093442.25915-6-jack@suse.cz
Signed-off-by: Christian Brauner <brauner@kernel.org>
7 months agopktcdvd: Convert to bdev_open_by_dev()
Jan Kara [Wed, 27 Sep 2023 09:34:11 +0000 (11:34 +0200)] 
pktcdvd: Convert to bdev_open_by_dev()

Convert pktcdvd to use bdev_open_by_dev().

Acked-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230927093442.25915-5-jack@suse.cz
Signed-off-by: Christian Brauner <brauner@kernel.org>
7 months agodrdb: Convert to use bdev_open_by_path()
Jan Kara [Wed, 27 Sep 2023 09:34:10 +0000 (11:34 +0200)] 
drdb: Convert to use bdev_open_by_path()

Convert drdb to use bdev_open_by_path().

CC: drbd-dev@lists.linbit.com
Acked-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230927093442.25915-4-jack@suse.cz
Signed-off-by: Christian Brauner <brauner@kernel.org>
7 months agoblock: Use bdev_open_by_dev() in disk_scan_partitions() and blkdev_bszset()
Jan Kara [Wed, 27 Sep 2023 09:34:09 +0000 (11:34 +0200)] 
block: Use bdev_open_by_dev() in disk_scan_partitions() and blkdev_bszset()

Convert disk_scan_partitions() and blkdev_bszset() to use
bdev_open_by_dev().

Acked-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230927093442.25915-3-jack@suse.cz
Signed-off-by: Christian Brauner <brauner@kernel.org>
7 months agoblock: Use bdev_open_by_dev() in blkdev_open()
Jan Kara [Wed, 27 Sep 2023 09:34:08 +0000 (11:34 +0200)] 
block: Use bdev_open_by_dev() in blkdev_open()

Convert blkdev_open() to use bdev_open_by_dev(). To be able to propagate
handle from blkdev_open() to blkdev_release() we need to stop using
existence of file->private_data to determine exclusive block device
opens. Use bdev_handle->mode for this purpose since file->f_flags
isn't usable for this (O_EXCL is cleared from the flags during open).

Acked-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230927093442.25915-2-jack@suse.cz
Signed-off-by: Christian Brauner <brauner@kernel.org>
7 months agoblock: Provide bdev_open_* functions
Jan Kara [Wed, 27 Sep 2023 09:34:07 +0000 (11:34 +0200)] 
block: Provide bdev_open_* functions

Create struct bdev_handle that contains all parameters that need to be
passed to blkdev_put() and provide bdev_open_* functions that return
this structure instead of plain bdev pointer. This will eventually allow
us to pass one more argument to blkdev_put() (renamed to bdev_release())
without too much hassle.

Acked-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230927093442.25915-1-jack@suse.cz
Signed-off-by: Christian Brauner <brauner@kernel.org>
7 months agoMerge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 28 Oct 2023 02:52:51 +0000 (16:52 -1000)] 
Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

Pull clk fixes from Stephen Boyd:
 "Three fixes, one for the clk framework and two for clk drivers:

   - Avoid an oops in possible_parent_show() by checking for no parent
     properly when a DT index based lookup is used

   - Handle errors returned from divider_ro_round_rate() in
     clk_stm32_composite_determine_rate()

   - Fix clk_ops::determine_rate() implementation of socfpga's
     gateclk_ops that was ruining uart output because the divider
     was forgotten about"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: stm32: Fix a signedness issue in clk_stm32_composite_determine_rate()
  clk: Sanitize possible_parent_show to Handle Return Value of of_clk_get_parent_name
  clk: socfpga: gate: Account for the divider in determine_rate

7 months agoMerge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Linus Torvalds [Sat, 28 Oct 2023 02:44:58 +0000 (16:44 -1000)] 
Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull misc filesystem fixes from Al Viro:
 "Assorted fixes all over the place: literally nothing in common, could
  have been three separate pull requests.

  All are simple regression fixes, but not for anything from this cycle"

* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  ceph_wait_on_conflict_unlink(): grab reference before dropping ->d_lock
  io_uring: kiocb_done() should *not* trust ->ki_pos if ->{read,write}_iter() failed
  sparc32: fix a braino in fault handling in csum_and_copy_..._user()

7 months agotracing/kprobes: Fix symbol counting logic by looking at modules as well
Andrii Nakryiko [Fri, 27 Oct 2023 23:31:26 +0000 (16:31 -0700)] 
tracing/kprobes: Fix symbol counting logic by looking at modules as well

Recent changes to count number of matching symbols when creating
a kprobe event failed to take into account kernel modules. As such, it
breaks kprobes on kernel module symbols, by assuming there is no match.

Fix this my calling module_kallsyms_on_each_symbol() in addition to
kallsyms_on_each_match_symbol() to perform a proper counting.

Link: https://lore.kernel.org/all/20231027233126.2073148-1-andrii@kernel.org/
Cc: Francis Laniel <flaniel@linux.microsoft.com>
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Fixes: b022f0c7e404 ("tracing/kprobes: Return EADDRNOTAVAIL when func matches several symbols")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
7 months agoceph_wait_on_conflict_unlink(): grab reference before dropping ->d_lock
Al Viro [Fri, 15 Sep 2023 01:55:29 +0000 (21:55 -0400)] 
ceph_wait_on_conflict_unlink(): grab reference before dropping ->d_lock

Use of dget() after we'd dropped ->d_lock is too late - dentry might
be gone by that point.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
7 months agoio_uring: kiocb_done() should *not* trust ->ki_pos if ->{read,write}_iter() failed
Al Viro [Mon, 28 Aug 2023 22:47:31 +0000 (18:47 -0400)] 
io_uring: kiocb_done() should *not* trust ->ki_pos if ->{read,write}_iter() failed

->ki_pos value is unreliable in such cases.  For an obvious example,
consider O_DSYNC write - we feed the data to page cache and start IO,
then we make sure it's completed.  Update of ->ki_pos is dealt with
by the first part; failure in the second ends up with negative value
returned _and_ ->ki_pos left advanced as if sync had been successful.
In the same situation write(2) does not advance the file position
at all.

Reviewed-by: Christian Brauner <brauner@kernel.org>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
7 months agoMerge tag 'io_uring-6.6-2023-10-27' of git://git.kernel.dk/linux
Linus Torvalds [Sat, 28 Oct 2023 00:10:32 +0000 (14:10 -1000)] 
Merge tag 'io_uring-6.6-2023-10-27' of git://git.kernel.dk/linux

Pull io_uring fixes from Jens Axboe:
 "Fix for an issue reported where reading fdinfo could find a NULL
  thread as we didn't properly synchronize, and then a disable for the
  IOCB_DIO_CALLER_COMP optimization as a recent reported highlighted how
  that could lead to deadlocks if the task issued async O_DIRECT writes
  and then proceeded to do sync fallocate() calls"

* tag 'io_uring-6.6-2023-10-27' of git://git.kernel.dk/linux:
  io_uring/rw: disable IOCB_DIO_CALLER_COMP
  io_uring/fdinfo: lock SQ thread while retrieving thread cpu/pid

7 months agosparc32: fix a braino in fault handling in csum_and_copy_..._user()
Al Viro [Sun, 22 Oct 2023 23:34:28 +0000 (19:34 -0400)] 
sparc32: fix a braino in fault handling in csum_and_copy_..._user()

Fault handler used to make non-trivial calls, so it needed
to set a stack frame up.  Used to be
save ... - grab a stack frame, old %o... become %i...
....
ret - go back to address originally in %o7, currently %i7
 restore - switch to previous stack frame, in delay slot
Non-trivial calls had been gone since ab5e8b331244 and that code should
have become
retl - go back to address in %o7
 clr %o0 - have return value set to 0
What it had become instead was
ret - go back to address in %i7 - return address of *caller*
 clr %o0 - have return value set to 0
which is not good, to put it mildly - we forcibly return 0 from
csum_and_copy_{from,to}_iter() (which is what the call of that
thing had been inlined into) and do that without dropping the
stack frame of said csum_and_copy_..._iter().  Confuses the
hell out of the caller of csum_and_copy_..._iter(), obviously...

Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Fixes: ab5e8b331244 "sparc32: propagate the calling conventions change down to __csum_partial_copy_sparc_generic()"
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
7 months agoMerge tag 'block-6.6-2023-10-27' of git://git.kernel.dk/linux
Linus Torvalds [Sat, 28 Oct 2023 00:01:59 +0000 (14:01 -1000)] 
Merge tag 'block-6.6-2023-10-27' of git://git.kernel.dk/linux

Pull block fix from Jens Axboe:
 "Just a single fix for a potential divide-by-zero, introduced in this
  cycle"

* tag 'block-6.6-2023-10-27' of git://git.kernel.dk/linux:
  blk-throttle: check for overflow in calculate_bytes_allowed

7 months agoMerge tag 'ata-6.6-final' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal...
Linus Torvalds [Fri, 27 Oct 2023 23:38:59 +0000 (13:38 -1000)] 
Merge tag 'ata-6.6-final' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata

Pull ATA fix from Damien Le Moal:
 "A single patch to fix a regression introduced by the recent
  suspend/resume fixes.

  The regression is that ATA disks are not stopped on system shutdown,
  which is not recommended and increases the disks SMART counters for
  unclean power off events.

  This patch fixes this by refining the recent rework of the scsi device
  manage_xxx flags"

* tag 'ata-6.6-final' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
  scsi: sd: Introduce manage_shutdown device flag

7 months agoMerge tag 'platform-drivers-x86-v6.6-6' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 27 Oct 2023 23:32:48 +0000 (13:32 -1000)] 
Merge tag 'platform-drivers-x86-v6.6-6' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86

Pull x86 platform driver fix from Hans de Goede:
 "A single patch to extend the AMD PMC driver DMI quirk list
  for laptops which need special handling to avoid NVME s2idle
  suspend/resume errors"

* tag 'platform-drivers-x86-v6.6-6' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
  platform/x86: Add s2idle quirk for more Lenovo laptops

7 months agox86/tsc: Defer marking TSC unstable to a worker
Thomas Gleixner [Wed, 25 Oct 2023 21:31:35 +0000 (23:31 +0200)] 
x86/tsc: Defer marking TSC unstable to a worker

Tetsuo reported the following lockdep splat when the TSC synchronization
fails during CPU hotplug:

   tsc: Marking TSC unstable due to check_tsc_sync_source failed

   WARNING: inconsistent lock state
   inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
   ffffffff8cfa1c78 (watchdog_lock){?.-.}-{2:2}, at: clocksource_watchdog+0x23/0x5a0
   {IN-HARDIRQ-W} state was registered at:
     _raw_spin_lock_irqsave+0x3f/0x60
     clocksource_mark_unstable+0x1b/0x90
     mark_tsc_unstable+0x41/0x50
     check_tsc_sync_source+0x14f/0x180
     sysvec_call_function_single+0x69/0x90

   Possible unsafe locking scenario:
     lock(watchdog_lock);
     <Interrupt>
       lock(watchdog_lock);

   stack backtrace:
    _raw_spin_lock+0x30/0x40
    clocksource_watchdog+0x23/0x5a0
    run_timer_softirq+0x2a/0x50
    sysvec_apic_timer_interrupt+0x6e/0x90

The reason is the recent conversion of the TSC synchronization function
during CPU hotplug on the control CPU to a SMP function call. In case
that the synchronization with the upcoming CPU fails, the TSC has to be
marked unstable via clocksource_mark_unstable().

clocksource_mark_unstable() acquires 'watchdog_lock', but that lock is
taken with interrupts enabled in the watchdog timer callback to minimize
interrupt disabled time. That's obviously a possible deadlock scenario,

Before that change the synchronization function was invoked in thread
context so this could not happen.

As it is not crucical whether the unstable marking happens slightly
delayed, defer the call to a worker thread which avoids the lock context
problem.

Fixes: 9d349d47f0e3 ("x86/smpboot: Make TSC synchronization function call based")
Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/87zg064ceg.ffs@tglx
7 months agox86/i8259: Skip probing when ACPI/MADT advertises PCAT compatibility
Thomas Gleixner [Wed, 25 Oct 2023 21:04:15 +0000 (23:04 +0200)] 
x86/i8259: Skip probing when ACPI/MADT advertises PCAT compatibility

David and a few others reported that on certain newer systems some legacy
interrupts fail to work correctly.

Debugging revealed that the BIOS of these systems leaves the legacy PIC in
uninitialized state which makes the PIC detection fail and the kernel
switches to a dummy implementation.

Unfortunately this fallback causes quite some code to fail as it depends on
checks for the number of legacy PIC interrupts or the availability of the
real PIC.

In theory there is no reason to use the PIC on any modern system when
IO/APIC is available, but the dependencies on the related checks cannot be
resolved trivially and on short notice. This needs lots of analysis and
rework.

The PIC detection has been added to avoid quirky checks and force selection
of the dummy implementation all over the place, especially in VM guest
scenarios. So it's not an option to revert the relevant commit as that
would break a lot of other scenarios.

One solution would be to try to initialize the PIC on detection fail and
retry the detection, but that puts the burden on everything which does not
have a PIC.

Fortunately the ACPI/MADT table header has a flag field, which advertises
in bit 0 that the system is PCAT compatible, which means it has a legacy
8259 PIC.

Evaluate that bit and if set avoid the detection routine and keep the real
PIC installed, which then gets initialized (for nothing) and makes the rest
of the code with all the dependencies work again.

Fixes: e179f6914152 ("x86, irq, pic: Probe for legacy PIC and set legacy_pic appropriately")
Reported-by: David Lazar <dlazar@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: David Lazar <dlazar@gmail.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Cc: stable@vger.kernel.org
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218003
Link: https://lore.kernel.org/r/875y2u5s8g.ffs@tglx
7 months agox86/cpu: Add model number for Intel Arrow Lake mobile processor
Tony Luck [Wed, 25 Oct 2023 20:25:13 +0000 (13:25 -0700)] 
x86/cpu: Add model number for Intel Arrow Lake mobile processor

For "reasons" Intel has code-named this CPU with a "_H" suffix.

[ dhansen: As usual, apply this and send it upstream quickly to
   make it easier for anyone who is doing work that
   consumes this. ]

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/all/20231025202513.12358-1-tony.luck%40intel.com
7 months agoMerge tag 'iommu-fix-v6.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro...
Linus Torvalds [Fri, 27 Oct 2023 15:43:05 +0000 (05:43 -1000)] 
Merge tag 'iommu-fix-v6.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu fix from Joerg Roedel:

 - Fix boot regression for Sapphire Rapids with Intel VT-d driver

* tag 'iommu-fix-v6.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu: Avoid unnecessary cache invalidations

7 months agoMerge tag 'powerpc-6.6-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc...
Linus Torvalds [Fri, 27 Oct 2023 15:40:42 +0000 (05:40 -1000)] 
Merge tag 'powerpc-6.6-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:

 - Fix boot crash with FLATMEM since set_ptes() introduction

 - Avoid calling arch_enter/leave_lazy_mmu() in set_ptes()

Thanks to Aneesh Kumar K.V and Erhard Furtner.

* tag 'powerpc-6.6-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/mm: Avoid calling arch_enter/leave_lazy_mmu() in set_ptes
  powerpc/mm: Fix boot crash with FLATMEM

7 months agoplatform/x86: Add s2idle quirk for more Lenovo laptops
David Lazar [Wed, 25 Oct 2023 19:30:16 +0000 (21:30 +0200)] 
platform/x86: Add s2idle quirk for more Lenovo laptops

When suspending to idle and resuming on some Lenovo laptops using the
Mendocino APU, multiple NVME IOMMU page faults occur, showing up in
dmesg as repeated errors:

nvme 0000:01:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000b
address=0xb6674000 flags=0x0000]

The system is unstable afterwards.

Applying the s2idle quirk introduced by commit 455cd867b85b ("platform/x86:
thinkpad_acpi: Add a s2idle resume quirk for a number of laptops")
allows these systems to work with the IOMMU enabled and s2idle
resume to work.

Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218024
Suggested-by: Mario Limonciello <mario.limonciello@amd.com>
Suggested-by: Mark Pearson <mpearson-lenovo@squebb.ca>
Signed-off-by: David Lazar <dlazar@gmail.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Mark Pearson <mpearson-lenovo@squebb.ca>
Link: https://lore.kernel.org/r/ZTlsyOaFucF2pWrL@localhost
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
7 months agotracing/kprobes: Fix the description of variable length arguments
Yujie Liu [Fri, 27 Oct 2023 04:13:14 +0000 (12:13 +0800)] 
tracing/kprobes: Fix the description of variable length arguments

Fix the following kernel-doc warnings:

kernel/trace/trace_kprobe.c:1029: warning: Excess function parameter 'args' description in '__kprobe_event_gen_cmd_start'
kernel/trace/trace_kprobe.c:1097: warning: Excess function parameter 'args' description in '__kprobe_event_add_fields'

Refer to the usage of variable length arguments elsewhere in the kernel
code, "@..." is the proper way to express it in the description.

Link: https://lore.kernel.org/all/20231027041315.2613166-1-yujie.liu@intel.com/
Fixes: 2a588dd1d5d6 ("tracing: Add kprobe event command generation functions")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202310190437.paI6LYJF-lkp@intel.com/
Signed-off-by: Yujie Liu <yujie.liu@intel.com>
Reviewed-by: Mukesh Ojha <quic_mojha@quicinc.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
7 months agoiommu: Avoid unnecessary cache invalidations
Lu Baolu [Thu, 26 Oct 2023 08:49:42 +0000 (16:49 +0800)] 
iommu: Avoid unnecessary cache invalidations

The iommu_create_device_direct_mappings() only needs to flush the caches
when the mappings are changed in the affected domain. This is not true
for non-DMA domains, or for devices attached to the domain that have no
reserved regions. To avoid unnecessary cache invalidations, add a check
before iommu_flush_iotlb_all().

Fixes: a48ce36e2786 ("iommu: Prevent RESV_DIRECT devices from blocking domains")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Tested-by: Henry Willard <henry.willard@oracle.com>
Link: https://lore.kernel.org/r/20231026084942.17387-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
7 months agoMerge tag 'drm-fixes-2023-10-27' of git://anongit.freedesktop.org/drm/drm
Linus Torvalds [Fri, 27 Oct 2023 06:42:02 +0000 (20:42 -1000)] 
Merge tag 'drm-fixes-2023-10-27' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "This is the final set of fixes for 6.6, just misc bits mainly in
  amdgpu and i915, nothing too noteworthy.

  amdgpu:
   - ignore duplicated BOs in CS parser
   - remove redundant call to amdgpu_ctx_priority_is_valid()
   - Extend VI APSM quirks to more platforms

  amdkfd:
   - reserve fence slot while locking BO

  dp_mst:
   - Fix NULL deref in get_mst_branch_device_by_guid_helper()

  logicvc:
   - Kconfig: Select REGMAP and REGMAP_MMIO

  ivpu:
   - Fix missing VPUIP interrupts

  i915:
   - Determine context valid in OA reports
   - Hold GT forcewake during steering operations
   - Check if PMU is closed before stopping event"

* tag 'drm-fixes-2023-10-27' of git://anongit.freedesktop.org/drm/drm:
  accel/ivpu/37xx: Fix missing VPUIP interrupts
  drm/amd: Disable ASPM for VI w/ all Intel systems
  drm/i915/pmu: Check if pmu is closed before stopping event
  drm/i915/mcr: Hold GT forcewake during steering operations
  drm/logicvc: Kconfig: select REGMAP and REGMAP_MMIO
  drm/i915/perf: Determine context valid in OA reports
  drm/amdkfd: reserve a fence slot while locking the BO
  drm/amdgpu: Remove redundant call to priority_is_valid()
  drm/dp_mst: Fix NULL deref in get_mst_branch_device_by_guid_helper()
  drm/amdgpu: ignore duplicate BOs again

7 months agoMerge tag 'amd-drm-fixes-6.6-2023-10-25' of https://gitlab.freedesktop.org/agd5f...
Dave Airlie [Fri, 27 Oct 2023 02:13:29 +0000 (12:13 +1000)] 
Merge tag 'amd-drm-fixes-6.6-2023-10-25' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-6.6-2023-10-25:

amdgpu:
- Extend VI APSM quirks to more platforms

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231026035452.14921-1-alexander.deucher@amd.com
7 months agoMerge tag 'drm-intel-fixes-2023-10-26' of git://anongit.freedesktop.org/drm/drm-intel...
Dave Airlie [Fri, 27 Oct 2023 01:58:28 +0000 (11:58 +1000)] 
Merge tag 'drm-intel-fixes-2023-10-26' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

- Determine context valid in OA reports (Umesh)
- Hold GT forcewake during steering operations (Matt Roper)
- Check if PMU is closed before stopping event (Umesh)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZTp8IQ0wxzxVjN7J@intel.com
7 months agoMerge tag 'drm-misc-fixes-2023-10-26' of git://anongit.freedesktop.org/drm/drm-misc...
Dave Airlie [Fri, 27 Oct 2023 01:50:51 +0000 (11:50 +1000)] 
Merge tag 'drm-misc-fixes-2023-10-26' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

Short summary of fixes pull:

amdgpu:
- ignore duplicated BOs in CS parser
- remove redundant call to amdgpu_ctx_priority_is_valid()

amdkfd:
- reserve fence slot while locking BO

dp_mst:
- Fix NULL deref in get_mst_branch_device_by_guid_helper()

logicvc:
- Kconfig: Select REGMAP and REGMAP_MMIO

ivpu:
- Fix missing VPUIP interrupts

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20231026110132.GA10591@linux-uq9g.fritz.box
7 months agoscsi: sd: Introduce manage_shutdown device flag
Damien Le Moal [Wed, 25 Oct 2023 06:46:12 +0000 (15:46 +0900)] 
scsi: sd: Introduce manage_shutdown device flag

Commit aa3998dbeb3a ("ata: libata-scsi: Disable scsi device
manage_system_start_stop") change setting the manage_system_start_stop
flag to false for libata managed disks to enable libata internal
management of disk suspend/resume. However, a side effect of this change
is that on system shutdown, disks are no longer being stopped (set to
standby mode with the heads unloaded). While this is not a critical
issue, this unclean shutdown is not recommended and shows up with
increased smart counters (e.g. the unexpected power loss counter
"Unexpect_Power_Loss_Ct").

Instead of defining a shutdown driver method for all ATA adapter
drivers (not all of them define that operation), this patch resolves
this issue by further refining the sd driver start/stop control of disks
using the new flag manage_shutdown. If this new flag is set to true by
a low level driver, the function sd_shutdown() will issue a
START STOP UNIT command with the start argument set to 0 when a disk
needs to be powered off (suspended) on system power off, that is, when
system_state is equal to SYSTEM_POWER_OFF.

Similarly to the other manage_xxx flags, the new manage_shutdown flag is
exposed through sysfs as a read-write device attribute.

To avoid any confusion between manage_shutdown and
manage_system_start_stop, the comments describing these flags in
include/scsi/scsi.h are also improved.

Fixes: aa3998dbeb3a ("ata: libata-scsi: Disable scsi device manage_system_start_stop")
Cc: stable@vger.kernel.org
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218038
Link: https://lore.kernel.org/all/cd397c88-bf53-4768-9ab8-9d107df9e613@gmail.com/
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
7 months agoMerge tag 'soc-fixes-6.7-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Linus Torvalds [Thu, 26 Oct 2023 18:17:26 +0000 (08:17 -1000)] 
Merge tag 'soc-fixes-6.7-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull ARM SoC fixes from Arnd Bergmann:
 "A couple of platforms have some last-minute fixes, in particular:

   - riscv gets some fixes for noncoherent DMA on the renesas and thead
     platforms and dts fix for SPI on the visionfive 2 board

   - Qualcomm Snapdragon gets three dts fixes to address board specific
     regressions on the pmic and gpio nodes

   - Rockchip platforms get multiple dts fixes to address issues on the
     recent rk3399 platform as well as the older rk3128 platform that
     apparently regressed a while ago.

   - TI OMAP gets some trivial code and dts fixes and a regression fix
     for the omap1 ams-delta modem

   - NXP i.MX firmware has one fix for a use-after-free but in its error
     handling"

* tag 'soc-fixes-6.7-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (25 commits)
  soc: renesas: ARCH_R9A07G043 depends on !RISCV_ISA_ZICBOM
  riscv: only select DMA_DIRECT_REMAP from RISCV_ISA_ZICBOM and ERRATA_THEAD_PBMT
  riscv: RISCV_NONSTANDARD_CACHE_OPS shouldn't depend on RISCV_DMA_NONCOHERENT
  riscv: dts: thead: set dma-noncoherent to soc bus
  arm64: dts: rockchip: Fix i2s0 pin conflict on ROCK Pi 4 boards
  arm64: dts: rockchip: Add i2s0-2ch-bus-bclk-off pins to RK3399
  clk: ti: Fix missing omap5 mcbsp functional clock and aliases
  clk: ti: Fix missing omap4 mcbsp functional clock and aliases
  ARM: OMAP1: ams-delta: Fix MODEM initialization failure
  soc: renesas: Make ARCH_R9A07G043 depend on required options
  riscv: dts: starfive: visionfive 2: correct spi's ss pin
  firmware/imx-dsp: Fix use_after_free in imx_dsp_setup_channels()
  ARM: OMAP: timer32K: fix all kernel-doc warnings
  ARM: omap2: fix a debug printk
  ARM: dts: rockchip: Fix timer clocks for RK3128
  ARM: dts: rockchip: Add missing quirk for RK3128's dma engine
  ARM: dts: rockchip: Add missing arm timer interrupt for RK3128
  ARM: dts: rockchip: Fix i2c0 register address for RK3128
  arm64: dts: rockchip: set codec system-clock-fixed on px30-ringneck-haikou
  arm64: dts: rockchip: use codec as clock master on px30-ringneck-haikou
  ...

7 months agoMerge tag 'net-6.6-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 26 Oct 2023 17:41:27 +0000 (07:41 -1000)] 
Merge tag 'net-6.6-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from WiFi and netfilter.

  Most regressions addressed here come from quite old versions, with the
  exceptions of the iavf one and the WiFi fixes. No known outstanding
  reports or investigation.

  Fixes to fixes:

   - eth: iavf: in iavf_down, disable queues when removing the driver

  Previous releases - regressions:

   - sched: act_ct: additional checks for outdated flows

   - tcp: do not leave an empty skb in write queue

   - tcp: fix wrong RTO timeout when received SACK reneging

   - wifi: cfg80211: pass correct pointer to rdev_inform_bss()

   - eth: i40e: sync next_to_clean and next_to_process for programming
     status desc

   - eth: iavf: initialize waitqueues before starting watchdog_task

  Previous releases - always broken:

   - eth: r8169: fix data-races

   - eth: igb: fix potential memory leak in igb_add_ethtool_nfc_entry

   - eth: r8152: avoid writing garbage to the adapter's registers

   - eth: gtp: fix fragmentation needed check with gso"

* tag 'net-6.6-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (43 commits)
  iavf: in iavf_down, disable queues when removing the driver
  vsock/virtio: initialize the_virtio_vsock before using VQs
  net: ipv6: fix typo in comments
  net: ipv4: fix typo in comments
  net/sched: act_ct: additional checks for outdated flows
  netfilter: flowtable: GC pushes back packets to classic path
  i40e: Fix wrong check for I40E_TXR_FLAGS_WB_ON_ITR
  gtp: fix fragmentation needed check with gso
  gtp: uapi: fix GTPA_MAX
  Fix NULL pointer dereference in cn_filter()
  sfc: cleanup and reduce netlink error messages
  net/handshake: fix file ref count in handshake_nl_accept_doit()
  wifi: mac80211: don't drop all unprotected public action frames
  wifi: cfg80211: fix assoc response warning on failed links
  wifi: cfg80211: pass correct pointer to rdev_inform_bss()
  isdn: mISDN: hfcsusb: Spelling fix in comment
  tcp: fix wrong RTO timeout when received SACK reneging
  r8152: Block future register access if register access fails
  r8152: Rename RTL8152_UNPLUG to RTL8152_INACCESSIBLE
  r8152: Check for unplug in r8153b_ups_en() / r8153c_ups_en()
  ...

7 months agoMerge tag 'renesas-fixes-for-v6.6-tag3' of git://git.kernel.org/pub/scm/linux/kernel...
Arnd Bergmann [Thu, 26 Oct 2023 15:06:37 +0000 (17:06 +0200)] 
Merge tag 'renesas-fixes-for-v6.6-tag3' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel into arm/fixes

Renesas fixes for v6.6 (take three)

  - Sort out a few Kconfig dependency issues for the rich set of RISC-V
    non-coherent DMA support.

* tag 'renesas-fixes-for-v6.6-tag3' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel:
  soc: renesas: ARCH_R9A07G043 depends on !RISCV_ISA_ZICBOM
  riscv: only select DMA_DIRECT_REMAP from RISCV_ISA_ZICBOM and ERRATA_THEAD_PBMT
  riscv: RISCV_NONSTANDARD_CACHE_OPS shouldn't depend on RISCV_DMA_NONCOHERENT

Link: https://lore.kernel.org/r/cover.1698312384.git.geert+renesas@glider.be
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
7 months agosoc: renesas: ARCH_R9A07G043 depends on !RISCV_ISA_ZICBOM
Christoph Hellwig [Wed, 18 Oct 2023 05:26:54 +0000 (07:26 +0200)] 
soc: renesas: ARCH_R9A07G043 depends on !RISCV_ISA_ZICBOM

ARCH_R9A07G043 has its own non-standard global pool based DMA coherent
allocator, which conflicts with the remap based RISCV_ISA_ZICBOM version.
Add a proper dependency.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Acked-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20231018052654.50074-4-hch@lst.de
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
7 months agoriscv: only select DMA_DIRECT_REMAP from RISCV_ISA_ZICBOM and ERRATA_THEAD_PBMT
Christoph Hellwig [Wed, 18 Oct 2023 05:26:53 +0000 (07:26 +0200)] 
riscv: only select DMA_DIRECT_REMAP from RISCV_ISA_ZICBOM and ERRATA_THEAD_PBMT

RISCV_DMA_NONCOHERENT is also used for whacky non-standard
non-coherent ops that use different hooks in dma-direct.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Tested-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://lore.kernel.org/r/20231018052654.50074-3-hch@lst.de
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
7 months agoriscv: RISCV_NONSTANDARD_CACHE_OPS shouldn't depend on RISCV_DMA_NONCOHERENT
Christoph Hellwig [Wed, 18 Oct 2023 05:26:52 +0000 (07:26 +0200)] 
riscv: RISCV_NONSTANDARD_CACHE_OPS shouldn't depend on RISCV_DMA_NONCOHERENT

RISCV_NONSTANDARD_CACHE_OPS is also used for the pmem cache maintenance
helpers, which are built into the kernel unconditionally.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20231018052654.50074-2-hch@lst.de
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
7 months agoaccel/ivpu/37xx: Fix missing VPUIP interrupts
Karol Wachowski [Tue, 24 Oct 2023 16:19:52 +0000 (18:19 +0200)] 
accel/ivpu/37xx: Fix missing VPUIP interrupts

Move sequence of masking and unmasking global interrupts from buttress
interrupt handler to generic one that handles both VPUIP and BTRS
interrupts. Unmasking global interrupts will re-trigger MSI for any
pending interrupts.

Lack of this sequence will cause the driver to miss any
VPUIP interrupt that comes after reading VPU_37XX_HOST_SS_ICB_STATUS_0
and before clearing all active interrupt sources.

Fixes: 35b137630f08 ("accel/ivpu: Introduce a new DRM driver for Intel VPU")
Cc: stable@vger.kernel.org
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231024161952.759914-1-stanislaw.gruszka@linux.intel.com
7 months agoiavf: in iavf_down, disable queues when removing the driver
Michal Schmidt [Wed, 25 Oct 2023 18:32:13 +0000 (11:32 -0700)] 
iavf: in iavf_down, disable queues when removing the driver

In iavf_down, we're skipping the scheduling of certain operations if
the driver is being removed. However, the IAVF_FLAG_AQ_DISABLE_QUEUES
request must not be skipped in this case, because iavf_close waits
for the transition to the __IAVF_DOWN state, which happens in
iavf_virtchnl_completion after the queues are released.

Without this fix, "rmmod iavf" takes half a second per interface that's
up and prints the "Device resources not yet released" warning.

Fixes: c8de44b577eb ("iavf: do not process adminq tasks when __IAVF_IN_REMOVE_TASK is set")
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Tested-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231025183213.874283-1-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 months agoMerge tag 'nf-23-10-25' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Jakub Kicinski [Wed, 25 Oct 2023 23:02:06 +0000 (16:02 -0700)] 
Merge tag 'nf-23-10-25' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

This patch contains two late Netfilter's flowtable fixes for net:

1) Flowtable GC pushes back packets to classic path in every GC run,
   ie. every second. This is because NF_FLOW_HW_ESTABLISHED is only
   used by sched/act_ct (never set) and IPS_SEEN_REPLY might be unset
   by the time the flow is offloaded (this status bit is only reliable
   in the sched/act_ct datapath).

2) sched/act_ct logic to push back packets to classic path to reevaluate
   if UDP flow is unidirectional only applies if IPS_HW_OFFLOAD_BIT is
   set on and no hardware offload request is pending to be handled.
   From Vlad Buslov.

These two patches fixes two problems that were introduced in the
previous 6.5 development cycle.

* tag 'nf-23-10-25' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  net/sched: act_ct: additional checks for outdated flows
  netfilter: flowtable: GC pushes back packets to classic path
====================

Link: https://lore.kernel.org/r/20231025100819.2664-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 months agovsock/virtio: initialize the_virtio_vsock before using VQs
Alexandru Matei [Tue, 24 Oct 2023 19:17:42 +0000 (22:17 +0300)] 
vsock/virtio: initialize the_virtio_vsock before using VQs

Once VQs are filled with empty buffers and we kick the host, it can send
connection requests. If the_virtio_vsock is not initialized before,
replies are silently dropped and do not reach the host.

virtio_transport_send_pkt() can queue packets once the_virtio_vsock is
set, but they won't be processed until vsock->tx_run is set to true. We
queue vsock->send_pkt_work when initialization finishes to send those
packets queued earlier.

Fixes: 0deab087b16a ("vsock/virtio: use RCU to avoid use-after-free on the_virtio_vsock")
Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20231024191742.14259-1-alexandru.matei@uipath.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 months agofile, i915: fix file reference for mmap_singleton()
Christian Brauner [Wed, 25 Oct 2023 10:14:37 +0000 (12:14 +0200)] 
file, i915: fix file reference for mmap_singleton()

Today we got a report at [1] for rcu stalls on the i915 testsuite in [2]
due to the conversion of files to SLAB_TYPSSAFE_BY_RCU. Afaict,
get_file_rcu() goes into an infinite loop trying to carefully verify
that i915->gem.mmap_singleton hasn't changed - see the splat below.

So I stared at this code to figure out what it actually does. It seems
that the i915->gem.mmap_singleton pointer itself never had rcu semantics.

The i915->gem.mmap_singleton is replaced in
file->f_op->release::singleton_release():

        static int singleton_release(struct inode *inode, struct file *file)
        {
                struct drm_i915_private *i915 = file->private_data;

                cmpxchg(&i915->gem.mmap_singleton, file, NULL);
                drm_dev_put(&i915->drm);

                return 0;
        }

The cmpxchg() is ordered against a concurrent update of
i915->gem.mmap_singleton from mmap_singleton(). IOW, when
mmap_singleton() fails to get a reference on i915->gem.mmap_singleton:

While mmap_singleton() does

        rcu_read_lock();
        file = get_file_rcu(&i915->gem.mmap_singleton);
        rcu_read_unlock();

it allocates a new file via anon_inode_getfile() and does

        smp_store_mb(i915->gem.mmap_singleton, file);

So, then what happens in the case of this bug is that at some point
fput() is called and drops the file->f_count to zero leaving the pointer
in i915->gem.mmap_singleton in tact.

Now, there might be delays until
file->f_op->release::singleton_release() is called and
i915->gem.mmap_singleton is set to NULL.

Say concurrently another task hits mmap_singleton() and does:

        rcu_read_lock();
        file = get_file_rcu(&i915->gem.mmap_singleton);
        rcu_read_unlock();

When get_file_rcu() fails to get a reference via atomic_inc_not_zero()
it will try the reload from i915->gem.mmap_singleton expecting it to be
NULL, assuming it has comparable semantics as we expect in
__fget_files_rcu().

But it hasn't so it reloads the same pointer again, trying the same
atomic_inc_not_zero() again and doing so until
file->f_op->release::singleton_release() of the old file has been
called.

So, in contrast to __fget_files_rcu() here we want to not retry when
atomic_inc_not_zero() has failed. We only want to retry in case we
managed to get a reference but the pointer did change on reload.

<3> [511.395679] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
<3> [511.395716] rcu:   Tasks blocked on level-1 rcu_node (CPUs 0-9): P6238
<3> [511.395934] rcu:   (detected by 16, t=65002 jiffies, g=123977, q=439 ncpus=20)
<6> [511.395944] task:i915_selftest   state:R  running task     stack:10568 pid:6238  tgid:6238  ppid:1001   flags:0x00004002
<6> [511.395962] Call Trace:
<6> [511.395966]  <TASK>
<6> [511.395974]  ? __schedule+0x3a8/0xd70
<6> [511.395995]  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
<6> [511.396003]  ? lockdep_hardirqs_on+0xc3/0x140
<6> [511.396013]  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
<6> [511.396029]  ? get_file_rcu+0x10/0x30
<6> [511.396039]  ? get_file_rcu+0x10/0x30
<6> [511.396046]  ? i915_gem_object_mmap+0xbc/0x450 [i915]
<6> [511.396509]  ? i915_gem_mmap+0x272/0x480 [i915]
<6> [511.396903]  ? mmap_region+0x253/0xb60
<6> [511.396925]  ? do_mmap+0x334/0x5c0
<6> [511.396939]  ? vm_mmap_pgoff+0x9f/0x1c0
<6> [511.396949]  ? rcu_is_watching+0x11/0x50
<6> [511.396962]  ? igt_mmap_offset+0xfc/0x110 [i915]
<6> [511.397376]  ? __igt_mmap+0xb3/0x570 [i915]
<6> [511.397762]  ? igt_mmap+0x11e/0x150 [i915]
<6> [511.398139]  ? __trace_bprintk+0x76/0x90
<6> [511.398156]  ? __i915_subtests+0xbf/0x240 [i915]
<6> [511.398586]  ? __pfx___i915_live_setup+0x10/0x10 [i915]
<6> [511.399001]  ? __pfx___i915_live_teardown+0x10/0x10 [i915]
<6> [511.399433]  ? __run_selftests+0xbc/0x1a0 [i915]
<6> [511.399875]  ? i915_live_selftests+0x4b/0x90 [i915]
<6> [511.400308]  ? i915_pci_probe+0x106/0x200 [i915]
<6> [511.400692]  ? pci_device_probe+0x95/0x120
<6> [511.400704]  ? really_probe+0x164/0x3c0
<6> [511.400715]  ? __pfx___driver_attach+0x10/0x10
<6> [511.400722]  ? __driver_probe_device+0x73/0x160
<6> [511.400731]  ? driver_probe_device+0x19/0xa0
<6> [511.400741]  ? __driver_attach+0xb6/0x180
<6> [511.400749]  ? __pfx___driver_attach+0x10/0x10
<6> [511.400756]  ? bus_for_each_dev+0x77/0xd0
<6> [511.400770]  ? bus_add_driver+0x114/0x210
<6> [511.400781]  ? driver_register+0x5b/0x110
<6> [511.400791]  ? i915_init+0x23/0xc0 [i915]
<6> [511.401153]  ? __pfx_i915_init+0x10/0x10 [i915]
<6> [511.401503]  ? do_one_initcall+0x57/0x270
<6> [511.401515]  ? rcu_is_watching+0x11/0x50
<6> [511.401521]  ? kmalloc_trace+0xa3/0xb0
<6> [511.401532]  ? do_init_module+0x5f/0x210
<6> [511.401544]  ? load_module+0x1d00/0x1f60
<6> [511.401581]  ? init_module_from_file+0x86/0xd0
<6> [511.401590]  ? init_module_from_file+0x86/0xd0
<6> [511.401613]  ? idempotent_init_module+0x17c/0x230
<6> [511.401639]  ? __x64_sys_finit_module+0x56/0xb0
<6> [511.401650]  ? do_syscall_64+0x3c/0x90
<6> [511.401659]  ? entry_SYSCALL_64_after_hwframe+0x6e/0xd8
<6> [511.401684]  </TASK>

Link: [1]: https://lore.kernel.org/intel-gfx/SJ1PR11MB6129CB39EED831784C331BAFB9DEA@SJ1PR11MB6129.namprd11.prod.outlook.com
Link: [2]: https://intel-gfx-ci.01.org/tree/linux-next/next-20231013/bat-dg2-11/igt@i915_selftest@live@mman.html#dmesg-warnings10963
Cc: Jann Horn <jannh@google.com>,
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20231025-formfrage-watscheln-84526cd3bd7d@brauner
Signed-off-by: Christian Brauner <brauner@kernel.org>