Merge series 'filelock: split file leases out of struct file_lock' of https://lore.kernel.org/r/20240131-flsplit-v3-0-c6129007ee8d@kernel.org
Pull file locking series from Jeff Layton:
Long ago, file locks used to hang off of a singly-linked list in struct
inode. Because of this, when leases were added, they were added to the
same list and so they had to be tracked using the same sort of
structure.
Several years ago, we added struct file_lock_context, which allowed us
to use separate lists to track different types of file locks. Given
that, leases no longer need to be tracked using struct file_lock.
That said, a lot of the underlying infrastructure _is_ the same between
file leases and locks, so we can't completely separate everything.
This patchset first splits a group of fields used by both file locks and
leases into a new struct file_lock_core, that is then embedded in struct
file_lock. Coccinelle was then used to convert a lot of the callers to
deal with the move, with the remaining 25% or so converted by hand.
It then converts several internal functions in fs/locks.c to work with
struct file_lock_core. Lastly, struct file_lock is split into struct
file_lock and file_lease, and the lease-related APIs converted to take
struct file_lease.
I also added a few small helpers and converted several users over to
them. That reduces the size of the per-fs conversion patches later in
the series. I played with some others too, but they were too awkward
or not frequently used enough to make it worthwhile.
* series 'filelock: split file leases out of struct file_lock' of https://lore.kernel.org/r/20240131-flsplit-v3-0-c6129007ee8d@kernel.org: (47 commits)
filelock: split leases out of struct file_lock
filelock: remove temporary compatibility macros
smb/server: adapt to breakup of struct file_lock
smb/client: adapt to breakup of struct file_lock
ocfs2: adapt to breakup of struct file_lock
nfsd: adapt to breakup of struct file_lock
nfs: adapt to breakup of struct file_lock
lockd: adapt to breakup of struct file_lock
fuse: adapt to breakup of struct file_lock
gfs2: adapt to breakup of struct file_lock
dlm: adapt to breakup of struct file_lock
ceph: adapt to breakup of struct file_lock
afs: adapt to breakup of struct file_lock
9p: adapt to breakup of struct file_lock
filelock: convert seqfile handling to use file_lock_core
filelock: convert locks_translate_pid to take file_lock_core
filelock: convert locks_insert_lock_ctx and locks_delete_lock_ctx
filelock: convert locks_wake_up_blocks to take a file_lock_core pointer
filelock: make assign_type helper take a file_lock_core pointer
filelock: reorganize locks_delete_block and __locks_insert_block
...
Signed-off-by: Christian Brauner <brauner@kernel.org>
Jeff Layton [Wed, 31 Jan 2024 23:02:28 +0000 (18:02 -0500)]
filelock: split leases out of struct file_lock
Add a new struct file_lease and move the lease-specific fields from
struct file_lock to it. Convert the appropriate API calls to take
struct file_lease instead, and convert the callers to use them.
There is zero overlap between the lock manager operations for file
locks and the ones for file leases, so split the lease-related
operations off into a new lease_manager_operations struct.
Jeff Layton [Wed, 31 Jan 2024 23:02:09 +0000 (18:02 -0500)]
filelock: reorganize locks_delete_block and __locks_insert_block
Rename the old __locks_delete_block to __locks_unlink_lock. Rename
change old locks_delete_block function to __locks_delete_block and
have it take a file_lock_core. Make locks_delete_block a simple wrapper
around __locks_delete_block.
Also, change __locks_insert_block to take struct file_lock_core, and
fix up its callers.
Jeff Layton [Wed, 31 Jan 2024 23:02:07 +0000 (18:02 -0500)]
filelock: convert fl_blocker to file_lock_core
Both locks and leases deal with fl_blocker. Switch the fl_blocker
pointer in struct file_lock_core to point to the file_lock_core of the
blocker instead of a file_lock structure.
Jeff Layton [Wed, 31 Jan 2024 23:02:06 +0000 (18:02 -0500)]
filelock: convert __locks_insert_block, conflict and deadlock checks to use file_lock_core
Have both __locks_insert_block and the deadlock and conflict checking
functions take a struct file_lock_core pointer instead of a struct
file_lock one. Also, change posix_locks_deadlock to return bool.
Jeff Layton [Wed, 31 Jan 2024 23:01:59 +0000 (18:01 -0500)]
filelock: have fs/locks.c deal with file_lock_core directly
Convert fs/locks.c to access fl_core fields direcly rather than using
the backward-compatibility macros. Most of this was done with
coccinelle, with a few by-hand fixups.
Jeff Layton [Wed, 31 Jan 2024 23:01:58 +0000 (18:01 -0500)]
filelock: split common fields into struct file_lock_core
In a future patch, we're going to split file leases into their own
structure. Since a lot of the underlying machinery uses the same fields
move those into a new file_lock_core, and embed that inside struct
file_lock.
For now, add some macros to ensure that we can continue to build while
the conversion is in progress.
Jeff Layton [Wed, 31 Jan 2024 23:01:53 +0000 (18:01 -0500)]
nfsd: convert to using new filelock helpers
Convert to using the new file locking helper functions. Also, in later
patches we're going to introduce some macros with names that clash with
the variable names in nfsd4_lock. Rename them.
Jeff Layton [Wed, 31 Jan 2024 23:01:52 +0000 (18:01 -0500)]
nfs: convert to using new filelock helpers
Convert to using the new file locking helper functions. Also, in later
patches we're going to introduce some temporary macros with names that
clash with the variable name in nfs4_proc_unlck. Rename it.
Jeff Layton [Wed, 31 Jan 2024 23:01:51 +0000 (18:01 -0500)]
lockd: convert to using new filelock helpers
Convert to using the new file locking helper functions. Also in later
patches we're going to introduce some macros with names that clash with
the variable names in nlmclnt_lock. Rename them.
Jeff Layton [Wed, 31 Jan 2024 23:01:49 +0000 (18:01 -0500)]
dlm: convert to using new filelock helpers
Convert to using the new file locking helper functions. Also, in later
patches we're going to introduce some temporary macros with names that
clash with the variable name in dlm_posix_unlock. Rename it.
Jeff Layton [Wed, 31 Jan 2024 23:01:47 +0000 (18:01 -0500)]
afs: convert to using new filelock helpers
Convert to using the new file locking helper functions. Also, in later
patches we're going to introduce macros that conflict with the variable
name in afs_next_locker. Rename it.
Jeff Layton [Wed, 31 Jan 2024 23:01:45 +0000 (18:01 -0500)]
filelock: add some new helper functions
In later patches we're going to embed some common fields into a new
structure inside struct file_lock. Smooth the transition by adding some
new helper functions, and converting the core file locking code to use
them.
Jeff Layton [Wed, 31 Jan 2024 23:01:43 +0000 (18:01 -0500)]
filelock: rename some fields in tracepoints
In later patches we're going to introduce some macros with names that
clash with fields here. To prevent problems building, just rename the
fields in the trace entry structures.
Jeff Layton [Wed, 31 Jan 2024 23:01:42 +0000 (18:01 -0500)]
filelock: fl_pid field should be signed int
This field has been unsigned for a very long time, but most users of the
struct file_lock and the file locking internals themselves treat it as a
signed value. Change it to be pid_t (which is a signed int).
Linus Torvalds [Sun, 21 Jan 2024 22:01:12 +0000 (14:01 -0800)]
Merge tag 'bcachefs-2024-01-21' of https://evilpiepirate.org/git/bcachefs
Pull more bcachefs updates from Kent Overstreet:
"Some fixes, Some refactoring, some minor features:
- Assorted prep work for disk space accounting rewrite
- BTREE_TRIGGER_ATOMIC: after combining our trigger callbacks, this
makes our trigger context more explicit
- A few fixes to avoid excessive transaction restarts on
multithreaded workloads: fstests (in addition to ktest tests) are
now checking slowpath counters, and that's shaking out a few bugs
- Assorted tracepoint improvements
- Starting to break up bcachefs_format.h and move on disk types so
they're with the code they belong to; this will make room to start
documenting the on disk format better.
- A few minor fixes"
* tag 'bcachefs-2024-01-21' of https://evilpiepirate.org/git/bcachefs: (46 commits)
bcachefs: Improve inode_to_text()
bcachefs: logged_ops_format.h
bcachefs: reflink_format.h
bcachefs; extents_format.h
bcachefs: ec_format.h
bcachefs: subvolume_format.h
bcachefs: snapshot_format.h
bcachefs: alloc_background_format.h
bcachefs: xattr_format.h
bcachefs: dirent_format.h
bcachefs: inode_format.h
bcachefs; quota_format.h
bcachefs: sb-counters_format.h
bcachefs: counters.c -> sb-counters.c
bcachefs: comment bch_subvolume
bcachefs: bch_snapshot::btime
bcachefs: add missing __GFP_NOWARN
bcachefs: opts->compression can now also be applied in the background
bcachefs: Prep work for variable size btree node buffers
bcachefs: grab s_umount only if snapshotting
...
Linus Torvalds [Sun, 21 Jan 2024 19:14:40 +0000 (11:14 -0800)]
Merge tag 'timers-core-2024-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer updates from Thomas Gleixner:
"Updates for time and clocksources:
- A fix for the idle and iowait time accounting vs CPU hotplug.
The time is reset on CPU hotplug which makes the accumulated
systemwide time jump backwards.
- Assorted fixes and improvements for clocksource/event drivers"
* tag 'timers-core-2024-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tick-sched: Fix idle and iowait sleeptime accounting vs CPU hotplug
clocksource/drivers/ep93xx: Fix error handling during probe
clocksource/drivers/cadence-ttc: Fix some kernel-doc warnings
clocksource/drivers/timer-ti-dm: Fix make W=n kerneldoc warnings
clocksource/timer-riscv: Add riscv_clock_shutdown callback
dt-bindings: timer: Add StarFive JH8100 clint
dt-bindings: timer: thead,c900-aclint-mtimer: separate mtime and mtimecmp regs
Kent Overstreet [Tue, 16 Jan 2024 21:20:21 +0000 (16:20 -0500)]
bcachefs: opts->compression can now also be applied in the background
The "apply this compression method in the background" paths now use the
compression option if background_compression is not set; this means that
setting or changing the compression option will cause existing data to
be compressed accordingly in the background.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 16 Jan 2024 18:29:59 +0000 (13:29 -0500)]
bcachefs: Prep work for variable size btree node buffers
bcachefs btree nodes are big - typically 256k - and btree roots are
pinned in memory. As we're now up to 18 btrees, we now have significant
memory overhead in mostly empty btree roots.
And in the future we're going to start enforcing that certain btree node
boundaries exist, to solve lock contention issues - analagous to XFS's
AGIs.
Thus, we need to start allocating smaller btree node buffers when we
can. This patch changes code that refers to the filesystem constant
c->opts.btree_node_size to refer to the btree node buffer size -
btree_buf_bytes() - where appropriate.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
In __bch2_ioctl_subvolume_create(), we grab s_umount unconditionally
and unlock it at the end of the function. There is a comment
"why do we need this lock?" about the lock coming from
commit 42d237320e98 ("bcachefs: Snapshot creation, deletion")
The reason is that __bch2_ioctl_subvolume_create() calls
sync_inodes_sb() which enforce locked s_umount to writeback all dirty
nodes before doing snapshot works.
Fix it by read locking s_umount for snapshotting only and unlocking
s_umount after sync_inodes_sb().
Signed-off-by: Su Yue <glass.su@suse.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Colin Ian King [Tue, 16 Jan 2024 11:07:23 +0000 (11:07 +0000)]
bcachefs: remove redundant variable tmp
The variable tmp is being assigned a value but it isn't being
read afterwards. The assignment is redundant and so tmp can be
removed.
Cleans up clang scan build warning:
warning: Although the value stored to 'ret' is used in the enclosing
expression, the value is never actually read from 'ret'
[deadcode.DeadStores]
Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 16 Jan 2024 01:37:23 +0000 (20:37 -0500)]
bcachefs: Fix excess transaction restarts in __bchfs_fallocate()
drop_locks_do() should not be used in a fastpath without first trying
the do in nonblocking mode - the unlock and relock will cause excessive
transaction restarts and potentially livelocking with other threads that
are contending for the same locks.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 15 Jan 2024 22:59:51 +0000 (17:59 -0500)]
bcachefs: Better journal tracepoints
Factor out bch2_journal_bufs_to_text(), and use it in the
journal_entry_full() tracepoint; when we can't get a journal reservation
we need to know the outstanding journal entry sizes to know if the
problem is due to excessive flushing.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 15 Jan 2024 22:56:22 +0000 (17:56 -0500)]
bcachefs: Avoid flushing the journal in the discard path
When issuing discards, we may need to flush the journal if there's too
many buckets that can't be discarded until a journal flush.
But the heuristic was bad; we should be comparing the number of buckets
that need to flushes against the number of free buckets, not the number
of buckets we saw.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 15 Jan 2024 19:15:26 +0000 (14:15 -0500)]
bcachefs: bch2_kthread_io_clock_wait() no longer sleeps until full amount
Drop t he loop in bch2_kthread_io_clock_wait(): this allows the code
that uses it to be woken up for other reasons, and fixes a bug where
rebalance wouldn't wake up when a scan was requested.
This raises the possibility of spurious wakeups, but callers should
always be able to handle that reasonably well.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 11 Jan 2024 04:47:04 +0000 (23:47 -0500)]
bcachefs: Reduce would_deadlock restarts
We don't have to take locks in any particular ordering - we'll make
forward progress just fine - but if we try to stick to an ordering, it
can help to avoid excessive would_deadlock transaction restarts.
This tweaks the reflink path to take extents btree locks in the right
order.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 11 Jan 2024 04:08:30 +0000 (23:08 -0500)]
bcachefs: Don't log errors if BCH_WRITE_ALLOC_NOWAIT
Previously, we added logging in the write path to ensure that any
unexpected errors getting reported to userspace have a log message; but
BCH_WRITE_ALLOC_NOWAIT is a special case, it's used for promotes where
errors are expected and not reported out to userspace - so we need to
silence those.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
- retry (reconnect) improvement including new retrans mount parm, and
handling of two additional return codes that need to be retried on
- two minor cleanup patches and another to remove duplicate query
info code
- two documentation cleanup, and one reviewer email correction"
* tag 'v6.8-rc-part2-smb-client' of git://git.samba.org/sfrench/cifs-2.6:
cifs: update iface_last_update on each query-and-update
cifs: handle servers that still advertise multichannel after disabling
cifs: new mount option called retrans
cifs: reschedule periodic query for server interfaces
smb: client: don't clobber ->i_rdev from cached reparse points
smb: client: get rid of smb311_posix_query_path_info()
smb: client: parse owner/group when creating reparse points
smb: client: fix parsing of SMB3.1.1 POSIX create context
cifs: update known bugs mentioned in kernel docs for cifs
cifs: new nt status codes from MS-SMB2
cifs: pick channel for tcon and tdis
cifs: open_cached_dir should not rely on primary channel
smb3: minor documentation updates
Update MAINTAINERS email address
cifs: minor comment cleanup
smb3: show beginning time for per share stats
cifs: remove redundant variable tcon_exist