]> git.ipfire.org Git - thirdparty/kernel/linux.git/log
thirdparty/kernel/linux.git
6 months agobcachefs: Don't BUG_ON() when superblock feature wasn't set for compressed data
Kent Overstreet [Fri, 15 Nov 2024 04:03:40 +0000 (23:03 -0500)] 
bcachefs: Don't BUG_ON() when superblock feature wasn't set for compressed data

We don't allocate the mempools for compression/decompression unless we
need them - but that means there's an inconsistency to check for.

Reported-by: syzbot+cb3fbcfb417448cfd278@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Don't use a shared decompress workspace mempool
Kent Overstreet [Fri, 15 Nov 2024 05:52:20 +0000 (00:52 -0500)] 
bcachefs: Don't use a shared decompress workspace mempool

gzip and zstd require different decompress workspace sizes, and if we
start with one and then start using the other at runtime we may not get
the correct size

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: compression workspaces should be indexed by opt, not type
Kent Overstreet [Sun, 17 Nov 2024 02:03:53 +0000 (21:03 -0500)] 
bcachefs: compression workspaces should be indexed by opt, not type

type includes lz4 and lz4_old, which do not get different compression
workspaces, and incompressible, a fake type - BCH_COMPRESSION_OPTS() is
the correct enum to use.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: add missing BTREE_ITER_intent
Kent Overstreet [Sun, 17 Nov 2024 08:31:01 +0000 (03:31 -0500)] 
bcachefs: add missing BTREE_ITER_intent

this fixes excessive transaction restarts due to trans_commit having to
upgrade

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Kill bch2_get_next_backpointer()
Kent Overstreet [Fri, 15 Nov 2024 02:53:38 +0000 (21:53 -0500)] 
bcachefs: Kill bch2_get_next_backpointer()

Since for quite some time backpointers have only been stored in the
backpointers btree, not alloc keys (an aborted experiment, support for
which has been removed) - we can replace get_next_backpointer() with
simple btree iteration.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Delete backpointers check in try_alloc_bucket()
Kent Overstreet [Fri, 15 Nov 2024 02:28:40 +0000 (21:28 -0500)] 
bcachefs: Delete backpointers check in try_alloc_bucket()

try_alloc_bucket() has a "safety" check, which avoids allocating a
bucket if there's any backpointers present.

But backpointers are not the source of truth for live data in a bucket,
the bucket sector counts are; this check was fairly useless, and we're
also deferring backpointers checks from fsck to runtime in the near
future.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: peek_prev_min(): Search forwards for extents, snapshots
Kent Overstreet [Sat, 26 Oct 2024 00:41:06 +0000 (20:41 -0400)] 
bcachefs: peek_prev_min(): Search forwards for extents, snapshots

With extents and snapshots, for slightly different reasons, we may have
to search forwards to find a key that compares equal to iter->pos (i.e.
a key that peek_prev() should return, as it returns keys <= iter->pos).

peek_slot() does this, and is an easy way to fix this case.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Implement bch2_btree_iter_prev_min()
Kent Overstreet [Fri, 25 Oct 2024 02:12:37 +0000 (22:12 -0400)] 
bcachefs: Implement bch2_btree_iter_prev_min()

A user contributed a filessytem dump, where the dump was actually
corrupted (due to being taken while the filesystem was online), but
which exposed an interesting bug in fsck - reconstruct_inode().

When itearting in BTREE_ITER_filter_snapshots mode, it's required to
give an end position for the iteration and it can't span inode numbers;
continuing into the next inode might mean we start seeing keys from a
different snapshot tree, that the is_ancestor() checks always filter,
thus we're never able to return a key and stop iterating.

Backwards iteration never implemented the end position because nothing
else needed it - except for reconstuct_inode().

Additionally, backwards iteration is now able to overlay keys from the
journal, which will be useful if we ever decide to start doing journal
replay in the background.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: discard_one_bucket() now uses need_discard_or_freespace_err()
Kent Overstreet [Sun, 27 Oct 2024 03:25:17 +0000 (23:25 -0400)] 
bcachefs: discard_one_bucket() now uses need_discard_or_freespace_err()

More conversion of inconsistent errors to fsck errors.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: bch2_bucket_do_index(): inconsistent_err -> fsck_err
Kent Overstreet [Sun, 27 Oct 2024 02:21:20 +0000 (22:21 -0400)] 
bcachefs: bch2_bucket_do_index(): inconsistent_err -> fsck_err

Factor out a common helper, need_discard_or_freespace_err(), which is
now used by both fsck and the runtime checks, and can repair.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: try_alloc_bucket() now uses bch2_check_discard_freespace_key()
Kent Overstreet [Sun, 27 Oct 2024 04:40:43 +0000 (00:40 -0400)] 
bcachefs: try_alloc_bucket() now uses bch2_check_discard_freespace_key()

check_discard_freespace_key() was doing all the same checks as
try_alloc_bucket(), but with repair.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: rework bch2_bucket_alloc_freelist() freelist iteration
Kent Overstreet [Mon, 28 Oct 2024 00:47:03 +0000 (20:47 -0400)] 
bcachefs: rework bch2_bucket_alloc_freelist() freelist iteration

Prep work for converting try_alloc_bucket() to use
bch2_check_discard_freespace_key().

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: kill inconsistent err in invalidate_one_bucket()
Kent Overstreet [Sun, 27 Oct 2024 04:05:54 +0000 (00:05 -0400)] 
bcachefs: kill inconsistent err in invalidate_one_bucket()

Change it to a normal fsck_err() - meaning it'll get repaired at runtime
when that's flipped on.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Don't delete reflink pointers to missing indirect extents
Kent Overstreet [Mon, 21 Oct 2024 00:27:44 +0000 (20:27 -0400)] 
bcachefs: Don't delete reflink pointers to missing indirect extents

To avoid tragic loss in the event of transient errors (i.e., a btree
node topology error that was later corrected by btree node scan), we
can't delete reflink pointers to correct errors.

This adds a new error bit to bch_reflink_p, indicating that it is known
to point to a missing indirect extent, and the error has already been
reported.

Indirect extent lookups now use bch2_lookup_indirect_extent(), which on
error reports it as a fsck_err() and sets the error bit, and clears it
if necessary on succesful lookup.

This also gets rid of the bch2_inconsistent_error() call in
__bch2_read_indirect_extent, and in the reflink_p trigger: part of the
online self healing project.

An on disk format change isn't necessary here: setting the error bit
will be interpreted by older versions as pointing to a different index,
which will also be missing - which is fine.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Reorganize reflink.c a bit
Kent Overstreet [Thu, 31 Oct 2024 05:25:09 +0000 (01:25 -0400)] 
bcachefs: Reorganize reflink.c a bit

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Reserve 8 bits in bch_reflink_p
Kent Overstreet [Tue, 29 Oct 2024 03:43:16 +0000 (23:43 -0400)] 
bcachefs: Reserve 8 bits in bch_reflink_p

Better repair for reflink pointers, as well as propagating new inode
options to indirect extents, are going to require a few extra bits
bch_reflink_p: so claim a few from the high end of the destination
index.

Also add some missing bounds checking.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Kill FSCK_NEED_FSCK
Kent Overstreet [Tue, 29 Oct 2024 01:27:23 +0000 (21:27 -0400)] 
bcachefs: Kill FSCK_NEED_FSCK

If we find an error that indicates that we need to run fsck, we can
specify that directly with run_explicit_recovery_pass().

These are now log_fsck_err() calls: we're just logging in the superblock
that an error occurred - and possibly doing an emergency shutdown,
depending on policy.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: lru errors are expected when reconstructing alloc
Kent Overstreet [Tue, 29 Oct 2024 05:17:08 +0000 (01:17 -0400)] 
bcachefs: lru errors are expected when reconstructing alloc

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Delete dead code from bch2_discard_one_bucket()
Kent Overstreet [Sun, 27 Oct 2024 02:52:06 +0000 (22:52 -0400)] 
bcachefs: Delete dead code from bch2_discard_one_bucket()

alloc key validation ensures that if a bucket is in need_discard state
the sector counts are all zero - we don't have to check for that.

The NEED_INC_GEN check appears to be dead code, as well: we only see
buckets in the need_discard btree, and it's an error if they aren't in
the need_discard state.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: bch2_btree_bit_mod_iter()
Kent Overstreet [Sun, 27 Oct 2024 03:35:03 +0000 (23:35 -0400)] 
bcachefs: bch2_btree_bit_mod_iter()

factor out a new helper, make it handle extents bitset btrees
(freespace).

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: delete dead code
Kent Overstreet [Tue, 12 Nov 2024 08:53:30 +0000 (03:53 -0500)] 
bcachefs: delete dead code

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Fix shutdown message
Kent Overstreet [Fri, 8 Nov 2024 02:50:00 +0000 (21:50 -0500)] 
bcachefs: Fix shutdown message

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Don't use page allocator for sb_read_scratch
Kent Overstreet [Fri, 8 Nov 2024 00:15:38 +0000 (19:15 -0500)] 
bcachefs: Don't use page allocator for sb_read_scratch

Kill another unnecessary dependency on PAGE_SIZE

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Simplify code in bch2_dev_alloc()
Youling Tang [Wed, 16 Oct 2024 01:49:11 +0000 (09:49 +0800)] 
bcachefs: Simplify code in bch2_dev_alloc()

- Remove unnecessary variable 'ret'.
- Remove unnecessary bch2_dev_free() operations.

Signed-off-by: Youling Tang <tangyouling@kylinos.cn>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Remove redundant initialization in bch2_vfs_inode_init()
Youling Tang [Fri, 27 Sep 2024 08:40:42 +0000 (16:40 +0800)] 
bcachefs: Remove redundant initialization in bch2_vfs_inode_init()

`inode->v.i_ino` has been initialized to `inum.inum`. If `inum.inum` and
`bi->bi_inum` are not equal, BUG_ON() is triggered in
bch2_inode_update_after_write().

Signed-off-by: Youling Tang <tangyouling@kylinos.cn>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Removes NULL pointer checks for __filemap_get_folio return values
Youling Tang [Tue, 24 Sep 2024 02:53:50 +0000 (10:53 +0800)] 
bcachefs: Removes NULL pointer checks for __filemap_get_folio return values

__filemap_get_folio the return value cannot be NULL, so unnecessary checks
are removed.

Signed-off-by: Youling Tang <tangyouling@kylinos.cn>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Add support for FS_IOC_GETFSSYSFSPATH
Kent Overstreet [Tue, 9 Jul 2024 01:11:34 +0000 (09:11 +0800)] 
bcachefs: Add support for FS_IOC_GETFSSYSFSPATH

[TEST]:
```
$ cat ioctl_getsysfspath.c
 #include <stdio.h>
 #include <stdlib.h>
 #include <fcntl.h>
 #include <sys/ioctl.h>
 #include <linux/fs.h>
 #include <unistd.h>

 int main(int argc, char *argv[]) {
     int fd;
     struct fs_sysfs_path sysfs_path = {};

     if (argc != 2) {
         fprintf(stderr, "Usage: %s <path_to_file_or_directory>\n", argv[0]);
         exit(EXIT_FAILURE);
     }

     fd = open(argv[1], O_RDONLY);
     if (fd == -1) {
         perror("open");
         exit(EXIT_FAILURE);
     }

     if (ioctl(fd, FS_IOC_GETFSSYSFSPATH, &sysfs_path) == -1) {
         perror("ioctl FS_IOC_GETFSSYSFSPATH");
         close(fd);
         exit(EXIT_FAILURE);
     }

     printf("FS_IOC_GETFSSYSFSPATH: %s\n", sysfs_path.name);
     close(fd);
     return 0;
 }

$ gcc ioctl_getsysfspath.c
$ sudo bcachefs format /dev/sda
$ sudo mount.bcachefs /dev/sda /mnt
$ sudo ./a.out /mnt
  FS_IOC_GETFSSYSFSPATH: bcachefs/c380b4ab-fbb6-41d2-b805-7a89cae9cadb
```

Original patch link:
[1]: https://lore.kernel.org/all/20240207025624.1019754-8-kent.overstreet@linux.dev/

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Youling Tang <youling.tang@linux.dev>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Add support for FS_IOC_GETFSUUID
Kent Overstreet [Tue, 9 Jul 2024 01:11:33 +0000 (09:11 +0800)] 
bcachefs: Add support for FS_IOC_GETFSUUID

Use super_set_uuid() to set `sb->s_uuid_len` to avoid returning `-ENOTTY`
with sb->s_uuid_len being 0.

Original patch link:
[1]: https://lore.kernel.org/all/20240207025624.1019754-2-kent.overstreet@linux.dev/

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Youling Tang <tangyouling@kylinos.cn>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Correct the description of the '--bucket=size' options
Youling Tang [Wed, 16 Oct 2024 01:50:26 +0000 (09:50 +0800)] 
bcachefs: Correct the description of the '--bucket=size' options

Signed-off-by: Youling Tang <tangyouling@kylinos.cn>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: add support for true/false & yes/no in bool-type options
Integral [Wed, 23 Oct 2024 10:00:33 +0000 (18:00 +0800)] 
bcachefs: add support for true/false & yes/no in bool-type options

Here is the patch which uses existing constant table:

Currently, when using bcachefs-tools to set options, bool-type options
can only accept 1 or 0. Add support for accepting true/false and yes/no
for these options.

Signed-off-by: Integral <integral@murena.io>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Acked-by: David Howells <dhowells@redhat.com>
6 months agobcachefs: Move fsck ioctl code to fsck.c
Kent Overstreet [Wed, 6 Nov 2024 18:13:25 +0000 (13:13 -0500)] 
bcachefs: Move fsck ioctl code to fsck.c

chardev.c and fs-ioctl.c are not organized by subject; let's try to fix
this.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Kill unnecessary iter_rewind() in bkey_get_empty_slot()
Kent Overstreet [Sat, 26 Oct 2024 02:16:19 +0000 (22:16 -0400)] 
bcachefs: Kill unnecessary iter_rewind() in bkey_get_empty_slot()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Simplify btree_iter_peek() filter_snapshots
Kent Overstreet [Fri, 25 Oct 2024 05:48:26 +0000 (01:48 -0400)] 
bcachefs: Simplify btree_iter_peek() filter_snapshots

Collapse all the BTREE_ITER_filter_snapshots handling down into a single
block; btree iteration is much simpler in the !filter_snapshots case.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Rename btree_iter_peek_upto() -> btree_iter_peek_max()
Kent Overstreet [Thu, 24 Oct 2024 22:39:59 +0000 (18:39 -0400)] 
bcachefs: Rename btree_iter_peek_upto() -> btree_iter_peek_max()

We'll be introducing btree_iter_peek_prev_min(), so rename for
consistency.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Assert that we're not violating key cache coherency rules
Kent Overstreet [Sat, 26 Oct 2024 02:31:20 +0000 (22:31 -0400)] 
bcachefs: Assert that we're not violating key cache coherency rules

We're not allowed to have a dirty key in the key cache if the key
doesn't exist at all in the btree - creation has to bypass the key
cache, so that iteration over the btree can check if the key is present
in the key cache.

Things break in subtle ways if cache coherency is broken, so this needs
an assert.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: bch2_trans_verify_not_unlocked_or_in_restart()
Kent Overstreet [Sun, 27 Oct 2024 23:32:40 +0000 (19:32 -0400)] 
bcachefs: bch2_trans_verify_not_unlocked_or_in_restart()

Fold two asserts into one.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Better in_restart error
Kent Overstreet [Tue, 15 Oct 2024 03:52:51 +0000 (23:52 -0400)] 
bcachefs: Better in_restart error

We're ramping up on checking transaction restart handling correctness -
so, in debug mode we now save a backtrace for where the restart was
emitted, which makes it much easier to track down the incorrect
handling.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Assert we're not in a restart in bch2_trans_put()
Kent Overstreet [Tue, 15 Oct 2024 03:33:57 +0000 (23:33 -0400)] 
bcachefs: Assert we're not in a restart in bch2_trans_put()

This always indicates a transaction restart handling bug

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Fix unhandled transaction restart in evacuate_bucket()
Kent Overstreet [Fri, 8 Nov 2024 03:00:05 +0000 (22:00 -0500)] 
bcachefs: Fix unhandled transaction restart in evacuate_bucket()

Generally, releasing a transaction within a transaction restart means an
unhandled transaction restart: but this can happen legitimately within
the move code, e.g. when bch2_move_ratelimit() tells us to exit before
we've retried.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Improved check_topology() assert
Kent Overstreet [Thu, 31 Oct 2024 04:25:36 +0000 (00:25 -0400)] 
bcachefs: Improved check_topology() assert

On interior btree node updates, we always verify that we're not
introducing topology errors: child nodes should exactly span the range
of the parent node.

single_device.ktest small_nodes has been popping this assert: change it
to give us more information.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Kill BCH_TRANS_COMMIT_lazy_rw
Kent Overstreet [Thu, 31 Oct 2024 07:39:32 +0000 (03:39 -0400)] 
bcachefs: Kill BCH_TRANS_COMMIT_lazy_rw

We unconditionally go read-write, if we're going to do so, before
journal replay: lazy_rw is obsolete.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Add assert for use of journal replay keys for updates
Kent Overstreet [Thu, 31 Oct 2024 07:35:41 +0000 (03:35 -0400)] 
bcachefs: Add assert for use of journal replay keys for updates

The journal replay keys mechanism can only be used for updates in early
recovery, when still single threaded.

Add some asserts to make sure we never accidentally use it elsewhere.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: use attribute define helper for sysfs attribute
Hongbo Li [Tue, 29 Oct 2024 12:54:08 +0000 (20:54 +0800)] 
bcachefs: use attribute define helper for sysfs attribute

The sysfs attribute definition has been wrapped into macro:
rw_attribute, read_attribute and write_attribute, we can
use these helpers to uniform the attribute definition.

Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: remove write permission for gc_gens_pos sysfs interface
Hongbo Li [Tue, 29 Oct 2024 12:53:50 +0000 (20:53 +0800)] 
bcachefs: remove write permission for gc_gens_pos sysfs interface

The gc_gens_pos is used to show the status of bucket gen gc.
There is no need to assign write permissions for this attribute.
Here we can use read_attribute helper to define this attribute.

```
[Before]
  $ ll internal/gc_gens_pos
  -rw-r--r-- 1 root root 4096 Oct 28 15:27 internal/gc_gens_pos

[After]
  $ ll internal/gc_gens_pos
  -r--r--r-- 1 root root 4096 Oct 28 17:27 internal/gc_gens_pos
```

Fixes: ac516d0e7db7 ("bcachefs: Add the status of bucket gen gc to sysfs")
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Move bch_extent_rebalance code to rebalance.c
Kent Overstreet [Tue, 29 Oct 2024 03:23:18 +0000 (23:23 -0400)] 
bcachefs: Move bch_extent_rebalance code to rebalance.c

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Improve trace_rebalance_extent
Kent Overstreet [Sat, 26 Oct 2024 05:42:57 +0000 (01:42 -0400)] 
bcachefs: Improve trace_rebalance_extent

We now say explicitly which pointers are being moved or compressed

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Simplify option logic in rebalance
Kent Overstreet [Sun, 20 Oct 2024 01:41:20 +0000 (21:41 -0400)] 
bcachefs: Simplify option logic in rebalance

Since bch2_move_get_io_opts() now synchronizes io_opts with options from
bch_extent_rebalance, delete the ad-hoc logic in rebalance.c that
previously did this.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: get_update_rebalance_opts()
Kent Overstreet [Sun, 20 Oct 2024 01:41:20 +0000 (21:41 -0400)] 
bcachefs: get_update_rebalance_opts()

bch2_move_get_io_opts() now synchronizes options loaded from the
filesystem and inode (if present, i.e. not walking the reflink btree
directly) with options from the bch_extent_rebalance_entry, updating the
extent if necessary.

Since bch_extent_rebalance tracks where its option came from we can
preserve "inode options override filesystem options", even for indirect
extents where we don't have access to the inode the options came from.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: bch2_write_inode() now checks for changing rebalance options
Kent Overstreet [Mon, 21 Oct 2024 00:53:53 +0000 (20:53 -0400)] 
bcachefs: bch2_write_inode() now checks for changing rebalance options

Previously, BCHFS_IOC_REINHERIT_ATTRS didn't trigger rebalance scans
when changing rebalance options - it had been missed, only the xattr
interface triggered them.

Ideally they'd be done by the transactional trigger, but unpacking the
inode to get the options is too heavy to be done in the low level
trigger - the inode trigger is run on every extent update, since the
bch_inode.bi_journal_seq has to be updated for fsync.

bch2_write_inode() is a good compromise, it already unpacks and repacks
and is not run in any super-fast paths.

Additionally, creating the new rebalance entry to trigger the scan is
now done in the same transaction as the inode update that changed the
options.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: New bch_extent_rebalance fields
Kent Overstreet [Sun, 20 Oct 2024 01:41:20 +0000 (21:41 -0400)] 
bcachefs: New bch_extent_rebalance fields

- Add more io path options to bch_extent_rebalance
- For each option, track whether it came from the filesystem or the
  inode

This will be used for improved rebalance support for reflinked data.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: bch2_prt_csum_opt()
Kent Overstreet [Mon, 28 Oct 2024 05:14:53 +0000 (01:14 -0400)] 
bcachefs: bch2_prt_csum_opt()

bounds checking helper

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: copygc_enabled, rebalance_enabled now opts.h options
Kent Overstreet [Thu, 24 Oct 2024 05:06:53 +0000 (01:06 -0400)] 
bcachefs: copygc_enabled, rebalance_enabled now opts.h options

They can now be set at mount time

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Add bch_io_opts fields for indicating whether the opts came from the inode
Kent Overstreet [Sun, 20 Oct 2024 03:26:11 +0000 (23:26 -0400)] 
bcachefs: Add bch_io_opts fields for indicating whether the opts came from the inode

This is going to be used in the bch_extent_rebalance improvements, which
propagate io_path options into the extent (important for rebalance,
which needs something present in the extent for transactionally tagging
them in the rebalance_work btree, and also for indirect extents).

By tracking in bch_extent_rebalance whether the option came from the
filesystem or the inode we can correctly handle options being changed on
indirect extents.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: io_opts_to_rebalance_opts()
Kent Overstreet [Sun, 20 Oct 2024 06:28:51 +0000 (02:28 -0400)] 
bcachefs: io_opts_to_rebalance_opts()

New helper to simplify bch2_bkey_set_needs_rebalance()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: rename bch_extent_rebalance fields to match other opts structs
Kent Overstreet [Sun, 20 Oct 2024 06:21:28 +0000 (02:21 -0400)] 
bcachefs: rename bch_extent_rebalance fields to match other opts structs

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: kill __bch2_bkey_sectors_need_rebalance()
Kent Overstreet [Sun, 20 Oct 2024 06:14:53 +0000 (02:14 -0400)] 
bcachefs: kill __bch2_bkey_sectors_need_rebalance()

Single caller, fold into bch2_bkey_sectors_need_rebalance()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: kill bch2_bkey_needs_rebalance()
Kent Overstreet [Sun, 20 Oct 2024 05:40:19 +0000 (01:40 -0400)] 
bcachefs: kill bch2_bkey_needs_rebalance()

Dead code

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: small cleanup for extent ptr bitmasks
Kent Overstreet [Sun, 20 Oct 2024 05:32:55 +0000 (01:32 -0400)] 
bcachefs: small cleanup for extent ptr bitmasks

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: bch2_io_opts_fixups()
Kent Overstreet [Sun, 20 Oct 2024 05:21:43 +0000 (01:21 -0400)] 
bcachefs: bch2_io_opts_fixups()

Centralize some io path option fixups - they weren't always being
applied correctly:

- background_compression uses compression if unset
- background_target uses foreground_target if unset
- nocow disables most fancy io path options

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: use bch2_data_update_opts_to_text() in trace_move_extent_fail()
Kent Overstreet [Sun, 20 Oct 2024 05:16:16 +0000 (01:16 -0400)] 
bcachefs: use bch2_data_update_opts_to_text() in trace_move_extent_fail()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: avoid 'unsigned flags'
Kent Overstreet [Sun, 20 Oct 2024 05:11:29 +0000 (01:11 -0400)] 
bcachefs: avoid 'unsigned flags'

flags should have actual types, where possible: fix btree_update.h
helpers

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Annotate struct bucket_gens with __counted_by()
Thorsten Blum [Sat, 26 Oct 2024 15:47:04 +0000 (17:47 +0200)] 
bcachefs: Annotate struct bucket_gens with __counted_by()

Add the __counted_by compiler attribute to the flexible array member b
to improve access bounds-checking via CONFIG_UBSAN_BOUNDS and
CONFIG_FORTIFY_SOURCE.

Use struct_size() to calculate the number of bytes to be allocated.

Update bucket_gens->nbuckets and bucket_gens->nbuckets_minus_first when
resizing.

Compile-tested only.

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Use str_write_read() helper in write_super_endio()
Thorsten Blum [Sat, 26 Oct 2024 10:47:23 +0000 (12:47 +0200)] 
bcachefs: Use str_write_read() helper in write_super_endio()

Remove hard-coded strings by using the str_write_read() helper function.

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Use str_write_read() helper in ec_block_endio()
Thorsten Blum [Sun, 20 Oct 2024 11:20:46 +0000 (13:20 +0200)] 
bcachefs: Use str_write_read() helper in ec_block_endio()

Remove hard-coded strings by using the helper function str_write_read().

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Use str_write_read() helper function
Thorsten Blum [Sat, 19 Oct 2024 12:25:27 +0000 (14:25 +0200)] 
bcachefs: Use str_write_read() helper function

Remove hard-coded strings by using the helper function str_write_read().

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Add version check for bch_btree_ptr_v2.sectors_written validate
Kent Overstreet [Sun, 20 Oct 2024 23:02:44 +0000 (19:02 -0400)] 
bcachefs: Add version check for bch_btree_ptr_v2.sectors_written validate

A user popped up with a very old (0.11) filesystem that needed repair
and wasn't recently backed up.

Reported-by: Manoa <manoa@mail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Add block plugging to read paths
Kent Overstreet [Tue, 15 Oct 2024 01:35:44 +0000 (21:35 -0400)] 
bcachefs: Add block plugging to read paths

This will help with some of the btree_trans srcu lock hold time warnings
that are still turning up; submit_bio() can block for awhile if the
device is sufficiently congested.

It's not a perfect solution since blk_plug bios are submitted when
scheduling; we might want a way to disable the "submit on context
switch" behaviour, or switch to our own plugging in the future.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Fix warning about passing flex array member by value
Kent Overstreet [Sat, 12 Oct 2024 18:07:44 +0000 (14:07 -0400)] 
bcachefs: Fix warning about passing flex array member by value

this showed up when building in userspace

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: bch2_journal_meta() takes ref on c->writes
Kent Overstreet [Sat, 12 Oct 2024 02:50:48 +0000 (22:50 -0400)] 
bcachefs: bch2_journal_meta() takes ref on c->writes

This part of addressing
https://github.com/koverstreet/bcachefs/issues/656

where we're getting stuck in bch2_journal_meta() in the dump tool.

We shouldn't be invoking the journal without a ref on c->writes (if
we're not RW), and there's no reason for the dump tool to be going
read-write.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: -o norecovery now bails out of recovery earlier
Kent Overstreet [Sat, 12 Oct 2024 02:53:09 +0000 (22:53 -0400)] 
bcachefs: -o norecovery now bails out of recovery earlier

-o norecovery (used by the dump tool) should be doing the absolute
minimum amount of work to get the filesystem up and readable; we
shouldn't be running check and repair code, or going read-write.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Refactor new stripe path to reduce dependencies on ec_stripe_head
Kent Overstreet [Sun, 1 Sep 2024 18:57:26 +0000 (14:57 -0400)] 
bcachefs: Refactor new stripe path to reduce dependencies on ec_stripe_head

We need to add a path for reshaping existing stripes (for e.g. device
removal), and this new path won't necessarily use ec_stripe_head.

Refactor the code to avoid unnecessary references to it for clarity.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Avoid bch2_btree_id_str()
Kent Overstreet [Thu, 10 Oct 2024 03:02:04 +0000 (23:02 -0400)] 
bcachefs: Avoid bch2_btree_id_str()

Prefer bch2_btree_id_to_text() - it prints out the integer ID when
unknown.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: better error message in check_snapshot_tree()
Kent Overstreet [Thu, 10 Oct 2024 01:27:11 +0000 (21:27 -0400)] 
bcachefs: better error message in check_snapshot_tree()

If we find a snapshot node and it didn't match the snapshot tree, we
should print it.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Factor out jset_entry_log_msg_bytes()
Kent Overstreet [Thu, 10 Oct 2024 01:51:05 +0000 (21:51 -0400)] 
bcachefs: Factor out jset_entry_log_msg_bytes()

Needed for improved userspace cmd_list_journal

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: improved bkey_val_copy()
Kent Overstreet [Thu, 10 Oct 2024 01:26:05 +0000 (21:26 -0400)] 
bcachefs: improved bkey_val_copy()

Factor out some common code, add typechecking.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: bch2_btree_lost_data() now uses run_explicit_rceovery_pass_persistent()
Kent Overstreet [Sun, 22 Sep 2024 03:40:01 +0000 (23:40 -0400)] 
bcachefs: bch2_btree_lost_data() now uses run_explicit_rceovery_pass_persistent()

Also get a bit more fine grained about which passes to run for which
btrees.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Add locking for bch_fs.curr_recovery_pass
Kent Overstreet [Sun, 22 Sep 2024 03:27:59 +0000 (23:27 -0400)] 
bcachefs: Add locking for bch_fs.curr_recovery_pass

Recovery can rewind in certain situations - when we discover we need to
run a pass that doesn't normally run.

This can happen from another thread for btree node read errors, so we
need a bit of locking.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: lru, accounting are alloc btrees
Kent Overstreet [Sun, 22 Sep 2024 03:22:48 +0000 (23:22 -0400)] 
bcachefs: lru, accounting are alloc btrees

They can be regenerated by fsck and don't require a btree node scan,
like other alloc btrees.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: bch2_run_explicit_recovery_pass() returns different error when not in recovery
Kent Overstreet [Sun, 22 Sep 2024 00:21:18 +0000 (20:21 -0400)] 
bcachefs: bch2_run_explicit_recovery_pass() returns different error when not in recovery

if we're not in recovery then there's no way to rewind recovery - give
this a different errcode so that any error messages will give us a
better idea of what happened.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: add more path idx debug asserts
Kent Overstreet [Mon, 23 Sep 2024 22:11:07 +0000 (18:11 -0400)] 
bcachefs: add more path idx debug asserts

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Use FOREACH_ACL_ENTRY() macro to iterate over acl entries
Thorsten Blum [Mon, 23 Sep 2024 14:44:53 +0000 (16:44 +0200)] 
bcachefs: Use FOREACH_ACL_ENTRY() macro to iterate over acl entries

Use the existing FOREACH_ACL_ENTRY() macro to iterate over POSIX acl
entries and remove the custom acl_for_each_entry() macro.

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Remove duplicate included headers
Thorsten Blum [Mon, 23 Sep 2024 14:20:29 +0000 (16:20 +0200)] 
bcachefs: Remove duplicate included headers

The header files dirent_format.h and disk_groups_format.h are included
twice. Remove the redundant includes and the following warnings reported
by make includecheck:

  disk_groups_format.h is included more than once
  dirent_format.h is included more than once

Reviewed-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agodocs: filesystems: bcachefs: fixed some spelling mistakes in the bcachefs coding...
Dennis Lam [Thu, 12 Sep 2024 01:16:28 +0000 (21:16 -0400)] 
docs: filesystems: bcachefs: fixed some spelling mistakes in the bcachefs coding style page

Specifically, fixed spelling of "commit" and pluralization of last sentence.

Signed-off-by: Dennis Lam <dennis.lamerice@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: kill btree_trans_restart_nounlock()
Kent Overstreet [Tue, 24 Sep 2024 02:11:41 +0000 (22:11 -0400)] 
bcachefs: kill btree_trans_restart_nounlock()

Redundant, the normal btree_trans_restart() doesn't unlock.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Remove unnecessary peek_slot()
Kent Overstreet [Tue, 24 Sep 2024 09:08:39 +0000 (05:08 -0400)] 
bcachefs: Remove unnecessary peek_slot()

hash_lookup() used to return an errorcode, and a peek_slot() call was
required to get the key it looked up. But we're adding fault injection
for transaction restarts, so fix this old unconverted code.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: move bch2_xattr_handlers to .rodata
Thomas Bertschinger [Sat, 14 Sep 2024 00:11:22 +0000 (18:11 -0600)] 
bcachefs: move bch2_xattr_handlers to .rodata

A series posted previously moved all of the `struct xattr_handler`
tables to .rodata for each filesystem [1].

However, this appears to have been done shortly before bcachefs was
merged, so bcachefs was missed at that time.

Link: https://lkml.kernel.org/r/20230930050033.41174-1-wedsonaf@gmail.com
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Thomas Bertschinger <tahbertschinger@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Delete dead code
Alan Huang [Fri, 27 Sep 2024 14:26:53 +0000 (22:26 +0800)] 
bcachefs: Delete dead code

lock_fail_root_changed has not been used since commit
0d7009d7ca99 ("bcachefs: Delete old deadlock avoidance code")

Remove it.

Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Pull disk accounting hooks out of trans_commit.c
Kent Overstreet [Tue, 1 Oct 2024 20:59:08 +0000 (16:59 -0400)] 
bcachefs: Pull disk accounting hooks out of trans_commit.c

Also, fix a minor bug in the revert path, where we weren't checking the
journal entry type correctly.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: bch_verbose_ratelimited
Kent Overstreet [Sun, 29 Sep 2024 03:10:48 +0000 (23:10 -0400)] 
bcachefs: bch_verbose_ratelimited

ratelimit "deleting unlinked inode" messages

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: rcu_pending: don't invoke __call_rcu() under lock
Kent Overstreet [Sun, 22 Sep 2024 05:11:36 +0000 (01:11 -0400)] 
bcachefs: rcu_pending: don't invoke __call_rcu() under lock

In userspace we don't (yet) have an SRCU implementation, so call_srcu()
recurses.

But we don't want to be invoking it under the lock anyways.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: __bch2_key_has_snapshot_overwrites uses for_each_btree_key_reverse_norestart()
Kent Overstreet [Mon, 30 Sep 2024 04:14:09 +0000 (00:14 -0400)] 
bcachefs: __bch2_key_has_snapshot_overwrites uses for_each_btree_key_reverse_norestart()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: remove_backpointer() now uses dirent_get_by_pos()
Kent Overstreet [Tue, 1 Oct 2024 21:45:58 +0000 (17:45 -0400)] 
bcachefs: remove_backpointer() now uses dirent_get_by_pos()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: bch2_inode_should_have_bp -> bch2_inode_should_have_single_bp
Kent Overstreet [Sat, 28 Sep 2024 18:27:24 +0000 (14:27 -0400)] 
bcachefs: bch2_inode_should_have_bp -> bch2_inode_should_have_single_bp

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: remove superfluous ; after statements
Colin Ian King [Mon, 7 Oct 2024 08:11:21 +0000 (09:11 +0100)] 
bcachefs: remove superfluous ; after statements

There are a several statements with two following semicolons, replace
these with just one semicolon.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agobcachefs: Fix racy use of jiffies
Kent Overstreet [Wed, 9 Oct 2024 20:53:59 +0000 (16:53 -0400)] 
bcachefs: Fix racy use of jiffies

Calculate the timeout, then check if it's positive before calling
schedule_timeout().

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
6 months agoMerge branch 'bcachefs-kill-retry-estale' into HEAD
Kent Overstreet [Fri, 11 Oct 2024 23:23:26 +0000 (19:23 -0400)] 
Merge branch 'bcachefs-kill-retry-estale' into HEAD

6 months agoLinux 6.13-rc3 v6.13-rc3
Linus Torvalds [Sun, 15 Dec 2024 23:58:23 +0000 (15:58 -0800)] 
Linux 6.13-rc3

6 months agoMerge tag 'arc-6.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Linus Torvalds [Sun, 15 Dec 2024 23:38:12 +0000 (15:38 -0800)] 
Merge tag 'arc-6.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc

Pull ARC fixes from Vineet Gupta:

 - Sundry build and misc fixes

* tag 'arc-6.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  ARC: build: Try to guess GCC variant of cross compiler
  ARC: bpf: Correct conditional check in 'check_jmp_32'
  ARC: dts: Replace deprecated snps,nr-gpios property for snps,dw-apb-gpio-port devices
  ARC: build: Use __force to suppress per-CPU cmpxchg warnings
  ARC: fix reference of dependency for PAE40 config
  ARC: build: disallow invalid PAE40 + 4K page config
  arc: rename aux.h to arc_aux.h

6 months agoMerge tag 'efi-fixes-for-v6.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 15 Dec 2024 23:33:41 +0000 (15:33 -0800)] 
Merge tag 'efi-fixes-for-v6.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi

Pull EFI fixes from Ard Biesheuvel:

 - Limit EFI zboot to GZIP and ZSTD before it comes in wider use

 - Fix inconsistent error when looking up a non-existent file in
   efivarfs with a name that does not adhere to the NAME-GUID format

 - Drop some unused code

* tag 'efi-fixes-for-v6.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
  efi/esrt: remove esre_attribute::store()
  efivarfs: Fix error on non-existent file
  efi/zboot: Limit compression options to GZIP and ZSTD

6 months agoMerge tag 'i2c-for-6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa...
Linus Torvalds [Sun, 15 Dec 2024 23:29:07 +0000 (15:29 -0800)] 
Merge tag 'i2c-for-6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:
 "i2c host fixes: PNX used the wrong unit for timeouts, Nomadik was
  missing a sentinel, and RIIC was missing rounding up"

* tag 'i2c-for-6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: riic: Always round-up when calculating bus period
  i2c: nomadik: Add missing sentinel to match table
  i2c: pnx: Fix timeout in wait functions