]> git.ipfire.org Git - thirdparty/xfsprogs-dev.git/log
thirdparty/xfsprogs-dev.git
14 months agoxfs_repair: remove the old bag implementation
Darrick J. Wong [Mon, 22 Apr 2024 17:01:19 +0000 (10:01 -0700)] 
xfs_repair: remove the old bag implementation

Remove the old bag implementation.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
14 months agoxfs_repair: port to the new refcount bag structure
Darrick J. Wong [Mon, 22 Apr 2024 17:01:19 +0000 (10:01 -0700)] 
xfs_repair: port to the new refcount bag structure

Port the refcount record generating code to use the new refcount bag
data structure.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
14 months agoxfs_repair: create refcount bag
Darrick J. Wong [Mon, 22 Apr 2024 17:01:19 +0000 (10:01 -0700)] 
xfs_repair: create refcount bag

Create a bag structure for refcount information that uses the refcount
bag btree defined in the previous patch.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
14 months agoxfs_repair: define an in-memory btree for storing refcount bag info
Darrick J. Wong [Mon, 22 Apr 2024 17:01:18 +0000 (10:01 -0700)] 
xfs_repair: define an in-memory btree for storing refcount bag info

Create a new in-memory btree type so that we can store refcount bag info
in a much more memory-efficient format.  Note that the xfs_repair rcbag
btree stores inode numbers (unlike the kernel rcbag btree) because
xfs_repair needs to compute the bitmap of inodes that must have the
reflink iflag set.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
14 months agoxfs_repair: remove the old rmap collection slabs
Darrick J. Wong [Mon, 22 Apr 2024 17:01:18 +0000 (10:01 -0700)] 
xfs_repair: remove the old rmap collection slabs

Now that we've switched the offline repair code to use an in-memory
rmap btree for everything except recording the rmaps for the newly
generated per-AG btrees, get rid of all the old code.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
14 months agoxfs_repair: reduce rmap bag memory usage when creating refcounts
Darrick J. Wong [Mon, 22 Apr 2024 17:01:18 +0000 (10:01 -0700)] 
xfs_repair: reduce rmap bag memory usage when creating refcounts

The algorithm that computes reference count records uses a "bag"
structure to remember the rmap records corresponding to the current
block.  In the previous patch we converted the bag structure to store
actual rmap records instead of pointers to rmap records owned by another
structure as part of preparing for converting this algorithm to use
in-memory rmap btrees.

However, the memory usage of the bag structure is now excessive -- we
only need the physical extent and inode owner information to generate
refcount records and mark inodes that require the reflink flag.  IOWs,
the flags and offset fields are unnecessary.  Create a custom structure
for the bag, which halves its memory usage.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
14 months agoxfs_repair: compute refcount data from in-memory rmap btrees
Darrick J. Wong [Mon, 22 Apr 2024 17:01:18 +0000 (10:01 -0700)] 
xfs_repair: compute refcount data from in-memory rmap btrees

Use the in-memory rmap btrees to compute the reference count
information.  Convert the bag implementation to hold actual records
instead of pointers to slab objects.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
14 months agoxfs_repair: verify on-disk rmap btrees with in-memory btree data
Darrick J. Wong [Mon, 22 Apr 2024 17:01:18 +0000 (10:01 -0700)] 
xfs_repair: verify on-disk rmap btrees with in-memory btree data

Check the on-disk reverse mappings with the observations we've recorded
in the in-memory btree during the filesystem walk.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
14 months agoxfs_repair: convert regular rmap repair to use in-memory btrees
Darrick J. Wong [Mon, 22 Apr 2024 17:01:17 +0000 (10:01 -0700)] 
xfs_repair: convert regular rmap repair to use in-memory btrees

Convert the rmap btree repair code to use in-memory rmap btrees to store
the observed reverse mapping records.  This will eliminate the need for
a separate record sorting step, as well as eliminate the need for all
the code that turns multiple consecutive bmap records into a single rmap
record.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
14 months agolibxfs: provide a kernel-compatible kasprintf
Christoph Hellwig [Mon, 22 Apr 2024 17:01:17 +0000 (10:01 -0700)] 
libxfs: provide a kernel-compatible kasprintf

The kernel-like kasprintf will be used by the new metadir code, as well
as the rmap data structures in xfs_repair.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: tweak commit message]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
14 months agoxfs_repair: check num before bplist[num]
Darrick J. Wong [Mon, 22 Apr 2024 17:01:17 +0000 (10:01 -0700)] 
xfs_repair: check num before bplist[num]

smatch complained about checking an array index before indexing the
array, so fix that.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
14 months agoxfs_repair: log when buffers fail CRC checks even if we just recompute it
Darrick J. Wong [Sun, 2 Jun 2024 23:32:50 +0000 (16:32 -0700)] 
xfs_repair: log when buffers fail CRC checks even if we just recompute it

We should always log metadata block CRC validation errors, even if we
decide that the block contents are ok and that we'll simply recompute
the checksum.  Without this patch, xfs_repair -n won't say anything
about crc errors on these blocks.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
14 months agoxfs_scrub: upload clean bills of health
Darrick J. Wong [Mon, 22 Apr 2024 17:01:17 +0000 (10:01 -0700)] 
xfs_scrub: upload clean bills of health

If scrub terminates with a clean bill of health, tell the kernel that
the result of the scan is that everything's healthy.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
14 months agoxfs_scrub: use multiple threads to run in-kernel metadata scrubs that scan inodes
Darrick J. Wong [Mon, 22 Apr 2024 17:01:17 +0000 (10:01 -0700)] 
xfs_scrub: use multiple threads to run in-kernel metadata scrubs that scan inodes

Instead of running the inode link count and quotacheck scanners in
serial, run them in parallel, with a slight delay to stagger the work to
reduce inode resource contention.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
14 months agoxfs_scrub: update health status if we get a clean bill of health
Darrick J. Wong [Mon, 22 Apr 2024 17:01:16 +0000 (10:01 -0700)] 
xfs_scrub: update health status if we get a clean bill of health

If we checked a filesystem and it turned out to be clean, upload that
information into the kernel.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
14 months agoxfs_scrub: check file link counts
Darrick J. Wong [Mon, 22 Apr 2024 17:01:16 +0000 (10:01 -0700)] 
xfs_scrub: check file link counts

Check file link counts as part of checking a filesystem.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
14 months agoxfs_scrub: implement live quotacheck inode scan
Darrick J. Wong [Mon, 22 Apr 2024 17:01:16 +0000 (10:01 -0700)] 
xfs_scrub: implement live quotacheck inode scan

Teach xfs_scrub to check quota resource usage counters when checking a
filesystem.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
14 months agoxfs_spaceman: report health of inode link counts
Darrick J. Wong [Mon, 22 Apr 2024 17:01:16 +0000 (10:01 -0700)] 
xfs_spaceman: report health of inode link counts

Report on the health of the inode link counts.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
14 months agoxfs_spaceman: report the health of quota counts
Darrick J. Wong [Mon, 22 Apr 2024 17:01:15 +0000 (10:01 -0700)] 
xfs_spaceman: report the health of quota counts

Report the health of quota counts.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
14 months agolibxfs: add a realtime flag to the bmap update log redo items
Darrick J. Wong [Mon, 22 Apr 2024 17:01:15 +0000 (10:01 -0700)] 
libxfs: add a realtime flag to the bmap update log redo items

Extend the bmap update (BUI) log items with a new realtime flag that
indicates that the updates apply against a realtime file's data fork.
We'll wire up the actual code later.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
14 months agolibxfs: add a xattr_entry helper
Darrick J. Wong [Mon, 22 Apr 2024 17:01:15 +0000 (10:01 -0700)] 
libxfs: add a xattr_entry helper

Add a helper to translate from the item list head to the attr_intent
item structure and use it so shorten assignments and avoid the need for
extra local variables.

Inspired-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
14 months agolibxfs: reuse xfs_bmap_update_cancel_item
Darrick J. Wong [Mon, 22 Apr 2024 17:01:15 +0000 (10:01 -0700)] 
libxfs: reuse xfs_bmap_update_cancel_item

Reuse xfs_bmap_update_cancel_item to put the AG/RTG and free the item in
a few places that currently open code the logic.

Inspired-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
14 months agolibxfs: add a bi_entry helper
Darrick J. Wong [Mon, 22 Apr 2024 17:01:15 +0000 (10:01 -0700)] 
libxfs: add a bi_entry helper

Add a helper to translate from the item list head to the bmap_intent
structure and use it so shorten assignments and avoid the need for extra
local variables.

Inspired-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
14 months agolibxfs: remove kmem_alloc, kmem_zalloc, and kmem_free
Darrick J. Wong [Mon, 22 Apr 2024 17:01:15 +0000 (10:01 -0700)] 
libxfs: remove kmem_alloc, kmem_zalloc, and kmem_free

Remove all three of these helpers now that the kernel has dropped them.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
14 months agoxfs: allow sunit mount option to repair bad primary sb stripe values
Dave Chinner [Mon, 22 Apr 2024 17:01:14 +0000 (10:01 -0700)] 
xfs: allow sunit mount option to repair bad primary sb stripe values

Source kernel commit: 15922f5dbf51dad334cde888ce6835d377678dc9

If a filesystem has a busted stripe alignment configuration on disk
(e.g. because broken RAID firmware told mkfs that swidth was smaller
than sunit), then the filesystem will refuse to mount due to the
stripe validation failing. This failure is triggering during distro
upgrades from old kernels lacking this check to newer kernels with
this check, and currently the only way to fix it is with offline
xfs_db surgery.

This runtime validity checking occurs when we read the superblock
for the first time and causes the mount to fail immediately. This
prevents the rewrite of stripe unit/width via
mount options that occurs later in the mount process. Hence there is
no way to recover this situation without resorting to offline xfs_db
rewrite of the values.

However, we parse the mount options long before we read the
superblock, and we know if the mount has been asked to re-write the
stripe alignment configuration when we are reading the superblock
and verifying it for the first time. Hence we can conditionally
ignore stripe verification failures if the mount options specified
will correct the issue.

We validate that the new stripe unit/width are valid before we
overwrite the superblock values, so we can ignore the invalid config
at verification and fail the mount later if the new values are not
valid. This, at least, gives users the chance of correcting the
issue after a kernel upgrade without having to resort to xfs-db
hacks.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: shrink failure needs to hold AGI buffer
Dave Chinner [Mon, 22 Apr 2024 17:01:14 +0000 (10:01 -0700)] 
xfs: shrink failure needs to hold AGI buffer

Source kernel commit: 75bcffbb9e7563259b7aed0fa77459d6a3a35627

Chandan reported a AGI/AGF lock order hang on xfs/168 during recent
testing. The cause of the problem was the task running xfs_growfs
to shrink the filesystem. A failure occurred trying to remove the
free space from the btrees that the shrink would make disappear,
and that meant it ran the error handling for a partial failure.

This error path involves restoring the per-ag block reservations,
and that requires calculating the amount of space needed to be
reserved for the free inode btree. The growfs operation hung here:

[18679.536829]  down+0x71/0xa0
[18679.537657]  xfs_buf_lock+0xa4/0x290 [xfs]
[18679.538731]  xfs_buf_find_lock+0xf7/0x4d0 [xfs]
[18679.539920]  xfs_buf_lookup.constprop.0+0x289/0x500 [xfs]
[18679.542628]  xfs_buf_get_map+0x2b3/0xe40 [xfs]
[18679.547076]  xfs_buf_read_map+0xbb/0x900 [xfs]
[18679.562616]  xfs_trans_read_buf_map+0x449/0xb10 [xfs]
[18679.569778]  xfs_read_agi+0x1cd/0x500 [xfs]
[18679.573126]  xfs_ialloc_read_agi+0xc2/0x5b0 [xfs]
[18679.578708]  xfs_finobt_calc_reserves+0xe7/0x4d0 [xfs]
[18679.582480]  xfs_ag_resv_init+0x2c5/0x490 [xfs]
[18679.586023]  xfs_ag_shrink_space+0x736/0xd30 [xfs]
[18679.590730]  xfs_growfs_data_private.isra.0+0x55e/0x990 [xfs]
[18679.599764]  xfs_growfs_data+0x2f1/0x410 [xfs]
[18679.602212]  xfs_file_ioctl+0xd1e/0x1370 [xfs]

trying to get the AGI lock. The AGI lock was held by a fstress task
trying to do an inode allocation, and it was waiting on the AGF
lock to allocate a new inode chunk on disk. Hence deadlock.

The fix for this is for the growfs code to hold the AGI over the
transaction roll it does in the error path. It already holds the AGF
locked across this, and that is what causes the lock order inversion
in the xfs_ag_resv_init() call.

Reported-by: Chandan Babu R <chandanbabu@kernel.org>
Fixes: 46141dc891f7 ("xfs: introduce xfs_ag_shrink_space()")
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: xfs_btree_bload_prep_block() should use __GFP_NOFAIL
Dave Chinner [Mon, 22 Apr 2024 17:01:14 +0000 (10:01 -0700)] 
xfs: xfs_btree_bload_prep_block() should use __GFP_NOFAIL

Source kernel commit: 3aca0676a1141c4d198f8b3c934435941ba84244

This was missed in the conversion from KM* flags.

Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Fixes: 10634530f7ba ("xfs: convert kmem_zalloc() to kzalloc()")
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: move symlink target write function to libxfs
Darrick J. Wong [Mon, 22 Apr 2024 17:01:14 +0000 (10:01 -0700)] 
xfs: move symlink target write function to libxfs

Source kernel commit: b8102b61f7b8929ad8043e4574a1e26276398041

Move xfs_symlink_write_target to xfs_symlink_remote.c so that kernel and
mkfs can share the same function.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: move remote symlink target read function to libxfs
Darrick J. Wong [Mon, 22 Apr 2024 17:01:13 +0000 (10:01 -0700)] 
xfs: move remote symlink target read function to libxfs

Source kernel commit: 376b4f0522484f43660dab8e4e92b471863b49f9

Move xfs_readlink_bmap_ilocked to xfs_symlink_remote.c so that the
swapext code can use it to convert a remote format symlink back to
shortform format after a metadata repair.  While we're at it, fix a
broken printf prefix.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: move xfs_symlink_remote.c declarations to xfs_symlink_remote.h
Darrick J. Wong [Mon, 22 Apr 2024 17:01:13 +0000 (10:01 -0700)] 
xfs: move xfs_symlink_remote.c declarations to xfs_symlink_remote.h

Source kernel commit: 622d88e2ad7960b83af38dabf6b848a22a5a1c1f

Move declarations for libxfs symlink functions into a separate header
file like we do for most everything else.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: xfs_bmap_finish_one should map unwritten extents properly
Darrick J. Wong [Mon, 22 Apr 2024 17:01:13 +0000 (10:01 -0700)] 
xfs: xfs_bmap_finish_one should map unwritten extents properly

Source kernel commit: 6c8127e93e3ac9c2cf6a13b885dd2d057b7e7d50

The deferred bmap work state and the log item can transmit unwritten
state, so the XFS_BMAP_MAP handler must map in extents with that
unwritten state.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: support deferred bmap updates on the attr fork
Darrick J. Wong [Mon, 22 Apr 2024 17:01:13 +0000 (10:01 -0700)] 
xfs: support deferred bmap updates on the attr fork

Source kernel commit: 52f807067ba4a122e75bf1e0e0595c78e6a3d8b6

The deferred bmap update log item has always supported the attr fork, so
plumb this in so that higher layers can access this.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: add a realtime flag to the bmap update log redo items
Darrick J. Wong [Mon, 22 Apr 2024 17:01:13 +0000 (10:01 -0700)] 
xfs: add a realtime flag to the bmap update log redo items

Source kernel commit: 7302cda7f8b08062b11d2ba9ae0b4f3871fe3d46

Extend the bmap update (BUI) log items with a new realtime flag that
indicates that the updates apply against a realtime file's data fork.
We'll wire up the actual code later.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: fix xfs_bunmapi to allow unmapping of partial rt extents
Darrick J. Wong [Mon, 22 Apr 2024 17:01:12 +0000 (10:01 -0700)] 
xfs: fix xfs_bunmapi to allow unmapping of partial rt extents

Source kernel commit: 2b6a5ec26887cba195022286b039f2cc0ec683b1

When XFS_BMAPI_REMAP is passed to bunmapi, that means that we want to
remove part of a block mapping without touching the allocator.  For
realtime files with rtextsize > 1, that also means that we should skip
all the code that changes a partial remove request into an unwritten
extent conversion.  IOWs, bunmapi in this mode should handle removing
the mapping from the rt file and nothing else.

Note that XFS_BMAPI_REMAP callers are required to decrement the
reference count and/or free the space manually.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: move xfs_bmap_defer_add to xfs_bmap_item.c
Darrick J. Wong [Mon, 22 Apr 2024 17:01:12 +0000 (10:01 -0700)] 
xfs: move xfs_bmap_defer_add to xfs_bmap_item.c

Source kernel commit: 80284115854e60686b2e0183b31bb303ae69aa8c

Move the code that adds the incore xfs_bmap_item deferred work data to a
transaction live with the BUI log item code.  This means that the file
mapping code no longer has to know about the inner workings of the BUI
log items.

As a consequence, we can hide the _get_group helper.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: clean up bmap log intent item tracepoint callsites
Darrick J. Wong [Mon, 22 Apr 2024 17:01:12 +0000 (10:01 -0700)] 
xfs: clean up bmap log intent item tracepoint callsites

Source kernel commit: 2a15e7686094d1362b5026533b96f57ec989a245

Pass the incore bmap structure to the tracepoints instead of open-coding
the argument passing.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: hook live rmap operations during a repair operation
Darrick J. Wong [Mon, 22 Apr 2024 17:01:12 +0000 (10:01 -0700)] 
xfs: hook live rmap operations during a repair operation

Source kernel commit: 7e1b84b24d257700e417bc9cd724c1efdff653d7

Hook the regular rmap code when an rmapbt repair operation is running so
that we can unlock the AGF buffer to scan the filesystem and keep the
in-memory btree up to date during the scan.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: create a shadow rmap btree during rmap repair
Darrick J. Wong [Mon, 22 Apr 2024 17:01:11 +0000 (10:01 -0700)] 
xfs: create a shadow rmap btree during rmap repair

Source kernel commit: 4787fc802752c9b73b28ff18860c0560bf4337f2

Create an in-memory btree of rmap records instead of an array.  This
enables us to do live record collection instead of freezing the fs.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: repair the rmapbt
Darrick J. Wong [Mon, 22 Apr 2024 17:01:11 +0000 (10:01 -0700)] 
xfs: repair the rmapbt

Source kernel commit: 32080a9b9b2ef8f4089e8e28a2c307334431757e

Rebuild the reverse mapping btree from all primary metadata.  This first
patch establishes the bare mechanics of finding records and putting
together a new ondisk tree; more complex pieces are needed to make it
work properly.

Link: Documentation/filesystems/xfs-online-fsck-design.rst
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: create a helper to decide if a file mapping targets the rt volume
Darrick J. Wong [Mon, 22 Apr 2024 17:01:11 +0000 (10:01 -0700)] 
xfs: create a helper to decide if a file mapping targets the rt volume

Source kernel commit: 5049ff4d140c8f6545464811409302cab017321a

Create a helper so that we can stop open-coding this decision
everywhere.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: launder in-memory btree buffers before transaction commit
Darrick J. Wong [Mon, 22 Apr 2024 17:01:11 +0000 (10:01 -0700)] 
xfs: launder in-memory btree buffers before transaction commit

Source kernel commit: 0dc63c8a1ce39c1ac7da536ee9174cdc714afae2

As we've noted in various places, all current users of in-memory btrees
are online fsck.  Online fsck only stages a btree long enough to rebuild
an ondisk data structure, which means that the in-memory btree is
ephemeral.  Furthermore, if we encounter /any/ errors while updating an
in-memory btree, all we do is tear down all the staged data and return
an errno to userspace.  In-memory btrees need not be transactional, so
their buffers should not be committed to the ondisk log, nor should they
be checkpointed by the AIL.  That's just as well since the ephemeral
nature of the btree means that the buftarg and the buffers may disappear
quickly anyway.

Therefore, we need a way to launder the btree buffers that get attached
to the transaction by the generic btree code.  Because the buffers are
directly mapped to backing file pages, there's no need to bwrite them
back to the tmpfs file.  All we need to do is clean enough of the buffer
log item state so that the bli can be detached from the buffer, remove
the bli from the transaction's log item list, and reset the transaction
dirty state as if the laundered items had never been there.

For simplicity, create xfbtree transaction commit and cancel helpers
that launder the in-memory btree buffers for callers.  Once laundered,
call the write verifier on non-stale buffers to avoid integrity issues,
or punch a hole in the backing file for stale buffers.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: support in-memory btrees
Darrick J. Wong [Mon, 22 Apr 2024 17:01:11 +0000 (10:01 -0700)] 
xfs: support in-memory btrees

Source kernel commit: a095686a2383526d7315197e2419d84ee8470217

Adapt the generic btree cursor code to be able to create a btree whose
buffers come from a (presumably in-memory) buftarg with a header block
that's specific to in-memory btrees.  We'll connect this to other parts
of online scrub in the next patches.

Note that in-memory btrees always have a block size matching the system
memory page size for efficiency reasons.  There are also a few things we
need to do to finalize a btree update; that's covered in the next patch.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: add a xfs_btree_ptrs_equal helper
Christoph Hellwig [Mon, 22 Apr 2024 17:01:10 +0000 (10:01 -0700)] 
xfs: add a xfs_btree_ptrs_equal helper

Source kernel commit: 8c1771c45dfa9dddd4569727c48204b66073d2c2

This only has a single caller and thus might be a bit questionable,
but I think it really improves the readability of
xfs_btree_visit_block.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agolibxfs: support in-memory buffer cache targets
Darrick J. Wong [Mon, 22 Apr 2024 17:01:10 +0000 (10:01 -0700)] 
libxfs: support in-memory buffer cache targets

Allow the buffer cache to target in-memory files by connecting it to
xfiles.  The next few patches will enable creating xfs_btrees in memory.
Unlike the kernel version of this patch, we use a partitioned xfile to
avoid overflowing the fd table instead of opening a separate memfd for
each target.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: teach buftargs to maintain their own buffer hashtable
Darrick J. Wong [Mon, 22 Apr 2024 17:01:10 +0000 (10:01 -0700)] 
xfs: teach buftargs to maintain their own buffer hashtable

Source kernel commit: e7b58f7c1be20550d4f51cec6307b811e7555f52

Currently, cached buffers are indexed by per-AG hashtables.  This works
great for the data device, but won't work for in-memory btrees.  To
handle that use case, buftargs will need to be able to index buffers
independently of other data structures.

We accomplish this by hoisting the rhashtable and its lock into a
separate xfs_buf_cache structure, make the buftarg point to the
_buf_cache structure, and rework various functions to use it.  This
will enable the in-memory buftarg to come up with its own _buf_cache.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agolibxfs: partition memfd files to avoid using too many fds
Darrick J. Wong [Mon, 22 Apr 2024 17:01:10 +0000 (10:01 -0700)] 
libxfs: partition memfd files to avoid using too many fds

In a few patchsets from now, we'll transition xfs_repair to use
memfd-backed rmap and rcbag btrees for storing repair data instead of
heap allocations.  This allows repair to use libxfs code shared from the
online repair code, which reduces the size of the codebase.  It also
reduces heap fragmentation, which might be critical on 32-bit systems.

However, there's one hitch -- userspace xfiles naively allocate one
memfd per data structure, but there's only so many file descriptors that
a process can open.  If a filesystem has a lot of allocation groups, we
can run out of fds and fail.  xfs_repair already tries to increase
RLIMIT_NOFILE to the maximum (~1M) but this can fail due to system or
memory constraints.

Fortunately, it is possible to compute the upper bound of a memfd btree,
which implies that we can store multiple btrees per memfd.  Make it so
that we can partition a memfd file to avoid running out of file
descriptors.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agolibxfs: add xfile support
Darrick J. Wong [Mon, 22 Apr 2024 17:01:09 +0000 (10:01 -0700)] 
libxfs: add xfile support

Port the xfile functionality (anonymous pageable file-index memory) from
the kernel.  In userspace, we try to use memfd() to create tmpfs files
that are not in any namespace, matching the kernel.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agolibxfs: teach buftargs to maintain their own buffer hashtable
Darrick J. Wong [Mon, 22 Apr 2024 17:01:09 +0000 (10:01 -0700)] 
libxfs: teach buftargs to maintain their own buffer hashtable

Currently, cached buffers are indexed with a single global bcache
structure.  This works ok for the limited use case where we only support
reading from the data device, but will fail badly when we want to
support buffers from in-memory btrees.  Move the bcache structure into
the buftarg.

As a side effect, we don't need to compare buftarg->bt_bdev anymore
since libxfs is careful enough not to create more than one buftarg per
open fd.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: move and rename xfs_btree_read_bufl
Christoph Hellwig [Mon, 22 Apr 2024 17:01:09 +0000 (10:01 -0700)] 
xfs: move and rename xfs_btree_read_bufl

Source kernel commit: 6a701eb8fbbb5f500684947883fd77ed0475fa82

Despite its name, xfs_btree_read_bufl doesn't contain any btree-related
functionaliy and isn't used by the btree code.  Move it to xfs_bmap.c,
hard code the refval and ops arguments and rename it to
xfs_bmap_read_buf.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: remove xfs_btree_reada_bufs
Christoph Hellwig [Mon, 22 Apr 2024 17:01:09 +0000 (10:01 -0700)] 
xfs: remove xfs_btree_reada_bufs

Source kernel commit: 6324b00c9ecb8d11a157d2a4bc3e5a495534bdf1

xfs_btree_reada_bufl just wraps xfs_btree_readahead and a agblock
to daddr conversion.  Just open code it's three callsites in the
two callers (One of which isn't even btree related).

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: remove xfs_btree_reada_bufl
Christoph Hellwig [Mon, 22 Apr 2024 17:01:08 +0000 (10:01 -0700)] 
xfs: remove xfs_btree_reada_bufl

Source kernel commit: 5eec8fa30dfa548d07332756101053f47f6ba26c

xfs_btree_reada_bufl just wraps xfs_btree_readahead and a fsblock
to daddr conversion.  Just open code it's two callsites in the only
caller.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: factor out a __xfs_btree_check_lblock_hdr helper
Christoph Hellwig [Mon, 22 Apr 2024 17:01:08 +0000 (10:01 -0700)] 
xfs: factor out a __xfs_btree_check_lblock_hdr helper

Source kernel commit: 79e72304dcba471e5c0dea2f3c67fe1a0558c140

This will allow sharing code with the in-memory block checking helper.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: rename btree helpers that depends on the block number representation
Christoph Hellwig [Mon, 22 Apr 2024 17:01:08 +0000 (10:01 -0700)] 
xfs: rename btree helpers that depends on the block number representation

Source kernel commit: 5ef819c34f954fccfc42f79b9b0bea9b40cef9a1

All these helpers hardcode fsblocks or agblocks and not just the pointer
size.  Rename them so that the names are still fitting when we add the
long format in-memory blocks and adjust the checks when calling them to
check the btree types and not just pointer length.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: consolidate btree block verification
Christoph Hellwig [Mon, 22 Apr 2024 17:01:08 +0000 (10:01 -0700)] 
xfs: consolidate btree block verification

Source kernel commit: 4ce0c711d9ab3a435bc605cd2f36a3f6b4e12c05

Add a __xfs_btree_check_block helper that can be called by the scrub code
to validate a btree block of any form, and move the duplicate error
handling code from xfs_btree_check_sblock and xfs_btree_check_lblock into
xfs_btree_check_block and thus remove these two helpers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: tighten up validation of root block in inode forks
Christoph Hellwig [Mon, 22 Apr 2024 17:01:08 +0000 (10:01 -0700)] 
xfs: tighten up validation of root block in inode forks

Source kernel commit: d477f1749f00899c71605ea01aba0ce67e030471

Check that root blocks that sit in the inode fork and thus have a NULL
bp don't have siblings.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: remove the crc variable in __xfs_btree_check_lblock
Christoph Hellwig [Mon, 22 Apr 2024 17:01:07 +0000 (10:01 -0700)] 
xfs: remove the crc variable in __xfs_btree_check_lblock

Source kernel commit: bd45019d9aa942d1c2457d96a7dbf2ad3051754b

crc is only used once, just use the xfs_has_crc check directly.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: misc cleanups for __xfs_btree_check_sblock
Christoph Hellwig [Mon, 22 Apr 2024 17:01:07 +0000 (10:01 -0700)] 
xfs: misc cleanups for __xfs_btree_check_sblock

Source kernel commit: 43be09192ce1f3cf9c3d2073e822a1d0a42fe5b2

Remove the local crc variable that is only used once and remove the bp
NULL checking as it can't ever be NULL for short form blocks.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: consolidate btree ptr checking
Christoph Hellwig [Mon, 22 Apr 2024 17:01:07 +0000 (10:01 -0700)] 
xfs: consolidate btree ptr checking

Source kernel commit: 57982d6c835a71da5c66e6090680de1adf6e736a

Merge xfs_btree_check_sptr and xfs_btree_check_lptr into a single
__xfs_btree_check_ptr that can be shared between xfs_btree_check_ptr
and the scrub code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: open code xfs_btree_check_lptr in xfs_bmap_btree_to_extents
Christoph Hellwig [Mon, 22 Apr 2024 17:01:07 +0000 (10:01 -0700)] 
xfs: open code xfs_btree_check_lptr in xfs_bmap_btree_to_extents

Source kernel commit: fb0793f206701a68f8588a09bf32f7cf44878ea3

xfs_bmap_btree_to_extents always passes a level of 1 to
xfs_btree_check_lptr, thus making the level check redundant.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: simplify xfs_btree_check_lblock_siblings
Christoph Hellwig [Mon, 22 Apr 2024 17:01:07 +0000 (10:01 -0700)] 
xfs: simplify xfs_btree_check_lblock_siblings

Source kernel commit: 8b8ada973cacff338a0e817a97dd0afa301798c0

Stop using xfs_btree_check_lptr in xfs_btree_check_lblock_siblings,
as it only duplicates the xfs_verify_fsbno call in the other leg of
if / else besides adding a tautological level check.

With this the cur and level arguments can be removed as they are
now unused.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: simplify xfs_btree_check_sblock_siblings
Christoph Hellwig [Mon, 22 Apr 2024 17:01:06 +0000 (10:01 -0700)] 
xfs: simplify xfs_btree_check_sblock_siblings

Source kernel commit: 4bc94bf640e08cf970354036683ec143a7ae974e

Stop using xfs_btree_check_sptr in xfs_btree_check_sblock_siblings,
as it only duplicates the xfs_verify_agbno call in the other leg of
if / else besides adding a tautological level check.

With this the cur and level arguments can be removed as they are
now unused.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: remove xfs_btnum_t
Christoph Hellwig [Mon, 22 Apr 2024 17:01:06 +0000 (10:01 -0700)] 
xfs: remove xfs_btnum_t

Source kernel commit: ec793e690f801d97a7ae2a0d429fea1fee4d44aa

The last checks for bc_btnum can be replaced with helpers that check
the btree ops.  This allows adding new btrees to XFS without having
to update a global enum.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: complete the ops predicates]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: pass a 'bool is_finobt' to xfs_inobt_insert
Christoph Hellwig [Mon, 22 Apr 2024 17:01:06 +0000 (10:01 -0700)] 
xfs: pass a 'bool is_finobt' to xfs_inobt_insert

Source kernel commit: fbeef4e061ab28bf556af4ee2a5a9848dc4616c5

This is one of the last users of xfs_btnum_t and can only designate
either the inobt or finobt.  Replace it with a simple bool.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: split xfs_inobt_init_cursor
Christoph Hellwig [Mon, 22 Apr 2024 17:01:06 +0000 (10:01 -0700)] 
xfs: split xfs_inobt_init_cursor

Source kernel commit: 14dd46cf31f4aaffcf26b00de9af39d01ec8d547

Split xfs_inobt_init_cursor into separate routines for the inobt and
finobt to prepare for the removal of the xfs_btnum global enumeration
of btree types.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: split xfs_inobt_insert_sprec
Christoph Hellwig [Mon, 22 Apr 2024 17:01:06 +0000 (10:01 -0700)] 
xfs: split xfs_inobt_insert_sprec

Source kernel commit: 8541a7d9da2dd6e44f401f2363b21749b7413fc9

Split the finobt version that never merges and uses a different cursor
out of xfs_inobt_insert_sprec to prepare for removing xfs_btnum_t.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: remove the btnum argument to xfs_inobt_count_blocks
Christoph Hellwig [Mon, 22 Apr 2024 17:01:05 +0000 (10:01 -0700)] 
xfs: remove the btnum argument to xfs_inobt_count_blocks

Source kernel commit: 4bfb028a4c00d0a079a625d7867325efb3c37de2

xfs_inobt_count_blocks is only used for the finobt.  Hardcode the btnum
argument and rename the function to match that.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: remove xfs_inobt_cur
Christoph Hellwig [Mon, 22 Apr 2024 17:01:05 +0000 (10:01 -0700)] 
xfs: remove xfs_inobt_cur

Source kernel commit: 3038fd8129384c64946c17198229ee61f6f2c8e1

This helper provides no real advantage over just open code the two
calls in it in the callers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: split xfs_allocbt_init_cursor
Christoph Hellwig [Mon, 22 Apr 2024 17:01:05 +0000 (10:01 -0700)] 
xfs: split xfs_allocbt_init_cursor

Source kernel commit: 1c8b9fd278c08e16c27a41be484b77383738de1f

Split xfs_allocbt_init_cursor into separate routines for the by-bno
and by-cnt btrees to prepare for the removal of the xfs_btnum global
enumeration of btree types.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: add a sick_mask to struct xfs_btree_ops
Christoph Hellwig [Mon, 22 Apr 2024 17:01:05 +0000 (10:01 -0700)] 
xfs: add a sick_mask to struct xfs_btree_ops

Source kernel commit: 7f47734ad61af77a001b1e24691dcbfcb008c938

Clean up xfs_btree_mark_sick by adding a sick_mask to the btree-ops
for all AG-root btrees.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: add a name field to struct xfs_btree_ops
Christoph Hellwig [Mon, 22 Apr 2024 17:01:05 +0000 (10:01 -0700)] 
xfs: add a name field to struct xfs_btree_ops

Source kernel commit: 77953b97bb19dc031673d055c811a5ba7df92307

The btnum in struct xfs_btree_ops is often used for printing a symbolic
name for the btree.  Add a name field to the ops structure and use that
directly.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: split the agf_roots and agf_levels arrays
Christoph Hellwig [Mon, 22 Apr 2024 17:01:04 +0000 (10:01 -0700)] 
xfs: split the agf_roots and agf_levels arrays

Source kernel commit: e45ea3645178c6db91aef4314945b05e4c6ee1fc

Using arrays of largely unrelated fields that use the btree number
as index is not very robust.  Split the arrays into three separate
fields instead.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: remove xfs_bmbt_stage_cursor
Christoph Hellwig [Mon, 22 Apr 2024 17:01:04 +0000 (10:01 -0700)] 
xfs: remove xfs_bmbt_stage_cursor

Source kernel commit: 02f7ebf5f99c3776bbf048786885eeafeb2f21ca

Just open code the two calls in the callers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: fold xfs_bmbt_init_common into xfs_bmbt_init_cursor
Christoph Hellwig [Mon, 22 Apr 2024 17:01:04 +0000 (10:01 -0700)] 
xfs: fold xfs_bmbt_init_common into xfs_bmbt_init_cursor

Source kernel commit: 802f91f7b1d535ac975e2d696bf5b5dea82816e7

Make the levels initialization in xfs_bmbt_init_cursor conditional
and merge the two helpers.

This requires the fakeroot case to now pass a -1 whichfork directly
into xfs_bmbt_init_cursor, and some special casing for that, but
at least this scheme to deal with the fake btree root is handled and
documented in once place now.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: tidy up a multline ternary]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: make staging file forks explicit
Darrick J. Wong [Mon, 22 Apr 2024 17:01:03 +0000 (10:01 -0700)] 
xfs: make staging file forks explicit

Source kernel commit: 42e357c806c8c0ffb9c5c2faa4ad034bfe950d77

Don't open-code "-1" for whichfork when we're creating a staging btree
for a repair; let's define an actual symbol to make grepping and
understanding easier.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: make full use of xfs_btree_stage_ifakeroot in xfs_bmbt_stage_cursor
Christoph Hellwig [Mon, 22 Apr 2024 17:01:03 +0000 (10:01 -0700)] 
xfs: make full use of xfs_btree_stage_ifakeroot in xfs_bmbt_stage_cursor

Source kernel commit: 579d7022d1afea8f4475d1750224ec0b652febee

Remove the duplicate cur->bc_nlevels assignment in xfs_bmbt_stage_cursor,
and move the cur->bc_ino.forksize assignment into
xfs_btree_stage_ifakeroot as it is part of setting up the fake btree
root.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: remove xfs_rmapbt_stage_cursor
Christoph Hellwig [Mon, 22 Apr 2024 17:01:03 +0000 (10:01 -0700)] 
xfs: remove xfs_rmapbt_stage_cursor

Source kernel commit: 1317813290be04bc37196c4adf457712238c7faa

xfs_rmapbt_stage_cursor is currently unused, but future callers can
trivially open code the two calls.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: fold xfs_rmapbt_init_common into xfs_rmapbt_init_cursor
Christoph Hellwig [Mon, 22 Apr 2024 17:01:03 +0000 (10:01 -0700)] 
xfs: fold xfs_rmapbt_init_common into xfs_rmapbt_init_cursor

Source kernel commit: c49a4b2f0ef0ac5daee5c2a3cfd2b537345c34eb

Make the levels initialization in xfs_rmapbt_init_cursor conditional
and merge the two helpers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: remove xfs_refcountbt_stage_cursor
Christoph Hellwig [Mon, 22 Apr 2024 17:01:03 +0000 (10:01 -0700)] 
xfs: remove xfs_refcountbt_stage_cursor

Source kernel commit: a5c2194406f322e91b90fb813128541a9b4fed6a

Just open code the two calls in the callers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: fold xfs_refcountbt_init_common into xfs_refcountbt_init_cursor
Christoph Hellwig [Mon, 22 Apr 2024 17:01:02 +0000 (10:01 -0700)] 
xfs: fold xfs_refcountbt_init_common into xfs_refcountbt_init_cursor

Source kernel commit: 4f2dc69e4bcb4b3bfaea0a96ac6424b0ed998172

Make the levels initialization in xfs_refcountbt_init_cursor conditional
and merge the two helpers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: remove xfs_inobt_stage_cursor
Christoph Hellwig [Mon, 22 Apr 2024 17:01:02 +0000 (10:01 -0700)] 
xfs: remove xfs_inobt_stage_cursor

Source kernel commit: 6234dee7e6f58676379f3a2d8b0629a6e9a427fd

Just open code the two calls in the callers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: fold xfs_inobt_init_common into xfs_inobt_init_cursor
Christoph Hellwig [Mon, 22 Apr 2024 17:01:02 +0000 (10:01 -0700)] 
xfs: fold xfs_inobt_init_common into xfs_inobt_init_cursor

Source kernel commit: f6c98d921a9e5b753ac1a35d540a6487ee111a33

Make the levels initialization in xfs_inobt_init_cursor conditional
and merge the two helpers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: remove xfs_allocbt_stage_cursor
Christoph Hellwig [Mon, 22 Apr 2024 17:01:02 +0000 (10:01 -0700)] 
xfs: remove xfs_allocbt_stage_cursor

Source kernel commit: 91796b2eef8bd725873bec326a7be830a68a11ff

Just open code the two calls in the callers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: fold xfs_allocbt_init_common into xfs_allocbt_init_cursor
Christoph Hellwig [Mon, 22 Apr 2024 17:01:02 +0000 (10:01 -0700)] 
xfs: fold xfs_allocbt_init_common into xfs_allocbt_init_cursor

Source kernel commit: fb518f8eeb90197624b21a3429e57b6a65bff7bb

Make the levels initialization in xfs_allocbt_init_cursor conditional
and merge the two helpers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: don't override bc_ops for staging btrees
Christoph Hellwig [Mon, 22 Apr 2024 17:01:01 +0000 (10:01 -0700)] 
xfs: don't override bc_ops for staging btrees

Source kernel commit: 2b9e7f2668c540f18afd66a053ea78f3a629f8e2

Add a few conditionals for staging btrees to the core btree code instead
of overloading the bc_ops vector.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: add a xfs_btree_init_ptr_from_cur
Christoph Hellwig [Mon, 22 Apr 2024 17:01:01 +0000 (10:01 -0700)] 
xfs: add a xfs_btree_init_ptr_from_cur

Source kernel commit: f9c18129e57df7b33f4257340840525816481da6

Inode-rooted btrees don't need to initialize the root pointer in the
->init_ptr_from_cur method as the root is found by the
xfs_btree_get_iroot method later.  Make ->init_ptr_from_cur option
for inode rooted btrees by providing a helper that does the right
thing for the given btree type and also documents the semantics.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: move comment about two 2 keys per pointer in the rmap btree
Christoph Hellwig [Mon, 22 Apr 2024 17:01:01 +0000 (10:01 -0700)] 
xfs: move comment about two 2 keys per pointer in the rmap btree

Source kernel commit: 72c2070f3f52196a2e8b4efced94390b62eb8ac4

Move it to the relevant initialization of the ops structure instead
of a place that has nothing to do with the key size.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: create predicate to determine if cursor is at inode root level
Darrick J. Wong [Mon, 22 Apr 2024 17:01:01 +0000 (10:01 -0700)] 
xfs: create predicate to determine if cursor is at inode root level

Source kernel commit: f73def90a7cd24a32a42f689efba6a7a35edeb7b

Create a predicate to decide if the given cursor and level point to the
root block in the inode immediate area instead of a disk block, and get
rid of the open-coded logic everywhere.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: split the per-btree union in struct xfs_btree_cur
Christoph Hellwig [Mon, 22 Apr 2024 17:01:01 +0000 (10:01 -0700)] 
xfs: split the per-btree union in struct xfs_btree_cur

Source kernel commit: 88ee2f4849119b82b95d6e8e2d9daa81214eb080

Split up the union that encodes btree-specific fields in struct
xfs_btree_cur.  Most fields in there are specific to the btree type
encoded in xfs_btree_ops.type, and we can use the obviously named union
for that.  But one field is specific to the bmapbt and two are shared by
the refcount and rtrefcountbt.  Move those to a separate union to make
the usage clear and not need a separate struct for the refcount-related
fields.

This will also make unnecessary some very awkward btree cursor
refc/rtrefc switching logic in the rtrefcount patchset.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: split out a btree type from the btree ops geometry flags
Christoph Hellwig [Mon, 22 Apr 2024 17:01:00 +0000 (10:01 -0700)] 
xfs: split out a btree type from the btree ops geometry flags

Source kernel commit: 4f0cd5a555072e21fb589975607b70798e073f8f

Two of the btree cursor flags are always used together and encode
the fundamental btree type.  There currently are two such types:

1) an on-disk AG-rooted btree with 32-bit pointers
2) an on-disk inode-rooted btree with 64-bit pointers

and we're about to add:

3) an in-memory btree with 64-bit pointers

Introduce a new enum and a new type field in struct xfs_btree_geom
to encode this type directly instead of using flags and change most
code to switch on this enum.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: make the pointer lengths explicit]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: store the btree pointer length in struct xfs_btree_ops
Darrick J. Wong [Mon, 22 Apr 2024 17:01:00 +0000 (10:01 -0700)] 
xfs: store the btree pointer length in struct xfs_btree_ops

Source kernel commit: 1a9d26291c68fbb8f8d24f9f694b32223a072745

Make the pointer length an explicit field in the btree operations
structure so that the next patch (which introduces an explicit btree
type enum) doesn't have to play a bunch of awkward games with inferring
the pointer length from the enumeration.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: factor out a btree block owner check
Darrick J. Wong [Mon, 22 Apr 2024 17:01:00 +0000 (10:01 -0700)] 
xfs: factor out a btree block owner check

Source kernel commit: 186f20c003199824eb3eb3b78e4eb7c2535a8ffc

Hoist the btree block owner check into a separate helper so that we
don't have an ugly multiline if statement.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: factor out a xfs_btree_owner helper
Darrick J. Wong [Mon, 22 Apr 2024 17:01:00 +0000 (10:01 -0700)] 
xfs: factor out a xfs_btree_owner helper

Source kernel commit: 2054cf051698d30cc9479678c2b807a364248f38

Split out a helper to calculate the owner for a given btree instead of
duplicating the logic in two places.  While we're at it, make the
bc_ag/bc_ino switch logic depend on the correct geometry flag.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: break this up into two patches for the owner check]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: move the btree stats offset into struct btree_ops
Christoph Hellwig [Mon, 22 Apr 2024 17:01:00 +0000 (10:01 -0700)] 
xfs: move the btree stats offset into struct btree_ops

Source kernel commit: 07b7f2e3172b97da2a7ac273ecbaf173cc09a9f4

The statistics offset is completely static, move it into the btree_ops
structure instead of the cursor.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: move lru refs to the btree ops structure
Darrick J. Wong [Mon, 22 Apr 2024 17:00:59 +0000 (10:00 -0700)] 
xfs: move lru refs to the btree ops structure

Source kernel commit: 90cfae818dac5227e94e21d0f5250e098432723e

Move the btree buffer LRU refcount to the btree ops structure so that we
can eliminate the last bc_btnum switch in the generic btree code.  We're
about to create repair-specific btree types, and we don't want that
stuff cluttering up libxfs.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: set btree block buffer ops in _init_buf
Darrick J. Wong [Mon, 22 Apr 2024 17:00:59 +0000 (10:00 -0700)] 
xfs: set btree block buffer ops in _init_buf

Source kernel commit: ad065ef0d2fcd787225bd8887b6b75c6eb4da9a1

Set the btree block buffer ops in xfs_btree_init_buf since we already
have access to that information through the btree ops.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: remove the unnecessary daddr paramter to _init_block
Darrick J. Wong [Mon, 22 Apr 2024 17:00:59 +0000 (10:00 -0700)] 
xfs: remove the unnecessary daddr paramter to _init_block

Source kernel commit: 11388f6581f40e7d5a69ce5f8b13264eca7c2c5c

Now that all of the callers pass XFS_BUF_DADDR_NULL as the daddr
parameter, we can elide that too.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: btree convert xfs_btree_init_block to xfs_btree_init_buf calls
Darrick J. Wong [Mon, 22 Apr 2024 17:00:59 +0000 (10:00 -0700)] 
xfs: btree convert xfs_btree_init_block to xfs_btree_init_buf calls

Source kernel commit: 7771f7030007e3faa6906864d01b504b590e1ca2

Convert any place we call xfs_btree_init_block with a buffer to use the
_init_buf function.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: rename btree block/buffer init functions
Darrick J. Wong [Mon, 22 Apr 2024 17:00:58 +0000 (10:00 -0700)] 
xfs: rename btree block/buffer init functions

Source kernel commit: 3c68858b264fac292f74733eeaf558595978a5e5

Rename xfs_btree_init_block_int to xfs_btree_init_block, and
xfs_btree_init_block to xfs_btree_init_buf so that the name suggests the
type that caller are supposed to pass in.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: initialize btree blocks using btree_ops structure
Darrick J. Wong [Mon, 22 Apr 2024 17:00:58 +0000 (10:00 -0700)] 
xfs: initialize btree blocks using btree_ops structure

Source kernel commit: c87e3bf7802477cb4500dfafe0ab039313aa2dda

Notice now that the btree ops structure encodes btree geometry flags and
the magic number through the buffer ops.  Refactor the btree block
initialization functions to use the btree ops so that we no longer have
to open code all that.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
14 months agoxfs: extern some btree ops structures
Darrick J. Wong [Mon, 22 Apr 2024 17:00:58 +0000 (10:00 -0700)] 
xfs: extern some btree ops structures

Source kernel commit: d8d6df4253adcdb5862a9410d962e9168b973c88

Expose these static btree ops structures so that we can reference them
in the AG initialization code in the next patch.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>