]> git.ipfire.org Git - thirdparty/xfsprogs-dev.git/log
thirdparty/xfsprogs-dev.git
7 months agoxfs: add group based bno conversion helpers
Christoph Hellwig [Mon, 25 Nov 2024 21:14:15 +0000 (13:14 -0800)] 
xfs: add group based bno conversion helpers

Source kernel commit: 759cc1989a53024066b0f2ea52c206b4ff8f522c

Add/move the blocks, blklog and blkmask fields to the generic groups
structure so that code can work with AGs and RTGs by just using the
right index into the array.

Then, add convenience helpers to convert block numbers based on the
generic group.  This will allow writing code that doesn't care if it is
used on AGs or the upcoming realtime groups.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: add a generic group pointer to the btree cursor
Christoph Hellwig [Mon, 25 Nov 2024 21:14:15 +0000 (13:14 -0800)] 
xfs: add a generic group pointer to the btree cursor

Source kernel commit: 77a530e6c49d22bd4a221d2f059db24fc30094db

Replace the pag pointers in the type specific union with a generic
xfs_group pointer.  This prepares for adding realtime group support.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: convert busy extent tracking to the generic group structure
Christoph Hellwig [Mon, 25 Nov 2024 21:14:15 +0000 (13:14 -0800)] 
xfs: convert busy extent tracking to the generic group structure

Source kernel commit: adbc76aa0fedcb6da2d1ceb1ce786d1f963afee8

Split busy extent tracking from struct xfs_perag into its own private
structure, which can be pointed to by the generic group structure.

Note that this structure is now dynamically allocated instead of embedded
as the upcoming zone XFS code doesn't need it and will also have an
unusually high number of groups due to hardware constraints.  Dynamically
allocating the structure this is a big memory saver for this case.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: move the online repair rmap hooks to the generic group structure
Christoph Hellwig [Mon, 25 Nov 2024 21:14:15 +0000 (13:14 -0800)] 
xfs: move the online repair rmap hooks to the generic group structure

Source kernel commit: eb4a84a3c2bd09efe770fa940fb68e349f90c8c6

Prepare for the upcoming realtime groups feature by moving the online
repair rmap hooks to based to the generic xfs_group structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: move draining of deferred operations to the generic group structure
Christoph Hellwig [Mon, 25 Nov 2024 21:14:15 +0000 (13:14 -0800)] 
xfs: move draining of deferred operations to the generic group structure

Source kernel commit: 34cf3a6f3952ecabd54b4fe3d431aa44ce98fe45

Prepare supporting the upcoming realtime groups feature by moving the
deferred operation draining to the generic xfs_group structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: move metadata health tracking to the generic group structure
Christoph Hellwig [Mon, 25 Nov 2024 21:14:14 +0000 (13:14 -0800)] 
xfs: move metadata health tracking to the generic group structure

Source kernel commit: 5c8483cec3fe261a5c1ede7430bab042ed156361

Prepare for also tracking the health status of the upcoming realtime
groups by moving the health tracking code to the generic xfs_group
structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: switch perag iteration from the for_each macros to a while based iterator
Christoph Hellwig [Mon, 25 Nov 2024 21:14:14 +0000 (13:14 -0800)] 
xfs: switch perag iteration from the for_each macros to a while based iterator

Source kernel commit: 86437e6abbd2ef040f42ef190264819db6118415

The current for_each_perag* macros are a bit annoying in that they
require the caller to both provide an object and an index iterator, and
also somewhat obsfucate the underlying control flow mechanism.

Switch to open coded while loops using new xfs_perag_next{,_from,_range}
helpers that return the next pag structure to iterate on based on the
previous one or NULL for the loop start.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: add a xfs_group_next_range helper
Christoph Hellwig [Mon, 25 Nov 2024 21:14:14 +0000 (13:14 -0800)] 
xfs: add a xfs_group_next_range helper

Source kernel commit: 819928770bd91960f88f5a4dfa21b35a1bade61b

Add a helper to iterate over iterate over all groups, which can be used
as a simple while loop:

struct xfs_group                *xg = NULL;

while ((xg = xfs_group_next_range(mp, xg, 0, MAX_GROUP))) {
...
}

This will be wrapped by the realtime group code first, and eventually
replace the for_each_rtgroup_from and for_each_rtgroup_range helpers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: factor out a generic xfs_group structure
Christoph Hellwig [Mon, 25 Nov 2024 21:14:14 +0000 (13:14 -0800)] 
xfs: factor out a generic xfs_group structure

Source kernel commit: e9c4d8bfb26c13c41b73fdf4183d3df2d392101e

Split the lookup and refcount handling of struct xfs_perag into an
embedded xfs_group structure that can be reused for the upcoming
realtime groups.

It will be extended with more features later.

Note that he xg_type field will only need a single bit even with
realtime group support.  For now it fills a hole, but it might be
worth to fold it into another field if we can use this space better.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: insert the pag structures into the xarray later
Christoph Hellwig [Mon, 25 Nov 2024 21:14:13 +0000 (13:14 -0800)] 
xfs: insert the pag structures into the xarray later

Source kernel commit: d66496578b2a099ea453f56782f1cd2bf63a8029

Cleaning up is much easier if a structure can't be looked up yet, so only
insert the pag once it is fully set up.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: split xfs_initialize_perag
Christoph Hellwig [Mon, 25 Nov 2024 21:14:13 +0000 (13:14 -0800)] 
xfs: split xfs_initialize_perag

Source kernel commit: 201c5fa342af75adaf762fd6c63380bb8001762d

Factor out a xfs_perag_alloc helper that allocates a single perag
structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: convert remaining trace points to pass pag structures
Christoph Hellwig [Mon, 25 Nov 2024 21:14:13 +0000 (13:14 -0800)] 
xfs: convert remaining trace points to pass pag structures

Source kernel commit: c4ae021bcb6bf8bbb329ce8ef947a43009bc2fe4

Convert all tracepoints that take [mp,agno] tuples to take a pag argument
instead so that decoding only happens when tracepoints are enabled and to
clean up the callers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: pass objects to the xfs_irec_merge_{pre,post} trace points
Christoph Hellwig [Mon, 25 Nov 2024 21:14:13 +0000 (13:14 -0800)] 
xfs: pass objects to the xfs_irec_merge_{pre,post} trace points

Source kernel commit: 487092ceaa72448ca3a82ea9fb89768c88f6abec

Pass the perag structure and the irec to these tracepoints so that the
decoding is only done when tracing is actually enabled and the call sites
look a lot neater.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: pass a perag structure to the xfs_ag_resv_init_error trace point
Christoph Hellwig [Mon, 25 Nov 2024 21:14:13 +0000 (13:14 -0800)] 
xfs: pass a perag structure to the xfs_ag_resv_init_error trace point

Source kernel commit: 835ddb592fab75ed96828ee3f12ea44496882d6b

And remove the single instance class indirection for it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: pass a pag to xfs_extent_busy_{search,reuse}
Christoph Hellwig [Mon, 25 Nov 2024 21:14:12 +0000 (13:14 -0800)] 
xfs: pass a pag to xfs_extent_busy_{search,reuse}

Source kernel commit: b6dc8c6dd2d3f230e1a554f869d6df4568a2dfbb

Replace the [mp,agno] tuple with the perag structure, which will become
more useful later.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: add a xfs_agino_to_ino helper
Christoph Hellwig [Mon, 25 Nov 2024 21:14:12 +0000 (13:14 -0800)] 
xfs: add a xfs_agino_to_ino helper

Source kernel commit: 6abd82ab6ea48430c13caebaad436ca6b5f2c34d

Add a helpers to convert an agino to an ino based on a pag structure.

This provides a simpler conversion and better type safety compared to the
existing code that passes the mount structure and the agno separately.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: add xfs_agbno_to_fsb and xfs_agbno_to_daddr helpers
Christoph Hellwig [Mon, 25 Nov 2024 21:14:12 +0000 (13:14 -0800)] 
xfs: add xfs_agbno_to_fsb and xfs_agbno_to_daddr helpers

Source kernel commit: 856a920ac2bbb2352ef6aa9e1e052f2e80677df7

Add helpers to convert an agbno to a daddr or fsbno based on a pag
structure.

This provides a simpler conversion and better type safety compared to the
existing code that passes the mount structure and the agno separately.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: remove the agno argument to xfs_free_ag_extent
Christoph Hellwig [Mon, 25 Nov 2024 21:14:12 +0000 (13:14 -0800)] 
xfs: remove the agno argument to xfs_free_ag_extent

Source kernel commit: db129fa01113f767d5b7a6fd339114a962023464

xfs_free_ag_extent already has a pointer to the pag structure through
the agf buffer.  Use that instead of passing the redundant argument,
and do the same for the tracepoint.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: pass a pag to xfs_difree_inode_chunk
Christoph Hellwig [Mon, 25 Nov 2024 21:14:11 +0000 (13:14 -0800)] 
xfs: pass a pag to xfs_difree_inode_chunk

Source kernel commit: 67ce5ba575354da1542e0579fb8c7a871cbf57b3

We'll want to use more than just the agno field in a bit.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: remove the unused pag_active_wq field in struct xfs_perag
Christoph Hellwig [Mon, 25 Nov 2024 21:14:11 +0000 (13:14 -0800)] 
xfs: remove the unused pag_active_wq field in struct xfs_perag

Source kernel commit: 9943b45732905a70496fc44368ab85b230c70db4

pag_active_wq is only woken, but never waited for.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: remove the unused pagb_count field in struct xfs_perag
Christoph Hellwig [Mon, 25 Nov 2024 21:14:11 +0000 (13:14 -0800)] 
xfs: remove the unused pagb_count field in struct xfs_perag

Source kernel commit: 4e071d79e477189a6c318f598634799e50921994

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: sb_spino_align is not verified
Dave Chinner [Mon, 25 Nov 2024 21:14:11 +0000 (13:14 -0800)] 
xfs: sb_spino_align is not verified

Source kernel commit: 59e43f5479cce106d71c0b91a297c7ad1913176c

It's just read in from the superblock and used without doing any
validity checks at all on the value.

Fixes: fb4f2b4e5a82 ("xfs: add sparse inode chunk alignment superblock field")
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: remove the redundant xfs_alloc_log_agf
Long Li [Mon, 25 Nov 2024 21:14:11 +0000 (13:14 -0800)] 
xfs: remove the redundant xfs_alloc_log_agf

Source kernel commit: 8b9b261594d8ef218ef4d0e732dad153f82aab49

There are two invocations of xfs_alloc_log_agf in xfs_alloc_put_freelist.
The AGF does not change between the two calls. Although this does not pose
any practical problems, it seems like a small mistake. Therefore, fix it
by removing the first xfs_alloc_log_agf invocation.

Signed-off-by: Long Li <leo.lilong@huawei.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoman: document the -n parent mkfs option
Darrick J. Wong [Wed, 27 Nov 2024 22:13:50 +0000 (14:13 -0800)] 
man: document the -n parent mkfs option

Document the -n parent option to mkfs.xfs so that users will actually
know how to turn on directory parent pointers.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoman: fix ioctl_xfs_commit_range man page install
Jan Palus [Thu, 5 Dec 2024 20:43:54 +0000 (12:43 -0800)] 
man: fix ioctl_xfs_commit_range man page install

INSTALL_MAN uses first symbol in .SH NAME section for both source and
destination filename hence it needs to match current filename. since
ioctl_xfs_commit_range.2 documents both ioctl_xfs_start_commit as well
as ioctl_xfs_commit_range ensure they are listed in order INSTALL_MAN
expects.

Signed-off-by: Jan Palus <jpalus@fastmail.com>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
7 months agoxfs_repair: fix maximum file offset comparison
Darrick J. Wong [Thu, 21 Nov 2024 00:24:22 +0000 (16:24 -0800)] 
xfs_repair: fix maximum file offset comparison

When running generic/525 with rtinherit=1 and rextsize=28k, generic/525
trips over the following block mapping:

data offset 2251799813685247 startblock 7 (0/7) count 1 flag 0
data offset 2251799813685248 startblock 8 (0/8) count 6 flag 1

with this error:

inode 155 - extent exceeds max offset - start 2251799813685248, count 6,
physical block 8

This is due to an incorrect check in xfs_repair, which tries to validate
that a block mapping cannot exceed what it thinks is the maximum file
offset.  Unfortunately, the check is wrong, because only br_startoff is
subject to the 2^52-1 limit -- not br_startoff + br_blockcount.

Nowadays libxfs provides a symbol XFS_MAX_FILEOFF for the maximum
allowable file block offset that can be mapped into a file.  Use this
instead of the open-coded logic in versions.c and correct all the other
checks.  Note that this problem only surfaced when rtgroups were enabled
because hch changed xfs_repair to use the same tree-based block state
data structure that we use for AGs when rtgroups are enabled.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfsprogs: Release v6.12.0 v6.12.0
Andrey Albershteyn [Mon, 2 Dec 2024 21:40:29 +0000 (22:40 +0100)] 
xfsprogs: Release v6.12.0

Update all the necessary files for a v6.12.0 release.

Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
8 months agoxfs_io: add support for atomic write statx fields
Catherine Hoang [Wed, 20 Nov 2024 02:35:44 +0000 (18:35 -0800)] 
xfs_io: add support for atomic write statx fields

Add support for the new atomic_write_unit_min, atomic_write_unit_max, and
atomic_write_segments_max fields in statx for xfs_io. In order to support builds
against old kernel headers, define our own internal statx structs. If the
system's struct statx does not have the required atomic write fields, override
the struct definitions with the internal definitions in statx.h.

Signed-off-by: Catherine Hoang <catherine.hoang@oracle.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: John Garry <john.g.garry@oracle.com>
8 months agoxfs_repair: synthesize incore inode tree records when required
Darrick J. Wong [Thu, 14 Nov 2024 23:53:24 +0000 (15:53 -0800)] 
xfs_repair: synthesize incore inode tree records when required

On a filesystem with 64k fsblock size, xfs/093 fails with the following:

Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
found inodes not in the inode allocation tree
found inodes not in the inode allocation tree
        - process known inodes and perform inode discovery...
        - agno = 0
xfs_repair: dino_chunks.c:1166: process_aginodes: Assertion `num_inos == igeo->ialloc_inos' failed.
./common/xfs: line 392: 361225 Aborted                 (core dumped) $XFS_REPAIR_PROG $SCRATCH_OPTIONS $* $SCRATCH_DEV

In this situation, the inode size is 512b, which means that two inobt
records map to a single fs block.  However, the inobt walk didn't find
the second record, so it didn't create a second incore ino_tree_node_t
object.  The assertion trips, and we fail to repair the filesystem.

To fix this, synthesize incore inode records when we know that they must
exist.  Mark the inodes as in use so that they will not be purged from
parent directories or moved to lost+found if the directory tree is also
compromised.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
8 months agoxfs_repair: fix crasher in pf_queuing_worker
Darrick J. Wong [Thu, 14 Nov 2024 23:53:24 +0000 (15:53 -0800)] 
xfs_repair: fix crasher in pf_queuing_worker

Don't walk off the end of the inode records when we're skipping inodes
for prefetching.  The skip loop doesn't make sense to me -- why we
ignore the first N inodes but don't care what number they are makes
little sense to me.  But let's fix xfs/155 to crash less, eh?

Cc: <linux-xfs@vger.kernel.org> # v2.10.0
Fixes: 2556c98bd9e6b2 ("Perform true sequential bulk read prefetching in xfs_repair Merge of master-melb:xfs-cmds:29147a by kenmcd.")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
8 months agoxfs: Reduce unnecessary searches when searching for the best extents
Chi Zhiling [Thu, 14 Nov 2024 23:53:23 +0000 (15:53 -0800)] 
xfs: Reduce unnecessary searches when searching for the best extents

Source kernel commit: 3ef22684038aa577c10972ee9c6a2455f5fac941

Recently, we found that the CPU spent a lot of time in
xfs_alloc_ag_vextent_size when the filesystem has millions of fragmented
spaces.

The reason is that we conducted much extra searching for extents that
could not yield a better result, and these searches would cost a lot of
time when there were millions of extents to search through. Even if we
get the same result length, we don't switch our choice to the new one,
so we can definitely terminate the search early.

Since the result length cannot exceed the found length, when the found
length equals the best result length we already have, we can conclude
the search.

We did a test in that filesystem:
[root@localhost ~]# xfs_db -c freesp /dev/vdb
from      to extents  blocks    pct
1       1     215     215   0.01
2       3  994476 1988952  99.99

Before this patch:
0)               |  xfs_alloc_ag_vextent_size [xfs]() {
0) * 15597.94 us |  }

After this patch:
0)               |  xfs_alloc_ag_vextent_size [xfs]() {
0)   19.176 us    |  }

Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
8 months agoxfs_spaceman: add dependency on libhandle target
Jan Palus [Sat, 19 Oct 2024 18:23:19 +0000 (20:23 +0200)] 
xfs_spaceman: add dependency on libhandle target

Fixes: 764d8cb8 ("xfs_spaceman: report file paths")
Signed-off-by: Jan Palus <jpalus@fastmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
9 months agomkfs: add a config file for 6.12 LTS kernels
Darrick J. Wong [Tue, 29 Oct 2024 00:03:34 +0000 (17:03 -0700)] 
mkfs: add a config file for 6.12 LTS kernels

We didn't add any new ondisk features in 2023, so the config file is the
same.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
9 months agoxfs_scrub_all: wait for services to start activating
Darrick J. Wong [Tue, 29 Oct 2024 00:03:34 +0000 (17:03 -0700)] 
xfs_scrub_all: wait for services to start activating

It seems that the function call to start a systemd unit completes
asynchronously from any change in that unit's active state.  On a
lightly loaded system, a Start() call followed by an ActiveState()
call actually sees the change in state from inactive to activating.

Unfortunately, on a heavily loaded system, the state change may take a
few seconds.  If this is the case, the wait() call can see that the unit
state is "inactive", decide that the service already finished, and exit
early, when in reality it hasn't even gotten to 'activating'.

Fix this by adding a second method that watches either for the inactive
-> activating state transition or for the last exit from inactivation
timestamp to change before waiting for the unit to reach inactive state.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Cc: <linux-xfs@vger.kernel.org> # v6.10.0
Fixes: 6d831e770359ff ("xfs_scrub_all: convert systemctl calls to dbus")
Reviewed-by: Christoph Hellwig <hch@lst.de>
9 months agoxfs_repair: stop preallocating blocks in mk_rbmino and mk_rsumino
Christoph Hellwig [Tue, 29 Oct 2024 00:03:34 +0000 (17:03 -0700)] 
xfs_repair: stop preallocating blocks in mk_rbmino and mk_rsumino

Now that repair is using libxfs_rtfile_initialize_blocks to write to the
rtbitmap and rtsummary inodes, space allocation is already taken care of
that helper and there is no need to preallocate it.  Remove the code to
do so.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
9 months agoxfs_repair: use libxfs_rtfile_initialize_blocks
Christoph Hellwig [Tue, 29 Oct 2024 00:03:34 +0000 (17:03 -0700)] 
xfs_repair: use libxfs_rtfile_initialize_blocks

Use libxfs_rtfile_initialize_blocks to write the re-computed rtbitmap
and rtsummary contents.  This removes duplicate code and prepares for
even more sharing once the rtgroup features adds a metadata header to
the rtbitmap and rtsummary blocks.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
9 months agomkfs: use xfs_rtfile_initialize_blocks
Christoph Hellwig [Tue, 29 Oct 2024 00:03:34 +0000 (17:03 -0700)] 
mkfs: use xfs_rtfile_initialize_blocks

Use the new libxfs helper for initializing the rtbitmap/summary files
for rtgroup-enabled file systems.  Also skip the zeroing of the blocks
for rtgroup file systems as we'll overwrite every block instantly.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
9 months agomkfs: remove a pointless rtfreesp_init forward declaration
Christoph Hellwig [Tue, 29 Oct 2024 00:03:33 +0000 (17:03 -0700)] 
mkfs: remove a pointless rtfreesp_init forward declaration

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
9 months agoxfs_repair: use xfs_validate_rt_geometry
Christoph Hellwig [Tue, 29 Oct 2024 00:03:33 +0000 (17:03 -0700)] 
xfs_repair: use xfs_validate_rt_geometry

Use shared libxfs code with the kernel instead of reimplementing it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
9 months agoxfs_repair: checking rt free space metadata must happen during phase 4
Darrick J. Wong [Tue, 29 Oct 2024 00:03:33 +0000 (17:03 -0700)] 
xfs_repair: checking rt free space metadata must happen during phase 4

Back in the really old days, xfs_repair would generate the new free
space information for the realtime section during phase 5, and write the
contents to the rtbitmap and summary files during phase 6.  This was ok
because the incore information isn't used until phase 6.

Then I changed the behavior to check the generated information against
what was on disk and complain about the discrepancies.  Unfortunately,
there was a subtle flaw here -- for a non -n run, we'll have regenerated
the AG metadata before we actually check the rt free space information.
If the AG btree regeneration should clobber one of the old rtbitmap or
summary blocks, this will be reported as a corruption even though
nothing's wrong.

Move check_rtmetadata to the end of phase 4 so that this doesn't happen.

Cc: <linux-xfs@vger.kernel.org> # v5.19.0
Fixes: f2e388616d7491 ("xfs_repair: check free rt extent count")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
9 months agoxfs_db: allow setting current address to log blocks
Darrick J. Wong [Tue, 29 Oct 2024 00:03:33 +0000 (17:03 -0700)] 
xfs_db: allow setting current address to log blocks

Add commands so that users can target blocks on an external log device.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
9 months agoxfs_db: convert rtsummary geometry
Darrick J. Wong [Tue, 29 Oct 2024 00:03:33 +0000 (17:03 -0700)] 
xfs_db: convert rtsummary geometry

Teach the rtconvert command to be able to convert realtime blocks and
extents to locations within the rt summary.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
9 months agoxfs_db: convert rtbitmap geometry
Darrick J. Wong [Tue, 29 Oct 2024 00:03:32 +0000 (17:03 -0700)] 
xfs_db: convert rtbitmap geometry

Teach the rtconvert command to be able to convert realtime blocks and
extents to locations within the rt bitmap.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
9 months agoxfs_db: enable conversion of rt space units
Darrick J. Wong [Tue, 29 Oct 2024 00:03:32 +0000 (17:03 -0700)] 
xfs_db: enable conversion of rt space units

Teach the xfs_db convert function about rt extents, rt block numbers,
and how to compute offsets within the rt bitmap and summary files.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
9 months agoxfs_db: access arbitrary realtime blocks and extents
Darrick J. Wong [Tue, 29 Oct 2024 00:03:32 +0000 (17:03 -0700)] 
xfs_db: access arbitrary realtime blocks and extents

Add two commands to xfs_db so that we can point ourselves at any
arbitrary realtime block or extent.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
9 months agoxfs_db: access realtime file blocks
Darrick J. Wong [Tue, 29 Oct 2024 00:03:32 +0000 (17:03 -0700)] 
xfs_db: access realtime file blocks

Now that we have the ability to point the io cursor at the realtime
device, let's make it so that the "dblock" command can walk the contents
of realtime files.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
9 months agoxfs_db: make the daddr command target the realtime device
Darrick J. Wong [Tue, 29 Oct 2024 00:03:31 +0000 (17:03 -0700)] 
xfs_db: make the daddr command target the realtime device

Make it so that users can issue the command "daddr -r XXX" to select
disk block XXX on the realtime device.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
9 months agoxfs_db: report the realtime device when associated with each io cursor
Darrick J. Wong [Tue, 29 Oct 2024 00:03:31 +0000 (17:03 -0700)] 
xfs_db: report the realtime device when associated with each io cursor

When db is reporting on an io cursor and the cursor points to the
realtime device, print that fact.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
9 months agoxfs_db: support passing the realtime device to the debugger
Darrick J. Wong [Tue, 29 Oct 2024 00:03:31 +0000 (17:03 -0700)] 
xfs_db: support passing the realtime device to the debugger

Create a new -R flag so that sysadmins can pass the realtime device to
the xfs debugger.  Since we can now have superblocks on the rt device,
we need this to be able to inspect/dump/etc.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
9 months agoxfs_io: add atomic file update commands to exercise file commit range
Darrick J. Wong [Tue, 29 Oct 2024 00:03:31 +0000 (17:03 -0700)] 
xfs_io: add atomic file update commands to exercise file commit range

Add three commands to xfs_io so that we can exercise atomic file updates
as provided by reflink and the start-commit / commit-range functionality.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
9 months agoxfs_io: add a commitrange option to the exchangerange command
Darrick J. Wong [Tue, 29 Oct 2024 00:03:30 +0000 (17:03 -0700)] 
xfs_io: add a commitrange option to the exchangerange command

Teach the xfs_io exchangerange command to be able to use the commit
range functionality so that we can test it piece by piece.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
9 months agoxfs_fsr: port to new file exchange library function
Darrick J. Wong [Tue, 29 Oct 2024 00:03:30 +0000 (17:03 -0700)] 
xfs_fsr: port to new file exchange library function

Port fsr to use the new libfrog library functions to handle exchanging
mappings between the target and donor files.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
9 months agolibxfs: validate inumber in xfs_iget
Darrick J. Wong [Tue, 29 Oct 2024 00:03:30 +0000 (17:03 -0700)] 
libxfs: validate inumber in xfs_iget

Actually use the inumber validator to check the argument passed in here,
just like we now do in the kernel.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
9 months agolibxfs: remove unused xfs_inode fields
Darrick J. Wong [Tue, 29 Oct 2024 00:03:30 +0000 (17:03 -0700)] 
libxfs: remove unused xfs_inode fields

Remove these unused fields; on the author's system this reduces the
struct size from 560 bytes to 448.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
9 months agolibfrog: add support for commit range ioctl family
Darrick J. Wong [Tue, 29 Oct 2024 00:03:30 +0000 (17:03 -0700)] 
libfrog: add support for commit range ioctl family

Add some library code to support the new file range commit ioctls.  This
will be used to test the atomic file commit functionality in fstests.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
9 months agoman: document file range commit ioctls
Darrick J. Wong [Tue, 29 Oct 2024 00:03:30 +0000 (17:03 -0700)] 
man: document file range commit ioctls

Document the two new ioctls to support committing arbitrary dirty data
ranges of two files.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
9 months agoxfs: update the pag for the last AG at recovery time
Christoph Hellwig [Mon, 28 Oct 2024 05:11:49 +0000 (22:11 -0700)] 
xfs: update the pag for the last AG at recovery time

Source kernel commit: 4a201dcfa1ff0dcfe4348c40f3ad8bd68b97eb6c

Currently log recovery never updates the in-core perag values for the
last allocation group when they were grown by growfs.  This leads to
btree record validation failures for the alloc, ialloc or finotbt
trees if a transaction references this new space.

Found by Brian's new growfs recovery stress test.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
9 months agoxfs: don't use __GFP_RETRY_MAYFAIL in xfs_initialize_perag
Christoph Hellwig [Mon, 28 Oct 2024 05:11:48 +0000 (22:11 -0700)] 
xfs: don't use __GFP_RETRY_MAYFAIL in xfs_initialize_perag

Source kernel commit: 069cf5e32b700f94c6ac60f6171662bdfb04f325

__GFP_RETRY_MAYFAIL increases the likelyhood of allocations to fail,
which isn't really helpful during log recovery.  Remove the flag and
stick to the default GFP_KERNEL policies.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
9 months agoxfs: merge the perag freeing helpers
Christoph Hellwig [Mon, 28 Oct 2024 05:11:47 +0000 (22:11 -0700)] 
xfs: merge the perag freeing helpers

Source kernel commit: aa67ec6a25617e36eba4fb28a88159f500a6cac6

There is no good reason to have two different routines for freeing perag
structures for the unmount and error cases.  Add two arguments to specify
the range of AGs to free to xfs_free_perag, and use that to replace
xfs_free_unused_perag_range.

The addition RCU grace period for the error case is harmless, and the
extra check for the AG to actually exist is not required now that the
callers pass the exact known allocated range.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
9 months agoxfs: pass the exact range to initialize to xfs_initialize_perag
Christoph Hellwig [Mon, 28 Oct 2024 05:11:47 +0000 (22:11 -0700)] 
xfs: pass the exact range to initialize to xfs_initialize_perag

Source kernel commit: 82742f8c3f1a93787a05a00aca50c2a565231f84

Currently only the new agcount is passed to xfs_initialize_perag, which
requires lookups of existing AGs to skip them and complicates error
handling.  Also pass the previous agcount so that the range that
xfs_initialize_perag operates on is exactly defined.  That way the
extra lookups can be avoided, and error handling can clean up the
exact range from the old count to the last added perag structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
9 months agoxfs: support lowmode allocations in xfs_bmap_exact_minlen_extent_alloc
Christoph Hellwig [Mon, 21 Oct 2024 00:10:48 +0000 (17:10 -0700)] 
xfs: support lowmode allocations in xfs_bmap_exact_minlen_extent_alloc

Source kernel commit: 6aac77059881e4419df499392c995bf02fb9630b

Currently the debug-only xfs_bmap_exact_minlen_extent_alloc allocation
variant fails to drop into the lowmode last resort allocator, and
thus can sometimes fail allocations for which the caller has a
transaction block reservation.

Fix this by using xfs_bmap_btalloc_low_space to do the actual allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
9 months agoxfs: call xfs_bmap_exact_minlen_extent_alloc from xfs_bmap_btalloc
Christoph Hellwig [Mon, 21 Oct 2024 00:10:47 +0000 (17:10 -0700)] 
xfs: call xfs_bmap_exact_minlen_extent_alloc from xfs_bmap_btalloc

Source kernel commit: 405ee87c6938f67e6ab62a3f8f85b3c60a093886

xfs_bmap_exact_minlen_extent_alloc duplicates the args setup in
xfs_bmap_btalloc.  Switch to call it from xfs_bmap_btalloc after
doing the basic setup.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
9 months agoxfs: don't ifdef around the exact minlen allocations
Christoph Hellwig [Mon, 21 Oct 2024 00:10:47 +0000 (17:10 -0700)] 
xfs: don't ifdef around the exact minlen allocations

Source kernel commit: b611fddc0435738e64453bbf1dadd4b12a801858

Exact minlen allocations only exist as an error injection tool for debug
builds.  Currently this is implemented using ifdefs, which means the code
isn't even compiled for non-XFS_DEBUG builds.  Enhance the compile test
coverage by always building the code and use the compilers' dead code
elimination to remove it from the generated binary instead.

The only downside is that the alloc_minlen_only field is unconditionally
added to struct xfs_alloc_args now, but by moving it around and packing
it tightly this doesn't actually increase the size of the structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
9 months agoxfs: fold xfs_bmap_alloc_userdata into xfs_bmapi_allocate
Christoph Hellwig [Mon, 21 Oct 2024 00:10:47 +0000 (17:10 -0700)] 
xfs: fold xfs_bmap_alloc_userdata into xfs_bmapi_allocate

Source kernel commit: 865469cd41bce2b04bef9539cbf70676878bc8df

Userdata and metadata allocations end up in the same allocation helpers.
Remove the separate xfs_bmap_alloc_userdata function to make this more
clear.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
9 months agoxfs: distinguish extra split from real ENOSPC from xfs_attr_node_try_addname
Christoph Hellwig [Mon, 21 Oct 2024 00:10:47 +0000 (17:10 -0700)] 
xfs: distinguish extra split from real ENOSPC from xfs_attr_node_try_addname

Source kernel commit: b3f4e84e2f438a119b7ca8684a25452b3e57c0f0

Just like xfs_attr3_leaf_split, xfs_attr_node_try_addname can return
-ENOSPC both for an actual failure to allocate a disk block, but also
to signal the caller to convert the format of the attr fork.  Use magic
1 to ask for the conversion here as well.

Note that unlike the similar issue in xfs_attr3_leaf_split, this one was
only found by code review.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
9 months agoxfs: distinguish extra split from real ENOSPC from xfs_attr3_leaf_split
Christoph Hellwig [Mon, 21 Oct 2024 00:10:46 +0000 (17:10 -0700)] 
xfs: distinguish extra split from real ENOSPC from xfs_attr3_leaf_split

Source kernel commit: a5f73342abe1f796140f6585e43e2aa7bc1b7975

xfs_attr3_leaf_split propagates the need for an extra btree split as
-ENOSPC to it's only caller, but the same return value can also be
returned from xfs_da_grow_inode when it fails to find free space.

Distinguish the two cases by returning 1 for the extra split case instead
of overloading -ENOSPC.

This can be triggered relatively easily with the pending realtime group
support and a file system with a lot of small zones that use metadata
space on the main device.  In this case every about 5-10th run of
xfs/538 runs into the following assert:

ASSERT(oldblk->magic == XFS_ATTR_LEAF_MAGIC);

in xfs_attr3_leaf_split caused by an allocation failure.  Note that
the allocation failure is caused by another bug that will be fixed
subsequently, but this commit at least sorts out the error handling.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
9 months agoxfs: return bool from xfs_attr3_leaf_add
Christoph Hellwig [Mon, 21 Oct 2024 00:10:46 +0000 (17:10 -0700)] 
xfs: return bool from xfs_attr3_leaf_add

Source kernel commit: 346c1d46d4c631c0c88592d371f585214d714da4

xfs_attr3_leaf_add only has two potential return values, indicating if the
entry could be added or not.  Replace the errno return with a bool so that
ENOSPC from it can't easily be confused with a real ENOSPC.

Remove the return value from the xfs_attr3_leaf_add_work helper entirely,
as it always return 0.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
9 months agoxfs: merge xfs_attr_leaf_try_add into xfs_attr_leaf_addname
Christoph Hellwig [Mon, 21 Oct 2024 00:10:46 +0000 (17:10 -0700)] 
xfs: merge xfs_attr_leaf_try_add into xfs_attr_leaf_addname

Source kernel commit: b1c649da15c2e4c86344c8e5af69c8afa215efec

xfs_attr_leaf_try_add is only called by xfs_attr_leaf_addname, and
merging the two will simplify a following error handling fix.

To facilitate this move the remote block state save/restore helpers up in
the file so that they don't need forward declarations now.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
9 months agoxfs: enable block size larger than page size support
Pankaj Raghav [Mon, 21 Oct 2024 00:10:46 +0000 (17:10 -0700)] 
xfs: enable block size larger than page size support

Source kernel commit: 7df7c204c678e24cd32d33360538670b7b90e330

Page cache now has the ability to have a minimum order when allocating
a folio which is a prerequisite to add support for block size > page
size.

Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/20240827-xfs-fix-wformat-bs-gt-ps-v1-1-aec6717609e0@kernel.org
Link: https://lore.kernel.org/r/20240822135018.1931258-11-kernel@pankajraghav.com
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
9 months agoxfs: ensure st_blocks never goes to zero during COW writes
Christoph Hellwig [Mon, 21 Oct 2024 00:10:46 +0000 (17:10 -0700)] 
xfs: ensure st_blocks never goes to zero during COW writes

Source kernel commit: 90fa22da6d6b41dc17435aff7b800f9ca3c00401

COW writes remove the amount overwritten either directly for delalloc
reservations, or in earlier deferred transactions than adding the new
amount back in the bmap map transaction.  This means st_blocks on an
inode where all data is overwritten using the COW path can temporarily
show a 0 st_blocks.  This can easily be reproduced with the pending
zoned device support where all writes use this path and trips the
check in generic/615, but could also happen on a reflink file without
that.

Fix this by temporarily add the pending blocks to be mapped to
i_delayed_blks while the item is queued.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
9 months agoxfs: convert perag lookup to xarray
Christoph Hellwig [Mon, 21 Oct 2024 00:10:45 +0000 (17:10 -0700)] 
xfs: convert perag lookup to xarray

Source kernel commit: 32fa4059fe6776d7db1e9058f360e06b36c9f2ce

Convert the perag lookup from the legacy radix tree to the xarray,
which allows for much nicer iteration and bulk lookup semantics.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
9 months agoxfs: move the tagged perag lookup helpers to xfs_icache.c
Christoph Hellwig [Mon, 21 Oct 2024 00:10:45 +0000 (17:10 -0700)] 
xfs: move the tagged perag lookup helpers to xfs_icache.c

Source kernel commit: f48f0a8e00b67028d4492e7656b346fa0d806570

The tagged perag helpers are only used in xfs_icache.c in the kernel code
and not at all in xfsprogs.  Move them to xfs_icache.c in preparation for
switching to an xarray, for which I have no plan to implement the tagged
lookup functions for userspace.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
9 months agoxfs: use kfree_rcu_mightsleep to free the perag structures
Christoph Hellwig [Mon, 21 Oct 2024 00:10:45 +0000 (17:10 -0700)] 
xfs: use kfree_rcu_mightsleep to free the perag structures

Source kernel commit: 4ef7c6d39dc72dae983b836c8b2b5de7128c0da3

Using the kfree_rcu_mightsleep is simpler and removes the need for a
rcu_head in the perag structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
9 months agoxfs: remove unnecessary check
Dan Carpenter [Mon, 21 Oct 2024 00:10:45 +0000 (17:10 -0700)] 
xfs: remove unnecessary check

Source kernel commit: fb8b941c75bd70ddfb0a8a3bb9bb770ed1d648f8

We checked that "pip" is non-NULL at the start of the if else statement
so there is no need to check again here.  Delete the check.

Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
9 months agoxfs: use kvmalloc for xattr buffers
Dave Chinner [Mon, 21 Oct 2024 00:10:45 +0000 (17:10 -0700)] 
xfs: use kvmalloc for xattr buffers

Source kernel commit: de631e1a8b71017b8a12b57d07db82e4052555af

Pankaj Raghav reported that when filesystem block size is larger
than page size, the xattr code can use kmalloc() for high order
allocations. This triggers a useless warning in the allocator as it
is a __GFP_NOFAIL allocation here:

static inline
struct page *rmqueue(struct zone *preferred_zone,
struct zone *zone, unsigned int order,
gfp_t gfp_flags, unsigned int alloc_flags,
int migratetype)
{
struct page *page;

/*
* We most definitely don't want callers attempting to
* allocate greater than order-1 page units with __GFP_NOFAIL.
*/
>>>>    WARN_ON_ONCE((gfp_flags & __GFP_NOFAIL) && (order > 1));
...

Fix this by changing all these call sites to use kvmalloc(), which
will strip the NOFAIL from the kmalloc attempt and if that fails
will do a __GFP_NOFAIL vmalloc().

This is not an issue that productions systems will see as
filesystems with block size > page size cannot be mounted by the
kernel; Pankaj is developing this functionality right now.

Reported-by: Pankaj Raghav <kernel@pankajraghav.com>
Fixes: f078d4ea8276 ("xfs: convert kmem_alloc() to kmalloc()")
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Link: https://lore.kernel.org/r/20240822135018.1931258-8-kernel@pankajraghav.com
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Pankaj Raghav <p.raghav@samsung.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Daniel Gomez <da.gomez@samsung.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
9 months agoxfs: standardize the btree maxrecs function parameters
Darrick J. Wong [Mon, 21 Oct 2024 00:10:44 +0000 (17:10 -0700)] 
xfs: standardize the btree maxrecs function parameters

Source kernel commit: 411a71256de6f5a0015a28929cfbe6bc36c503dc

Standardize the parameters in xfs_{alloc,bm,ino,rmap,refcount}bt_maxrecs
so that we have consistent calling conventions.  This doesn't affect the
kernel that much, but enables us to clean up userspace a bit.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
9 months agoxfs: replace shouty XFS_BM{BT,DR} macros
Darrick J. Wong [Mon, 21 Oct 2024 00:10:44 +0000 (17:10 -0700)] 
xfs: replace shouty XFS_BM{BT,DR} macros

Source kernel commit: 79124b3740063573312de4b225407ebdae219275

Replace all the shouty bmap btree and bmap disk root macros with actual
functions.

sed \
-e 's/XFS_BMBT_BLOCK_LEN/xfs_bmbt_block_len/g' \
-e 's/XFS_BMBT_REC_ADDR/xfs_bmbt_rec_addr/g' \
-e 's/XFS_BMBT_KEY_ADDR/xfs_bmbt_key_addr/g' \
-e 's/XFS_BMBT_PTR_ADDR/xfs_bmbt_ptr_addr/g' \
-e 's/XFS_BMDR_REC_ADDR/xfs_bmdr_rec_addr/g' \
-e 's/XFS_BMDR_KEY_ADDR/xfs_bmdr_key_addr/g' \
-e 's/XFS_BMDR_PTR_ADDR/xfs_bmdr_ptr_addr/g' \
-e 's/XFS_BMAP_BROOT_PTR_ADDR/xfs_bmap_broot_ptr_addr/g' \
-e 's/XFS_BMAP_BROOT_SPACE_CALC/xfs_bmap_broot_space_calc/g' \
-e 's/XFS_BMAP_BROOT_SPACE/xfs_bmap_broot_space/g' \
-e 's/XFS_BMDR_SPACE_CALC/xfs_bmdr_space_calc/g' \
-e 's/XFS_BMAP_BMDR_SPACE/xfs_bmap_bmdr_space/g' \
-i $(git ls-files fs/xfs/*.[ch] fs/xfs/libxfs/*.[ch] fs/xfs/scrub/*.[ch])

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
9 months agoxfs: fix a sloppy memory handling bug in xfs_iroot_realloc
Darrick J. Wong [Mon, 21 Oct 2024 00:10:44 +0000 (17:10 -0700)] 
xfs: fix a sloppy memory handling bug in xfs_iroot_realloc

Source kernel commit: de55149b6639e903c4d06eb0474ab2c05060e61d

While refactoring code, I noticed that when xfs_iroot_realloc tries to
shrink a bmbt root block, it allocates a smaller new block and then
copies "records" and pointers to the new block.  However, bmbt root
blocks cannot ever be leaves, which means that it's not technically
correct to copy records.  We /should/ be copying keys.

Note that this has never resulted in actual memory corruption because
sizeof(bmbt_rec) == (sizeof(bmbt_key) + sizeof(bmbt_ptr)).  However,
this will no longer be true when we start adding realtime rmap stuff,
so fix this now.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
9 months agoxfs: replace m_rsumsize with m_rsumblocks
Christoph Hellwig [Mon, 21 Oct 2024 00:10:44 +0000 (17:10 -0700)] 
xfs: replace m_rsumsize with m_rsumblocks

Source kernel commit: 33912286cb1956920712aba8cb6f38e434824357

Track the RT summary file size in blocks, just like the RT bitmap
file.  While we have users of both units, blocks are used slightly
more often and this matches the bitmap file for consistency.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
9 months agoxfs: remove xfs_{rtbitmap,rtsummary}_wordcount
Christoph Hellwig [Mon, 21 Oct 2024 00:10:43 +0000 (17:10 -0700)] 
xfs: remove xfs_{rtbitmap,rtsummary}_wordcount

Source kernel commit: 1fc51cf11dd8b26856ae1c4111e402caec73019c

xfs_rtbitmap_wordcount and xfs_rtsummary_wordcount are currently unused,
so remove them to simplify refactoring other rtbitmap helpers.  They
can be added back or simply open coded when actually needed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
9 months agoxfs: clean up the ISVALID macro in xfs_bmap_adjacent
Christoph Hellwig [Mon, 21 Oct 2024 00:10:43 +0000 (17:10 -0700)] 
xfs: clean up the ISVALID macro in xfs_bmap_adjacent

Source kernel commit: 1e21d1897f935815618d419c94e88452070ec8e5

Turn the  ISVALID macro defined and used inside in xfs_bmap_adjacent
that relies on implict context into a proper inline function.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
9 months agoxfs: simplify xfs_rtalloc_query_range
Christoph Hellwig [Mon, 21 Oct 2024 00:10:43 +0000 (17:10 -0700)] 
xfs: simplify xfs_rtalloc_query_range

Source kernel commit: df8b181f1551581e96076a653cdca43468093c0f

There isn't much of a good reason to pass the xfs_rtalloc_rec structures
that describe extents to xfs_rtalloc_query_range as we really just want
a lower and upper bound xfs_rtxnum_t.  Pass the rtxnum directly and
simply the interface.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
9 months agoxfs: remove xfs_rtb_to_rtxrem
Christoph Hellwig [Mon, 21 Oct 2024 00:10:43 +0000 (17:10 -0700)] 
xfs: remove xfs_rtb_to_rtxrem

Source kernel commit: fa0fc38b255cc88aef31ff13b5593e27622204e1

Simplify the number of block number conversion helpers by removing
xfs_rtb_to_rtxrem.  Any recent compiler is smart enough to eliminate
the double divisions if using separate xfs_rtb_to_rtx and
xfs_rtb_to_rtxoff calls.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
9 months agoxfs: ensure rtx mask/shift are correct after growfs
Christoph Hellwig [Mon, 21 Oct 2024 00:10:42 +0000 (17:10 -0700)] 
xfs: ensure rtx mask/shift are correct after growfs

Source kernel commit: 86a0264ef26e90214a5bd74c72fb6e3455403bcf

When growfs sets an extent size, it doesn't updated the m_rtxblklog and
m_rtxblkmask values, which could lead to incorrect usage of them if they
were set before and can't be used for the new extent size.

Add a xfs_mount_sb_set_rextsize helper that updates the two fields, and
also use it when calculating the new RT geometry instead of disabling
the optimization there.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
9 months agoxfs: push transaction join out of xfs_rtbitmap_lock and xfs_rtgroup_lock
Christoph Hellwig [Mon, 21 Oct 2024 00:10:42 +0000 (17:10 -0700)] 
xfs: push transaction join out of xfs_rtbitmap_lock and xfs_rtgroup_lock

Source kernel commit: 0a59e4f3e1670bc49d60e1bd1a9b19ca156ae9cb

To prepare for being able to join an already locked rtbitmap inode to a
transaction split out separate helpers for joining the transaction from
the locking helpers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
9 months agoxfs: factor out rtbitmap/summary initialization helpers
Christoph Hellwig [Mon, 21 Oct 2024 00:10:42 +0000 (17:10 -0700)] 
xfs: factor out rtbitmap/summary initialization helpers

Source kernel commit: 2a95ffc44b610643c9d5d2665600d3fbefa5ec4f

Add helpers to libxfs that can be shared by growfs and mkfs for
initializing the rtbitmap and summary, and by passing the optional data
pointer also by repair for rebuilding them.  This will become even more
useful when the rtgroups feature adds a metadata header to each block,
which means even more shared code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: minor documentation and data advance tweaks]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
9 months agoxfs: add bounds checking to xfs_rt{bitmap,summary}_read_buf
Christoph Hellwig [Mon, 21 Oct 2024 00:10:42 +0000 (17:10 -0700)] 
xfs: add bounds checking to xfs_rt{bitmap,summary}_read_buf

Source kernel commit: b4781eea6872431840e53ffebb95a5614e6944b4

Add a corruption check for passing an invalid block number, which is a
lot easier to understand than the xfs_bmapi_read failure later on.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
9 months agoxfs: assert a valid limit in xfs_rtfind_forw
Christoph Hellwig [Mon, 21 Oct 2024 00:10:42 +0000 (17:10 -0700)] 
xfs: assert a valid limit in xfs_rtfind_forw

Source kernel commit: 6d2db12d56a389b3e8efa236976f8dc3a8ae00f0

Protect against developers passing stupid limits when refactoring the
RT code once again.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
9 months agoxfs: remove the limit argument to xfs_rtfind_back
Christoph Hellwig [Mon, 21 Oct 2024 00:10:41 +0000 (17:10 -0700)] 
xfs: remove the limit argument to xfs_rtfind_back

Source kernel commit: 119c65e56bc131b466a7cd958a4089e286ce3c4b

All callers pass a 0 limit to xfs_rtfind_back, so remove the argument
and hard code it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
9 months agoxfs: factor out a xfs_validate_rt_geometry helper
Christoph Hellwig [Mon, 21 Oct 2024 00:10:41 +0000 (17:10 -0700)] 
xfs: factor out a xfs_validate_rt_geometry helper

Source kernel commit: 6529eef810e2ded0e540162273ee31a41314ec4e

Split the RT geometry validation in the early mount code into a
helper than can be reused by repair (from which this code was
apparently originally stolen anyway).

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: u64 return value for calc_rbmblocks]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
9 months agoxfs: remove xfs_validate_rtextents
Christoph Hellwig [Mon, 21 Oct 2024 00:10:41 +0000 (17:10 -0700)] 
xfs: remove xfs_validate_rtextents

Source kernel commit: 021d9c107e29a598e51fb66a54b22e5416125408

Replace xfs_validate_rtextents with an open coded check for 0
rtextents.  The name for the function implies it does a lot more
than a zero check, which is more obvious when open coded.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
9 months agoxfs: pass the icreate args object to xfs_dialloc
Darrick J. Wong [Mon, 21 Oct 2024 00:10:41 +0000 (17:10 -0700)] 
xfs: pass the icreate args object to xfs_dialloc

Source kernel commit: 390b4775d6787706b1846f15623a68e576ec900c

Pass the xfs_icreate_args object to xfs_dialloc since we can extract the
relevant mode (really just the file type) and parent inumber from there.
This simplifies the calling convention in preparation for the next
patch.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
9 months agoxfs: introduce new file range commit ioctls
Darrick J. Wong [Mon, 21 Oct 2024 00:10:40 +0000 (17:10 -0700)] 
xfs: introduce new file range commit ioctls

Source kernel commit: 398597c3ef7fb1d8fa31491c8f4f3996cff45701

This patch introduces two more new ioctls to manage atomic updates to
file contents -- XFS_IOC_START_COMMIT and XFS_IOC_COMMIT_RANGE.  The
does, but with the additional requirement that file2 cannot have changed
since some sampling point.  The start-commit ioctl performs the sampling
of file attributes.

Note: This patch currently samples i_ctime during START_COMMIT and
checks that it hasn't changed during COMMIT_RANGE.  This isn't entirely
safe in kernels prior to 6.12 because ctime only had coarse grained
granularity and very fast updates could collide with a COMMIT_RANGE.
With the multi-granularity ctime introduced by Jeff Layton, it's now
possible to update ctime such that this does not happen.

It is critical, then, that this patch must not be backported to any
kernel that does not support fine-grained file change timestamps.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Acked-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
9 months agolibfrog: add xarray emulation
Christoph Hellwig [Mon, 21 Oct 2024 00:10:40 +0000 (17:10 -0700)] 
libfrog: add xarray emulation

Implement the simple parts of the kernel xarray API on-top of the libfrog
radix-tree.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
9 months agolibxfs: port IS_ENABLED from the kernel
Darrick J. Wong [Mon, 21 Oct 2024 00:10:40 +0000 (17:10 -0700)] 
libxfs: port IS_ENABLED from the kernel

Port the IS_ENABLED macro from the kernel so that it can be used in
libxfs.  This requires a bit of hygiene on our part -- any CONFIG_XFS_*
define in userspace that have counterparts in the kernel must be defined
to 1 (and not simply define'd) so that the macro works, because the
kernel translates CONFIG_FOO=y in .config to #define CONFIG_FOO 1.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
9 months agolibxfs: test compiling public headers with a C++ compiler
Darrick J. Wong [Mon, 21 Oct 2024 00:10:40 +0000 (17:10 -0700)] 
libxfs: test compiling public headers with a C++ compiler

Apparently C++ compilers don't like the implicit void* casts that go on
in the system headers.  Compile a dummy program with the C++ compiler to
make sure this works.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Tested-by: Sam James <sam@gentoo.org>
Reviewed-by: Sam James <sam@gentoo.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
9 months agolibxfs: require -std=gnu11 for compilation by default
Darrick J. Wong [Mon, 21 Oct 2024 00:10:40 +0000 (17:10 -0700)] 
libxfs: require -std=gnu11 for compilation by default

The kernel now builds with -std=gnu11, so let's make xfsprogs do that by
default too.  Distributions can still override the parameters by passing
CFLAGS= and BUILD_CFLAGS= to configure, just as they always have.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
9 months agoxfsprogs: Release v6.11.0 v6.11.0
Andrey Albershteyn [Wed, 16 Oct 2024 18:13:22 +0000 (20:13 +0200)] 
xfsprogs: Release v6.11.0

Update all the necessary files for a v6.11.0 release.

Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
9 months agoxfsprogs: update gitignore
Andrey Albershteyn [Fri, 4 Oct 2024 11:57:04 +0000 (13:57 +0200)] 
xfsprogs: update gitignore

Building xfsprogs seems to produce many build artifacts which are
not tracked by git. Ignore them.

Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
9 months agoxfsprogs: fix permissions on files installed by libtoolize
Andrey Albershteyn [Fri, 4 Oct 2024 11:57:03 +0000 (13:57 +0200)] 
xfsprogs: fix permissions on files installed by libtoolize

Libtoolize installs some set of AUX files from its system package.
Not all distributions have the same permissions set on these files.
For example, read-only libtoolize system package will copy those
files without write permissions. This causes build to fail as next
line copies ./include/install-sh over ./install-sh which is not
writable.

Fix this by setting permission explicitly on files copied by
libtoolize.

Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>