]> git.ipfire.org Git - thirdparty/xfsprogs-dev.git/log
thirdparty/xfsprogs-dev.git
3 years agoxfs: make the key parameters to all btree query range functions const
Darrick J. Wong [Mon, 31 Jan 2022 20:25:47 +0000 (15:25 -0500)] 
xfs: make the key parameters to all btree query range functions const

Source kernel commit: 04dcb47482a9d9e27feba48ca92613edced42ef9

Range query functions are not supposed to modify the query keys that are
being passed in, so mark them all const.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: make the key parameters to all btree key comparison functions const
Darrick J. Wong [Mon, 31 Jan 2022 20:25:47 +0000 (15:25 -0500)] 
xfs: make the key parameters to all btree key comparison functions const

Source kernel commit: d29d5577774d7d032da1343dba80be7423e307f9

The btree key comparison functions are not allowed to change the keys
that are passed in, so mark them const.  We'll need this for the next
patch, which adds const to the btree range query functions.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: make xfs_rtalloc_query_range input parameters const
Darrick J. Wong [Mon, 31 Jan 2022 20:25:47 +0000 (15:25 -0500)] 
xfs: make xfs_rtalloc_query_range input parameters const

Source kernel commit: c02f6529864a4f5f91d216d324bac4ba75415d19

In commit 8ad560d2565e, we changed xfs_rtalloc_query_range to constrain
the range of bits in the realtime bitmap file that would actually be
searched.  In commit a3a374bf1889, we changed the range again
(incorrectly), leading to the fix in commit d88850bd5516, which finally
corrected the range check code.  Unfortunately, the author never noticed
that the function modifies its input parameters, which is a totaly no-no
since none of the other range query functions change their input
parameters.

So, fix this function yet again to stash the upper end of the query
range (i.e. the high key) in a local variable and hope this is the last
time I have to fix my own function.  While we're at it, mark the key
inputs const so nobody makes this mistake again. :(

Fixes: 8ad560d2565e ("xfs: strengthen rtalloc query range checks")
Not-fixed-by: a3a374bf1889 ("xfs: fix off-by-one error in xfs_rtalloc_query_range")
Not-fixed-by: d88850bd5516 ("xfs: fix high key handling in the rt allocator's query_range function")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: Rename __xfs_attr_rmtval_remove
Allison Henderson [Mon, 31 Jan 2022 20:25:47 +0000 (15:25 -0500)] 
xfs: Rename __xfs_attr_rmtval_remove

Source kernel commit: 5e68b4c7fb6414c4a48b6d2988312e3b1f31978e

Now that xfs_attr_rmtval_remove is gone, rename __xfs_attr_rmtval_remove
to xfs_attr_rmtval_remove

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: add attr state machine tracepoints
Allison Henderson [Mon, 31 Jan 2022 20:25:47 +0000 (15:25 -0500)] 
xfs: add attr state machine tracepoints

Source kernel commit: df0826312a23e495faa91eee0d6ac31bca35dc09

This is a quick patch to add a new xfs_attr_*_return tracepoints.  We
use these to track when ever a new state is set or -EAGAIN is returned

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: allow setting and clearing of log incompat feature flags
Darrick J. Wong [Mon, 31 Jan 2022 20:25:47 +0000 (15:25 -0500)] 
xfs: allow setting and clearing of log incompat feature flags

Source kernel commit: 908ce71e54f8265fa909200410d6c50ab9a2d302

Log incompat feature flags in the superblock exist for one purpose: to
protect the contents of a dirty log from replay on a kernel that isn't
prepared to handle those dirty contents.  This means that they can be
cleared if (a) we know the log is clean and (b) we know that there
aren't any other threads in the system that might be setting or relying
upon a log incompat flag.

Therefore, clear the log incompat flags when we've finished recovering
the log, when we're unmounting cleanly, remounting read-only, or
freezing; and provide a function so that subsequent patches can start
using this.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: replace kmem_alloc_large() with kvmalloc()
Dave Chinner [Mon, 31 Jan 2022 20:25:47 +0000 (15:25 -0500)] 
xfs: replace kmem_alloc_large() with kvmalloc()

Source kernel commit: d634525db63e9e946c3229fb93c8d9b763afbaf3

There is no reason for this wrapper existing anymore. All the places
that use KM_NOFS allocation are within transaction contexts and
hence covered by memalloc_nofs_save/restore contexts. Hence we don't
need any special handling of vmalloc for large IOs anymore and
so special casing this code isn't necessary.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: remove the active vs running quota differentiation
Christoph Hellwig [Mon, 31 Jan 2022 20:25:47 +0000 (15:25 -0500)] 
xfs: remove the active vs running quota differentiation

Source kernel commit: 149e53afc851713a70b6a06ebfaf2ebf25975454

These only made a difference when quotaoff supported disabling quota
accounting on a mounted file system, so we can switch everyone to use
a single set of flags and helpers now. Note that the *QUOTA_ON naming
for the helpers is kept as it was the much more commonly used one.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: remove support for disabling quota accounting on a mounted file system
Christoph Hellwig [Mon, 31 Jan 2022 20:25:47 +0000 (15:25 -0500)] 
xfs: remove support for disabling quota accounting on a mounted file system

Source kernel commit: 40b52225e58cd3adf9358146b4b39dccfbfe5892

Disabling quota accounting is hairy, racy code with all kinds of pitfalls.
And it has a very strange mind set, as quota accounting (unlike
enforcement) really is a propery of the on-disk format.  There is no good
use case for supporting this.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs_{copy,db,logprint,repair}: pass xfs_mount pointers instead of xfs_sb pointers
Darrick J. Wong [Mon, 31 Jan 2022 20:25:41 +0000 (15:25 -0500)] 
xfs_{copy,db,logprint,repair}: pass xfs_mount pointers instead of xfs_sb pointers

Where possible, convert these four programs to pass a pointer to a
struct xfs_mount instead of the struct xfs_sb inside the mount.  This
will make it easier to convert some of the code to the new
xfs_has_FEATURE predicates later on.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfsprogs: fix static build problems caused by liburcu
Theodore Ts'o [Mon, 31 Jan 2022 20:24:54 +0000 (15:24 -0500)] 
xfsprogs: fix static build problems caused by liburcu

The liburcu library has a dependency on pthreads.  Hence, in order for
static builds of xfsprogs to work, $(LIBPTHREAD) needs to appear
*after* $(LUBURCU) in LLDLIBS.  Otherwise, static links of xfs_* will
fail due to undefined references of pthread_create, pthread_exit,
et. al.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfsprogs: Release v5.14.2 v5.14.2
Eric Sandeen [Mon, 6 Dec 2021 21:41:57 +0000 (16:41 -0500)] 
xfsprogs: Release v5.14.2

Update all the necessary files for a 5.14.2 release.

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agolibxfs: hide the drainbamaged fallthrough macro from xfslibs
Darrick J. Wong [Mon, 6 Dec 2021 19:26:04 +0000 (14:26 -0500)] 
libxfs: hide the drainbamaged fallthrough macro from xfslibs

Back in mid-2021, Kees and Gustavo rammed into the kernel a bunch of
static checker "improvements" that redefined '/* fallthrough */'
comments for switch statements as a macro that virtualizes either that
same comment, a do-while loop, or a compiler __attribute__.  This was
necessary to work around the poor decision-making of the clang, gcc, and
C language standard authors, who collectively came up with four mutually
incompatible ways to document a lack of branching in a code flow.

Having received ZERO HELP porting this to userspace, Eric and I
foolishly dumped that crap into linux.h, which was a poor decision
because we keep forgetting that linux.h is exported as a userspace
header.  This has now caused downstream regressions in Debian[1] and
will probably cause more problems in the other distros.

Move it to platform_defs.h since that's not shipped publicly and leave a
warning to anyone else who dare modify linux.h.

[1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1000974

Fixes: df9c7d8d ("xfs: Fix fall-through warnings for Clang")
Cc: 1000974@bugs.debian.org, gustavoars@kernel.org, keescook@chromium.org
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
[sandeen: add comment to top of linux.h  per Dave's suggestion]
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfsprogs: Release v5.14.1 v5.14.1
Eric Sandeen [Thu, 2 Dec 2021 22:02:13 +0000 (17:02 -0500)] 
xfsprogs: Release v5.14.1

Update all the necessary files for a 5.14.1 release.

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agolibxfs: fix atomic64_t poorly for 32-bit architectures
Darrick J. Wong [Thu, 2 Dec 2021 20:24:33 +0000 (15:24 -0500)] 
libxfs: fix atomic64_t poorly for 32-bit architectures

In commit de555f66, we converted the atomic64_t implementation to use
the liburcu uatomic_* functions.  Regrettably, nobody tried to build
xfsprogs on a 32-bit architecture (hint: maintainers don't scale well
anymore) so nobody noticed that the build fails due to the unknown
symbol _uatomic_link_error.  This is what happens when liburcu doesn't
know how to perform atomic updates to a variable of a certain size, due
to some horrid macro magic in urcu.h.

Rather than a strict revert to non-atomic updates for these platforms or
(which would introduce a landmine) or roll everything back for the sake
of older platforms, I went with providing a custom atomic64_t
implementation that uses a single pthread mutex.  This enables us to
work around the fact that the kernel atomic64_t API doesn't require a
special initializer function, and is probably good enough since there
are only a handful of atomic64_t counters in the kernel.

Clean up the type declarations of a couple of variables in libxlog to
match the kernel usage, though that's probably overkill.

Eventually we'll want to decide if we're deprecating 32-bit, but this
fixes them in the mean time.

[sandeen: make the custom atomic64_add() take an int64_t]

Fixes: de555f66 ("atomic: convert to uatomic")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agolibfrog: fix crc32c self test code on cross builds
Darrick J. Wong [Thu, 2 Dec 2021 20:24:32 +0000 (15:24 -0500)] 
libfrog: fix crc32c self test code on cross builds

Helmut Grohne reported that the crc32c self test program fails to cross
build on 5.14.0 if the build host doesn't have liburcu installed.  We
don't need userspace RCU functionality to test crc32 on the build host,
so twiddle the header files to include only the two header files that we
actually need.

Note: Build-time testing of crc32c is useful for upstream developers so
that we can check that we haven't broken the checksum code, but we
really ought to be testing this in mkfs and repair on the user's system
so that they don't end up with garbage filesystems.  A future patch will
introduce that.

Reported-by: Helmut Grohne <helmut@subdivi.de>
Cc: Bastian Germann <bage@debian.org>
Suggested-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfsprogs: Release v5.14.0 v5.14.0
Eric Sandeen [Fri, 19 Nov 2021 15:59:45 +0000 (10:59 -0500)] 
xfsprogs: Release v5.14.0

Update all the necessary files for a 5.14.0 release.

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agodebian: Add changelog entry for 5.14.0-rc1-1
Bastian Germann [Fri, 19 Nov 2021 03:05:11 +0000 (22:05 -0500)] 
debian: Add changelog entry for 5.14.0-rc1-1

This holds all package changes since v5.13.0.
Add Closes tags that will autoclose related bugs.

Signed-off-by: Bastian Germann <bage@debian.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agodebian: Fix FTBFS
Boian Bonev [Fri, 19 Nov 2021 03:05:11 +0000 (22:05 -0500)] 
debian: Fix FTBFS

With newer autotools install-sh is regenerated by libtoolize.
Copy the package's version after autogen.

Link: https://bugs.debian.org/997656
Signed-off-by: Boian Bonev <bbonev@ipacct.com>
Signed-off-by: Bastian Germann <bage@debian.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agodebian: Pass --build and --host to configure
Bastian Germann [Fri, 19 Nov 2021 03:05:11 +0000 (22:05 -0500)] 
debian: Pass --build and --host to configure

xfsprogs fails to cross build because it fails to pass --host to configure.
Thus it selects the build architecture as host architecture and fails
configure, because the requested libraries are only installed for the host
architecture.

Link: https://bugs.debian.org/794158
Reported-by: Helmut Grohne <helmut@subdivi.de>
Signed-off-by: Bastian Germann <bage@debian.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agodebian: Update Uploaders list
Bastian Germann [Fri, 19 Nov 2021 03:05:11 +0000 (22:05 -0500)] 
debian: Update Uploaders list

Set Bastian's debian.org email address.

Signed-off-by: Bastian Germann <bage@debian.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfsprogs: Release v5.14.0-rc1 v5.14.0-rc1
Eric Sandeen [Fri, 12 Nov 2021 20:38:42 +0000 (15:38 -0500)] 
xfsprogs: Release v5.14.0-rc1

Update all the necessary files for a 5.14.0-rc1 release.

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agomkfs: warn about V4 deprecation when creating new V4 filesystems
Darrick J. Wong [Fri, 12 Nov 2021 20:27:59 +0000 (15:27 -0500)] 
mkfs: warn about V4 deprecation when creating new V4 filesystems

The V4 filesystem format is deprecated in the upstream Linux kernel.  In
September 2025 it will be turned off by default in the kernel and five
years after that, support will be removed entirely.  Warn people
formatting new filesystems with the old format, particularly since V4 is
not the default.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: sync xfs_btree_split macros with userspace libxfs
Darrick J. Wong [Fri, 12 Nov 2021 20:20:04 +0000 (15:20 -0500)] 
xfs: sync xfs_btree_split macros with userspace libxfs

Source kernel commit: 4a6b35b3b3f28df81fea931dc77c4c229cbdb5b2

Sync this one last bit of discrepancy between kernel and userspace
libxfs.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: #ifdef out perag code for userspace
Eric Sandeen [Fri, 12 Nov 2021 20:17:14 +0000 (15:17 -0500)] 
xfs: #ifdef out perag code for userspace

Source kernel commit: 29f11fce211c7fcf32713457c031e71785fb6088

The xfs_perag structure and initialization is unused in userspace,
so #ifdef it out with __KERNEL__ to facilitate the xfsprogs sync
and build.

Signed-off-by: Eric Sandeen <esandeen@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs_db: convert the agresv command to use for_each_perag libxfs-5.14-sync
Darrick J. Wong [Fri, 15 Oct 2021 20:28:27 +0000 (16:28 -0400)] 
xfs_db: convert the agresv command to use for_each_perag

Convert the AG iteration loop for this debugger command to use
for_each_perag, since it's the only place in userspace that obvious
wants it.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: logging the on disk inode LSN can make it go backwards
Dave Chinner [Fri, 15 Oct 2021 20:28:27 +0000 (16:28 -0400)] 
xfs: logging the on disk inode LSN can make it go backwards

Source kernel commit: 32baa63d82ee3f5ab3bd51bae6bf7d1c15aed8c7

When we log an inode, we format the "log inode" core and set an LSN
in that inode core. We do that via xfs_inode_item_format_core(),
which calls:

xfs_inode_to_log_dinode(ip, dic, ip->i_itemp->ili_item.li_lsn);

to format the log inode. It writes the LSN from the inode item into
the log inode, and if recovery decides the inode item needs to be
replayed, it recovers the log inode LSN field and writes it into the
on disk inode LSN field.

Now this might seem like a reasonable thing to do, but it is wrong
on multiple levels. Firstly, if the item is not yet in the AIL,
item->li_lsn is zero. i.e. the first time the inode it is logged and
formatted, the LSN we write into the log inode will be zero. If we
only log it once, recovery will run and can write this zero LSN into
the inode.

This means that the next time the inode is logged and log recovery
runs, it will *always* replay changes to the inode regardless of
whether the inode is newer on disk than the version in the log and
that violates the entire purpose of recording the LSN in the inode
at writeback time (i.e. to stop it going backwards in time on disk
during recovery).

Secondly, if we commit the CIL to the journal so the inode item
moves to the AIL, and then relog the inode, the LSN that gets
stamped into the log inode will be the LSN of the inode's current
location in the AIL, not it's age on disk. And it's not the LSN that
will be associated with the current change. That means when log
recovery replays this inode item, the LSN that ends up on disk is
the LSN for the previous changes in the log, not the current
changes being replayed. IOWs, after recovery the LSN on disk is not
in sync with the LSN of the modifications that were replayed into
the inode. This, again, violates the recovery ordering semantics
that on-disk writeback LSNs provide.

Hence the inode LSN in the log dinode is -always- invalid.

Thirdly, recovery actually has the LSN of the log transaction it is
replaying right at hand - it uses it to determine if it should
replay the inode by comparing it to the on-disk inode's LSN. But it
doesn't use that LSN to stamp the LSN into the inode which will be
written back when the transaction is fully replayed. It uses the one
in the log dinode, which we know is always going to be incorrect.

Looking back at the change history, the inode logging was broken by
back in 2016 by a stupid idiot who thought he knew how this code
worked. i.e. me. That commit replaced an in memory di_lsn field that
was updated only at inode writeback time from the inode item.li_lsn
value - and hence always contained the same LSN that appeared in the
on-disk inode - with a read of the inode item LSN at inode format
time. CLearly these are not the same thing.

Before 93f958f9c41f, the log recovery behaviour was irrelevant,
because the LSN in the log inode always matched the on-disk LSN at
the time the inode was logged, hence recovery of the transaction
would never make the on-disk LSN in the inode go backwards or get
out of sync.

A symptom of the problem is this, caught from a failure of
generic/482. Before log recovery, the inode has been allocated but
never used:

xfs_db> inode 393388
xfs_db> p
core.magic = 0x494e
core.mode = 0
....
v3.crc = 0x99126961 (correct)
v3.change_count = 0
v3.lsn = 0
v3.flags2 = 0
v3.cowextsize = 0
v3.crtime.sec = Thu Jan  1 10:00:00 1970
v3.crtime.nsec = 0

After log recovery:

xfs_db> p
core.magic = 0x494e
core.mode = 020444
....
v3.crc = 0x23e68f23 (correct)
v3.change_count = 2
v3.lsn = 0
v3.flags2 = 0
v3.cowextsize = 0
v3.crtime.sec = Thu Jul 22 17:03:03 2021
v3.crtime.nsec = 751000000
...

You can see that the LSN of the on-disk inode is 0, even though it
clearly has been written to disk. I point out this inode, because
the generic/482 failure occurred because several adjacent inodes in
this specific inode cluster were not replayed correctly and still
appeared to be zero on disk when all the other metadata (inobt,
finobt, directories, etc) indicated they should be allocated and
written back.

The fix for this is two-fold. The first is that we need to either
revert the LSN changes in 93f958f9c41f or stop logging the inode LSN
altogether. If we do the former, log recovery does not need to
change but we add 8 bytes of memory per inode to store what is
largely a write-only inode field. If we do the latter, log recovery
needs to stamp the on-disk inode in the same manner that inode
writeback does.

I prefer the latter, because we shouldn't really be trying to log
and replay changes to the on disk LSN as the on-disk value is the
canonical source of the on-disk version of the inode. It also
matches the way we recover buffer items - we create a buf_log_item
that carries the current recovery transaction LSN that gets stamped
into the buffer by the write verifier when it gets written back
when the transaction is fully recovered.

However, this might break log recovery on older kernels even more,
so I'm going to simply ignore the logged value in recovery and stamp
the on-disk inode with the LSN of the transaction being recovered
that will trigger writeback on transaction recovery completion. This
will ensure that the on-disk inode LSN always reflects the LSN of
the last change that was written to disk, regardless of whether it
comes from log recovery or runtime writeback.

Fixes: 93f958f9c41f ("xfs: cull unnecessary icdinode fields")
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: correct the narrative around misaligned rtinherit/extszinherit dirs
Darrick J. Wong [Fri, 15 Oct 2021 20:28:27 +0000 (16:28 -0400)] 
xfs: correct the narrative around misaligned rtinherit/extszinherit dirs

Source kernel commit: 83193e5ebb0164d612aa620ceab7d3746e80e2a4

While auditing the realtime growfs code, I realized that the GROWFSRT
ioctl (and by extension xfs_growfs) has always allowed sysadmins to
change the realtime extent size when adding a realtime section to the
filesystem.  Since we also have always allowed sysadmins to set
RTINHERIT and EXTSZINHERIT on directories even if there is no realtime
device, this invalidates the premise laid out in the comments added in

In other words, this is not a case of inadequate metadata validation.
This is a case of nearly forgotten (and apparently untested) but
supported functionality.  Update the comments to reflect what we've
learned, and remove the log message about correcting the misalignment.

Fixes: 603f000b15f2 ("xfs: validate extsz hints against rt extent size when rtinherit is set")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: check for sparse inode clusters that cross new EOAG when shrinking
Darrick J. Wong [Fri, 15 Oct 2021 20:28:27 +0000 (16:28 -0400)] 
xfs: check for sparse inode clusters that cross new EOAG when shrinking

Source kernel commit: da062d16a897c0759ae907e786bc0bea950c0c9d

While running xfs/168, I noticed occasional write verifier shutdowns
involving inodes at the very end of the filesystem.  Existing inode
btree validation code checks that all inode clusters are fully contained
within the filesystem.

However, due to inadequate checking in the fs shrink code, it's possible
that there could be a sparse inode cluster at the end of the filesystem
where the upper inodes of the cluster are marked as holes and the
corresponding blocks are free.  In this case, the last blocks in the AG
are listed in the bnobt.  This enables the shrink to proceed but results
in a filesystem that trips the inode verifiers.  Fix this by disallowing
the shrink.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: Fix multiple fall-through warnings for Clang
Gustavo A. R. Silva [Fri, 15 Oct 2021 20:28:27 +0000 (16:28 -0400)] 
xfs: Fix multiple fall-through warnings for Clang

Source kernel commit: 5937e00017f1d1dd4551e723ebfa306671f27843

In preparation to enable -Wimplicit-fallthrough for Clang, fix
the following warnings by replacing /* fallthrough */ comments,
and its variants, with the new pseudo-keyword macro fallthrough:

fs/xfs/libxfs/xfs_attr.c:487:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/libxfs/xfs_attr.c:500:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/libxfs/xfs_attr.c:532:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/libxfs/xfs_attr.c:594:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/libxfs/xfs_attr.c:607:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/libxfs/xfs_attr.c:1410:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/libxfs/xfs_attr.c:1445:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/libxfs/xfs_attr.c:1473:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]

Notice that Clang doesn't recognize /* fallthrough */ comments as
implicit fall-through markings, so in order to globally enable
-Wimplicit-fallthrough for Clang, these comments need to be
replaced with fallthrough; in the whole codebase.

Link: https://github.com/KSPP/linux/issues/115
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: Initialize error in xfs_attr_remove_iter
Allison Henderson [Fri, 15 Oct 2021 20:28:27 +0000 (16:28 -0400)] 
xfs: Initialize error in xfs_attr_remove_iter

Source kernel commit: d3a3340b6af28ab79a66687973fb0287d976d490

A recent bug report generated a warning that a code path in
xfs_attr_remove_iter could potentially return error uninitialized in the
case of XFS_DAS_RM_SHRINK state.  Fix this by initializing error.

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Bill O'Donnell <bodonnel@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: fix endianness issue in xfs_ag_shrink_space
Darrick J. Wong [Fri, 15 Oct 2021 20:28:27 +0000 (16:28 -0400)] 
xfs: fix endianness issue in xfs_ag_shrink_space

Source kernel commit: a8f3522c9a1f4a31e93b17f2b5310a2b615f5581

The AGI buffer is in big-endian format, so we must convert the
endianness to CPU format to do any comparisons.

Fixes: 46141dc891f7 ("xfs: introduce xfs_ag_shrink_space()")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: xfs_log_force_lsn isn't passed a LSN
Dave Chinner [Fri, 15 Oct 2021 20:28:27 +0000 (16:28 -0400)] 
xfs: xfs_log_force_lsn isn't passed a LSN

Source kernel commit: 5f9b4b0de8dc2fb8eb655463b438001c111570fe

In doing an investigation into AIL push stalls, I was looking at the
log force code to see if an async CIL push could be done instead.
This lead me to xfs_log_force_lsn() and looking at how it works.

xfs_log_force_lsn() is only called from inode synchronisation
contexts such as fsync(), and it takes the ip->i_itemp->ili_last_lsn
value as the LSN to sync the log to. This gets passed to
xlog_cil_force_lsn() via xfs_log_force_lsn() to flush the CIL to the
journal, and then used by xfs_log_force_lsn() to flush the iclogs to
the journal.

The problem is that ip->i_itemp->ili_last_lsn does not store a
log sequence number. What it stores is passed to it from the
->iop_committing method, which is called by xfs_log_commit_cil().
The value this passes to the iop_committing method is the CIL
context sequence number that the item was committed to.

As it turns out, xlog_cil_force_lsn() converts the sequence to an
actual commit LSN for the related context and returns that to
xfs_log_force_lsn(). xfs_log_force_lsn() overwrites it's "lsn"
variable that contained a sequence with an actual LSN and then uses
that to sync the iclogs.

This caused me some confusion for a while, even though I originally
wrote all this code a decade ago. ->iop_committing is only used by
a couple of log item types, and only inode items use the sequence
number it is passed.

Let's clean up the API, CIL structures and inode log item to call it
a sequence number, and make it clear that the high level code is
using CIL sequence numbers and not on-disk LSNs for integrity
synchronisation purposes.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: log stripe roundoff is a property of the log
Dave Chinner [Fri, 15 Oct 2021 20:28:27 +0000 (16:28 -0400)] 
xfs: log stripe roundoff is a property of the log

Source kernel commit: a6a65fef5ef8d0a6a0ce514eb66b2f3dfa777b48

We don't need to look at the xfs_mount and superblock every time we
need to do an iclog roundoff calculation. The property is fixed for
the life of the log, so store the roundoff in the log at mount time
and use that everywhere.

On a debug build:

$ size fs/xfs/xfs_log.o.*
text    data     bss     dec     hex filename
27360     560       8   27928    6d18 fs/xfs/xfs_log.o.orig
27219     560       8   27787    6c8b fs/xfs/xfs_log.o.patched

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: perag may be null in xfs_imap()
Dave Chinner [Fri, 15 Oct 2021 20:28:27 +0000 (16:28 -0400)] 
xfs: perag may be null in xfs_imap()

Source kernel commit: 90e2c1c20ac672756a2835b5a92a606dd48a4aa3

Dan Carpenter's static checker reported:

The patch 7b13c5155182: "xfs: use perag for ialloc btree cursors"
from Jun 2, 2021, leads to the following Smatch complaint:

fs/xfs/libxfs/xfs_ialloc.c:2403 xfs_imap()
error: we previously assumed 'pag' could be null (see line 2294)

And it's right. Fix it.

Fixes: 7b13c5155182 ("xfs: use perag for ialloc btree cursors")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: Make attr name schemes consistent
Allison Henderson [Fri, 15 Oct 2021 20:28:27 +0000 (16:28 -0400)] 
xfs: Make attr name schemes consistent

Source kernel commit: 816c8e39b7ea0875640312c9ed3be0d5a68d7183

This patch renames the following functions to make the nameing scheme more consistent:
xfs_attr_shortform_remove -> xfs_attr_sf_removename
xfs_attr_node_remove_name -> xfs_attr_node_removename
xfs_attr_set_fmt -> xfs_attr_sf_addname

Suggested-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: Fix default ASSERT in xfs_attr_set_iter
Allison Henderson [Fri, 15 Oct 2021 20:28:27 +0000 (16:28 -0400)] 
xfs: Fix default ASSERT in xfs_attr_set_iter

Source kernel commit: 4a4957c16dc674d1306a3b43d6b07ed93a7b7a14

This ASSERT checks for the state value of RM_SHRINK in the set path
which should never happen.  Change to ASSERT(0);

Suggested-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: drop the AGI being passed to xfs_check_agi_freecount
Dave Chinner [Fri, 15 Oct 2021 20:28:27 +0000 (16:28 -0400)] 
xfs: drop the AGI being passed to xfs_check_agi_freecount

Source kernel commit: 9ba0889e2272294bfbb5589b1b180ad2e782b2a4

Stephen Rothwell reported this compiler warning from linux-next:

fs/xfs/libxfs/xfs_ialloc.c: In function 'xfs_difree_finobt':
fs/xfs/libxfs/xfs_ialloc.c:2032:20: warning: unused variable 'agi' [-Wunused-variable]
2032 |  struct xfs_agi   *agi = agbp->b_addr;

Which is fallout from agno -> perag conversions that were done in
this function. xfs_check_agi_freecount() is the only user of "agi"
in xfs_difree_finobt() now, and it only uses the agi to get the
current free inode count. We hold that in the perag structure, so
there's not need to directly reference the raw AGI to get this
information.

The btree cursor being passed to xfs_check_agi_freecount() has a
reference to the perag being operated on, so use that directly in
xfs_check_agi_freecount() rather than passing an AGI.

Fixes: 7b13c5155182 ("xfs: use perag for ialloc btree cursors")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: fix radix tree tag signs
Darrick J. Wong [Fri, 15 Oct 2021 20:28:27 +0000 (16:28 -0400)] 
xfs: fix radix tree tag signs

Source kernel commit: 919a4ddb68413056ecb7c71d9d5465bb54c8032b

Radix tree tags are supposed to be unsigned ints, so fix the callers.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: mark xfs_bmap_set_attrforkoff static
Christoph Hellwig [Fri, 15 Oct 2021 20:28:27 +0000 (16:28 -0400)] 
xfs: mark xfs_bmap_set_attrforkoff static

Source kernel commit: 5a981e4ea8ff8062e7c7ea8fc4a1565e4820a08b

xfs_bmap_set_attrforkoff is only used inside of xfs_bmap.c, so mark it
static.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: Remove redundant assignment to busy
Jiapeng Chong [Fri, 15 Oct 2021 20:28:26 +0000 (16:28 -0400)] 
xfs: Remove redundant assignment to busy

Source kernel commit: 9673261c32dc2f30863b803374b726a72d16b07c

Variable busy is set to false, but this value is never read as it is
overwritten or not used later on, hence it is a redundant assignment
and can be removed.

Clean up the following clang-analyzer warning:

fs/xfs/libxfs/xfs_alloc.c:1679:2: warning: Value stored to 'busy' is
never read [clang-analyzer-deadcode.DeadStores].

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: sort variable alphabetically to avoid repeated declaration
Shaokun Zhang [Fri, 15 Oct 2021 20:28:26 +0000 (16:28 -0400)] 
xfs: sort variable alphabetically to avoid repeated declaration

Source kernel commit: 5f7fd75086203a8a4dd3e518976e52bcf24e8b22

Variable 'xfs_agf_buf_ops', 'xfs_agi_buf_ops', 'xfs_dquot_buf_ops' and
'xfs_symlink_buf_ops' are declared twice, so sort these variables
alphabetically and remove the repeated declaration.

Cc: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: remove xfs_perag_t
Dave Chinner [Fri, 15 Oct 2021 20:28:26 +0000 (16:28 -0400)] 
xfs: remove xfs_perag_t

Source kernel commit: 509201163fca3d4d906bd50a5320115d42818748

Almost unused, gets rid of another typedef.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: use perag through unlink processing
Dave Chinner [Fri, 15 Oct 2021 20:28:26 +0000 (16:28 -0400)] 
xfs: use perag through unlink processing

Source kernel commit: f40aadb2bb64fe0a3d9b59957e70796d629cdee2

Unlinked lists are held in the perag, and freeing of inodes needs to
be passed a perag, too, so look up the perag early in the unlink
processing and use it throughout.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: clean up and simplify xfs_dialloc()
Dave Chinner [Fri, 15 Oct 2021 20:28:26 +0000 (16:28 -0400)] 
xfs: clean up and simplify xfs_dialloc()

Source kernel commit: 8237fbf53d6fd2a3a248fc2a8608e047ef22316c

Because it's a mess.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: inode allocation can use a single perag instance
Dave Chinner [Fri, 15 Oct 2021 20:28:26 +0000 (16:28 -0400)] 
xfs: inode allocation can use a single perag instance

Source kernel commit: 309161f6603ce1a53b76a42817cde2a9bcd17e82

Now that we've internalised the two-phase inode allocation, we can
now easily make the AG selection and allocation atomic from the
perspective of a single perag context. This will ensure AGs going
offline/away cannot occur between the selection and allocation
steps.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: get rid of xfs_dir_ialloc()
Dave Chinner [Fri, 15 Oct 2021 20:28:26 +0000 (16:28 -0400)] 
xfs: get rid of xfs_dir_ialloc()

Source kernel commit: b652afd937033911944d7f681f2031b006961f1d

This is just a simple wrapper around the per-ag inode allocation
that doesn't need to exist. The internal mechanism to select and
allocate within an AG does not need to be exposed outside
xfs_ialloc.c, and it being exposed simply makes it harder to follow
the code and simplify it.

This is simplified by internalising xf_dialloc_select_ag() and
xfs_dialloc_ag() into a single xfs_dialloc() function and then
xfs_dir_ialloc() can go away.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: collapse AG selection for inode allocation
Dave Chinner [Fri, 15 Oct 2021 20:28:26 +0000 (16:28 -0400)] 
xfs: collapse AG selection for inode allocation

Source kernel commit: 89b1f55a2951bb89b7ae9f8cb3fd11513ff3f219

xfs_dialloc_select_ag() does a lot of repetitive work. It first
calls xfs_ialloc_ag_select() to select the AG to start allocation
attempts in, which can do up to two entire loops across the perags
that inodes can be allocated in. This is simply checking if there is
spce available to allocate inodes in an AG, and it returns when it
finds the first candidate AG.

xfs_dialloc_select_ag() then does it's own iterative walk across
all the perags locking the AGIs and trying to allocate inodes from
the locked AG. It also doesn't limit the search to mp->m_maxagi,
so it will walk all AGs whether they can allocate inodes or not.

Hence if we are really low on inodes, we could do almost 3 entire
walks across the whole perag range before we find an allocation
group we can allocate inodes in or report ENOSPC.

Because xfs_ialloc_ag_select() returns on the first candidate AG it
finds, we can simply do these checks directly in
xfs_dialloc_select_ag() before we lock and try to allocate inodes.
This reduces the inode allocation pass down to 2 perag sweeps at
most - one for aligned inode cluster allocation and if we can't
allocate full, aligned inode clusters anywhere we'll do another pass
trying to do sparse inode cluster allocation.

This also removes a big chunk of duplicate code.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: simplify xfs_dialloc_select_ag() return values
Dave Chinner [Fri, 15 Oct 2021 20:28:26 +0000 (16:28 -0400)] 
xfs: simplify xfs_dialloc_select_ag() return values

Source kernel commit: 4268547305c91b35ae7871374078de788a822ed1

The only caller of xfs_dialloc_select_ag() will always return
-ENOSPC to it's caller if the agbp returned from
xfs_dialloc_select_ag() is NULL. IOWs, failure to find a candidate
AGI we can allocate inodes from is always an ENOSPC condition, so
move this logic up into xfs_dialloc_select_ag() so we can simplify
the return logic in this function.

xfs_dialloc_select_ag() now only ever returns 0 with a locked
agbp, or an error with no agbp.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: remove agno from btree cursor
Dave Chinner [Fri, 15 Oct 2021 20:28:26 +0000 (16:28 -0400)] 
xfs: remove agno from btree cursor

Source kernel commit: 50f02fe3338d3fee6b298a1b262a4c562e7d84e0

Now that everything passes a perag, the agno is not needed anymore.
Convert all the users to use pag->pag_agno instead and remove the
agno from the cursor. This was largely done as an automated search
and replace.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: use perag for ialloc btree cursors
Dave Chinner [Fri, 15 Oct 2021 20:28:26 +0000 (16:28 -0400)] 
xfs: use perag for ialloc btree cursors

Source kernel commit: 7b13c515518264df0cb90d84fdab907a627c0fa9

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: convert allocbt cursors to use perags
Dave Chinner [Fri, 15 Oct 2021 20:28:26 +0000 (16:28 -0400)] 
xfs: convert allocbt cursors to use perags

Source kernel commit: 289d38d22cd88960cb648dc480c50de5102519bb

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: convert refcount btree cursor to use perags
Dave Chinner [Fri, 15 Oct 2021 20:28:26 +0000 (16:28 -0400)] 
xfs: convert refcount btree cursor to use perags

Source kernel commit: a81a06211fb43d80ee746e7a40a32ed812002f8e

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: convert rmap btree cursor to using a perag
Dave Chinner [Fri, 15 Oct 2021 20:28:26 +0000 (16:28 -0400)] 
xfs: convert rmap btree cursor to using a perag

Source kernel commit: fa9c3c197329fdab0efc48a8944d2c4a21c6a74f

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: add a perag to the btree cursor
Dave Chinner [Fri, 15 Oct 2021 20:28:26 +0000 (16:28 -0400)] 
xfs: add a perag to the btree cursor

Source kernel commit: be9fb17d88f08af648a89784d30dbac83d893154

Which will eventually completely replace the agno in it.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: pass perags around in fsmap data dev functions
Dave Chinner [Fri, 15 Oct 2021 20:28:25 +0000 (16:28 -0400)] 
xfs: pass perags around in fsmap data dev functions

Source kernel commit: 58d43a7e3263766ade4974c86118e6b5737ea259

Needs a [from, to] ranged AG walk, and the perag to be stuffed into
the info structure for callouts to use.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: push perags through the ag reservation callouts
Dave Chinner [Fri, 15 Oct 2021 20:28:25 +0000 (16:28 -0400)] 
xfs: push perags through the ag reservation callouts

Source kernel commit: 30933120ad79f4549d6e364df7eda474cc0d9c65

We currently pass an agno from the AG reservation functions to the
individual feature accounting functions, which in future may have to
do perag lookups to access per-AG state. Instead, pre-emptively
plumb the perag through from the highest AG reservation layer to the
feature callouts so they won't have to look it up again.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: pass perags through to the busy extent code
Dave Chinner [Fri, 15 Oct 2021 20:28:25 +0000 (16:28 -0400)] 
xfs: pass perags through to the busy extent code

Source kernel commit: 45d0662117565e6100f9e0cf356cd873542c95b1

All of the callers of the busy extent API either have perag
references available to use so we can pass a perag to the busy
extent functions rather than having them have to do unnecessary
lookups.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: convert secondary superblock walk to use perags
Dave Chinner [Fri, 15 Oct 2021 20:28:25 +0000 (16:28 -0400)] 
xfs: convert secondary superblock walk to use perags

Source kernel commit: 7f8d3b3ca6fe9269b3c5deee0dcea38499288e06

Clean up the last external manual AG walk.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: convert xfs_iwalk to use perag references
Dave Chinner [Fri, 15 Oct 2021 20:28:25 +0000 (16:28 -0400)] 
xfs: convert xfs_iwalk to use perag references

Source kernel commit: 6f4118fc6482b1989cdcb19a1a0ab53b2dca7ab9

Rather than manually walking the ags and passing agnunbers around,
pass the perag for the AG we are currently working on around in the
iwalk structure.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: convert raw ag walks to use for_each_perag
Dave Chinner [Fri, 15 Oct 2021 20:28:25 +0000 (16:28 -0400)] 
xfs: convert raw ag walks to use for_each_perag

Source kernel commit: 934933c3eec9e4a5826d3d7a47aca0742337fded

Convert the raw walks to an iterator, pulling the current AG out of
pag->pag_agno instead of the loop iterator variable.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: make for_each_perag... a first class citizen
Dave Chinner [Fri, 15 Oct 2021 20:28:25 +0000 (16:28 -0400)] 
xfs: make for_each_perag... a first class citizen

Source kernel commit: f250eedcf7621b9a56d563912b4eeacd524422c7

for_each_perag_tag() is defined in xfs_icache.c for local use.
Promote this to xfs_ag.h and define equivalent iteration functions
so that we can use them to iterate AGs instead to replace open coded
perag walks and perag lookups.

We also convert as many of the straight forward open coded AG walks
to use these iterators as possible. Anything that is not a direct
conversion to an iterator is ignored and will be updated in future

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: move perag structure and setup to libxfs/xfs_ag.[ch]
Dave Chinner [Fri, 15 Oct 2021 20:28:25 +0000 (16:28 -0400)] 
xfs: move perag structure and setup to libxfs/xfs_ag.[ch]

Source kernel commit: 07b6403a6873045344b0c18cbb4a4360854f6d76

Move the xfs_perag infrastructure to the libxfs files that contain
all the per AG infrastructure. This helps set up for passing perags
around all the code instead of bare agnos with minimal extra
includes for existing files.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: move xfs_perag_get/put to xfs_ag.[ch]
Dave Chinner [Fri, 15 Oct 2021 20:28:25 +0000 (16:28 -0400)] 
xfs: move xfs_perag_get/put to xfs_ag.[ch]

Source kernel commit: 9bbafc71919adfdf83fafd2ce909853b493e7d86

They are AG functions, not superblock functions, so move them to the
appropriate location.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: clean up open-coded fs block unit conversions
Darrick J. Wong [Fri, 15 Oct 2021 20:28:25 +0000 (16:28 -0400)] 
xfs: clean up open-coded fs block unit conversions

Source kernel commit: a7bcb147fef39054fe324a1a988470f5da127196

Replace some open-coded fs block unit conversions with the standard
conversion macro.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: Clean up xfs_attr_node_addname_clear_incomplete
Allison Henderson [Fri, 15 Oct 2021 20:28:25 +0000 (16:28 -0400)] 
xfs: Clean up xfs_attr_node_addname_clear_incomplete

Source kernel commit: 4fd084dbbd05402bb6e24782b8e9f9ea3e8ab3d6

We can use the helper function xfs_attr_node_remove_name to reduce
duplicate code in this function

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: Remove xfs_attr_rmtval_set
Allison Henderson [Fri, 15 Oct 2021 20:28:25 +0000 (16:28 -0400)] 
xfs: Remove xfs_attr_rmtval_set

Source kernel commit: 0e6acf29db6f463027d1ff7cea86a641da89f0d4

This function is no longer used, so it is safe to remove

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: Add delay ready attr set routines
Allison Henderson [Fri, 15 Oct 2021 20:28:25 +0000 (16:28 -0400)] 
xfs: Add delay ready attr set routines

Source kernel commit: 8f502a4009822a6972772ae65b34078645b3ba16

This patch modifies the attr set routines to be delay ready. This means
they no longer roll or commit transactions, but instead return -EAGAIN
to have the calling routine roll and refresh the transaction.  In this
series, xfs_attr_set_args has become xfs_attr_set_iter, which uses a
state machine like switch to keep track of where it was when EAGAIN was
returned. See xfs_attr.h for a more detailed diagram of the states.

Two new helper functions have been added: xfs_attr_rmtval_find_space and
xfs_attr_rmtval_set_blk.  They provide a subset of logic similar to
xfs_attr_rmtval_set, but they store the current block in the delay attr
context to allow the caller to roll the transaction between allocations.
This helps to simplify and consolidate code used by
xfs_attr_leaf_addname and xfs_attr_node_addname. xfs_attr_set_args has
now become a simple loop to refresh the transaction until the operation
is completed.  Lastly, xfs_attr_rmtval_remove is no longer used, and is
removed.

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: Add delay ready attr remove routines
Allison Henderson [Fri, 15 Oct 2021 20:28:25 +0000 (16:28 -0400)] 
xfs: Add delay ready attr remove routines

Source kernel commit: 2b74b03c13c444cb5af56804cc975534e2058d06

This patch modifies the attr remove routines to be delay ready. This
means they no longer roll or commit transactions, but instead return
-EAGAIN to have the calling routine roll and refresh the transaction. In
this series, xfs_attr_remove_args is merged with
xfs_attr_node_removename become a new function, xfs_attr_remove_iter.
This new version uses a sort of state machine like switch to keep track
of where it was when EAGAIN was returned. A new version of
xfs_attr_remove_args consists of a simple loop to refresh the
transaction until the operation is completed. A new XFS_DAC_DEFER_FINISH
flag is used to finish the transaction where ever the existing code used
to.

Calls to xfs_attr_rmtval_remove are replaced with the delay ready
version __xfs_attr_rmtval_remove. We will rename
__xfs_attr_rmtval_remove back to xfs_attr_rmtval_remove when we are
done.

xfs_attr_rmtval_remove itself is still in use by the set routines (used
during a rename).  For reasons of preserving existing function, we
modify xfs_attr_rmtval_remove to call xfs_defer_finish when the flag is
set.  Similar to how xfs_attr_remove_args does here.  Once we transition
the set routines to be delay ready, xfs_attr_rmtval_remove is no longer
used and will be removed.

This patch also adds a new struct xfs_delattr_context, which we will use
to keep track of the current state of an attribute operation. The new
xfs_delattr_state enum is used to track various operations that are in
progress so that we know not to repeat them, and resume where we left
off before EAGAIN was returned to cycle out the transaction. Other
members take the place of local variables that need to retain their
values across multiple function calls.  See xfs_attr.h for a more
detailed diagram of the states.

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: Hoist node transaction handling
Allison Henderson [Fri, 15 Oct 2021 20:28:25 +0000 (16:28 -0400)] 
xfs: Hoist node transaction handling

Source kernel commit: 3f562d092bb1edd39bfc0e6808d7108d47f8aa3a

This patch basically hoists the node transaction handling around the
leaf code we just hoisted.  This will helps setup this area for the
state machine since the goto is easily replaced with a state since it
ends with a transaction roll.

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: Hoist xfs_attr_leaf_addname
Allison Henderson [Fri, 15 Oct 2021 20:28:24 +0000 (16:28 -0400)] 
xfs: Hoist xfs_attr_leaf_addname

Source kernel commit: 83c6e70789ff371c4eebc54f2c8d979305a1bae8

This patch hoists xfs_attr_leaf_addname into the calling function.  The
goal being to get all the code that will require state management into
the same scope. This isn't particularly aesthetic right away, but it is a
preliminary step to merging in the state machine code.

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: Hoist xfs_attr_node_addname
Allison Henderson [Fri, 15 Oct 2021 20:28:24 +0000 (16:28 -0400)] 
xfs: Hoist xfs_attr_node_addname

Source kernel commit: 5d954cc09f6baed80458ea02ec092031608ea3fe

This patch hoists the later half of xfs_attr_node_addname into
the calling function.  We do this because it is this area that
will need the most state management, and we want to keep such
code in the same scope as much as possible

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: Add helper xfs_attr_node_addname_find_attr
Allison Henderson [Fri, 15 Oct 2021 20:28:24 +0000 (16:28 -0400)] 
xfs: Add helper xfs_attr_node_addname_find_attr

Source kernel commit: 6ca5a4a1f52952790a40099b79b5631d91163ba4

This patch separates the first half of xfs_attr_node_addname into a
helper function xfs_attr_node_addname_find_attr.  It also replaces the
restart goto with an EAGAIN return code driven by a loop in the calling
function.  This looks odd now, but will clean up nicly once we introduce
the state machine.  It will also enable hoisting the last state out of
xfs_attr_node_addname with out having to plumb in a "done" parameter to
know if we need to move to the next state or not.

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: Separate xfs_attr_node_addname and xfs_attr_node_addname_clear_incomplete
Allison Henderson [Fri, 15 Oct 2021 20:28:24 +0000 (16:28 -0400)] 
xfs: Separate xfs_attr_node_addname and xfs_attr_node_addname_clear_incomplete

Source kernel commit: f0f7c502c728d0c6947219739631bad101f8737b

This patch separate xfs_attr_node_addname into two functions.  This will
help to make it easier to hoist parts of xfs_attr_node_addname that need
state management

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: Refactor xfs_attr_set_shortform
Allison Henderson [Fri, 15 Oct 2021 20:28:24 +0000 (16:28 -0400)] 
xfs: Refactor xfs_attr_set_shortform

Source kernel commit: 6286514b63e12d7bedc67e46aa1aeff9ed8378ce

This patch is actually the combination of patches from the previous
version (v18).  Initially patch 3 hoisted xfs_attr_set_shortform, and
the next added the helper xfs_attr_set_fmt. xfs_attr_set_fmt is similar
the old xfs_attr_set_shortform. It returns 0 when the attr has been set
and no further action is needed. It returns -EAGAIN when shortform has
been transformed to leaf, and the calling function should proceed the
set the attr in leaf form.

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: Add xfs_attr_node_remove_name
Allison Henderson [Fri, 15 Oct 2021 20:28:24 +0000 (16:28 -0400)] 
xfs: Add xfs_attr_node_remove_name

Source kernel commit: a8490f699f6ec88843879b92cbb21953dab379ee

This patch pulls a new helper function xfs_attr_node_remove_name out
of xfs_attr_node_remove_step.  This helps to modularize
xfs_attr_node_remove_step which will help make the delayed attribute
code easier to follow

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: Reverse apply 72b97ea40d
Allison Henderson [Fri, 15 Oct 2021 20:28:24 +0000 (16:28 -0400)] 
xfs: Reverse apply 72b97ea40d

Source kernel commit: 4126c06e25b38842a254b2de6ffc3019a7b2f0ca

Originally we added this patch to help modularize the attr code in
preparation for delayed attributes and the state machine it requires.
However, later reviews found that this slightly alters the transaction
handling as the helper function is ambiguous as to whether the
transaction is diry or clean.  This may cause a dirty transaction to be
included in the next roll, where previously it had not.  To preserve the
existing code flow, we reverse apply this commit.

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agolibxfs: fix crash on second attempt to initialize library
Darrick J. Wong [Fri, 15 Oct 2021 20:28:14 +0000 (16:28 -0400)] 
libxfs: fix crash on second attempt to initialize library

xfs_repair crashes when it tries to initialize the libxfs library but
the initialization fails because the fs is already mounted:

# xfs_repair /dev/sdd
xfs_repair: /dev/sdd contains a mounted filesystem
xfs_repair: urcu.c:553: urcu_memb_register_thread: Assertion `!URCU_TLS(rcu_reader).registered' failed.
Aborted

This is because libxfs_init() registers the main thread with liburcu,
but doesn't unregister the thread if libxfs library initialization
fails.  When repair sets more dangerous options and tries again, the
second initialization attempt causes liburcu to abort.  Fix this by
unregistering the thread with liburcu if libxfs initialization fails.

Observed by running xfs/284.

Fixes: e4da1b16 ("xfsprogs: introduce liburcu support")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: use xfs_buf_alloc_pages for uncached buffers
Dave Chinner [Thu, 14 Oct 2021 16:35:45 +0000 (12:35 -0400)] 
xfs: use xfs_buf_alloc_pages for uncached buffers

Source kernel commit: 07b5c5add42a0afccf79401b12d78043ed6b8240

Use the newly factored out page allocation code. This adds
automatic buffer zeroing for non-read uncached buffers.

This also allows us to greatly simply the error handling in
xfs_buf_get_uncached(). Because xfs_buf_alloc_pages() cleans up
partial allocation failure, we can just call xfs_buf_free() in all
error cases now to clean up after failures.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agolibxfs: fix call_rcu crash when unmounting the fake mount in mkfs
Darrick J. Wong [Thu, 14 Oct 2021 16:35:43 +0000 (12:35 -0400)] 
libxfs: fix call_rcu crash when unmounting the fake mount in mkfs

In commit a6fb6abe, we simplified the process by which mkfs.xfs computes
the minimum log size calculation by creating a dummy xfs_mount with the
draft superblock image, using the dummy to compute the log geometry, and
then unmounting the dummy.

Note that creating a dummy mount with no data device is supported by
libxfs, though with the caveat that we don't set up any perag structures
at all.  Up until this point this has worked perfectly well since free()
(and hence kmem_free()) are perfectly happy to ignore NULL pointers.

Unfortunately, this will cause problems with the upcoming patch to shift
per-AG setup and teardown to libxfs because call_rcu in the liburcu
library actually tries to access the rcu_head of the passed-in perag
structure, but they're all NULL in the dummy mount case.  IOWs,
xfs_free_perag requires that every AG have a per-AG structure, and it's
too late to change the 5.14 kernel libxfs now, so work around this by
altering libxfs_mount to remember when it has initialized the perag
structures and libxfs_umount to skip freeing them when the flag isn't
set.

Just to be clear: This fault has no user-visible consequences right now;
it's a fixup to avoid problems in the libxfs sync series for 5.14.

Fixes: a6fb6abe ("mkfs: simplify minimum log size calculation")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agomisc: convert utilities to use "fallthrough;"
Darrick J. Wong [Fri, 1 Oct 2021 20:04:54 +0000 (16:04 -0400)] 
misc: convert utilities to use "fallthrough;"

Now that we have a macro to virtualize switch statement fallthroughs for
lazy compiler linters, we might as well spread it elsewhere.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfs: Fix fall-through warnings for Clang
Gustavo A. R. Silva [Fri, 1 Oct 2021 20:03:54 +0000 (16:03 -0400)] 
xfs: Fix fall-through warnings for Clang

Source kernel commit: 53004ee78d6273c994534ccf79d993098ac89769

In preparation to enable -Wimplicit-fallthrough for Clang, fix
the following warnings by replacing /* fall through */ comments,
and its variants, with the new pseudo-keyword macro fallthrough:

fs/xfs/libxfs/xfs_alloc.c:3167:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/libxfs/xfs_da_btree.c:286:3: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/libxfs/xfs_ag_resv.c:346:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/libxfs/xfs_ag_resv.c:388:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/xfs_bmap_util.c:246:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/xfs_export.c:88:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/xfs_export.c:96:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/xfs_file.c:867:3: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/xfs_ioctl.c:562:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/xfs_ioctl.c:1548:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/xfs_iomap.c:1040:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/xfs_inode.c:852:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/xfs_log.c:2627:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/xfs_trans_buf.c:298:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/scrub/bmap.c:275:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/scrub/btree.c:48:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/scrub/common.c:85:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/scrub/common.c:138:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/scrub/common.c:698:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/scrub/dabtree.c:51:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/scrub/repair.c:951:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
fs/xfs/scrub/agheader.c:89:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]

Notice that Clang doesn't recognize /* fall through */ comments as
implicit fall-through markings, so in order to globally enable
-Wimplicit-fallthrough for Clang, these comments need to be
replaced with fallthrough; in the whole codebase.

Link: https://github.com/KSPP/linux/issues/115
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agolibxfs: fix whitespace inconsistencies with kernel
Darrick J. Wong [Fri, 1 Oct 2021 19:15:19 +0000 (15:15 -0400)] 
libxfs: fix whitespace inconsistencies with kernel

Fix a few places where the whitespace isn't an exact match for the
kernel.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agolibxfs: port xfs_set_inode_alloc from the kernel
Darrick J. Wong [Fri, 1 Oct 2021 18:00:34 +0000 (14:00 -0400)] 
libxfs: port xfs_set_inode_alloc from the kernel

To prepare to perag initialization code move to libxfs, port the
xfs_set_inode_alloc function from the kernel and make
libxfs_initialize_perag use it.  The code isn't 1:1 identical, but
AFAICT it behaves the same way.  In a future kernel release we'll
move the function into xfs_ag.c and update xfsprogs.

(sandeen: Note that this is in effect simply syncing up several
kernel commits in this code which was not shared via libxfs.)

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agolibfrog: move topology.[ch] to libxfs
Darrick J. Wong [Wed, 29 Sep 2021 20:57:10 +0000 (16:57 -0400)] 
libfrog: move topology.[ch] to libxfs

The topology code depends on a few libxfs structures and is only needed
by mkfs and xfs_repair.  Move this code to libxfs to reduce the size of
libfrog and to avoid build failures caused by "xfs: move perag structure
and setup to libxfs/xfs_ag.[ch]".

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agomkfs: move mkfs/proto.c declarations to mkfs/proto.h
Darrick J. Wong [Wed, 29 Sep 2021 20:57:05 +0000 (16:57 -0400)] 
mkfs: move mkfs/proto.c declarations to mkfs/proto.h

These functions are only used by mkfs, so move them to a separate header
file that isn't in an internal library.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
[sandeen: cosmetic tidyups as suggested by Christoph]
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoatomic: convert to uatomic
Dave Chinner [Wed, 29 Sep 2021 20:57:02 +0000 (16:57 -0400)] 
atomic: convert to uatomic

Now we have liburcu, we can make use of it's atomic variable
implementation. It is almost identical to the kernel API - it's just
got a "uatomic" prefix. liburcu also provides all the same aomtic
variable memory barriers as the kernel, so if we pull memory barrier
dependent kernel code across, it will just work with the right
barrier wrappers.

This is preparation the addition of more extensive atomic operations
the that kernel buffer cache requires to function correctly.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
[chandan.babu@oracle.com: Swap order of arguments provided to atomic[64]_[add|sub]()]
Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>
[sandeen: fix whitespace]
[sandeen: make #defines a little more consistent about (a,v)]
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agolibxfs: add spinlock_t wrapper
Dave Chinner [Wed, 29 Sep 2021 20:48:29 +0000 (16:48 -0400)] 
libxfs: add spinlock_t wrapper

These provide the kernel spinlock_t interface, but are *not*
spinlocks. Spinlocks cannot be used by general purpose userspace
processes due to the fact they cannot control task preemption and
scheduling reliability. Hence these are implemented as a
pthread_mutex_t, similar to the way the kernel RT build implements
spinlock_t as a kernel mutex.

Because the current libxfs spinlock "implementation" just makes
spinlocks go away, we have to also add initialisation to spinlocks
that libxfs uses that are missing from the userspace implementation.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
[chandan.babu@oracle.com: Initialize inode log item spin lock]
Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>
[sandeen: fix minor whitespace]
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfsprogs: introduce liburcu support
Dave Chinner [Wed, 29 Sep 2021 20:48:03 +0000 (16:48 -0400)] 
xfsprogs: introduce liburcu support

The upcoming buffer cache rework/kerenl sync-up requires atomic
variables. I could use C++11 atomics build into GCC, but they are a
pain to work with and shoe-horn into the kernel atomic variable API.

Much easier is to introduce a dependency on liburcu - the userspace
RCU library. This provides atomic variables that very closely match
the kernel atomic variable API, and it provides a very similar
memory model and memory barrier support to the kernel. And we get
RCU support that has an identical interface to the kernel and works
the same way.

Hence kernel code written with RCU algorithms and atomic variables
will just slot straight into the userspace xfsprogs code without us
having to think about whether the lockless algorithms will work in
userspace or not. This reduces glue and hoop jumping, and gets us
a step closer to having the entire userspace libxfs code MT safe.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
[chandan.babu@oracle.com: Add m4 macros to detect availability of liburcu]
Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>
[sandeen: use dchinner's m4 macros]
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
3 years agoxfsprogs: Release v5.13.0 v5.13.0
Eric Sandeen [Fri, 20 Aug 2021 16:03:57 +0000 (12:03 -0400)] 
xfsprogs: Release v5.13.0

Update all the necessary files for a 5.13.0 release.

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfsprogs: Release v5.13.0-rc1 v5.13.0-rc1
Eric Sandeen [Mon, 2 Aug 2021 21:31:28 +0000 (17:31 -0400)] 
xfsprogs: Release v5.13.0-rc1

Update all the necessary files for a 5.13.0-rc1 release.

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs_quota: allow users to truncate group and project quota files
Darrick J. Wong [Wed, 28 Jul 2021 23:05:51 +0000 (19:05 -0400)] 
xfs_quota: allow users to truncate group and project quota files

In commit 79ac1ae4, I /think/ xfsprogs gained the ability to deal with
project or group quotas.  For some reason, the quota remove command was
structured so that if the user passes both -g and -p, it will only ask
the kernel truncate the group quota file.  This is a strange behavior
since -ug results in truncation requests for both user and group quota
files, and the kernel is smart enough to return 0 if asked to truncate a
quota file that doesn't exist.

In other words, this is a seemingly arbitrary limitation of the command.
It's an unexpected behavior since we don't do any sort of parameter
validation to warn users when -p is silently ignored.  Modern V5
filesystems support both group and project quotas, so it's all the more
surprising that you can't do group and project all at once.  Remove this
pointless restriction.

Found while triaging xfs/007 regressions.

Fixes: 79ac1ae4 ("Fix xfs_quota disable, enable, off and remove commands Merge of master-melb:xfs-cmds:29395a by kenmcd.")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs_repair: invalidate dirhash entry when junking dirent
Darrick J. Wong [Wed, 28 Jul 2021 23:05:15 +0000 (19:05 -0400)] 
xfs_repair: invalidate dirhash entry when junking dirent

In longform_dir2_entry_check_data, we add the directory entries we find
to the incore dirent hash table after we've validated the name but
before we're totally done checking the entry.  This sequence is
necessary to detect all duplicated names in the directory.

Unfortunately, if we later decide to junk the ondisk dirent, we neglect
to mark the dirhash entry, so if the directory gets rebuilt, it will get
rebuilt with the entry that we rejected.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs_io: allow callers to dump fs stats individually
Darrick J. Wong [Wed, 28 Jul 2021 23:03:18 +0000 (19:03 -0400)] 
xfs_io: allow callers to dump fs stats individually

Enable callers to decide if they want to see statfs, fscounts, or
geometry information (or any combination) from the xfs_io statfs
command.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agomkfs: validate rt extent size hint when rtinherit is set
Darrick J. Wong [Wed, 28 Jul 2021 23:03:16 +0000 (19:03 -0400)] 
mkfs: validate rt extent size hint when rtinherit is set

Extent size hints exist to nudge the behavior of the file data block
allocator towards trying to make aligned allocations.  Therefore, it
doesn't make sense to allow a hint that isn't a multiple of the
fundamental allocation unit for a given file.

This means that if the sysadmin is formatting with rtinherit set on the
root dir, validate_extsize_hint needs to check the hint value on a
simulated realtime file to make sure that it's correct.  Unfortunately,
the gate check here was for a nonzero rt extent size, which is wrong
since we never format with rtextsize==0.  This leads to absurd failures
such as:

# mkfs.xfs -f /dev/sdf -r extsize=7b -d rtinherit=0,extszinherit=13
illegal extent size hint 13, must be less than 649088 and a multiple of 7.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs_repair: validate alignment of inherited rt extent hints
Darrick J. Wong [Wed, 28 Jul 2021 23:03:12 +0000 (19:03 -0400)] 
xfs_repair: validate alignment of inherited rt extent hints

If we encounter a directory that has been configured to pass on an
extent size hint to a new realtime file and the hint isn't an integer
multiple of the rt extent size, we should turn off the hint because that
is a misconfiguration.  Old kernels didn't check for this when copying
attributes into new files and would crash in the rt allocator.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
[sandeen: clarify that it is an extent size *hint*]
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs_io: don't count fsmaps before querying fsmaps
Darrick J. Wong [Wed, 28 Jul 2021 23:01:23 +0000 (19:01 -0400)] 
xfs_io: don't count fsmaps before querying fsmaps

There's a bunch of code in fsmap.c that tries to count the GETFSMAP
records so that it can size the fsmap array appropriately for the
GETFSMAP call.  It's pointless to iterate the entire result set /twice/
(unlike the bmap command where the extent count is actually stored in
the fs metadata), so get rid of the duplicate walk.

In other words: Iterate over the records using the default chunk size
instead of doing one call to find the size and doing a giant allocation
and GETFSMAP call.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs_io: only print the header once when dumping fsmap in csv format
Darrick J. Wong [Wed, 28 Jul 2021 23:01:23 +0000 (19:01 -0400)] 
xfs_io: only print the header once when dumping fsmap in csv format

Only print the column names once when we're dumping fsmap information in
csv format.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs_io: clean up the funshare command a bit
Darrick J. Wong [Wed, 28 Jul 2021 23:01:23 +0000 (19:01 -0400)] 
xfs_io: clean up the funshare command a bit

Add proper argument parsing to the funshare command so that when you
pass it nonexistent --help it will print the help instead of complaining
that it can't convert that to a number.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
4 years agoxfs_io: fix broken funshare_cmd usage
Darrick J. Wong [Wed, 28 Jul 2021 23:01:23 +0000 (19:01 -0400)] 
xfs_io: fix broken funshare_cmd usage

Create a funshare_cmd and use that to store information about the
xfs_io funshare command instead of overwriting the contents of
fzero_cmd.  This fixes confusing output like:

$ xfs_io -c 'fzero 2 3 --help' /
fzero: invalid option -- '-'
funshare off len -- unshares shared blocks within the range

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>