Darrick J. Wong [Mon, 9 Sep 2019 19:37:07 +0000 (15:37 -0400)]
libfrog: move workqueue.h to libfrog/
Move this header to libfrog since the code is there already.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:07 +0000 (15:37 -0400)]
libfrog: move path.h to libfrog/
Move this header to libfrog since the code is there already.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:07 +0000 (15:37 -0400)]
libfrog: move crc32c.h to libfrog/
Move this header to libfrog since the code is there already.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:06 +0000 (15:37 -0400)]
libfrog: move workqueue.h to libfrog/
Move this header to libfrog since the code is there already.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:06 +0000 (15:37 -0400)]
libfrog: move radix-tree.h to libfrog/
Move this header to libfrog since the code is there already.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:06 +0000 (15:37 -0400)]
libfrog: move ptvar.h to libfrog/
Move this header to libfrog since the code is there already.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:06 +0000 (15:37 -0400)]
libfrog: move fsgeom.h to libfrog/
Move this header to libfrog since the code is there already.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:06 +0000 (15:37 -0400)]
libfrog: move convert.h to libfrog/
Move this header to libfrog since the code is there already.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:06 +0000 (15:37 -0400)]
libfrog: move bitmap.h to libfrog/
Move this header to libfrog since the code is there already.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:06 +0000 (15:37 -0400)]
libfrog: move avl64.h to libfrog/
Move this header to libfrog since the code is there already.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:05 +0000 (15:37 -0400)]
libxfs: move topology declarations into separate header
The topology functions live in libfrog now, which means their
declarations don't belong in libxcmd.h. Create new header file for
them.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:05 +0000 (15:37 -0400)]
libfrog: refactor open-coded INUMBERS calls
Refactor all the INUMBERS ioctl callsites into helper functions.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:05 +0000 (15:37 -0400)]
libfrog: create xfd_open function
Create a helper to open a file and initialize the xfd structure.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:05 +0000 (15:37 -0400)]
libfrog: refactor open-coded bulkstat calls
Refactor the BULKSTAT_SINGLE and BULKSTAT ioctl callsites into helper
functions.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:05 +0000 (15:37 -0400)]
libfrog: create online fs geometry converters
Create helper functions to perform unit conversions against a runtime
filesystem, then remove the open-coded versions in scrub.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:05 +0000 (15:37 -0400)]
libfrog: store more inode and block geometry in struct xfs_fd
Move the extra AG geometry fields out of scrub and into the libfrog code
so that we can consolidate the geoemtry code in one place.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:04 +0000 (15:37 -0400)]
libfrog: introduce xfs_fd to wrap an fd to a file on an xfs filesystem
Introduce a new "xfs_fd" context structure where we can store a file
descriptor and all the runtime fs context (geometry, which ioctls work,
etc.) that goes with it. We're going to create wrappers for the
bulkstat and inumbers ioctls in subsequent patches; and when we
introduce the v5 bulkstat/inumbers ioctls we'll need all that context to
downgrade gracefully on old kernels. Start the transition by adopting
xfs_fd natively in scrub.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:36:57 +0000 (15:36 -0400)]
libfrog: refactor online geometry queries
Refactor all the open-coded XFS_IOC_FSGEOMETRY queries into a single
helper that we can use to standardize behaviors across mixed xfslibs
versions. This is the prelude to introducing a new FSGEOMETRY version
in 5.2 and needing to fix the (relatively few) client programs.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
[sandeen: remove 5.2.1 compat hack now that we have the wrapper] Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Fri, 6 Sep 2019 21:59:13 +0000 (17:59 -0400)]
xfsprogs: update spdx tags in LICENSES/
Update the GPL related SPDX tags in LICENSES to reflect the SPDX 3.0
tagging formats.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
The "obvious" cause is that the attr ifork is null despite the inode
claiming an attr fork having at least one extent, but it's not so
obvious why we ended up with an inode in that state.
Reported-by: Zorro Lang <zlang@redhat.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204031 Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Bill O'Donnell <billodo@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Continue our game of replacing ASSERTs for corrupt ondisk metadata with
EFSCORRUPTED returns.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Bill O'Donnell <billodo@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
When we're iterating all the attributes using the built-in xattr
iterator, we can use the seen_enough variable to pass error codes back
to the main scrub function instead of flattening them into 0/1. This
will be used in a more exciting fashion in upcoming patches.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Create a new bulk ireq flag that enables userspace to ask us for a
special inode number instead of interpreting @ino as a literal inode
number. This enables us to query the root inode easily.
The reason for adding the ability to query specifically the root
directory inode is that certain programs (xfsdump and xfsrestore) want
to confirm when they've been pointed to the root directory. The
userspace code assumes the root directory is always the first result
from calling bulkstat with lastino == 0, but this isn't true if the
(initial btree roots + initial AGFL + inode alignment padding) is itself
long enough to be allocated to new inodes if all of those blocks should
happen to be free at the same time. Rather than make userspace guess
at internal filesystem state, we provide a direct query.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Add a new xfs_bulk_ireq flag to constrain the iteration to a single AG.
If the passed-in startino value is zero then we start with the first
inode in the AG that the user passes in; otherwise, we iterate only
within the same AG as the passed-in inode.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Introduce a new "v5" inode group structure that fixes the alignment
and padding problems of the existing structure.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Introduce a new version of the in-core bulkstat structure that supports
our new v5 format features. This structure also fills the gaps in the
previous structure. We leave wiring up the ioctls for the next patch.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Remove xfs_bstat_t, xfs_fsop_bulkreq_t, xfs_inogrp_t, and similarly
named compat typedefs.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Create a new iterator function to simplify walking inodes in an XFS
filesystem. This new iterator will replace the existing open-coded
walking that goes on in various places.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Currently, xfs doesn't have generic error codes defined for "stop
iterating"; we just reuse the XFS_BTREE_QUERY_* return values. This
looks a little weird if we're not actually iterating a btree index.
Before we start adding more iterators, we should create general
XFS_ITER_{CONTINUE,ABORT} return values and define the XFS_BTREE_QUERY_*
ones from that.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Instead of a magic flag for xfs_trans_alloc, just ensure all callers
that can't relclaim through the file system use memalloc_nofs_save to
set the per-task nofs flag.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
There are many, many xfs header files which are included but
unneeded (or included twice) in the xfs code, so remove them.
nb: xfs_linux.h includes about 9 headers for everyone, so those
explicit includes get removed by this. I'm not sure what the
preference is, but if we wanted explicit includes everywhere,
a followup patch could remove those xfs_*.h includes from
xfs_linux.h and move them into the files that need them.
Or it could be left as-is.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
When we're writing out a fresh new AG, make sure that we don't list an
internal log as free and that we create the rmap for the region. growfs
never does this, but we will need it when we hook up mkfs.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Refactor the code that populates the free space btrees of a new AG so
that we can avoid code duplication once things start getting
complicated.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
xfs_alloc_ag_vextent_small() doesn't update the output parameters in
the event of an AGFL allocation. Instead, it updates the
xfs_alloc_arg structure directly to complete the allocation.
Update both args and the output params to provide consistent
behavior for future callers.
Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
The small allocation helper is implemented in a way that is fairly
tightly integrated to the existing allocation algorithms. It expects
a cntbt cursor beyond the end of the tree, attempts to locate the
last record in the tree and only attempts an AGFL allocation if the
cntbt is empty.
The upcoming generic algorithm doesn't rely on the cntbt processing
of this function. It will only call this function when the cntbt
doesn't have a big enough extent or is empty and thus AGFL
allocation is the only remaining option. Tweak
xfs_alloc_ag_vextent_small() to handle a NULL cntbt cursor and skip
the cntbt logic. This facilitates use by the existing allocation
code and new code that only requires an AGFL allocation attempt.
Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Move the small allocation helper further up in the file to avoid the
need for a function declaration. The remaining declarations will be
removed by followup patches. No functional changes.
Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
xfs_alloc_ag_vextent_small() is kind of a mess. Clean it up in
preparation for future changes. No functional changes.
Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
We need to derive the mount pointer from a buffer in a lot of place.
Add a direct pointer to short cut the pointer chasing.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
The inode geometry structure isn't related to ondisk format; it's
support for the mount structure. Move it to xfs_shared.h.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
There are several functions which take a flag argument that is
only ever passed as "0," so remove these arguments.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Bill O'Donnell <billodo@redhat.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
The flags value is always passed as 0 so remove the argument.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Finish converting all the old inode_cluster_size >> inopblog users to
inodes_per_cluster.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
inode_cluster_size is supposed to represent the size (in bytes) of an
inode cluster buffer. We avoid having to handle multiple clusters per
filesystem block on filesystems with large blocks by openly rounding
this value up to 1 FSB when necessary. However, we never reset
inode_cluster_size to reflect this new rounded value, which adds to the
potential for mistakes in calculating geometries.
Fix this by setting inode_cluster_size to reflect the rounded-up size if
needed, and special-case the few places in the sparse inodes code where
we actually need the smaller value to validate on-disk metadata.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Migrate all of the inode geometry setup code from xfs_mount.c into a
single libxfs function that we can share with xfsprogs.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Separate the inode geometry information into a distinct structure.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Eric Sandeen [Wed, 21 Aug 2019 15:41:03 +0000 (11:41 -0400)]
xfsprogs: fix geometry calls on older kernels for 5.2.1
I didn't think 5.2.0 through; the udpate of the geometry ioctl means
that the tools won't work on older kernels that don't support the
v5 ioctls, since I failed to merge Darrick's wrappers.
As a very quick one-off I'd like to merge this to just revert every
geometry call back to the original ioctl, so it keeps working on
older kernels and I'll release 5.2.1. This hack can go away when
Darrick's wrappers get merged.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
It turns out that the log can consume nearly all the space in an AG, and
when this happens this it's possible that there will be less free space
in the AG than the reservation would try to hide. On a debug kernel
this can trigger an ASSERT in xfs/250:
The log is permanently allocated, so we know we're never going to have
to expand the btrees to hold any records associated with the log space.
We therefore can treat the space as if it doesn't exist.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
There are several functions which have no opportunity to return
an error, and don't contain any ASSERTs which could be argued
to be better constructed as error cases. So, make them voids
to simplify the callers.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Teach online scrub how to check the filesystem summary counters. We use
the incore delalloc block counter along with the incore AG headers to
compute expected values for fdblocks, icount, and ifree, and then check
that the percpu counter is within a certain threshold of the expected
value. This is done to avoid having to freeze or otherwise lock the
filesystem, which means that we're only checking that the counters are
fairly close, not that they're exactly correct.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
During testing of xfs/141 on a V4 filesystem, I observed some
inconsistent behavior with regards to resources that are held (i.e.
remain locked) across a defer roll. The transaction roll always gives
the defer roll function a new transaction, even if committing the old
transaction fails. However, the defer roll function only rejoins the
held resources if the transaction commit succeedied. This means that
callers of defer roll have to figure out whether the held resources are
attached to the transaction being passed back.
Worse yet, if the defer roll was part of a defer finish call, we have a
third possibility: the defer finish could pass back a dirty transaction
with dirty held resources and an error code.
The only sane way to handle all of these scenarios is to require that
the code that held the resource either cancel the transaction before
unlocking and releasing the resources, or use functions that detach
resources from a transaction properly (e.g. xfs_trans_brelse) if they
need to drop the reference before committing or cancelling the
transaction.
In order to make this so, change the defer roll code to join held
resources to the new transaction unconditionally and fix all the bhold
callers to release the held buffers correctly.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Add a percpu counter to track the number of blocks directly reserved for
delayed allocations on the data device. This counter (in contrast to
i_delayed_blks) does not track allocated CoW staging extents or anything
going on with the realtime device. It will be used in the upcoming
summary counter scrub function to check the free block counts without
having to freeze the filesystem or walk all the inodes to find the
delayed allocations.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Block allocation requires a permanent transaction for deferred AGFL
frees. Add an assert in the block allocation path to make explicit and
obvious to future callers the requirement of a transaction with a
permanent reservation.
Reported-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
[darrick: split this out from the previous patch per hch request] Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
The growdata transaction is used by growfs operations to increase
the data size of the filesystem. Part of this sequence involves
extending the size of the last preexisting AG in the fs, if
necessary. This is implemented by freeing the newly available
physical range to the AG.
tr_growdata is not a permanent transaction, however, and block
allocation transactions must be permanent to handle deferred frees
of AGFL blocks. If the grow operation extends an existing AG that
requires AGFL fixing, assert failures occur due to a populated dfops
list on a non-permanent transaction and the AGFL free does not
occur. This is reproduced (rarely) by xfs/104.
Change tr_growdata to a permanent transaction with a default log
count. This increases initial transaction reservation size, but
growfs is an infrequent and non-performance critical operation and
so should have minimal impact.
Reported-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
[darrick: add a comment to the assert] Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Use space in the bulkstat ioctl structure to report any problems
observed with the inode.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Use the AG geometry info ioctl to report health status too.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Use our newly expanded geometry structure to report the overall fs and
realtime health status.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Add a new ioctl to describe an allocation group's geometry.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Unfortunately, the V4 XFS_IOC_FSGEOMETRY structure is out of space so we
can't just add a new field to it. Hence we need to bump the definition
to V5 and and treat the V4 ioctl and structure similar to v1 to v3.
While doing this, clean up all the definitions associated with the
XFS_IOC_FSGEOMETRY ioctl.
Signed-Off-By: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
[darrick: forward port to 5.1, expand structure size to 256 bytes] Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
If we know the filesystem metadata isn't healthy during unmount, we want
to encourage the administrator to run xfs_repair right away. We can't
do this if BAD_SUMMARY will cause an unclean log unmount to force
summary recalculation, so turn it off if the fs is bad.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Replace the BAD_SUMMARY mount flag with calls to the equivalent health
tracking code.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Add the necessary in-core metadata fields to keep track of which parts
of the filesystem have been observed and which parts were observed to be
unhealthy, and print a warning at unmount time if we have unfixed
problems.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
The block allocation AG selection code has parameters that allow a
caller to perform multiple allocations from a single AG and
transaction (under certain conditions). The parameters specify the
total block allocation count required by the transaction and the AG
selection code selects and locks an AG that will be able to satisfy
the overall requirement. If the available block accounting
calculation turns out to be inaccurate and a subsequent allocation
call fails with -ENOSPC, the resulting transaction cancel leads to
filesystem shutdown because the transaction is dirty.
This exact problem can be reproduced with a highly parallel space
consumer and fsstress workload running long enough to a large
filesystem against -ENOSPC conditions. A bmbt block allocation
request made for inode extent to bmap format conversion after an
extent allocation is expected to be satisfied by the same AG and the
same transaction as the extent allocation. The bmbt block allocation
fails, however, because the block availability of the AG has changed
since the AG was selected (outside of the blocks used for the extent
itself).
The inconsistent block availability calculation is caused by the
deferred block freeing behavior of the AGFL. This immediately
removes extra blocks from the AGFL to free up AGFL slots, but rather
than immediately freeing such blocks as was done in the past, the
block free is deferred such that said blocks are not available for
allocation until the current transaction commits. The AG selection
logic currently considers all AGFL blocks as available and executes
shortly before any extra AGFL blocks are freed. This means the block
availability of the current AG can change before the first
allocation even occurs, but in practice a failure is more likely to
manifest via a subsequent allocation because extent allocation
usually has a contiguity requirement larger than a single block that
can't be satisfied from the AGFL.
In general, XFS prefers operational robustness to absolute
allocation efficiency. In other words, we prefer to return -ENOSPC
slightly earlier at the expense of not being able to allocate every
last block in an AG to avoid this kind of problem. As such, update
the AG block availability calculation to consider extra AGFL blocks
as unavailable since they are immediately removed following the
calculation and will not become available until the current
transaction commits.
Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
mkfs.xfs.8: Fix an inconsistency between the code and the man page.
The man page currently states that block and sector size units cannot
be used for other option values unless they are explicitly specified,
when in fact the default sizes will be used in that case.
Change the man page to clarify this.
Signed-off-by: Alvin Zheng <Alvin@linux.alibaba.com>
[sandeen: sector/block values do not need to be specified first] Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Wed, 10 Jul 2019 21:16:36 +0000 (17:16 -0400)]
man: create a separate xfs shutdown ioctl manpage
Create a separate manual page for the xfs shutdown ioctl so we can
document how it works.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
[sandeen: minor edits] Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Wed, 10 Jul 2019 21:16:36 +0000 (17:16 -0400)]
man: create a separate GETBMAPX/GETBMAPA/GETBMAP ioctl manpage
Create a separate manual page for the xfs BMAP ioctls so we can document
how they work.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
[sandeen: minor edits] Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Wed, 10 Jul 2019 21:16:36 +0000 (17:16 -0400)]
man: create a separate RESBLKS ioctl manpage
Create a separate manual page for the xfs RESBLKS ioctls so we can
document how it works.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
[sandeen: minor edits] Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Wed, 10 Jul 2019 21:16:36 +0000 (17:16 -0400)]
man: create a separate FSCOUNTS ioctl manpage
Create a separate manual page for the xfs FSCOUNTS ioctl so we can
document how it works.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Wed, 10 Jul 2019 21:16:36 +0000 (17:16 -0400)]
man: create a separate INUMBERS ioctl manpage
Create a separate manual page for the xfs INUMBERS ioctl so we can
document how it works.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Wed, 10 Jul 2019 21:16:36 +0000 (17:16 -0400)]
man: link to the SCRUB_METADATA ioctl manpage from xfsctl.3
Link to the scrub ioctl documentation from xfsctl.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Wed, 10 Jul 2019 21:16:36 +0000 (17:16 -0400)]
man: create a separate FSBULKSTAT ioctl manpage
Break out the xfs bulkstat ioctl into a separate manpage so that we can
document how it works.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
[sandeen: minor edits] Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Wed, 10 Jul 2019 21:16:36 +0000 (17:16 -0400)]
man: create a separate GEOMETRY ioctl manpage
Break out the xfs geometry ioctl into a separate manpage so that we can
document how it works.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
[sandeen: minor edits] Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Wed, 10 Jul 2019 21:16:32 +0000 (17:16 -0400)]
man: create a separate GETXATTR/SETXATTR ioctl manpage
Break out the xfs file attribute get and set ioctls into a separate
manpage to reduce clutter in xfsctl.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
[sandeen: minor edits] Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
While investigating another mkfs bug, noticed that cfg->lsunit is sometimes
left uninitialized when it should not. This is because calc_stripe_factors
in some cases needs cfg->loginternal to be set first. This is done in
validate_logdev. So move calc_stripe_factors below validate_logdev while
parsing configs.
Signed-off-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Eric Sandeen [Wed, 10 Jul 2019 15:37:25 +0000 (11:37 -0400)]
xfs_io: reorganize source file handling in copy_range
Rename and rearrange some of the vars related to using an open
file number as the source file, so that we don't temporarily
store a non-fd number in a var called "fd," and do the fd
assignment in a consistent code location.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Amir Goldstein [Wed, 10 Jul 2019 15:36:25 +0000 (11:36 -0400)]
xfs_io: allow passing an open file to copy_range
Commit 1a05efba ("io: open pipes in non-blocking mode")
addressed a specific copy_range issue with pipes by always opening
pipes in non-blocking mode.
This change takes a different approach and allows passing any
open file as the source file to copy_range. Besides providing
more flexibility to the copy_range command, this allows xfstests
to check if xfs_io supports passing an open file to copy_range.
Signed-off-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Yang Xu [Wed, 10 Jul 2019 15:35:25 +0000 (11:35 -0400)]
mkfs: remove useless log options in usage
Since commit 2cf637cf(mkfs: remove logarithm based CLI options),
xfsprogs has discarded log options in node_options, remove it in usage.
Fixes: 2cf637cf ("mkfs: remove logarithm based CLI options") Signed-off-by: Yang Xu <xuyang2018.jy@cn.fujitsu.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Eric Sandeen [Wed, 10 Jul 2019 15:35:07 +0000 (11:35 -0400)]
mkfs: don't use xfs_verify_fsbno() before m_sb is fully set up
Commit 8da5298 mkfs: validate start and end of aligned logs stopped
open-coding log end block checks, and used xfs_verify_fsbno() instead.
It also used xfs_verify_fsbno() to validate the log start. This
seemed to make sense, but then xfs/306 started failing on 4k sector
filesystems, which leads to a log striep unite being set on a single
AG filesystem.
As it turns out, if xfs_verify_fsbno() is testing a block in the
last AG, it needs to have mp->m_sb.sb_dblocks set, which isn't done
until later. With sb_dblocks unset we can't know how many blocks
are in the last AG, and hence can't validate it.
To fix all this, go back to open-coding the checks; note that this
/does/ rely on m_sb.sb_agblklog being set, but that /is/ already
done in the early call to start_superblock_setup().
Fixes: 8da5298 ("mkfs: validate start and end of aligned logs") Reported-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Eric Sandeen [Tue, 25 Jun 2019 21:04:42 +0000 (17:04 -0400)]
xfs_quota: fix built-in help for project setup
-s is used to set up a new project, not -c.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Tue, 25 Jun 2019 21:04:42 +0000 (17:04 -0400)]
xfs_io: repair_f should use its own name
If the repair command fails, it should tag the error message with its
own name ("repair").
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Tue, 25 Jun 2019 21:04:42 +0000 (17:04 -0400)]
mkfs: validate start and end of aligned logs
Validate that the start and end of the log stay within a single AG if
we adjust either end to align to stripe units.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Tue, 25 Jun 2019 21:04:42 +0000 (17:04 -0400)]
libfrog: cvt_u64 should use strtoull, not strtoll
cvt_u64 converts a string to an unsigned 64-bit number, so it should use
strtoull, not strtoll because we don't want negative numbers here.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Tue, 25 Jun 2019 21:04:42 +0000 (17:04 -0400)]
libfrog: don't set negative errno in conversion functions
Don't set errno to a negative value when we're converting integers.
That's a kernel thing; this is userspace.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Amir Goldstein [Tue, 25 Jun 2019 21:04:42 +0000 (17:04 -0400)]
xfs_info: limit findmnt to find mounted xfs filesystems
When running xfstests with -overlay, the xfs mount point
(a.k.a $OVL_BASE_SCRATCH_MNT) is used as the $SCRATCH_DEV argument
to the overlay mount, like this:
Ever since commit bbb43745, when xfs_info started using findmnt,
when calling the helper `_supports_filetype /vdf` it returns false,
and reports: "/vdf/ovl-mnt: Not on a mounted XFS filesystem".
Fix this ambiguity by preferring to query a mounted XFS filesystem,
if one can be found.
Fixes: bbb43745 ("xfs_info: use findmnt to handle mounted block devices") Signed-off-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Thu, 6 Jun 2019 13:58:55 +0000 (08:58 -0500)]
mkfs: enable reflink by default
Data block sharing (a.k.a. reflink) has been stable for a while, so turn
it on by default.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com>
[sandeen: update comments & man page] Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Thu, 6 Jun 2019 13:58:55 +0000 (08:58 -0500)]
xfs_scrub: fix background-mode sleep throttling
The comment preceding background_sleep() is wrong -- the function sleeps
100us, not 100ms, for every '-b' passed in after the first one. This is
really not obvious from the magic numbers, so fix the comment and use
symbolic constants for easier reading.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Thu, 6 Jun 2019 13:58:55 +0000 (08:58 -0500)]
xfs_repair: refactor namecheck functions
Now that we have name check functions in libxfs, use them instead of our
custom versions.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Thu, 6 Jun 2019 13:58:55 +0000 (08:58 -0500)]
libfrog: fix bitmap return values
Fix the return types of non-predicate bitmap functions to return the
usual negative error codes instead of the "moveon" boolean.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Thu, 6 Jun 2019 13:58:55 +0000 (08:58 -0500)]
misc: remove all use of xfs_fsop_geom_t
Remove all the uses of the old xfs_fsop_geom_t typedef.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>