git.ipfire.org Git - thirdparty/xfsprogs-dev.git/log

xfs_io: implement 'set_encpolicy' and 'get_encpolicy' commands

Add set_encpolicy and get_encpolicy commands to xfs_io so that xfstests
will be able to test filesystem encryption using the actual user API,
not just hacked in with a mount option.  These commands use the common
"fscrypt" API currently implemented by ext4 and f2fs, but it's also
under development for ubifs and planned for xfs.

Note that to get encrypted files to actually work, it's also necessary
to add a key to the kernel keyring.  This patch does not add a command
for this to xfs_io because it's possible to do it using keyctl.  keyctl
can also be used to remove keys, revoke keys, invalidate keys, etc.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_logprint: di_gen is unsigned

di_gen is unsigned:

__uint32_t di_gen; /* generation number */

but we print it as a signed int in logprint, so see oddities like:

forkoff:24 dmevmask:0x0 dmstate:0 flags:0x0 gen:-628807103

Fix this.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_io.8: remove duplicate .TP commands

The right margin of the xfs_io man page was being pushed inwards due to
extra .TP commands, making all the text below hard to read.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

build: add missing AC_PROG_INSTALL

$ autoreconf -fi
$ ./configure
configure: error: cannot find install-sh, install.sh, or shtool in . "."/.

Signed-off-by: Jan Engelhardt <jengelh@inai.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

build: resolve autoheader warning

$ autoreconf -fi
[...]
autoheader: warning: missing template: HAVE_UMODE_T
autoheader: Use AC_DEFINE([HAVE_UMODE_T], [], [Description])
autoreconf: /usr/bin/autoheader failed with exit status: 1

Signed-off-by: Jan Engelhardt <jengelh@inai.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_io: fix building with musl

The fallback in case the libc doesn't have or doesn't advertise the
existence of d_reclen in struct dirent uses d_namlen. Musl neither
advertises d_reclen nor does it have a d_namlen member.

Calculate the value for d_namlen from d_name in the fallback path.

Signed-off-by: Ralph Sennhauser <ralph.sennhauser@gmail.com>
Reviewed--by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

build: Allow compiling xfsprogs in a cross compile environment

Without this patch, we are using the same compiler and options for the host
compiler (BUILD_CC) and the target compiler (CC), and we would get error
messages at compilation:
x86_64-pc-linux-gnu-gcc -O2 -O2 -pipe -march=armv7-a -mtune=cortex-a15 ...
x86_64-pc-linux-gnu-gcc.real: error: unrecognized command line option
'-mfpu=neon'
'-mfloat-abi=hard'
'-clang-syntax'
'-mfpu=neon'
'-mfloat-abi=hard'
'-clang-syntax'

Add BUILD_CC and BUILD_CFLAGS as precious variables to allow setting it up
from the ebuild.

Signed-off-by: Gwendal Grignou <gwendal@chromium.org>
Reviewed-by: Mike Frysinger <vapier@gentoo.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: don't rely on ->total in xfs_alloc_space_available

Source kernel commit: 12ef830198b0d71668eb9b59f9ba69d32951a48a

->total is a bit of an odd parameter passed down to the low-level
allocator all the way from the high-level callers.  It's supposed to
contain the maximum number of blocks to be allocated for the whole
transaction [1].

But in xfs_iomap_write_allocate we only convert existing delayed
allocations and thus only have a minimal block reservation for the
current transaction, so xfs_alloc_space_available can't use it for
the allocation decisions.  Use the maximum of args->total and the
calculated block requirement to make a decision.  We probably should
get rid of args->total eventually and instead apply ->minleft more
broadly, but that will require some extensive changes all over.

[1] which creates lots of confusion as most callers don't decrement it
once doing a first allocation.  But that's for a separate series.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: adjust allocation length in xfs_alloc_space_available

Source kernel commit: 54fee133ad59c87ab01dd84ab3e9397134b32acb

We must decide in xfs_alloc_fix_freelist if we can perform an
allocation from a given AG is possible or not based on the available
space, and should not fail the allocation past that point on a
healthy file system.

But currently we have two additional places that second-guess
xfs_alloc_fix_freelist: xfs_alloc_ag_vextent tries to adjust the
maxlen parameter to remove the reservation before doing the
allocation (but ignores the various minium freespace requirements),
and xfs_alloc_fix_minleft tries to fix up the allocated length
after we've found an extent, but ignores the reservations and also
doesn't take the AGFL into account (and thus fails allocations
for not matching minlen in some cases).

Remove all these later fixups and just correct the maxlen argument
inside xfs_alloc_fix_freelist once we have the AGF buffer locked.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: fix bogus minleft manipulations

Source kernel commit: 255c516278175a6dc7037d1406307f35237d8688

We can't just set minleft to 0 when we're low on space - that's exactly
what we need minleft for: to protect space in the AG for btree block
allocations when we are low on free space.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: bump up reserved blocks in xfs_alloc_set_aside

Source kernel commit: 5149fd327f16e393c1d04fa5325ab072c32472bf

Setting aside 4 blocks globally for bmbt splits isn't all that useful,
as different threads can allocate space in parallel. Bump it to 4
blocks per AG to allow each thread that is currently doing an
allocation to dip into it separately. Without that we may no have
enough reserved blocks if there are enough parallel transactions
in an almost out space file system that all run into bmap btree
splits.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: use the actual AG length when reserving blocks

Source kernel commit: 20e73b000bcded44a91b79429d8fa743247602ad

We need to use the actual AG length when making per-AG reservations,
since we could otherwise end up reserving more blocks out of the last
AG than there are actual blocks.

Complained-about-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: use GPF_NOFS when allocating btree cursors

Source kernel commit: b24a978c377be5f14e798cb41238e66fe51aab2f

Use NOFS for allocating btree cursors, since they can be called
under the ilock.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: ignore leaf attr ichdr.count in verifier during log replay

Source kernel commit: 2e1d23370e75d7d89350d41b4ab58c7f6a0e26b2

When we create a new attribute, we first create a shortform
attribute, and try to fit the new attribute into it.
If that fails, we copy the (empty) attribute into a leaf attribute,
and do the copy again. Thus there can be a transient state where
we have an empty leaf attribute.

If we encounter this during log replay, the verifier will fail.
So add a test to ignore this part of the leaf attr verification
during log replay.

Thanks as usual to dchinner for spotting the problem.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: optimise CRC updates

Source kernel commit: cae028df53449905c944603df624ac94bc619661

Nick Piggin reported that the CRC overhead in an fsync heavy
workload was higher than expected on a Power8 machine. Part of this
was to do with the fact that the power8 CRC implementation is not
efficient for CRC lengths of less than 512 bytes, and so the way we
split the CRCs over the CRC field means a lot of the CRCs are
reduced to being less than than optimal size.

To optimise this, change the CRC update mechanism to zero the CRC
field first, and then compute the CRC in one pass over the buffer
and write the result back into the buffer. We can do this safely
because anything writing a CRC has exclusive access to the buffer
the CRC is being calculated over.

We leave the CRC verify code the same - it still splits the CRC
calculation - because we do not want read-only operations modifying
the underlying buffer. This is because read-only operations may not
have an exclusive access to the buffer guaranteed, and so temporary
modifications could leak out to to other processes accessing the
buffer concurrently.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: make xfs btree stats less huge

Source kernel commit: 11ef38afe98cc7ad1a46ef24945232ec1760d5e2

Embedding a switch statement in every btree stats inc/add adds a lot
of code overhead to the core btree infrastructure paths. Stats are
supposed to be small and lightweight, but the btree stats have
become big and bloated as we've added more btrees. It needs fixing
because the reflink code will just add more overhead again.

Convert the v2 btree stats to arrays instead of independent
variables, and instead use the type to index the specific btree
array via an enum. This allows us to use array based indexing
to update the stats, rather than having to derefence variables
specific to the btree type.

If we then wrap the xfsstats structure in a union and place uint32_t
array beside it, and calculate the correct btree stats array base
array index when creating a btree cursor,  we can easily access
entries in the stats structure without having to switch names based
on the btree type.

We then replace with the switch statement with a simple set of stats
wrapper macros, resulting in a significant simplification of the
btree stats code, and:

text    data     bss     dec     hex filename
48905     144       8   49057    bfa1 fs/xfs/libxfs/xfs_btree.o.old
36793     144       8   36945    9051 fs/xfs/libxfs/xfs_btree.o

it reduces the core btree infrastructure code size by close to 25%!

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: don't allow di_size with high bit set

Source kernel commit: ef388e2054feedaeb05399ed654bdb06f385d294

The on-disk field di_size is used to set i_size, which is a signed
integer of loff_t. If the high bit of di_size is set, we'll end up with
a negative i_size, which will cause all sorts of problems. Since the
VFS won't let us create a file with such length, we should catch them
here in the verifier too.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: error out if trying to add attrs and anextents > 0

Source kernel commit: 0f352f8ee8412bd9d34fb2a6411241da61175c0e

We shouldn't assert if somehow we end up trying to add an attr fork to
an inode that apparently already has attr extents because this is an
indication of on-disk corruption. Instead, return an error code to
userspace.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: don't crash if reading a directory results in an unexpected hole

Source kernel commit: 96a3aefb8ffde23180130460b0b2407b328eb727

In xfs_dir3_data_read, we can encounter the situation where err == 0 and
*bpp == NULL if the given bno offset happens to be a hole; this leads to
a crash if we try to set the buffer type after the _da_read_buf call.
Holes can happen due to corrupt or malicious entries in the bmbt data,
so be a little more careful when we're handling buffers.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: complain if we don't get nextents bmap records

Source kernel commit: 356a3225222e5bc4df88aef3419fb6424f18ab69

When reading into memory all extents of a btree-format inode fork,
complain if the number of extents we find is not the same as the number
of extents reported in the inode core. This is needed to stop an IO
action from accessing the garbage areas of the in-core fork.

[dchinner: removed redundant assert]

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: check for bogus values in btree block headers

Source kernel commit: bb3be7e7c1c18e1b141d4cadeb98cc89ecf78099

When we're reading a btree block, make sure that what we retrieved
matches the owner and level; and has a plausible number of records.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: forbid AG btrees with level == 0

Source kernel commit: d2a047f31e86941fa896e0e3271536d50aba415e

There is no such thing as a zero-level AG btree since even a single-node
zero-records btree has one level. Btree cursor constructors read
cur_nlevels straight from disk and then access things like
cur_bufs[cur_nlevels - 1] which is /really/ bad if cur_nlevels is zero!
Therefore, strengthen the verifiers to prevent this possibility.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: several xattr functions can be void

Source kernel commit: f7a136aee3c1c3f7daf87197b3b3c361744a2812

There are a handful of xattr functions which now return
nothing but zero. They can be made void, chased through calling
functions, and error handling etc can be removed.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: handle cow fork in xfs_bmap_trace_exlist

Source kernel commit: c44a1f22626c153976289e1cd67bdcdfefc16e1f

By inspection, xfs_bmap_trace_exlist isn't handling cow forks,
and will trace the data fork instead.

Fix this by setting state appropriately if whichfork
== XFS_COW_FORK.

()___()
< @ @ >
| |
{o_o}
(|)

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: pass state not whichfork to trace_xfs_extlist

Source kernel commit: 7710517fc37b1899722707883b54694ea710b3c0

When xfs_bmap_trace_exlist called trace_xfs_extlist,
it sent in the "whichfork" var instead of the bmap "state"
as expected (even though state was already set up for this
purpose).

As a result, the xfs_bmap_class in tracing code used
"whichfork" not state in xfs_iext_state_to_fork(), and got
the wrong ifork pointer. It all goes downhill from
there, including an ASSERT when ifp_bytes is empty
by the time it reaches xfs_iext_get_ext():

XFS: Assertion failed: idx < ifp->if_bytes / sizeof(xfs_bmbt_rec_t)

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: Move AGI buffer type setting to xfs_read_agi

Source kernel commit: 200237d6746faaeaf7f4ff4abbf13f3917cee60a

We've missed properly setting the buffer type for
an AGI transaction in 3 spots now, so just move it
into xfs_read_agi() and set it if we are in a transaction
to avoid the problem in the future.

This is similar to how it is done in i.e. the dir3
and attr3 read functions.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: track preallocation separately in xfs_bmapi_reserve_delalloc()

Source kernel commit: 974ae922efd93b07b6cdf989ae959883f6f05fd8

Speculative preallocation is currently processed entirely by the callers
of xfs_bmapi_reserve_delalloc(). The caller determines how much
preallocation to include, adjusts the extent length and passes down the
resulting request.

While this works fine for post-eof speculative preallocation, it is not
as reliable for COW fork preallocation. COW fork preallocation is
implemented via the cowextszhint, which aligns the start offset as well
as the length of the extent. Further, it is difficult for the caller to
accurately identify when preallocation occurs because the returned
extent could have been merged with neighboring extents in the fork.

To simplify this situation and facilitate further COW fork preallocation
enhancements, update xfs_bmapi_reserve_delalloc() to take a separate
preallocation parameter to incorporate into the allocation request. The
preallocation blocks value is tacked onto the end of the request and
adjusted to accommodate neighboring extents and extent size limits.
Since xfs_bmapi_reserve_delalloc() now knows precisely how much
preallocation was included in the allocation, it can also tag the inodes
appropriately to support preallocation reclaim.

Note that xfs_bmapi_reserve_delalloc() callers are not yet updated to
use the preallocation mechanism. This patch should not change behavior
outside of correctly tagging reflink inodes when start offset
preallocation occurs (which the caller does not handle correctly).

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

fs: xfs: libxfs: constify xfs_nameops structures

Source kernel commit: cf7841c12d85d1fe0ad33fb8bc5746809a882010

Declare the structure xfs_nameops as const as it is only stored in the
m_dirnameops field of a xfs_mount structure. This field is of type
const struct xfs_nameops *, so xfs_nameops structures having this
property can be declared as const.
Done using Coccinelle:
@r1 disable optional_qualifier @
identifier i;
position p;
@@
static struct xfs_nameops i@p = {...};

@ok1@
identifier r1.i;
position p;
struct xfs_mount mp;
@@
mp.m_dirnameops=&i@p

@bad@
position p!={r1.p,ok1.p};
identifier r1.i;
@@
i@p

@depends on !bad disable optional_qualifier@
identifier r1.i;
@@
static
+const
struct xfs_nameops i={...};

@depends on !bad disable optional_qualifier@
identifier r1.i;
@@
+const
struct xfs_nameops i;

File size before:
text    data     bss     dec     hex filename
5302      85       0    5387    150b fs/xfs/libxfs/xfs_dir2.o

File size after:
text    data     bss     dec     hex filename
5318      69       0    5387    150b fs/xfs/libxfs/xfs_dir2.o

Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: factor rmap btree size into the indlen calculations

Source kernel commit: fd26a88093bab6529ea2de819114ca92dbd1d71d

When we're estimating the amount of space it's going to take to satisfy
a delalloc reservation, we need to include the space that we might need
to grow the rmapbt. This helps us to avoid running out of space later
when _iomap_write_allocate needs more space than we reserved. Eryu Guan
observed this happening on generic/224 when sunit/swidth were set.

Reported-by: Eryu Guan <eguan@redhat.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: remove NULLEXTNUM

Source kernel commit: 0e8d630ba039d9976d250eedb82c3a423ad15447

We only ever set a field to this constant for an impossible to reach
error case in xfs_bmap_search_extents. That functions has been removed,
so we can remove the constant as well.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: remove xfs_bmap_search_extents

Source kernel commit: 6edc977f775e5ac10655b03607ef091d2b06f2f6

Now that all users are gone.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

repair: use new extent lookup helpers in bmap_next_offset

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: remove prev argument to xfs_bmapi_reserve_delalloc

Source kernel commit: 65c5f419788d623a0410eca1866134f5e4628594

We can easily lookup the previous extent for the cases where we need it,
which saves the callers from looking it up for us later in the series.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: use new extent lookup helpers in __xfs_bunmapi

Source kernel commit: 7efc794561f6bfe34d26c3724289108f6cda3a4d

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: use new extent lookup helpers in xfs_bmapi_write

Source kernel commit: 2d58f6ef79db10257d77ae8cd8e9e8432816c9ac

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: use new extent lookup helpers in xfs_bmapi_read

Source kernel commit: 334f3423d6a6c39483aa0744ea044e105ca0e5ae

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: cleanup xfs_bmap_last_before

Source kernel commit: 86685f7ba5c0fdbcc159ecdbcc530b9b3509a675

Rewrite the function using xfs_iext_lookup_extent and xfs_iext_get_extent,
and massage the flow into something easily understandable.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: new inode extent list lookup helpers

Source kernel commit: 93533c7855c3c78c8a900cac65c8d669bb14935d

xfs_iext_lookup_extent looks up a single extent at the passed in offset,
and returns the extent covering the area, or the one behind it in case
of a hole, as well as the index of the returned extent in arguments,
as well as a simple bool as return value that is set to false if no
extent could be found because the offset is behind EOF. It is a simpler
replacement for xfs_bmap_search_extent that leaves looking up the rarely
needed previous extent to the caller and has a nicer calling convention.

xfs_iext_get_extent is a helper for iterating over the extent list,
it takes an extent index as input, and returns the extent at that index
in it's expanded form in an argument if it exists. The actual return
value is a bool whether the index is valid or not.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: check minimum block size for CRC filesystems

Source kernel commit: bec9d48d7a303a5bb95c05961ff07ec7eeb59058

[dchinner: cleaned up XFS_MIN_CRC_BLOCKSIZE check]

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: provide helper for counting extents from if_bytes

Source kernel commit: 5d829300bee000980a09ac2ccb761cb25867b67c

The open-coded pattern:

ifp->if_bytes / (uint)sizeof(xfs_bmbt_rec_t)

is all over the xfs code; provide a new helper
xfs_iext_count(ifp) to count the number of inline extents
in an inode fork.

[dchinner: pick up several missed conversions]

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: check return value of _trans_reserve_quota_nblks

Souce kernel commit: 4fd29ec47212c8cbf98916af519019ccc5e58e49

Check the return value of xfs_trans_reserve_quota_nblks for errors.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

libxfs: clean up _dir2_data_freescan

Source kernel commit: 523b2e76e3ecb54e0ec8651e32291bdaefc5f866

Refactor the implementations of xfs_dir2_data_freescan into a
routine that takes the raw directory block parameters and
a second function that figures out the raw parameters from the
directory inode. This enables us to use the exact same code
for both userspace and the kernel, since repair knows exactly
which directory block geometry parameters it needs.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

libxfs: synchronize dinode_verify with userspace

Source kernel commit: 420fbeb4bff483d89dd8de9242b8fd83e2eb3527

The userspace version of _dinode_verify takes a raw inode number
instead of an inode itself. Since neither version actually needs
the inode, port the changes to the kernel. This will also reduce
the libxfs diff noise.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: set XFS_DA_OP_OKNOENT in xfs_attr_get

Source kernel commit: c400ee3ed1b13d45adde68e12254dc6ab6977b59

It's entirely possible for userspace to ask for an xattr which
does not exist.

Normally, there is no problem whatsoever when we ask for such
a thing, but when we look at an obfuscated metadump image
on a debug kernel with selinux, we trip over this ASSERT in
xfs_da3_path_shift():

*result = -ENOENT;      /* we're out of our tree */
ASSERT(args->op_flags & XFS_DA_OP_OKNOENT);

It (more or less) only shows up in the above scenario, because
xfs_metadump obfuscates attr names, but chooses names which
keep the same hash value - and xfs_da3_node_lookup_int does:

if (((retval == -ENOENT) || (retval == -ENOATTR)) &&
(blk->hashval == args->hashval)) {
error = xfs_da3_path_shift(state, &state->path, 1, 1,
&retval);

IOWS, we only get down to the xfs_da3_path_shift() ASSERT
if we are looking for an xattr which doesn't exist, but we
find xattrs on disk which have the same hash, and so might be
a hash collision, so we try the path shift.  When *that*
fails to find what we're looking for, we hit the assert about
XFS_DA_OP_OKNOENT.

Simply setting XFS_DA_OP_OKNOENT in xfs_attr_get solves this
rather corner-case problem with no ill side effects.  It's
fine for an attr name lookup to fail.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: fix btree cursor error cleanups

Source kernel commit: f307080a626569f89bc8fbad9f936b307aded877

The btree cursor cleanup function takes an error parameter that
affects how buffers are released from the cursor. All buffers are
released in the event of error. Several callers do not specify the
XFS_BTREE_ERROR flag in the event of error, however. This can cause
buffers to hang around locked or with an elevated hold count and
thus lead to umount hangs in the event of errors.

Fix up the xfs_btree_del_cursor() callers to pass XFS_BTREE_ERROR if
the cursor is being torn down due to error.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

libxfs: fix line lengths

Fix some 80-char line length issues.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reported-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

libxfs: remove useless stuff from the kernel

Evidently the libxfs-apply script sucked in some fs/xfs/ content from
the kernel patches and an extra redefinition of _bmap_search_extents.
We don't need this, so get rid of it.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

libxfs: return bool from sb_version_hasmetauuid

The kernel's version of this function returns bool, so do so here too.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

libxfs: fix whitespace to match the kernel

Fix some minor whitespace errors so that my automated libxfs diff
scanning will stop reporting this.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

libxfs: refactor btree crc verifier

In a65d8d293b ("libxfs: validate metadata LSNs against log on v5
superblocks") the hascrc check was modified to use the helper mp
variable in the kernel. This was left out of the xfsprogs patch, so
change it here too.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

libxfs: remove unnecessary hascrc test in btree verifiers

xfs_btree_sblock_v5hdr_verify already checks _hascrc, so we can
remove it from the verifier functions. For whatever reason this
change made it into the kernel but not xfsprogs.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_io: fix libxfs naming violation

All the calls to libxfs code should start with 'libxfs'
per libxfs_api_defs.h.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

libxfs-apply: port to stgit

Teach libxfs-apply how to talk to a stgit repository
and fix a minor typo in the guilt hunk of apply_patch.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

tools: create libxfs-diff to compare libxfses

Create a script to compare every file in libxfs to the same files
in another libxfs. This is useful for comparing upstream kernel
and user progs to look for unported changes.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfsprogs: Release v4.9.0

Update all the necessary files for a 4.9.0 release.

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfsprogs: Release v4.9.0-rc1

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

Fix building xfsprogs on 32-bit platforms

xfslibs now requires that its users enable transparent largefile
support.  This broke building xfsprogs on 32-bit Linux (with glibc)
because _FILE_OFFSET_BITS=64 was not getting defined.  Although the
autoconf macro AC_SYS_LARGEFILE was intended to define it, this didn't
work because AC_SYS_LARGEFILE will only define _FILE_OFFSET_BITS in a
config header, which doesn't work for xfsprogs because not all .c files
include platform_defs.h as their first include.  Also,
platform_defs.h.in is not generated by autoheader and didn't contain a
template for _FILE_OFFSET_BITS.

Therefore, to fix the problem remove the useless autoconf macros and
instead add -D_FILE_OFFSET_BITS=64 to CFLAGS in builddefs.in.  Use
CFLAGS rather than PCFLAGS because this definition could be needed by
platforms other than "linux", and it doesn't hurt to always define it.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Felix Janda <felix.janda@posteo.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_quota: Fix test for wrapped id from GETNEXTQUOTA

dump_file and report_mount can be called with null *oid if
we aren't asking for the GETNEXTQUOTA interface, so we
should only test for the GETNEXTQUOTA wrap if *oid is
non-null. Otherwise we'll deref a null pointer in the
test.

This only happens for certain invocations of reporting,
which apparently are not covered by any regression tests
at this point, at least on new kernels which contain
GETNEXTQUOTA.

Addresses-Coverity-ID: 1397415
Addresses-Coverity-ID: 1397416
Brown-paper-bag-worn-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Bill O'Donnell <billodo@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_quota: handle wrapped id from GETNEXTQUOTA

The GETNEXTQUOTA interface in the kernel had a bug
(at least in xfs) where if we pass in UINT_MAX as the
ID, it incremented, warpped, and returned 0 for the next
id. This would cause userspace to start querying
again at zero, and an xfs_quota "report" command would
loop forever. This occurred if a quota ID near
UINT max existed, and later offsets within the block
wrapped the xfs_dqid_t.

This will also be fixed in the kernel, but we should also
catch this in userspace, and stop the loop if it happens.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_repair: don't indicate dirtiness if FSGEOMETRY fails

Today, pointing repair at an image hosted on a non-xfs
filesystem will result in a XFS_IOC_FSGEOMETRY_V1 failure,
but repair generally proceeds without further problems.

However, calling do_warn() sets fs_is_dirty to 1, so
xfs_repair -n exits with non-zero status, indicating
corruption. This is incorrect.

Change the message to use do_log so that it does not
incorrectly indicate corruption.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_repair: junk leaf attribute if count == 0

We have recently seen a case where, during log replay, the
attr3 leaf verifier reported corruption when encountering a
leaf attribute with a count of 0 in the header.

We chalked this up to a transient state when a shortform leaf
was created, the attribute didn't fit, and we promoted the
(empty) attribute to the larger leaf form.

I've recently been given a metadump of unknown provenance which actually
contains a leaf attribute with count 0 on disk. This causes the
verifier to fire every time xfs_repair is run:

Metadata corruption detected at xfs_attr3_leaf block 0x480988/0x1000

If this 0-count state is detected, we should just junk the leaf, same
as we would do if the count was too high. With this change, we now
remedy the problem:

Metadata corruption detected at xfs_attr3_leaf block 0x480988/0x1000
bad attribute count 0 in attr block 0, inode 12587828
problem with attribute contents in inode 12587828
clearing inode 12587828 attributes
correcting nblocks for inode 12587828, was 2 - counted 1

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_repair: change null check to assertion

It /should/ be the case that we never run out of records
before we run out of btree blocks, so change the null check
(that was only to appease Coverity) to an assert.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_repair: fix some potential null pointer deferences

Fix some potential NULL pointer deferences that Coverity pointed out,
and remove a trivial dead integer check.

Coverity-id: 1375789, 1375790, 1375791, 1375792
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_repair: fix bogus rmapbt record owner check

Make the reverse mapping owner check actually validate inode numbers.

Coverity-id: 1371628
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

platform: remove use of off64_t

Since we force transparent LFS it can be replaced by off_t.

Signed-off-by: Felix Janda <felix.janda@posteo.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs.h: require transparent LFS for all users

Since our interfaces depend on the consistent use of a 64bit offset
type, force downstreams to use transparent LFS (_FILE_OFFSET_BITS=64),
so that it becomes impossible for them to use 32bit interfaces.

Signed-off-by: Felix Janda <felix.janda@posteo.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfsprogs: replace statvfs64 by equivalent statvfs

Signed-off-by: Felix Janda <felix.janda@posteo.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>

fsr: remove workaround for statvfs on Mac OS X

It can be removed since fsr is no longer built on Mac OS X.

Signed-off-by: Felix Janda <felix.janda@posteo.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>

Makefile: disable fsr for Mac OS X

Since its kernel does not support XFS anyway this utility is not
useful, and with its removal the portability framework can be
simplified.

Signed-off-by: Felix Janda <felix.janda@posteo.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_io: replace posix_fadvise64 by equivalent posix_fadvise

also fixes a compile failure on FreeBSD

Signed-off-by: Felix Janda <felix.janda@posteo.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfsprogs: replace sendfile64 by equivalent sendfile

Signed-off-by: Felix Janda <felix.janda@posteo.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfsprogs: replace pread64/pwrite64 by equivalent pread/pwrite

Signed-off-by: Felix Janda <felix.janda@posteo.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfsprogs: replace lseek64 by equivalent lseek

Signed-off-by: Felix Janda <felix.janda@posteo.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfsprogs: replace ftruncate64 by equivalent ftruncate

Signed-off-by: Felix Janda <felix.janda@posteo.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfsprogs: replace [fl]stat64 by equivalent [fl]stat

Signed-off-by: Felix Janda <felix.janda@posteo.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>

configure: remove unecessary definitions of _FILE_OFFSET_BITS

now that we use AC_SYS_LARGEFILE, there is no need to explicitly
define _FILE_OFFSET_BITS.

Signed-off-by: Felix Janda <felix.janda@posteo.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>

configure: error out when LFS does not work

Signed-off-by: Felix Janda <felix.janda@posteo.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>

configure: use AC_SYS_LARGEFILE

The autoconf macro AC_SYS_LARGEFILE defines _FILE_OFFSET_BITS=64
where necessary to ensure that off_t and all interfaces using off_t
are 64bit, even on 32bit systems.

Signed-off-by: Felix Janda <felix.janda@posteo.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_io: Fix initial -m option

Like "open -m mode", the initial -m option requires a mode argument.

Document these options correctly as well.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_io: add command line option -i to start an idle thread

xfs_io -i will start by spawning an idle thread.

The purpose of this idle thread is to test io from a multi threaded
process. With single threaded process, the file table is not shared
and file structs are not reference counted. Spawning an idle thread
can help detecting file struct reference leaks.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfsprogs: Update FSF address in COPYING file

The FSF address in doc/COPYING needs an update. This was caught and
reported by the openSUSE build service while building the xfsprogs
package. The new address is taken directly from FSF's license files
put on their site

Signed-off-by: Grozdan Nikolov <neutrino8@gmail.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
--

mkfs.xfs: format reflink enabled filesystems

Create the refcount btree at mkfs time and set the feature flag.

v2: Turn on the reflink feature when calculating the minimum log size.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

xfs_repair: use thread pools to sort rmap data

Since each slab is a collection of independent mini-slabs, we can
fire up a bunch of threads to sort the mini-slabs in parallel.
This speeds up the sorting phase of the rmapbt rebuilding if we
have a large number of mini slabs.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

xfs_repair: check for mergeable refcount records

Make sure there aren't adjacent refcount records that could be merged;
this is a sign that the refcount tree algorithms aren't working
correctly.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

xfs_repair: use range query when while checking rmaps

For shared extents, we ought to use a range query on the rmapbt to
find the corresponding rmap.  However, most of the time the observed
rmap will be an exact match for the rmapbt rmap, in which case we
could have used the (much faster) regular lookup.  Therefore, try the
regular lookup first and resort to the range lookup if that doesn't
get us what we want.  This can cut the run time of the rmap check of
xfs_repair in half.

Theoretically, the only reason why an observed rmap wouldn't be an
exact match for an rmapbt rmap is because we modified some file on
account of a metadata error.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

xfs_repair: check the CoW extent size hint

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

xfs_repair: complain about copy-on-write leftovers

Complain about leftover CoW allocations that are hanging off the
refcount btree.  These are cleaned out at mount time, but we could be
louder about flagging down evidence of trouble.

Since these extents aren't "owned" by anything, we'll free them up by
reconstructing the free space btrees.

v2: When we're processing rmap records, we inadvertently forgot to
handle the CoW owner, so the leftover CoW staging blocks got marked as
file data.  These blocks will just get freed later, so mark them
"CoW".  When we process the refcountbt, complain about leftovers if
the type is unknown or "CoW".

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

xfs_repair: rebuild the refcount btree

Rebuild the refcount btree with the reference count data we assembled
during phase 4.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

xfs_repair: check the refcount btree against our observed reference counts when -n

Check the observed reference counts against whatever's in the refcount
btree for discrepancies.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

xfs_repair: fix inode reflink flags

While we're computing reference counts, record which inodes actually
share blocks with other files and fix the flags as necessary.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

xfs_repair: record reflink inode state

Record the state of the per-inode reflink flag, so that we can
compare against the rmap data and update the flags accordingly.
Clear the (reflink) state if we clear the inode.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

xfs_repair: process reverse-mapping data into refcount data

Take all the reverse-mapping data we've acquired and use it to generate
reference count data. This data is used in phase 5 to rebuild the
refcount btree.

v2: Update to reflect separation of rmap_irec flags.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

xfs_repair: handle multiple owners of data blocks

If reflink is enabled, don't freak out if there are multiple owners of
a given block; that's just a sign that each of those owners are
reflink files.

v2: owner and offset are unsigned types, so use those for inorder
comparison.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

xfs_repair: check the existing refcount btree

Spot-check the refcount btree for obvious errors, and mark the
refcount btree blocks as such.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>

xfs_repair: fix get_agino_buf to avoid corrupting inodes

The inode buffering code tries to read inodes in units of chunks,
which are the larger of 8K or 1 FSB.  Each chunk gets its own xfs_buf,
which means that get_agino_buf must calculate the disk address of the
chunk and feed that to libxfs_readbuf in order to find the inode data
correctly.  The current code simply grabs the chunk for the start
inode and indexes from that, which corrupts memory because the start
inode and the target inode could be in different inode chunks.  That
causes the assert in rmap.c to blow when we clear the reflink flag.

(Also fix some minor errors in the debugging printfs.)

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>

man: document the inode cowextsize flags & fields

Document the new copy-on-write extent size fields and inode flags
available in struct fsxattr.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>

xfs_logprint: support bmap redo items

Print block mapping update redo items.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

xfs_logprint: support refcount redo items

Print reference count update redo items.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

xfs_logprint: support cowextsize reporting in log contents

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>

xfs_io: try to unshare copy-on-write blocks via fallocate

Wire up the "unshare" flag to the xfs_io fallocate command.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>