git.ipfire.org Git - thirdparty/xfsprogs-dev.git/log

xfs: don't fail when converting shortform attr to long form during ATTR_REPLACE

Source kernel commit: 7b38460dc8e4eafba06c78f8e37099d3b34d473c

Kanda Motohiro reported that expanding a tiny xattr into a large xattr
fails on XFS because we remove the tiny xattr from a shortform fork and
then try to re-add it after converting the fork to extents format having
not removed the ATTR_REPLACE flag. This fails because the attr is no
longer present, causing a fs shutdown.

This is derived from the patch in his bug report, but we really
shouldn't ignore a nonzero retval from the remove call.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199119
Reported-by: kanda.motohiro@gmail.com
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: set format back to extents if xfs_bmap_extents_to_btree

Source kernel commit: 2c4306f719b083d17df2963bc761777576b8ad1b

If xfs_bmap_extents_to_btree fails in a mode where we call
xfs_iroot_realloc(-1) to de-allocate the root, set the
format back to extents.

Otherwise we can assume we can dereference ifp->if_broot
based on the XFS_DINODE_FMT_BTREE format, and crash.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199423
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: enhance dinode verifier

Source kernel commit: b42db0860e13067fcc7cbfba3966c9e652668bbc

Add several more validations to xfs_dinode_verify:

- For LOCAL data fork formats, di_nextents must be 0.
- For LOCAL attr fork formats, di_anextents must be 0.
- For inodes with no attr fork offset,
- format must be XFS_DINODE_FMT_EXTENTS if set at all
- di_anextents must be 0.

Thanks to dchinner for pointing out a couple related checks I had
forgotten to add.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199377
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: non-scrub - remove unused function parameters

Source kernel commit: a1f69417c6f4d1c5280ffb795da7778cba1e1451

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>

xfs: clean up xfs_mount allocation and dynamic initializers

Source kernel commit: 72c44e35f02a1cb4032e476c398a7234badcf49f

Most of the generic data structures embedded in xfs_mount are
dynamically initialized immediately after mp is allocated. A few
fields are left out and initialized during the xfs_mountfs()
sequence, after mp has been attached to the superblock.

To clean this up and help prevent premature access of associated
fields, refactor xfs_mount allocation and all dependent init calls
into a new helper. This self-documents that all low level data
structures (i.e., locks, trees, etc.) should be initialized before
xfs_mount is attached to the superblock.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>

xfs: remove dead inode version setting code

Source kernel commit: fa4493f0d9b3ac8f36743f1a26e2318b449ee4c8

We can only get into the branch if CRCs are enabled, so there's no
need to check inside the branch for CRCs being enabled....

Signed-Off-By: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>

xfs: don't accept inode buffers with suspicious unlinked chains

Source kernel commit: 6a96c5650568a2218712d43ec16f3f82296a6c53

When we're verifying inode buffers, sanity-check the unlinked pointer.
We don't want to run the risk of trying to purge something that's
obviously broken.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>

xfs: move inode extent size hint validation to libxfs

Source kernel commit: 8bb82bc12a9e75dd47047d9a2e53135cc5e5787b

Extent size hint validation is used by scrub to decide if there's an
error, and it will be used by repair to decide to remove the hint.
Since these use the same validation functions, move them to libxfs.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>

xfs: refactor inode buffer verifier error logging

Source kernel commit: 6edb181053f067cee64d4239830062cb40ddab00

When the inode buffer verifier encounters an error, it's much more
helpful to print a buffer from the offending inode instead of just the
start of the inode chunk buffer.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>

xfs: refactor inode verifier error logging

Source kernel commit: 90a58f95717b46f67756580ad5f8b698304e4bad

Refactor some of the inode verifier failure logging call sites to use
the new xfs_inode_verifier_error method which dumps the offending buffer
as well as the code location of the failed check. This trims the
output, makes it clearer to the admin that repair must be run, and gives
the developers more details to work from.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>

xfs: refactor bmap record validation

Source kernel commit: 30b0984d9117dd14c895265886d34335856b712b

Refactor the bmap validator into a more complete helper that looks for
extents that run off the end of the device, overflow into the next AG,
or have invalid flag states.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>

xfs: sanity-check the unused space before trying to use it

Source kernel commit: 6915ef35c0350e87a104cb4c4ab2121c81ca7a34

In xfs_dir2_data_use_free, we examine on-disk metadata and ASSERT if
it doesn't make sense. Since a carefully crafted fuzzed image can cause
the kernel to crash after blowing a bunch of assertions, let's move
those checks into a validator function and rig everything up to return
EFSCORRUPTED to userspace. Found by lastbit fuzzing ltail.bestcount via
xfs/391.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>

xfs: detect agfl count corruption and reset agfl

Source kernel commit: a27ba2607e60312554cbcd43fc660b2c7f29dc9c

The struct xfs_agfl v5 header was originally introduced with
unexpected padding that caused the AGFL to operate with one less
slot than intended. The header has since been packed, but the fix
left an incompatibility for users who upgrade from an old kernel
with the unpacked header to a newer kernel with the packed header
while the AGFL happens to wrap around the end. The newer kernel
recognizes one extra slot at the physical end of the AGFL that the
previous kernel did not. The new kernel will eventually attempt to
allocate a block from that slot, which contains invalid data, and
cause a crash.

This condition can be detected by comparing the active range of the
AGFL to the count. While this detects a padding mismatch, it can
also trigger false positives for unrelated flcount corruption. Since
we cannot distinguish a size mismatch due to padding from unrelated
corruption, we can't trust the AGFL enough to simply repopulate the
empty slot.

Instead, avoid unnecessarily complex detection logic and and use a
solution that can handle any form of flcount corruption that slips
through read verifiers: distrust the entire AGFL and reset it to an
empty state. Any valid blocks within the AGFL are intentionally
leaked. This requires xfs_repair to rectify (which was already
necessary based on the state the AGFL was found in). The reset
mitigates the side effect of the padding mismatch problem from a
filesystem crash to a free space accounting inconsistency. The
generic approach also means that this patch can be safely backported
to kernels with or without a packed struct xfs_agfl.

Check the AGF for an invalid freelist count on initial read from
disk. If detected, set a flag on the xfs_perag to indicate that a
reset is required before the AGFL can be used. In the first
transaction that attempts to use a flagged AGFL, reset it to empty,
warn the user about the inconsistency and allow the freelist fixup
code to repopulate the AGFL with new blocks. The xfs_perag flag is
cleared to eliminate the need for repeated checks on each block
allocation operation.

This allows kernels that include the packing fix commit 96f859d52bcb
("libxfs: pack the agfl header structure so XFS_AGFL_SIZE is correct")
to handle older unpacked AGFL formats without a filesystem crash.

Suggested-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by Dave Chiluk <chiluk+linuxxfs@indeed.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>

xfs: account only rmapbt-used blocks against rmapbt perag res

Source kernel commit: 0ab32086d0becee56c75a8ba21f16ac08b80f304

The rmapbt perag metadata reservation reserves blocks for the
reverse mapping btree (rmapbt). Since the rmapbt uses blocks from
the agfl and perag accounting is updated as blocks are allocated
from the allocation btrees, the reservation actually accounts blocks
as they are allocated to (or freed from) the agfl rather than the
rmapbt itself.

While this works for blocks that are eventually used for the rmapbt,
not all agfl blocks are destined for the rmapbt. Blocks that are
allocated to the agfl (and thus "reserved" for the rmapbt) but then
used by another structure leads to a growing inconsistency over time
between the runtime tracking of rmapbt usage vs. actual rmapbt
usage. Since the runtime tracking thinks all agfl blocks are rmapbt
blocks, it essentially believes that less future reservation is
required to satisfy the rmapbt than what is actually necessary.

The inconsistency is rectified across mount cycles because the perag
reservation is initialized based on the actual rmapbt usage at mount
time. The problem, however, is that the excessive drain of the
reservation at runtime opens a window to allocate blocks for other
purposes that might be required for the rmapbt on a subsequent
mount. This problem can be demonstrated by a simple test that runs
an allocation workload to consume agfl blocks over time and then
observe the difference in the agfl reservation requirement across an
unmount/mount cycle:

mount ...: xfs_ag_resv_init: ... resv 3193 ask 3194 len 3194
...
... : xfs_ag_resv_alloc_extent: ... resv 2957 ask 3194 len 1
umount...: xfs_ag_resv_free: ... resv 2956 ask 3194 len 0
mount ...: xfs_ag_resv_init: ... resv 3052 ask 3194 len 3194

As the above tracepoints show, the reservation requirement reduces
from 3194 blocks to 2956 blocks as the workload runs. Without any
other changes in the filesystem, the same reservation requirement
jumps from 2956 to 3052 blocks over a umount/mount cycle.

To address this divergence, update the RMAPBT reservation to account
blocks used for the rmapbt only rather than all blocks filled into
the agfl. This patch makes several high-level changes toward that
end:

1.) Reintroduce an AGFL reservation type to serve as an accounting
no-op for blocks allocated to (or freed from) the AGFL.
2.) Invoke RMAPBT usage accounting from the actual rmapbt block
allocation path rather than the AGFL allocation path.

The first change is required because agfl blocks are considered free
blocks throughout their lifetime. The perag reservation subsystem is
invoked unconditionally by the allocation subsystem, so we need a
way to tell the perag subsystem (via the allocation subsystem) to
not make any accounting changes for blocks filled into the AGFL.

The second change causes the in-core RMAPBT reservation usage
accounting to remain consistent with the on-disk state at all times
and eliminates the risk of leaving the rmapbt reservation
underfilled.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>

xfs: rename agfl perag res type to rmapbt

Source kernel commit: 215928633502a7296fec42614463bb49859787d6

The AGFL perag reservation type accounts all allocations that feed
into (or are released from) the allocation group free list (agfl).
The purpose of the reservation is to support worst case conditions
for the reverse mapping btree (rmapbt). As such, the agfl
reservation usage accounting only considers rmapbt usage when the
in-core counters are initialized at mount time.

This implementation inconsistency leads to divergence of the in-core
and on-disk usage accounting over time. In preparation to resolve
this inconsistency and adjust the AGFL reservation into an rmapbt
specific reservation, rename the AGFL reservation type and
associated accounting fields to something more rmapbt-specific. Also
fix up a couple tracepoints that incorrectly use the AGFL
reservation type to pass the agfl state of the associated extent
where the raw reservation type is expected.

Note that this patch does not change perag reservation behavior.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>

xfs: convert XFS_AGFL_SIZE to a helper function

Source kernel commit: a78ee256c325ecfaec13cafc41b315bd4e1dd518

The AGFL size calculation is about to get more complex, so lets turn
the macro into a function first and remove the macro.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
[darrick: forward port to newer kernel, simplify the helper]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>

xfs: convert a few more directory asserts to corruption

Source kernel commit: 3f883f5bb197b6fe4e6f461362782aa7b0e89cb6

Yet another round of playing whack-a-mole with directory code that
asserts on corrupt on-disk metadata when it really should be returning
-EFSCORRUPTED instead of ASSERTing. Found by a xfs/391 crash while
lastbit fuzzing of ltail.bestcount.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>

Cleanup old XFS_BTREE_* traces

Source kernel commit: e157ebdcb3acd16221f1e5f84c6e371e15d37b6e

Remove unused legacy btree traces from IRIX era.

Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>

xfsprogs: Release v4.16.0-rc1

Update all the necessary files for a 4.16.0-rc1 release.

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_repair: fix getsubopt name definitions to use enums

Convert the getsubopt usage in xfs_repair to use enums and explicitly
initialized array elements, similar to mkfs. This also fixes the hole
in the o_opts table caused by 42fa89bc1b8dc8 ("xfs_repair: remove
pre_65_beta option") that causes segfaults in xfs/179 and xfs/202.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Fixes: 42fa89bc1b ("xfs_repair: remove pre_65_beta option")
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_scrub_all: use system encoding for lsblk output decoding

Don't hardcode utf-8 as the decoding scheme for lsblk output, since the
system could set it to anything else.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_scrub_all: escape paths being passed to systemd service instances

systemd doesn't like unit instance names with slashes in them, so it
replaces them with dashes when it invokes the service.  However, it's
not smart enough to convert the dashes to something else, so when it
unescapes the instance name to feed to xfs_scrub, it turns all dashes
into slashes.  "/moo-cow" becomes "-moo-cow" becomes "/moo/cow", which
is wrong.  systemd actually /can/ escape the dashes correctly if it is
told that this is a path (and not a unit name), but it didn't do this
prior to January 2017, so fix this for them.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_scrub: disable private /tmp for scrub service

Don't make /tmp private when invoking xfs_scrub as a service, because
/tmp might contain or itself be an xfs filesystem mountpoint.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_scrub_all: report version

Make xfs_scrub_all -V report its version like the other xfs tools.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_scrub: refactor mountpoint finding code to use libfrog path code

Use the libfrog path finding code to determine if the argument being
passed in is a mountpoint, remove all mention of taking a block device
(we have never supported that) from the documentation, and fix some
potential memory leaks.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_scrub: don't warn about confusing names if dir/file only writable by root

If we are scanning the directory entries or attribute names of a
dir/file and the inode can only be written by root, don't warn about
Unicode confusable names by default because the system administrator
presumably made the system like that. Also don't warn about really
short confusable names because of the high chance of collisions. If
the caller really wants all the output, they can run in verbose mode.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Acked-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_scrub: use Unicode skeleton function to find confusing names

Drop the weak normalization-based Unicode name collision detection in
favor of the confusable name guidelines provided in Unicode TR36 & TR39.
This means that we transform the original name into its Unicode skeleton
in order to do hashing-based collision detection.

The Unicode skeleton is defined as nfd(translation(nfd(string))), which
is to say that it flattens sequences that render ambiguously into a
unambiguous format.  For example, 'l' and '1' can render identically in
some typefaces, so they're both squashed to 'l'.  From the skeletons we
can figure out if two names will look the same, and thereby complain
about them.  The unicode spoofing is provided by libicu, hence the
switch away from libunistring.

Note that potentially confusable names are only worth an informational
warning, since it's entirely possible that with the system typefaces in
use, two names will render distinctly enough that users can tell the
difference.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Acked-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_scrub: check name for suspicious characters

Look for suspicious characters in each name we process. This includes
control characters, text direction overrides, zero-width code points,
and names that mix characters from different directionalities.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Acked-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_scrub: transition from libunistring to libicu for Unicode processing

Move off of libunistring and onto libicu for Unicode name scanning.
This will make it easy to warn about unicode code points that do not
belong in identifiers (directional overrides, zero width elements) and
warn about names that could render similarly enough to cause confusion.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Acked-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_scrub: make name_entry a first class structure

Instead of open-coding the construction and hashtable insertion of name
entries, make name_entry a first class object. This means that we now
have name_entry_ prefix functions that take care of computing Unicode
normalized names as part of name_entry construction, and we pass around
the name_entries when we're looking for suspicious characters and
identically rendering names.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Acked-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_scrub: communicate name problems via flagset instead of booleans

Use an unsigned int to pass around name error flags instead of booleans.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_scrub: don't complain about different normalization

Since there are different ways to normalize utf8 names, don't complain
when we find a name that is normalized in a different way than the NFKC
that we use to find duplicate names.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_scrub: only run ascii name checks if unicode name checker

Skip the ASCII name checks if the Unicode name checker is going to run,
since the latter covers everything that the former does.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_scrub: avoid buffer overflow when scanning attributes

Avoid a buffer overflow when we're formatting extended attribute names
for name checking. The kernel headers provide us with XATTR_NAME_MAX,
so we can rely on that.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

libxfs: warn about deprecation of irix, freebsd, darwin

It's not clear that anyone is using these platforms or if
they even build at this point. Get someone's attention if
they are trying to use it.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_repair: test XFS_SB_VERSION_SHAREDBIT only once

Remove 2 of the 3 identical tests for XFS_SB_VERSION_SHAREDBIT

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfsprogs: remove unused delete_attr_ok

delete_attr_ok is never set to anything but 1;
remove it and all associated code.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_repair: remove pre_65_beta option

Irix 6.5 was released 20 years ago. Remove this option from
the code. (nb: it's not present in the manpage.)

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_repair: remove unused fs_shared_allowed variable

The fs_shared_allowed global was set to 1 and then ignored, and
in fact the feature is never actualy allowed. Remove it and
the last stragglers of the old features comment.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_repair: remove unused fs_has_extflgbit_allowed

fs_has_extflgbit_allowed is never set to anything but 1;
remove it and all associated code.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_repair: remove unused fs_sb_feature_bits_allowed

fs_sb_feature_bits_allowed is never set to anything but 1;
remove it and all associated code.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_repair: remove unused fs_aligned_inodes_allowed

fs_aligned_inodes_allowed is never set to anything but 1;
remove it and all associated code.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_repair: remove unused fs_has_extflgbit_allowed

fs_has_extflgbit_allowed is never set to anything but 1;
remove it and all associated code.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_repair: remove unused fs_attributes2_allowed

fs_attributes2_allowed is never set to anything but 1;
remove it and all associated code.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_repair: remove unused fs_attributes_allowed

fs_attributes_allowed is never set to anything but 1;
remove it and all associated code.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

libfrog: handle NULL dir && blkdev in __fs_table_lookup_mount

If neither dir nor blkdev is set, dpath never gets set,
and then gets used (uninitalized) later on.

If we are asked where "nothing" is mounted, just return
"nowhere."

Fixes-coverity-id: 1433615
Fixes-coverity-id: 1433616
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_scrub: initialize movon in xfs_scrub_connections

Given the logic in xfs_scrub_connections, it's possible to
fail all 3 tests and wind up checking an uninitialized moveon
variable at the end. Start out with "true" to avoid this and
move on even if all the conditions in the function are false.

Fixes-coverity-id: 1433617
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_scrub: synchronize error levels & logging

Having only a subset of the five error_levels present in
the log_level[] array is asking for trouble when someone
tries to __str_log(S_PREEN ...) and overruns the array.

Tie it all together in a single structure that's
initialized together to make the mapping more obvious and
idiot-proof.

Fixes-coverity-id: 1433618
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_spaceman: remove incorrect linux/fs.h include

Remove the direct linux/fs.h include from spaceman because all xfs
utilities should include xfs.h so that we can wrap missing kernel header
declarations properly.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

fsck.xfs: allow forced repairs using xfs_repair

The fsck.xfs script did nothing, because xfs doesn't need a fsck to be
run on every unclean shutdown. However, sometimes it may happen that the
root filesystem really requires the usage of xfs_repair and then it is a
hassle. This patch makes the situation a bit easier by detecting forced
checks (/forcefsck or fsck.mode=force) and invoking xfs_repair.

Signed-off-by: Jan Tulak <jtulak@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_repair: add flag -e to modify exit code for corrected errors

xfs_repair without -n ends with a return code 0 if it finished ok,
no matter if there were some errors in the fs, or not. The new flag
-e means that we can avoid screenscraping and parsing text output to
detect if an error was found (and corrected).

If something could not be corrected or in any other case than the "found
something but fixed it all," the behaviour with this flag is unchanged.

Signed-off-by: Jan Tulak <jtulak@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
[sandeen: make -e and -n exclusivity clear in manpage synopsis]
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

metadump/restore: don't use errno after fwrite/fread failures

fread/fwrite don't set errno, so printing out strerror(errno)
after a failure leads to incorrect and confusing messages:

# xfs_mdrestore pre_repair.meta pre_repair.img
xfs_mdrestore: error reading from file: Success

Don't return unset errno from write_index, either.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

mkfs: enable sparse inodes by default

Enable the sparse inode feature by default.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_fsr: refactor mountpoint finding to use libfrog paths functions

Refactor the mount-point finding code in fsr to use the libfrog helpers
instead of open-coding yet another routine.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

libfrog: fs_table_lookup_mount should realpath the argument

Call realpath on the dir argument so that we're comparing canonical
paths when looking for the mountpoint. This fixes the problem where
'/home/' doesn't match '/home' even though they refer to the same thing.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_repair: use custom ifork verifier in mv_orphanage

Now that we have a custom verifier which can ignore parent
inode numbers, use it in mv_orphanage() as well; orphan inodes
may have invalid parents, and we're about to reconnect
them anyway, so override that test when we get them off disk.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_repair: implement custom ifork verifiers

There are a few cases where an early stage of xfs_repair will write an
invalid inode fork buffer to signal to a later stage that it needs to
correct the value. This happens in phase 4 when we detect an inline
format directory with an invalid .. pointer. To avoid triggering the
ifork verifiers on this, inject a custom verifier for phase 6 that lets
this pass for now.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_scrub: add missing paths header

Fix the following build failure with musl libc:

xfs_scrub.c: In function ‘main’:
xfs_scrub.c:670:11: error: ‘_PATH_MOUNTED’ undeclared (first use in this function)
mtab = _PATH_MOUNTED;
^~~~~~~~~~~~~

Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

workqueue: add missing pthreads header

Fix the following build failure with musl libc:

In file included from read_verify.c:25:0:
../include/workqueue.h:39:2: error: unknown type name 'pthread_t'
  pthread_t  *threads;
  ^~~~~~~~~
../include/workqueue.h:42:2: error: unknown type name 'pthread_mutex_t'
  pthread_mutex_t  lock;
  ^~~~~~~~~~~~~~~
../include/workqueue.h:43:2: error: unknown type name 'pthread_cond_t'
  pthread_cond_t  wakeup;
  ^~~~~~~~~~~~~~

Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_io: fix operation time reporting

CUrrently the 100th/sec units always report zero, such as:

32 MiB, 8192 ops; 0:00:21.00 (1.476 MiB/sec and 377.9260 ops/sec)
^^

This is incorrect. Fix the maths that is wrong by removing all the
unnecesary floating point maths and just using basic integer
division...

Signed-Off-By: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_repair: don't fail directory repairs when grabbing inodes

There are a few places where xfs_repair needs to be able to load a
damaged directory inode to perform repairs. Since inline data fork
verifiers can now be customized, refactor libxfs_iget to enable
repair to get at this so that we don't crash in phase 6.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_db: print transaction reservation type information

Create a new xfs_db command to print the transaction reservation info for
a given filesystem. This will make it easier to compare the calculations
made by the kernel and xfsprogs in case there is a discrepancy.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_scrub: don't try to scan xattrs if bstat says there aren't any

Only try to scan the extended attributes of a file if bstat says that
the file actually has any. Surprisingly, this reduces the phase 5
runtime by 40% if most of the files don't have attrs.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_scrub: fix #include ordering to avoid build failure

Fix the ordering of the header includes in all the scrub source.  We put
xfs.h first so that it will pull in include/linux.h which pulls in
linux/fs.h + whatever overrides are necessary (currently limited to
struct fsxattr) to make things work on this platform, and then we remove
the #includes for anything that will get pulled (directly or indirectly)
by xfs.h for cleanliness.  Without this, a user compiling new xfsprogs
on a system with a 4.7 kernel gets this:

Building scrub
    [CC]     disk.o
In file included from ../include/xfs.h:37:0,
                 from disk.c:40:
../include/xfs/linux.h:185:8: error: redefinition of 'struct fsxattr'
struct fsxattr {
        ^~~~~~~
In file included from disk.c:31:0:
/usr/include/linux/fs.h:155:8: note: originally defined here
struct fsxattr {
        ^~~~~~~
gmake[2]: *** [../include/buildrules:60: disk.o] Error 1
gmake[1]: *** [include/buildrules:36: scrub] Error 2
make: *** [Makefile:77: default] Error 2

Reported-by: Mikael Magnusson <mikachu@gmail.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_scrub: don't ask user to run xfs_repair for only warnings

Don't advise the user to run xfs_repair on a filesystem that triggers
warnings but no errors; there's no corruption for it to fix.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_scrub: log operational messages when interactive

Record the summary of an interactive session in the system log so that
future support requests can get a better picture of what happened.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_db: don't crash in ablock if there's no inode

Make sure we actually have an inode selected before trying to unwrap its
attribute fork. Found via a crash in xfs/288 with project quotas
enabled.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

misc: fix gcc 7.3 warnings

Fix various compiler warnings that pop up in 7.3.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfsprogs: new libxfs-apply option for Signed-off-by: tag

Technically when a maintainer moves a patch from another project,
they should add their Signed-off-by: tag. Get that info automatically
from git-config, and add an option to to override it if desired,
to make that easy when cross-porting libxfs patches

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfsprogs: call libxfs_destroy from other utilities

Call libxfs_destroy() from xfs_copy, xfs_db, mkfs.xfs, and
xfs_repair to allow us to detect leaked items in these
utilities as well.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

libxfs: Catch non-empty zones on destroy

Create and use a kmem_zone_destroy which warns if we are
releasing a non-empty zone when the LIBXFS_LEAK_CHECK
environment variable is set, wire this into libxfs_destroy(),
and call that when various tools exit.

The LIBXFS_LEAK_CHECK environment variable also causes
the program to exit with failure when a leak is detected,
useful for failing automated tests if leaks are encountered.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

libxfs: move xfs_inode_zone to rdwr.c

The zone itself is created in rdwr.c, so define it there as
well, and add it to the list of externs in manage_zones along
with all the rest, for consistency.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

libxfs: add function to free all buffers in bcache

libxfs_bcache_purge simply moves all "free" buffers
onto the xfs_buf_freelist mru list; add a new function to
actually free them when we tear everything down, so leak
checkers don't go nuts about lots of unfreed xfs_bufs
at exit.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

libxfs: Replace XFS_BUF_SET_PTR with xfs_buf_associate_memory

We test the return value of the macro, but it returns
returns a side-effect which looks like failure. Write
a userspace-libxfs-specific version of xfs_buf_associate_memory
to make this code a tad more like the kernel, with a proper
return value to boot.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_io: add RWF_DSYNC support to pwrite

Enable testing write behaviour with the per-io RWF_DSYNC flag.

Signed-Off-By: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_scrub: make interpreter explicit to python3

Using #!/usr/bin/env makes some package dependency tools
such as rpm complain given that it cannot verify package
dependencies. Making it explicit resolves this lint rant.

Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_repair: Add missing braces to allow zeroing of corrupt log

When xlog_find_tail() fails to find the head or the tail, the missing
braces leads that an unparseable log always exits with status 2, even
if we've asked for -n or -L which should proceed. We can expose this
issue by xfstests case xfs/098.

Fixes: commit b04647edea32 ("xfs_repair: exit with status 2 if log dirtiness is unknown")
Signed-off-by: Xiao Yang <yangx.jy@cn.fujitsu.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_io: support a basic extent swap command

Extent swap is a low level mechanism exported by XFS to facilitate
filesystem defragmentation. It is typically invoked by xfs_fsr under
conditions that will atomically adjust inode extent state without
loss of file data.

While xfs_fsr provides some debug capability to tailor its behavior,
it is not flexible enough to facilitate low level tests of the
extent swap mechanism. For example, xfs_fsr may skip swaps between
inodes that consist solely of preallocated extents because it
considers such files already 100% defragmented. Further, xfs_fsr
copies data between files where doing so may be unnecessary and thus
inefficient for lower level tests.

Add a basic swapext command to xfs_io that allows userspace
invocation of the command under more controlled conditions. This
facilites targeted tests without interference from xfs_fsr policy,
such as using files with only preallocated extents, known/expected
failure cases, etc. This command makes no effort to retain data
across the operation. As such, it is for testing purposes only.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs_io: Add missing perror for write_once (-O)

This got missed in the last set of patches.

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

misc: enable link time optimization, if requested

Enable link time optimization (LTO) if the builder requests it. The
extra link optimization results in smaller binaries.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

misc: enable retpolines across all xfsprogs utilities

Detect and enable retpolines for all code, to mitigate Spectre v2
(branch target injection) on x86.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: fix u32 type usage in sb validation function

Source kernel commit: 131fa58d391fc0939f6c66b23776ad5df5db20f9

Don't use u32, use uint32_t, because this won't work in xfsprogs.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
[sandeen: no-op commit, fixed previously to keep build working]
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: don't screw up direct writes when freesp is fragmented

Source kernel commit: 6d8a45ce29c7d67cc4fc3016dc2a07660c62482a

xfs_bmap_btalloc is given a range of file offset blocks that must be
allocated to some data/attr/cow fork.  If the fork has an extent size
hint associated with it, the request will be enlarged on both ends to
try to satisfy the alignment hint.  If free space is fragmentated,
sometimes we can allocate some blocks but not enough to fulfill any of
the requested range.  Since bmapi_allocate always trims the new extent
mapping to match the originally requested range, this results in
bmapi_write returning zero and no mapping.

The consequences of this vary -- buffered writes will simply re-call
bmapi_write until it can satisfy at least one block from the original
request.  Direct IO overwrites notice nmaps == 0 and return -ENOSPC
through the dio mechanism out to userspace with the weird result that
writes fail even when we have enough space because the ENOSPC return
overrides any partial write status.  For direct CoW writes the situation
was disastrous because nobody notices us returning an invalid zero-length
wrong-offset mapping to iomap and the write goes off into space.

Therefore, if free space is so fragmented that we managed to allocate
some space but not enough to map into even a single block of the
original allocation request range, we should break the alignment hint in
order to guarantee at least some forward progress for the direct write.
If we return a short allocation to iomap_apply it'll call back about the
remaining blocks.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: treat CoW fork operations as delalloc for quota accounting

Source kernel commit: 4b4c1326fd7c7210d23d9dd3bfc51f2b6477bb9e

Since the CoW fork only exists in memory, it is incorrect to update the
on-disk quota block counts when we modify the CoW fork. Unlike the data
fork, even real extents in the CoW fork are only delalloc-style
reservations (on-disk they're owned by the refcountbt) so they must not
be tracked in the on disk quota info. Ensure the i_delayed_blks
accounting reflects this too.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: refactor accounting updates out of xfs_bmap_btalloc

Source kernel commit: 751f3767c245f9adf4f0a4f8f04aae9ae1d675a0

Move all the inode and quota accounting updates out of xfs_bmap_btalloc
in preparation for fixing some quota accounting problems with copy on
write.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: refactor inode verifier corruption error printing

Source kernel commit: 22431bf3dfbf44d7356933776eb486a6a01dea6f

Refactor inode verifier error reporting into a non-libxfs function so
that we aren't encoding the message format in libxfs. This also
changes the kernel dmesg output to resemble buffer verifier errors
more closely.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: bmap code cleanup

Source kernel commit: 6ca30729c206d62d88730a904af7d543a56273d8

Remove the extent size hint and realtime inode relevant code from
the xfs_bmapi_reserve_delalloc since it is not called on the inode
with extent size hint set or on a realtime inode.

Signed-off-by: Shan Hai <shan.hai@oracle.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

Split buffer's b_fspriv field

Source kernel commit: fb1755a645972ed096047583600838f6cf414e2b

By splitting the b_fspriv field into two different fields (b_log_item
and b_li_list). It's possible to get rid of an old ABI workaround, by
using the new b_log_item field to store xfs_buf_log_item separated from
the log items attached to the buffer, which will be linked in the new
b_li_list field.

This way, there is no more need to reorder the log items list to place
the buf_log_item at the beginning of the list, simplifying a bit the
logic to handle buffer IO.

This also opens the possibility to change buffer's log items list into a
proper list_head.

b_log_item field is still defined as a void *, because it is still used
by the log buffers to store xlog_in_core structures, and there is no
need to add an extra field on xfs_buf just for xlog_in_core.

Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Bill O'Donnell <billodo@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
[darrick: minor style changes]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
[sandeen: b_li_list unused in userspace]
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: convert to new i_version API

Source kernel commit: f0e28280629e0ec7921f3179409a179b1ea41f24

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Darrick J. Wong <darrick.wong@oracle.com>
Acked-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: check sb_agblocks and sb_agblklog when validating superblock

Source kernel commit: 4bb73d014785cc55225686f9f46e7192fb59d26b

Currently, we don't check sb_agblocks or sb_agblklog when we validate
the superblock, which means that we can fuzz garbage values into those
values and the mount succeeds. This leads to all sorts of UBSAN
warnings in xfs/350 since we can then coerce other parts of xfs into
shifting by ridiculously large values.

Once we've validated agblocks, make sure the agcount makes sense.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
[sandeen: fix up u32 usage now so we keep building]
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: recheck reflink / dirty page status before freeing CoW reservations

Source kernel commit: be78ff0e72778eb4df4aac66edb9e97462bfe00d

Eryu Guan reported seeing occasional hangs when running generic/269 with
a new fsstress that supports clonerange/deduperange.  The cause of this
hang is an infinite loop when we convert the CoW fork extents from
unwritten to real just prior to writing the pages out; the infinite
loop happens because there's nothing in the CoW fork to convert, and so
it spins forever.

The fundamental issue here is that when we go to perform these CoW fork
conversions, we're supposed to have an extent waiting for us, but the
low space CoW reaper has snuck in and blown them away!  There are four
conditions that can dissuade the reaper from touching our file -- no
reflink iflag; dirty page cache; writeback in progress; or directio in
progress.  We check the four conditions prior to taking the locks, but
we neglect to recheck them once we have the locks, which is how we end
up whacking the writeback that's in progress.

Therefore, refactor the four checks into a helper function and call it
once again once we have the locks to make sure we really want to reap
the inode.  While we're at it, add an ASSERT for this weird condition so
that we'll fail noisily if we ever screw this up again.

Reported-by: Eryu Guan <eguan@redhat.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Tested-by: Eryu Guan <eguan@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: btree format ifork loader should check for zero numrecs

Source kernel commit: 55e45429ce3e4ac9dd2bf4937b1a499a69ccc4ca

A btree format inode fork with zero records makes no sense, so reject it
if we see it, or else we can miscalculate memory allocations. Found by
zeroes fuzzing {a,u3}.bmbt.numrecs in xfs/{374,378,412} with KASAN.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: attr leaf verifier needs to check for obviously bad count

Source kernel commit: 79a69bf8dc240ebeb105226a8a8540df136bf987

In the attribute leaf verifier, we can check for obviously bad values of
firstused and count so that later attempts at lasthash don't run off the
end of the memory buffer. Found by ones fuzzing hdr.count in xfs/400 with
KASAN.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: directory scrubber must walk through data block to offset

Source kernel commit: ce92d29ddf9908d397895c46b7c78e9db8df414d

In xfs_scrub_dir_rec, we must walk through the directory block entries
to arrive at the offset given by the hash structure. If we blindly
trust the hash address, we can end up midway into a directory entry and
stray outside the block. Found by lastbit fuzzing lents[3].address in
xfs/390 with KASAN enabled.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: cross-reference the realtime bitmap

Source kernel commit: 46d9bfb5e706493777b9dfed666cd8967f69e6fd

While we're scrubbing various btrees, cross-reference the records
with the other metadata.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: add scrub cross-referencing helpers for the refcount btrees

Source kernel commit: 49db55eca5665e32c9d3e67a7d5694bcc6c274de

Add a couple of functions to the refcount btrees that will be used
to cross-reference metadata against the refcountbt.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: add scrub cross-referencing helpers for the rmap btrees

Source kernel commit: ed7c52d4bf92ac1f05b8c251a44a8bf4688f8786

Add a couple of functions to the rmap btrees that will be used
to cross-reference metadata against the rmapbt.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: add scrub cross-referencing helpers for the inode btrees

Source kernel commit: 2e001266b67c865ad904e1889658282d0773b207

Add a couple of functions to the inode btrees that will be used
to cross-reference metadata against the inobt.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: add scrub cross-referencing helpers for the free space btrees

Source kernel commit: ce1d802e6a889b8ee53b3444c6d7e8cfecadac50

Add a couple of functions to the free space btrees that will be used
to cross-reference metadata against the bnobt/cntbt, and a generic
btree function that provides the real implementation.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

xfs: cancel tx on xfs_defer_finish() error during xattr set/remove

Source kernel commit: c468562879a766de2c2fbedd41b653a7bf4c157d

Chris Dunlop reports a problem where an xattr operation fails,
reports the following error to syslog and hangs during unmount:

================================================
[ BUG: lock held when returning to user space! ]
...
------------------------------------------------
<PID> is leaving the kernel with locks still held!
1 lock held by <PID>:
#0: (sb_internal){......}, at: [<ffffffffa07692a3>] xfs_trans_alloc+0xe3/0x130 [xfs]

The failure/shutdown occurs during deferred ops processing which
leads to an error return from xfs_defer_finish() via
xfs_attr_leaf_addname(). While the root cause of the failure is
unknown corruption, the cause of the subsequent BUG above and
unmount hang is failure to cancel the transaction before returning
to userspace.

The transaction is not cancelled because the out_defer_cancel error
handling paths in the xfs_attr_[leaf|node]_[add|remove]name()
functions clear args.trans without releasing the transaction. The
callers therefore lose the reference to the transaction and fail to
cancel it.

Since xfs_attr_[set|remove]() always cancel args.trans when != NULL
and xfs_defer_finish()->...->xfs_trans_roll() should always return
with a valid transaction, update the leaf/node xattr functions to
not reset args.trans in the error path responsible for cancelling
deferred ops.

Reported-by: Chris Dunlop <chris@onthe.net.au>
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>