git.ipfire.org Git - thirdparty/xfsprogs-dev.git/log

libxfs: disambiguate xfs.h

There are two "xfs.h" header files in xfsprogs. include/xfs.h
contains the userspace API definitions for the platform and ioctl
interfaces. libxfs/xfs.h contains support infrastructure
to allow kernel code to compile in userspace. They have different
purposes, and so we need to disambiguate them so it is clear what
header files are being included in which files.

We can't change include/xfs.h as it is exported and installed into
/usr/include/xfs, and that means we have to rename the libxfs
internal header file. Rename this to "libxfs_priv.h" so it is clear
that it is private to libxfs, and update all the libxfs code to
include it.

Signed-off-by: Dave Chinner <dchinner@redhat.com>

libxfs: directly include libxfs headers

As a step closer to the kernel code, have the libxfs code explicitly
include the headers they require to compile. This causes lots of
churn in the include files but starts to bring sanity to current
include mess.

IOWs, this is the first step towards libxfs having a clean
internal/external API demarkation for userspace. The internal libxfs
build is controlled through libxfs/xfs.h, and that defines the
functions that are exported by the library via the libxfs_*
namespace. This also starts moving the internal header structure to
the same layout and dependencies as the kernel code so that we can
separate out include/libxfs.h so that it only needs to include other
header files and doesn't ave to provide lots of "work around kernel
oddities" and export function prototypes that the internal libxfs
code does not define prototypes for.

There's still lots of follow patches to make this a reality, but
this is the first major step in cleaning up the mess.

Signed-off-by: Dave Chinner <dchinner@redhat.com>

libxfs: update to 4.1-rc2 code base

Signed-off-by: Dave Chinner <dchinner@redhat.com>

libxfs: update to match 3.19-rc1 kernel code

Signed-off-by: Dave Chinner <dchinner@redhat.com>

libxfs: restructure to match kernel layout

The kernel now has a libxfs directory, so we need to restructure the
userspace libxfs source tree to match the kernel layout. This
involves changing the location of libxfs include files and how they
are linked into include/xfs for all the subdirectories to see them.

This is a bit convoluted as an initial conversion step - we move all
the libxfs headers files from include/ to libxfs/ and then add build
rules to symlink the files directly to include/xfs so everything
still "sees" the same header files. This changes the build
dependencies slightly, as libxfs now must be built after include
(unchanged) but before anything else (new) so that we have a
complete include/xfs directory. This also affects install and
packaging rules, as header install rules are now split across
include and libxfs Makefiles.

Signed-off-by: Dave Chinner <dchinner@redhat.com>

libxfs: error negation rework

The libxfs core in the kernel now returns negative error numbers one
failure rather than positive numbers. This commit switches the
libxfs core to use negative error numbers and converts all
the libxfs function callers to negate the returned error so that
none of the other codeneeds tobe changed at this time.

This allows us to drive negative errors through the xfsprogs code
base at our leisure rather than having to do it all right now.

Signed-off-by: Dave Chinner <dchinner@redhat.com>

xfs: Nuke XFS_ERROR macro

XFS_ERROR was designed long ago to trap return values, but it's not
runtime configurable, it's not consistently used, and we can do
similar error trapping with ftrace scripts and triggers from
userspace.

Just nuke XFS_ERROR and associated bits.

Port of kernel commit b474c7ae4395ba684e85fde8f55c8cf44a39afaf
from Eric Sandeen.

Signed-off-by: Dave Chinner <dchinner@redhat.com>

xfs: return is not a function

return is not a function. "return(EIO);" is silly;
"return (EIO);" moreso. return is not a function.
Nuke the pointless parens.

Port of kernel commit d99831ff393ff2e28d6110b41f24d9fecf986222
from Eric Sandeen.

Signed-off-by: Dave Chinner <dchinner@redhat.com>

libxfs: update to 3.16 kernel code

Bulk update, including all the changes to the surrounding userspace
code due to API changes.

Signed-off-by: Dave Chinner <dchinner@redhat.com>

xfs: kill unsupported superblock versions

We don't support filesystems older than dir v2 support, so we
always know about v2 inodes and nlink and other feature bits.
Strip out all the old, unnecessary feature support stuff from
xfs_repair in preparation for merging the 3.16 libxfs code from
the kernel.

Signed-off-by: Dave Chinner <dchinner@redhat.com>

libxfs: do all xfs->libxfs defines inside libxfs/

Currently include/libxfs does some #define libxfs.... xfs...
conversions, allowing external functions to call xfs namespace
functions directly. All of the exported functions from libxfs shoul
dbe compiled in as libxfs_... namespace symbols, and so such defines
should be in libxfs/xfs.h and done the opposite way around.

This exposes an awful lot of incorrectly namespaced calls in the
userspace code and highlights functions that we weren't expressly
exporting from libxfs, so we need to fix them up at the same time.

As such, this patchset starts laying the groundwork for a more
formalised libxfs interface definition. The libxfs code will need
to be restructured to match the kernel code restructing that was
done in 3.17, so this interface will become more formalised and
defined as that work is done.

Signed-off-by: Dave Chinner <dchinner@redhat.com>

xfsprogs: Release v3.2.4

Update all the release files to 3.2.4.

Signed-off-by: Dave Chinner <david@fromorbit.com>

metadump: Zero unused portions of inode, BMAP, and allocation btree blocks

Zero out the unused regions of these btree blocks, as
they may contain stale data.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

metadump: Zero unused portions of DA_NODE blocks

Zero out unused portions of interior nodes of DA btrees

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

metadump: Zero unused portions of attribute blocks

Attribute blocks may contain stale disk data either
between names (and/or values), and between the entries
array and the actual names/values. Zero out both of
these regions.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

metadump: Zero unused tail of symlink blocks

Symlink blocks may contain stale data past the end
of their content; zero it out.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

metadump: Zero sparse/unused regions of dir2

dir2 data has explicitly unused regions as well as
padding between entries; zero this out to avoid copying
stale data.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

metadump: Zero unused portions of inode literal area

The data & attr regions of the inode literal area
are rarely full; unused portions should be zeroed
out so that we don't copy stale disk data.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

metadump: Zero literal area of unused inodes

When metadump copies unused inodes it should zero out
the literal area, which may contain stale on-disk data.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

metadump: Zero out unused portion of the AGFL

mkfs.xfs doesn't zero the AGFL, so if it hasn't been
entirely used, metadump can pick up stale data. Zero
the unused parts.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

metadump: Copy the log if not obfuscating or zeroing

If we're not obfuscating and we're not zeroing out
stale data, we're collecting maximal information. Keep
even a clean log intact in that case as well.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

metadump: Add option to copy metadata blocks intact

By default we zero out unused portions of metadata blocks.
This adds an option to keep blocks intact, to better investigate
corruptions.

Nothing in this patch reads it, but subsequent patches add
users of the option.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

metadump: handle multi-block directories

assembled a buffer from multiple dir blocks, and we can use that in
obfuscate_dir_data_block() to continue past the first filesystem block
and continue obfuscating the entire thing.

Without this, anything after the first block was skipped, and
remained as cleartext.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

metadump: Obfuscate the filesystem label

Not a lot of room for sensitive data, but while
we're at it...

There is no need to do the normal hashed name,
we just replace it with "L's" - and for kicks,
we preserve the length of the label.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

metadump: Fill attribute values with 'v' rather than NUL

Rather than memset attribute values to '\0', use the character 'v' -
otherwise in some cases we get attributes with a non-zero value
length which start with a NUL, and that makes some userspace tools
unhappy, yielding results like this:

security.oO^Lio.=0sAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

metadump: obfuscate attrs on CRC fs

Lots of issues in xfs_metadump obfuscation of extended
attributes with CRC filesystems; this fixes it up.

The main issues are that the headers differ, and the
space in the remote blocks differ.

Tested with a script I'm using to do other metadump
work; I still owe an xfstest for this and other bits.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_metadump: obfuscate remote symlinks on CRC filesystems

On CRC filesystems, the symlink block starts with a header,
which contains magic, "XLSM"

The code happens to "work" today w/o corrupting anything,
because it seems "XSLM" as a string, decides it's too short
to obfuscate, and leaves it alone.

But the real symlink target is untouched. Fix that by moving
the pointer to the string we want to obfuscate by the size
of the header, and shorten the length to obfuscate accordingly.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_metadump: don't zero log if not obfuscating

The earlier commit:

ec693e1 metadump: zero out clean log

ignored the "obfuscate" state, but there's no reason to
zero out the log if we're not obfuscating; all the other
metadata is in the clear, so we may as well keep it around
in the log as well.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

metadump: zero out clean log

When doing an xfs_metadump, if the log is clean, zero it out
for 2 reasons:

* It'll make the image more compressible
* It'll eliminate an un-obfuscated metadata source

If the log isn't clean, and the user expected obfuscation, warn
that metadata in the log will not be obfuscated.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

metadump: rename dont_obfuscate variable

Seeing "if (!dont_obfuscate)" hurts my brain; rename it to
"obfuscate" and invert the logic.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

libxlog: add xlog_is_empty() helper

xfs_repair and xfs_db both check for a dirty log, using roughly
the same cut and pasted code.

Add a new helper to do this, xlog_is_dirty(), and use it instead.

Note, the helper (and the code before it) is a little odd in that
it (re-)initializes some libxfs_init_t members before proceeding,
I haven't yet worked out if that's necessary or correct, so I've
just kept the same behavior for now. Still, it seems like this
should be done in libxfs_init, or not at all.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfsprogs: v3.2.3 release

Update changelog and version files for upcoming release.

Signed-off-by: Dave Chinner <david@fromorbit.com>

xfsprogs: update deb packaging for upcoming release

Update debian changelog file for upcoming release.

Signed-off-by: Nathan Scott <nathans@debian.org>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfsprogs: update for 3.2.3-rc2 release

Update changelog and version files for upcoming release.

Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: ensure .. is set sanely when rebuilding dir

When we're rebuilding a directory, ensure that we reinitialize the
directory with a sane parent ('..') inode value.  If we don't, the
error return from xfs_dir_init() is ignored, and the rebuild process
becomes confused and leaves the directory corrupt.  If repair later
discovers that the rebuilt directory is an orphan, it'll try to attach
it to lost+found and abort on the corrupted directory.  Also fix
ignoring the return value of xfs_dir_init().

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: better checking of v5 metadata fields

Check the UUID, owner, and block number fields during repair, looking for
blocks that fail either the checksum or the data structure verifiers. For
directories we can simply rebuild corrupt/broken index data, though for
anything else we have to toss out the broken object.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: validate superblock against known v5 feature flags

Apparently xfs_repair running on a v5 filesystem doesn't check the
compat, rocompat, or incompat feature flags for bits that it doesn't
know about, which means that old xfs_repairs can wreak havoc. So,
strengthen the checks to prevent repair from "repairing" anything it
doesn't understand.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Fanael Linithien <fanael4@gmail.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: properly detect reserved attribute names

This function in xfs_repair tries to make sure that if an attr
name reserved for acls exists in the root namespace, then its
value is a valid acl.

However, because it only compares up to the length of the
reserved name, superstrings may match and cause false positive
xfs_repair errors.

Ensure that both the length and the content match before
flagging it as an error.

Spotted-by: Zach Brown <zab@zabbo.net>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfsprogs: update for 3.2.3-rc1 release

Update changelog and version files for upcoming release.

Signed-off-by: Dave Chinner <david@fromorbit.com>

mkfs: inode/block size error messages confusing

As reported by Brain:

# ./mkfs/mkfs.xfs -f /dev/test/scratch -b size=512
illegal inode size 512
allowable inode size with 512 byte blocks is 256
# ./mkfs/mkfs.xfs -f /dev/test/scratch -b size=512 -i size=256
Minimum inode size for CRCs is 512 bytes
Usage: mkfs.xfs
...

Fix mkfs to catch the block size as being too small, rather than
leaving it for inode size detection to enforce.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

mkfs: default to CRC enabled filesystems

It's time to change the mkfs defaults to enable CRCs for all new
filesystems. While there, also enable the free inode btree by
default, too, as that functionality has also had long enough to make
it into distro kernels, too. Also update the man page to reflect the
change in defaults.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_db: report FINOBT in version output

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfsprogs: Removing deprecated _BSD_SOURCE definition.

In glibc 2.20, _BSD_SOURCE was deprecated in favor of _DEFAULT_SOURCE.
_DEFAULT_SOURCE is included with _GNU_SOURCE, as well as _XOPEN_SOURCE >= 500,
so the obsolete and unnecessary definitions are removed.

For more details, see man 7 feature_test_macros for glibc >= 2.20.

Signed-off-by: Jan Ťulák <jtulak@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

libhandle: add fd_to_handle to handle.h

It's on the man page but strangely missing from the header.

Signed-off-by: Sage Weil <sage@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfsprogs: remove unreachable code in libxfs_inode_alloc

This code does:

        if (!ialloc_context && !ip)
                return;

        // if !ip here, ialloc_context must be true

        if (ialloc_context) {
                ...
                if (!ip)
                        error = ENOSPC;
                if (error)
                        return error;
                // if !ip in this block we've returned
        }

        // so (!ip) cannot be true here
        if (!ip)
                error = ENOSPC;

(cherry picked this one out of Coverity reports)

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_db: disallow sb UUID write on v5 filesystems

Do not allow xfs_db (or the xfs_admin frontend) to change the UUID
of a V5 filesystem; this will cause UUID mismatches across the
filesystem, and we currently have no mechanism to update them all.
Changing only the superblock UUID makes all other metadata look
invalid, and xfs_repair reacts by junking everything.

Addresses-Debian-Bug: 782012
Reported-by: F. Stoyan <fstoyan@swapon.de>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

repair: only check secondary sb->sb_pquotino for v5 superblocks

xfs_repair scans for garbage data beyond the last valid superblock field
for the particular sb version in secondary_sb_wack(). If any non-zero
data is detected, the entire range is reset to zero. Subsequently,
various valid superblock fields are checked for valid/expected data.

The sb_pquotino field is checked unconditionally as part of this
sequence even though it is a v5 only field. As a result, repair
complains about a non-null project quota field if any garbage data
exists for a v4 secondary sb. This is reproduced by xfs/070 against a v4
superblock and is also easily reproduced manually as follows:

$ mkfs.xfs -f -m crc=0 <dev>
$ xfs_db -x -c "sb 3" -c "write lsn 1" <dev>
$ xfs_repair <dev>
...
zeroing unused portion of secondary superblock (AG #3)
non-null project quota inode field in superblock 3
...

This occurs because the garbage data detection mechanism has reset
sb->sb_pquotino to 0 while the validity check expects a value of
NULLFSINO.

Update secondary_sb_wack() to only check sb->sb_pquotino for validity on
supers where it is a valid field. If it is anything other than 0 on
pre-v5 superblocks, it is explicitly reset to 0 by the garbage data
checks earlier in the function.

Reported-by: Xing Gu <gux.fnst@cn.fujitsu.com>
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

db/check: handle zero inoalignmt correctly for large block sizes

The check command prints a spurious error when sb_inoalignmt is zero but
the sb align feature bit is set:

$ mkfs.xfs -f -bsize=16k <dev>
$ xfs_db -c "check" <dev>
sb versionnum extra align bit 80

This occurs because check determines whether to expect the alignment
feature bit based on a non-zero inoalignmt (in init()). sb_inoalignmt of
0 is expected for block sizes that are large enough for at least one
full inode record (64 inodes), however. For example, when bsize >= 16k
on v4 filesystems or >=32k on v5 filesystems.

Update the init() logic in the check command to detect this particular
scenario. Set the in-memory feature bit if inoalignmt is zero and the
block size is large enough for full inode records such that blockget_f()
does not complain.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

mkfs: don't zero old superblocks if file was truncated

If the force overwrite option is passed to mkfs, we attempt to zero old
superblock metadata on the mkfs target. We attempt to read the primary
superblock to identify the secondary superblocks.

If the mkfs target is a regular file, it is truncated on open and the
secondary superblock zeroing operation returns a spurious and incorrect
error message due to a 0-byte read:

$ mkfs.xfs -f -d file=1,name=xfs.fs,size=32m
...
existing superblock read failed: Inappropriate ioctl for device

Fix the error reporting in zero_old_xfs_structures() to only print an
error string if the pread() call returns an error. Warn the user if the
read doesn't match the sector size. Finally, detect the case where we
know we've already truncated a regular file and skip the sb zeroing.

Reported-by: Alexander Tsvetkov <alexander.tsvetkov@oracle.com>
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

libxfs: don't write uninitialized heap contents into new directory blocks

Clear the contents of the xfs buffer when we're initializing it to avoid
writing random heap contents (and CRC thereof) to disk.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: don't abort on bad directory leaf crc during leaf check

longform_dir2_check_leaf() checks a directory leaf block to help
decide if we need to rebuild the directory. If the verifier fails
with a CRC or corrupt structure error, rebuild the directory instead
of aborting.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: validate & fix inode CRCs

xfs_repair doesn't ever check an inode's CRC, so it never repairs
them. If the root inode or realtime inodes have bad crcs, the
fs won't even mount and can't be fixed (without using xfs_db).

It's fairly straightforward to just test the inode CRC before
we do any other checking or modification of the inode, and
just mark it dirty if it's wrong and needs to be re-written.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: don't clear . or .. in process_dir2_data

process_dir2_data() has special . and .. processing; it is able
to correct these inodes, so there is no reason to clear them.

Do this before we adjust a length 0 filename to length 1, so
that we don't take this action on an accidentally created "."
name from a hidden dotfile.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: set *parent if process_dir2_data() fixes root inode

process_dir2_data() may fix the root dir's parent inode:

"bad .. entry in root directory inode 6912, was 7159: correcting"

But we don't update the *parent passed in in that case; this then leads to
an assert later in process_dir2:

xfs_repair: dir2.c:2039: process_dir2:
  Assertion `(ino != mp->m_sb.sb_rootino && ino != *parent) ||
  (ino == mp->m_sb.sb_rootino && (ino == *parent || need_root_dotdot == 1))'
  failed.

Updating the value of *parent when we fix the parent value resolves this
problem.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: clear need_root_dotdot if we rebuild the root dir

It's possible to enter longform_dir2_rebuild with need_root_dotdot
set; rebuilding it will add "..", and if need_root_dotdot stays set,
we'll add another ".." entry, causing a second repair to find and
fix that problem.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

libxfs: remove ASSERT on ftype read from disk

This one is already fixed in the kernel, with
fb04013 xfs: don't ASSERT on corrupt ftype
but that kernel<->userspace merge is still pending.

In the meantime, just fix it as a one-off here, because ASSERTing
on bad on-disk values when running xfs_repair is a very unfriendly
thing to do.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: remove last-entry hack in process_sf_dir2

process_sf_dir2() tries to special-case the last entry in
a short form dir, and salvage it if the name length is wrong
by using the dir size as a clue to what the length should be.

However, this doesn't actually work; there is either a 32-bit
or 64-bit inode after the name, and with dirv3 there's a file
type in there as well. Extending the name length to the dir
size means it overlaps these fields, which often have a 0 in
the top bits, and then namecheck() fails the result and junks
it anyway:

> entry #1 is zero length in shortform dir 67, resetting to 29
> entry contains illegal character in shortform dir 67
> junking entry "bbbbbbbbbbbbbbbbbbbbbbbb" in directory inode 67

And depending on the corruption, the current code will set
a new negative namelen if it turns out that the name itself
starts beyond dir size; it can't be shortened enough.

So, we could fix this up, and choose a new namelen such that
the xfs_dir2_ino8_t and/or xfs_dir2_ino8_t and/or file type
still fits at the end, but we really seem to be reaching the
point of diminishing returns. The chances that nothing but
the name length is wrong, and that the remaining few
bytes at the end of the dir size are actually correct, seems
awfully small.

Therefore just remove this special case, which I think is
of questionable value.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: collapse 2 cases in process_sf_dir2

process_sf_dir2() has 2 cases for a bad namelen: on-disk namelen
is 0, or on-disk namelen extends beyond the directory i_size.

After the prior 2 patches, the code for each case is now essentially
the same, so collapse the two cases.

Note, this does slightly change some of the error messages.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: remove impossible tests in process_sf_dir2

We're testing for an impossible case in here:

                if (i == num_entries - 1)  {
...
                } else  {
...
                                if (i == num_entries - 1)
                                    /* can't happen! */
...
                }

So, remove the impossible case.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: dirty inode in process_sf_dir2 if we change namelen

There are two "fix sfep->namelen" cases, but we only mark
*dino_dirty = 1 in one of them. Add the other to ensure that
the change gets written out.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_db: nlink fields are valid for di_version == 3, too

Printing inodes with di_version == 3 skips the nlink
fields, because they are only printed if di_version == 2.
This was intended to separate them from di_version == 1,
but it mistakenly excluded di_version == 3, which also contains
these fields.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_db: fix inode CRC validity state, and warn on read if invalid

Currently, the "ino_crc_ok" field on the io cursor reflects
overall inode validity, not CRC correctness. Because it is
only used when printing CRC validity, change it to reflect
only that state - and update it whenever we re-write the
inode (thus updating the CRC).

In addition, when reading an inode, warn if the CRC is bad.

Note, when specifying an inode which doesn't actually exist,
this will claim corruption; I'm not sure if that's good or
bad. Today, it already issues corruption errors on the way;
this adds a new message as well:

xfs_db> inode 129
Metadata corruption detected at block 0x80/0x2000
Metadata corruption detected at block 0x80/0x2000
...
Metadata CRC error detected for ino 129

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_db: Allow writes of corrupted data

Being able to write corrupt data is handy if we wish to test
repair against specific types of corruptions.

Add a "-c" option to the write command which allows this by removing
the write verifier.

Note that this also skips CRC updates; it's not currently possible
to write invalid data with a valid CRC; CRC recalculation is
intertwined with validation.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

repair: superblock buffers need to be sector sized

In secondary_sb_wack() we zero the unused portion of both the
on-disk superblock and the in-memory copy that we have. When
the device sector size is 4k, this causes xfs_repair to crash like
so:

# xfs_repair /dev/ram1
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
bad magic number
bad on-disk superblock 3 - bad magic number
primary/secondary superblock 3 conflict - AG superblock geometry info conflicts with filesystem geometry
zeroing unused portion of secondary superblock (AG #3)
#

The stack trace is indicative:

#0  memset ()
#1  0x000000000040404b in secondary_sb_wack
#2  verify_set_agheader
#3  0x0000000000427b4b in scan_ag
#4  0x000000000042a2ca in worker_thread
#5  0x00007ffff77ba0a4 in start_thread
#6  0x00007ffff74efc2d in clone

Which points at memset overrunning the in memory buffer, as it is
only 512 bytes in length.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

mkfs: log stripe unit fails to influence default log size

This fails on a 4GB, 4k physical sector size device:

# mkfs.xfs -f -l version=2,su=256k /dev/ram1
log size 2560 blocks too small, minimum size is 3264 blocks
....

The combination of 4k sectors and a log stripe unit increase the
minimum size of the log. We should be automatically calculating an
appropriate, valid log size when the user does not specify it.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

metadump: check for non-zero inode alignment

The copy_inode_chunk() function performs some basic sanity checks on the
inode record, block number, etc. One of these checks includes whether
the inode chunk is aligned according to sb_inoalignmt. sb_inoalignment
can equal 0 with larger block sizes. This results in a mod-by-zero,
"badly aligned inode ..." warnings and skipped inodes in metadump
images. This can be reproduced with a '-m crc=1,finobt=1 -b size=64k' fs
on ppc64.

Update copy_inode_chunk() to only enforce the inode alignment check when
sb_inoalignmt is non-zero.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

metadump: include NULLFSINO check in inode copy code

The copy_ino() function includes a check for effectively NULL inode
numbers. It checks for 0 but does not include NULLFSINO. This leads to
spurious warnings in some instances. For example, copy_ino() is called
unconditionally for sb quota inodes from copy_sb_inodes(), values of
which can be NULLFSINO.

Check for NULLFSINO and return quietly from copy_ino().

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfsprog: xfsio: update xfs_io manpage for FALLOC_FL_INSERT_RANGE

Update xfs_io manpage for FALLOC_FL_INSERT_RANGE.

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_quota: manpage - project command requires arguments

The xfs_quota man page states that the "project" command without
arguments will list all project names and identifiers, but it has
never done this; the project_f command has always been defined as
requiring at least one argument.

Fix the man page to reflect reality.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfsprogs: do not do any dynamic linking of libtool libraries

if --disable-static and --enable-shared are given on the command
line, the link with xfsprogs's internal libraries fail because
they have been dynamically compiled.

Hence the following error:
ld: attempted static link of dynamic object `../libxcmd/.libs/libxcmd.so'

xfsprogs rely on the original behaviour of -static which was modified in
Buildroot by [1]. But since commit [2] the build of xfsprogs tools is broken
because they try to link statically with the static libuuid library
(util-linux), which is not build for shared only build.

The use of -static-libtool-libs allows to fallback to the dynamic linking for
libuuid only:

LD_TRACE_LOADED_OBJECTS=1 xfs_copy
linux-gate.so.1 => (0xf7793000)
libuuid.so.1 => /lib/libuuid.so.1 (0x465e1000)
libpthread.so.0 => /lib/libpthread.so.0 (0x46db1000)
librt.so.1 => /lib/librt.so.1 (0x46f21000)
libc.so.6 => /lib/libc.so.6 (0x46bf1000)
/lib/ld-linux.so.2 (0x46bce000)

[1] http://git.buildroot.net/buildroot/commit/?id=97703978ac870ce2b14ad144f8e082de82aa2c64
[2] http://git.buildroot.net/buildroot/commit/?id=f1d3e09895b245da9d54bbaef36e5de95269034e

Signed-off-by: Romain Naour <romain.naour@openwide.fr>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_quota: fix typo in manpage

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

repair: check ino alignment value to avoid mod by zero

xfs_repair checks inode records for valid alignment according to the
alignment specified in the superblock. It currently performs the
alignment check whenever fs_aligned_inodes is set, which is determined
based on whether the fs supports the field.

Support for the field does not guarantee its value is non-zero, however.
For example, a large block size fs on a large page size arch (e.g.,
ppc64):

mkfs.xfs -f -m crc=1,finobt=1 -b size=64k <dev>

... can lead to incorrect badly aligned inode record messages from
xfs_repair and other problems.

Update the inobt and finobt checks to verify that alignment is a
non-zero value before attempting to use it to divide (mod) by zero.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

repair: fix v5 sb ino alignment calculation for large blocksizes

xfs_repair validates the superblock inode alignment field against
several possible valid values. On v5 superblocks, the inode alignment
can be scaled up based on the inode size in relation to the minimum
inode size.

If the block size is larger than the default cluster size (consider
large page size arches such as ppc64), the initial alignment value
calculates to zero. If the inode size is large enough such that
sb_inoalignmt is not zero, sb_validate_ino_align() scales the align
value by the factor of inode size increase. If align is zero, however,
we multiply by zero, the subsequent check incorrectly fails and the
overall superblock verification fails as well. To reproduce, format an
fs as follows on ppc64 and run xfs_repair:

mkfs.xfs -f -m crc=1 -b size=64k -i size=2k <dev>

Fix the scaled alignment calculation by scaling the default cluster size
appropriately to avoid a multiplication by zero.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

libxfs: report VERSION in libxfs_fs_repair_cmn_err()

Because this is usually the first question asked...

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_io: add finsert command for insert range

Add finsert command for fallocate FALLOC_FL_INSERT_RANGE flag.

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

repair: remove unused strided secondary sb scan logic

verify_set_primary_sb() scans and verifies secondary superblocks based
on the primary sb. It currently defines a maximum number of 8
superblocks to scan per iteration. It also implements a strided
algorithm that appears intended to ultimately scan every secondary,
albeit in a strided order.

Given that the algorithm is written to hit every sb and the stride value
is initialized as follows:

num_sbs = MIN(NUM_SBS, rsb->sb_agcount);
skip = howmany(num_sbs, rsb->sb_agcount);

... which is guaranteed to be 1 since the howmany() parameters are
backwards, the strided algorithm doesn't appear to accomplish anything
that can't be done with a simple for loop. In other words, despite the
max value and strided algorithm, repair always scans all of the
secondary superblocks in incremental order.

Therefore, remove the strided algorithm bits and replace with a simple
for loop. As a result of this cleanup, also remove the 'checked' buffer
used to track repeated ag visits and the now unused NUM_SBS definition.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

repair: fix unnecessary secondary scan if only last sb is corrupt

verify_set_primary_sb() scans the secondary superbocks based on the
geometry specified in the primary and determines the most likely correct
geometry by tracking how many superblocks are consistent across the set.
The most frequent geometry is copied into the primary superblock. The
return value is checked by the caller (phase1()) to determine whether a
brute force secondary scan is necessary.

This generally occurs when not enough secondary sb's are consistent to
declare the geometry correct. If enough secondaries are consistent,
verify_set_primary_sb() returns the status of the last secondary sb that
was scanned. Corruptions to secondary supers other than the last are
thus resolved fine. If the last secondary is corrupt, however, an error
is returned to phase1(). This causes a brute force scan even if enough
supers were found to repair the last secondary.

Move the initialization of retval to after the sb scan to return an
error only if not enough secondary supers were found to declare a
correct geometry.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: fix max block offset test

Eryu pointed out that in fstest xfs/071, we find corruption
reported at the end. This test attempts to do IO at the
maximum possible offsets, and repair yields:

inode 1027 - extent offset too large - start 70, count 1, offset 2251799813685247
correcting nextents for inode 1027
bad data fork in inode 1027
would have cleared inode 1027

Repair is complaining that an extent *starts* at the maximum
block, but AFAICT, starting there is just fine, as long as
we also end there. i.e. a one-block extent at the limit
is just fine.

So change the xfs_repair test to allow this situation.

Also, the warning text is a bit unclear, mixing in the physical
block w/ the logical block... rearrange that a little to make
it obvious.

Reported-by: Eryu Guan <eguan@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: do not check symlink component lengths

As reported by Andy Grimm,

# ln -s $( python -c 'print "a" * 260' ) /mnt/foo

will succeed on xfs, but then xfs_repair will complain:

component of symlink in inode 131 too long
problem with symbolic link in inode 131
would have cleared inode 131

The kernel checks the total length of the symlink on both read
and write, but does not look at component paths.

Looking around the kernel, no other filesystem checks component
lengths, nor does the vfs. And as Andy points out, the target
could even be on a different filesystem, with different limitations.

And having a "too-long" component doesn't even seem like something
likely to stem from disk corruption anyway, so I'm not sure why repair
should care.

Therefore I propose removing the component length checks from xfs_repair.

Andy Grimm <agrimm@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfsprogs: v3.2.2 release

Update all the version and changelog files for release.

Signed-off-by: Dave Chinner <david@fromorbit.com>

repair: fix crash on zero record finobt reconstruction

The inode btrees are reconstructed in phase 5. init_ino_cursor() helps
determine the block requirements of the tree based on the number of
records. If the finobt is empty, we can crash in the btree blocks
calculation code due to a divide-by-zero error in the following line:

lptr->modulo = num_recs % lptr->num_blocks;

This occurs if num_recs and in-turn lptr->num_blocks evaluate to zero.

We already have an execution path for the zero record btree scenario.
However, it is only invoked when no records are found in the in-core
tree. The finobt zero-record scenario can occur with a populated in-core
tree provided that none of the existing records contain free inodes.

Move the zero-record handling code after the loop and use the record
count to trigger it. This is safe because the loop iterator checks for
ino_rec != NULL. This allows reuse of the same code regardless of
whether the in-core tree is empty or non-empty but contains no records
that meet the requirements for the particular on-disk tree under
reconstruction (e.g., finobt).

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_copy: simplify first_agbno calculation

After ffe9a9a xfsprogs: xfs_copy: fix data corruption of target,
xfs_copy started hitting an ASSERT for a 4k sector / 4k blocksize
filesystem:

# dd if=/dev/zero of=test.img bs=1M count=1024
# mkfs.xfs -s size=4096 test.img
# xfs_copy test.img xfs.img
xfs_copy: xfs_copy.c:720: main: Assertion `((((((xfs_daddr_t)(3 << (mp)->m_sectbb_log)) + 1) * (1<<9)) + first_residue) % source_blocksize) == 0' failed.
Aborted

I started digging through all the calculations below, and realized
that in the end, all it wants is the first filesystem block after
the AG header. XFS_AGFL_BLOCK(mp) + 1 suffices for this purpose;
rip out the rest which seems overly complex and apparently bug-prone.

I tested this by creating a 4g filesystem with combinations of
sector & block size between 512 and 4k, copying in /lib/modules,
running an xfs_copy of that, and running repair against the copy;
it all looks good. It took a long time, but I will create a
simpler/shorter xfstest based on this.

Reported-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

packaging: update deb changelog for upcoming release

Add list of resolved issues from merged patches into the
debian/changelog for processing with the next release.

Signed-off-by: Nathan Scott <nathans@debian.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>

packaging: clarify xfs header licensing within deb builds

Tackles bug #751511 - ensuring the licensing information in
the debian/copyright file matches reality. Use explanation
that Christoph Hellwig came up with, pretty much verbatim.

Signed-off-by: Nathan Scott <nathans@debian.org>
Signed-off-by: Dave Chinner <david@fromorbit.com>

packaging: rework dh_autoreconf invocation for deb builds

Reviewed, tested and merged the final iteration of proposed
solutions to #757455 - resolving configure-script-generation
for clean ppc64le builds, originally. Many thanks to Andreas
Barth and Matthias Klose for coming up with this solution.

Signed-off-by: Nathan Scott <nathans@debian.org>
Signed-off-by: Dave Chinner <david@fromorbit.com>

repair: AGFL rebuild fails if btree split required

In phase 5 we rebuild the freelist btrees, the AGF and the AGFL from
all the free space information we have gathered and resolved during
pahses 3 and 4. If the freespace information is laid out just right,
we end up having to allocate free space for the AGFL blocks.

If the size of the free space we allocate from is larger than the
space we need, then we have to insert the remainder back into the
freespace btree. For the by-size tree, this means we are likely to
be removing a record from one leaf, and then inserting the remainder
- a smaller size - into another leaf.

The issue is that the leaf blocks to the left of the original leaf
block we removed the extent record from are full and hence require a
split to insert the new record. That, of course, requires a free
block in the AGFL to allocate from, and now we have a chicken and
egg situation: there are no free blocks in the AGFL because we are
setting it up.

As a result, setting up the free list silently fails, leaving the
freespace btrees in an inconsistent state and the AGFL in question
empty. When the filesystem is next mounted, the first allocation
from that AGF results in attempting to fix the AGFL, and it then
does exactly the same thing as the repair code, fails to allocate a
block during the split and fails. This results in an immediate
shutdown because the transaction doing the allocation is dirty by
this stage.

The fix for the problem is to make repair handle rebulding the btree
differently. If we leave ispace for a couple of records in each
btree leaf and node, there is never a need for a split to occur when
initially setting up the AGFL. This results in repair doing the
right thing, and hence the runtime problems after mount don't occur.
Further, add error checking the the AGFL setup code and abort repair
if we have a failure to correctly set up the AGFL so we catch this
problem at repair time, not mount time...

Reported-by: Barkley Vowk <bvowk@box.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

repair: fix XR_BLD_FREE_TRACE compilation errors

Obviously hasn't been used for quite some time, so fix the
build problems and make it useful again.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

mkfs: don't warn about log sunit size if it was auto-discovered

Today, users doing a bare mkfs on storage with a large default
stripe size may be surprised to get this warning:

log stripe unit (%d bytes) is too large (maximum is 256KiB
log stripe unit adjusted to 32KiB

through no fault of their own. The fallback is appropriate
and harmless, and there's no need to warn about this in the
defaults case.

However, we keep the warning if a large log stripe unit was
specified by the user on the commandline.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

mkfs: ignore stripe geom if sunit or swidth == physical sector size

Today, this geometry:

# modprobe scsi_debug opt_blks=2048 dev_size_mb=2048
# blockdev --getpbsz --getss --getiomin --getioopt /dev/sdd
512
512
512
1048576

will result in a warning at mkfs time, like this:

# mkfs.xfs -f -d su=64k,sw=12 -l su=64k /dev/sdd
mkfs.xfs: Specified data stripe width 1536 is not the same as the volume stripe width 2048

because our geometry discovery thinks it looks like a
valid striping setup which the commandline is overriding.
However, a stripe unit of 512 really isn't indicative of
a proper stripe geometry.

Prior to this patch, we reset only sunit *or* swidth,
if either was equal to physical block size, but not
necessarily both.

Change the heuristic so that if either the discovered
sunit or the discovered swidth is physical block size,
we reset *both* to zero and ignore the geom completely.

While we're at it, don't pass &dummy in for multiple
arguments to blkid_get_topology(); that'll mean that
inside the function, the last assignment wins, and could
lead to unexpected results.

Reported-by: Stan Hoeppner <stan@hardwarefreak.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_io: add sync and syncfs commands

There's no easy way to invoke syncfs from the commandline,
as far as I know, so add it to xfs_io to be handy.

Add sync while we're at it, just for completeness.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfsprogs: two more completely harmless sparse nits

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

libxfs: fix harmless sparse endian nit

h_crc is __le32 but cpu_to_be32() is... __be32. So sparse
complains, even though it's harmless.

Although sparse is smart about bare 0s, and we could
drop the swap, other places explicitly swap to keep
things clear (I guess?) so "swap" the 0 with the proper
routine.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

libxfs: fix endian mishap in xfs_dialloc_ag()

Fixes a regression introduced by:

88fc730 xfs: use and update the finobt on inode allocation

which passed the non-swapped version of agi->agi_newino to
xfs_inobt_lookup()

Caught by make C=2, ftw!

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfsctl.3: fix XFS_IOC_FSSETXATTR fields

The xfsctl manual page fails to mention that fsx_projid is a
setable field.

Signed-off-by: Mark Tinguely <tinguely@sgi.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

libxfs: use structure initializers for cache_operations

This makes it a lot easier for cscope etc.

Surely all modern compilers can cope?

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfsprogs: add supported file attributes to xfs.5 manpage

The chattr(1) manpage suffers from the same problems mount(1) had:
many options listed, not kept up to date for various filesystems.

I've submitted a manpage update for chattr(1) which says to refer to
filesystem-specific manpages for supported attributes; this patch
updates xfs(5) to list the attributes supported by xfs.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_io: add mremap command

This adds a simple mremap command to xfs_io.

It does not take a start address; it uses the existing start
address, so the sized passed will be the new total size of the
mapping.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: preserve error state in process_shortform_attr

process_shortform_attr uses the "junkit" error to track whether an
error was found, but by assigning it directly to the result of
valuecheck, previous errors are ignored, leading to unrepairable
errors of the form i.e.

"entry has INCOMPLETE flag on in shortform attribute"
or
"entry contains illegal character in shortform attribute name"

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: clear bad flags in process_dinode_int

process_dinode_int() reports bad flags if dino->di_flags &
~XFS_DIFLAG_ANY - i.e. if any flags are set outside the known set.
But then instead of clearing them, it does flags &= ~XFS_DIFLAG_ANY
which keeps *only* the bad flags. This leads to persistent,
unrepairable errors of the form:

"Bad flags set in inode XXX"

Fix this.

While we are at it, fix a couple lines which look like they used to
be continuation lines, but are no longer.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_db: free flist on error in write_struct()

One error path in write_struct() wasn't freeing the flist_t *fl
which was allocated, so it leaks.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>