git.ipfire.org Git - thirdparty/xfsprogs-dev.git/log

xfsprogs: make fsr use mntinfo when there is no mntent

For what fsr needs, mntinfo can be used instead of mntent on some
platforms. Exctract the platform-specific code to platform headers.

Signed-off-by: Jan Tulak <jtulak@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

io: update reflink/dedupe ioctl definitions

The committed version was a pre-review version of the ioctl
interface. Update it to match post-review
version.

[dchinner: I've just aplied the latest patches over the top of
what I originally committed, so the diff here isn't exactly what
Darrick originally sent. ]

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

libxcmd: provide a common function to report command runtimes

Refactor the open-coded runtime stats reporting into a library
command, then update xfs_io commands to use it.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>

headers: remove definition of ASSERT from xfs.h

If we define ASSERT() in the installed xfs.h header file, programs
including this header file will have their local definitions of
ASSERT screwed up. This is something internal to the xfsprogs build,
so move it to platform_defs.h.in where it will no longer be public.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>

xfsprogs: Release v4.3.0-rc1

Update all the release files for a 4.3.0-rc1 release.

Signed-off-by: Dave Chinner <david@fromorbit.com>

db: fix AGI ops definition in CRC type table

The wrong buffer ops structure was added to the AGI field of the
type table when initially committed. This was not noticed because it
only affects manually setting the type of a buffer from xfs_db. e.g

xfs_db> agi 0
xfs_db> p
.....
crc = 0xbc58d757 (correct)
.....
xfs_db> fsb 2
xfs_db> type agi
Metadata CRC error detected at block 0x10/0x1000
xfs_db>

This is because (trimmed for clarity):

Breakpoint 1, xfs_verifier_error:
(gdb) bt
#0  xfs_verifier_error
#1  xfs_agfl_read_verify
#2  set_iocur_type
#3  type_f
#4  main

It's clear that the wrong verifier is being run (AGFL, not AGI).
The fix is simple.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: fix left-shift overflows

pmask in struct parent_list is a __uint64_t, but in some places
we populated it with "1LL << shift" where shift could be up
to 63; this really needs to be a 1ULL type for this to be correct.

Also spotted by libubsan...

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_metadump: Fix unaligned accesses

This fixes some unaligned accesses spotted by libubsan in
xfs_metadump.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_logprint: fix some unaligned accesses

This routine had a fair bit of gyration to avoid unaligned
accesses, but didn't fix them all.
Fix some more spotted at runtime by libubsan.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: fix unaligned accesses

This fixes some unaligned accesses spotted by libubsan in repair.

See Documentation/unaligned-memory-access.txt in the kernel
tree for why these can be a problem.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

libxfs: avoid negative (and full-width) shifts in radix-tree.c

Pull this commit over from the kernel, as libubsan spotted a
bad shift at runtime:

    commit 430d275a399175c7c0673459738979287ec1fd22
    Author: Peter Lund <firefly@vax64.dk>
    Date:   Tue Oct 16 23:29:35 2007 -0700

    avoid negative (and full-width) shifts in radix-tree.c

    Negative shifts are not allowed in C (the result is undefined).  Same thing
    with full-width shifts.

    It works on most platforms but not on the VAX with gcc 4.0.1 (it results in an
    "operand reserved" fault).

    Shifting by more than the width of the value on the left is also not
    allowed.  I think the extra '>> 1' tacked on at the end in the original
    code was an attempt to work around that.  Getting rid of that is an extra
    feature of this patch.

    Here's the chapter and verse, taken from the final draft of the C99
    standard ("6.5.7 Bitwise shift operators", paragraph 3):

      "The integer promotions are performed on each of the operands. The
      type of the result is that of the promoted left operand. If the
      value of the right operand is negative or is greater than or equal
      to the width of the promoted left operand, the behavior is
      undefined."

    Thank you to Jan-Benedict Glaw, Christoph Hellwig, Maciej Rozycki, Pekka
    Enberg, Andreas Schwab, and Christoph Lameter for review.  Special thanks
    to Andreas for spotting that my fix only removed half the undefined
    behaviour.

Signed-off-by: Peter Lund <firefly@vax64.dk>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_copy: format v5 sb logs correctly

xfs_copy formats the target filesystem log in non-duplicate copy mode to
stamp the new fs UUID into the log. The current format mechanism resets
the current LSN of the target filesystem to cycle 1, which is invalid
for v5 superblocks. The current LSN of v5 superblocks must always move
forward and remain ahead of metadata LSNs stored in filesystem metadata.

Update the log format helper to detect and use an alternate format
mechanism for v5 superblock logs. Allocate an independent log format
buffer based on the size of the log and format the buffer with an
incremented cycle count using the libxfs log format mechanism.

Note that the new libxfs log format mechanism could be used for both v5
and older superblock formats. The new mechanism requires a new, full log
sized buffer allocation as well as doing I/O to the entire log whereas
the pre-v5 sb mechanism only writes to the log head and tail. This is
due to how xfs_copy uses its own internal buffer data structure rather
than libxfs buftarg structures. As such, keep the original mechanism
around to avoid potential disruption for non-v5 users. The old mechanism
can be removed at some point in the future when the new mechanism is
shaken out and v5 filesystems tend to outnumber v4.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_copy: refactor log format code into new helper

The xfs_copy log format code is mostly open-coded into the main()
function of the application. Support for v5 superblock log formatting
will require an alternate mechanism than that utilized for v4
superblocks and older.

As such, refactor the log formatting code into new helper functions. The
top-level helper iterates the copy target devices and another helper
implements the log format magic. This patch does not change existing
behavior.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_copy: store data buf alignment in buf data structure

The write buffer data structure stores various characteristics of the
write buffer, such as I/O alignment requirements, etc., but does not
include the required data buffer alignment.

Data buffer alignment is a required buffer initialization parameter and
the v5 log format support code would like to initialize an independent
log buffer based on the predetermined alignment constraints encoded into
the global write buffer. Update the write buffer data structure to store
the provided data alignment value such that it can be accessed
throughout the codebase. This patch does not change existing behavior.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_copy: genericize write helper to facilitate separate log buf

xfs_copy uses a fixed write buffer throughout the code for the various
targets. This is a global buffer and is assumed to be the write source
in the do_write() helper used by the log clearing code.

v5 superblock log formatting will require a larger, independent buffer
to format the log. Therefore, update do_write() to accept a write buffer
parameter to optionally override the global write buffer. This patch
does not change current behavior.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_copy: check for dirty log on non-duplicate copies

xfs_copy either copies the log as-is or formats the log of the target(s)
based on whether duplicate copy mode is enabled. The target log is
formatted when non-duplicate mode is enabled because copies gain a new
fs UUID and the new UUID must be stamped into the log.

When non-duplicate mode is enabled, however, xfs_copy does not check
whether the source filesystem log is actually clean. If the source log
is dirty, the target filesystem is silently created with a clean log and
thus ends up in a potentially corrupted state.

Update xfs_copy to check the source log state and fail the copy if in
non-duplicate mode and the log is dirty. Encourage the user to mount the
filesystem or run xfs_repair to clear the log. Note that the log is
scanned unconditionally as opposed to only in non-duplicate mode because
v5 superblock log format support requires the current cycle number to
format the log correctly.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

db/metadump: bump lsn when log is cleared on v5 supers

xfs_metadump handles the log in different ways depending on the mode of
operation. If the log is dirty or obfuscation and stale data zeroing are
disabled, the log is copied as is. In all other scenarios, the log is
explicitly zeroed. This is incorrect for version 5 superblocks where the
current LSN is always expected to be ahead of all fs metadata.

Update metadump to use libxfs_log_clear() to format the log with an
elevated LSN rather than zero the log and reset the current the LSN.
Metadump does not use buffers for the dump target, instead using a
cursor implementation to access the log via a single memory buffer.
Therefore, update libxfs_log_clear() to receive an optional (but
exclusive to the buftarg parameter) memory buffer pointer for the log.
If the pointer is provided, the log format is written out to this
buffer. Otherwise, fall back to the original behavior and access the log
through buftarg buffers.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_db: do not reset current lsn from uuid command on v5 supers

The xfs_db uuid command modifes the uuid of the filesystem. As part of
this process, it is required to clear the log and format it with records
that use the new uuid. It currently resets the log to cycle 1, which is
not safe for v5 superblocks.

Update the uuid log clearing implementation to bump the current cycle
when the log is formatted on v5 superblock filesystems. This ensures
that the current LSN remains ahead of metadata LSNs on the subsequent
mount. Since the log is checked and cleared across a couple different
functions, also add a new global xlog structure, associate it with the
preexisting global mount structure and reference it to get and use the
current log cycle.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: seed the max lsn from log state in phase 2

At this point, the log state is always available once repair has
progressed through phase 2 and the log is only ever zeroed when
absolutely necessary. This means that in the common case, repair runs
with the log in a non-initialized state. The libxfs max metadata LSN
tracking initializes the max LSN to zero, however, which will require
updates throughout the repair process even if all metadata LSNs are
behind the current LSN.

Since all metadata LSNs that are behind the current LSN are valid, seed
the libxfs maximum seen LSN value with the log state from phase 2. This
is a minor optimization to minimize global variable updates in the
common case where all (or most) metadata LSNs are valid.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: don't clear the log by default

xfs_repair currently clears the log regardless of whether it is
corrupted, clean or contains data. This is traditionally harmless but
now causes log recovery problems on v5 filesystems. v5 filesystems
expect a clean log to always have an LSN out ahead of the maximum last
modification LSN stamped on any bit of metadata throughout the fs. If
this is not the case, repair must reformat the log with a larger cycle
number after fs processing is complete.

Given that unconditional log clearing actually introduces a filesystem
inconsistency on v5 superblocks (that repair must subsequently recover
from) and provides no tangible benefit for v4 filesystems that otherwise
have a clean and covered log, it is more appropriate behavior to not
clear the log by default.

Update xfs_repair to always and only clear the log when the -L parameter
is specified. Retain the existing logic to require -L or otherwise exit
if the log appears to contain data. Adopt similar behavior if the log
appears to be corrupted.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: format the log with forward cycle number on v5 supers

v5 filesystems use the current LSN and the last modified LSN stored
within fs metadata to provide correct log recovery ordering. xfs_repair
historically clears the log in phase 2. This resets to the current LSN
of the filesystem to the initial cycle, as if the fs was just created.

This is problematic because the filesystem LSN is now behind many
pre-existing metadata structures on-disk until either the current
filesystem LSN catches up or those particular data structures are
modified and written out. If a filesystem crash occurs in the meantime,
log recovery can incorrectly skip log items and cause filesystem
corruption.

Update xfs_repair to check the maximum metadata LSN value against the
current log state once the filesystem has been processed. If the maximum
LSN exceeds the current LSN with respect to the log, reformat the log
with a cycle number that exceeds that of the maximum LSN.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: process the log in no_modify mode

xfs_repair does not zero the log in no_modify mode. In doing so, it also
skips the function that scans the log, locates the head/tail blocks and
sets the current LSN. Now that the log state is used beyond phase 2, the
log scan must occur regardless of whether no_modify mode is enabled or
not.

Update phase 2 to always execute the log scanning code. Push down the
no_modify checks into the log clearing helper such that the log is still
not modified in no_modify mode.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: track log state throughout all recovery phases

xfs_repair examines and clears the log in phase 2. Phase 2 acquires the
log state in local data structures that are lost upon phase exit. v5
filesystems require that the log is formatted with a higher cycle number
after the fs is repaired. This requires assessment of the log state to
determine whether a reformat is necessary.

Rather than duplicate the log processing code, update phase 2 to
populate a globally available log data structure. Add a log pointer to
xfs_mount, as exists in kernel space, that repair uses to store a
reference to the log that is available to various phases. Note that this
patch simply plumbs through the global log data structure and does not
change behavior in any way.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

libxlog: pull struct xlog out of xlog_is_dirty()

The xlog_is_dirty() helper is called in various places to acquire the
current log head, tail and to determine whether the log is dirty. Some
callers will require additional information to deal with formatting the
log, such as the current LSN. xlog_is_dirty() already acquires this
information through existing sub-helpers, but it is not available to
callers as the xlog structure is allocated on the local stack.

Update xlog_is_dirty() to receive the xlog structure as a parameter and
pass it along such that additional information about the log is
available to callers. Update the existing callers to allocate the xlog
structure on the stack.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

libxfs: add ability to clear log to arbitrary log cycle

The libxfs_log_clear() helper currently zeroes the log and writes a
single log record such that the kernel code detects the log has been
zeroed and mounts successfully. This is not sufficient for v5
filesystems, which must have the log cleared to an LSN that is
guaranteed to be ahead of any LSN that has been previously stamped into
on-disk metadata.

Update libxfs_log_clear() to support the ability to format the log to an
arbitrary cycle number. First, the log is physically zeroed. A log
record is written to the first block of the log with the desired lsn and
refers to the tail_lsn as the last record of the previous cycle. The
rest of the log is filled with log records of the previous cycle. This
causes the kernel to set the current LSN to start of the desired cycle
number at mount time.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

libxfs: pass lsn param to log clear and record header logging helpers

In preparation to support the ability to format the log with an
arbitrary cycle number, the log clear and record logging helpers must be
updated to receive the desired cycle and LSN values as parameters.

Update libxfs_log_clear() to receive the desired cycle number to format
the log with. Define a preprocessor directive to represent the currently
hardcoded case of cycle 1. Update libxfs_log_header() to receive the lsn
and tail_lsn of the record to write. Use a NULL value LSN to represent
the currently hardcoded behavior.

All callers are updated to use the current default values. As such, this
patch does not change behavior in any way.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

libxfs: don't hardcode cycle 1 into unmount op header

The libxfs helper to write a log record after zeroing the log fills much
of the record header and unmount record with dummy data. It also
hardcodes the cycle number into the transaction oh_tid field as the
kernel expects to find the cycle stamped at the top of each block and
the original oh_tid value packed into h_cycle_data of the record header.

The log clearing code requires the ability to format the log to an
arbitrary cycle number to fix v5 superblock log recovery ordering
problems. As a result, the unmount record helper must not hardcode a
cycle of 1.

Fix up libxfs_log_header() to pack the unmount record appropriately, as
is already done for extra blocks that might exist beyond the record. Use
h_cycle_data for the original 32-bit word of the log record data block
and stamp the cycle number in its place. This allows unmount_record() to
work for arbitrary cycle numbers and libxfs_log_header() to pack a cycle
value that matches the lsn used in the record header. Note that this
patch does not change behavior as the lsn is still hardcoded to (1:0).

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

libxfs: track largest metadata LSN in use via verifiers

The LSN validation helper is called in the I/O verifier codepath for
metadata that embed a last-modification LSN. While the codepath exists,
this is not used in userspace as in the kernel because the former
doesn't have an active log.

xfs_repair does need to check the validity of the LSN metadata with
respect to the on-disk log, however. Use the LSN validation mechanism to
track the largest LSN that has been seen. Export the value so repair can
use it once it has processed the entire filesystem. Note that the helper
continues to always return true to preserve existing behavior.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

libxfs: validate metadata LSNs against log on v5 superblocks

Backport the associated kernel commit. Replace the xfs_log_check_lsn()
helper with a stub.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

mkfs.xfs: option for using a pre-defined filesystem UUID

Usage: mkfs.xfs -m uuid=<uuid> <device>

The filesystem UUID can now be optionally specified during filesystem
creation. The default behavior is still to generate a random UUID.

Allows using pre-generated UUIDs for identifying a filesystem based
on the metadata stored inside the filesystem. Filesystem labels can
be used for the same purpose, but are limited by their length
(12 chars in the case of xfs) whereas the UUID field can store an
entire 128bit UUID, which is plenty for e.g. random ID collision
avoidance.

Random UUID generated during the creation of the filesystem is not
always feasible when an external DB or other system is used to track
the created filesystem, e.g. in automated VM provisioning systems,
as this would require a feedback mechanism which is not always
available. In these cases the best approach often is to generate
a random UUID for the filesystem before the filesystem even exists,
store it in the tracking DB and later create the filesystem directly
with the correct UUID (instead of "mkfs.xfs + xfs_admin -U <new_uuid>").

Signed-off-by: Mika Eloranta <mel@ohmu.fi>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

platform: rename lstat64 to lstat for OS X

OS X has a different means to distinguish between
a 32 and 64bit calls - using xxx64 is deprecated.

Signed-off-by: Jan Tulak <jtulak@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

io: Make mremap conditional

Don't build mremap on platforms where it has no support.

Signed-off-by: Jan Tulak <jtulak@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

platform: Add statvfs64 for OS X

Simply rename statvfs64 to statfs with a #define.
OSX version of statvfs is missing some members, so if the renaming
is in effect (stavfs64 is defined), don't try to use them and go
directly for the other member value.

Signed-off-by: Jan Tulak <jtulak@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

platform: Add a timer implementation for OS X

OS X does not have the timer used in xfs_repair.
Add a simple implementation providing the required
capabilities.

Signed-off-by: Jan Tulak <jtulak@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_estimate: change nftw64 to nftw

There is only one usage of nftw64 in entire xfsprogs, but
multiple usages of nftw. It seems the 64 variant has no reason,
and causes difficulties with some other platforms which has
only nftw call.

Signed-off-by: Jan Tulak <jtulak@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

platform: Remove conflicting define for OS X

ENOATTR already exists in OS X.

Signed-off-by: Jan Tulak <jtulak@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>

platform: uuid changes for OS X

UUID API changed in OS X in last few years, so fix the platform_ calls.

Signed-off-by: Jan Tulak <jtulak@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>

build: Add autoconf check for fsetxattr call

OS X has fsetxattr() in another header and with different arguments.
For now, check for the Linux variant and if not available, skip
the code using the call.

Signed-off-by: Jan Tulak <jtulak@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>

build: Add includes required for OS X

Delta patch, an older version missing 3 includes was merged
into 4.2.0-rc2.

Signed-off-by: Jan Tulak <jtulak@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>

libxfs: prefix XATTR_LIST_MAX with XFS_

As we depends on XATTR_ value that is available only on some
platforms, prefix it with XFS_ and allow for an alternative value
in future.

Signed-off-by: Jan Tulak <jtulak@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

libxfs: avoid dependency on Linux XATTR_SIZE_MAX

Currently, we depends on Linux XATTR value for on disk
definitions. Which causes trouble on other platforms and
maybe also if this value was to change.

Fix it by creating a custom definition independent from
those in Linux (although with the same values), so it is OK
with the be16 fields used for holding these attributes.

Signed-off-by: Jan Tulak <jtulak@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

platform: Add XATTR_LIST_MAX to OS X headers

OS X has no XATTR_LIST_MAX value. So add it to the platform header.

Signed-off-by: Jan Tulak <jtulak@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

build: make libblkid usage optional

Because not all platforms have up-to-date blkid with required
functions, allow at least partial functionality by adding
--enable-blkid=yes/no optional configure argument.

When blkid is disabled, signature detection and device geometry
detection doesn't work.

Signed-off-by: Jan Tulak <jtulak@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: Fix up warning strings in da_util.c

Switch the warning messages based on which fork has
encountered the problem.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: move common dir2 and attr_repair code to da_util.c

Now that dir2.c and attr_repair.c are functionally identical,
move the duplicate code into a new file da_util.c, with da_util.h
as a header file for the common functions.

Last step will be to fix up comments and printfs' to be appropriate
for code that checks both dirs and attrs.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: whitespace & comments

This patch does nothing but fix up whitespace and comments
to match across dir2.c and attr_repair.c

At this point, a diff of repair/dir2.c and attr_repair.c
show them to be identical in function.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: Remove more differences between attr & dir2

This is a hodgepodge of unrelated but not-completely-trivial
chagnes to both the dir2 and attr code to make their common
code more similar.

* It removes the whichfork checking in attr_repair, because we
only get there with XFS_ATTR_FORK.
* It changes the magic-checking logic slightly to match.
* It swaps some (bp == NULL) tests for (!bp)

These should be purely cosmetic changes.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: better checking of v5 attributes

The commit:

0519f66 xfs_repair: better checking of v5 metadata fields

added new corruption checks to dir2.c but missed the similar
code in attr_repair.c; add that here.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: catch bad level/depth in da node

Two tests added some time ago to dir2.c:

44dae5e xfs_repair: test for bad level in dir2 node
28148f6 xfs_repair: catch bad depth in traverse_int_dir2block

never made it to the similar tree-walking code in attr_repair.c;
fix that up here. The error string details will be fixed up
later.

Signed-off-by; Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: Remove BUF_PTR from attr_repair.c

The BUF_PTR macro was removed from kernelspace a while ago
(6292604 xfs: Remove the macro XFS_BUF_PTR) but it lives
on in some parts of xfsprogs. dir2.c doesn't use it,
but similar code in attr_repair.c does. remove it from
attr_repair.c to converge the code.

Remove a related but unnecessary cast from a *void b_addr
in dir2.c while we're at it.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: add XR_DIR_TRACE to dir2.c

attr_repair.c has many printf-tracepoints under
#ifdef XR_DIR_TRACE, but the similar code in dir2.c does not.

Add these same tracepoints to remove more differences between
these two pieces of code.

Not all messages are quite correct; those will be fixed up last.
For now we just make the code more obviously similar.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: fix use-after-free in verify_final_dir2_path

Way back in 2002, commit 948ce18 fixed a potential use-after-free
in verify_final_da_path, but the same fix was not applied to
verify_final_dir2_path; apply it now.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: use multibuffer read routines in attr_repair.c

verify_da_path and traverse_int_dablock are similar to
verify_dir2_path and traverse_int_dir2block, but one
difference is that the dir2 code reads using the
multibuffer capable da_read_buf() routine, whereas
the attr code doesn't need to, and just calls
libxfs_readbuf.

The multibuffer code falls back just fine when the
geometry indicates that it's not needed, so use that
same code in the attribute routines, and remove
another dir2 / da difference. We make da_read_buf()
non-static to facilitate this.

Finally, add a local *geo to these routines,
to make the code even more similar at this point.
The geometry will get passed in later in the series.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: make CRC checking consistent in path verification

verify_da_path and verify_dir2_path both take steps to
re-compute the CRC of the block if it otherwise looks
ok and no other changes are needed. They do this inside
a loop, but the approach differs; verify_da_path expects
its caller to check the first buffer prior to the loop,
and verify_dir2_path expects its caller to check the last
buffer after the loop.

Make this consistent by semi-arbitrarily choosing to make
verify_da_path (and its caller) match the method used by
verify_dir2_path, and check the last buffer after the
loop is done.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: remove type from da & dir2 cursors

The type field in these cursors is only set (and only
in the attr code), and it's never read; just remove
it.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: remove trace-only 'n' member from da_level_state

The da_level_state structure contains an 'n' member
when XR_DIR_TRACE is enabled, which is a) write only, and
b) set by a macro which doesn't exist (XFS_BUF_TO_DA_INTNODE)

Removing this structure member fixes compilation with
XR_DIR_TRACE enabled, and also makes da_level_state identical
to dir2_level_state, so the two can be combined later.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_db: enable blockget for v5 filesystems

Plumb in the necessary magic number checks and other fixups required
to handle v5 filesystems.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_db: trash the block at the top of the cursor stack

Add a new -z option to blocktrash to make it trash the block that's at
the top of the stack, so that we can perform targeted fuzzing. While
we're at it, prevent fuzzing off the end of the buffer and add a -o
parameter so that we can specify an offset to start fuzzing from.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_db: enable blocktrash for checksummed filesystems

Disable the write verifiers when we're trashing a block.  With this
in place, create a xfs fuzzer script that formats, populates, corrupts,
tries to use, repairs, and tries again to use a crash test xfs image.
Hopefully this will shake out some v5 filesystem bugs.

This allow trashing of log blocks and symlinks, and require the
caller to explicitly ask for trashing of log blocks and super
blocks.  Allowing log blocks by default skews the trashing heavily
in favor of (probably unused) log blocks, which doesn't help us with
fuzzing.  Furthermore, trashing the superblock results in a time
consuming sector by sector superblock hunt.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_io: support reflinking and deduping file ranges

Wire up xfs_io to use the XFS range clone and dedupe ioctls to make
files share data blocks.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfsprogs: Release v4.2.0

Update all the release files for a 4.2.0 release.

Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: release corrupt directory node buffer

If repair encounters a dir node block that fails checksum or
verification, free the buffer before the directory gets rebuilt.

Reported-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: update btree ptr when attr node level moves to next buffer

xfs_repair walks the attribute fork btree for files with a significant
number of extended attributes. It creates a cursor, walks the leaf
blocks, and verifies the path from each leaf block back to the root of
the tree. Eryu reports that the following test causes xfs_repair to
report corruption on 512b filesystems:

num_xattrs=577
for ((i = 1; i <= $num_xattrs; i++)); do
name="user.attr_$(printf "%04d" $i)"
setfattr -n $name -v "val_$(printf "%04d" $i)" <file>
done

xfs_repair complains that the block number of the leaf (level 0) does
not match the block number of the level 1 node block entry. This occurs
as soon as the left-most level 1 node block is completely processed and
the cursor is walked to the next level 1 block in the array. The problem
is that while verify_da_path() updates level 1 of the cursor to the next
level 1 buffer, it fails to correctly update the btree pointer to the
entry list of the new buffer. As a result, the child leaf block of the
next node block is incorrectly validated against the entry list of the
previous node block.

Update verify_da_path() to correctly update the btree pointer to the
entry list of the new node block when the cursor is walked forward at
higher (non-leaf) levels.

Reported-by: Eryu Guan <eguan@redhat.com>
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

progs: use CURDIR instead of relative paths for header symlinks

This fixes broken header symlinks when make isn't triggered from the
xfsprogs source location, but as a recursion from another make in a
different directory. This is a common pattern found in cross build
systems.

Signed-off-by: Lucas Stach <dev@lynxeye.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: force not-so-bad bmbt blocks back through the verifier

If during prefetch we encounter a bmbt block that fails the CRC check
due to corruption in the unused part of the block, force the buffer
back through the non-prefetch verifiers later so that the CRC is
updated. Otherwise, the bad checksum goes unfixed and the kernel will
still flag the bmbt block as invalid.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: check v5 filesystem attr block header sanity

Check the v5 fields (uuid, blocknr, owner) of attribute blocks for
obvious errors while scanning xattr blocks. If the ownership info
is incorrect, kill the block.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: ignore "repaired" flag after we decide to clear xattr block

If in the course of examining extended attribute block contents we
first decide to repair an entry (*repair = 1) but secondly decide to
clear the whole block, set *repair = 0 because the clearing action
only happens if *repair == 0. Put another way, if we're nuking a
block, don't pretend like we've fixed it too.

v2: fix all the paths to clear the attr block if the processing
functions error out.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

libxfs: clear buffer state flags in libxfs_getbuf and variants

When we're running xfs_repair with prefetch enabled, it's possible
that repair will decide to clear an inode without examining all
metadata blocks owned by that inode.  This leaves the unreferenced
prefetched buffers marked UNCHECKED, which will cause a subsequent CRC
error if the block is reallocated to a different structure and read
more than once.  Typically this happens when a large directory is
corrupted and lost+found has to grow to accomodate all the
disconnected inodes.

In libxfs_getbuf*(), we're supposed to return an unused buffer which
has a clean state.  Unfortunately, things like UNCHECKED can hang
around to cause incorrect verifier errors later, so change those
functions to launder the state bits clean.

v2: Change the function name to reset_buf_state() to reflect what
the function is trying to accomplish.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

libxfs: fix XFS_WANT_CORRUPTED_* macros to return negative error codes

Since the rest of libxfs returns negative error codes, these two sanity
checking macros ought to have the same applied. While we're at it,
fix a couple more sign errors in the same file.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

libxfs: verifier should set buffer error when da block has a bad magic number

If xfs_da3_node_read_verify() doesn't recognize the magic number of a
buffer it's just read, set the buffer error to -EFSCORRUPTED so that
the error can be sent up to userspace. Without this patch we'll
notice the bad magic eventually while trying to traverse or change
the block, but we really ought to fail early in the verifier.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: set args.geo in longform_dir2_entry_check_data

Here's another one where we miss setting da_args->geo:

longform_dir2_entry_check_data
        struct xfs_da_args      da = {
                .dp = ip,
// .geo is unset
        };
...
libxfs_dir2_data_make_free(&da ...)
xfs_dir2_data_make_free
endptr = (char *)hdr + args->geo->blksize;
BOOM

Addresses-Coverity-Id: 1298008
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: set args.geo in dir2_kill_block

This path in xfs_repair:

dir2_kill_block
libxfs_da_shrink_inode
xfs_dir2_shrink_inode
xfs_dir2_db_to_da

segfaults, because dir2_kill_block() does not initialize
args.geo, and a null geometry winds up in xfs_dir2_db_to_da(),
which dereferences it.

Fix that.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfsprogs: Release v4.2.0-rc3

Update all the release files for a 4.2.0-rc3 release.

Signed-off-by: Dave Chinner <david@fromorbit.com>

man pages: Minor fixes for xfs.5

Fix whitespace around logdev/rtdev mount option, and add missing
qnoenforce.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfsprogs: revert OS X dummy function changes

Commit ff6f019d ("xfsprogs: missing and dummy calls for OS X
support") was committed prematurely. Revert it for now so that
better solutions can be committed cleanly.

Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs: Use consistent logging message prefixes

The second and subsequent lines of multi-line logging messages
are not prefixed with the same information as the first line.

Separate messages with newlines into multiple calls to ensure
consistent prefixing and allow easier grep use.

Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs: remote attribute headers contain an invalid LSN

In recent testing, a system that crashed failed log recovery on
restart with a bad symlink buffer magic number:

XFS (vda): Starting recovery (logdev: internal)
XFS (vda): Bad symlink block magic!
XFS: Assertion failed: 0, file: fs/xfs/xfs_log_recover.c, line: 2060

On examination of the log via xfs_logprint, none of the symlink
buffers in the log had a bad magic number, nor were any other types
of buffer log format headers mis-identified as symlink buffers.
Tracing was used to find the buffer the kernel was tripping over,
and xfs_db identified it's contents as:

000: 5841524d 00000000 00000346 64d82b48 8983e692 d71e4680 a5f49e2c b317576e
020: 00000000 00602038 00000000 006034ce d0020000 00000000 4d4d4d4d 4d4d4d4d
040: 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d
060: 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d
.....

This is a remote attribute buffer, which are notable in that they
are not logged but are instead written synchronously by the remote
attribute code so that they exist on disk before the attribute
transactions are committed to the journal.

The above remote attribute block has an invalid LSN in it - cycle
0xd002000, block 0 - which means when log recovery comes along to
determine if the transaction that writes to the underlying block
should be replayed, it sees a block that has a future LSN and so
does not replay the buffer data in the transaction. Instead, it
validates the buffer magic number and attaches the buffer verifier
to it. It is this buffer magic number check that is failing in the
above assert, indicating that we skipped replay due to the LSN of
the underlying buffer.

The problem here is that the remote attribute buffers cannot have a
valid LSN placed into them, because the transaction that contains
the attribute tree pointer changes and the block allocation that the
attribute data is being written to hasn't yet been committed. Hence
the LSN field in the attribute block is completely unwritten,
thereby leaving the underlying contents of the block in the LSN
field. It could have any value, and hence a future overwrite of the
block by log recovery may or may not work correctly.

Fix this by always writing an invalid LSN to the remote attribute
block, as any buffer in log recovery that needs to write over the
remote attribute should occur. We are protected from having old data
written over the attribute by the fact that freeing the block before
the remote attribute is written will result in the buffer being
marked stale in the log and so all changes prior to the buffer stale
transaction will be cancelled by log recovery.

Hence it is safe to ignore the LSN in the case or synchronously
written, unlogged metadata such as remote attribute blocks, and to
ensure we do that correctly, we need to write an invalid LSN to all
remote attribute blocks to trigger immediate recovery of metadata
that is written over the top.

As a further protection for filesystems that may already have remote
attribute blocks with bad LSNs on disk, change the log recovery code
to always trigger immediate recovery of metadata over remote
attribute blocks.

cc: <stable@vger.kernel.org>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs: Fix uninitialized return value in xfs_alloc_fix_freelist()

xfs_alloc_fix_freelist() can sometimes jump to out_agbp_relse
without ever setting value of 'error' variable which is then
returned. This can happen e.g. when pag->pagf_init is set but AG is
for metadata and we want to allocate user data.

Fix the problem by initializing 'error' to 0, which is the desired
return value when we decide to skip this group.

CC: xfs@oss.sgi.com
Coverity-id: 1309714
Signed-off-by: Jan Kara <jack@suse.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs: remote attributes need to be considered data

We don't log remote attribute contents, and instead write them
synchronously before we commit the block allocation and attribute
tree update transaction. As a result we are writing to the allocated
space before the allcoation has been made permanent.

As a result, we cannot consider this allocation to be a metadata
allocation. Metadata allocation can take blocks from the free list
and so reuse them before the transaction that freed the block is
committed to disk. This behaviour is perfectly fine for journalled
metadata changes as log recovery will ensure the free operation is
replayed before the overwrite, but for remote attribute writes this
is not the case.

Hence we have to consider the remote attribute blocks to contain
data and allocate accordingly. We do this by dropping the
XFS_BMAPI_METADATA flag from the block allocation. This means the
allocation will not use blocks that are on the busy list without
first ensuring that the freeing transaction has been committed to
disk and the blocks removed from the busy list. This ensures we will
never overwrite a freed block without first ensuring that it is
really free.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs: xfs_bunmapi() does not need XFS_BMAPI_METADATA flag

xfs_bunmapi() doesn't care what type of extent is being freed and
does not look at the XFS_BMAPI_METADATA flag at all. As such we can
remove the XFS_BMAPI_METADATA from all callers that use it.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs: don't cast string literals

The commit:

a9273ca5 xfs: convert attr to use unsigned names

added these (unsigned char *) casts, but then the _SIZE macros
return "7" - size of a pointer minus one - not the length of
the string. This is harmless in the kernel, because the _SIZE
macros are not used, but as we sync up with userspace, this will
matter.

I don't think the cast is necessary; i.e. assigning the string
literal to an unsigned char *, or passing it to a function
expecting an unsigned char *, should be ok, right?

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

libxfs: Fix file type directory corruption for btree directories

Users have occasionally reported that file type for some directory
entries is wrong. This mostly happened after updating libraries some
libraries. After some debugging the problem was traced down to
xfs_dir2_node_replace(). The function uses args->filetype as a file type
to store in the replaced directory entry however it also calls
xfs_da3_node_lookup_int() which will store file type of the current
directory entry in args->filetype. Thus we fail to change file type of a
directory entry to a proper type.

Fix the problem by storing new file type in a local variable before
calling xfs_da3_node_lookup_int().

Reported-by: Giacomo Comes <comes@naic.edu>
Signed-off-by: Jan Kara <jack@suse.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

libxfs: remove self-assignment in libxfs/util.c

We don't have percpu counters in userspace, so libxfs plays
tricks.  Rather than calling percpu_counter_set() in
xfs_reinit_percpu_counters, we just directly assign
the values in mp->m_sb to the counters in mp.

But this was already handled by #defining the percpu counters
in the mount structure to those in the superblock, i.e.:

#define m_icount        m_sb.sb_icount
#define m_ifree         m_sb.sb_ifree
#define m_fdblocks      m_sb.sb_fdblocks

so we actually end up with pointless self-assignment.

Define away the xfs_reinit_percpu_counters() function,
because it's a no-op.

Addresses-Coverity-Id: 1298009
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: unconditionally free blockmaps when threads complete

blkmap_free() doesn't actually free the block map unless it's
inordinately large; this keeps us from constantly freeing
and re-allocating blockmaps for each inode, which makes sense.

However, once the threads which have allocated these structures
exit, we should actually free them; they can grow up to 2MB
for each of the data and attr maps, for each thread, and not
be freed through the normal blkmap_free() test.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: call IRELE(ip) after libxfs_trans_iget calls

Commit 260c85e libxfs: dont free xfs_inode until complete
changed the alloc/free convention a bit:

    Originally, the xfs_inode are released upon the first
    call to xfs_trans_cancel, xfs_trans_commit, or
    inode_item_done.
    <snip>
    This patch does the following:
     1) Removes the iput from the transaction completion and
        requires that the xfs_inode allocators call IRELE()
        when they are done with the pointer.

But that change missed several callers in xfs_repair phase6;
fix that up.

Addresses-Coverity-Id: 1315100
Addresses-Coverity-Id: 1315101
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: free msgbuf on exit

Just to keep valgrind less noisy, and make it easiser to spot
more things that actually matter ...

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

libxfs: fix memory leasks in libxfs_umount()

libxfs_umount was failing to free a handful of resources; fix that
up. Call it from xfs_copy as well, while we're at it; every other
libxfs_mount has a libxfs_umount counterpart, at least on a clean
exit.

[dchinner: fix superblock buffer leak uncovered by adding
libxfs_umount() to xfs_copy. ]

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_repair: fix broken EFSBADCRC/EFSCORRUPTED usage with buffer errors

When we encounter CRC or verifier errors, bp->b_error is set to
-EFSBADCRC and -EFSCORRUPTED; note the negative sign. For whatever
reason, repair and db use the positive versions, and therefore fail to
notice the error, so fix all the broken uses.

Note however that the db and repair turn the negative codes returned
by libxfs into positive codes that can be used with strerror.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs_db: don't crash on a corrupt inode

If the user selects a corrupt inode via the 'inode XXX' command, the
read verifier will fail and the io cursor at the top of the ring will
not have any data attached. When this is the case, we cannot
dereference the NULL pointer or xfs_db will crash. Therefore, check
the buffer pointer before using it.

It's arguable that we ought to retry the read without the verifiers
if the inode is corrupt or fails CRC, since this /is/ a debugging
tool, and maybe you wanted the contents anyway.

[dchinner: fixes xfs/003 on 1k block size failure]

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

libxfs: readahead of dir3 data blocks should use the read verifier

In the dir3 data block readahead function, use the regular read
verifier to check the block's CRC and spot-check the block contents
instead of calling the spot-checking routine directly. This prevents
corrupted directory data blocks from being read into the kernel, which
can lead to garbage ls output and directory loops (if say one of the
entries contains invalid characters).

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

repair: fix wrong logic when validating node magic number

Magic number is wrong only when != XFS_DA_NODE_MAGIC and
!= XFS_DA3_NODE_MAGIC.

This is triggered by shared/002 when testing 512 block size XFS.

  Phase 1 - find and verify superblock...
  Phase 2 - using internal log
          - scan filesystem freespace and inode maps...
          - found root inode chunk
  Phase 3 - for each AG...
          - scan (but don't clear) agi unlinked lists...
          - process known inodes and perform inode discovery...
          - agno = 0
  bad magic number febe in block 64 (108) for directory inode 35
  ......

Fix it by changing "||" to "&&".

Signed-off-by: Eryu Guan <eguan@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfsprogs: Release v4.2.0-rc2

Update all the release files for a 4.2.0-rc2 release.

Signed-off-by: Dave Chinner <david@fromorbit.com>

libxfs: v3 inodes are only valid on crc-enabled filesystems

xfs_repair was not detecting that version 3 inodes are invalid for
for non-CRC filesystems. The result is specific inode corruptions go
undetected and hence aren't repaired if only the version number is
out of range.

The core of the problem is that the XFS_DINODE_GOOD_VERSION() macro
doesn't know that valid inode versions are dependent on a superblock
version number. Fix this in libxfs, and propagate the new function
out into the rest of xfsprogs to fix the issue.

[dchinner: forward port from 3.2.4 to 4.2.0-rc1, move
xfs_dinode_good_version() to libxfs/xfs_inode-buf.c with all the
other dinode validation functions. ]

Reported-by: Leslie Rhorer <lrhorer@mygrande.net>
Signed-off-by: Roger Willcocks <roger@filmlight.ltd.uk>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

doc: Update OS X build info and limitations

Signed-off-by: Jan Tulak <jtulak@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>

build: Add fls check into autoconf

Signed-off-by: Jan Tulak <jtulak@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>

build:: Add mntent.h check into autoconf

Signed-off-by: Jan Tulak <jtulak@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>

build: Change OS X-specific CFLAGS/LDFLAGS

OS X uses clang as a default compiler.
So remove incompatible options.

Signed-off-by: Jan Tulak <jtulak@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>

libxfs: Fix attr leaf block definition

struct xfs_attr_leafblock contains 'entries' array which is declared
with size 1 altough it can in fact contain much more entries. Since this
array is followed by further struct members, gcc (at least in version
4.8.3) thinks that the array has the fixed size of 1 element and thus
optimizes away all accesses beyond the end of array resulting in
non-working code. In particular this problem was seen with
xfsprogs-3.1.8.

Signed-off-by: Jan Kara <jack@suse.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

mkfs.xfs: fix ftype-vs-crc option combination testing

mkfs.xfs got weird along the way; today it has different outcomes
depending on the order of option specification:

$ mkfs/mkfs.xfs -n ftype=1 -m crc=0 -dfile,name=fsfile,size=16g
cannot specify both crc and ftype
$ mkfs/mkfs.xfs -m crc=0 -n ftype=1 -dfile,name=fsfile,size=16g
<succeeds>

Somehow the tests got written as being constrained on what options
are specified - and in what order! - vs actually testing for
incompatible feature sets.

It's fine to specify both crc & ftype options, as long as it's an
allowed combination, so just test for the incompatible combination
(crc=1 and ftype=0) after all options have been processed.

[dchinner: fix dirftype init value so mkfs default config works]

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

libxfs: remove sparse inode mount warning

The sparse inodes experimental feature warning fires multiple times
during mkfs because the warning is emitted as part of the superblock
verifier codepath. The warning is intended as a mount-time warning only
and has been relocated as such in the kernel repo.

Remove the warning from libxfs such that it is not emitted from
userspace.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>