Now that reflink operations don't set the firstblock value we don't
need the workarounds for non-NULL firstblock values without a prior
allocation.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
The main thing that xfs_bmap_remap_alloc does is fixing the AGFL, similar
to what we do in the space allocator. But the reflink code doesn't touch
the allocation btree unlike the normal space allocator, so we couldn't
care less about the state of the AGFL.
So remove xfs_bmap_remap_alloc and just handle the di_nblocks update in
the caller.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Add a new helper to be used for reflink extent list additions instead of
funneling them through xfs_bmapi_write and overloading the firstblock
member in struct xfs_bmalloca and struct xfs_alloc_args.
With some small changes to xfs_bmap_remap_alloc this also means we do
not need a xfs_bmalloca structure for this case at all.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
For the reflink case we'd much rather pass the required arguments than
faking up a struct xfs_bmalloca.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
We never do COW operations for the attr fork, so don't pretend we handle
them.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
bno should be a xfs_fsblock_t, which is 64-bit wides instead of a
xfs_aglock_t, which truncates the value to 32 bits.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
ndquots is a 32-bit value, and we don't care
about the remainder; there is no reason to use do_div
here, it seems to be the result of a decade+ historical
accident.
Worse, the do_div implementation in userspace breaks
when fed a 32-bit dividend, so we commented it out there
in any case.
Change to simple division, and then we can change
userspace to match, and mandate a 64-bit dividend in
the do_div() in userspace as well.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Introduce a new ioctl that uses the reverse mapping btree to return
information about the physical layout of the filesystem.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Add _query_range and _query_all functions to the realtime bitmap
allocator. These two functions are similar in usage to the btree
functions with the same name and will be used for getfsmap and scrub.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Create a helper function that will query all records in a btree.
This will be used by the online repair functions to examine every
record in a btree to rebuild a second btree.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Implement a query_range function for the bnobt and cntbt. This will
be used for getfsmap fallback if there is no rmapbt and by the online
scrub and repair code.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Plumb in the pieces (init_high_key, diff_two_keys) necessary to call
query_range on the free space btrees. Remove the debugging asserts
so that we can make queries starting from block 0.
While we're at it, merge the redundant "if (btnum ==" hunks.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
"xfs_iread: validation failed for inode 96 failed"
One "failed" seems like enough.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Alex Elder <elder@linaro.org> Reviewed-by: Bill O'Donnell <billodo@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Opencoding the trivial checks makes it much easier to read (and grep..).
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
This checks for all the non-normal extent types, including handling both
encodings of delayed allocations.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Zirong Lang [Wed, 3 May 2017 19:54:26 +0000 (14:54 -0500)]
xfs_io: add missed quotation marks in man page
The description about set_encpolicy command in xfs_io man page missed
some quotation marks. It's a little picky, but it really causes the
quotation marks can't match each other.
Signed-off-by: Zorro Lang <zlang@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Zirong Lang [Wed, 3 May 2017 19:54:00 +0000 (14:54 -0500)]
xfs_io: add missed inode command into man page
There's an "inode" command in xfs_io, it's used to query physical
information about an inode. But there's not any information about
it in xfs_io and other related man pages. So document this command
in the xfs_io man page now.
[sandeen: include some of djwong's edits]
Signed-off-by: Zorro Lang <zlang@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Tue, 2 May 2017 16:12:57 +0000 (11:12 -0500)]
xfs_db: dump metadata btrees via 'btdump'
Introduce a new 'btdump' command that can print the contents of all
blocks of any metadata subtree in the filesystem. This enables
developers and forensic analyst to view a metadata structure without
having to navigate the btree manually.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Tue, 2 May 2017 16:12:55 +0000 (11:12 -0500)]
xfs_db: use iocursor type to guess btree geometry if bad magic
The function block_to_bt plays an integral role in determining the btree
geometry of a block that we want to manipulate with the debugger.
Normally we use the block magic to find the geometry profile, but if the
magic is bad we'll never find it and return NULL. The callers of this
function do not check for NULL and crash.
Therefore, if we can't find a geometry profile matching the magic
number, use the iocursor type to guess the profile and scowl about that
to stdout. This makes it so that even with a corrupt magic we can try
to print the fields instead of crashing the debugger.
[sandeen: comment changes, add magic ASSERT, keep other ASSERTs]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Tue, 2 May 2017 16:12:54 +0000 (11:12 -0500)]
xfs_db: don't print arrays off the end of a buffer
Before printing an array, clamp the array count against the size of the
buffer so that we don't print random heap contents.
[sandeen: re-use fsz variable in call to prfunc]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Eric Sandeen [Tue, 2 May 2017 16:12:53 +0000 (11:12 -0500)]
mkfs.xfs: Assign proper defaults to rmapbt and reflink flags
The "defaultval" field in the options structure was a bit confusing,
so when the rmapbt & reflink options got added, the desire was
to keep them off by default, and "defaultval = 0" got set.
However, the purpose of this field is to define the default value
when the flag is specified with no associated value, i.e.
-m rmapbt vs. -m rmapbt=0 or -m rmapbt=1
Today, the resulting behavior is unexpected, and different from any
other mkfs flags; specifying "-m rmapbt,reflink" results in a
filesystem /without/ those features.
Fix these to be consistent with every other boolean flag in the
mkfs options, so that specifying the flag with no value will
enable the feature.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
chandan [Tue, 2 May 2017 16:12:52 +0000 (11:12 -0500)]
xfs_io: Add statx support for PowerPC architecture
Linux kernel commit f717629c7f834ab2efa05c7dbf0826f1d7c32ade (powerpc:
Wire up statx() syscall) added support for statx syscall for PowerPC
architecture. This commit enables using 'statx' sub-command of xfs_io to
be used on PowerPC.
Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Eric Sandeen [Tue, 2 May 2017 16:12:51 +0000 (11:12 -0500)]
xfs_io: fix statx definition for non-x86 architecture
Apply the same fix to xfs_io as Gwendal did for fstests:
Fix a compilation error for ARM:
__ILP32__ is defined but not __X32_SYSCALL_BIT.
The check should only apply for x86_64 architecture, statx for other
architectures is not implemented yet - see commit 7acc839c9e57
"statx: Add a system call to make enhanced file info available".
Signed-off-by: Gwendal Grignou <gwendal@chromium.org> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Eric Sandeen [Tue, 2 May 2017 16:12:50 +0000 (11:12 -0500)]
xfs_db: allow write -d to dqblks
Allow write -d to write bad data and recalculate CRC
for dqblks.
Inspired-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Tue, 2 May 2017 16:12:31 +0000 (11:12 -0500)]
xfs_db: allow write -d to inodes
Add a helper function to xfs_db so that we can recalculate the CRC of an
inode whose field we just wrote. This enables us to write arbitrary
values with a good CRC for the purpose of checking the read verifiers on
a v5 filesystem.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Tue, 2 May 2017 16:12:31 +0000 (11:12 -0500)]
xfs_db.8: document write -d option
The xfs_db write "-d" option allows us to write bad
data with a good CRC, as added in 86769b3 xfs_db: allow recalculating CRCs on invalid metadata
but never documented, so do that now.
[sandeen: split out doc patch]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Eric Sandeen [Mon, 10 Apr 2017 22:40:58 +0000 (17:40 -0500)]
xfs.5: document barrier deprecation in xfs
Since kernel v4.10, the barrier mount option is deprecated:
2291dab2 xfs: Always flush caches when integrity is required 4cf4573d xfs: deprecate barrier/nobarrier mount option
Document this fact in the xfs(5) manpage.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Eric Sandeen [Mon, 10 Apr 2017 22:34:45 +0000 (17:34 -0500)]
xfs_io: hook up statx
Wire up the statx syscall to xfs_io.
xfs_io> help statx
statx [-v|-r][-m basic | -m all | -m <mask>][-FD] -- extended statistics on the currently open file
Display extended file status.
Options:
-v -- More verbose output
-r -- Print raw statx structure fields
-m mask -- Specify the field mask for the statx call
(can also be 'basic' or 'all'; default STATX_ALL)
-D -- Don't sync attributes with the server
-F -- Force the attributes to be sync'd with the server
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Eric Sandeen [Mon, 10 Apr 2017 22:33:38 +0000 (17:33 -0500)]
xfs_io: refactor stat functions, add raw dump
This adds a "-r" raw structure dump to stat options, and
factors the code a bit; statx will also use print_file_info
and print_xfs_info.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Eric Sandeen [Mon, 10 Apr 2017 22:33:15 +0000 (17:33 -0500)]
xfs_io: move stat functions to new file
Adding statx will add a bit of code, so break stat-related
functions out of open.c into their own new file.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Eric Sandeen [Mon, 10 Apr 2017 22:32:04 +0000 (17:32 -0500)]
xfsprogs: fix build dep on configure.ac
Zorro reported that this sequence:
# git checkout v4.9.0; make realclean; make
# git checkout v4.10.0; make clean; make
fails:
...
Building libxfs
[CC] gen_crc32table
gcc: error: @BUILD_CFLAGS@: No such file or directory
gmake[3]: *** No rule to make target `crc32table.h', needed by `crc32selftest'. Stop.
This is because
0a71e38 build: Allow compiling xfsprogs in a cross compile environment
added the new BUILD_CFLAGS to configure.ac, and unless we re-run
autotools, that variable does not get substituted when
include/builddefs gets built.
(This can be worked around by "make realclean" and then everything
gets regenerated.)
The configure script is generated from configure.ac, so adding
a Make dependency here should resolve such issues in the future.
Reported-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Eric Sandeen [Mon, 10 Apr 2017 22:32:04 +0000 (17:32 -0500)]
xfs_repair: pass btnum not magic to phase5 functions
When ed849ef xfs: remove boilerplate around xfs_btree_init_block
was merged from kernelspace, I made only minimal changes at the
libxfs boundary to accommodate the new libxfs_btree_init_block
interface.
We can chase that up a bit higher and remove more code by
passing in btnum from the start; we can also remove the
"finobt" argument from build_ino_tree() because that is
known from type of tree passed in.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Eric Sandeen [Mon, 10 Apr 2017 22:32:04 +0000 (17:32 -0500)]
xfs_repair: warn about dirty log with -n option
When looking at xfs_repair -n output today, we have no idea if
reported errors may be due to an un-replayed dirty log. If this
is the case, mention it in the output.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Mathias Troiden reproduced a filesystem corruption that resulted in
a zero-sized local format symlink inode. This is invalid state and
results in an inode that cannot be accessed or modified.
The kernel detects this problem on inode access, fails and warns the
user to umount and run xfs_repair. Unfortunately, xfs_repair doesn't
even detect the problem. Thus the user has no path to recovery.
Update xfs_repair to check for invalid zero-sized symlinks and flag
them as corrupted. This results in tossing the inode, but returns
the fs to a valid state.
Reported-by: Mathias Troiden <mathias.troiden@gmail.com> Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Eric Sandeen [Mon, 10 Apr 2017 22:31:55 +0000 (17:31 -0500)]
xfsprogs: remove unused libxfs helper #defines
There are several #defines which aren't used anywhere
in either xfsprogs or the kernel, so remove them from the
libxfs/libxfs_priv.h helper file.
This does leave a few (currently) unused defines in place
if they are still used in the kernel, in the hope that future
libxfs changes might Just Work.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
The inline directory verifiers should be called on the inode fork data,
which means after iformat_local on the read side, and prior to
ifork_flush on the write side. This makes the fork verifier more
consistent with the way buffer verifiers work -- i.e. they will operate
on the memory buffer that the code will be reading and writing directly.
Furthermore, revise the verifier function to return -EFSCORRUPTED so
that we don't flood the logs with corruption messages and assert
notices. This has been a particular problem with xfs/348, which
triggers the XFS_WANT_CORRUPTED_RETURN assertions, which halts the
kernel when CONFIG_XFS_DEBUG=y. Disk corruption isn't supposed to do
that, at least not in a verifier.
Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
xfs_extent_busy_flush is a void function, so don't reduce it to zero.
This shuts up gcc warnings about do-nothing statements.
[sandeen: switch to more common ((void)0) paradigm]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
When we're reading or writing the data fork of an inline directory,
check the contents to make sure we're not overflowing buffers or eating
garbage data. xfs/348 corrupts an inline symlink into an inline
directory, triggering a buffer overflow bug.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
When a reflink operation causes the bmap code to allocate a btree block
we're currently doing single-AG allocations due to having ->firstblock
set and then try any higher AG due a little reflink quirk we've put in
when adding the reflink code. But given that we do not have a minleft
reservation of any kind in this AG we can still not have any space in
the same or higher AG even if the file system has enough free space.
To fix this use a XFS_ALLOCTYPE_FIRST_AG allocation in this fall back
path instead.
[And yes, we need to redo this properly instead of piling hacks over
hacks. I'm working on that, but it's not going to be a small series.
In the meantime this fixes the customer reported issue]
Also add a warning for failing allocations to make it easier to debug.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Commit fa7f138 ("xfs: clear delalloc and cache on buffered write
failure") fixed one regression in the iomap error handling code and
exposed another. The fundamental problem is that if a buffered write
is a rewrite of preexisting delalloc blocks and the write fails, the
failure handling code can punch out preexisting blocks with valid
file data.
This was reproduced directly by sub-block writes in the LTP
kernel/syscalls/write/write03 test. A first 100 byte write allocates
a single block in a file. A subsequent 100 byte write fails and
punches out the block, including the data successfully written by
the previous write.
To address this problem, update the ->iomap_begin() handler to
distinguish newly allocated delalloc blocks from preexisting
delalloc blocks via the IOMAP_F_NEW flag. Use this flag in the
->iomap_end() handler to decide when a failed or short write should
punch out delalloc blocks.
This introduces the subtle requirement that ->iomap_begin() should
never combine newly allocated delalloc blocks with existing blocks
in the resulting iomap descriptor. This can occur when a new
delalloc reservation merges with a neighboring extent that is part
of the current write, for example. Therefore, drop the
post-allocation extent lookup from xfs_bmapi_reserve_delalloc() and
just return the record inserted into the fork. This ensures only new
blocks are returned and thus that preexisting delalloc blocks are
always handled as "found" blocks and not punched out on a failed
rewrite.
Reported-by: Xiong Zhou <xzhou@redhat.com> Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
XFS_ALLOCTYPE_ANY_AG was only used for the RT allocator and is unused
now, and XFS_ALLOCTYPE_START_AG has been unused for a while.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
In various places we currently assert that xfs_bmap_btalloc allocates
from the same as the firstblock value passed in, unless it's either
NULLAGNO or the dop_low flag is set. But the reflink code does not
fully follow this convention as it passes in firstblock purely as
a hint for the allocator without actually having previous allocations
in the transaction, and without having a minleft check on the current
AG, leading to the assert firing on a very full and heavily used
file system. As even the reflink code only allocates from equal or
higher AGs for now we can simply the check to always allow for equal
or higher AGs.
Note that we need to eventually split the two meanings of the firstblock
value. At that point we can also allow the reflink code to allocate
from any AG instead of limiting it in any way.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
When block size is larger than inode cluster size, the call to
XFS_B_TO_FSBT(mp, mp->m_inode_cluster_size) returns 0. Also, mkfs.xfs
would have set xfs_sb->sb_inoalignmt to 0. This causes
xfs_ialloc_cluster_alignment() to return 0. Due to this
args.minalignslop (in xfs_ialloc_ag_alloc()) gets the unsigned
equivalent of -1 assigned to it. This later causes alloc_len in
xfs_alloc_space_available() to have a value of 0. In such a scenario
when args.total is also 0, the assert statement "ASSERT(args->maxlen >
0);" fails.
This commit fixes the bug by replacing the call to XFS_B_TO_FSBT() in
xfs_ialloc_cluster_alignment() with a call to xfs_icluster_size_fsb().
Suggested-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Certain workoads that punch holes into speculative preallocation can
cause delalloc indirect reservation splits when the delalloc extent is
split in two. If further splits occur, an already short-handed extent
can be split into two in a manner that leaves zero indirect blocks for
one of the two new extents. This occurs because the shortage is large
enough that the xfs_bmap_split_indlen() algorithm completely drains the
requested indlen of one of the extents before it honors the existing
reservation.
This ultimately results in a warning from xfs_bmap_del_extent(). This
has been observed during file copies of large, sparse files using 'cp
--sparse=always.'
To avoid this problem, update xfs_bmap_split_indlen() to explicitly
apply the reservation shortage fairly between both extents. This smooths
out the overall indlen shortage and defers the situation where we end up
with a delalloc extent with zero indlen reservation to extreme
circumstances.
Reported-by: Patrick Dung <mpatdung@gmail.com> Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
When a delalloc extent is created, it can be merged with pre-existing,
contiguous, delalloc extents. When this occurs,
xfs_bmap_add_extent_hole_delay() merges the extents along with the
associated indirect block reservations. The expectation here is that the
combined worst case indlen reservation is always less than or equal to
the indlen reservation for the individual extents.
This is not always the case, however, as existing extents can less than
the expected indlen reservation if the extent was previously split due
to a hole punch. If a new extent merges with such an extent, the total
indlen requirement may be larger than the sum of the indlen reservations
held by both extents.
xfs_bmap_add_extent_hole_delay() assumes that the worst case indlen
reservation is always available and assigns it to the merged extent
without consideration for the indlen held by the pre-existing extent. As
a result, the subsequent xfs_mod_fdblocks() call can attempt an
unintentional allocation rather than a free (indicated by an ASSERT()
failure). Further, if the allocation happens to fail in this context,
the failure goes unhandled and creates a filesystem wide block
accounting inconsistency.
Fix xfs_bmap_add_extent_hole_delay() to function as designed. Cap the
indlen reservation assigned to the merged extent to the sum of the
indlen reservations held by each of the individual extents.
Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Currently we force the log and simply try again if we hit a busy extent,
but especially with online discard enabled it might take a while after
the log force for the busy extents to disappear, and we might have
already completed our second pass.
So instead we add a new waitqueue and a generation counter to the pag
structure so that we can do wakeups once we've removed busy extents,
and we replace the single retry with an unconditional one - after
all we hold the AGF buffer lock, so no other allocations or frees
can be racing with us in this AG.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
When we allocate COW fork blocks for direct I/O writes we currently first
create a delayed allocation, and then convert it to a real allocation
once we've got the delayed one.
As there is no good reason for that this patch instead makes use call
xfs_bmapi_write from the COW allocation path. The only interesting bits
are a few tweaks the low-level allocator to allow for this, most notably
the need to remove the call to xfs_bmap_extsize_align for the cowextsize
in xfs_bmap_btalloc - for the existing convert case it's a no-op, but
for the direct allocation case it would blow up our block reservation
way beyond what we reserved for the transaction.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
In the data fork, we only allow extents to perform the following state
transitions:
delay -> real <-> unwritten
There's no way to move directly from a delalloc reservation to an
/unwritten/ allocated extent. However, for the CoW fork we want to be
able to do the following to each extent:
delalloc -> unwritten -> written -> remapped to data fork
This will help us to avoid a race in the speculative CoW preallocation
code between a first thread that is allocating a CoW extent and a second
thread that is remapping part of a file after a write. In order to do
this, however, we need two things: first, we have to be able to
transition from da to unwritten, and second the function that converts
between real and unwritten has to be made aware of the cow fork. Do
both of those things.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
We can't handle a bmbt that's taller than BTREE_MAXLEVELS, and there's
no such thing as a zero-level bmbt (for that we have extents format),
so if we see this, send back an error code.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Don't let anybody load an obviously bad btree pointer. Since the values
come from disk, we must return an error, not just ASSERT.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
When we open a directory, we try to readahead block 0 of the directory
on the assumption that we're going to need it soon. If the bmbt is
corrupt, the directory will never be usable and the readahead fails
immediately, so we might as well prevent the directory from being opened
at all. This prevents a subsequent read or modify operation from
hitting it and taking the fs offline.
NOTE: We're only checking for early failures in the block mapping, not
the readahead directory block itself.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
We use di_format and if_flags to decide whether we're grabbing the ilock
in btree mode (btree extents not loaded) or shared mode (anything else),
but the state of those fields can be changed by other threads that are
also trying to load the btree extents -- IFEXTENTS gets set before the
_bmap_read_extents call and cleared if it fails.
We don't actually need to have IFEXTENTS set until after the bmbt
records are successfully loaded and validated, which will fix the race
between multiple threads trying to read the same directory. The next
patch strengthens directory bmbt validation by refusing to open the
directory if reading the bmbt to start directory readahead fails.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
After scratching my head looking for "xfs_busy_extent" I realized
it's not used; it's xfs_extent_busy, and the declaration for the
other name is bogus. Remove that and a few others as well.
(struct xfs_log_callback is used, but the 2nd declaration is
unnecessary).
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Now that xfs_btree_init_block_int is able to determine crc
status from the passed-in mp, we can determine the proper
magic as well if we are given a btree number, rather than
an explicit magic value.
Change xfs_btree_init_block[_int] callers to pass in the
btree number, and let xfs_btree_init_block_int use the
xfs_magics array via the xfs_btree_magic macro to determine
which magic value is needed. This makes all of the
if (crc) / else stanzas identical, and the if/else can be
removed, leading to a single, common init_block call.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Right now the xfs_btree_magic() define takes only a cursor;
change this to take crc and btnum args to make it more generically
useful, and move to a function.
This will allow xfs_btree_init_block_int callers which don't
have a cursor to make use of the xfs_magics array, which will
happen in the next patch.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
xfs_btree_init_block_int() can determine whether crcs are
in effect without the passed-in XFS_BTREE_CRC_BLOCKS flag;
the mp argument allows us to determine this from the
superblock. Remove the flag from callers, and use
xfs_sb_version_hascrc(&mp->m_sb) internally instead.
This removes one difference between the if & else cases
in the callers.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Wed, 22 Feb 2017 20:39:01 +0000 (14:39 -0600)]
include: don't collide __bitwise definitions in 4.10
Linux 4.10 changed the definition of __bitwise in such a way that
xfsprogs' definition is no longer a strict match for it. This causes
gcc to complain, so only #define it here if the system hasn't already
done it for us.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Eric Sandeen [Thu, 16 Feb 2017 03:48:31 +0000 (21:48 -0600)]
xfs_metadump: ignore attr leaf with 0 entries
Another in the ongoing saga of attribute leaves with zero
entries; in this case, if we try to metadump an inode with
a zero-entries attribute leaf, the zeroing code will go off
the rails and segfault at:
because first_name is null, and we try to memset a large
(negative) number.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Thu, 16 Feb 2017 03:04:03 +0000 (21:04 -0600)]
libxfs: sync up FSGETXATTR names and definitions with the kernel
The rest of xfsprogs uses FS_XFLAG values for FSGETXATTR as defined in
the kernel, so we should do the same in io/cowextsize.c. Also, move the
XFS_IOC_FSGETXATTR definition to the same part of xfs_fs.h as the
kernel.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Eric Biggers [Thu, 16 Feb 2017 03:04:03 +0000 (21:04 -0600)]
xfsprogs: Fix building xfsprogs on 32-bit platforms (again)
Building xfsprogs on 32-bit platforms was broken again by the recent
split of BUILD_CFLAGS from CFLAGS. -D_FILE_OFFSET_BITS=64 was not added
to BUILD_CFLAGS, but in fact BUILD_CFLAGS is used to compile
crc32selftest, which includes xfs.h and therefore requires this
declaration. Fix this by adding -D_FILE_OFFSET_BITS=64 to BUILD_CFLAGS.
Fixes: 0a71e3839630 ("build: Allow compiling xfsprogs in a cross compile environment") Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
With COW files they are the hotpath, just like for files with the
extent size hint attribute. We really shouldn't micro-manage anything
but failure cases with unlikely.
Additionally Arnd Bergmann recently reported that one of these two
unlikely annotations causes link failures together with an upcoming
kernel instrumentation patch, so let's get rid of it ASAP.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
xfs_attr_[get|remove]() have unlocked attribute fork checks to optimize
away a lock cycle in cases where the fork does not exist or is otherwise
empty. This check is not safe, however, because an attribute fork short
form to extent format conversion includes a transient state that causes
the xfs_inode_hasattr() check to fail. Specifically,
xfs_attr_shortform_to_leaf() creates an empty extent format attribute
fork and then adds the existing shortform attributes to it.
This means that lookup of an existing xattr can spuriously return
-ENOATTR when racing against a setxattr that causes the associated
format conversion. This was originally reproduced by an untar on a
particularly configured glusterfs volume, but can also be reproduced on
demand with properly crafted xattr requests.
The format conversion occurs under the exclusive ilock. xfs_attr_get()
and xfs_attr_remove() already have the proper locking and checks further
down in the functions to handle this situation correctly. Drop the
unlocked checks to avoid the spurious failure and rely on the existing
logic.
Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Currently we try to rely on the global reserved block pool for block
allocations for the free inode btree, but I have customer reports
(fairly complex workload, need to find an easier reproducer) where that
is not enough as the AG where we free an inode that requires a new
finobt block is entirely full. This causes us to cancel a dirty
transaction and thus a file system shutdown.
I think the right way to guard against this is to treat the finot the same
way as the refcount btree and have a per-AG reservations for the possible
worst case size of it, and the patch below implements that.
Note that this could increase mount times with large finobt trees. In
an ideal world we would have added a field for the number of finobt
fields to the AGI, similar to what we did for the refcount blocks.
We should do add it next time we rev the AGI or AGF format by adding
new fields.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Try to reserve the blocks first and only then update the fields in
or hanging off the mount structure. This way we can call __xfs_ag_resv_init
again after a previous failure.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
sb_dirblklog is added to sb_blocklog to compute the directory block size
in bytes. Therefore, we must compare the sum of both those values
against XFS_MAX_BLOCKSIZE_LOG, not just dirblklog.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Due to the way how xfs_iomap_write_allocate tries to convert the whole
found extents from delalloc to real space we can run into a race
condition with multiple threads doing writes to this same extent.
For the non-COW case that is harmless as the only thing that can happen
is that we call xfs_bmapi_write on an extent that has already been
converted to a real allocation. For COW writes where we move the extent
from the COW to the data fork after I/O completion the race is, however,
not quite as harmless. In the worst case we are now calling
xfs_bmapi_write on a region that contains hole in the COW work, which
will trip up an assert in debug builds or lead to file system corruption
in non-debug builds. This seems to be reproducible with workloads of
small O_DSYNC write, although so far I've not managed to come up with
a with an isolated reproducer.
The fix for the issue is relatively simple: tell xfs_bmapi_write
that we are only asked to convert delayed allocations and skip holes
in that case.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
fs/xfs/libxfs/xfs_dir2.h:40:8: error: type qualifiers ignored on function return type [-Werror=ignored-qualifiers]
Removing the 'const' modifier avoids the warning and has no
other effect.
Fixes: 1fc4d33fed12 ("xfs: replace xfs_mode_to_ftype table with switch statement") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Eric Sandeen [Thu, 26 Jan 2017 02:37:23 +0000 (20:37 -0600)]
xfs_repair: Fix uninit var in process_leaf_attr_level
My unreviewed maintainer adjustment on the way in gets a
brown paper bag.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Thu, 26 Jan 2017 02:02:43 +0000 (20:02 -0600)]
tools/find-api-violations: fix fs -> fsr in the directory list
Fix a stupid typo in the original commit.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
This is due to fp_dinode_fmt() interpretting the di_format field as
signed. This commit fixes the bug by passing BVUNSIGNED (instead of
BVSIGNED) as the argument to getbitval(). With this commit applied, we
now get,
Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Thu, 26 Jan 2017 02:02:43 +0000 (20:02 -0600)]
xfs_repair: trash dirattr btrees that cycle to the root
If xfs_repair detects a dir/attr btree that cycles back to the root, the
tree should be cleared and/or rebuilt instead of simply aborting the
repair program.
[sandeen: move check outside main loop]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Thu, 26 Jan 2017 02:02:43 +0000 (20:02 -0600)]
xfs_repair: zero shared_vn
Since shared_vn always has to be zero, zero it at the start of repair.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Thu, 26 Jan 2017 02:02:43 +0000 (20:02 -0600)]
xfs_repair: strengthen geometry checks
In xfs_repair, the inodelog, sectlog, and dirblklog values are read
directly into the xfs_mount structure without any sanity checking by the
verifier. This results in xfs_repair segfaulting when those fields have
ridiculously high values because the pointer arithmetic runs us off the
end of the metadata buffers. Therefore, reject the superblock if these
values are garbage and try to find one of the other ones. Clean up the
dblocks checking to use the relevant macros.
The superblock field fuzzer (xfs/1301) triggers all these segfaults.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Thu, 26 Jan 2017 02:02:43 +0000 (20:02 -0600)]
xfs_db: fix the 'source' command when passed as a -c option
The 'source' command is supposed to read commands out of a file and
execute them. This works great when done from an interactive command
line, but it doesn't work at all when invoked from the command line
because we never actually do anything with the opened file.
So don't load stdin into the input stack when we're only executing
command line options, and use that to decide if source_f is executing
from the command line so that we can actually run the input loop. We'll
use this for the per-field fuzzing xfstests.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Eric Sandeen [Thu, 26 Jan 2017 02:02:43 +0000 (20:02 -0600)]
libxfs: sanitize agcount on load
Before we get into libxfs_initialize_perag and try to blindly
allocate a perag struct for every (possibly corrupted number of)
AGs, see if we can read the last one. If not, assume it's corrupt,
and load only the first AG.
Do this only for an arbitrarily high-ish agcount, so that normal-ish
geometry on a possibly truncated file or device will still do
its best to make all readable AGs available.
Set xfs_db's exitcode to 1 if this happens.
Also teach metadump to detect this and exit appropriately if
truncated, as it resets exitcode to 0 for its own purposes internally.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Hou Tao [Thu, 26 Jan 2017 02:02:42 +0000 (20:02 -0600)]
xfs_logprint: handle log operation split of inode item correctly
If an inode log item has 4 log operations, and the 4th operation
(attr fork op) is splitted to the next log record due to the size
limitation of log record, xfs_logprint doesn't check whether or not
the 4th operation is in the current log record and print invalid data.
xfs_logprint also needs to calculate the count of splitted log
operations correctly instead of just returning 1.
The following is a diff of the output before and after the patch
is applied:
The port of the opensource xfsprogs to IRIX always was secondary to the
"real" tools in the IRIX tree. IRIX has effectively been EOLed, and
dropping support for it will allow cleaning up various things in the XFS
tree where IRIX was oddly different from the other ports. E.g. that the
xfsctl function needs a path and a fd, something we could replace with
just documenting to use ioctl in the future, or various ifdefs in the
headers shared with the kernel for structures provided by native headers
in IRIX.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Check for invalid file type in xfs_dinode_verify()
and fail to load the inode structure from disk.
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
The size of the xfs_mode_to_ftype[] conversion table
was too small to handle an invalid value of mode=S_IFMT.
Instead of fixing the table size, replace the conversion table
with a conversion helper that uses a switch statement.
Suggested-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
xfs_dir2.h dereferences some data types in inline functions
and fails to include those type definitions, e.g.:
xfs_dir2_data_aoff_t, struct xfs_da_geometry.
Signed-off-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
This changes fixes an assertion hit when fuzzing on-disk
i_mode values.
The easy case to fix is when changing an empty file
i_mode to S_IFDIR. In this case, xfs_dinode_verify()
detects an illegal zero size for directory and fails
to load the inode structure from disk.
For the case of non empty file whose i_mode is changed
to S_IFDIR, the ASSERT() statement in xfs_dir2_isblock()
is replaced with return -EFSCORRUPTED, to avoid interacting
with corrupted jusk also when XFS_DEBUG is disabled.
Suggested-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Thu, 12 Jan 2017 20:12:42 +0000 (14:12 -0600)]
xfs_repair.8: document dirty log conditions
Add a section describing what is a dirty log, why xfs_repair won't touch
such things, and what one can do to clear the condition and check the
filesystem.
[ sandeen: light editing, minor man-page-ification ]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Deepa Dinamani [Thu, 12 Jan 2017 20:12:41 +0000 (14:12 -0600)]
xfs_io: implement 'utimes' command
Add the utimes command to provide a way to utilize
the futimens C library call. This is the
interface to the utimensat system call, which updates
the mtime and atime of a file.
[ sandeen: minor merge fixups, re-alphabetization ]
Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Dave Chinner [Thu, 12 Jan 2017 20:12:41 +0000 (14:12 -0600)]
libxcmd: add non-iterating user commands
Right now command iteration is not directly controllable by the
user; it is controlled entirely by the application command flag
setup. Sometimes we don't want commands to iterate but only operate
on the currently selected object.
To do this, add a function to supply a "non-iterating" user command
that will execute an iterating-capable command as though it
CMD_FLAG_ONESHOT was set. Add a new command line option to xfs_io to
drive it (-C <command>) and connect it all up. Document it in the
xfs_io man page, too.
Dave Chinner [Thu, 12 Jan 2017 20:12:41 +0000 (14:12 -0600)]
xfs_io: make various commands one-shot only
It makes no sense to iterate the file table for some xfs_io
commands. Some commands are already marked in this way, but lots of
them are not and this leads to bad behaviour. For example, the open
command will run until the process fd table is full and EMFILE is
returned rather than just opening the specified file once.
Signed-Off-By: Dave Chinner <dchinner@redhat.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>