Dave Chinner [Fri, 7 Jun 2013 00:25:55 +0000 (10:25 +1000)]
xfs_db: convert directory parsing to use libxfs structure
xfs_db rolls it's own "opaque" directory types for the different
block formats. All it cares about is where the headers end and the
data starts, and none of the other details in the structures. Rather
than duplicate this for the dir3 format, we already have perfectly
good headers and abstraction functions for finding this information
in libxfs. Using these means that the dir2 code used for printing
fields, metadump and check need to be modified to use libxfs
definitions.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
Dave Chinner [Fri, 7 Jun 2013 00:25:51 +0000 (10:25 +1000)]
libxfs: determine inode size from version number, not struct xfs_dinode
xfs_db does not use the same structure types as libxfs when checking
inodes, and so cannot determine the size of the inode core by
passing a struct xfs_dinode to a function. We do, however, know the
raw version number, so we can pass that instead. Convert the code to
passing the inode version rather than a structure.
Note that this should probably be converted in the kernel code as
well.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
Dave Chinner [Fri, 7 Jun 2013 00:25:50 +0000 (10:25 +1000)]
xfs_db: disable modification for CRC enabled filessytems.
xfs_db does not have the IO infrastructure to calculate metadata
CRCs after modifying metadata. Hence xfs_db can only run in
read-only mode on filesystems with version 5 superblocks.
To fix this, xfs_db needs to have it's IO engine converted to use
the buffer based IO provided by libxfs rather than rolling it's own
IO routines. That is future work, so until this conversion is done,
only allow xfs_db to run in read-only mode on v5 filesystems.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
Dave Chinner [Fri, 7 Jun 2013 00:25:49 +0000 (10:25 +1000)]
xfsprogs: disable xfs_check for CRC enabled filesystems
Until xfs_db has full metadata CRC support, xfs_check will not be
able to fully verify filesystems in this format. Don't even
bother trying right now, and to make it simple to test full xfsprogs
installs with xfstests, just silently succeed.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
Dave Chinner [Fri, 7 Jun 2013 00:25:47 +0000 (10:25 +1000)]
xfsprogs: add crc format support to repair
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
[minor whitespace, spelling, and coarse language cleanup -bpm]
Dave Chinner [Fri, 7 Jun 2013 00:25:45 +0000 (10:25 +1000)]
xfsprogs: Add verifiers to libxfs buffer interfaces.
Verifiers need to be used everywhere to enable calculation of CRCs
during writeback of modified metadata. Add then to the libxfs buffer
interfaces conver the internal use of devices to be buftarg aware.
Verifiers also require that the buffer has a back pointer to the
struct xfs_mount. To make this source level comaptible between
kernel and userspace, convert userspace to pass struct xfs_buftargs
around rather than a "device".
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
Dave Chinner [Fri, 7 Jun 2013 00:25:44 +0000 (10:25 +1000)]
xfs: implement extended feature masks
The version 5 superblock has extended feature masks for compatible,
incompatible and read-only compatible feature sets. Implement the
masking and mount-time checking for these feature masks.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
Dave Chinner [Fri, 7 Jun 2013 00:25:43 +0000 (10:25 +1000)]
xfs: add CRC checks to the superblock
With the addition of CRCs, there is such a wide and varied change to
the on disk format that it makes sense to bump the superblock
version number rather than try to use feature bits for all the new
functionality.
This commit introduces all the new superblock fields needed for all
the new functionality: feature masks similar to ext4, separate
project quota inodes, a LSN field for recovery and the CRC field.
This commit does not bump the superblock version number, however.
That will be done as a separate commit at the end of the series
after all the new functionality is present so we switch it all on in
one commit. This means that we can slowly introduce the changes
without them being active and hence maintain bisectability of the
tree.
This patch is based on a patch originally written by myself back
from SGI days, which was subsequently modified by Christoph Hellwig.
There is relatively little of that patch remaining, but the history
of the patch still should be acknowledged here.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
Dave Chinner [Fri, 7 Jun 2013 00:25:42 +0000 (10:25 +1000)]
xfs: buffer type overruns blf_flags field
The buffer type passed to log recvoery in the buffer log item
overruns the blf_flags field. I had assumed that flags field was a
32 bit value, and it turns out it is a unisgned short. Therefore
having 19 flags doesn't really work.
Convert the buffer type field to numeric value, and use the top 5
bits of the flags field for it. We currently have 17 types of
buffers, so using 5 bits gives us plenty of room for expansion in
future....
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
Dave Chinner [Fri, 7 Jun 2013 00:25:40 +0000 (10:25 +1000)]
xfs: add CRC protection to remote attributes
There are two ways of doing this - the first is to add a CRC to the
remote attribute entry in the attribute block. The second is to
treat them similar to the remote symlink, where each fragment has
it's own header and identifies fragment location in the attribute.
The problem with the CRC in the remote attr entry is that we cannot
identify the owner of the metadata from the metadata blocks
themselves, or where the blocks fit into the remote attribute. The
down side to this approach is that we never know when the attribute
has been read from disk or not and so we have to verify it every
time it is read, and we must calculate it during the create
transaction and log it. We do not log CRCs for any other metadata,
and so this creates a unique set of coherency problems that, in
general, are best avoided.
Adding an identifying header to each allocated block allows us to
identify each fragment and where in the attribute it is located. It
enables us to rebuild the remote attribute from just the raw blocks
containing the attribute. It also provides us to do per-block CRCs
verification at IO time rather than during the transaction context
that creates it or every time it is read into a user buffer. Hence
it avoids all the problems that an external, logged CRC has, and
provides all the benefits of self identifying metadata.
The only complexity is that we have to add a header per fragment,
and we don't know how many fragments will be needed prior to
allocations. If we take the symlink example, the header is 56 bytes
and hence for a 4k block size filesystem, in the worst case 16
headers requires 1 extra block for the 64k attribute data. For 512
byte filesystems the worst case is an extra block for every 9
fragments (i.e. 16 extra blocks in the worse case). This will be
very rare and so it's not really a major concern.
Because allocation is done in two steps - the first finds a hole
large enough in the attribute file, the second does the allocation -
we only need to find a hole big enough for a worst case allocation.
We only need to allocate enough extra blocks for number of headers
required by the fragments, and we can calculate that as we go....
Hence it really only makes sense to use the same model as for
symlinks - it doesn't add that much complexity, does not require an
attribute tree format change, and does not require logging
calculated CRC values.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
Dave Chinner [Fri, 7 Jun 2013 00:25:39 +0000 (10:25 +1000)]
xfs: split remote attribute code out
Adding CRC support to remote attributes adds a significant amount of
remote attribute specific code. Split the existing remote attribute
code out into it's own file so that all the relevant remote
attribute code is in a single, easy to find place.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
Dave Chinner [Fri, 7 Jun 2013 00:25:36 +0000 (10:25 +1000)]
xfs: shortform directory offsets change for dir3 format
Because the header size for the CRC enabled directory blocks is
larger, the offset of the first entry into a directory block is
different to the dir2 format. The shortform directory stores the
dirent's offset so that it doesn't change when moving from shortform
to block form and back again, and hence it needs to take into
account the different header sizes to maintain the correct offsets.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
Dave Chinner [Fri, 7 Jun 2013 00:25:35 +0000 (10:25 +1000)]
xfs: add CRC checking to dir2 leaf blocks
This addition follows the same pattern as the dir2 block CRCs.
Seeing as both LEAF1 and LEAFN types need to changed at the same
time, this is a pretty large amount of change. leaf block headers
need to be abstracted away from the on-disk structures (struct
xfs_dir3_icleaf_hdr), as do the base leaf entry locations.
This header abstract allows the in-core header and leaf entry
location to be passed around instead of the leaf block itself. This
saves a lot of converting individual variables from on-disk format
to host format where they are used, so there's a good chance that
the compiler will be able to produce much more optimal code as it's
not having to byteswap variables all over the place.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
Dave Chinner [Fri, 7 Jun 2013 00:25:33 +0000 (10:25 +1000)]
xfs: add CRC checking to dir2 free blocks
This addition follows the same pattern as the dir2 block CRCs, but
with a few differences. The main difference is that the free block
header is different between the v2 and v3 formats, so an "in-core"
free block header has been added and _todisk/_from_disk functions
used to abstract the differences in structure format from the code.
This is similar to the on-disk superblock versus the in-core
superblock setup. The in-core strucutre is populated when the buffer
is read from disk, all the in memory checks and modifications are
done on the in-core version of the structure which is written back
to the buffer before the buffer is logged.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
Dave Chinner [Fri, 7 Jun 2013 00:25:32 +0000 (10:25 +1000)]
xfs: add CRC checks to block format directory blocks
Now that directory buffers are made from a single struct xfs_buf, we
can add CRC calculation and checking callbacks. While there, add all
the fields to the on disk structures for future functionality such
as d_type support, uuids, block numbers, owner inode, etc.
To distinguish between the different on disk formats, change the
magic numbers for the new format directory blocks.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
Dave Chinner [Fri, 7 Jun 2013 00:25:28 +0000 (10:25 +1000)]
xfsprogs: Support new AGFL format
With the addition of CRCs to the filesystem format, the AGFL has a
new format structure definition. Existing code that pulls freelist
blocks out via dereferencing agfl->agfl_bno no longer works as the
location of the free list is now variable depending on the disk
format in use.
Hence all the users of agfl_bno need ot be converted to extract the
location of the first free list entry from the AGFL and grab entries
relative to that first entry. It's a simple change, but needs to be
made in several places as there is very little code reuse within and
between the different utilities in xfsprogs.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
Dave Chinner [Fri, 17 May 2013 11:12:57 +0000 (21:12 +1000)]
logprint: fix wrapped log dump issue
When running xfs/295 on a 512 byte block size filesystem, logprint
fails during checking with a "Bad log record header" error. This is
due to the fact that the log has wrapped and there is partial record
a the start of the log.
logprint doesn't check for this condition, and simply assumes that
the first block in the log contains a log header, and hence aborts
when this case occurs. So we now have a spurious test failure due to
logprint displaying how right this comment is:
/*
* This code is gross and needs to be rewritten.
*/
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
Eric Sandeen [Tue, 16 Jul 2013 02:16:52 +0000 (21:16 -0500)]
xfs_metadump: manpage fix regarding frozen fs
The xfs_metadump manpage states that metadump works
on a frozen filesystem; it does not. In fact, there is
no way to detect a frozen filesystem, so we can't make it
work, either.
So just remove this from the manpage; unmounted or RO
mounted is what is enforced by xfs_metadump.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
Dave Chinner [Fri, 7 Jun 2013 00:25:24 +0000 (10:25 +1000)]
mkfs: fix realtime device initialisation
The method that libxfs uses for logging inodes is not followed by rtinit().
It fails to join the realtime bitmap inode to the final extent free
transactions, and so mkfs.xfs dies when trying to log changes to the bitmap
inode. Fix it.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
Dave Chinner [Tue, 18 Jun 2013 03:40:53 +0000 (13:40 +1000)]
xfsprogs: fix make deb
Commit 48212a30 ("xfsprogs: update 'make deb' to use tarball) fixed
a bunch of problems with making the source tarball for releases.
However, it broke the debian package builds in a way I hadn't
noticed until I rewrote my CI system build script.
I noticed that the CI system wasn't building from a pristine
workarea, and instead was just updating the old workarea and running
'make deb'. I added a 'make realclean' to remove all previous state
from the workarea, and then 'make deb' started failing with errors
building the tarball because po/xfsprogs.pot didn't have a build
rule
The above commit removed the pre-build of the translations target,
and instead made the translation build target a dependency of
building the the tarball. Hence the lack of a build rule of the
translations causes the source tarball build to fail.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
Michael L. Semon [Fri, 14 Jun 2013 07:00:34 +0000 (03:00 -0400)]
xfsprogs: define umode_t for build if not defined already
umode_t has not been exported to the kernel private headers since
around kernel 3.2.0-rc7. Add a check for umode_t and define it
if it has not been defined already.
Signed-off-by: Michael L. Semon <mlsemon35@gmail.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
Dave Chinner [Thu, 9 May 2013 16:20:13 +0000 (11:20 -0500)]
xfs_logprint: fix continuation transactions
As demonstrated by xfs/295, continuation transactions cause of
problems for xfs_logprint. The failure demonstrated by the test is
that the buffer log format structures are variable sized on disk -
the dirty bitmap is sized according to the buffer length, not fixed
to the length of the maximum supported buffer size.
xfs_logprint assumes that the buf log format reocrds are of fixed
size, and so when a short buffer is found it fails to handle it
properly and treats it like a continuation record. This causses the
opheader pointer to be incremented incorrectly and then logprint
wanders off into a dark corner and gets eaten by a grue.
While fixing this, make the xlog_print_record code that does the
transaction opheader walking a little easier to read and stop it
from outputting binary data direct to the console by converting the
no-data-print case to use a hex dumping loop.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Rich Johnston <rjohnston@sgi.com>
Dave Chinner [Thu, 9 May 2013 13:19:06 +0000 (08:19 -0500)]
xfsprogs: updata libxlog to current kernel code
Update the libxlog log recovery code to match the current 3.8-rc2
kernel code and ensure all the callers work correctly.
Note: while this introduces CRC validation infrastructure, it is
currently short-circuited as it is not clear what to do with log
buffers that fail CRC checking. We're only reading the log to
determine it is clean/dirty or dumping the contents for analysis, so
it's not clear what to do with CRC validation errors yet, or even
if there is any commonality with the kernel handling. This will need
to be revisited as the situation clarifies.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Rich Johnston <rjohnston@sgi.com>
Dave Chinner [Thu, 9 May 2013 13:18:51 +0000 (08:18 -0500)]
xfsprogs: add CRC32c infrastructure
Pull the generic crc32(c) code from the kernel and add it to libxfs.
Modify it to build in the libxfs environment, and drop the bigendian
CRC version as it is unused by XFS, which uses the little endian
version so that it can be hardware accelerated using native
instructions on x86-64 CPUs.
Also wire up the self-test code in the crc32 module to the build
infrastructure and make passing the self test a build dependency.
This prevents xfsprogs from being built on platforms that the CRC
algorithm does not work on and hence ensures the tools do not write
bad CRCs to disk as a result of a broken calculation.
Also pull the XFS CRC helper functions across in preparation for
using the CRC functions in libxfs.
XXX: something in the CRC table generation breaks the debian package
build. It fails to build libxfs as a dependency of mkfs.xfs. Works
fine outside the debian build environment, so I'm not sure what the
issue is yet. Most likely it is the execution path of the
gen_crc32table binary that is built...
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Rich Johnston <rjohnston@sgi.com>
Dave Chinner [Thu, 9 May 2013 12:22:15 +0000 (07:22 -0500)]
xfsprogs: Die dir1 Die!
Version 1 directories have never been supported on linux, and we base the
default on pre-Irix 6.2 kernels (~1998). A recent xfs_repair debugging session
on #xfs determined that dir v1 support in xfs_repair has been broken since May
26, 2008 when the ascii case insensitivity support was added to userspace.
Seeing that the code has been broken for roughly 5 years and the first time that
it was noticed was a couple of days ago, it is clearly rarely required, rarely
used and completely untested.
Following the time-honoured X server deprecation model, if it's been broken for
several years and nobody has noticed, then it can and should be removed. So,
rather than trying to fix something we can't test and very, very few people care
about, let's just remove it.
For xfs_repair, some of the checking code is shared with the attribute repair
code. Once all the dirv1 specific code is removed, there isn't a whole lot left,
so move it to attr_repair.c and we can kill the dir.[ch] files completely.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Rich Johnston <rjohnston@sgi.com>
Dave Chinner [Thu, 9 May 2013 12:16:09 +0000 (07:16 -0500)]
xfs_fsr: file reads should be O_DIRECT
When running xfs_fsr on a sparse filesystem image containing
approximately 8 million extents and 80GB of data, I noticed that the
page cache grew and consumed all the memory in the machine. It turns
out that xfs_fsr is using direct IO to write data, but buffered IO
to read data. Convert the read side to use direct IO to prevent page
cache blowouts.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Rich Johnston <rjohnston@sgi.com>
Dave Chinner [Thu, 9 May 2013 12:11:50 +0000 (07:11 -0500)]
xfs_logprint: print all AGI unlinked buckets
When printing buffer contents, the AGI unlinked buckets are not
printed in transactional output. In normal dump format, they are
printed, but that format is generally not useful for log recovery
analysis. Add the same code to the transactional buffer dump.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Rich Johnston <rjohnston@sgi.com>
Dave Chinner [Thu, 2 May 2013 12:59:20 +0000 (07:59 -0500)]
xfs_repair: validate on-disk extent count better
When scanning a btree format inode, we trust the extent count to be
in range. However, values of the range 2^31 <= cnt < 2^32 are
invalid and can cause problems with signed range checks. This
results in assert failures which validating the extent count such
as:
Eric Sandeen [Thu, 25 Apr 2013 15:03:47 +0000 (15:03 +0000)]
xfsprogs: Fix manpages for missing or incorrect options
Add valid options which aren't in manpages, and
remove invalid options which are in manpages:
* Document -V (show version and exit) for many manpages.
* Remove -? option from xfs_estimate.8
* Document -p passes, -d (debug) and -g (syslog) in xfs_fsr.8
* Document -n (O_NONBLOCK) in xfs_io.8
* Document -v (print overwrite) in xfs_logprint.8
* Document -m max_extents in xfs_metadump.8
* Document -p (preallocate) in xfs_mkfile.8
Brian Foster [Tue, 19 Mar 2013 13:23:35 +0000 (13:23 +0000)]
xfsprogs: reduce bb_numrecs in bno/cnt btrees when log consumes all agf space
The mkfs code currently creates a single free space extent record
for each of the bno and cnt btrees in each AG. The start block of
the record is pushed forward on the AG that hosts an internal log.
If the log happens to consume all available space in the AG, the
start block becomes equal to sb->sb_agblocks and thus invalid.
This causes xfs_repair to complain.
Phase 1 - find and verify superblock...
Phase 2 - using internal log
- scan filesystem freespace and inode maps...
invalid start block 4096 in record 0 of bno btree block 1600/1
invalid start block 4096 in record 0 of cnt btree block 1600/2
- found root inode chunk
...
xfs_repair appears to correct the numrecs value such that subsequent
checks are successful. The sequence above is pulled from xfstests
test #250, which fails due to this behavior.
Modify mkfs.xfs such that we check the block count value of the
free space record for the log AG after the log is accounted for. If
no space is left for the record, reset the record count to 0.
Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Rich Johnston <rjohnston@sgi.com>
Eric Sandeen [Sat, 9 Mar 2013 15:21:55 +0000 (15:21 +0000)]
xfsprogs: skip freelist scans of corrupt agf
If an agf has bad values in the freelist, this can wreak
havoc if, for example, first > last and the loop
never exits; we index agfl->agfl_bno[i] off into the weeds.
If they're off, warn about it and skip the scan.
This is done both in xfs_check and xfs_db's freespace cmd.
Also fix uninit'd variable "i" from previous, similar fix
for xfs_repair.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Rich Johnston <rjohnston@sgi.com>
Eric Sandeen [Sat, 2 Mar 2013 21:23:12 +0000 (21:23 +0000)]
xfsprogs: xfs_repair skip freelist scan of corrupt agf in no-modify mode
In xfs_repair's no-modify mode (-n), verify_set_agf doesn't fix up
bad freelist blocks that it finds. When we get to scan_freelist,
this can wreak havoc if, for example, first > last and the loop
never exits; we index agfl->agfl_bno[i] off into the weeds.
To fix this, re-check the values in no-modify mode, and if
they're off, warn about it and skip the scan.
Reported-by: Ole Tange <tange@binf.ku.dk> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Rich Johnston <rjohnston@sgi.com> Signed-off-by: Rich Johnston <rjohnston@sgi.com>
Eric Sandeen [Sat, 26 Jan 2013 22:40:31 +0000 (22:40 +0000)]
xfs_fsr: fix attribute no_change_count logic
As it stands today, if no_change_count++ isn't > 10,
we will reset it to 0. There's no way to get above 1
(let alone 10) so this isn't working as intended.
If we see progress (last_forkoff != tbstat.bs_forkoff)
*then* we sould reset the no_change_count counter to 0.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Mark Tinguely <tinguely@sgi.com>
Eric Sandeen [Sat, 26 Jan 2013 22:40:27 +0000 (22:40 +0000)]
libxfs: fix setup_cursor array allocation
setup_cursor() wants an array of xfs_agbno_t's, but
it allocated a multiple of *pointers* to xfs_agbno_t's.
xfs_agbno_t is 4 bytes, so this is harmless other than
allocating twice as much memory as needed on a 64-bit
machine.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Mark Tinguely <tinguely@sgi.com>
Eric Sandeen [Fri, 25 Jan 2013 21:10:22 +0000 (21:10 +0000)]
xfsprogs: Fix possible unallocated memory access in fiemap
(Based on original patch by Lukas Czerner & comments by Dave Chinner)
Currently we could access unallocated memory in fiemap because we're
using uninitialized variable 'fiemap' in fiemap_f(). In fact this has
been spotted on x390s machine where xfs_io would segfault.
The problem happens in the for cycle which seems to be intended to
compute the header item spacing. However at that point the fiemap
structure has just been allocated and does not contain any extents
yet, so it is entirely useless and it never actually worked.
This patch delays the format calculation until the first batch
of extents has come in for analysis.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Mark Tinguely <tinguely@sgi.com>
Jeff Liu [Tue, 19 Feb 2013 05:31:25 +0000 (05:31 +0000)]
xfsprogs: sync the changes in transaction log space reservations to user space
Sync the kernel code changes regarding transaction log space
reservations to user space.
As we have splitted the calculation of attrset log space reservations
into mount time and runtime in kernel code, here we need to fix
max_attrset_trans_res_adjust() to reflect this change.
Signed-off-by: Jie Liu <jeff.liu@oracle.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Mark Tinguely <tinguely@sgi.com>
Ben Myers [Thu, 14 Feb 2013 16:54:09 +0000 (10:54 -0600)]
xfsprogs: update 'make deb' to use tarball
This patch changes the build process so that 'make deb' uses the same
process of creating a source tree as the release script.
* Add a list of files which go in the release tarball in .gitcensus
This is needed so that you can create a tarball in a bare release
tree, when .git is not available.
* Modify the SRCTAR target to include files from .gitcensus and use tar
instead of git archive.
* Modify the SRCTARINC files to include .gitcensus, and include
.gitcensus in the 'make realclean' target.
* remove the 'make source-link' target.
Signed-off-by: Ben Myers <bpm@sgi.com> Reviewed-by: Nathan Scott <nathans@debian.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
Andrew Dahl [Mon, 14 Jan 2013 18:16:02 +0000 (12:16 -0600)]
xfsprogs: Refactor release scripts to conform to using git archive
Refactored release scripts to conform to using git archive
When generating a release, there is a risk of missing necessary
source files. This is fixed by using git archive, which also
fixes the lack of conformity between the xfs utilities. As well,
some files may be stale during packaging. This is fixed with a
clean at the beginning of release generation.
Signed-off-by: Andrew Dahl <adahl@sgi.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
Eric Sandeen [Wed, 2 Jan 2013 23:03:52 +0000 (17:03 -0600)]
xfs_logprint: Handle continued inode transactions
xlog_print_trans_inode() has a special case for 2
specific op_head->oh_len lengths. If it matches
sizeof(xfs_inode_log_format_32_t) or
sizeof(xfs_inode_log_format_64_t), it assumes that
it's got an inode, and attempts to convert it and
print it accordingly.
However, if we arrive here via an op header which
is continued, then the length is simply a continuation
of the previous op, and it might *randomly* match the
size of one of the inode log formats, and thus get parsed
incorrectly.
Change the caller to pass in whether or not it's a continued
op, so that it can be handled correctly.
Tested by running xfs_logprint of TEST_DEV in xfsprogs
after sequential tests; without this change it gets off
in the weeds eventually; with this fix, it lasts longer,
until it hits some other yet-unfixed logprint bug...
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
Eric Sandeen [Wed, 2 Jan 2013 23:02:17 +0000 (17:02 -0600)]
xfs_logprint: Handle multiply-logged inode fields
As xlog_print_trans_inode() stands today, it will error
out if more than one flag is set on f->ilf_fields:
xlog_print_trans_inode: illegal inode type
but this is a perfectly valid case, to have i.e. a data and
an attr flag set.
Following is a pretty big reworking of the function to
handle more than one field type set, mostly following
xlog_recover_inode_pass2() for logic.
I've tested this by a simple test such as creating one
file on an selinux box, so that data+attr is set, and
logprinting; I've also tested by running logprint after
subsequent xfstest runs (although we hit other bugs that
way).
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
blkid_get_topology() ignores devices which report 512
as their minimum & optimal IO size, but we should ignore
anything up to the physical sector size; otherwise hard-4k
sector devices will report a "stripe size" of 4k, and warn
if anything larger is specified:
# modprobe scsi_debug physblk_exp=3 num_parts=2 dev_size_mb=128
# mdadm --create /dev/md1 --level=0 --raid-devices=2 -c 4 /dev/sdb1 /dev/sdb2
# mkfs.xfs -f -d su=16k,sw=2 /dev/md1
mkfs.xfs: Specified data stripe unit 32 is not the same as the volume stripe unit 8
mkfs.xfs: Specified data stripe width 64 is not the same as the volume stripe width 16
...
but a stripe unit of 4k is pretty nonsensical. And that's even chosen by
default in this case, which is maybe even worse?
Eric Sandeen [Tue, 10 Apr 2012 04:34:12 +0000 (23:34 -0500)]
metadump: obfuscate symlinks by path component
xfs_metadump currently obfuscates entire symlinks without regard
to path components; this can lead to a corrupt image when restoring
a metadump containing extremely long symlinks:
Phase 3 - for each AG...
- scan and clear agi unlinked lists...
- process known inodes and perform inode discovery...
- agno = 0
component of symlink in inode 145 too long
problem with symbolic link in inode 145
cleared inode 145
... <more trail of woe>
Fix this by consolidating symlink obfuscation into a new
function which obfuscates one path component at a time.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Alex Elder <elder@kernel.org> Signed-off-by: Ben Myers <bpm@sgi.com>
Eric Sandeen [Sat, 8 Dec 2012 21:03:10 +0000 (21:03 +0000)]
xfsprogs: remove setfl from xfs_io
Doesn't seem to have worked for ages, and is (therefore)
apparently not ever used:
xfs_io> setfl
xfs_io> help setfl
setfl [-adx] -- set/clear append/direct flags on the open file
xfs_io> setfl -a
bad argument count 1 to setfl, expected 0 arguments
xfs_io> setfl -d
bad argument count 1 to setfl, expected 0 arguments
xfs_io> setfl
xfs_io>
At best, it seems intended to toggle the flag state, but
gives no feedback about current state. -x is in help but
not implemented, etc.
Just remove it.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Mark Tinguely <tinguely@sgi.com>
Eric Sandeen [Sat, 8 Dec 2012 20:55:18 +0000 (20:55 +0000)]
xfsprogs: document all commands in xfs_io
Add missing command documentation to xfs_io(8) manpage.
fiemap, fpunch, chproj, lsproj, and setfl are all missing.
setfl seems to not work today in any case, and nothing
in xfstests uses it; I will send another patch to simply
remove it from xfs_io, as I don't think it's terribly useful,
and hasn't worked forever anyway.
Also fix references to the fallocate manpage, which is (now?)
in section 2, not section 3 of the man pages. (Since it's
a syscall, not a library function).
Eric Sandeen [Thu, 6 Dec 2012 21:52:54 +0000 (15:52 -0600)]
mkfs.xfs: go into multidisk mode when geometry is on cmdline
In the course of some other investigations, I found that
calc_default_ag_geometry() doesn't go into "multidisk" mode
unless stripe geometry is *detected* (i.e. by the blkid routines).
Specifying a geometry on the cmdline is *not* sufficient, because
we test (ft.dsunit | ft.dswidth) which are not set by the cmdline
options.
If we move the AG calculations to after we have set dsunit & dswdith,
then we'll pick up either cmdline-specified or blkid-detected
geometry, and go into "multidisk" mode for AG size/count
calculations in both cases.
So now for a ~5T fs, for example, we'd make several more
AGs:
Dave Chinner [Fri, 16 Nov 2012 01:14:48 +0000 (01:14 +0000)]
xfs_quota: correctly initialise the default path
When we initial xfs_quota, we place lots of information into the
fs_table. This includes all the devices/mount points the user has
specified as a global command line parameter to report on, as well
as all the paths under project quota control.
There is a "current path" pointer (fs_path) maintained by the code
that points somewhere into the fs_table. After the table is
initialised, fs_path always points to the last entry in the table,
and hence has to be re-initialised to point at the desired entry
before it can be used properly.
In the case of xfs_quota, if the command passed on the command line
is a non-global command, the command is called multiple times, each
time after the libxcmd args_command() callback is run. That starts
with an index of 0, and until the callback returns zero it will keep
passing whatever the last returned value was into the callback.
xfs_quota supplies such a callback, and it's purpose is to iterate
over the fs_table setting fs_path to the next mount point in the
table. IOWs, non-global quota functions get called once for each
mount point specified on the command line. However, it also means
that for global functions, the fs_path pointer is not
re-initialised and hence if there are project quotas configured the
fs_path pointer does not point to a mount point andhence commands
may malfunction..
The problem that demonstrated this is the report function. It does
it's own fs_table iteration if the command requires it, and so only
should be called once to avoid outputting the same information
multiple times. That's what the previous patch fixed by making the
command global, but this now has the effect of making commands that
need to operate on the device specified on the global command rely
on the fs_path variable pointing at that device.
Further, commands executed by the interactive method are always
treated as global commands, so the report command never worked as a
global command in the presence of a configured project quota setup.
Fix the problem by initialising the fs_path pointer correctly.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Mark Tinguely <tinguely@sgi.com>
Dave Chinner [Fri, 9 Nov 2012 07:02:58 +0000 (07:02 +0000)]
xfs_quota: fix report command parsing
The report command line needs to be parsed as a whole not as
individual elements - report_f() is set up to do this correctly.
When treated as non-global command line, the report function is
called once for each command line arg, resulting in reports being
issued multiple times.
Set the command to be a global command so that it is only called
once.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Mark Tinguely <tinguely@sgi.com>
Dave Chinner [Fri, 9 Nov 2012 07:02:57 +0000 (07:02 +0000)]
xfs_db: flush devices before exiting
Test 287 uses xfs_db to change 32-bit project ID support while the
filesystem is unmounted. On a large filesystem the test was failing
due to the mount not seeing the feature bit in the superblock.
xfs_db uses a different address space to the filesystem when it is
mounte dby the kernel, so the only way to keep them coherent is to
ensure that all buffered data is written to disk before the other
entity tries to read it. xfs_db uses buffered IO, but does not close
the devices when it exits, thereby leaving changes it has written in
the block device cache rather than on disk. Hence when th ekernel
tries to mount the filesystem, it reads what is on disk and does not
see xfs_db's changes.
Fix this by ensuring that xfs_db flushes it's changes to disk before
it exits by caling libxfs_device_close(). This fsyncs the data and
flushes the caches to ensure that it is present on disk before
xfs_db exits.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Mark Tinguely <tinguely@sgi.com>
Mike Frysinger [Mon, 24 Sep 2012 23:39:38 +0000 (19:39 -0400)]
xfsprogs: install shared libs with +x bits
These are shared libs w/executable code, so make sure they have +x bits
set on them. Some kernels will proactively disallow executable mmaps if
the files lack +x bits. It's also the right thing to do.
Signed-off-by: Mike Frysinger <vapier@gentoo.org> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
Eric Sandeen [Wed, 10 Oct 2012 03:40:11 +0000 (03:40 +0000)]
xfs_io: include headers for preadv/pwritev
We need to include uio.h to avoid:
[CC] pread.o
pread.c: In function `do_pread':
pread.c:198: warning: implicit declaration of function `preadv'
[CC] pwrite.o
pwrite.c: In function `do_pwrite':
pwrite.c:85: warning: implicit declaration of function `pwritev'
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Mark Tinguely <tinguely@sgi.com>
Carlos Maiolino [Mon, 16 Apr 2012 20:56:56 +0000 (20:56 +0000)]
mkfs: Set a clean output in case of invalid inode size
Remove an unnecessary usage() call after a mkfs failure due an invalid inode
size.
A call to usage() at this point confuses the output message which may cause the
user to think it used wrong arguments to mkfs, instead of an invalid inode size.
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Mark Tinguely <tinguely@sgi.com>
Mike Frysinger [Sat, 25 Aug 2012 23:07:30 +0000 (23:07 +0000)]
libxcmd: link against readline
This library uses readline funcs (the input.c file), so we need to link
this shared library against it.
URL: https://bugs.gentoo.org/432644 Reported-by: David Badia <dbadia@gmail.com> Signed-off-by: Mike Frysinger <vapier@gentoo.org> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Mark Tinguely <tinguely@sgi.com>
Dave Chinner [Wed, 25 Jul 2012 22:30:50 +0000 (22:30 +0000)]
xfs_io: implement pwritev for vectored writes
When looking at KVM based direct IO patterns, I noticed that it was
using preadv and pwritev, and I could not use xfs_io to simulate
these IO patterns. Extend the pwrite command to be able to issue
vectored write IO to enable use to simulate KVM style direct IO.
Also document the new parameters as well as all the missing pwrite
command parameters in the xfs_io(8) man page.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Mark Tinguely <tinguely@sgi.com>
Dave Chinner [Wed, 25 Jul 2012 22:30:49 +0000 (22:30 +0000)]
xfs_io: implement preadv for vectored reads
When looking at KVM based direct IO patterns, I noticed that it was
using preadv and pwritev, and I could not use xfs_io to simulate
these IO patterns. Extend the pread command to be able to issue
vectored read IO to enable use to simulate KVM style direct IO.
Also document the new parameters as well as all the missing pread
command parameters in the xfs_io(8) man page.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Mark Tinguely <tinguely@sgi.com>
Dave Chinner [Wed, 25 Jul 2012 22:30:48 +0000 (22:30 +0000)]
xfs_io: add sync_file_range support
Add sync_file_range support to xfs_io to allow fine grained control
of data writeback and syncing on a given file.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Mark Tinguely <tinguely@sgi.com>