Alex Elder [Fri, 30 Jul 2010 21:45:45 +0000 (21:45 +0000)]
xfsprogs: fix depend targets
There's no need to re-make the dependency files all the time. Make
it so the "depend" target rebuilds the ".dep" file only if necessary.
Also change the name of the dependency file created for "ltdepend"
to be ".ltdep".
Signed-off-by: Alex Elder <aelder@sgi.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
We need to substitute root_sbindir and root_libdir even for the case
where we don't have the different from the default prefix, otherwise
xfsprogs won't build for that case with rpath errors, and wouldn't
install correctly either.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> Reported-by: Christian Kujau <lists@nerdbynature.de> Tested-by: Christian Kujau <lists@nerdbynature.de>
Peter Watkins [Fri, 9 Jul 2010 16:17:10 +0000 (09:17 -0700)]
xfs_db: validate btree block magic in the freesp command
Occasionally I've hit a SEGV while querying free space in xfs_db on a
mounted file system. In scanfunc_bno, block->bb_numrecs has crazy values.
And bb_magic is not XFS_ABTB_MAGIC.
Check for the correct magic number first, and return otherwise.
Signed-off-by: Peter Watkins <treestem@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
Dave Chinner [Thu, 8 Jul 2010 00:20:22 +0000 (10:20 +1000)]
xfs_db: check for valid inode data pointer before dereferencing
When processing an inode, the code checks various flags to determine
whether to output messages or not. When checking the CLI provided
inode numbers to be verbose about, we fail to check if the inode
data structre returned is valid or not before dereferencing it.
Hence running xfs_check with the "serious errors only" flag, xfs_db
will crash. Fix up the "should we output" logic to be safe.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
Dave Chinner [Tue, 6 Apr 2010 09:19:01 +0000 (19:19 +1000)]
xfs_fsr: Improve handling of attribute forks V2
If the file being defragmented has attributes, then fsr puts a dummy
attribute on the temporary file to try to ensure that the inode
attribute fork offset is set correctly. This works perfectly well
for the old style of attributes that use a fixed fork offset - the
presence of any attribute of any size or shape will result in fsr
doing the correct thing.
However, for attr2 filesystems, the attribute fork offset is
dependent on the size and shape of both the data and attribute
forks. Hence setting a small attribute on the file does not
guarantee that the two inodes have the same fork offset and
therefore compatible for a data fork swap.
This patch improves the attribute fork handling of fsr. It checks
the filesystem version to see if the old style attributes are in
use, and if so uses the current method.
If attr2 is in use, fsr uses bulkstat output to determine what the
fork offset is. If the attribute fork offsets differ then fsr will
try to create attributes that will result in the correct offset. If
that fails, or the attribute fork is too large, it will give up and just
attempt the swap.
This fork offset value in bulkstat new functionality in the kernel,
so if there are attributes and a zero fork offset, then the kernel
does not support this feature and we simply fall back to the existing,
less effective code.
Version 2:
- simplify the attribute creation to use a small fixed size attribute
- handle the fork offset not changing as attributes are added - it can take a
few attributes to move it from one offset to another
- comment the code better
- passes test 226 and reduces the number of unswappable inode pairs passed to
the (fixed) kernel to zero
Wengang Wang [Mon, 26 Apr 2010 17:49:41 +0000 (12:49 -0500)]
xfsprogs: mkfs manpage fix for -nsize/log
There are two limitations for the mkfs.xfs -nsize/log option:
1) directory block size must be a power of 2.
2) it can't be less than a file system block size.
Current man page don't include the above information. User could
be confused with errors, say "Illegal value xxx for -n size option", but
they can't find out the cause by checking the man page.
The patch adds the two limitations to the manpage.
Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Petr Salinger [Wed, 24 Mar 2010 03:21:15 +0000 (14:21 +1100)]
Resolve build issues on Debian GNU/kFreeBSD port.
Additional platform target added to build system, with similar
build options to Linux but ultimately making BSD syscalls (and
hence leveragin the existing FreeBSD port in places too).
Reviewed-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Nathan Scott <nathans@debian.org> Signed-off-by: Nathan Scott <nathans@debian.org>
Dave Chinner [Sun, 14 Mar 2010 22:52:08 +0000 (09:52 +1100)]
xfsprogs: duplicate extent btrees in xfs_repair need locking
The per-ag duplicate extent btrees can be search concurrently from multiple
threads. This occurs when inode extent lists are being processed and inodes
with extents in the same AG are checked concurrently. The btrees have an
internal traversal cursor, so doing concurrent searches can result in the
cursor being corrupted for both searches.
Add an external lock for each duplicate extent tree and use it for searches,
inserts and deletes to ensure that we don't trash the state of any operation.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
Dave Chinner [Wed, 17 Feb 2010 05:19:17 +0000 (16:19 +1100)]
xfsprogs: clean up make install build V2
The install targets did not get the silent treatment like the
normal build targets. Shut them up.
Also, remove the top level install target dependency on the default
target. Each sub-directory already defines the correct dependencies
for the install targets and so all the rebuilds can be done in one
traversal of the subdirectories via the install rules.
Signed-off-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
We currently fail to detect that a device does indeed not contain any
signature and we are indeed fine to proceed with it due to mishandling
the return value of blkid_do_fullprobe. Fix that up and add some
better diagnostics of the blkid detection.
from RH bugzilla https://bugzilla.redhat.com/show_bug.cgi?id=561870
# dd if=/dev/zero of=k bs=1MB count=2 seek=20; mkfs.xfs k
# mkfs.xfs: probe of k failed, cannot detect existing filesystem.
# mkfs.xfs: Use the -f option to force overwrite
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Only negatie return values from open mean we failed to open the device.
Without this check we do not print the usage message when no device is
specified. This leads to a weird failure in xfstests 122.
Reviewed-by: Eric Sandeen <sandeen@sandeen.ent> Signed-off-by: Christoph Hellwig <hch@lst.de>
Dave Chinner [Thu, 21 Jan 2010 07:43:18 +0000 (07:43 +0000)]
xfsprogs: Automatic build dependency calculations
Currently the xfsprogs builds do not have any automatic dependency
calculations. It relies on a separate make depend run to build or
update dependency information. It also relies on an external
makedepend binary. If that binary does not exist, the dependencies
do not get calculated.
To remove the dependency on makedepend, gcc can be used instead as
it has a command to generate dependency information. This patch
changes the dependency rule building to use gcc.
In case anyone uses an old (several years) gcc compiler or a
compiler that doesn't support gcc compatible dependency generation,
a new configure check is added to turn off dependency checking so
builds can still be done.
To use the dependencies automatically, we need to use a special
include makefile directive to include the build dependencies into
the current makefile. Essentially once the dependencies are
calculated, they can be included into the makefile and make will
recalculate the build dependencies automatically based on that
information.
Hence we get a build that automatically calculates and keeps
dependencies up to date without dependence on any external tools.
Signed-off-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
Dave Chinner [Tue, 19 Jan 2010 22:13:15 +0000 (09:13 +1100)]
xfsprogs: fix build warnings in repair V2
Rewrite the loop in btree_get_prev() so that the compiler
can see that it returns if the cur->index is zero so it
doesn't complain about possible array bound underflows
when getting the key out of the buffer. Version 2 fixes
a height overflow in the reworked loop.
Fix the directory name sign warnings by casting to (uchar_t *)
appropriately.
Dave Chinner [Mon, 18 Jan 2010 00:09:17 +0000 (11:09 +1100)]
xfsprogs: fix warning in adfs superblock probe
The probe gets an array subscript warning because gcc is not smart
enough to realise that a structure made up of multiple byte arrays
in it can be referenced as a flat buffer and it is valid to access
bytes beyond the first array in the structure....
Fix it by passing the adfs superblock in and using the internal
checksum array to get the checksum value.
Signed-off-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
Dave Chinner [Mon, 18 Jan 2010 00:08:50 +0000 (11:08 +1100)]
xfsprogs: fix missing error check in xfs_rtfree_range in libxfs
When xfs_rtfind_forw() returns an error, the block is returned
uninitialised. xfs_rtfree_range() is not checking the error return,
so could be using an uninitialised block number for modifying bitmap
summary info.
Signed-off-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
Dave Chinner [Tue, 19 Jan 2010 22:05:49 +0000 (09:05 +1100)]
xfsprogs: Make the compile output cleaner V3
We don't need to see every compiler command line for every file that
is compiled. This makes it hard to see warnings and errors during
compile. For progress notification, we really only need to see the
diretory/file being operated on.
Turn down the verbosity of output by suppressing various make output
and provide better overall visibility of which directory is being
operated on, what the operation is and what is being done to the
files by the build/clean process.
Eric Sandeen [Fri, 15 Jan 2010 19:25:20 +0000 (13:25 -0600)]
mkfs: fix mkfs.xfs -dfile,name=$NAME for new files
# /sbin/mkfs.xfs -dfile,name=grrr,size=100g
mkfs.xfs: Use the -f option to force overwrite.
check_overwrite is failing, because blkid_new_probe_from_filename()
is failing, because the (new) image file is 0 length.
It's easy to test for 0 length, and if found, there is
nothing to overwrite so return 0.
Also, if testing itself failed for some reason, print
a message to that effect:
# mkfs/mkfs.xfs -dfile,name=newfile,size=1g
mkfs.xfs: probe of newfile failed, cannot detect existing filesystem.
mkfs.xfs: Use the -f option to force overwrite.
Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: Alex Elder <aelder@sgi.com>
Eric Sandeen [Tue, 12 Jan 2010 16:22:43 +0000 (10:22 -0600)]
mkfs: handle 4k sector devices more cleanly
Trying to mkfs a 4k sector device today fails w/o manually specifying
sector size:
# modprobe scsi_debug sector_size=4096 dev_size_mb=32
# mkfs.xfs -f /dev/sdc
mkfs.xfs: warning - cannot set blocksize on block device /dev/sdc: Invalid argument
Warning: the data subvolume sector size 512 is less than the sector size
reported by the device (4096).
... <fail>
add sectorsize to the device topology info, and use that if present.
Also check that explicitly requested sector sizes are not smaller
than the hardware size. This already fails today, but with the more
cryptic "cannot set blocksize" ioctl error above.
With a few more suggested comments & cleanups from Christoph.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
Bill Kendall [Wed, 6 Jan 2010 17:38:41 +0000 (18:38 +0100)]
libhandle: always use a good path for by-handle ioctls
We can't open symbolic links to perform the by-handle XFS ioctls, and
while we can open special files they won't end up calling into the
XFS ioctl method. So before calling into the handle ioctls generate
a fspath that always points to a regular file or directory that we can
call the ioctl on.
Signed-off-by: Bill Kendall <wkendall@sgi.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
Eric Sandeen [Wed, 6 Jan 2010 17:32:00 +0000 (18:32 +0100)]
detect blkid topology support in autoconf
Here's some autoconf fu to check for blkid topo support; this changes it to
default to using blkid, optionally disable-able, and disables it automatically
if the topo stuff isn't found (I think ;)
Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Christoph Hellwig <hch@lst.de>
repair: compare superblock / AG headers fields against manual counts
Compare the free block / inode counters in the superblock and AG headers
against the values we get from a manual btree traversal. Ported over from
xfs_db to get the same amount of superblock / AG header checking as in
xfs_check.
Note: this causes additional output in the xfstests 030 and 178 which will need
some adjustments in the testcases.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com>
MAXEXTLEN is a limit for the bmap btree extent length, not for the freespace
btrees which can fill out the whole u32. Remove the check which makes
repair skip too large freespace extents. Also add warnings for freespace
btree records that fail the remaining validations.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com>
I think that this mode is left over from when xfs_fsr used to
fork into the uid of the file's owner, and so needed somewhere
it was guaranteed to be able to write.
This behavior was removed in commit d51b892411c8d33374a02e20c5888df280811549
(in the xfsdump tree, before xfs_fsr got moved) and so these
wide-open permissions should no longer be needed.
Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: Nathan Scott <nscott@aconex.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
xfs_db uses the xfs_bmbt_rec_32_t type to pass around extent information in a
few places. But everywhere where we actually use it we use the normal
xfs_bmbt_rec_t just casting from/to xfs_bmbt_rec_32_t to pass it around.
Just pass the xfs_bmbt_rec_t directly and thus get rid of the last use
of xfs_bmbt_rec_32_t in xfsprogs.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Eric Sandeen [Wed, 2 Dec 2009 17:19:48 +0000 (11:19 -0600)]
xfs_db: modify bad_features2 when modifying features2
The "attr1" command in xfs_db, for example, only modifies the features2
field; when mounted, the kernel will find a mismatch between features2
and bad_features2, and attr2 gets turned back on.
I think the simplest fix is to modify do_version to modify both fields,
but not if there is an existing mismatch that should be investigated
first.
Any mismatch can be fixed up by writing directly to the superblock
fields.
Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: Christoph Hellwig <hch@lst.de>
Alex Elder [Wed, 25 Nov 2009 23:44:40 +0000 (17:44 -0600)]
Revert "3.0.5 release" and some of its preceding commits.
This reverts 11 commits that followed merge 15a60a5...: b0567f1 3.0.5 release 24d9757 add lpath_to_handle to libhandle bad0fe5 repair: add missing locking in scanfunc_bmap 2098754 repair: optimize duplicate extent tracking 241ea1c repair: switch block usage bitmap to a btree af20fe6 repair: cleanup alloc/free/reset of the block... add8f66 repair: cleanup helpers for tracking block usage da9398d repair: track logical to physical block mapping... d081a36 repair: clean up prefetch tracing d93f8b2 repair: use single prefetch queue eb26465 repair: use a btree instead of a radix tree for...
Using off64_t may require special headers or compiler flags that aren't
always available, e.g. in the configure check in xfstests. Rever to a plain
uint64_t to make apps compile as before.
While we're at it also rename the second argument of platform_discard_blocks
from end to len as that's what the BLKDISCARD ioctl excepts - we currently
always discard the whole device so it doesn't matter in practice.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Since xfsprogs 3.0 libdisk is intended to be private to xfsprogs and we do
not install the headers anymore. But we kept installing the static library,
which doesn't make sense.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Add symbol versioning for libhandle. For now version 1.0.3 contains all
pre-existing symbols, any new additions both needs a minor version bump
and an entry in libhandle.sym.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Barry Naujok [Thu, 12 Nov 2009 10:28:34 +0000 (11:28 +0100)]
repair: optimize duplicate extent tracking
Switch the duplicate extent tracking from an avl tree to our new btree
implementation. Modify search_dup_extent to find overlapping extents
with differening start blocks instead of having the caller walk every
possible start block.
Signed-off-by: Barry Naujok <bnaujok@sgi.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com>
Barry Naujok [Thu, 12 Nov 2009 10:28:31 +0000 (11:28 +0100)]
repair: switch block usage bitmap to a btree
Using a btree representing the extents is much more space efficient than
using a bitmap tracking every single block. In addition it also allows
for more optimal algorithms checking range overlaps instead of walking
every block in various places.
Also move the RT tracking bitmap into incore.c instead of leaving it
a as macros - this keeps the implementation contained.
Signed-off-by: Barry Naujok <bnaujok@sgi.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com>
Barry Naujok [Thu, 12 Nov 2009 10:28:25 +0000 (11:28 +0100)]
repair: cleanup alloc/free/reset of the block usage tracking
Currently the code to allocate, free and reset the block usage bitmaps
is a complete mess. This patch reorganizes it into logical helpers.
Details:
- the current incore_init code is called just before phase2 is called,
which then marks the log and the AG headers used.
- we get rid of incore_init init, and replace it with direct calls to the
unchanched incore_ino_init/incore_ext_init functions and our new init_bmaps
which does all the allocations for the block usage tracking, aswell
as a call to reset_bmaps to initialize it to the default values.
- reset_bmaps is also called from early phase4 code to reset all state
instead of opencoding it.
- there is a new free_bmaps helper which we call to free our block usage
bitmaps when we don't need them anymore after phase5. The current
code frees some of it a bit early in phase5, but needs to take of it
in phase6 in case we didn't call phase5 due to nomodify mode, and leaks
it if we don't call phase 6, which might happen in case of a bad inode
allocation btree.
Signed-off-by: Barry Naujok <bnaujok@sgi.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com>
Barry Naujok [Thu, 12 Nov 2009 10:27:33 +0000 (11:27 +0100)]
repair: cleanup helpers for tracking block usage
Rename get_agbno_state/set_agbno_state to get_bmap/set_bmap because
those names are more self-descriptive. Remove the superblous mount
argument to the as the current filesystem is a global in repair.
Remove the fsbno taking variant as they just complicated the code.
Bring all uses of them into the canonical form.
Signed-off-by: Barry Naujok <bnaujok@sgi.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com>
Barry Naujok [Thu, 12 Nov 2009 10:27:28 +0000 (11:27 +0100)]
repair: track logical to physical block mapping more effeciently
Currently we track the logical to physical block mapping by a structure which
contains an array of physicial blocks. This is extremly inefficient and is
replaced with the normal startblock, length extent descriptors.
In addition also use thread-local storage for the block map, this is possible
because repair only processes one inode at a given time per thread, and the
block map does not have to outlive the processing of a single inode.
The combination of those factors means we can use pthread thread-local
storage to store the block map, and we can re-use the allocation over
and over again.
This should be ported over to xfs_db eventually, or even better we could try
to share the code.
[hch: added a small fix in blkmap_set_ext to not call memmove unless needed]
Signed-off-by: Barry Naujok <bnaujok@sgi.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com>
Barry Naujok [Thu, 12 Nov 2009 10:27:23 +0000 (11:27 +0100)]
repair: clean up prefetch tracing
Define a dummy pftrace macro for the non-tracing case to reduce the ifdef hell,
clean up a few trace calls and add proper init/exit handlers for the tracing
setup and teardown.
Signed-off-by: Barry Naujok <bnaujok@sgi.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com>
Barry Naujok [Thu, 12 Nov 2009 10:27:11 +0000 (11:27 +0100)]
repair: use a btree instead of a radix tree for the prefetch queue
Currently the prefetch queue in xfs_repair uses a radix tree implementation
derived from the Linux kernel one to manage it's prefetch queue.
The radix tree implement is not very memory efficient for sparse indices,
so replace it with a btree implementation that is much more efficient.
This is not that important for the prefetch queue but will be very important
for the next memory optimization patches which need a tree to store things
like the block map which are very sparse, and we do not want to deal with
two tree implementations (or rather three given that we still have avl.c
around)
Signed-off-by: Barry Naujok <bnaujok@sgi.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com>
Currently the build/ directory can create rpm, debian and source / binary
tar packages. The RPM generation is not used as all distributions prefer
their own spec files, and the binary tarball not used at all as it's
a not very useful format. Reimplement the generation of the source
tarballs to use the source-link method used for the debian packages and
get rid of the whole old package generation machinery. Also fix a small
bug in the link-based source directory creation which was not including
the .pot file for gettext.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Nathan Scott <nathans@debian.org>
Bill Kendall [Thu, 22 Oct 2009 16:52:23 +0000 (16:52 +0000)]
add lpath_to_handle to libhandle
path_to_handle() is not reliable when called on a path which
is a symlink. If the symlink is dangling, or if its points
to a non-XFS filesystem then path_to_handle() will fail. The
reason is that path_to_handle() must open the path in order
to obtain an fd for the xfsctl call.
It's common during xfsrestore to have dangling symlinks since
the target of the link may not be restored before the symlink.
This patch adds a new function to libhandle, lpath_to_handle.
It is just like path_to_handle, except it takes a filesystem
path in addition to the path which you want convert to a
handle.
Alex Elder is going to take care of bumping the libhandle
minor number, and adjusting the xfsdump/xfsprogs version numbers
and dependencies to ensure a compatible libhandle is installed
for xfsdump.
Signed-off-by: Bill Kendall <wkendall@sgi.com> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Alex Elder <aelder@sgi.com>
Make sure to protect access to the block usage tracking btree with
the ag_lock.
Signed-off-by: Barry Naujok <bnaujok@sgi.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Alex Elder <aelder@sgi.com>
Switch the duplicate extent tracking from an avl tree to our new btree
implementation. Modify search_dup_extent to find overlapping extents
with differening start blocks instead of having the caller walk every
possible start block.
Signed-off-by: Barry Naujok <bnaujok@sgi.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Alex Elder <aelder@sgi.com>
Using a btree representing the extents is much more space efficient than
using a bitmap tracking every single block. In addition it also allows
for more optimal algorithms checking range overlaps instead of walking
every block in various places.
Also move the RT tracking bitmap into incore.c instead of leaving it
a as macros - this keeps the implementation contained.
Signed-off-by: Barry Naujok <bnaujok@sgi.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Alex Elder <aelder@sgi.com>
repair: cleanup alloc/free/reset of the block usage tracking
Currently the code to allocate, free and reset the block usage bitmaps
is a complete mess. This patch reorganizes it into logical helpers.
Details:
- the current incore_init code is called just before phase2 is called,
which then marks the log and the AG headers used.
- we get rid of incore_init init, and replace it with direct calls to the
unchanched incore_ino_init/incore_ext_init functions and our new init_bmaps
which does all the allocations for the block usage tracking, aswell
as a call to reset_bmaps to initialize it to the default values.
- reset_bmaps is also called from early phase4 code to reset all state
instead of opencoding it.
- there is a new free_bmaps helper which we call to free our block usage
bitmaps when we don't need them anymore after phase5. The current
code frees some of it a bit early in phase5, but needs to take of it
in phase6 in case we didn't call phase5 due to nomodify mode, and leaks
it if we don't call phase 6, which might happen in case of a bad inode
allocation btree.
Signed-off-by: Barry Naujok <bnaujok@sgi.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Alex Elder <aelder@sgi.com>
Rename get_agbno_state/set_agbno_state to get_bmap/set_bmap because
those names are more self-descriptive. Remove the superblous mount
argument to the as the current filesystem is a global in repair.
Remove the fsbno taking variant as they just complicated the code.
Bring all uses of them into the canonical form.
Signed-off-by: Barry Naujok <bnaujok@sgi.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Alex Elder <aelder@sgi.com>
repair: track logical to physical block mapping more effeciently
Currently we track the logical to physical block mapping by a structure which
contains an array of physicial blocks. This is extremly efficient and is
replaced with the normal starblock storage we use in the kernel and on disk
in this patch.
In addition also use thread-local storage for the block map, this is possible
because repair only processes one inode at a given time per thread, and the
block map does not have to outlive the processing of a single inode.
The combination of those factors means we can use pthread thread-local
storage to store the block map, and we can re-use the allocation over
and over again.
This should be ported over to xfs_db eventually, or even better we could try
to share the code.
[hch: added a small fix in blkmap_set_ext to not call memmove unless needed]
Signed-off-by: Barry Naujok <bnaujok@sgi.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Alex Elder <aelder@sgi.com>
Define a dummy pftrace macro for the non-tracing case to reduce the ifdef hell,
clean up a few trace calls and add proper init/exit handlers for the tracing
setup and teardown.
Signed-off-by: Barry Naujok <bnaujok@sgi.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Alex Elder <aelder@sgi.com>
We don't need two prefetch queues as we guarantee execution in order anyway.
XXX: description could use some more details.
Signed-off-by: Barry Naujok <bnaujok@sgi.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Alex Elder <aelder@sgi.com>
repair: use a btree instead of a radix tree for the prefetch queue
Currently the prefetch queue in xfs_repair uses a radix tree implementation
derived from the Linux kernel one to manage it's prefetch queue.
The radix tree implement is not very memory efficient for sparse indices,
so replace it with a btree implementation that is much more efficient.
This is not that important for the prefetch queue but will be very important
for the next memory optimization patches which need a tree to store things
like the block map which are very sparse, and we do not want to deal with
two tree implementations (or rather three given that we still have avl.c
around)
Signed-off-by: Barry Naujok <bnaujok@sgi.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Alex Elder <aelder@sgi.com>
Those two functions are almost identical. The big difference is that we only
move blocks from XR_E_FREE1 to XR_E_FREE state when processing the cnt btree.
Besides that we print bno vs cnt in the messages and obviously validate a
slightly different magic number in the header.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Call the BLKDISCARD ioctl to mark the whole disk as unused before creating
a new filesystem. This will allow SSDs, Arrays with thin provisioning support
and virtual machines to make smarter allocation decisions.
Add a new -K option to prevent mkfs from discarding blocks to aid
trouble-shooting or specialized requirements.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: Andi Kleen <andi@firstfloor.org>
Add a new --enable-blkid switch to use libblkid from util-linux to detect
the device geometry and check for existing partitions or filesystem on a
device. Note that this requires the latest blkid from util-linux-ng git
for the topology calls, odler ones won't work. If I had a little more
autoconf fu we might be able to detect a too early one, but right now it
just fails if it's too old and --enable-blkid is specified. We also
stop building libdisk in the blkid case as it's an internal static library
not otherwise used.
For the actual checks I tried to stay as close as possible to the old
code, so we still don't check topology for external log devices. I hope
to add this at a later stage.
As a small addition we also print a warning if trying to create a filesystem
on a partition that is not properly aligned.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@sandeen.net>