Eric Sandeen [Mon, 18 May 2009 22:02:32 +0000 (17:02 -0500)]
resize2fs: fix minimum size calculations
The extra padding added to the minimum size calculations:
/*
* We need to reserve a few extra blocks if extents are
* enabled, in case we need to grow the extent tree. The more
* we shrink the file system, the more space we need.
*/
if (fs->super->s_feature_incompat & EXT3_FEATURE_INCOMPAT_EXTENTS)
blks_needed += (fs->super->s_blocks_count - blks_needed)/500;
can go quite wrong if we've already added up more "blks_needed"
than our current size, and the above subtraction wraps. This can
easily happen for a filesystem which is almost completely full.
In this case, just return the current fs size as the minimum and
be done with it.
With this fix we could probably call calculate_minimum_resize_size()
for each resize2fs invocation and refuse to resize smaller than that?
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Eric Sandeen [Wed, 20 May 2009 21:36:26 +0000 (16:36 -0500)]
resize2fs: fix ENOSPC corruption case
http://people.redhat.com/esandeen/livecd-creator-imagefile.bz2
contains an image (for now) which, when resized to 578639, corrupts
the filesystem.
This is a bit crazy, I guess, because the fs currently has only
1 free block, but still, we should be graceful about the failure.
Perhaps it would make sense to check the requested valuea against
the minimum value resize2fs would compute for "-P" and fail (at
least without a force).
But in any case, this exposed 2 bugs when moving that one block
required an extent split, which is what hit the ENOSPC.
For starters, ext2fs_extent_set_bmap() in the "(re/un)mapping last
block in extent" case was replacing the old extent before the
new one was created; when the new extent creation failed, it
left us in an inconsistent state. Simply changing the order of
the two should fix this problem.
Next, ext2fs_extent_insert was calling ext2fs_extent_delete()
on *any* error, including one caused by failure to allocate a new
block to split the node to hold that extent ... the handle was left
unchanged, and we deleted the -original- extent.
As a quick fix for this, just don't do the delete if we fail the split,
though this may need to be smarter. I don't think we have terribly
consistent behavior about where a handle is left on various errors.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Tue, 26 May 2009 00:14:04 +0000 (20:14 -0400)]
e2fsck: Fix journal replay bug which reverts changes to the bg descriptors
Fix a regression in e2fsprogs 1.41.5 which would undo updates to the
block group descriptors after a journal replay, caused by commit b7c5b403. We now use ext2fs_free() instead of ext2fs_close() to make
sure we the library will never try to write out superblock or block
group descriptors.
Andreas Dilger [Mon, 18 May 2009 03:03:04 +0000 (23:03 -0400)]
e2fsck: initialize error handling before journal replay
One of our customers hit a temporary IO error during an e2fsck run during
the read from the journal. It seems that the read error resulted in
e2fsck automatically discarding the journals and recreating them on several
filesystems on this node without any prompting from the user:
end_request: I/O error, dev sdg, sector 484832
Buffer I/O error on device sdg, logical block 60604
fsck-sdg[8276]: ls2-OST024c: Superblock has an invalid ext3 journal (inode 8).
fsck-sdg[8276]: CLEARED.
fsck-sdg[8276]: *** ext3 journal has been deleted - filesystem is now ext2
only ***
fsck-sdg[8276]: ls2-OST024c was not cleanly unmounted, check forced.
fsck-sdg[8276]: ls2-OST024c: Journal inode is not in use, but contains data.
CLEARED.
fsck-sdg[8276]: ls2-OST024c: Recreate journal to make the filesystem ext3
again?
fsck-sdg[8276]: FIXED.
fsck-sdg[8276]: Creating journal (32768 blocks): Done.
fsck-sdg[8276]:
fsck-sdg[8276]: *** journal has been re-created - filesystem is now ext3 again
***
fsck-sdg[8276]: ls2-OST024c: 39818/20183248 files (8.2% non-contiguous), 222122257/779902976 blocks
fsck-sdg[8276]: exit code 1 (file system errors corrected)
The following patch moves the e2fsck error handler initialization earlier
in the e2fsck startup code before the journal is processed, so that the
user will be prompted for an action. This is the first IO that is not
part of ext2fs_open() where fs->io is first initialized.
It doesn't seem possible to initialize the error handlers for the initial
filesystem open without changing the prototype for ext2fs_open2(). If we
are getting a new ext2fs_open3() prototype for 64-bit it might make sense
to add at least "read_error" as a parameter ("write_error" is not strictly
necessary for the open and could be set afterward).
Signed-off-by: Andreas Dilger <adilger@sun.com> Signed-off-by: Jim Garlick <garlick@llnl.gov> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Theodore Ts'o [Sun, 17 May 2009 12:42:52 +0000 (08:42 -0400)]
e2fsck: Don't crash if an inode with a bad extent header is not cleared
If ext2fs_extent_open() fails due to a corrupt extent header, and the
user declines to clear the inode, check_blocks_extents() should bail
out; otherwise, it will cause a core dump due a null pointer
dereference.
Karel Zak [Mon, 27 Apr 2009 13:00:57 +0000 (15:00 +0200)]
blkid: use /dev/mapper/<name> rather than /dev/dm-<N>
The libblkid (since v1.41.1) returns private device-mapper names (e.g.
/dev/dm-0). It's because the probe_one() function scans /dev before
/dev/mapper.
brw-rw---- 1 root disk 253, 0 2009-04-27 13:41 /dev/dm-0
brw-rw---- 1 root disk 253, 0 2009-04-27 13:41 /dev/mapper/TestVolGroup-TestLogVolume
Old version:
# blkid -t LABEL="TEST-LABEL" -o device
/dev/dm-0
Andreas Dilger [Tue, 28 Apr 2009 18:59:07 +0000 (12:59 -0600)]
e2fsck: cleanup whitespace in problem.c and problem.h
Cleanup whitespace in the problem.h and problem.c files. Removes a
bunch of places where tabs follow spaces, whitespace on empty lines, etc.
I didn't reformat the indenting of the entire problem.h error codes,
but there is some room for doing this...
Signed-off-by: Andreas Dilger <adilger@sun.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Andreas Dilger [Mon, 27 Apr 2009 22:59:24 +0000 (16:59 -0600)]
e2fsck: Add test code in problem.c to verify problem codes
We've hit a number of cases where the error codes in problem.h have
been assigned duplicate values compared to problems in our own e2fsck
patches, and this can lead to confusing and difficult to find bugs
in e2fsck (e.g. wrong problem messages, incorrect repair action, etc).
Attached is a test case for the problem.c file to ensure that the
problem table is sorted and does not contain any duplicate values.
Having the problem table sorted allows the correctness checking to be
very simple, and if it ever became important for performance we could
use binary searching of the problem table for the specific problem code.
Signed-off-by: Andreas Dilger <adilger@sun.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
e2fsck: Skip journal checks if the fs is mounted and doesn't need recovery
If we are checking a mounted filesystem (typically the root
filesystem, mounted read/only) and the NEEDS_RECOVERY flag is not set,
skip all of the checks associated with making sure the journal is
consistent. There is the very slight possibility we could lose if the
NEEDS_RECOVERY flag was somehow cleared even though there was data in
the journal, but this has practically never happend in practice, and
it reduces the number of reads required at boot-time, which is a big
deal when trying to reduce boot times with HDD's.
libext2fs: read the block group descriptors more efficiently
When opening a filesystem, make ext2fs_open2() much more efficient by
reading the normal block group descriptors all at once, instead of one
block at a time.
e2fsck: Reduce unnecessary I/O when checking backup superblock
E2fsck needs to check to see if the backup superblock differs from the
primary superblock. Previously it was doing so by calling
ext2fs_open(), which does a lot of unnecessary work, including reading
all of the backup block group descriptors. Avoid this by reading in
the backup superblock directly.
e2fsck: Don't test the resize_inode if the filesystem is clean
Move check_resize_inode() out of check_super_block(), since we only
need to test the resize_inode for correctness only if the filesystem
requires checking. This change avoids a lot of I/O operations which
slows down a 1 second boot.
Eric Sandeen [Thu, 23 Apr 2009 03:51:51 +0000 (22:51 -0500)]
blkid: remove whole-disk entries from cache when partitions are found
We can get into a situation in blkid where whole disks remain
in the cache, even though partitions are found. For labels
such as sun disklabels which may have the first partition
beginning at sector 0, this is even somewhat likely.
1) create a sun disklabel w/partitions
2) mkfs the first partition (at sector 0)
3) remove the partition table
4) run blkid - this finds the fs on the whole disk, places in cache
5) recreate the partition table
6) run blkid - this finds the partition, places in cache
And now we have both /dev/sda and /dev/sda1 in cache.
There are heuristics in probe_all to avoid putting the whole disk
in cache if it has partitions, but there is nothing to remove the
whole-disk entry in the above case. I think the below patch
suffices, although I haven't quite convinced myself that setting
the lens[which]=0; is the right logic for that bit of state...
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
If the device name doesn't start with a slash, ignore the /etc/mtab
entry, so that relative pathnames passed into functions such as
ext2fs_check_mount_point() or ext2fs_check_if_mounted() don't return
false positives.
resize2fs: Print a warning message if the ftruncate system call fails
Resize2fs will attempt to truncate an image file of a filesystem down
to size for the convenience of the system administrator. If the
truncate operation fails, print a warning message. This also avoids a
gcc warning message.
Fixed a potential bug where by partial returns from the write(2)
system call could lost characters to be sent to external progress bar
display program.
libss: ss_execute_line: reflect any error codes from system() to the caller
This is primarily to silence a gcc warning, but it's better to reflect
the error from system() up to the caller. In this case we don't
actually use it for anything, but that's OK.
libcom_err: Declare prototypes for et_list_lock/unlock in com_err.h
Define the prototypes for et_list_lock() and et_list_unlock() in
com_err.h. This promotes better error checking and avoids warnings
when compiling the library and programs that call these functions.
libe2p: Declare prototypes for the journal feature name functions in e2p.h
Define the prototypes for e2p_jrnl_feature2string() and
e2p_jrnl_string2feature() in e2p.h. This promotes better error
checking and avoids warnings when compiling the library and programs
that call these functions.
Add an option to switch between the private (in-tree) libblkid and
public (in-system installed) library. The private version is still
enabled by default.
If --disable-libblkid is specified the findfs(8) program, which is a
variant of tune2fs, is also not built or installed.
Signed-off-by: Karel Zak <kzak@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
debian: Update copyright files to point the correct common license files
This fixes the Lintian warning:
The copyright file refers to the versionless symlink in
/usr/share/common-licenses for the full text of the GPL, LGPL, or GFDL
license, but the package does not appear to allow distribution under
later versions of the license. This symlink will change with each
release of a new version of the license and may therefore point to a
different version than the package is released under. debian/copyright
should instead refers to the specific version of the license that the
package references.
For example, if the package says something like "you can redistribute it
and/or modify it under the terms of the GNU General Public License as
published by the Free Software Foundation; version 2 dated June, 1991,"
the debian/copyright file should refer to
/usr/share/common-licenses/GPL-2, not /GPL.
resize2fs: Fix corruption bug impacting ext4 filesystems with uninit_bg
Due to a fencepost bug, when skipping a block group whose block bitmap
was uninitialized (and hence could not contain any blocks eligible for
relaocation), the block immediately following the block group wasn't
checked as well. If it was in use and required relocation, it
wouldn't get properly relocated, with the result that an inode using
such a block would end up, post resize, with a pointer to a block now
outside the bounds of the filesystem.
resize2fs: Fix data corruption bug when shrinking the inode table for ext4
If we need to shrink the inode table, we need to make sure the inodes
contained in the part of the inode table we are vacating don't get
reused as part of the filesystem shrink operation. This wasn't a
problem with ext3 filesystems, since the inode table was located in
the block group that was going away, so that location was not eligible
for reallocation.
However with ext4 filesystems with flex_bg enabled, it's possible for
a portion of the inode table in the last flex_bg group to be
deallocated, but in a part of the filesystem which could be used as
data blocks. So we must mark those blocks as reserved to prevent
their reuse, and adjust the minimum filesystem size calculation to
assure that we don't shrink a filesystem too small for the resize
operation to succeed.
resize2fs: Fix data corruption bug when growing an ext4 filesystem off-line
When allocating a new set of block group metadata as part of growing
the filesystem, the resize2fs code assumes that the bitmap and inode
table blocks are in their own block group; an assumption which is
changed by the flex_bg feature. This commit works around the problem
by temporarily turning off flex_bg while allocating the new block
group metadata, to avoid potentially overwriting previously allocated
data blocks.
Andreas Dilger [Tue, 28 Oct 2008 20:44:21 +0000 (14:44 -0600)]
e2fsck: print unsigned RAM usage statistics
Running e2fsck against a 14.5TB filesystem with -tt it reported
-200904kB for RAM usage in pass3 instead of the correct 2300773kB.
The RAM usage statistics were being printed with %d instead of %u.
Also fix a few places using %ld for inode numbers instead of %lu.
Signed-off-by: Andreas Dilger <adilger@sun.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Eric Sandeen [Sat, 22 Nov 2008 15:02:48 +0000 (09:02 -0600)]
e2fsck: ignore differing NEEDS_RECOVERY flag on backup sbs
When we resize online, the primary superblock gets copied to all
the backups, and of course since we're mounted the NEEDS_RECOVERY
flag is set. A subsequent fsck will find the backups have the
NEEDS_RECOVERY flag set while the primary does not, and this
forces a full fsck pass.
I think this flag can be safely ignored in the flag comparisons.
Addresses-Red-Hat-Bugzilla: #471925
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Theodore Ts'o [Tue, 17 Mar 2009 02:16:44 +0000 (22:16 -0400)]
libext2fs: external journal devices should not cause ext2fs_open2 to fail
This fixes a regression introduced in commit 79a9ab14 which caused
attempts to open external journals to fail due to overly strict
filesystem consistency checks.
Jim Meyering [Mon, 23 Feb 2009 17:07:50 +0000 (18:07 +0100)]
remove useless if-before-free tests
In case you're wondering about whether this change is safe from a
portability standpoint, fear not. This has been beaten to death
in other forums. Here are a few threads:
There has been debate about whether it's a good idea from a
performance standpoint, too, but imho you'll have a hard time
finding an instance where this sort of change induces a
measurable performance penalty. If you do, please let me know.
Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Eric Sandeen [Thu, 29 Jan 2009 22:52:13 +0000 (17:52 -0500)]
fix libext2fs info page name
In Red Hat bug #481620 Jerry reported that the libext2fs info page
is not accesable via "info libext2fs" but is via "info libext2fs.info"
and suggested that the following change should fix it.
Additional info from Jerry:
The problem is that makeinfo 4.12 interprets the dot in "libext2fs.info"
to be the end of the description portion of the info entry, even though
it hasn't seen the closing parenthesis yet. Making the reference be to
just "libext2fs" works.
Addresses-Red-Hat-Bugzilla: #481620
Reported-by: Jerry James <loganjerry@gmail.com> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Eric Sandeen [Thu, 29 Jan 2009 22:46:28 +0000 (17:46 -0500)]
debugfs: fix segfault on "stat" command with no open fs
This is a regression from commit 8fdf29117f922419bd5b3f741e5d554b1d5b8893, which attempts to access
current_fs via a feature check before we check that it's open.
Just moving the feature check below the open check should fix it.
Reported-by: Andrew Hecox <ahecox@redhat.com> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Sun, 8 Mar 2009 22:56:41 +0000 (18:56 -0400)]
blkid: Add fallback to ext4 for 2.6.29+ kernels if ext2 is not present
Starting in 2.6.29, ext4 can be used to support filesystems without a
journal. So if ext2 is not present, and the kernel version is greater
than 2.6.29, and ext4 is present, return a filesystme type of ext4.
Theodore Ts'o [Fri, 6 Mar 2009 00:40:20 +0000 (19:40 -0500)]
Add support for a new superblock field: s_kbytes_written
This field tracks the lifetime amount of writes to the filesystem. It
will be updated by the kernel as well as by e2fsprogs programs which
write to the filesystem.
Theodore Ts'o [Thu, 22 Jan 2009 21:05:29 +0000 (16:05 -0500)]
e2fsck: Change PR_3_CREATE_LPF_ERROR to be a non-fatal problem
The other problem codes associated with failing to create the
lost+found directory are non-fatal, and this one should be non-fatal
as well. The two places which call e2fsck_get_lost_and_found()
already deal with a failure to create the directory, so there's no
point making this be a fatal error.
Theodore Ts'o [Thu, 22 Jan 2009 21:01:38 +0000 (16:01 -0500)]
libext2fs: Add sanity checks to ext2fs_{block,inode}_alloc_stats
If ext2fs_inode_alloc_stats2() or ext2fs_block_alloc_stats() is passed
an insanely large inode or block number, it's possible for these
functions to overrun an array boundary and cause the calling program
to crash with a memory error.
Detect this case, and since these functions don't return an error
code, print a warning message, much like we do in ext2fs_warn_bitmap2().
Theodore Ts'o [Thu, 22 Jan 2009 20:55:49 +0000 (15:55 -0500)]
ext2fs_new_inode(): Add sanity check to assure a valid inode number
Add a sanity check to makesure that even if the superblock field
s_first_inode is insane, that we won't return an invalid inode number.
(The function will return the error EXT2_ET_INODE_ALLOC_FAIL in that
case.)
Theodore Ts'o [Tue, 20 Jan 2009 18:37:47 +0000 (13:37 -0500)]
ext2fs_get_device_size: Fix error handling
The previous patch would return EFBIG for any failure called from
ext2fs_get_device_size2(). (I didn't merge this fix with the
preceeding commit to allow merges to happen more easily.)
Theodore Ts'o [Tue, 20 Jan 2009 16:49:17 +0000 (11:49 -0500)]
tune2fs: Fix tune2fs -I so it won't corrupt RAID filesystems
If a filesystem is built with the stride extended-option (which is
often used in RAID filesystems to make sure the block and inode
allocation bitmaps don't end up hitting one disk platter harder than
the rest), this can cause tune2fs -I to corrupt the filesystem because
it fails to handle the case where the allocation bitmaps are located
after the inode table, where the inode table needs to grow. Handle
this case correctly.
Theodore Ts'o [Tue, 20 Jan 2009 06:50:07 +0000 (01:50 -0500)]
tune2fs: Don't allow the -I option if the flex_bg feature is enabled
With flex_bg usually the inode table for most block groups are packed
right against each other, so expanding the inode table size needs
special handling that's not currently in tune2fs.
Theodore Ts'o [Tue, 20 Jan 2009 05:46:06 +0000 (00:46 -0500)]
resize2fs: Reserve some extra space for -P/-M for ext4 filesystems
Some extra blocks may be needed to expand some extent allocation trees
while we are shrinking the filesystem. We don't know exactly how
much, so we use a hueristic.
Theodore Ts'o [Tue, 20 Jan 2009 00:30:59 +0000 (19:30 -0500)]
ext2fs_block_iterate2: Preserve the uninit flag in extents
When modifying a block via the block_iterate interface, preserve the
uninit flag in the extent. Resize2fs uses this interface, so we have
to preserve the uninit status when relocating a block.
Theodore Ts'o [Mon, 19 Jan 2009 19:22:52 +0000 (14:22 -0500)]
ext2fs_block_iterate2: Reflect errors from ext2fs_extent_set_bmap to caller
If the callback function tries to change a block, and
ext2fs_extent_set_bmap() fails for some reason (for example, there
isn't enough disk space to split a node and expand the extent tree,
make sure that error is reflected back up to the caller.