debian: clean up conditional fuse2fs pacakge build rules
There's no point conditionalizing fuse2fs in the control file, since
we control whether or not fuse2fs is built in the rules file, and the
control file is going to be the same when built on all of the debian
build servers.
Also key off of DEB_HOST_ARCH_OS to determine whether or not we are
building for the Hurd, and define it so the right thing happens if
./debian/rules is run by hand on a Hurd system.
Andreas Dilger [Wed, 30 Aug 2017 06:05:25 +0000 (02:05 -0400)]
tests: report if a test is taking a long time
Print out a message if a test takes longer than 60s, with a
reminder to potentially add is_slow_test to that test, so
that it is easier to find which tests are taking a long time,
especially if running with "make -j check" or similar.
Add an exclusion for r_expand_full on MacOS since HFS+ does
not support sparse devices and doesn't like large test images.
Signed-off-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Andreas Dilger [Wed, 30 Aug 2017 05:56:08 +0000 (01:56 -0400)]
tune2fs: quiet llvm build warning
Quiet a relatively harmless build warning:
tune2fs.c:928:18: warning: '&&' within '||' [-Wlogical-op-parentheses]
if (pass == 1 && (inode->i_flags & EXT4_EA_INODE_FL) ||
~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~
tune2fs.c:928:18: place parentheses around '&&' to silence warning
Signed-off-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Matthias Andree [Tue, 29 Aug 2017 05:02:36 +0000 (01:02 -0400)]
Search for GNU-compatible dd for self-tests.
This checks for a dd that supports iflag=fullblock oflag=append,
and looks at gdd and dd for now, and warns of failing self-tests
if neither supports these two flags.
Theodore Ts'o [Sat, 26 Aug 2017 17:42:30 +0000 (13:42 -0400)]
Silence valgrind warnings
Valgrind doesn't understand that the kernel will be initializing the
struct termios and struct loop_info64 structures. Since they occur in
functions which are not in the hot path, preinitialize to zero to
prevent valgrind from producing a huge number of false positives.
Jaco Kroon [Wed, 23 Aug 2017 18:21:43 +0000 (14:21 -0400)]
e2fsck: add optimization for heavily hard-linked file systems
In the case of file system with large number of hard links, e2fsck can
take a large amount of time in pass 2 due to binary search lookup of
inode numbers. This implements a memory trade-off (storing 2 bytes
in-memory for each inode to store inode counts).
For a 40TB filesystem with 2.8bn inodes this map alone requires 5.7GB
of RAM. For this reason, we don't enable this optimization by
default. It can be enabled using either an extended option to e2fsck
or via a seting in e2fsck.conf.
Even when the fullmap optimization is enabled, we don't use this for
the icount structure in pass 1. This is because the gain CPU gain is
nearly nil for that pass and the sacrificed memory does not justify
the increase in RAM.
(It could be that during pass 1, if more than 17% if possible inodes
has link_count>1 (466m inodes in the 40TB with 2.8bn possible inodes
case) then it becomes more memory efficient to use the full map
implementation in terms of memory. However, this is extremely
unlikely given that most file systems are heavily over-provisioned in
terms of the number of inodes in the system.)
Jaco Kroon [Wed, 23 Aug 2017 17:54:25 +0000 (13:54 -0400)]
e2fsck: optimize out the use region_t in scan_extent_node()
Since extents have a guarantee of being monotonically increasing we
merely need to check that block n+1 starts after block n. This is a
simple enough check and we can perform this by calculating the next
expected logical block instead of using the region usage tracking data
abstraction.
Theodore Ts'o [Wed, 23 Aug 2017 14:57:25 +0000 (10:57 -0400)]
tune2fs: explain why an fsck is needed
Currently tune2fs just says without any explanation, "run fsck -f".
Add a short explanation that a freshly checked file system is required
to reduce user confusion. (We could add even more details, but
hopefully this is enough.)
Theodore Ts'o [Wed, 23 Aug 2017 14:30:09 +0000 (10:30 -0400)]
mke2fs: automatically use 256 byte inodes if project feature enabled
If the inode size is not implicitly requested on the command line, and
it is too small to support the project feature, automatically promote
the inode size to be 256 bytes so that the project feature will work.
Note the previous test to check for a too-small inode size didn't work
because it checked before inode size was set in fs_param. Hence, it
was possible to create file systems with a 128 byte inode and the
project feature enabled.
Theodore Ts'o [Tue, 22 Aug 2017 16:15:26 +0000 (12:15 -0400)]
debian: remove support for pre-multiarch versions of Debian
All versions of Debian after Wheezy support Multiarch, so we can
simply the Debian control.in and rules file by removing support for
older versions of Debian without Multiarch support.
Theodore Ts'o [Tue, 22 Aug 2017 15:23:21 +0000 (11:23 -0400)]
libext2fs: avoid potential out-of-bounds write if pread/pread64 fails
In unix_io.c's raw_read_block(), if the initial attempt to call
pread/pread64 fails because the offset is insane, the variable
"actual" is left at -1, and then when lseek fails, the cleanup
function will try to clear (as an out-of-bounds write) a single byte
before the buffer. Fix this.
Addresses-Debian-Bug: #871539
Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reported-by: Jakub Wilk <jwilk@jwilk.net>
Theodore Ts'o [Tue, 22 Aug 2017 14:37:10 +0000 (10:37 -0400)]
debian: remove support libuuid/libblkid packages
Remove support for util-linux prior to 2.16, when e2fsprogs provided
its own copy of libuuid and libblkid. This is only needed for Debian
distributions prior to Wheezy, which is no longer supported.
Theodore Ts'o [Tue, 22 Aug 2017 04:54:15 +0000 (00:54 -0400)]
libsupport: don't try accessing the project quota for 128 byte inodes
If the file system has 128 byte inode, it's not possible to access its
project quota id, since the inode is too small. So prevent a
potential out-of-bounds read in get_qid().
The problem was found by valgrind and American Fuzzy Lop.
Theodore Ts'o [Tue, 22 Aug 2017 01:20:38 +0000 (21:20 -0400)]
e2fsck: in ask_yn() fall back to English yes/no characters
In the case of missing translations, if the translation for y/n is
missing due to fuzzy transactions, such that user is told to use
<y/n>, those characters will work correctly.
Theodore Ts'o [Mon, 14 Aug 2017 23:52:39 +0000 (19:52 -0400)]
e2fsck: add optimization for large, fragmented sparse files
The code which checks for overlapping logical blocks in an extent tree
is O(h*e) in time, where h is the number of holes in the file, and e
is the number of extents in the file. So a file with a large number
of holes can take e2fsck a long time process. Optimize this taking
advantage of the fact the vast majority of the time, region_allocate()
is called with increasing logical block numbers, so we are almost
always append onto the end of the region list.
Theodore Ts'o [Mon, 14 Aug 2017 01:07:21 +0000 (21:07 -0400)]
mke2fs: fix UI problem caused by fuzzy translations
When the original message was changed from "(y, n)" to "(y, N)", this
caused the translations to be marked as "fuzzy". For those
translations that use a different characters for yes and no --- for
example, German, which uses j and n for "ja" and "nein" --- not having
the translation can cause user confusion since the user will type 'y',
and it will be interpreted as "No", since mke2fs is expecting that the
user will type some other character, such as 'j' or 'J' for "Ja" in
the German locale.
Theodore Ts'o [Sun, 13 Aug 2017 18:45:27 +0000 (14:45 -0400)]
libsupport: fix 32-bit quota test failures
On 32-bit platform some of the util_dqblk structures have a type of
long long. So we need to use %lld and casts to make sure the right
thing happens on both 32-bit and 64-bit platforms.
Theodore Ts'o [Fri, 4 Aug 2017 06:01:43 +0000 (02:01 -0400)]
Remove special mips libraries from Debian build
These libraries were needed to support arcboot, which is obsolete and
no longer part of Debian. So drop these non-standard, legacy special
libraries that were only built on the mips platform.
Theodore Ts'o [Tue, 1 Aug 2017 14:26:11 +0000 (10:26 -0400)]
e2fsck: fix e2fsck -D for encrypted directories
If the directory entry is encrypted there may be embedded NUL
characters; hence, we should use memcmp instead of strncmp when
comparing strings. Otherwise, e2fsck can erroneously report that a
directory have duplicant entries when doing an e2fsck -D check.
libsupport: fix error handling in quota_write_inode
The error return value of quota_file_create() is no longer < 0,
and the error handling in quota_write_inode() is incorrect,
fix these. This also fix a tune2fs segfault that currently
occurs when we add project and quota features to an inode
exhaustion ext4 filesystem.
debugfs: fix "ls -p" to avoid printing garbage after the file name
In commit 68a1de3df3 (debugfs: pretty print encrypted filenames in the
ls command), a change was introduced in debugfs/ls.c which instead of
copying dirent->name and 0-terminating it, dirent->name is used
directly in printf.
However, instead of using the precision to limit the number of
characters output, the code uses the field width. As a result,
characters are output until a 0 is read, which results in garbage
after the file name.
Also fix two other instances of this in debugging messages that aren't
used, but fixing them will avoid potential future copypasta bugs.
Reported-by: Christian Gabriel <ch_gabriel@web.de> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Extended attribute inodes have a link count of 1 but they are not
attached to any directories. When an xattr inode with zero ea
references is found, the remedy is to reconnect it to lost+found dir.
Since reconnect operation increments the link count, it would normally
become 2 but to avoid that, check_ea_inode() sets the link count to
zero in anticipation of reconnect operation. And it does it even when
e2fsck is invoked with -n option which causes a fatal e2fsck failure
as can be demonstrated with the following test script:
resize2fs: add support for resizing filesystems with ea_inode feature
Resizing filesystems with ea_inode feature was disallowed so far
because the code for updating the ea entries was missing. This patch
adds that support.
This patch is a major update to how we decide where to put extended
attributes. The main motivation is to enable creating values in
extended attribute inodes. While doing this, we want to implement a
behavior that is as close to kernel as possible.
Existing set ea code deviates from kernel behavior which makes it harder
to implement ea_inode feature:
- kernel only sorts ea entries in xattr block, e2fsprogs implementation
sorts all entries on every update.
- e2fsprogs implementation shuffled things on every update so the order
of updates does not matter. Kernel does not reshuffle things.
- e2fsprogs could evacuate entries from inode body to xattr block and
vice versa. This behavior does not exist in kernel.
Such differences could lead to inconsistent behavior between fuse2fs and
a kernel mount.
With ea_inode feature, we also need to decide whether to put a value
in an inode or keep it 'inline'. In kernel implementation this
depends on current placement of entries.
To close the behavioral gap, ext2fs_xattr_set() now takes over the
decision about where to place ea values. This also allows it to raise
errors early instead of delaying them to a separate
ext2fs_xattrs_write() call later.
libext2fs: eliminate empty element holes in ext2_xattr_handle->attrs
When an extended attribute is removed, its array element is emptied.
This creates holes in the array so various places that want to walk
filled elements have to do an empty element check.
Have remove operation shift remaining filled elements to the left.
This allows a simple iteration up to ext2_xattr_handle->count to walk
all filled entries, and so empty element checks become unnecessary.
libext2fs: rename ext2_xattr_handle->length to capacity
ext2_xattr_handle has two fields 'count' and 'length' which
represent number of filled elements vs total element count.
They have close meanings so are easy to confuse, thus make code less
readable. Rename length to capacity.
Eric Sandeen [Sun, 23 Jul 2017 22:34:57 +0000 (18:34 -0400)]
tune2fs: edit dire warning about check intervals
Time & mount-count based checks have been off by default for quite some
time now, but the dire warning about disabling them remains in the
tune2fs manpage, which is confusing. We did "strongly consider
the consequences" and disabled it by default, no need to scare the
user about it now. Inform the user of the consequences in a more
measured tone.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
resize2fs: sanity check the free blocks and inode counts
If the free block or free inodes count are larger than the number of
blocks or inodes in the system, request that the file system be
checked. Otherwise it's possible for calcuate_minimum_resize_size()
to hang in an infinite loop.
This problem was found using American Fuzzy Lop.
Reported-by: Adam Buchbinder <abuchbinder@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
E2fsck checks block numbers against the block_metadata_map before it
checks to see whether or not the block numbers are valid. So suppress
these harmless warnings.
If the superblock has invalid inode numbers for the user, group, or
project quota inodes, e2fsck should notice and offer to fix things by
zeroing out the invalid superblock field.
libext2fs: fix the s_log_block_size check in ext2fs_open()
The s_log_block_check can fail to detect an invalid value if it is
between UINT_MAX-9 and UINT_MAX, which can lead to ext2fs_open()
crashing with a division by zero error.
This bug was found using American Fuzzy Lop: http://lcamtuf.coredump.cx/afl/
Tahsin Erdogan [Fri, 30 Jun 2017 01:31:59 +0000 (18:31 -0700)]
Use i_size to determine whether a symlink is a fast symlink
Current way of determining whether a symlink is in fast symlink
format is to call ext2fs_inode_data_blocks2(). If number of data
blocks is zero and EXT4_INLINE_DATA_FL flag is not set, then symlink
data must be in inode->i_block.
This heuristic is becoming increasingly hard to maintain because
inode->i_blocks count can also be incremented for blocks used by
extended attributes. Before ea_inode feature, extra block could come
from xattr block, now more blocks can be added because of xattr
inodes.
To address the issue, add a ext2fs_is_fast_symlink() function that
gives a direct answer based on inode->i_size field. This is
equivalent to kernel's ext4_inode_is_fast_symlink() function.
This patch also fixes a few issues related to fast symlink handling:
- Both rdump_symlink() and follow_link() interpreted symlinks with
0 data blocks to always mean fast symlinks. This is incorrect
because symlinks that are stored as inline data also have
0 data blocks. Thus, they try to read everything from
inode->i_block and miss the symlink suffix in inode extra area.
- e2fsck_pass1_check_symlink() had code to handle inode with
EXT4_INLINE_DATA_FL flag twice. The first if block always returns
from the function so the second one is unreachable code.
In some cases, resize2fs needs to move inodes because their inode
number is greater than the maximum allowed. Moving extended attribute
inodes would require updating all the references to them. This
is currently not supported.
ext2fs_xattr_set() currently does not support creating xattr inodes,
so allowing fuse2fs to mount a filesystem with ea_inode feature could
lead to corruption. Refuse to mount if the ea_inode feature is set.
tune2fs: update ea_inode hashes when fs uuid changes
Extended attribute inodes maintain a crc32c hash that is used for
deduplication. The crc seed derives from uuid so ea_inode hashes
must be updated when uuid changes.
The ea_inode hash is also incorporated into the xattr entry e_hash
so the entries that reference the inode also must be updated.
When check_inode_extra_space() detects a problem with the value of
i_extra_isize, it adjusts it and then returns without further validation
of contents in the inode body. Change this so that it will proceed to
check inline extended attributes.
In original implementation of ea_inode feature, each xattr inode had
a single parent. Child inode tracked the parent by storing its inode
number in i_mtime field. Also, child's i_generation matched parent's
i_generation.
With deduplication support, xattr inodes can now be shared so a single
backpointer is not sufficient to achieve strong binding. This is now
replaced by hash validation.