Theodore Ts'o [Tue, 9 Aug 2022 00:17:40 +0000 (20:17 -0400)]
libext2fs: in ext2fs_open[2](), return an error if s_desc_size is too large
Previously, ext2fs_open() and ext2fs_open2() would return an error if
s_desc_size is too small. Add a check so it will return an error if
s_desc_size is too large, as well.
These checks will be skipped for e2fsck when it uses the flag
EXT2_FLAG_IGNORE_SB_ERRORS.
Theodore Ts'o [Sun, 7 Aug 2022 23:47:25 +0000 (19:47 -0400)]
Fix UBSAN if s_log_groups_per_flex is 31
It is logal (albeit rare) for the number of block groups per flex_bg
to 2**31 (which effectively means to put all of the block groups into
a single flex_bg). However, in that case "1 << 31" is undefined on
architectures with a 32-bit integer. Fix this UBSAN complaint by
using "1U << 31" instead.
Theodore Ts'o [Sat, 6 Aug 2022 06:21:49 +0000 (02:21 -0400)]
libext2fs: teach ext2fs_open() to reject file systems with an invalid flex_bg size
If s_log_groups_per_flex is greater than 31, it will result in an
UBSAN error, since it will result in an invalid shift exponent when
calculating the flex_bg size. So reject such file systems when they
are opened. (The mke2fs program will not allow the creation of such
file systems, so they can only occur due to corruption.)
Theodore Ts'o [Sat, 6 Aug 2022 05:37:20 +0000 (01:37 -0400)]
libext2fs: teach ext2fs_open() to reject file systems with an invalid cluster size
If the cluster size is smaller than the block size, this can result in
a negative shift, which is undefined. When such a file system is
opened, immediately return an error indicating that the file system is
corrupted.
Theodore Ts'o [Thu, 4 Aug 2022 19:18:15 +0000 (15:18 -0400)]
resize2fs: fix to respect the environment variable E2FSPROGS_FAKE_TIME
When performing an off-line resize, if an inode's block map needs to
be updated, resize2fs will update the inode's ctime. In addition, if
inode numbers need to be renumbered due to the file system shrinking
forcing the inode table to be shrunk, any directories which need to be
modified will have their ctime and mtime updated.
If the E2FSPROGS_FAkE_TIME environment variable is set, when the file
system is opened, fs->now will be set to this value, and resize2fs
needs to use it instead of calling time(0) to get their current time.
Theodore Ts'o [Tue, 7 Jun 2022 02:44:35 +0000 (22:44 -0400)]
e2fsck: avoid out-of-bounds write for very deep extent trees
The kernel doesn't support extent trees deeper than 5
(EXT4_MAX_EXTENT_DEPTH). For this reason we only maintain the extent
tree statistics for 5 levels. Avoid out-of-bounds writes and reads if
the extent tree is deeper than this.
We keep these statistics to determine whether we should rebuild the
extent tree. If the extent tree is too deep, we don't need the
statistics because we should always rebuild the it.
Reported-by: Nils Bars <nils.bars@rub.de> Reported-by: Moritz Schlögel <moritz.schloegel@rub.de> Reported-by: Nico Schiller <nico.schiller@rub.de> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Mon, 6 Jun 2022 17:34:08 +0000 (13:34 -0400)]
e2fsck: check for xattr value size integer wraparound
When checking an extended attrbiute block for correctness, we check if
the starting offset plus the value size exceeds the end of the block.
However, we weren't checking if the size was too large, and if it is
so large that it triggers a wraparound when we added the starting
offset, we won't notice the problem. Add the missing check.
Reported-by: Nils Bars <nils.bars@rub.de> Reported-by: Moritz Schlögel <moritz.schloegel@rub.de> Reported-by: Nico Schiller <nico.schiller@rub.de> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Mon, 6 Jun 2022 16:03:36 +0000 (12:03 -0400)]
libext2fs: add check for too-short directory blocks
If there is an inline data directory which is smaller than 8 bytes
(which should never happen but for corrupted or fuzzed file systems),
ext2fs_process_dir_block() will now abort EXT2_ET_DIR_CORRUPTED to
avoid an out-of-bounds read.
Reported-by: Nils Bars <nils.bars@rub.de> Reported-by: Moritz Schlögel <moritz.schloegel@rub.de> Reported-by: Nico Schiller <nico.schiller@rub.de> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Mon, 30 May 2022 23:17:30 +0000 (19:17 -0400)]
e2fsck: sanity check the journal inode number
E2fsck replays the journal before sanity checking the full superblock.
So it's possible that the journal inode number is not valid relative
to the number of block groups. So to avoid potentially an array
bounds overrun, sanity check this before trying to find the journal
inode.
Reported-by: Nils Bars <nils.bars@rub.de> Reported-by: Moritz Schlögel <moritz.schloegel@rub.de> Reported-by: Nico Schiller <nico.schiller@rub.de> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Andreas Dilger [Mon, 13 Dec 2021 06:35:30 +0000 (23:35 -0700)]
e2fsck: no parent lookup in disconnected dir
Don't call into ext2fs_get_pathname() to do a name lookup for a
disconnected directory, since the directory block traversal in
pass1 has already scanned all of the leaf blocks and never finds
the entry, always printing "???". If the name entry had been
found earlier, the directory would not be disconnected in pass3.
Instead, lookup ".." and print the parent name in the prompt, and
then do not search for the current directory name at all. This
avoids a useless full directory scan for each disconnected entry,
which can potentially be slow if the parent directory is large.
Separate the recursively looped directory case to a new error code,
since it is a different problem that should use its own descriptive
text, and a proper pathname can be shown in this case.
Andreas Dilger [Wed, 8 Dec 2021 07:51:12 +0000 (00:51 -0700)]
e2fsck: map PROMPT_* values to prompt messages
It isn't totally clear when searching the code for PROMPT_*
constants from problem codes where these messages come from.
Similarly, there isn't a direct mapping from the prompt string
to the constant.
Add comments that make this mapping more clear.
Signed-off-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Lukas Czerner [Thu, 17 Feb 2022 09:25:00 +0000 (10:25 +0100)]
Use mallinfo2 instead of mallinfo if available
mallinfo has been deprecated with GNU C library version 2.33 in favor of
mallinfo2 which works exactly the same as mallinfo but with larger field
widths. Use mallinfo2 if available.
Lukas Czerner [Thu, 17 Feb 2022 09:24:59 +0000 (10:24 +0100)]
libss: fix possible NULL pointer dereferece on allocation failure
Currently in ss_execute_command() we're missng a check to see if the
memory allocation was succesful. Fix it by checking the return from
malloc and returning ENOMEM if it had failed.
[ Removed addition of the SS_ET_ENOMEM entry to the the libss error
table. -TYT ]
libext2fs: add sanity check to extent manipulation
It is possible to have a corrupted extent tree in such a way that a leaf
node contains zero extents in it. Currently if that happens and we try
to traverse the tree we can end up accessing wrong data, or possibly
even uninitialized memory. Make sure we don't do that.
Additionally make sure that we have a sane number of bytes passed to
memmove() in ext2fs_extent_delete().
Note that e2fsck is currently unable to spot and fix such corruption in
pass1.
Theodore Ts'o [Tue, 4 Jan 2022 05:02:22 +0000 (00:02 -0500)]
setup-schroot: install the udev and systemd packages separately
On non-Linux Debian ports (e.g., GNU/Hurd and GNU/kFreeBSD) the udev
and systemd packages don't exist. So try to install them separately,
so they can fail on their own on those platforms.
Theodore Ts'o [Tue, 4 Jan 2022 03:45:37 +0000 (22:45 -0500)]
tests: support older versions of timeout in r_corrupt_fs
Older versions of the timeout program in coreutils don't support the
-v option. (This is apparently still in use in the GNU/FreeBSD Debain
port since coreutils hasn't built successfully since Coreutils version
8.28.)
Theodore Ts'o [Tue, 28 Dec 2021 17:33:15 +0000 (12:33 -0500)]
reisze2fs: sanity check free block group counts when calculating minimum size
If one or more block group descriptor's free blocks count is insane,
it's possible this can lead to a infinite loop in the function
calculate_minimum_resize_size(), which is called by resize2fs -P or
resize2fs -M.
Add some sanity checks to avoid this. In the case where the file
system is corrupt, this will result in resize2fs -P reporting an
incorrect value, but that's OK, since when we try to do an actual
resize operation, resize2fs requires that the file system be freshly
checked using e2fsck.
https://github.com/tytso/e2fsprogs/issues/94
Fixes: ac94445fc01f ("resize2fs: make minimum size estimates more reliable for mounted fs") Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Tue, 21 Dec 2021 19:55:32 +0000 (14:55 -0500)]
setup-schroot: add some additional packages needed to build debian packages
On older Debian systems, "apt-get build-dep e2fsprogs" might not bring
in all of the packages needed to build in the most recent versions of
e2fsprogs. So explicitly try to install some additional packages
including dh-exec, udev, systemd, and cron.
Theodore Ts'o [Tue, 21 Dec 2021 19:28:51 +0000 (14:28 -0500)]
libuuid: try to use getrandom() or getentropy() if available
If getrandom() or getentropy() is available, use these interfaces in
favor of opening /dev/[u]random. This avoids a potential TSAN problem
that could potentially cause a fd leak when trying to open
/dev/urandom. (Which is not a disaster, but these interfaces are more
foolproof and avoids needing to open a file descriptor in a library,
which is a good thing.)
Theodore Ts'o [Sat, 11 Dec 2021 03:40:40 +0000 (22:40 -0500)]
e2fsck: update the bg_checksum after fixing problems in the bg descriptor
Otherwise, we break the block group descriptor's checksum, and while
this gets fixed by e2fsck, it results unnecessary messages printed or
questions asked of the system administrator.
Theodore Ts'o [Thu, 9 Dec 2021 15:55:54 +0000 (10:55 -0500)]
libext2fs: don't old the CACHE_MTX while doing I/O
A report a deadlock problem caused by I/O errors (caused by e2fsck's
error handler trying to write to a bad block to perform a forced
rewrite) uncovered that we were holding the CACHE_MTX while doing read
operations. This serialized read operations which destroyed the
performance benefits from doing parallel bitmap loading (or the
parallel e2fsck processing under development).
So restructure the code in unix_read_blk64() so that the read is
always done into the user-provided buffer, and then copied into the
cache afterwards.
e2fsck: skip sorting extents if there are no valid extents
At the end of a fast commit replay, e2fsck tries merging extents in a
inode. This patch fixes a bug in this logic where we were continuing
this action even if there were no extents to merge resulting in
accessing illegal memory.
Speed up an off-line resize of a 10GB file system to 64TB located on
tmpfs from 90 seconds to 16 seconds by extracting block group bitmaps
using a population count function to count the blocks in use instead
checking each bit in the block bitmap.
resize2fs: adjust new size of the file system to allow a successful resize
The previous commit in this series (commit 50088b1996cc: "resize2fs:
attempt to keep the # of inodes valid by removing the last bg") allows
a successful off-line resize of a file system with the default 16k
inode ratio to be grown to support a 64TB storage device by dropping
the last block group so the number of inodes is just below the maximum
2**32-1 number of inodes.
However, this is not a complete solution, for two reasons. First,
this adjustment happens after resize2fs has started potentially making
changes to the file system in the off-line (unmounted) case, which
means resize2fs will do a lot of unnecessary work. Secondly, in the
on-line resize case, passing the original requested size to the kernel
causes the kernel fail the online resize request.
So teach resize2fs to adjust the new size of the file system much
earlier, which avoids both problems.
resize2fs: attempt to keep the # of inodes valid by removing the last bg
If a the 10GB file system (with the default inode ratio size of 16k)
is resized to 64TB, the number of inodes will become 2**32 --- one
above the maximum allowed number of inodes of 2**32-1. In
adjust_fs_info(), we already try drop the last block group if there
isn't sufficient space in the last block group to support the metadata
for that block group. So if dropping the last block group allows the
number of inodes to valid, we should try that as well. In some cases
this will mean resizing a file system to 64TB will result in it be
resized to a size of 64TB - 128MB, which is close enough for
government work.
Jan Kara [Mon, 23 Aug 2021 15:41:27 +0000 (17:41 +0200)]
debugfs: Fix headers for quota commands
list_quota and get_quota commands have 'blocks' header while what they
actually show is a used space in bytes. Fix the header to state 'space'
instead.
Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Jan Kara [Mon, 23 Aug 2021 15:41:25 +0000 (17:41 +0200)]
e2fsck: Do not trash user limits when processing orphan list
When e2fsck was loading quotas to process orphan list, it was loading
only quota usage. However subsequent quota writeout has effectively
overwritten quota limits, loosing them forever. Make sure quota limits
are preserved over orphan replay.
Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Jan Kara [Mon, 23 Aug 2021 15:41:24 +0000 (17:41 +0200)]
tune2fs: Fix conversion of quota files
When tune2fs is enabling quota feature, it looks for old-style quota
files and tries to transfer limits stored in these files into newly
created hidded quota files. However the code doing the transfer setups
the quota scan wrongly and instead of transferring limits we transfer
usage. So not only quota limits are lost (at least they can still be
recovered from the old quota files) but also usage information may be
wrong if the accounting in e2fsprogs does not exactly match the
accounting in quota-tools (which is actually the case). Fix the setup of
the quota scan.
Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Jan Kara [Mon, 23 Aug 2021 15:41:23 +0000 (17:41 +0200)]
quota: Rename quota_update_limits() to quota_read_all_dquots()
quota_update_limits() is a misnomer because what it actually does is
that it updates 'usage' counters and leaves 'limit' counters intact.
Rename quota_update_limits() to quota_read_all_dquots() and while
changing prototype also add a flags argument so that callers can control
which quota information is actually updated from the disk.
Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Jan Kara [Mon, 23 Aug 2021 15:41:21 +0000 (17:41 +0200)]
quota: Add support to version 0 quota format
Version 0 quota format differs from version 1 by having only 32-bit
counters for inodes and block limits. For many installations this is not
limiting and thus the format is widely used. Also quota tools still
create quota files with this format by default. Add support for this
quota format to e2fsprogs so that we can seamlessly convert quota files
in this format into our internal quota files.
Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Sun, 22 Aug 2021 14:07:15 +0000 (10:07 -0400)]
tests: update expect file for u_direct_io
The u_direct_io test is normally not run (since it requires root
privileges); as a result, when the mke2fs.conf defaults were changed,
I didn't notice that the expected output for u_direct_io test needed
to be updated.
Fixes: d730be5ceeba ("tests: update mke2fs.conf to create 256 byte inodes by default" Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Mon, 16 Aug 2021 12:18:59 +0000 (08:18 -0400)]
debian: switch to using build dependency on debhelper-compat
This is preferred in favor of using the debhelper/compat file, and we
no longer worry about supporting Debian Jessie or Debian Stretch
(at least without Stretch Backports).
Theodore Ts'o [Sat, 14 Aug 2021 21:07:53 +0000 (17:07 -0400)]
tests: update mke2fs.conf to create 256 byte inodes by default
The regression tests have their own private copy of mke2fs which is
used when tests create file systems. Since we are now using 256 byte
inodes by default, the tests should reflect this.
While we're at it, modify the r_move_itable test so it actually tests
moving the inode table.
Theodore Ts'o [Sat, 14 Aug 2021 14:39:13 +0000 (10:39 -0400)]
mke2fs: warn that bigalloc is experimental only for large cluster sizes
Since we have done a lot of testing with a cluster size equal to 64k
(or 16 times the default 4k block size), mke2fs will only warn for
bigalloc file systems where the cluster size is greater than 16 times
the block size.
Darrick J. Wong [Thu, 12 Aug 2021 23:22:22 +0000 (16:22 -0700)]
mke2fs: warn about missing y2038 support when formatting fresh ext4 fs
Filesystems with 128-byte inodes do not support timestamps beyond the
year 2038. Since we're now less than 16.5 years away from that point,
it's time to start warning users about this lack of support when they
format an ext4 filesystem with small inodes.
(Note that even for ext2 and ext3, we changed the default for
non-small file systems in 2008 in commit commit b1631cce648e ("Create
new filesystems with 256-byte inodes by default").)
So change the mke2fs.conf file to specify 256-byte inodes even for
small filesystems, and then add a warning to mke2fs itself if someone
is trying to make us format a file system with 128-byte inodes. This
can be suppressed by setting the boolean option warn_y2038_dates in
the mke2fs.conf file to false, which we do in the case of GNU Hurd,
since it only supports 128 byte inodes as of this writing.
[ Patch reworked by tytso to only warn in the case of GNU Hurd, since
the default for ext2/ext3 was changed for all but small file systems
in 2008. ]
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Lukas Czerner [Fri, 6 Aug 2021 09:58:17 +0000 (11:58 +0200)]
libext2fs: remove augmented rbtree functionality
Rbtree code was originally taken from linux kernel. This includes the
augmented rbtree functionality, however this was never intended to be
used and is not used still. Just remove it.
Lukas Czerner [Fri, 6 Aug 2021 09:58:16 +0000 (11:58 +0200)]
libext2fs: fix unexpected NULL variable
The ext2fs_check_mount_point() function can be called with mtpt being
NULL as for example from ext2fs_check_if_mounted(). However in the
is_swap_device condition we use the mtpt in strncpy without checking
whether it is non-null first.
This should not be a problem on linux since the previous attempt to open
the device exclusively would have prevented us from ever reaching the
problematic strncpy. However it's still a bug and can cause problems on
other systems, fix it by conditioning strncpy on mtpt not being null.