Emil Velikov [Sun, 21 Nov 2021 18:05:19 +0000 (18:05 +0000)]
tar: demote -xa from error to a warning
It's fairly common for people to use caf and xaf on Linux. The former in
itself being GNU tar specific - libarchive tar does not allow xa.
While it makes little sense to use xaf with libarchive tar, that is
implementation detail which gets in the way when trying to write trivial
tooling/scripts.
For the sake of compatibility, reduce the error to a warning and augment
the message itself. Making it clear that the option makes little sense.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Jonas Witschel [Sun, 21 Nov 2021 09:07:52 +0000 (10:07 +0100)]
test_sparse_basic: do not assume that holes can be read in one go
verify_sparse_file() assumes that every hole will be fully contained in only
one archive_read_data_block(). This is a reasonable assumption if the file is
indeed sparsely encoded in the archive because archive_read_data_block() will
just skip the hole and return the offset of the next data block.
However, if the file is not sparsely encoded in the archive, a hole consists of
a lot of zeroes that need to be read byte by byte. In this case, the archive
contains no information on where this block of zeroes ends and where actual
data begins. Therefore it can happen that a single archive_read_data_block()
contains both zeroes from a hole and actual data.
If this happens, assert(sparse->type == HOLE) fails. This assertion is
reasonable for sparsely encoded files because archive_read_data_block() will
never only read part of a hole (since it does not really "read" a hole at all,
it just returns a higher offset accounting for the size of the hole).
However, we want to start testing files with verify_sparse_file() that are
explicitly not sparsely encoded. In this case, the assertion does not
necessarily hold any more. Therefore we need to account for the case where the
overlapping block consists of data. To make sure the file contents are
correctly encoded in the archive, we need to test the contents of the data
block, like it is already done for blocks completely contained in the data read
by archive_read_data_block().
Note that this modification does not change the way sparsely encoded files are
verified, it just relaxes an edge case that cannot happen with sparsely encoded
files to make it possible to test any kind of file, whether sparsely encoded or
not.
Theo Buehler [Fri, 19 Nov 2021 17:55:29 +0000 (18:55 +0100)]
Remove OpenSSL compat code that misuses the API
Immediately after EVP_CIPHER_CTX_new() neither EVP_CIPHER_CTX_init()
nor EVP_CIHPER_CTX_reset() should be called: the purpose of the init
function is to initialize a context on the stack while reset clears
a used context for reuse. Neither situation is the case here.
Removing the code also fixes a potential NULL dereference because an
error of reset is not signaled to the caller. Fortunately reset doesn't
currently fail in this situation in current OpenSSL and LibreSSL.
Martin Matuska [Wed, 17 Nov 2021 20:06:00 +0000 (21:06 +0100)]
archive_write_disk_posix: fix writing fflags broken in 8a1bd5c
The fixup list was erroneously assumed to be directories only.
Only in the case of critical file flags modification (e.g. SF_IMMUTABLE
on BSD systems), other file types (e.g. regular files or symbolic links)
may be added to the fixup list. We still need to verify that we are writing
to the correct file type, so compare the archive entry file type with
the file type of the file to be modified.
Jonas Witschel [Sun, 14 Nov 2021 17:56:49 +0000 (18:56 +0100)]
Add ARCHIVE_READDISK_NO_SPARSE to suppress reading sparse file info
Sparse file information depends on the file system and can therefore be a
source of unreproducibility for the generated archives, e.g. if the same
content is compressed on a file system with and without sparse file support.
Add an option to suppress reading this information from disk entirely.
Emil Velikov [Sat, 23 Oct 2021 14:32:12 +0000 (15:32 +0100)]
editorconfig: add simple top-level file
Add a simple top-level .editorconfig file to manage common attributes
such as indentation style, trailing whitespace and newline at end of
file. The format is wide spread and has support for nearly every editor
out there - see https://editorconfig.org/ for more.
Majority of the project is C - which uses tabs, although there are some
CMake files using 2 space indent and shell scripts - predominantly using
4 space indent.
This makes it harder for casual contributors to butcher things :-)
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Remove the excessive fallthrough chain in parse_keyword(). Even though
it is in the else/error path, there is no point in comparing the key
another dozen (or more) times when we know it will fail.
Just use an early return (OK) or break respectively.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Samanta Navarro [Sat, 28 Aug 2021 11:58:00 +0000 (11:58 +0000)]
Fix size_t cast in read_mac_metadata_blob
The size_t data type on 32 bit systems is smaller than int64_t. Check
the int64_t value before casting to size_t. If the value is too large
then stop operation instead of continuing operation with truncated
value.
Martin Matuska [Fri, 27 Aug 2021 08:56:28 +0000 (10:56 +0200)]
Fix following symlinks when processing the fixup list
The previous fix in b41daecb5 was incomplete. Fixup entries are
given the original path without calling cleanup_pathname().
To make sure we don't follow a symlink, we must strip trailing
slashes from the path.
The fixup entries are always directories. Make sure we try to modify
only directories by providing O_DIRECTORY to open() (if supported)
and if it fails to check directory via lstat().
Martin Matuska [Sun, 22 Aug 2021 01:53:28 +0000 (03:53 +0200)]
Never follow symlinks when setting file flags on Linux
When opening a file descriptor to set file flags on linux, ensure
no symbolic links are followed. This fixes the case when an archive
contains a directory entry followed by a symlink entry with the same
path. The fixup code would modify file flags of the symlink target.
Martin Matuska [Sat, 21 Aug 2021 07:07:54 +0000 (09:07 +0200)]
write_disk_posix: rename variable in check_symlinks_fsobj()
Rename the flag "extracting_hardlink" to "checking_linkname" to
be more accurate about its use. If the variable is non-zero it
means that check_symlinks_fsobj() is called on the linkname
when a hardlink is going to be created.
Samanta Navarro [Tue, 1 Jun 2021 11:26:30 +0000 (11:26 +0000)]
Fix mutual check in tar sparse handling
GNU.sparse.numbytes and GNU.sparse.offset both have to be set before
gnu_add_sparse_entry can be called.
The GNU.sparse.numbytes parser checks for tar->sparse_numbytes.
This has to be tar->sparse_offset instead to work just like the
GNU.sparse.offset parser.
Samanta Navarro [Tue, 1 Jun 2021 11:25:03 +0000 (11:25 +0000)]
Handle all negative int64_t values in mtree/tar
The variable last_digit_limit is negative since INT64_MIN itself is
negative as well. This means that the last digit after "limit" always
leads to maxval.
Turning last_digit_limit positive in itself is not sufficient because
it would lead to a signed integer overflow during shift operation.
If limit is reached and the last digit is last_digit_limit, the number
is at least maxval. The already existing if condition for even larger
(or smaller) values can be reused to prevent the last shift.
In my humble opinion it might make sense to reduce duplicated code and
keep it separated in a utility source file for shared use.
Owen W. Taylor [Wed, 12 May 2021 20:26:24 +0000 (16:26 -0400)]
On close, handle short writes from archive_write_callback
The archive_write_callback passed to archive_write_open() is documented as:
"each call to the write callback function should translate to a single write(2) system call.
On success, the write callback should return the number of bytes actually written"
And in most places, the code repeatedly calls the write callback, but when flushing
the buffer at close, the write callback was called once, assuming it would write everything.
This could result in a truncated archive.
A test is added to test short writes in different code paths.
Owen W. Taylor [Wed, 12 May 2021 20:37:16 +0000 (16:37 -0400)]
Avoid getcwd(0, PATH_MAX) for GNU libc
Recent versions of GNU libc and GCC produce a warning on getcwd(0, PATH_MAX):
test_main.c: In function ‘get_refdir’:
test_main.c:3684:8: error: argument 1 is null but the corresponding size argument 2 value is 4096 [-Werror=nonnull]
3684 | pwd = getcwd(NULL, PATH_MAX);/* Solaris getcwd needs the size. */
This is because getcwd() is marked with the 'write_only (1, 2)' attribute.
Using the alternate getcwd(NULL, 0) path which is supported by GNU libc avoids this.
Wei-Cheng Pan [Tue, 9 Mar 2021 16:34:55 +0000 (16:34 +0000)]
fix rar header skiming
The available size returned from `__archive_read_ahead` can be larger
then required size. Substract by available size may underflow `skip`,
which will reach EOF too soon.