Mike Frysinger [Mon, 27 Mar 2017 00:29:34 +0000 (20:29 -0400)]
support reading metadata from compressed files
The raw format provides very little metadata. Allow filters to pass
back state that it knows about. With gzip, we know the original file
name, mtime, and file size. For now, we only pull out the first two
as those are available in the file header. The latter is in the file
trailer, so we'll have to add support for that later (if we can seek
the input).
Patrick Ohly [Mon, 24 Oct 2016 11:10:48 +0000 (13:10 +0200)]
test_option_n.c: cover non-recursive extract/list
Testing uses only listing because extraction uses the same code
paths. Indirectly this covers also the new API call.
Some corner cases get special attention:
- archive where a file in a directory is present without the
directory
- the error when asking to extract a directory which is not
present
Martin Matuska [Sun, 14 Apr 2019 23:50:29 +0000 (01:50 +0200)]
Windows symlinks: new functions and extended tar header
New functions:
archive_entry_symlink_type()
archive_entry_set_symlink_type()
Suppoted value constants:
AE_SYMLINK_TYPE_UNDEFINED
AE_SYMLINK_TYPE_FILE
AE_SYMLINK_TYPE_DIRECTORY
New extended tar header:
LIBARCHIVE.symlinktype
The function archive_entry_symlink_type() retrieves and the function
archive_entry_set_symlink_type() sets the symbolic link type of an archive
entry. The information about the symbolic link type is required to properly
restore symbolic links on Microsoft Windows. If the symlink type is set
to AE_SYMLINK_TYPE_FILE or AE_SYMLINK_TYPE_DIRECTORY and a tar archive
is written, an extended tar header LIBARCHIVE.symlinktype is stored with
the value "file" or "dir". When reading symbolic links on Windows, the
link type is automatically stored in the archive_entry structure.
On unix systems, the symlink type has no effect when reading or writing
symbolic links.
Patrick Ohly [Mon, 24 Oct 2016 10:54:48 +0000 (12:54 +0200)]
non-recursive extract and list
Sometimes it makes sense to extract or list a directory contained in
an archive without also doing the same for the content of the
directory, i.e. allowing -n (= --no-recursion) in combination with the
x and t modes.
bsdtar uses the match functionality in libarchive to track include
matches. A new libarchive API call
archive_match_set_inclusion_recursion() gets introduced to
influence the matching behavior, with the default behavior as before.
Non-recursive matching can be achieved by anchoring the path match at
both start and end. Asking for a directory which itself isn't in the
archive when in non-recursive mode is an error and handled by the
existing mechanism for tracking unused inclusion entries.
Martin Matuska [Sat, 13 Apr 2019 19:51:03 +0000 (21:51 +0200)]
Windows symlink bugfixes and improvements
Treat targets ending with /. and /.. as directory symlinks
Explicitly test for file and directory symlinks
Improve debug output on test failure
Fix two memory allocations
Andrew Gierth [Sat, 30 Mar 2019 15:01:41 +0000 (15:01 +0000)]
Update tests for platforms on which success is impossible.
If a platform lacks O_EXEC and we get a failure when starting the
traverse from within an unreadable directory, then don't score that as
a failure, since with the current code it can never succeed. But if
O_EXEC exists, the failure still counts.
Andrew Gierth [Fri, 29 Mar 2019 08:22:46 +0000 (08:22 +0000)]
Fix bugs related to unreadable directories.
1. Don't try to open ".." for reading as part of the process of
ascending out of an initially specified directory; it's wrong, and if
the directory is not readable it causes a spurious error.
2. If opening "." initially for reading fails, then open it for
execute instead, if O_EXEC exists. This avoids spurious and unhelpful
failures when the current directory is not readable.
Add test cases for the above.
At least the first of these issues is ancient; it was reported against
FreeBSD in 2014.
Martin Matuska [Tue, 19 Mar 2019 16:17:51 +0000 (17:17 +0100)]
archive_entry.c: changes in file flags code
Use "undel" for FS_UNRM_FL file flag
Drop compat of UF_NOUNLINK and FS_UNRM_FL
Use "secdel" for FS_SECRM_FL and "journal-data" for FS_JOURNAL_DATA_FL
Martin Matuska [Wed, 27 Feb 2019 21:22:46 +0000 (22:22 +0100)]
Travis CI Windows fixes
- MS Visual Studio: use cmake's interface to build system
- disable Windows tests (test only the build) due to timeout and fail issues
Patrick Cheng [Sun, 24 Feb 2019 19:32:06 +0000 (11:32 -0800)]
fix dereferencing null pointer in file_new()
file_new() sets file to NULL first.
when file_new() fails, file is set to NULL if it doesn't need to be freed
so, only free when need to. otherwise would deference a null pointer.
Martin Matuska [Sun, 3 Feb 2019 22:47:42 +0000 (23:47 +0100)]
POSIX reader: more next_entry() fixes
- if not descending, fail if tree_current_lstat() returns ENOENT
- fix the "File removed before we read it" error message if processing multiple files at a time.
Markus Elfring [Mon, 4 Feb 2019 18:00:32 +0000 (19:00 +0100)]
Bug #1128: Deletion of unnecessary checks before calls of the function “archive_entry_free”
The function “archive_entry_free” is implemented in the way that only
two functions are called which tolerate the passing of null pointers.
It is therefore not needed that a function caller repeats a corresponding check.
This issue was fixed by using the software “Coccinelle 1.0.7”.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Markus Elfring [Mon, 4 Feb 2019 17:38:18 +0000 (18:38 +0100)]
Bug #1128: Deletion of unnecessary checks before calls of the function “free”
The function “free” is documented in the way that no action shall occur for
a passed null pointer. It is therefore not needed that a function caller
repeats a corresponding check.
https://stackoverflow.com/questions/18775608/free-a-null-pointer-anyway-or-check-first
This issue was fixed by using the software “Coccinelle 1.0.7”.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
RAR5 reader: Removed a memory leak in process_head_file
The process_head_file function was using memset() to clear the
archive_entry structure. The problem was that this structure could
contain pointers to allocated blocks of memory, and removing those
pointers with memset() resulted in a memory leak.
Switching it in favor of archive_entry_clear() effectively clears the
structure, but also releases any allocated memory blocks. This removes
the memory leak.
The commit also changes the way a temporary archive_entry instance is
being created when skipping a base block after block merge; instead of
directly creating a new instance on the stack, a constructor function
archive_entry_new() is used to ensure the new archive_entry instance is
not in an inconsistent state. This is needed because the fix described
in the first half of this commit message depends on the archive_entry
instance being in a consistent state due to the call of the
archive_entry_clear() function.
RAR5 reader: Fixed a read from invalid memory block
In multi-file RAR5 archives, if a block spans from one file to another,
the RAR5 reader merges both blocks into one, and feeds this merged block
to the decompressor function. The problem is that the block merge
function allocates the exact number of bytes for this block. This is
problematic because when trying to read the last byte from this new
block with bit reader functions, the bit reader functions will reference
few additional bytes right after the byte the caller is trying to read,
resulting in an out of bounds read.
The commit increases the allocation size for new merged block. This
ensures that bit reader functions will never perform any out of bounds
reads. Additional space is zeroed out to prevent errors from
instrumentation tools like ASan or Valgrind.
Daniel Axtens [Tue, 1 Jan 2019 05:01:40 +0000 (16:01 +1100)]
7zip: fix crash when parsing certain archives
Fuzzing with CRCs disabled revealed that a call to get_uncompressed_data()
would sometimes fail to return at least 'minimum' bytes. This can cause
the crc32() invocation in header_bytes to read off into invalid memory.
A specially crafted archive can use this to cause a crash.
An ASAN trace is below, but ASAN is not required - an uninstrumented
binary will also crash.
==7719==ERROR: AddressSanitizer: SEGV on unknown address 0x631000040000 (pc 0x7fbdb3b3ec1d bp 0x7ffe77a51310 sp 0x7ffe77a51150 T0)
==7719==The signal is caused by a READ memory access.
#0 0x7fbdb3b3ec1c in crc32_z (/lib/x86_64-linux-gnu/libz.so.1+0x2c1c)
#1 0x84f5eb in header_bytes (/tmp/libarchive/bsdtar+0x84f5eb)
#2 0x856156 in read_Header (/tmp/libarchive/bsdtar+0x856156)
#3 0x84e134 in slurp_central_directory (/tmp/libarchive/bsdtar+0x84e134)
#4 0x849690 in archive_read_format_7zip_read_header (/tmp/libarchive/bsdtar+0x849690)
#5 0x5713b7 in _archive_read_next_header2 (/tmp/libarchive/bsdtar+0x5713b7)
#6 0x570e63 in _archive_read_next_header (/tmp/libarchive/bsdtar+0x570e63)
#7 0x6f08bd in archive_read_next_header (/tmp/libarchive/bsdtar+0x6f08bd)
#8 0x52373f in read_archive (/tmp/libarchive/bsdtar+0x52373f)
#9 0x5257be in tar_mode_x (/tmp/libarchive/bsdtar+0x5257be)
#10 0x51daeb in main (/tmp/libarchive/bsdtar+0x51daeb)
#11 0x7fbdb27cab96 in __libc_start_main /build/glibc-OTsEL5/glibc-2.27/csu/../csu/libc-start.c:310
#12 0x41dd09 in _start (/tmp/libarchive/bsdtar+0x41dd09)
This was primarly done with afl and FairFuzz. Some early corpus entries
may have been generated by qsym.
Daniel Axtens [Tue, 1 Jan 2019 06:10:49 +0000 (17:10 +1100)]
iso9660: Fail when expected Rockridge extensions is missing
A corrupted or malicious ISO9660 image can cause read_CE() to loop
forever.
read_CE() calls parse_rockridge(), expecting a Rockridge extension
to be read. However, parse_rockridge() is structured as a while
loop starting with a sanity check, and if the sanity check fails
before the loop has run, the function returns ARCHIVE_OK without
advancing the position in the file. This causes read_CE() to retry
indefinitely.
Make parse_rockridge() return ARCHIVE_WARN if it didn't read an
extension. As someone with no real knowledge of the format, this
seems more apt than ARCHIVE_FATAL, but both the call-sites escalate
it to a fatal error immediately anyway.
Found with a combination of AFL, afl-rb (FairFuzz) and qsym.