RAR5 reader: fix invalid type used for dictionary size mask.
This commit fixes places where the window_mask variable, which is needed
to perform operations on the dictionary circular buffer, was casted to
an int variable.
In files that declare dictionary buffer size of 4GB, window_mask has a
value of 0xFFFFFFFF. If this value will be assigned to an int variable,
this will effectively make the variable to contain value of -1. This
means, that any cast to a 64-bit value will bit-extend the int variable
to 0xFFFFFFFFFFFFFFFF. This was happening during a read operation from
the dictionary. Such invalid window_mask variable was not guarding
against buffer underflow.
This commit should fix the OSSFuzz issue #14537.
The commit also contains a test case for this issue.
RAR5 reader: handle a case with truncated huffman tables.
RAR5 reader did assume that the block contains full huffman table data.
In invalid files that declare existence of huffman tables, but also
declare too small block size to fit the huffman tables in, RAR5 reader
was interpreting memory beyond the allocated block.
The commit adds necessary buffer overflow checks and fails the huffman
table reading function in case truncated data will be detected.
The commit also provides a unit test for this case.
This commit fixes some undefined shift-left operations on types that do
not support such a big shift. Those invalid shift operations were
triggering on invalid files produced by fuzzing.
The commit also contains two unit tests that ensure such problems won't
arise in the future.
RAR5 reader: fix buffer overflow when parsing huffman tables.
RAR5 compresses its Huffman tables by using an algorithm similar to Run
Length Encoding. During uncompression of those tables, RAR5 reader
didn't perform enough checks to prevent from buffer overflow in some
cases.
This commit adds additional check that prevents from encountering a
buffer overflow error in some files.
The commit also adds a unit test to guard against regression of this
issue.
This commit fixes a memory leak which is triggered by invalid files.
Sample test case that triggers the leak is provided by OSSFuzz #14470.
If the ZIPX file contanis an LZMA stream, and this stream is invalid,
the reader was allocating an LZMA decoding context which wasn't freed.
Later, when trying to unpack another LZMA stream, context was
re-initialized by allocating a new context and overwriting old pointers
to an unfreed memory, causing a memory leak.
After applying this commit, the LZMA stream context initialization
function will check if there is an non-freed previous context being in
use. If it exists, the reader will free the memory before allocating a
new LZMA unpacking context.
The commit also contains a test case with OSSFuzz sample #14470.
RAR5 reader: add support for 'version' extra field and ignore unknown fields.
This commit adds support for the VERSION extra field appended to FILE
base block. This field allows to add version support for files inside
the archive. If the file name is 'abc' and its version is 15, libarchive
will unpack this file as 'abc;15'. Changing of file names is needed
because there can be multiple files inside the archive with the same
names and different versions. In order for the user to not be confused
which file is which, RAR5 reader changes the name.
Also this commit contains a unit test for VERSION extra field support.
Another change this commit introduces is ignoring of unknown extra
fields. Before applying the commit, RAR5 reader was failing to unpack
the file if an unknown field was encountered. But since the reader knows
the unknown field's size, it can skip it and ignore it, then proceed
with parsing the structure. After applying this commit, RAR5 reader will
skip and ignore unknown fields.
Unknown fields that are skipped include fields in FILE's extra header,
as well as unsupported REDIR types.
RAR5 reader: fix ASan errors, fix OSSFuzz samples, add a unit test
This commit fixes errors reported by ASan, as well as fixes runtime
behavior of RAR5 reader on OSSFuzz sample files:
#12999, #13029, #13144, #13478, #13490
Root cause for these changes is that merge_block() function was
sometimes called in a recursive way. But this function shouldn't be used
this way, because calling it recursively overwrites the global state
that is used by the function. So, the commit ensures the function will
not be called recursively.
There is also one fix that changes some tabs to spaces, because whole
file originally used space indentation.
Mike Frysinger [Mon, 27 Mar 2017 00:29:34 +0000 (20:29 -0400)]
support reading metadata from compressed files
The raw format provides very little metadata. Allow filters to pass
back state that it knows about. With gzip, we know the original file
name, mtime, and file size. For now, we only pull out the first two
as those are available in the file header. The latter is in the file
trailer, so we'll have to add support for that later (if we can seek
the input).
RAR5 reader: invalid window buffer read in E8E9 filter
The E8E9 filter was accessing the window buffer with a direct memory
read. But since the window buffer is a circular buffer, some of its data
can span between the end of the buffer and beginning of the buffer. This
means that accessing the window buffer needs to be done always by a
reading function that is aware of the fact that the window buffer is
circular.
The commit changes direct memory read to the access through the
circular_memcpy() function.
This fixes some edge cases when the E8E9 filter data (4 bytes) is
spanned between the end of the window buffer and the beginning of the
buffer. This situation can happen in archives compressed with a small
dictionary size.
Patrick Ohly [Mon, 24 Oct 2016 11:10:48 +0000 (13:10 +0200)]
test_option_n.c: cover non-recursive extract/list
Testing uses only listing because extraction uses the same code
paths. Indirectly this covers also the new API call.
Some corner cases get special attention:
- archive where a file in a directory is present without the
directory
- the error when asking to extract a directory which is not
present
Martin Matuska [Sun, 14 Apr 2019 23:50:29 +0000 (01:50 +0200)]
Windows symlinks: new functions and extended tar header
New functions:
archive_entry_symlink_type()
archive_entry_set_symlink_type()
Suppoted value constants:
AE_SYMLINK_TYPE_UNDEFINED
AE_SYMLINK_TYPE_FILE
AE_SYMLINK_TYPE_DIRECTORY
New extended tar header:
LIBARCHIVE.symlinktype
The function archive_entry_symlink_type() retrieves and the function
archive_entry_set_symlink_type() sets the symbolic link type of an archive
entry. The information about the symbolic link type is required to properly
restore symbolic links on Microsoft Windows. If the symlink type is set
to AE_SYMLINK_TYPE_FILE or AE_SYMLINK_TYPE_DIRECTORY and a tar archive
is written, an extended tar header LIBARCHIVE.symlinktype is stored with
the value "file" or "dir". When reading symbolic links on Windows, the
link type is automatically stored in the archive_entry structure.
On unix systems, the symlink type has no effect when reading or writing
symbolic links.
Patrick Ohly [Mon, 24 Oct 2016 10:54:48 +0000 (12:54 +0200)]
non-recursive extract and list
Sometimes it makes sense to extract or list a directory contained in
an archive without also doing the same for the content of the
directory, i.e. allowing -n (= --no-recursion) in combination with the
x and t modes.
bsdtar uses the match functionality in libarchive to track include
matches. A new libarchive API call
archive_match_set_inclusion_recursion() gets introduced to
influence the matching behavior, with the default behavior as before.
Non-recursive matching can be achieved by anchoring the path match at
both start and end. Asking for a directory which itself isn't in the
archive when in non-recursive mode is an error and handled by the
existing mechanism for tracking unused inclusion entries.
Martin Matuska [Sat, 13 Apr 2019 19:51:03 +0000 (21:51 +0200)]
Windows symlink bugfixes and improvements
Treat targets ending with /. and /.. as directory symlinks
Explicitly test for file and directory symlinks
Improve debug output on test failure
Fix two memory allocations
Andrew Gierth [Sat, 30 Mar 2019 15:01:41 +0000 (15:01 +0000)]
Update tests for platforms on which success is impossible.
If a platform lacks O_EXEC and we get a failure when starting the
traverse from within an unreadable directory, then don't score that as
a failure, since with the current code it can never succeed. But if
O_EXEC exists, the failure still counts.
Andrew Gierth [Fri, 29 Mar 2019 08:22:46 +0000 (08:22 +0000)]
Fix bugs related to unreadable directories.
1. Don't try to open ".." for reading as part of the process of
ascending out of an initially specified directory; it's wrong, and if
the directory is not readable it causes a spurious error.
2. If opening "." initially for reading fails, then open it for
execute instead, if O_EXEC exists. This avoids spurious and unhelpful
failures when the current directory is not readable.
Add test cases for the above.
At least the first of these issues is ancient; it was reported against
FreeBSD in 2014.
Martin Matuska [Tue, 19 Mar 2019 16:17:51 +0000 (17:17 +0100)]
archive_entry.c: changes in file flags code
Use "undel" for FS_UNRM_FL file flag
Drop compat of UF_NOUNLINK and FS_UNRM_FL
Use "secdel" for FS_SECRM_FL and "journal-data" for FS_JOURNAL_DATA_FL
Martin Matuska [Wed, 27 Feb 2019 21:22:46 +0000 (22:22 +0100)]
Travis CI Windows fixes
- MS Visual Studio: use cmake's interface to build system
- disable Windows tests (test only the build) due to timeout and fail issues
Patrick Cheng [Sun, 24 Feb 2019 19:32:06 +0000 (11:32 -0800)]
fix dereferencing null pointer in file_new()
file_new() sets file to NULL first.
when file_new() fails, file is set to NULL if it doesn't need to be freed
so, only free when need to. otherwise would deference a null pointer.
Martin Matuska [Sun, 3 Feb 2019 22:47:42 +0000 (23:47 +0100)]
POSIX reader: more next_entry() fixes
- if not descending, fail if tree_current_lstat() returns ENOENT
- fix the "File removed before we read it" error message if processing multiple files at a time.