Martin Matuska [Sat, 28 Dec 2019 21:58:08 +0000 (22:58 +0100)]
Fix a possible heap-buffer-overflow in archive_string_append_from_wcs()
When we grow the archive_string buffer, we have to make sure it fits
at least one maximum-sized multibyte character in the current locale
and the null character.
ZIP reader: support LZMA_STREAM_END marker in 'lzma alone' files
It appears that ZIPX files with type 14 stream ('lzma alone') can
contain LZMA_STREAM_END markers at the end of the stream. The ZIP reader
was displaying an "unknown error 1" status after encountering such
marker.
The fix handles such case, and the reader doesn't return error status
anymore. Thus it should be possible to unpack files that contain the
LZMA_STREAM_END marker at the end of the stream.
RAR5 reader: verify window size for multivolume archives
RAR5 archives can contain files that span across multiple .rar files. If the
archive contains a big file that doesn't fit to first .rar file, then this file
is continued in another .rar file.
In this case, the RAR compressor first emits the FILE base block for this big
file in the first .rar file. Then, it finishes first .rar file, and creates the
new .rar file. In this new file, it emits the continuation FILE block that
marks start of the continuation data for the rest of the huge file.
The problem was that the RAR5 reader didn't ignore the window size declaration
when parsing through the continuation FILE base block. The malicious file could
declare a different window size inside the continuation base block than was
declared in the primary FILE base block in the previous volume. The window size
from continuation block was applied, but the actual window buffer was not
reallocated. This resulted in a potential SIGSEGV error, since bounary checks
for accessing the window buffer were working incorrectly (the window size
variable didn't match the actual window buffer size).
The commit fixes the issue by ignoring the window size declaration in the
continuation FILE base block when switching volumes.
The commit also contains a test case and OSSFuzz sample #19509.
When the initial archive open for write fails, explicitly free filters.
This provides a defense-in-depth against programming errors due to the
partial state. Based on a report from Airbus Security - Vulnerability
Management.
Martin Matuska [Thu, 21 Nov 2019 02:08:40 +0000 (03:08 +0100)]
Bugfix and optimize archive_wstring_append_from_mbs()
The cal to mbrtowc() or mbtowc() should read up to mbs_length
bytes and not wcs_length. This avoids out-of-bounds reads.
mbrtowc() and mbtowc() return (size_t)-1 wit errno EILSEQ when
they encounter an invalid multibyte character and (size_t)-2 when
they they encounter an incomplete multibyte character. As we return
failure and all our callers error out it makes no sense to continue
parsing mbs.
As we allocate `len` wchars at the beginning and each wchar has
at least one byte, there will never be need to grow the buffer,
so the code can be left out. On the other hand, we are always
allocatng more memory than we need.
As long as wcs_length == mbs_length == len we can omit wcs_length.
We keep the old code commented if we decide to save memory and
use autoexpanding wcs_length in the future.
Claybird [Mon, 14 Oct 2019 13:32:54 +0000 (22:32 +0900)]
This adds UNICODE filename support for lha.
The lastest lha format supports UNICODE filenames on its content, using extended headers(EXT_UTF16_FILENAME and EXT_UTF16_DIRECTORY).
However, currently libarchive ignores them.
This modification is to handle these extensions.
Daniel Verkamp [Fri, 4 Oct 2019 19:31:32 +0000 (12:31 -0700)]
Fix sparse file offset overflow on 32-bit systems
On architectures where ssize_t is 32 bits but file offsets are 64 bits
(such as 32-bit Linux with _FILE_OFFSET_BITS=64), the POSIX disk reader
would incorrectly skip large sparse regions due to a 32-bit integer
overflow in _archive_read_data_block(). This can result in the reader
failing with "Encountered out-of-order sparse blocks", since the
overflowed value is interpreted as a signed number and added to the
current offset.
The bytes variable was used to store the difference between two 64-bit
integers, but bytes is a ssize_t. Since this value of bytes was not
used after the block handling sparse offsets (it is always overwritten
in the block below), replace it with an int64_t sparse_bytes variable
that can always represent the difference without truncation.
Signed-off-by: Daniel Verkamp <dverkamp@chromium.org>
RAR5 archives can contain files compressed independently of each other,
and files that share a common window buffer, so files which are
compressed using 'solid' method. In the latter case, all files
are required to use the same window buffer, so window size should also
be the same.
OSSFuzz sample #15482 declares a different window size for multiple
solid files. RAR5 reader doesn't reallocate window buffer when
decompressing solid files, so it was possible to perform an
out-of-bounds read by declaring two solid files, where the second solid
file declared the window size parameter that was bigger than window size
used in first solid file.
This commit introduces additional checks to ensure all solid files are
using the same window size.
The commit also adds a test case using OSSFuzz sample #15482 to hunt
down regressions in the future.
Some other test cases had to be adjusted as well, because other OSSFuzz
samples were also declaring different window sizes for solid files. So
this commit has changed the error reporting for those invalid sample files.
Minor corrections to the formatting of manual page.
Found with mandoc -Tlint; fixing the following messages:
WARNING: bad NAME section content
WARNING: missing comma before name
WARNING: new sentence, new line
WARNING: parenthesis in function name
WARNING: skipping no-space macro
WARNING: skipping paragraph macro
WARNING: unusual Xr order
WARNING: unusual Xr punctuation
STYLE: no blank before trailing delimiter
STYLE: possible typo in section name
STYLE: trailing delimiter
STYLE: whitespace at end of input line
For the meaning of the messages, see:
https://man.openbsd.org/mandoc#DIAGNOSTICS
Raise compression on the second test to level=20, and perform a
third at level=1. Expect the output archive sizes to line up
based on compression level. Reduces test susceptibility to small
output size variations from different libzstd releases.
Dmitry Torokhov [Tue, 25 Jun 2019 22:17:52 +0000 (15:17 -0700)]
archive_read_next_header2: clean old entry data
We need to clean old entry data in archive_read_next_header2 in Windows
and Posix disk readers to ensure consistent results. One possible
failure mode: sparse data from the previous entry is carried over to
next non-sparse file entry, causing it to be mishandled.
Dmitry Torokhov [Tue, 25 Jun 2019 17:09:44 +0000 (10:09 -0700)]
archive_read: fix handling of sparse files
If a file ends with a sparse "hole" that is larger than buffer supplied
to archive_read(), then archive_read() will return prematurely because
archive_read_data_block() will return ARHCIVE_EOF as there is no more
"real" data. We can fix that by not trying to refill data buffer until
we exhaust the hole range.
RAR5 reader: fix ARM filter going beyond window buffer boundary
RAR5 uses filters in order to mutate data just before compression, to
achieve a better compression ratio. After decompression, this mutation
needs to be reversed by processing various filters that the compressor
uses.
One of such filters is an ARM executable file filter, which changes some
bytes in the input stream if the stream is recognized as an executable
file with ARM native code.
This commit fixes the situation when the decompressor using an ARM filter
was referencing a byte outside current window buffer. Such action is
invalid and can produce segmentation faults.
This commit also adds a test using OSSFuzz sample #15431.
RAR5 reader: window_mask was not updated correctly
The `window_mask` variable should be always in sync with the
`window_size` variable.
The commit fixes a bug in which there was one place that `window_size`
was modified, but `window_mask` wasn't updated. This was leading to a
SIGSEGV error, because by having wrong `window_mask`, RAR5 reader was
accessing the memory outsize current window buffer.
The commit also adds a test for this issue, together with OSSFuzz
sample #15278.
Martin Matuska [Sat, 15 Jun 2019 20:32:35 +0000 (22:32 +0200)]
RAR reader: extend fix user after free
If read_data_compressed() returns ARCHIVE_FAILED, the caller is allowed
to continue with next archive headers. In addition to
rar->start_new_table=1 we need to set rar->ppmd_valid=0.
Martin Matuska [Mon, 3 Jun 2019 21:33:49 +0000 (23:33 +0200)]
Minor bsdtar.1 manpage fixes
- the -p option does not restore owner by default.
- the -n option was listed twice
- file flags are called file attributes on Linux and are platform-specific
Mike Frysinger [Wed, 22 May 2019 04:04:35 +0000 (09:49 +0545)]
simplify gitignore a bit
Lets ignore autotool generated files (.la .dirstamp .deps) everywhere
rather than hardcoded specific subdirs. We'll never add files with
those names to the source repo, so that should be OK.
We're already ignoring CMakeFiles/ everywhere (since the rule lacks
a leading / anchor), so we can delete the redundant paths.
Rather than hardcode every possible unittest and related files, add
globs that ignore all *_test related paths in the topdir. We won't
be adding paths like that to the source repo, so it should be OK.