Martin Matuska [Mon, 20 Jan 2020 10:41:18 +0000 (11:41 +0100)]
Implement ARCHIVE_EXTRACT_SAFE_WRITES on Windows
As Windows does not support atomic rename/_wrename
we have to unlink the file right before calling _wrename
leaving a short unsafe window where a different file with the
same name can be created and make the rename operation fail.
zoulasc [Mon, 20 Jan 2020 12:20:42 +0000 (13:20 +0100)]
Introduce archive_write_disk(3) flag ARCHIVE_EXTRACT_SAFE_WRITES
This flag changes the way that regular files are extracted:
Instead of removing existing files first and re-creating them in
order to replace their contents, a temporary file is created and
when writing to the temporary file is completed, the file is
rename(2)d to the final destination name.
This has the effect of presenting a consistent view of the file to
the system (either the file with the new contents or the file with
the old contents). Removing and overwriting the file has the
undesired side effect that the the system can either not see the
file at all (from the time it is being removed till the time it is
being re-created), or worse it can see partial file contents. This
is problematic when extracting system files (for example shared
libraries).
If the existing file that is going to be overwritten is a hard link,
for now we unlink it before calling rename(2). This can be done
correctly by creating a hardlink to a tmpnam(3) generated file
and then use rename(2), but that is fairly intrusive and requires
refactoring.
archive_read_support_format_rar.c:
Fix invalid nested loop break.
Comment out dead code sections.
Simplify size comparsions of lensymbol and offsymbol.
Martin Matuska [Sat, 28 Dec 2019 21:58:08 +0000 (22:58 +0100)]
Fix a possible heap-buffer-overflow in archive_string_append_from_wcs()
When we grow the archive_string buffer, we have to make sure it fits
at least one maximum-sized multibyte character in the current locale
and the null character.
ZIP reader: support LZMA_STREAM_END marker in 'lzma alone' files
It appears that ZIPX files with type 14 stream ('lzma alone') can
contain LZMA_STREAM_END markers at the end of the stream. The ZIP reader
was displaying an "unknown error 1" status after encountering such
marker.
The fix handles such case, and the reader doesn't return error status
anymore. Thus it should be possible to unpack files that contain the
LZMA_STREAM_END marker at the end of the stream.
RAR5 reader: verify window size for multivolume archives
RAR5 archives can contain files that span across multiple .rar files. If the
archive contains a big file that doesn't fit to first .rar file, then this file
is continued in another .rar file.
In this case, the RAR compressor first emits the FILE base block for this big
file in the first .rar file. Then, it finishes first .rar file, and creates the
new .rar file. In this new file, it emits the continuation FILE block that
marks start of the continuation data for the rest of the huge file.
The problem was that the RAR5 reader didn't ignore the window size declaration
when parsing through the continuation FILE base block. The malicious file could
declare a different window size inside the continuation base block than was
declared in the primary FILE base block in the previous volume. The window size
from continuation block was applied, but the actual window buffer was not
reallocated. This resulted in a potential SIGSEGV error, since bounary checks
for accessing the window buffer were working incorrectly (the window size
variable didn't match the actual window buffer size).
The commit fixes the issue by ignoring the window size declaration in the
continuation FILE base block when switching volumes.
The commit also contains a test case and OSSFuzz sample #19509.
When the initial archive open for write fails, explicitly free filters.
This provides a defense-in-depth against programming errors due to the
partial state. Based on a report from Airbus Security - Vulnerability
Management.
Martin Matuska [Thu, 21 Nov 2019 02:08:40 +0000 (03:08 +0100)]
Bugfix and optimize archive_wstring_append_from_mbs()
The cal to mbrtowc() or mbtowc() should read up to mbs_length
bytes and not wcs_length. This avoids out-of-bounds reads.
mbrtowc() and mbtowc() return (size_t)-1 wit errno EILSEQ when
they encounter an invalid multibyte character and (size_t)-2 when
they they encounter an incomplete multibyte character. As we return
failure and all our callers error out it makes no sense to continue
parsing mbs.
As we allocate `len` wchars at the beginning and each wchar has
at least one byte, there will never be need to grow the buffer,
so the code can be left out. On the other hand, we are always
allocatng more memory than we need.
As long as wcs_length == mbs_length == len we can omit wcs_length.
We keep the old code commented if we decide to save memory and
use autoexpanding wcs_length in the future.
Claybird [Mon, 14 Oct 2019 13:32:54 +0000 (22:32 +0900)]
This adds UNICODE filename support for lha.
The lastest lha format supports UNICODE filenames on its content, using extended headers(EXT_UTF16_FILENAME and EXT_UTF16_DIRECTORY).
However, currently libarchive ignores them.
This modification is to handle these extensions.
Daniel Verkamp [Fri, 4 Oct 2019 19:31:32 +0000 (12:31 -0700)]
Fix sparse file offset overflow on 32-bit systems
On architectures where ssize_t is 32 bits but file offsets are 64 bits
(such as 32-bit Linux with _FILE_OFFSET_BITS=64), the POSIX disk reader
would incorrectly skip large sparse regions due to a 32-bit integer
overflow in _archive_read_data_block(). This can result in the reader
failing with "Encountered out-of-order sparse blocks", since the
overflowed value is interpreted as a signed number and added to the
current offset.
The bytes variable was used to store the difference between two 64-bit
integers, but bytes is a ssize_t. Since this value of bytes was not
used after the block handling sparse offsets (it is always overwritten
in the block below), replace it with an int64_t sparse_bytes variable
that can always represent the difference without truncation.
Signed-off-by: Daniel Verkamp <dverkamp@chromium.org>
RAR5 archives can contain files compressed independently of each other,
and files that share a common window buffer, so files which are
compressed using 'solid' method. In the latter case, all files
are required to use the same window buffer, so window size should also
be the same.
OSSFuzz sample #15482 declares a different window size for multiple
solid files. RAR5 reader doesn't reallocate window buffer when
decompressing solid files, so it was possible to perform an
out-of-bounds read by declaring two solid files, where the second solid
file declared the window size parameter that was bigger than window size
used in first solid file.
This commit introduces additional checks to ensure all solid files are
using the same window size.
The commit also adds a test case using OSSFuzz sample #15482 to hunt
down regressions in the future.
Some other test cases had to be adjusted as well, because other OSSFuzz
samples were also declaring different window sizes for solid files. So
this commit has changed the error reporting for those invalid sample files.
Minor corrections to the formatting of manual page.
Found with mandoc -Tlint; fixing the following messages:
WARNING: bad NAME section content
WARNING: missing comma before name
WARNING: new sentence, new line
WARNING: parenthesis in function name
WARNING: skipping no-space macro
WARNING: skipping paragraph macro
WARNING: unusual Xr order
WARNING: unusual Xr punctuation
STYLE: no blank before trailing delimiter
STYLE: possible typo in section name
STYLE: trailing delimiter
STYLE: whitespace at end of input line
For the meaning of the messages, see:
https://man.openbsd.org/mandoc#DIAGNOSTICS
Raise compression on the second test to level=20, and perform a
third at level=1. Expect the output archive sizes to line up
based on compression level. Reduces test susceptibility to small
output size variations from different libzstd releases.