Brad King [Thu, 6 Feb 2020 19:28:02 +0000 (14:28 -0500)]
Fix possible heap-buffer-overflow in archive_string_append_from_wcs on Windows
Fix `archive_string_append_from_wcs_in_codepage` to account for the
already-used portion of the buffer when computing the size of the
remaining buffer for `WideCharToMultiByte` output.
RAR5 reader: reject files that declare invalid header flags
One of the fields in RAR5's base block structure is the size of the
header. Some invalid files declare a 0 header size setting, which can
confuse the unpacker. Minimum header size for RAR5 base blocks is 7
bytes (4 bytes for CRC, and 3 bytes for the rest), so block size of 0
bytes should be rejected at header parsing stage.
The fix adds an error condition if header size of 0 bytes is detected.
In this case, the unpacker will not attempt to unpack the file, as the
header is corrupted.
The commit also adds OSSFuzz #20459 sample to test further regressions
in this area.
Martin Matuska [Thu, 30 Jan 2020 23:30:34 +0000 (00:30 +0100)]
Update filter and format options in manual pages.
The list of format and filter options in bsdtar(1) is incomplete.
Add reference to manual pages archive_write_set_options(3) and
archive_read_set_options(3) for a complete list of supported options.
Brad King [Thu, 30 Jan 2020 14:49:22 +0000 (09:49 -0500)]
libarchive: Fix detection of 'major' on Solaris 11.4
In `archive_pack_dev.c` there is code checking the `HAVE_MAJOR` macro,
but it is not computed. Port the equivalent logic from
`archive_entry.c` to define the macro.
Martin Matuska [Mon, 20 Jan 2020 10:41:18 +0000 (11:41 +0100)]
Implement ARCHIVE_EXTRACT_SAFE_WRITES on Windows
As Windows does not support atomic rename/_wrename
we have to unlink the file right before calling _wrename
leaving a short unsafe window where a different file with the
same name can be created and make the rename operation fail.
zoulasc [Mon, 20 Jan 2020 12:20:42 +0000 (13:20 +0100)]
Introduce archive_write_disk(3) flag ARCHIVE_EXTRACT_SAFE_WRITES
This flag changes the way that regular files are extracted:
Instead of removing existing files first and re-creating them in
order to replace their contents, a temporary file is created and
when writing to the temporary file is completed, the file is
rename(2)d to the final destination name.
This has the effect of presenting a consistent view of the file to
the system (either the file with the new contents or the file with
the old contents). Removing and overwriting the file has the
undesired side effect that the the system can either not see the
file at all (from the time it is being removed till the time it is
being re-created), or worse it can see partial file contents. This
is problematic when extracting system files (for example shared
libraries).
If the existing file that is going to be overwritten is a hard link,
for now we unlink it before calling rename(2). This can be done
correctly by creating a hardlink to a tmpnam(3) generated file
and then use rename(2), but that is fairly intrusive and requires
refactoring.
archive_read_support_format_rar.c:
Fix invalid nested loop break.
Comment out dead code sections.
Simplify size comparsions of lensymbol and offsymbol.
Martin Matuska [Sat, 28 Dec 2019 21:58:08 +0000 (22:58 +0100)]
Fix a possible heap-buffer-overflow in archive_string_append_from_wcs()
When we grow the archive_string buffer, we have to make sure it fits
at least one maximum-sized multibyte character in the current locale
and the null character.
ZIP reader: support LZMA_STREAM_END marker in 'lzma alone' files
It appears that ZIPX files with type 14 stream ('lzma alone') can
contain LZMA_STREAM_END markers at the end of the stream. The ZIP reader
was displaying an "unknown error 1" status after encountering such
marker.
The fix handles such case, and the reader doesn't return error status
anymore. Thus it should be possible to unpack files that contain the
LZMA_STREAM_END marker at the end of the stream.
RAR5 reader: verify window size for multivolume archives
RAR5 archives can contain files that span across multiple .rar files. If the
archive contains a big file that doesn't fit to first .rar file, then this file
is continued in another .rar file.
In this case, the RAR compressor first emits the FILE base block for this big
file in the first .rar file. Then, it finishes first .rar file, and creates the
new .rar file. In this new file, it emits the continuation FILE block that
marks start of the continuation data for the rest of the huge file.
The problem was that the RAR5 reader didn't ignore the window size declaration
when parsing through the continuation FILE base block. The malicious file could
declare a different window size inside the continuation base block than was
declared in the primary FILE base block in the previous volume. The window size
from continuation block was applied, but the actual window buffer was not
reallocated. This resulted in a potential SIGSEGV error, since bounary checks
for accessing the window buffer were working incorrectly (the window size
variable didn't match the actual window buffer size).
The commit fixes the issue by ignoring the window size declaration in the
continuation FILE base block when switching volumes.
The commit also contains a test case and OSSFuzz sample #19509.
When the initial archive open for write fails, explicitly free filters.
This provides a defense-in-depth against programming errors due to the
partial state. Based on a report from Airbus Security - Vulnerability
Management.
Martin Matuska [Thu, 21 Nov 2019 02:08:40 +0000 (03:08 +0100)]
Bugfix and optimize archive_wstring_append_from_mbs()
The cal to mbrtowc() or mbtowc() should read up to mbs_length
bytes and not wcs_length. This avoids out-of-bounds reads.
mbrtowc() and mbtowc() return (size_t)-1 wit errno EILSEQ when
they encounter an invalid multibyte character and (size_t)-2 when
they they encounter an incomplete multibyte character. As we return
failure and all our callers error out it makes no sense to continue
parsing mbs.
As we allocate `len` wchars at the beginning and each wchar has
at least one byte, there will never be need to grow the buffer,
so the code can be left out. On the other hand, we are always
allocatng more memory than we need.
As long as wcs_length == mbs_length == len we can omit wcs_length.
We keep the old code commented if we decide to save memory and
use autoexpanding wcs_length in the future.