Merge pull request #3116 from stoeckmann/hardening
This PR does not fix any reachable issue, but fixes the code in question nonetheless to prevent regressions in the future:
- Do not call `archive_copy_error` after `archive_read_free` to prevent a user after free bug
- Reset `vtable` to `NULL` to prevent `close` from being called after filter initialization error, since `data` is already freed and set to `NULL`, preventing a `NULL` pointer dereference
If a system with sizeof(wchar_t)=2 (e.g. Cygwin) tries to convert a wide
character string into a multi byte string representation, it
precalculates the required length with sizeof(wchar_t) instead of
MB_LEN_MAX. This can lead to short memory allocation for filenames which
have a shorter representation in wchar_t than in UTF-8.
A system with sizeof(wchar_t)=2 (Cygwin on Windows) can trigger an out
of boundary write in archive_read_open_filenames_w when converting the
wide character string into a multi byte string.
The current finite state machine carefully handles short reads, i.e. the
loop can enter as often as needed until enough bytes arrive for the
current state to perform its actions.
This can be simplified by relying on __archive_filter_read_ahead to
return the amount of bytes actually needed. I assume that this did not
happen in the original code due to its age (2009) and evolution of
libarchive's internals over time.
Also, headers are only skipped at the beginning. As soon as the reader
starts returning data (ST_ARCHIVE reached), the filter pretty much
becomes a pass-through filter.
Split the initial lead and header skipping into its own function and
only keep track if the initial skipping was performed or not. This
greatly simplifies the reader function.
Also, it avoids book keeping of internal states and "total_in" tracking,
which I don't have to properly audit for edge cases anymore.
Last but not least, this refactoring properly reports truncated streams
now.
00redbeer [Sun, 7 Jun 2026 12:23:25 +0000 (14:23 +0200)]
rar5: fix integer underflow in bytes_remaining
A malformed RAR5 archive with data_size=1 forces bytes_remaining
(ssize_t) to wrap to -2 when a compressed block header consumes
to_skip=3 bytes (CWE-191). That negative value is then implicitly
cast to size_t ~0 inside malloc(), requesting a ~16-exabyte
allocation — confirmed heap buffer overflow via ASAN/UBSan on a
48-byte crafted archive requiring no authentication.
Three guards added to archive_read_support_format_rar5.c:
1. Reject data_size > SSIZE_MAX before assigning to bytes_remaining
(CWE-195, unsafe unsigned-to-signed conversion)
2. Reject to_skip > bytes_remaining in process_block() before the
subtraction — this is the primary fix for the underflow (CWE-191)
3. Change cur_block_size == 0 to cur_block_size <= 0 in merge_block()
as defense-in-depth so that any negative bytes_remaining reaching
read_ahead() is caught before it becomes a malloc size (CWE-122)
00redbeer [Sun, 7 Jun 2026 12:14:34 +0000 (14:14 +0200)]
rar5: check integer overflow in bytes_remaining
A malformed RAR5 archive with data_size=1 forces bytes_remaining
(ssize_t) to wrap to -2 when a compressed block header consumes
to_skip=3 bytes (CWE-191). That negative value is then implicitly
cast to size_t ~0 inside malloc(), requesting a ~16-exabyte
allocation — confirmed heap buffer overflow via ASAN/UBSan on a
48-byte crafted archive requiring no authentication.
Reproducer: 48-byte crafted RAR5 archive; ASAN confirms
"allocation-size-too-big 0xfffffffffffffffe".
If vtable is not set to NULL, close function would be called during
shutdown. Since data is already freed and set to NULL, this would lead
to a NULL pointer dereference later on.
The called library functions should never fail though, so this is a
purely defensive measure against future lzma changes.
If archive_read_next_header in add_pattern_from_file would ever return
anything but ARCHIVE_OK or ARCHIVE_EOF, a use after free would occur
when copying error information.
Since this is impossible with current setup (format raw without any
further filter, thus only open_filename code), this change is a purely
defensive measure against future changes.
If the original name cannot be duplicated, return ARCHIVE_FAILED instead
of ARCHIVE_WARN. The latter implies that the option is unknown, which is
not the case.
All arithmetical operations are unsigned, and it makes sense to keep it
unsigned: The total_in value is written at the end of the stream and if
the value overflows, it's pretty much expected to be % UINT32_MAX.
Very unlikely that int64_t will ever overflow, but the fix is cheap.
The CreateSymbolicLinkW function is available since 0x0600 and is also
part of the Nano Server APIs. On earlier systems, don't even try.
Otherwise use it directly to simplify code.
- Use a stack array for 22 bytes
- Entering the if-branch already implies that we will add data
- Use snprintf instead of strcpy
Even though snprintf is slower than strcpy, it's easier to verify and
since nobody complained so far about the malloc overhead, this should be
okay (for now).
As a bonus, this code cannot fail anymore, which previously meant that
file attributes were silently ignored.
windows: Define WINAPI_FAMILY_PARTITION if missing
Define the macro WINAPI_FAMILY_PARTITION if it's missing, which can
happen on old Windows versions like Windows XP. Also add other missing
definitions to support compilation.
windows: Fix concurrency in __archive_create_child
Use proper memory barrier while checking for availability of function
WaitForInputIdle.
Since Vista, InitOnceExecuteOnce is a very simple idiom to handle such a
singleton setup. For earlier setups, accept a possible data race and
recover gracefully if two concurrent threads performed the initial
setup.
It is possible to trigger an out of boundary write on 32 bit systems
with around 1 GB of data (with a line consuming most of that data) when
opened with archive_read_open_memory.
Cap the amount of data read at once at 2 * UUENCODE_BID_MAX_READ to
allow range checks to take place before a possible SSIZE_MAX overflow
can occur through avail_in. Also, discard any line longer than
UUENCODE_BID_MAX_READ since this should definitely be more than
enough, especially since in_cnt check already takes care of that.
Merge pull request #3102 from DHowett/bug/7z-build
7zip: fix a number of issues in zstd detection
- -Wunused-function when ZSTD_compressStream is unavailable
- Incorrect automatic selection of 7Z_ZSTD when ZSTD_compressStream is unavailable
- Other instances of HAVE_ZSTD_H not matching HAVE_LIBZSTD
7zip: only fall back to 7Z_ZSTD if we can actually use zstd
Without this fix, the 7zip writer will fall back to zstd (when it is the
last available option) even if it could not be linked, then fail at
runtime with an unexpected error message.