Tim Kientzle [Sat, 25 Apr 2026 18:58:17 +0000 (14:58 -0400)]
[7zip] Sanity-check the number of files
We allocate space early on to support the advertised number of
files. A malicious archive can set a nonsensical value here to exhaust
memory. This adds a check comparing the number of files to the number
of streams and the size of the total header.
Note that the just-added test does not actually fail without this.
The existing code recovers if the allocation fails, which it typically
will. The new check tightens the limit so that we reject nonsensical
file counts and avoid problems from large memory allocations.
François Degros [Fri, 24 Apr 2026 07:34:10 +0000 (17:34 +1000)]
Add tests for appending various filters before archive open
Extend test coverage to ensure all supported filters can be appended
to an archive reader before it is opened, matching the behavior
required to fix #2514.
François Degros [Fri, 24 Apr 2026 07:09:13 +0000 (17:09 +1000)]
Fix SIGSEGV in compress filter when appended before open
Calling archive_read_append_filter(a, ARCHIVE_FILTER_COMPRESS) would
previously trigger a crash because compress_bidder_init() attempted to
read header bits from the upstream filter immediately. If the archive
was not yet opened (common when setting up filters), the upstream filter
state was not ready for reading.
This commit defers the header reading and decompressor initialization
until the first read operation (lazy initialization), consistent with
other filter implementations in libarchive.
cab reader: Fix use of uninitialized values from Huffman table
Initialize the Huffman table to invalid values, which doesn't otherwise
affect the computation but avoids use of uninitialized values upon
extraction of some archives (as reported by `valgrind`).
libarchive: fix Windows compilation with ENABLE_CNG=OFF
Currently, libarchive_{random,util}.c use a couple bcrypt functions
regardless of whether HAVE_BCRYPT_H is defined as there are no other
implementations for Windows, but the actual <bcrypt.h> header is
included only under this macro.
To be able to build libarchive with ENABLE_CNG=OFF (for example, to
prefer a different crypto/digest engine) on Windows, don't guard
the include in these two files. In that case, bcrypt will still be
used, but only as an RNG.
This won't break anything because, as mentioned above, bcrypt is
used unconditionally here and if it's not present in the system,
the library won't build either way, with or without the change.
At least until we implement an RNG for Windows based on something
else.
Signed-off-by: Alexander Lobakin <alobakin@mailbox.org>
The function isofile_gen_utility_names could resolve .. directory
entries in a way that dirname will start with "../". If this happens,
the while-loop is unable to detect this because it forwards until the
cursor detects a slash again.
Fix this by also taking "../" at the beginning into account. Such an
entry can happen if "../../" points before the top directory.
The isofile_gen_utility_names function normalizes directories, including
dot dot directory entries. If such an entry has multiple slahes and leads
to the top directory, then the new path erroneously becomes absolute.
Skip multiple slashes.
If rp is not NULL, then it points to a slash already. Takes this into
account to unify the rp and dirname cases a bit more.
Resolving paths like "dir/../filename" to "filename" can lead
to a strcpy call with overlapping memory. Use memmove instead,
which already happens at times in isofile_gen_utility_names.
Benjamin Gilbert [Sun, 19 Apr 2026 04:05:06 +0000 (23:05 -0500)]
Have `make distcheck` verify CMake build succeeds
There have been multiple instances of test cases being added to the CMake
build but not the Autotools one, thus omitting them from the released dist
tarball. Prevent this by testing the CMake build during `make distcheck`.
Remove an #include controlled by a preprocessor symbol that nothing
defines. I'm not sure if this has ever been needed, or what for, but
it serves no purpose today.
Tim Kientzle [Tue, 14 Apr 2026 02:38:07 +0000 (19:38 -0700)]
Fix a double-free in the link resolver
The link resolver is a helper utility that tracks linked
entries so they can be correctly restored. Clients add link information
to the link resolver and incrementally query it to correctly
link entries as they are restored to disk. The link resolver
incrementally releases entries as they are consumed in order
to minimize memory usage.
The `archive_entry_linkresolver_free()` method cleans up
by repeatedly querying the cache and freeing each entry.
But this conflicted with the incremental clean up,
leading to double-frees of leftover items.
The easy fix here is to have `archive_entry_linkresolver_free()`
just repeatedly query the list without trying to free, relying
on the incremental clean up mechanism.
Credit: tianshuo han reported the issue and suggested the fix.
elhananhaenel [Thu, 19 Mar 2026 14:43:29 +0000 (16:43 +0200)]
Add regression test for zisofs 32-bit heap overflow
A crafted ISO with pz_log2_bs=2 and pz_uncompressed_size=0xFFFFFFF9
causes an integer overflow in the block pointer allocation in
zisofs_read_data(). On 32-bit, (ceil+1)*4 wraps size_t to 0, malloc(0)
returns a tiny buffer, and the code writes ~4GB past it.
The pz_log2_bs validation fix prevents this. Add a regression test with
a crafted 48KB ISO that triggers the overflow on unfixed 32-bit builds.
libarchive/ppmd8: mark the remaining functions static
Those 9 are not used anywhere outside the file (the actual
functionality is exported as a callback structure).
Make them static for a bit better compiler optimization
opportunities and, more important, to avoid symbol conflict
when static linking libarchive and any library which uses
the original Ppmd*.c from the LZMA SDK (like minizip-ng).
Also remove a couple declarations and macros not used
anywhere at all while we're here.
Signed-off-by: Alexander Lobakin <alobakin@mailbox.org>
The anchor characters ^ and $ have only special meanings if they are
located at the beginning (^) or at the end ($) of the pattern. And even
then they are supposed to be only special if flags are set.
If they are located within the pattern itself, they are regular
characters regardless of flags.
By only removing periods from error messages in Windows specific code,
but not adjusting its POSIX counterpart, the test fails on Windows but
not on POSIX systems.
Fix this by removing the period in test and in POSIX error messages.
Fixes: 3e0819b59e ("libarchive: Remove period from error messages") Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
Calling archive_set_error with a message and errno already indicates
that a failure occurred. Only a minority of functions did that: Unify
with the rest.
The error messages are mostly written without a period. This makes
sense, because they can be accompanied with a strerror(errno) call,
giving more information: most likely a colon is appended.
The buff variable is only used in entry_to_archive. Moving it into the
specific code block where it is actually used reduces its visibility and
thus makes it easier to read the code:
Since Windows indeed uses unsigned for read, this makes it much easier
to verify that buff never grows and cannot be too large.
The API allows to set int64_t uid/gid values. When writing pax archives,
such large values are properly set in the USTAR header of actual data in
base256, i.e. everything works.
The pax header entries might be missing though because the check
truncates these values to unsigned int. Larger values could be truncated
in a way that they seem smaller than (1 << 18).
The check in line 1427 which sets the uid/gid values into the PAX header
block is correct, truncating the actual value to max. octal
representation.
This is a purely defensive change to support parsers which actually
allow such large uid/gid values but do not understand base256 encoding.
If a filter option is recognized but its value is invalid, return
ARCHIVE_FAILED instead of ARCHIVE_WARN. The latter is used for unknown
options, e.g. at the end of the option setter functions.
The buffer has a fixed 1024 limit, which can be smaller than the maximum
allowed path length. Increasing the limit to 4096 would partially help,
but since leading spaces are stripped, input lines could be valid and
longer.