The archive_utility_string_sort function won't be part of the 4.0.0 API
anymore. No users were found and such a task should be done outside of
the library.
The utility function "archive_utility_string_sort" is a custom qsort
implementation. Since qsort is specified in C11 and POSIX.1-2008
which libarchive is based on, use system's qsort directly.
The function is not used directly in libarchive, so this is a good
way to save around 500 bytes in resulting library without breaking
compatibility for any user of this function (none found).
Also allows more than UINT_MAX entries which previously were limited
by data type and (way earlier) due to recursion.
Test cases already get a C locale, which is sufficient for this test.
IF LC_TIME was not previously set, the used en_US.UTF-8 would stay
as an environment variable, possibly affecting other test cases.
Since en_US.UTF-8 is not guaranteed to be available, C is a better
choice.
Reset current locale settings through setlocale and also all
environment variables which might affect test cases which
spawn children through systemf which in turn would call setlocale
on their own, e.g. bsdtar.
Explicitly use goto to turn a recursive call into an iterative one.
Most compilers do this on their own with default settings, but MSVC
with default settings would create a binary which actually performs
recursive calls.
Fixes call stack overflow in binaries compiled with low optimization.
tar: Handle many sparse comments on 32 bit systems
The sparse 1.0 parser skips lines with comments. The amount of skipped
bytes is stored in a ssize_t variable, although common 32 bit systems
allow files larger than 4 GB.
Gracefully handle files with more than 2 GB bytes full of comments to
prevent integer truncations.
If a pax global header specifies a negative size, it is possible to
reduce variable `unconsumed` by 512 bytes, leading to a re-reading
of the pax global header. Fortunately the loop verifies that only one
global header per entry is allowed, leading to a later ARCHIVE_FATAL.
Avoid any form of negative size handling and fail early.
Steve Lhomme [Mon, 26 May 2025 08:44:49 +0000 (10:44 +0200)]
[cmake] add uuid library when using xmllite
Consecutive to 16fd043f51d911b106f2a7834ad8f08f65051977
IID_ISequentialStream is required by the code.
This GUID is defined in uuid.lib or libuuid.a in mingw-w64. It is required
to link with that library to get the definition of the GUID. Some toolchains
add it by default but not all.
If a pax attribute has a 0 length value and no newline, the tar reader
gets out of sync with block alignment.
This happens because the pax parser assumes that variable value_length
(which includes the terminating newline) is at least 1. To get the
real value length, 1 is subtracted. This result is subtracted from
extsize, which in this case would lead to `extsize -= -1`, i.e.
the remaining byte count is increased.
Such an unexpected calculation leads to an off-by-one when skipping
to the next block. In supplied test case, bsdtar complains that the
checksum of the next block is wrong. Since the tar parser was not
properly 512 bytes aligned, this is no surprise.
Gracefully handle such a case like GNU tar does and warn the user that
an invalid attribute has been encountered.
Zhaofeng Li [Sat, 24 May 2025 19:45:18 +0000 (13:45 -0600)]
tar: Reset accumulated header state after reading macOS metadata blob
AppleDouble extension entries are present as separate files immediately
preceding the corresponding real files. In libarchive, we process the
entire metadata file (headers + data) as if it were a header in the real
file. However, the code forgets to reset the accumulated header state
before parsing the real file's headers. In one code path, this causes
the metadata file's name to be used as the real file's name.
Specifically, this can be triggered with a tar containing two files:
1. A file named `._badname` with pax header containing the `path` attribute
2. A file named `goodname` _with_ a pax header but _without_ the `path` attribute
libarchive will list one file, `._badname` containing the data of `goodname`.
This code is pretty brittle and we really should let the client deal with
it :(
Pax extended headers may specify negative time values for files older
than the epoch.
Adjust the code to clear values to 0.0 more often and set ps to
INT64_MIN to have a proper error specifier, because the parser does
not allow anything below -INT64_MAX.
The count fields are merely used to check if a list is empty or not.
A check for first being not NULL is sufficient and is already in
place while iterating over the linked elements (count is not used).
The operations for key and node comparison depend on the platform
libarchive is compiled for. Since these values do not change
during runtime, set them only once during initialisation.
Further simplify the code by declaring only one "rb_ops" with
required functions based on platform.
The cygwin FAQ states that __CYGWIN__ is defined when building for a
Cygwin environment. Only a few test files check (inconsistently) for
CYGWIN, so adjust them to the recommended __CYGWIN__ definition.
Cast address of "version" to BYTE pointer for CryptGetProvParam.
Fix "major" variable assignment for picky compilers like MSVC.
The "length" variable is an in/out variable. It must be set to the size
of available memory within "version". Right now it is undefined behavior
and 0 would crash during runtime.
dependabot[bot] [Tue, 20 May 2025 08:19:56 +0000 (10:19 +0200)]
CI: Bump the all-actions group across 1 directory with 4 updates (#2623)
Bumps the all-actions group with 4 updates:
`actions/checkout` from 4.2.1 to 4.2.2
`actions/upload-artifact` from 4.4.3 to 4.6.2
`github/codeql-action` from 3.26.12 to 3.28.18
`ossf/scorecard-action` from 2.4.0 to 2.4.1
Rose [Sat, 17 May 2025 23:35:22 +0000 (19:35 -0400)]
Fatal if field[0].start or field[0].end is null
We should not get here, but given that the check exists, we should not let it happen if this is NULL because otherwise we just dereference it later on.
Nicholas Vinson [Sun, 13 Apr 2025 11:33:43 +0000 (07:33 -0400)]
Copy ae digests to mtree_entry
Copy ae digests to mtree_entry. This simplifies porting non-archive
formats to archive formats while preserving supported message
digests specifically in cases where recomputing digests is not
viable.
Signed-off-by: Nicholas Vinson <nvinson234@gmail.com>
The size_t to int conversion is especially required on Windows systems
to support their int-based functions. These variables should be properly
checked before casts. This avoids integer truncations with large
strings.
I prefer size_t over int for sizes and adjusted variables to size_t
where possible to avoid casts.
If vsnprintf fails with errno EOVERFLOW, the results are very platform
dependent but never useful. The implementation in glibc fills bytes with
blanks, FreeBSD fills them with zeros, OpenBSD and Windows set first
byte to '\0'.
Just stop processing and don't print anything, which makes it follow
the OpenBSD and Windows approach.
The stack buffer is never cleared, which can become an issue depending
on vsnprintf implementation's behavior if -1 is returned. The code
would eventually fall back to stack buffer which might be not
nul terminated.
Zhaofeng Li [Thu, 15 May 2025 12:08:14 +0000 (06:08 -0600)]
bsdtar: Support `--mtime` and `--clamp-mtime` (#2601)
Hi,
This PR adds support for setting a forced mtime on all written files
(`--mtime` and `--clamp-mtime`) in bsdtar.
The end goal will be to support all functionalities in
<https://reproducible-builds.org/docs/archives/#full-example>, namely
`--sort` and disabling other attributes (atime, ctime, etc.).
Fixes #971.
## History
- [v1](https://github.com/zhaofengli/libarchive/tree/forced-mtime-v1):
Added `archive_read_disk_set_forced_mtime` in libarchive. As a result,
it was only applied when reading from the filesystem and not from other
archives.
- [v2](https://github.com/zhaofengli/libarchive/tree/forced-mtime-v2):
Refactored to apply the forced mtime in `archive_write`.
- v3 (current): Reduced libarchive change to exposing
`archive_parse_date`, moved clamping logic into bsdtar.
---------
Signed-off-by: Zhaofeng Li <hello@zhaofeng.li> Co-authored-by: Dustin L. Howett <dustin@howett.net>
A filter block size must not be larger than the lzss window, which is
defined
by dictionary size, which in turn can be derived from unpacked file
size.
While at it, improve error messages and fix lzss window wrap around
logic.
rar: Fix double free with over 4 billion nodes (#2598)
If a system is capable of handling 4 billion nodes in memory, a double
free could occur because of an unsigned integer overflow leading to a
realloc call with size argument of 0. Eventually, the client will
release that memory again, triggering a double free.
mehrabiworkmail [Fri, 9 May 2025 17:21:32 +0000 (10:21 -0700)]
7z sfx overaly detection (#2088)
To detect 7z SFX files, libarchive currently searches for the 7z header
in a hard-coded addr range of the PE/ELF file
(specified via macros SFX_MIN_ADDR and SFX_MAX_ADDR). This causes it to
miss SFX files that may stray outside these values (libarchive fails to
extract 7z SFX ELF files created by recent versions of 7z tool because
of this issue). This patch fixes the issue by finding a more robust
starting point for the 7z header search: overlay in PE or the .data
section in ELF. This patch also adds 3 new test cases for 7z SFX to
libarchive.
7zip reader: add test for POWERPC filter support for LZMA compressor (#2460)
This new test archive contains a C hello world executable built like so
on a ubuntu 24.04 machine:
```
int main(int argc, char *argv[]) {
printf("hello, world\n");
return 0;
}
```
`powerpc-linux-gnu-gcc hw.c -o hw-powerpc -Wall`
The test archive that contains this executable was created like so,
using 7-Zip 24.08: `7zz a -t7z -m0=lzma2 -mf=ppc
libarchive/test/test_read_format_7zip_lzma2_powerpc.7z hw-powerpc`
The new test archive is required because the powerpc filter for lzma is
implemented in liblzma rather than in libarchive.