Jia Tan [Mon, 28 Aug 2023 13:50:16 +0000 (21:50 +0800)]
liblzma: Update assert in vli_ceil4().
The argument to vli_ceil4() should always guarantee the return value
is also a valid lzma_vli. Thus the highest three valid lzma_vli values
are invalid arguments. All uses of the function ensure this so the
assert is updated to match this.
Jia Tan [Mon, 28 Aug 2023 13:31:25 +0000 (21:31 +0800)]
liblzma: Add overflow check for Unpadded size in lzma_index_append().
This was not a security bug since there was no path to overflow
UINT64_MAX in lzma_index_append() or when it calls index_file_size().
The bug was discovered by a failing assert() in vli_ceil4() when called
from index_file_size() when unpadded_sum (the sum of the compressed size
of current Stream and the unpadded_size parameter) exceeds LZMA_VLI_MAX.
Previously, the unpadded_size parameter was checked to be not greater
than UNPADDED_SIZE_MAX, but no check was done once compressed_base was
added.
This could not have caused an integer overflow in index_file_size() when
called by lzma_index_append(). The calculation for file_size breaks down
into the sum of:
- Compressed base from all previous Streams
- 2 * LZMA_STREAM_HEADER_SIZE (size of the current Streams header and
footer)
- stream_padding (can be set by lzma_index_stream_padding())
- Compressed base from the current Stream
- Unpadded size (parameter to lzma_index_append())
The sum of everything except for Unpadded size must be less than
LZMA_VLI_MAX. This is guarenteed by overflow checks in the functions
that can set these values including lzma_index_stream_padding(),
lzma_index_append(), and lzma_index_cat(). The maximum value for
Unpadded size is enforced by lzma_index_append() to be less than or
equal UNPADDED_SIZE_MAX. Thus, the sum cannot exceed UINT64_MAX since
LZMA_VLI_MAX is half of UINT64_MAX.
Jia Tan [Wed, 9 Aug 2023 12:35:16 +0000 (20:35 +0800)]
Build: Conditionally allow win95 threads and --enable-small.
When the compiler supports __attribute__((__constructor__))
mythread_once() is never used, even with --enable-small. A configuration
with win95 threads and --enable-small will compile and be thread safe so
it can be allowed.
This isn't a very common configuration since MSVC does not support
__attribute__((__constructor__)), but MINGW32 and CLANG32 environments
for MSYS2 can use win95 threads and have
__attribute__((__constructor__)) support.
Jamaika1 [Tue, 8 Aug 2023 12:07:59 +0000 (14:07 +0200)]
mythread.h: Fix typo error in Vista threads mythread_once().
The "once_" variable was accidentally referred to as just "once". This
prevented building with Vista threads when
HAVE_FUNC_ATTRIBUTE_CONSTRUCTOR was not defined.
Jia Tan [Fri, 4 Aug 2023 14:17:11 +0000 (22:17 +0800)]
codespell: Add .codespellrc to set default options.
The .codespellrc allows setting default options to avoid false positive
matches, set additional dictionaries, etc. For now, codespell can be
used locally before committing doc and comment changes.
It should help prevent silly errors and fix up commits in the future.
Lasse Collin [Wed, 2 Aug 2023 12:19:43 +0000 (15:19 +0300)]
build-aux/manconv.sh: Fix US-ASCII and UTF-8 output.
groff defaults to SGR escapes. Using -P-c passes -c to grotty
which restores the old behavior. Perhaps there is a better way to
get pure plain text output but this works for now.
ChanTsune [Tue, 1 Aug 2023 15:17:17 +0000 (18:17 +0300)]
mythread.h: Disable signal functions in builds targeting Wasm + WASI.
signal.h in WASI SDK doesn't currently provide sigprocmask()
or sigset_t. liblzma doesn't need them so this change makes
liblzma and xzdec build against WASI SDK. xz doesn't build yet
and the tests don't either as tuktest needs setjmp() which
isn't (yet?) implemented in WASI SDK.
Closes: https://github.com/tukaani-project/xz/pull/57
See also: https://github.com/tukaani-project/xz/pull/56
(The original commit was edited a little by Lasse Collin.)
Jia Tan [Fri, 28 Jul 2023 14:03:08 +0000 (22:03 +0800)]
CMake: Conditionally allow the creation of broken symlinks.
The CMake build will try to create broken symlinks on Unix and Unix-like
platforms. Cygwin and MSYS2 are Unix-like, but may not be able to create
broken symlinks. The value of the CYGWIN or MSYS environment variables
determine if broken symlinks are valid.
Jia Tan [Fri, 28 Jul 2023 13:56:48 +0000 (21:56 +0800)]
CI: Fix windows-ci dependency installation.
All of the MSYS2 environments need make, and it does not come with the
toolchain package. The toolchain package will install the needed
compiler toolchains since without this package CMake cannot properly
generate the Makefiles.
Jia Tan [Fri, 28 Jul 2023 13:54:22 +0000 (21:54 +0800)]
CI: Update ci_build.sh CMake to always make Unix Makefiles.
The default for many of the MSYS2 environments is for CMake to create
Ninja build files. This would complicate the build script since we would
need a different command to run the tests. Its simpler to always use
Unix Makefiles so that "make test" is always a usable target for
testing.
Jia Tan [Mon, 24 Jul 2023 13:43:44 +0000 (21:43 +0800)]
liblzma: Prevent an empty translation unit in Windows builds.
To workaround Automake lacking Windows resource compiler support, an
empty source file is compiled to overwrite the resource files for static
library builds. Translation units without an external declaration are
not allowed by the C standard and result in a warning when used with
-Wempty-translation-unit (Clang) or -pedantic (GCC).
Jia Tan [Sat, 22 Jul 2023 06:55:42 +0000 (14:55 +0800)]
CI: Add Windows runner for Autotools builds with MSYS2.
Only a subset of the tests run by the Linux and MacOS Autotools builds
are run. The most interesting tests are the ones that disable threads,
encoders, and decoders.
The Windows runner will only be run manually since these tests will
likely take much longer than the Linux and MacOS runners. This runner
should be used before merging any large features and before releases.
Currently the clang64 environment fails to due to a warning and
-Werror is enabled for the CI tests. This is still an early version
since the CMake build can be done for MSVC and optionally each of the
MSYS2 environments. GitHub does not allow manually running the CI tests
unless the workflow is checked on the default branch so checking in a
minimum version is a good idea.
Thanks to Arthur S for the original proposing the original patch.
Jia Tan [Wed, 19 Jul 2023 15:36:00 +0000 (23:36 +0800)]
liblzma: Suppress -Wunused-function warning.
Clang 16.0.0 and earlier have a bug that the ifunc resolver function
triggers the -Wunused-function warning. The resolver function is static
and only "used" by the __attribute__((__ifunc()__)).
At this time, the bug is still unresolved, but has been reported:
https://github.com/llvm/llvm-project/issues/63957
This further improves the documentation from commit f36ca7982f6bd5e9827219ed4f3c5a1fbf5d7bdf. The previous wording of
"supported options" was slightly misleading since the options that are
printed are the ones that are relevant for encoding/decoding. It is not
about which options can or must be specified.
Jia Tan [Fri, 14 Jul 2023 15:20:33 +0000 (23:20 +0800)]
Docs: Add a new section to INSTALL for Tests.
The new Tests section describes basic information about the tests, how
to run them, and important details when cross compiling. We have had a
few questions about how to compile the tests without running them, so
hopefully this information will help others with the same question in the
future.
Jia Tan [Thu, 13 Jul 2023 15:32:10 +0000 (23:32 +0800)]
xz: Fix typo in man page.
The Memory limit information section described three output
columns when it actually has six. This was reworded to
"multiple" to make it more future proof.
Jia Tan [Thu, 13 Jul 2023 13:46:12 +0000 (21:46 +0800)]
xz: Minor clean up for coder.c
* Moved max_block_list_size from a global to local variable.
* Reworded error message in validate_block_list_filter().
* Removed helper function filter_chain_error().
* Changed 1 << X to 1U << X in many places
Jia Tan [Mon, 19 Jun 2023 15:07:10 +0000 (23:07 +0800)]
xz: Reorder robot mode subsections in the man page.
The order is now consistent with the order the command line arguments
are documented earlier in the man page. The new order is:
1. --list
2. --info-memory
3. --version
Instead of the previous order:
1. --version
2. --info-memory
3. --list
Jia Tan [Fri, 12 May 2023 16:44:41 +0000 (00:44 +0800)]
xz: Add a new --filters-help option.
The --filters-help can be used to help create filter chains with the
--filters and --filtersX options. The message in --long-help is too
short to fully explain the syntax to construct complex filter chains.
In --robot mode, xz will only print the output from liblzma function
lzma_str_list_filters.
Jia Tan [Fri, 21 Apr 2023 12:28:11 +0000 (20:28 +0800)]
xz: Update the man page for --block-list and --filtersX
The --block-list option description needed updating since the new
--filtersX option changes how it can be used. The new entry for
--filters1=FILTERS ... --filter9=FILTERS was created right after
the --filters option.
Jia Tan [Sat, 17 Jun 2023 12:46:21 +0000 (20:46 +0800)]
xz: Ignore filter chains that are set but never used in --block-list.
If a filter chain is set but not used in --block-list, it introduced
unexpected behavior such as requiring an unneeded amount of memory to
compress, reducing the number of threads in multi-threaded encoding, and
printing an incorrect amount of memory needed to decompress.
This also renames filters_init_mask => filters_used_mask. A filter is
assumed to be used if it is specified in --filtersX until
coder_set_compression_settings() determines which filters are referenced
in --block-list.
Jia Tan [Sat, 13 May 2023 12:11:13 +0000 (20:11 +0800)]
xz: Set the Block size for mt encoding correctly.
When opt_block_size is not used, the Block size for mt encoder is
derived from the minimum of the largest Block specified by
--block-list and the recommended Block size on all filter chains
calculated by lzma_mt_block_size(). This avoids using unnecessary
memory and ensures that all Blocks are large enough for the most memory
needy filter chain.
Jia Tan [Sat, 13 May 2023 11:54:33 +0000 (19:54 +0800)]
xz: Allows --block-list filters to scale down memory usage.
Previously, only the default filter chain could have its memory usage
adjusted. The filter chains specified with --filtersX were not checked
for memory usage. Now, all used filter chains will be adjusted if
necessary.
Jia Tan [Wed, 10 May 2023 13:50:33 +0000 (21:50 +0800)]
xz: Do not include block splitting if encoders are disabled.
The block splitting logic and split_block() function are not needed if
encoders are disabled. This will help slightly reduce the binary size
when built without encoders and allow split_block() to use functions
that require encoders being enabled.
Jia Tan [Wed, 10 May 2023 14:38:59 +0000 (22:38 +0800)]
xz: Free filters[] in debug mode.
This will only free filter chains created with --filters1-9 since the
default filter chain may be set from a static function variable. The
complexity to free the default filter chain is not worth the burden on
code maintenance.
Jia Tan [Tue, 18 Apr 2023 12:29:09 +0000 (20:29 +0800)]
xz: Create command line options for filters[1-9].
The new command line options are meant to be combined with --block-list.
They work as an optional extension to --block-list to specify a custom
filter chain for each block listed. The new options allow the creation
of up to 9 reusable filter chains. For instance:
Will create the following blocks:
1. A block of size 10 MiB with filter chain delta, lzma2.
2. A block of size 5 MiB with filter chain arm64, lzma2.
3. A block of size 5 MiB with filter chain arm64, lzma2.
4. A block of size 5 MiB with filter chain x86, lzma2.
5. A block containing the rest of the file contents with filter chain
delta, lzma2.
Jia Tan [Sat, 13 May 2023 11:36:09 +0000 (19:36 +0800)]
xz: Use lzma_filters_free() in forget_filter_chain().
This is a little cleaner than the previous implementation of
forget_filter_chain(). It is also more consistent since
lzma_str_to_filters() will always terminate the filter chain so there
is no need to terminate it later in coder_set_compression_settings().
Jia Tan [Thu, 5 Jan 2023 16:02:29 +0000 (00:02 +0800)]
xz: Add --filters option to CLI.
The --filters option uses the new lzma_str_to_filters() function
to convert a string into a full filter chain. Using this option
will reset all previous filters set by --preset, --[filter], or
--filters.
Jia Tan [Fri, 14 Jul 2023 13:30:25 +0000 (21:30 +0800)]
Tests: Improve feature testing for skipping.
Fixed a bug where test_compress_* would all fail if arm64 or armthumb
filters were enabled for compression but arm was disabled. Since the
grep tests only checked for "define HAVE_ENCODER_ARM", this would match
on HAVE_ENCODER_ARM64 or HAVE_ENCODER_ARMTHUMB.
Now the config.h feature test requires " 1" at the end to prevent the
prefix problem. have_feature() was also updated for this even though
there were known current bugs affecting it. This is just in case future
features have a similar prefix problem.
Jia Tan [Sat, 8 Jul 2023 13:24:19 +0000 (21:24 +0800)]
liblzma: Remove non-portable empty initializer.
Commit 78704f36e74205857c898a351c757719a6c8b666 added an empty
initializer {} to prevent a warning. The empty initializer is a GNU
extension and results in a build failure on MSVC. The -wpedantic flag
warns about empty initializers.
Jia Tan [Wed, 28 Jun 2023 12:46:31 +0000 (20:46 +0800)]
Tests: Fix memory leaks in test_index.
Several tests were missing calls to lzma_index_end() to clean up the
lzma_index structs. The memory leaks were discovered by using
-fsanitize=address with GCC.
Jia Tan [Wed, 28 Jun 2023 12:43:29 +0000 (20:43 +0800)]
Tests: Fix memory leaks in test_block_header.
test_block_header was not properly freeing the filter options between
calls to lzma_block_header_decode(). The memory leaks were discovered by
using -fsanitize=address with GCC.
Jia Tan [Wed, 28 Jun 2023 12:31:11 +0000 (20:31 +0800)]
liblzma: Prevent uninitialzed warning in mt stream encoder.
This change only impacts the compiler warning since it was impossible
for the wait_abs struct in stream_encode_mt() to be used before it was
initialized since mythread_condtime_set() will always be called before
mythread_cond_timedwait().
Since the mythread.h code is different between the POSIX and
Windows versions, this warning was only present on Windows builds.
Thanks to Arthur S for reporting the warning and providing an initial
patch.
Jia Tan [Wed, 28 Jun 2023 12:22:38 +0000 (20:22 +0800)]
liblzma: Prevent warning for MSYS2 Windows build.
In lzma_memcmplen(), the <intrin.h> header file is only included if
_MSC_VER and _M_X64 are both defined but _BitScanForward64() was
previously used if _M_X64 was defined. GCC for MSYS2 defines _M_X64 but
not _MSC_VER so _BitScanForward64() was used without including
<intrin.h>.
Now, lzma_memcmplen() will use __builtin_ctzll() for MSYS2 GCC builds as
expected.
Jia Tan [Wed, 28 Jun 2023 13:01:22 +0000 (21:01 +0800)]
CI: Add test with -fsanitize=address,undefined.
ci_build.sh was updated to accept disabling of __attribute__ ifunc
and CLMUL. This will allow -fsanitize=address to pass because ifunc
is incompatible with -fsanitize=address. The CLMUL implementation has
optimizations that potentially read past the buffer and mask out the
unwanted bytes.
Lasse Collin [Tue, 27 Jun 2023 14:05:23 +0000 (17:05 +0300)]
liblzma: Add ifunc implementation to crc64_fast.c.
The ifunc method avoids indirection via the function pointer
crc64_func. This works on GNU/Linux and probably on FreeBSD too.
The previous __attribute((__constructor__)) method is kept for
compatibility with ELF platforms which do support ifunc.
The ifunc method has some limitations, for example, building
liblzma with -fsanitize=address will result in segfaults.
The configure option --disable-ifunc must be used for such builds.
Thanks to Hans Jansen for the original patch. Closes: https://github.com/tukaani-project/xz/pull/53
Hans Jansen [Thu, 22 Jun 2023 17:49:30 +0000 (19:49 +0200)]
Add ifunc check to CMakeLists.txt
CMake build system will now verify if __attribute__((__ifunc__())) can be
used in the build system. If so, HAVE_FUNC_ATTRIBUTE_IFUNC will be
defined to 1.
Benjamin Buch [Tue, 6 Jun 2023 13:32:45 +0000 (15:32 +0200)]
CMake: Protects against double find_package
Boost iostream uses `find_package` in quiet mode and then again uses
`find_package` with required. This second call triggers a
`add_library cannot create imported target "ZLIB::ZLIB" because another
target with the same name already exists.`
This can simply be fixed by skipping the alias part on secondary
`find_package` runs.