Lasse Collin [Tue, 30 Apr 2024 19:22:45 +0000 (22:22 +0300)]
liblzma: Fix incorrect function type error from sanitizer
Clang 17 with -fsanitize=address,undefined:
src/liblzma/common/filter_common.c:366:8: runtime error:
call to function encoder_find through pointer to incorrect
function type 'const lzma_filter_coder *(*)(unsigned long)'
src/liblzma/common/filter_encoder.c:187: note:
encoder_find defined here
Use a wrapper function to get the correct type neatly.
This reduces the number of casts needed too.
This issue could be a problem with control flow integrity (CFI)
methods that check the function type on indirect function calls.
Lasse Collin [Sat, 27 Apr 2024 11:56:16 +0000 (14:56 +0300)]
Tests: test_index: Make it clear that my_alloc() has no integer overflows
liblzma guarantees that the product of the allocation size arguments
will fit in size_t.
Putting the pre-increment in the if-statement was clearly wrong
although in practice it didn't matter here as the function is
called only a couple of times.
Lasse Collin [Sat, 27 Apr 2024 11:33:38 +0000 (14:33 +0300)]
liblzma: index_decoder: Fix missing initializations on LZMA_PROG_ERROR
If the arguments to lzma_index_decoder() or lzma_index_buffer_decode()
were such that LZMA_PROG_ERROR was returned, the lzma_index **i
argument wasn't touched even though the API docs say that *i = NULL
is done if an error occurs. This obviously won't be done even now
if i == NULL but otherwise it is best to do it due to the wording
in the API docs.
In practice this matters very little: The problem can occur only
if the functions are called with invalid arguments, that is,
the calling application must already have a bug.
Lasse Collin [Mon, 22 Apr 2024 18:54:39 +0000 (21:54 +0300)]
liblzma: lzma_str_to_filters: Set *error_pos on all errors
The API docs clearly say that if error_pos isn't NULL then *error
is always set on any error. However, it wasn't touched if str == NULL
or filters == NULL or unsupported flags were specified.
Lasse Collin [Fri, 15 Mar 2024 15:15:50 +0000 (17:15 +0200)]
Build: Use only the generic symbol versioning with NVIDIA HPC Compiler.
This does the previous commit with CMake.
AC_EGREP_CPP uses AC_REQUIRE so the outermost if-commands must
be changed to AS_IF to ensure that things wont break some day.
See 5a5bd7f871818029d5ccbe189f087f591258c294.
Lasse Collin [Fri, 15 Mar 2024 14:36:35 +0000 (16:36 +0200)]
CMake: Use only the generic symbol versioning with NVIDIA HPC Compiler.
It doesn't support the __symver__ attribute or __asm__(".symver ...").
The generic symbol versioning can still be used since it only needs
linker support.
Sergey Kosukhin [Wed, 13 Mar 2024 12:07:13 +0000 (13:07 +0100)]
liblzma: Fix building with NVHPC (NVIDIA HPC SDK).
NVHPC compiler has several issues that make it impossible to
build liblzma:
- the compiler cannot handle unions that contain pointers that
are not the first members;
- the compiler fails to produce valid code for delta_decode if the
vectorization is enabled, which results in failed tests.
This introduces NVHPC-specific workarounds that address the issues.
(This commit was contributed under 0BSD but the author confirmed
that it is fine to backport it to the public domain branches. See
https://github.com/tukaani-project/xz/pull/90#issuecomment-2100185936
and the next two messages.)
Sergey Kosukhin [Tue, 12 Mar 2024 19:03:49 +0000 (20:03 +0100)]
Build: Let the users override the symbol versioning variant.
There are cases when the users want to decide themselves whether
they want to have the generic (even on GNU/Linux) or the linux
(even if we do not recommend that) symbol versioning variant.
The former might be needed to circumvent compiler issues (i.e.
the compiler does not support all features that are required
for the linux versioning), the latter might help in overriding
the assumptions made in the configure script.
(This commit was contributed under 0BSD but the author confirmed
that it is fine to backport it to the public domain branches. See
https://github.com/tukaani-project/xz/pull/90#issuecomment-2100185936
and the next two messages.)
Lasse Collin [Sat, 17 Feb 2024 21:07:35 +0000 (23:07 +0200)]
xz: Fix message_init() description.
Also explicitly initialize progress_automatic to make it clear
that it can be read before message_init() sets it. Static variable
was initialized to false by default already so this is only for
clarity.
Lasse Collin [Wed, 14 Feb 2024 17:11:48 +0000 (19:11 +0200)]
Docs: Include doc/examples/11_file_info.c in tarballs.
It was added in 2017 in c2e29f06a7d1e3ba242ac2fafc69f5d6e92f62cd
but it never got into any release tarballs because it was
forgotten to be added to Makefile.am.
Lasse Collin [Fri, 9 Feb 2024 15:20:31 +0000 (17:20 +0200)]
Fix SHA-256 authors.
The initial commit 5d018dc03549c1ee4958364712fb0c94e1bf2741
in 2007 had a comment in sha256.c that the code is based on
Crypto++ Library 5.5.1. In 2009 the Authors list in sha256.c
and the AUTHORS file was updated with information that the
code had come from Crypto++ but via 7-Zip. I know I had viewed
7-Zip's SHA-256 code but back then the C code has been identical
enough with Crypto++, so I don't why I thought the author info
would need that extra step via 7-Zip for this single file.
Another error is that I had mixed sha.* and shacal2.* files
when checking for author info in Crypto++. The shacal2.* files
aren't related to liblzma's sha256.c and thus Kevin Springle's
code in Crypto++ isn't either.
Jia Tan [Sat, 16 Dec 2023 12:51:38 +0000 (20:51 +0800)]
liblzma: Set all values in lzma_lz_encoder to NULL after allocation.
The first member of lzma_lz_encoder doesn't necessarily need to be set
to NULL since it will always be set before anything tries to use it.
However the function pointer members must be set to NULL since other
functions rely on this NULL value to determine if this behavior is
supported or not.
This fixes a somewhat serious bug, where the options_update() and
set_out_limit() function pointers are not set to NULL. This seems to
have been forgotten since these function pointers were added many years
after the original two (code() and end()).
The problem is that by not setting this to NULL we are relying on the
memory allocation to zero things out if lzma_filters_update() is called
on a LZMA1 encoder. The function pointer for set_out_limit() is less
serious because there is not an API function that could call this in an
incorrect way. set_out_limit() is only called by the MicroLZMA encoder,
which must use LZMA1 where set_out_limit() is always set. Its currently
not possible to call set_out_limit() on an LZMA2 encoder at this time.
So calling lzma_filters_update() on an LZMA1 encoder had undefined
behavior since its possible that memory could be manipulated so the
options_update member pointed to a different instruction sequence.
This is unlikely to be a bug in an existing application since it relies
on calling lzma_filters_update() on an LZMA1 encoder in the first place.
For instance, it does not affect xz because lzma_filters_update() can
only be used when encoding to the .xz format.
Jia Tan [Sat, 16 Dec 2023 12:28:21 +0000 (20:28 +0800)]
liblzma: Make parameter names in function definition match declaration.
lzma_raw_encoder() and lzma_raw_encoder_init() used "options" as the
parameter name instead of "filters" (used by the declaration). "filters"
is more clear since the parameter represents the list of filters passed
to the raw encoder, each of which contains filter options.
Jia Tan [Sat, 16 Dec 2023 12:18:47 +0000 (20:18 +0800)]
liblzma: Improve lzma encoder init function consistency.
lzma_encoder_init() did not check for NULL options, but
lzma2_encoder_init() did. This is more of a code style improvement than
anything else to help make lzma_encoder_init() and lzma2_encoder_init()
more similar.
Jia Tan [Thu, 23 Nov 2023 14:04:35 +0000 (22:04 +0800)]
xz: Create separate is_tty() function.
The new is_tty() will report if a file descriptor is a terminal or not.
On POSIX systems, it is a wrapper around isatty(). However, the native
Windows implementation of isatty() will return true for all character
devices, not just terminals. So is_tty() has a special case for Windows
so it can use alternative Windows API functions to determine if a file
descriptor is a terminal.
This fixes a bug with MSVC and MinGW-w64 builds that refused to read from
or write to non-terminal character devices because xz thought it was a
terminal. For instance:
xz foo -c > /dev/null
would fail because /dev/null was assumed to be a terminal.
Jia Tan [Fri, 17 Nov 2023 12:35:11 +0000 (20:35 +0800)]
Tests: Create test_suffix.sh.
This tests some complicated interactions with the --suffix= option.
The suffix option must be used with --format=raw, but can optionally
be used to override the default .xz suffix.
This test also verifies some recent bugs have been correctly solved
and to hopefully avoid further regressions in the future.
Jia Tan [Tue, 14 Nov 2023 12:27:04 +0000 (20:27 +0800)]
xz: Fix suffix check.
The suffix refactor done in 99575947a58a60416c570eb78038d18a1ea4cef4
had a small regression where raw format compression to standard out
failed if a suffix was not set. In this case, setting the suffix did
not make sense since a file is not created.
Now, xz should only fail when a suffix is not provided when it is
actually needed.
For instance:
echo "foo" | xz --format=raw --lzma2 | wc -c
does not need a suffix check since it creates no files. But:
xz --format=raw --lzma2 --suffix=.bar foo
Needs the suffix to be set since it must create foo.bar.
Lasse Collin [Tue, 31 Oct 2023 19:41:09 +0000 (21:41 +0200)]
liblzma: Fix compilation of fastpos_tablegen.c.
The macro lzma_attr_visibility_hidden has to be defined to make
fastpos.h usable. The visibility attribute is irrelevant to
fastpos_tablegen.c so simply #define the macro to an empty value.
fastpos_tablegen.c is never built by the included build systems
and so the problem wasn't noticed earlier. It's just a standalone
program for generating fastpos_table.c.
Fixes: https://github.com/tukaani-project/xz/pull/69
Thanks to GitHub user Jamaika1.
Lasse Collin [Sun, 22 Oct 2023 14:08:39 +0000 (17:08 +0300)]
liblzma: #define lzma_attr_visibility_hidden in common.h.
In ELF shared libs:
-fvisibility=hidden affects definitions of symbols but not
declarations.[*] This doesn't affect direct calls to functions
inside liblzma as a linker can replace a call to lzma_foo@plt
with a call directly to lzma_foo when -fvisibility=hidden is used.
[*] It has to be like this because otherwise every installed
header file would need to explictly set the symbol visibility
to default.
When accessing extern variables that aren't defined in the
same translation unit, compiler assumes that the variable has
the default visibility and thus indirection is needed. Unlike
function calls, linker cannot optimize this.
Using __attribute__((__visibility__("hidden"))) with the extern
variable declarations tells the compiler that indirection isn't
needed because the definition is in the same shared library.
About 15+ years ago, someone told me that it would be good if
the CRC tables would be defined in the same translation unit
as the C code of the CRC functions. While I understood that it
could help a tiny amount, I didn't want to change the code because
a separate translation unit for the CRC tables was needed for the
x86 assembly code anyway. But when visibility attributes are
supported, simply marking the extern declaration with the
hidden attribute will get identical result. When there are only
a few affected variables, this is trivial to do. I wish I had
understood this back then already.
Lasse Collin [Sat, 30 Sep 2023 19:54:28 +0000 (22:54 +0300)]
liblzma: Refer to MinGW-w64 instead of MinGW in the API headers.
MinGW (formely a MinGW.org Project, later the MinGW.OSDN Project
at <https://osdn.net/projects/mingw/>) has GCC 9.2.0 as the
most recent GCC package (released 2021-02-02). The project might
still be alive but majority of people have switched to MinGW-w64.
Thus it seems clearer to refer to MinGW-w64 in our API headers too.
Building with MinGW is likely to still work but I haven't tested it
in the recent years.
Lasse Collin [Fri, 29 Sep 2023 23:14:25 +0000 (02:14 +0300)]
CMake: Generate and install liblzma.pc if not using MSVC.
Autotools based build uses -pthread and thus adds it to Libs.private
in liblzma.pc. CMake doesn't use -pthread at all if pthread functions
are available in libc so Libs.private doesn't get -pthread either.
Lasse Collin [Fri, 29 Sep 2023 17:46:11 +0000 (20:46 +0300)]
liblzma: Add Cflags.private to liblzma.pc.in for MSYS2.
It properly adds -DLZMA_API_STATIC when compiling code that
will be linked against static liblzma. Having it there on
systems other than Windows does no harm.