Lasse Collin [Sun, 15 Dec 2024 17:08:32 +0000 (19:08 +0200)]
CMake: Bump maximum policy version to 3.31
With CMake 3.31, there were a few warnings from
CMP0177 "install() DESTINATION paths are normalized".
These occurred because the install(FILES) command in
my_install_man_lang() is called with a DESTINATION path
that contains two consecutive slashes, for example,
"share/man//man1". Such a path is for the English man pages.
With translated man pages, the language code goes between
the slashes. The warning was probably triggered because the
extra slash gets removed by the normalization.
Lasse Collin [Sun, 1 Dec 2024 19:38:17 +0000 (21:38 +0200)]
doc/SHA256SUMS: Add the list of SHA-256 hashes of release files
The release files are signed but verifying the signatures cannot
catch certain types of attacks:
1. A malicious maintainer could make more than one variant of
a package. One could be for general distribution. Another
with malicious content could be targeted to specific users,
for example, distributing the malicious version on a mirror
controlled by the attacker.
2. If the signing key of an honest maintainer was compromised
without being detected, a similar situation as described
above could occur.
SHA256SUMS could be put on the project website but having it in
the Git repository makes it obvious that old lines aren't modified
when the file is updated.
Hashes of uncompressed files are included too. This way tarballs
can be recompressed and the hashes can still be verified.
Lasse Collin [Sat, 30 Nov 2024 10:05:59 +0000 (12:05 +0200)]
Docs: Remove .github/SECURITY.md
One of the reasons to have this file in the xz repository was to
show vulnerability reporting info in the Security section on GitHub.
On 2024-11-25, I added SECURITY.md to the tukaani-project organization
on GitHub:
GitHub shows that file in all projects in the organization unless
overridden by a project-specific SECURITY.md. Thus, removing
the file from the xz repo makes GitHub show the organization-wide
text instead.
Maintaining a single copy for the whole GitHub organization makes
things simpler. It's also nicer to have fewer GitHub-specific files
in the xz repo. Information how to report bugs (including security
issues) is available in README and on the home page too.
The OpenSSF Scorecard tool didn't find .github/SECURITY.md from the
xz repository. There was a suggestion to move the file to the top-level
directory where Scorecard should find it. However, Scorecard does find
the organization-wide SECURITY.md. Thus, the file isn't needed in the
xz repository to score points in the Scorecard game:
Lasse Collin [Tue, 1 Oct 2024 09:10:23 +0000 (12:10 +0300)]
Windows: Embed an application manifest in the EXE files
IMPORTANT: This includes a security fix to command line tool
argument handling.
Some toolchains embed an application manifest by default to declare
UAC-compliance. Some also declare compatibility with Vista/8/8.1/10/11
to let the app access features newer than those of Vista.
We want all the above but also two more things:
- Declare that the app is long path aware to support paths longer
than 259 characters (this may also require a registry change).
- Force the code page to UTF-8. This allows the command line tools
to access files whose names contain characters that don't exist
in the current legacy code page (except unpaired surrogates).
The UTF-8 code page also fixes security issues in command line
argument handling which can be exploited with malicious filenames.
See the new file w32_application.manifest.comments.txt.
Thanks to Orange Tsai and splitline from DEVCORE Research Team
for discovering this issue.
Thanks to Vijay Sarvepalli for reporting the issue to me.
Thanks to Kelvin Lee for testing with MSVC and helping with
the required build system fixes.
Lasse Collin [Sun, 29 Sep 2024 11:46:52 +0000 (14:46 +0300)]
Windows: Set DLL name accurately in StringFileInfo on Cygwin and MSYS2
Now the information in the "Details" tab in the file properties
dialog matches the naming convention of Cygwin and MSYS2. This
is only a cosmetic change.
Lasse Collin [Sat, 28 Sep 2024 17:09:50 +0000 (20:09 +0300)]
CMake: Add the resource files to the Cygwin and MSYS2 builds
Autotools-based build has always done this so this is for consistency.
However, the CMake build won't create the DEF file when building
for Cygwin or MSYS2 because in that context it should be useless.
(If Cygwin or MSYS2 is used to host building of normal Windows
binaries then the DEF file is still created.)
"xzdec -M123" exited with exit status 1 without printing
any messages. The "M:" entry should have been removed when
the memory usage limiter support was removed from xzdec.
Yifeng Li [Thu, 22 Aug 2024 02:18:49 +0000 (02:18 +0000)]
liblzma: Fix x86-64 movzw compatibility in range_decoder.h
Support for instruction "movzw" without suffix in "GNU as" was
added in commit [1] and stabilized in binutils 2.27, released
in August 2016. Earlier systems don't accept this instruction
without a suffix, making range_decoder.h's inline assembly
unable to build on old systems such as Ubuntu 16.04, creating
error messages like:
lzma_decoder.c: Assembler messages:
lzma_decoder.c:371: Error: no such instruction: `movzw 2(%r11),%esi'
lzma_decoder.c:373: Error: no such instruction: `movzw 4(%r11),%edi'
lzma_decoder.c:388: Error: no such instruction: `movzw 6(%r11),%edx'
lzma_decoder.c:398: Error: no such instruction: `movzw (%r11,%r14,4),%esi'
OpenBSD 7.6 will support elf_aux_info(3), and the detection code used
on FreeBSD will work on OpenBSD 7.6 too. Keep things simpler and drop
the OpenBSD-specific sysctl() method.
Lasse Collin [Sat, 6 Jul 2024 11:04:48 +0000 (14:04 +0300)]
xz: Remove the TODO comment about --recursive
It won't be implemented. find + xargs is more flexible, for example,
it allows compressing small files in parallel. An example for that
has been included in the xz man page since 2010.
Lasse Collin [Wed, 3 Jul 2024 17:45:48 +0000 (20:45 +0300)]
CMake: Link xz against Threads::Threads if using pthreads
The liblzma target was recently changed to link against Threads::Threads
with the PRIVATE keyword. I had forgotten that xz itself depends on
pthreads too due to pthread_sigmask(). Thus, the build broke when
building shared liblzma and pthread_sigmask() wasn't in libc.
Lasse Collin [Tue, 2 Jul 2024 17:19:47 +0000 (20:19 +0300)]
CMake: Update the comment at the top of CMakeLists.txt
While po/*.gmo files won't be used from the release tarball,
the generated translated man pages will be used still. Those
are text files and po4a has slightly more dependencies than
gettext tools so installing po4a might be a bit more challenging
in some situations.
Lasse Collin [Tue, 2 Jul 2024 17:12:40 +0000 (20:12 +0300)]
CMake: Drop support for pre-generated po/*.gmo files
When a release tarball is created using Autotools, the tarball includes
po/*.gmo files which are binary files generated from po/*.po. Other
tarball creation methods don't and won't create the .gmo files.
It feels clearer if CMake will never install pre-generated binary files
from the source package. If people are able to install CMake, they
likely are able to install gettext tools as well (assuming they want
translations).
Lasse Collin [Tue, 2 Jul 2024 16:14:50 +0000 (19:14 +0300)]
CMake: Make XZ_NLS handling more robust
If a user set XZ_NLS=ON but find_package(Intl) failed or CMake version
wasn't at least 3.20, the configuration would fail in a cryptic way.
If XZ_NLS is enabled, require that CMake is new enough and that either
gettext tools or pre-generated .gmo files are available. Otherwise fail
the configuration. Previously missing gettext tools and .gmo files would
only result in a warning.
Missing man page translations are still only a warning.
Lasse Collin [Tue, 2 Jul 2024 15:02:50 +0000 (18:02 +0300)]
CMake: The compile definition is ENABLE_NLS, not XZ_NLS
The CMake variables were renamed and accidentally also
the compile definition was renamed. As a result, translation
support wasn't actually enabled in the executables.
Xi Ruoyao [Fri, 28 Jun 2024 10:36:43 +0000 (13:36 +0300)]
liblzma: Speed up CRC32 calculation on 64-bit LoongArch
The crc.w.{b/h/w/d}.w instructions in LoongArch can calculate the CRC32
result for 1/2/4/8 bytes in a single operation. Using these is much
faster compared to the generic method.
Optimized CRC32 is enabled unconditionally on 64-bit LoongArch because
the LoongArch specification says that CRC32 instructions shall be
implemented for 64-bit processors. Optimized CRC32 isn't enabled for
32-bit LoongArch processors because not enough information is available
about them.
Co-authored-by: Lasse Collin <lasse.collin@tukaani.org> Closes: https://github.com/tukaani-project/xz/pull/86
Sam James [Fri, 28 Jun 2024 11:18:35 +0000 (14:18 +0300)]
CI: Speed up Valgrind job by using --trace-children-skip-by-arg=...
This addresses the issue I mentioned in 6c095a98fbec70b790253a663173ecdb669108c4 and speeds up the Valgrind
job a bit, because non-xz tools aren't run unnecessarily with
Valgrind by the script tests.
Lasse Collin [Tue, 25 Jun 2024 13:00:22 +0000 (16:00 +0300)]
Build: Prepend, not append, PTHREAD_CFLAGS to LIBS
It shouldn't make any difference because LIBS should be empty
at that point in configure. But prepending is the correct way
because in general the libraries being added might require other
libraries that come later on the command line.
Lasse Collin [Tue, 25 Jun 2024 11:24:29 +0000 (14:24 +0300)]
Build: Use AC_LINK_IFELSE to handle implicit function declarations
It's more robust in case the compiler allows pre-C99 implicit function
declarations. If an x86 intrinsic is missing and gets treated as
implicit function, the linking step will very probably fail. This
isn't the only way to workaround implicit function declarations but
it might be the simplest and cleanest.
The problem hasn't been observed in the wild.
There are a couple more AC_COMPILE_IFELSE uses in configure.ac.
Of these, Landlock check calls prctl() and in theory could have
the same problem. In practice it doesn't as the check program
looks for several other things too. However, it was changed to
AC_LINK_IFELSE still to look more correct.
Similarly, m4/tuklib_cpucores.m4 and m4/tuklib_physmem.m4 were
updated although they haven't given any trouble either. They
have worked all these years because those check programs rely
on specific headers and types: if headers or types are missing,
compilation will fail. Using the linker makes these checks more
similar to the ones in cmake/tuklib_*.cmake which always link.
Lasse Collin [Mon, 24 Jun 2024 20:35:59 +0000 (23:35 +0300)]
Build: Use AC_LINK_IFELSE instead of -Werror
AC_COMPILE_IFELSE needed -Werror because Clang <= 14 would merely
warn about the unsupported attribute and implicit function declaration.
Changing to AC_LINK_IFELSE handles the implicit declaration because
the symbol __crc32d is unlikely to exist in libc.
Note that the other part of the check is that #include <arm_acle.h>
must work. If the header is missing, most compilers give an error
and the linking step won't be attempted.
Avoiding -Werror makes the check more robust in case CFLAGS contains
warning flags that break -Werror anyway (but this isn't the only check
in configure.ac that has this problem). Using AC_LINK_IFELSE also makes
the check more similar to how it is done in CMakeLists.txt.
Lasse Collin [Mon, 24 Jun 2024 17:14:43 +0000 (20:14 +0300)]
CMake: Not experimental anymore
While the CMake support has gotten a lot less testing than
the Autotools-based build, the supported features should now
be equal. The output may differ slightly, for example,
liblzma.pc may have
Libs.private: -pthread -lpthread
with Autotools on GNU/Linux. CMake doesn't put any options
in Libs.private because on modern glibc the pthread functions
are in libc. The options options aren't required to link static
liblzma into an application.
Autotools-based build doesn't generate or install
lib/cmake/liblzma-*.cmake files. This means that on most
platforms one cannot rely on
Lasse Collin [Tue, 25 Jun 2024 12:51:48 +0000 (15:51 +0300)]
CMake: Always add pthread flags into CMAKE_REQUIRED_LIBRARIES
It was weird to add CMAKE_THREAD_LIBS_INIT in CMAKE_REQUIRED_LIBRARIES
only if CLOCK_MONOTONIC is available. Alternative would be to remove
the thread libs from CMAKE_REQUIRED_LIBRARIES after the check for
pthread_condattr_setclock() but keeping the libs should be fine too.
Then it's ready in case more pthread functions were wanted some day.
Lasse Collin [Mon, 24 Jun 2024 19:41:10 +0000 (22:41 +0300)]
CMake: Fix three checks if building with -flto
In CMake, check_c_source_compiles() always links too. With
link-time optimization, unused functions may get omitted if
main() doesn't depend on them. Consider the following which
tries to check if somefunction() is available when <someheader.h>
has been included:
#include <someheader.h>
int foo(void) { return somefunction(); }
int main(void) { return 0; }
LTO may omit foo() completely because the program as a whole doesn't
need it and then the program will link even if the symbol somefunction
isn't available in libc or other library being linked in, and then
the test may pass when it shouldn't.
What happens if <someheader.h> doesn't declare somefunction()?
Shouldn't the test fail in the compilation phase already? It should
but many compilers don't follow the C99 and later standards that
prohibit implicit function declarations. Instead such compilers
assume that somefunction() exists, compilation succeeds (with a
warning), and then linker with LTO omits the call to somefunction().
Change the tests so that they are part of main(). If compiler accepts
implicitly declared functions, LTO cannot omit them because it has to
assume that they might have side effects and thus linking will fail.
On the other hand, if the functions/intrinsics being used are supported,
they might get optimized away but in that case it's fine because they
really are supported.
It is fine to use __attribute__((target(...))) for main(). At least
it works with GCC 4.9 to 14.1 on x86-64.
Lasse Collin [Mon, 24 Jun 2024 14:39:54 +0000 (17:39 +0300)]
CI: Workaround buggy config.guess on Ubuntu 22.04LTS and 24.04LTS
Check for the wrong triplet from config.guess and override it with
the --build option on the configure command line. Then i386 assembly
autodetection will work.
These Ubuntu versions (and as of writing, also Debian unstable)
ship config.guess version 2022-01-09 which contains a bug that
was fixed in version 2022-05-08. It results in a wrong configure
triplet when using CC="gcc -m32" to build i386 binaries.
Lasse Collin [Sun, 23 Jun 2024 12:35:35 +0000 (15:35 +0300)]
liblzma: Tidy up crc_common.h
Prefix ARM64_RUNTIME_DETECTION with CRC_ and reorder it to be with
the other ARM64-specific lines. That macro isn't used outside this
file.
ARM64 CLMUL implementation doesn't exist yet and thus CRC64_ARM64_CLMUL
isn't used anywhere yet.
It's not ideal that the single-letter CRC utility macros are here
as they pollute the namespace of the LZ encoder files. Those could
be moved their own crc_macros.h like they were in 5.2.x but in practice
this is fine enough already.
Lasse Collin [Sun, 23 Jun 2024 11:22:08 +0000 (14:22 +0300)]
liblzma: Move lzma_crcXX_table[][] declarations to crc_common.h
LZ encoder needs lzma_crc32_table[0] but otherwise those tables
are private to the CRC code. In contrast, the other things in
check.h are needed in several places.
Lasse Collin [Wed, 19 Jun 2024 15:38:22 +0000 (18:38 +0300)]
liblzma: Make 32-bit x86 CRC assembly co-exist with CLMUL
Now runtime detection of CLMUL support can pick between the CLMUL and
the generic assembly implementations. Whatever overhead this has for
builds that omit CLMUL completely isn't important because builds for
any non-ancient system is likely to include the CLMUL code too.
Handle the CRC tables in crcXX_fast.c files because now these files
are built even when assembly code is used.
If 32-bit x86 assembly is enabled then it will always be built even
if compiler flags were such that CLMUL would be allowed unconditionally.
That is, runtime detection will be used anyway. This keeps the build
rules simpler.
In LZ encoder, build and use lzma_lz_hash_table[256] if CLMUL CRC
is used without runtime detection. Previously this wasn't needed
because crc32_table.c included the lzma_crc32_table[][] in the build
unless encoder support had been disabled. Including an 8 KiB table
was silly when only 1 KiB is actually used. So now liblzma is 7 KiB
smaller if CLMUL is enabled without runtime detection.
Lasse Collin [Thu, 20 Jun 2024 15:12:21 +0000 (18:12 +0300)]
CMake: Move threading detection a few lines up
It feels clearer this way, and when support for external SHA-256
is added, this will keep the order of the library detection the
same as in configure.ac (check for pthreads before libmd) although
it shouldn't matter in practice.
Lasse Collin [Thu, 20 Jun 2024 18:53:03 +0000 (21:53 +0300)]
CMake: Refactor XZ_SYMBOL_VERSIONING to match configure.ac
Make the available options and their behavior match
--enable-symbol-versions in configure.ac.
Don't enable symbol versions on Linux if not using glibc. Previously
the generic variant was selected on Microblaze or if using NVHPC
without checking that libc is glibc.
Leave the cache variable to "auto" or "yes" if that was specified
instead of setting it to the autodetected value by default. A downside
is that one cannot easily see which variant the autodetection code
has selected. The same applies to XZ_SANDBOX and XZ_THREADS though.