Lasse Collin [Tue, 25 Mar 2025 13:18:31 +0000 (15:18 +0200)]
Translations: Run "make -C po update-po"
POT-Creation-Date is set to match the timestamp in 5.7.2beta which
in the Translation Project is known as 5.8.0-pre1. The strings
haven't changed since 5.7.1alpha but a few comments have.
This is a very noisy commit, but this helps keeping the PO files
similar between the Git repository and stable release tarballs.
Lasse Collin [Tue, 25 Mar 2025 13:18:31 +0000 (15:18 +0200)]
Translations: Partially fix overtranslation in Serbian man pages
Names of environment variables and some other strings must be present
in the original form. The translator couldn't be reached so I'm
changing some of the strings myself. In the "Robot mode" section,
occurrences in the middle of sentences weren't changed to reduce
the chance of grammar breakage, but I kept the translated strings in
parenthesis in the headings. It's not ideal, but now people shouldn't
need to look at the English man page to find the English strings.
Lasse Collin [Tue, 25 Mar 2025 13:18:31 +0000 (15:18 +0200)]
liblzma: Use SSE2 intrinsics instead of memcpy() in dict_repeat()
SSE2 is supported on every x86-64 processor. The SSE2 code is used on
32-bit x86 if compiler options permit unconditional use of SSE2.
dict_repeat() copies short random-sized unaligned buffers. At least
on glibc, FreeBSD, and Windows (MSYS2, UCRT, MSVCRT), memcpy() is
clearly faster than byte-by-byte copying in this use case. Compared
to the memcpy() version, the new SSE2 version reduces decompression
time by 0-5 % depending on the machine and libc. It should never be
slower than the memcpy() version.
However, on musl 1.2.5 on x86-64, the memcpy() version is the slowest.
Compared to the memcpy() version:
- The byte-by-version takes 6-7 % less time to decompress.
- The SSE2 version takes 16-18 % less time to decompress.
The numbers are from decompressing a Linux kernel source tarball in
single-threaded mode on older AMD and Intel systems. The tarball
compresses well, and thus dict_repeat() performance matters more
than with some other files.
Lasse Collin [Tue, 25 Mar 2025 13:18:31 +0000 (15:18 +0200)]
liblzma: Add "restrict" to a few functions in lz_decoder.h
This doesn't make any difference in practice because compilers can
already see that writing through the dict->buf pointer cannot modify
the contents of *dict itself: The LZMA decoder makes a local copy of
the lzma_dict structure, and even if it didn't, the pointer to
lzma_dict in the LZMA decoder is already "restrict".
It's nice to add "restrict" anyway. uint8_t is typically unsigned char
which can alias anything. Without the above conditions or "restrict",
compilers could need to assume that writing through dict->buf might
modify *dict. This would matter in dict_repeat() because the loops
refer to dict->buf and dict->pos instead of making local copies of
those members for the duration of the loops. If compilers had to
assume that writing through dict->buf can affect *dict, then compilers
would need to emit code that reloads dict->buf and dict->pos after
every write through dict->buf.
Lasse Collin [Mon, 10 Mar 2025 11:13:30 +0000 (13:13 +0200)]
CMake: Fix tuklib_use_system_extensions
Revert back to a macro so that list(APPEND CMAKE_REQUIRED_DEFINITIONS)
will affect the calling scope. I had forgotten that while CMake
functions inherit the variables from the parent scope, the changes
to them are local unless using set(... PARENT_SCOPE).
This also means that the commit message in 5bb77d0920dc is wrong. The
commit itself is still fine, making it clearer that -DHAVE_SYS_PARAM_H
is only needed for specific check_c_source_compiles() calls.
Lasse Collin [Sun, 9 Mar 2025 12:06:35 +0000 (14:06 +0200)]
CMake: Revise tuklib_use_system_extensions
Define NetBSD and Darwin/macOS feature test macros. Autoconf defines
these too (and a few others).
Define the macros on Windows except with MSVC. The _GNU_SOURCE macro
makes a difference with mingw-w64.
Use a function instead of a macro. Don't take the TARGET_OR_ALL argument
because there's always global effect because the global variable
CMAKE_REQUIRED_DEFINITIONS is modified.
Lasse Collin [Thu, 6 Mar 2025 15:37:39 +0000 (17:37 +0200)]
Docs: Add a few TRANSLATORS comments to man pages
All translators know that --command-line-options must not be translated.
With some other strings it's not obvious when the untranslated string
must be preserved. These comments hopefully help.
Lasse Collin [Thu, 6 Mar 2025 14:34:32 +0000 (16:34 +0200)]
Scripts: Mark the LZMA Utils script aliases as deprecated
The deprecated aliases are lzcmp, lzdiff, lzless, lzmore,
lzgrep, lzegrep, and lzfgrep. The commands that start with
the xz prefix have identical behavior, for example, both
lzgrep and xzgrep handle all supported file formats.
This doesn't affect lzma, unlzma, lzcat, lzmadec, or lzmainfo.
The last release of LZMA Utils was made in 2008, but the lzma
compatibility alias for the gzip-like tool is still in common use.
Deprecating it would cause unnecessary breakage.
Lasse Collin [Mon, 17 Feb 2025 19:46:15 +0000 (21:46 +0200)]
Build: Fix out-of-tree builds when using the replacement getopt_long
Nowaways $(top_builddir)/lib/getopt.h depends on headers in
$(top_srcdir)/lib, so both have to be in the include path.
CMake-based build already did this.
Lasse Collin [Mon, 17 Feb 2025 16:11:58 +0000 (18:11 +0200)]
Build: Allow forcing the use of the replacement getopt_long
Now one can pass gl_replace_getopt=yes to configure to force the use
of GNU getopt_long from the lib directory. This only checks that the
value of gl_replace_getopt is non-empty, so one cannot force the
replacement to be disabled.
Lasse Collin [Mon, 3 Feb 2025 14:15:38 +0000 (16:15 +0200)]
Translations: Update Chinese (traditional) translation
Since there are no spaces between words, the unsophisticated automatic
word wrapping code needs some help. Compared to the version in the
Translation Project, I added a few \t characters which the word
wrapping code interprets as zero width spaces (hopefully they are
placed correctly). These edits can be seen with this command:
Lasse Collin [Sun, 2 Feb 2025 12:15:07 +0000 (14:15 +0200)]
Build: Update posix-shell.m4 from Gnulib
Tabs have been converted to spaces and a "serial" number has been
added. The previous version was from 2008/2009. There are no functional
changes since then but now it's clearer that the copy in XZ Utils
isn't outdated.
The new file was picked from the Gnulib commit 81a4c1e3b7692e95c0806d948cbab9148ad85ef2. A later commit adds
a warranty disclaimer to the license, which obviously is fine,
but I didn't find a SPDX license identifier for the new license,
so for simplicity I used the earlier commit.
Lasse Collin [Sun, 2 Feb 2025 10:51:03 +0000 (12:51 +0200)]
Build: Check for -fsanitize= also in $CC
People may put -fsanitize in CC instead of CFLAGS so check both.
Landlock sandbox isn't compatible with sanitizers so it's nice
to catch the incompatible options at configure time.
Don't attempt to do the same in CMakeLists.txt; the check for
CMAKE_C_FLAGS / CFLAGS shall be enough there. The extra flags from
the CC environment variable go into the undocumented internal variable
CMAKE_C_COMPILER_ARG1 (all flags from CC go into that same variable).
Peeking the internal variable merely for improved diagnostics isn't
worth it.
Lasse Collin [Tue, 28 Jan 2025 14:28:18 +0000 (16:28 +0200)]
Windows: Avoid an error message on broken pipe
Also make xz not process more input files after a broken pipe has
been detected. This matches the behavior on POSIX. If all files
are being written to standard output, trying with the next file is
pointless when it's known that standard output won't accept more data.
xzdec already stopped after the first error. It does so with all
errors, so it differs from xz:
$ xz -dc not_found_1 not_found_2
xz: not_found_1: No such file or directory
xz: not_found_2: No such file or directory
$ xzdec not_found_1 not_found_2
xzdec: not_found_1: No such file or directory
Lasse Collin [Wed, 22 Jan 2025 13:03:55 +0000 (15:03 +0200)]
Windows: Disable MinGW-w64's stdio functions in size-optimized builds
This only affects builds with UCRT. With legacy MSVCRT, the replacement
functions are always enabled.
Omitting the MinGW-w64 replacements saves over 20 KiB per executable.
The downside is that --enable-small or XZ_SMALL=ON disables thousand
separator support in xz messages. If someone is OK with the slower
speed of slightly smaller builds, lack of thousand separators won't
matter.
Don't override __USE_MINGW_ANSI_STDIO if it is already defined (via
CPPFLAGS or such method).