Yann Collet [Wed, 8 Dec 2021 23:05:17 +0000 (15:05 -0800)]
remove offending static assert lines
no idea why visual + clang-cl + appveyor don't like them,
I've not been able to reproduce the issue locally,
but these static assert are very unlikely to deliver a useful signal,
I can't imagine a situation where they will be wrong,
and if they are, then a ton of other things will be broken way before reaching that point.
Nick Terrell [Wed, 1 Dec 2021 20:52:23 +0000 (12:52 -0800)]
[asm] Share portability macros and restrict ASM further
Move portability macros to `lib/common/portability_macros.h`. This file
only contains platform/feature detection (e.g. 0/1 macros). This file is
shared between C and ASM code, so it cannot include any C code.
Rename `HUF_` ASM macros to be `ZSTD_` prefixed, and move to the new
header.
Restrict `ZSTD_ASM_SUPPORTED` to `__GNUC__`, because we need the GAS
assembler.
Finally, only include the ASM code if we are actually going to use it.
This disables it on all Windows platforms, which should resolve the
problem brought up in Issue #2789.
Nick Terrell [Thu, 2 Dec 2021 21:32:53 +0000 (13:32 -0800)]
Improve zstd_opt build speed and size
Use the same trick as we did for zstd_lazy in PR #2828:
* Create one search function specialization for each (dictMode, mls).
* Select the search function pointer at the top of the match finder.
Additionally, we no longer inline `ZSTD_compressBlock_opt_generic` into
every function, since `dictMode` is no longer used as a template. Create
two specializations, for opt levels 0 and 2, and call one of the two
specializations.
Lastly, remove the hack that disabled inlining for zstd_opt for the
Linux Kernel, as we've gotten most of the benefit already.
W. Felix Handte [Thu, 2 Dec 2021 19:25:33 +0000 (14:25 -0500)]
Avoid Using Deprecated Functions in Deprecated Code
`lib/deprecated` is no longer built by zstd's bundled build files. However,
users may try to build these files when they import the source tree into
their own build systems. And if they have `-Wdeprecated-declarations` on,
this can produce warnings.
This PR migrates these files away from using deprecated declarations.
Nick Terrell [Wed, 1 Dec 2021 19:49:58 +0000 (11:49 -0800)]
[CircleCI] Fix short-tests-0
short-tests-0 were silently failing. I think because of the && make clean construction. Switch to ; instead.
Also fix all the test failures that were exposed.
`make all` is failing on CircleCI because it is missing Docker. Move that test
to GitHub actions, and switch the pedantic CircleCI test to `make allmost`.
Nick Terrell [Wed, 1 Dec 2021 02:26:11 +0000 (18:26 -0800)]
[contrib][pzstd] Fix build issue with gcc-5
gcc-5 didn't like the l-value overload for defaulted operator=. There is
no reason it needs to be l-value overloaded, so just remove it.
I'm not sure why the build broke for @mckaygerhard in Issue #2811, since
this code hasn't changed since it was added. But, there is no harm in
fixing it.
Nick Terrell [Tue, 30 Nov 2021 20:29:55 +0000 (12:29 -0800)]
[test] Test that the exec-stack bit isn't set on libzstd.so
Tests that libzstd.so doesn't have the exec-stack bit set using
readelf. If the stack is marked executable systemd will refuse
to link against zstd. We now test that it isn't set on every PR.
Nick Terrell [Wed, 1 Dec 2021 00:51:16 +0000 (16:51 -0800)]
[zdict] Remove ZDICT_CONTENTSIZE_MIN restriction for ZDICT_finalizeDictionary
Allow the `dictContentSize` to be any size. The finalized dictionary
content size must be at least as large as the maximum repcode (8). So we
add zero bytes to the dictionary to ensure that we meet that
requirement.
I've removed this restriction because its been causing us headaches when
people complain that dictionary training failed. It fails because there
isn't enough useful content to put in the dictionary. Either because
every sample is exactly the same and less than ZDICT_CONTENTSIZE_MIN bytes,
or there isn't enough content. Instead, we should succeed in creating
the dictionary, and it is up to the user to decide if it is worthwhile.
It is possible that the tables alone provide enough value.
NOTE: This allows us to produce dictionaries with finalized
`dictContentSize < ZDICT_CONTENTSIZE_MIN`. But, they are still valid
zstd dictionaries. We could remove the `ZDICT_CONTENTSIZE_MIN` macro,
but I've decided to leave that for now, so we don't break users.
Nick Terrell [Wed, 1 Dec 2021 01:43:28 +0000 (17:43 -0800)]
[bmi2] Add lzcnt and bmi target attributes
* When dynamic dispatching to bmi2 add lzcnt and bmi to the
TARGET_ATTRIBUTE.
* Centralize the bmi2 TARGET_ATTRIBUTE definition to
BMI2_TARGET_ATTRIBUTE so we can change it in the future.
* Only enable bmi2 when both bmi1 & bmi2 are supported. There shouldn't
be any cases where bmi2 is supported but bmi1 isn't. But, since we are
using the instruction we should check bmi1 as well.
W. Felix Handte [Wed, 17 Nov 2021 23:09:18 +0000 (18:09 -0500)]
Determinism: Avoid Mapping Window into Reserved Indices during Reduction
PR #2850 attempted to fix a determinism bug that was uncovered by OSS-Fuzz. It
succeeded in addressing that source of non-determinism, but introduced a new
one: it was possible, when index reduction occurred, to map indices in the
window to the reserved value, which would cause them to be zeroed, potentially
altering parsing of the input.
This PR addresses this issue. It makes sure that the bottom of the window is
always `>= ZSTD_WINDOW_START_INDEX`.
I'm not sure if this makes #2850 redundant. I think it's probably still
valuable to have that protection as well.
Nick Terrell [Tue, 16 Nov 2021 22:25:18 +0000 (14:25 -0800)]
[linux-kernel] Don't add -O3 to CFLAGS
It is no longer necessary to get good performance, there is only a small
speed difference between -O2 and -O3, so just stick to the default of
-O2. I've measured neutral compression speed and a ~3% decompression
speed loss in userspace with clang & gcc. I've also measured neutral
compression speed and a ~1% decompression speed loss in the kernel
benchmarks.
This also fixes the stack space usage on parisc. The compiler was buggy
for -O3 and used ~3KB of stack space for several functions. With -O2 the
problem is completely resolved, and stack space is back to a few hundred
bytes.
Additionally, we get a large code size win on gcc:
Nick Terrell [Tue, 16 Nov 2021 00:57:00 +0000 (16:57 -0800)]
[linux-kernel] Don't inline function in zstd_opt.c
The optimal parser is unlikely to be used in the linux kernel in
practice. There is no reason these functions should be force inlined,
since we aren't gaining anything, and are losing build size.
Nick Terrell [Tue, 16 Nov 2021 01:25:24 +0000 (17:25 -0800)]
Reduce function size in fast & dfast
Take the same approach as in PR #2828 [0] to remove functions that force
inline many function bodies and `switch`. Instead, create one function per
"template" combination, and then switch between these functions. This
allows the compiler to break the large function into many small
functions, which generally helps codegen.
Also, in the `extDict` modes when there is no ext-dict, call the top
level function instead of the force inlined one, to save on code size.
I'm specifically doing this because gcc on the parisc architecture doesn't
handle the large function body well, and ends up using a lot of excess
stack space. Outlining these functions fixes it.
ko-zu [Sat, 13 Nov 2021 13:48:33 +0000 (22:48 +0900)]
Remove executable flag from GNU_STACK section
Putting stack marking into every assembly files is required to indicate
that the stack does not need to be executable.
Executable flag on stack conflicts with some security measures, Systemd
MemoryDenyWriteExecute=yes for example.