Eli Schwartz [Tue, 2 Jan 2024 06:36:45 +0000 (01:36 -0500)]
CI: meson: use builtin handling for MSVC
This avoids downloading -- and periodically bumping the checksum for --
a third-party action that isn't strictly required, and thus helps keep
down dependencies and reduce update churn.
Fix a nullptr dereference in ZSTD_createCDict_advanced2()
If the relevant allocation returns NULL, ZSTD_createCDict_advanced_internal()
will return NULL. But ZSTD_createCDict_advanced2() doesn't check for
this and attempts to use the returned pointer anyway, which leads to
a segfault.
Nick Terrell [Tue, 21 Nov 2023 21:26:25 +0000 (13:26 -0800)]
Modernize macros to use `do { } while (0)`
This PR introduces no functional changes. It attempts to change all
macros currently using `{ }` or some variant of that to to
`do { } while (0)`, and introduces trailing `;` where necessary.
There were no bugs found during this migration.
The bug in Visual Studios warning on this has been fixed since VS2015.
Additionally, we have several instances of `do { } while (0)` which have
been present for several releases, so we don't have to worry about
breaking peoples builds.
Nick Terrell [Mon, 20 Nov 2023 20:04:30 +0000 (12:04 -0800)]
[huf] Fix null pointer addition
`HUF_DecompressFastArgs_init()` was adding 0 to NULL. Fix it by exiting
early for empty outputs. This is no change in behavior, because the
function was already exiting 0 in this case, just slightly later.
Nick Terrell [Mon, 20 Nov 2023 19:33:57 +0000 (11:33 -0800)]
[huf] Improve fast C & ASM performance on small data
* Rename `ilimit` to `ilowest` and set it equal to `src` instead of
`src + 6 + 8`. This is safe because the fast decoding loops guarantee
to never read below `ilowest` already. This allows the fast decoder to
run for at least two more iterations, because it consumes at most 7
bytes per iteration.
* Continue the fast loop all the way until the number of safe iterations
is 0. Initially, I thought that when it got towards the end, the
computation of how many iterations of safe might become expensive. But
it ends up being slower to have to decode each of the 4 streams
individually, which makes sense.
This drastically speeds up the Huffman decoder on the `github` dataset
for the issue raised in #3762, measured with `zstd -b1e1r github/`.
| Decoder | Speed before | Speed after |
|----------|--------------|-------------|
| Fallback | 477 MB/s | 477 MB/s |
| Fast C | 384 MB/s | 492 MB/s |
| Assembly | 385 MB/s | 501 MB/s |
We can also look at the speed delta for different block sizes of silesia
using `zstd -b1e1r silesia.tar -B#`.
Nick Terrell [Sat, 18 Nov 2023 02:20:19 +0000 (18:20 -0800)]
[huf] Improve fast huffman decoding speed in linux kernel
gcc in the linux kernel was not unrolling the inner loops of the Huffman
decoder, which was destroying decoding performance. The compiler was
generating crazy code with all sorts of branches. I suspect because of
Spectre mitigations, but I'm not certain. Once the loops were manually
unrolled, performance was restored.
Additionally, when gcc couldn't prove that the variable left shift in
the 4X2 decode loop wasn't greater than 63, it inserted checks to verify
it. To fix this, mask `entry.nbBits & 0x3F`, which allows gcc to eliete
this check. This is a no op, because `entry.nbBits` is guaranteed to be
less than 64.
Lastly, introduce the `HUF_DISABLE_FAST_DECODE` macro to disable the
fast C loops for Issue #3762. So if even after this change, there is a
performance regression, users can opt-out at compile time.
Nick Terrell [Fri, 17 Nov 2023 01:15:25 +0000 (17:15 -0800)]
[debug] Don't define g_debuglevel in the kernel
We only use this constant when `DEBUGLEVEL>=2`, but we get
-Werror=pedantic errors for empty translation units, so still define it
except in kernel environments.
More recent versions of CMake emit the following warning:
CMake Deprecation Warning at cmake/CMakeLists.txt:10 (cmake_minimum_required):
Compatibility with CMake < 3.5 will be removed from a future version of
CMake.
Update the VERSION argument <min> value or use a ...<max> suffix to tell
CMake that the project does not need compatibility with older versions.
Nick Terrell [Wed, 27 Sep 2023 00:53:26 +0000 (17:53 -0700)]
Stop suppressing pointer-overflow UBSAN errors
* Remove all pointer-overflow suppressions from our UBSAN builds/tests.
* Add `ZSTD_ALLOW_POINTER_OVERFLOW_ATTR` macro to suppress
pointer-overflow at a per-function level. This is a superior approach
because it also applies to users who build zstd with UBSAN.
* Add `ZSTD_wrappedPtr{Diff,Add,Sub}()` that use these suppressions.
The end goal is to only tag these functions with
`ZSTD_ALLOW_POINTER_OVERFLOW`. But we can start by annoting functions
that rely on pointer overflow, and gradually transition to using
these.
* Add `ZSTD_maybeNullPtrAdd()` to simplify pointer addition when the
pointer may be `NULL`.
* Fix all the fuzzer issues that came up. I'm sure there will be a lot
more, but these are the ones that came up within a few minutes of
running the fuzzers, and while running GitHub CI.
To the best of my knowledge:
* `_WIN32` and `_WIN64` are defined by the compiler,
* `WIN32` and `WIN64` are defined by the user, to indicate whatever
the user chooses them to indicate. They mean 32-bit and 64-bit Windows
compilation by convention only.
Windows compilers in general, and MSVC in particular, have been defining
`_WIN32` and `_WIN64` for a long time, provably at least since Visual Studio
2015, and in practice as early as in the days of 16-bit Windows.
So, do not prefix `BINDIR` with `DESTDIR`, and adapt all paths for
installation. This is more common, and, for example, `programs/Makefile`
does the same.
Nick Terrell [Thu, 24 Aug 2023 21:41:21 +0000 (14:41 -0700)]
Fix & refactor Huffman repeat tables for dictionaries
The Huffman repeat mode checker assumed that the CTable was zeroed in the region `[maxSymbolValue + 1, 256)`.
This assumption didn't hold for tables built in the dictionaries, because it didn't go through the same codepath.
Since this code was originally written, we added a header to the CTable that specifies the `tableLog`.
Add `maxSymbolValue` to that header, and check that the table's `maxSymbolValue` is at least the block's `maxSymbolValue`.
This solution is cleaner because we write this header for every CTable we build, so it can't be missed in any code path.
Dmitry V. Levin [Tue, 8 Aug 2023 08:00:00 +0000 (08:00 +0000)]
zdictlib: fix prototype mismatch
Fix the following warnings reported by the compiler when
ZDICTLIB_STATIC_API is not defined to ZDICTLIB_API:
lib/dictBuilder/cover.c:1122:21: warning: redeclaration of 'ZDICT_optimizeTrainFromBuffer_cover' with different visibility (old visibility
preserved)
lib/dictBuilder/cover.c:736:21: warning: redeclaration of 'ZDICT_trainFromBuffer_cover' with different visibility (old visibility
+preserved)
lib/dictBuilder/fastcover.c:549:1: warning: redeclaration of 'ZDICT_trainFromBuffer_fastCover' with different visibility (old visibility
preserved)
lib/dictBuilder/fastcover.c:618:1: warning: redeclaration of 'ZDICT_optimizeTrainFromBuffer_fastCover' with different visibility (old
visibility preserved)
Nick Terrell [Mon, 21 Aug 2023 18:33:29 +0000 (11:33 -0700)]
No longer reject dictionaries with literals maxSymbolValue < 255
We already have logic in our Huffman encoder to validate Huffman tables with missing symbols.
We use this for higher compression levels to re-use the previous blocks statistics, or when the dictionaries table has zero-weighted symbols.
This check was leftover as an oversight from before we added validation for Huffman tables.
I validated that the `dictionary_loader` fuzzer has coverage of every line in the `ZSTD_loadCEntropy()` function to validate that it is correctly testing this function.
W. Felix Handte [Wed, 16 Aug 2023 16:09:12 +0000 (12:09 -0400)]
Unpoison Workspace Memory Before Freeing to Custom Free
MSAN is hooked into the system malloc, but when the user provides a custom
allocator, it may not provide the same cleansing behavior. So if we leave
memory poisoned and return it to the user's allocator, where it is re-used
elsewhere, our poisoning can blow up in some other context.