Yonatan Komornik [Mon, 13 Mar 2023 20:20:49 +0000 (13:20 -0700)]
Add init once memory (#3528) (#3529)
- Adds memory type that is guaranteed to have been initialized at least once in the workspace's lifetime.
- Changes tag space in row hash to be based on init once memory.
Yonatan Komornik [Mon, 13 Mar 2023 17:00:03 +0000 (10:00 -0700)]
[Bugfix] row hash tries to match position 0 (#3548)
#3543 decreases the size of the tagTable by a factor of 2, which requires using the first tag position in each row for head position instead of a tag.
Although position 0 stopped being a valid match, it still persisted in mask calculation resulting in the matches loops possibly terminating before it should have. The fix skips position 0 to solve this problem.
Yonatan Komornik [Fri, 10 Mar 2023 22:15:04 +0000 (14:15 -0800)]
Reduce RowHash's tag space size by x2 (#3543)
Allocate half the memory for tag space, which means that we get one less slot for an actual tag (needs to be used for next position index).
The results is a slight loss in compression ratio (up to 0.2%) and some regressions/improvements to speed depending on level and sample. In turn, we get to save 16% of the hash table's space (5 bytes per entry instead of 6 bytes per entry).
Yann Collet [Fri, 10 Mar 2023 01:48:35 +0000 (17:48 -0800)]
Improved seekable format ingestion speed for small frame size
As reported by @P-E-Meunier in https://github.com/facebook/zstd/issues/2662#issuecomment-1443836186,
seekable format ingestion speed can be particularly slow
when selected `FRAME_SIZE` is very small,
especially in combination with the recent row_hash compression mode.
The specific scenario mentioned was `pijul`,
using frame sizes of 256 bytes and level 10.
This is improved in this PR,
by providing approximate parameter adaptation to the compression process.
Tested locally on a M1 laptop,
ingestion of `enwik8` using `pijul` parameters
went from 35sec. (before this PR) to 2.5sec (with this PR).
For the specific corner case of a file full of zeroes,
this is even more pronounced, going from 45sec. to 0.5sec.
These benefits are unrelated to (and come on top of) other improvement efforts currently being made by @yoniko for the row_hash compression method specifically.
The `seekable_compress` test program has been updated to allows setting compression level,
in order to produce these performance results.
Nick Terrell [Tue, 7 Mar 2023 23:15:40 +0000 (15:15 -0800)]
Add ZSTD_set{C,F,}Params() helper functions
* Add ZSTD_setFParams() and ZSTD_setParams()
* Modify ZSTD_setCParams() to use ZSTD_setParameter() to avoid a second path setting parameters
* Add unit tests
* Update documentation to suggest using them to replace deprecated functions
Adds initialization of clevel to static cdict (#3525) (#3527)
- Initializes clevel in `ZSTD_CCtxParams_init`
- Adds CI workflow for msan fuzzers runs without optimization (`-O0`)
- Fixes Makefile to correctly pass on user defined `MOREFLAGS` and `FUZZER_FLAGS` in cases they have been overwritten
Nick Terrell [Wed, 22 Feb 2023 22:49:08 +0000 (14:49 -0800)]
[bug-fix] Fix rare corruption bug affecting the block splitter
The block splitter confuses sequences with literal length == 65536 that use a
repeat offset code. It interprets this as literal length == 0 when deciding the
meaning of the repeat offset, and corrupts the repeat offset history. This is
benign, merely causing suboptimal compression performance, if the confused
history is flushed before the end of the block, e.g. if there are 3 consecutive
non-repeat code sequences after the mistake. It also is only triggered if the
block splitter decided to split the block.
All that to say: This is a rare bug, and requires quite a few conditions to
trigger. However, the good news is that if you have a way to validate that the
decompressed data is correct, e.g. you've enabled zstd's checksum or have a
checksum elsewhere, the original data is very likely recoverable. So if you were
affected by this bug please reach out.
The fix is to remind the block splitter that the literal length is actually 64K.
The test case is a bit tricky to set up, but I've managed to reproduce the issue.
Thanks to @danlark1 for alerting us to the issue and providing us a reproducer!
Sutou Kouhei [Tue, 14 Feb 2023 21:38:50 +0000 (06:38 +0900)]
Don't require CMake 3.18 or later
fix #3500
CMake 3.18 or later was required by #3392. Because it uses
`CheckLinkerFlag`. But requiring CMake 3.18 or later is a bit
aggressive. Because Ubuntu 20.04 LTS still uses CMake 3.16.3:
https://packages.ubuntu.com/search?keywords=cmake
This change disables `-z noexecstack` check with old CMake. This will
not break any existing users. Because users who need `-z noexecstack`
must already use CMake 3.18 or later.
Yonatan Komornik [Tue, 14 Feb 2023 02:00:13 +0000 (18:00 -0800)]
CI workflow to test external compressors dependencies
Implemented CI workflow for testing compilation with external compressors and without them. This serves as a sanity check to avoid any code dependencies on libraries that may not always be present. (Reference: #3497 for a bug fix related to this issue.)
Yonatan Komornik [Sun, 12 Feb 2023 20:32:31 +0000 (12:32 -0800)]
Fix zstd-dll build missing dependencies (#3496)
* Fixes zstd-dll build (https://github.com/facebook/zstd/issues/3492):
- Adds pool.o and threading.o dependency to the zstd-dll target
- Moves custom allocation functions into header to avoid needing to add dependency on common.o
- Adds test target for zstd-dll
- Adds github workflow that buildis zstd-dll
Eli Schwartz [Fri, 10 Feb 2023 04:55:09 +0000 (23:55 -0500)]
meson: always build the zstd binary when tests are enabled
We need to run it for the tests, even if programs are disabled. So if
they are disabled, create a build rule for the program, but don't
install it. Just make it available for the test itself.
Eli Schwartz [Fri, 10 Feb 2023 05:28:47 +0000 (00:28 -0500)]
meson: correctly specify the dependency relationship for playtests
It depends on the zstd program being built, and passes it as an env
variable. Just like datagen. But for datagen, we explicitly depend on
it, while for zstd, we assume it's built as part of "all".
This can be wrong in two cases:
- when running individual tests, meson can (re)build just what is needed
for that one test
- a later patch will handle building zstd but not by default
Nick Terrell [Wed, 8 Feb 2023 19:42:46 +0000 (11:42 -0800)]
Fix empty-block.zst golden decompression file
This frame is invalid because the `Window_Size = 0`, and the
`Block_Maximum_Size = min(128 KB, Window_Size) = 0`. But the empty
compressed block has a `Block_Content` size of 2, which is invalid.
The fix is to switch to using a `Window_Descriptor` instead of the
`Single_Segment_Flag`. This sets the `Window_Size = 1024`.
Hexdump before this PR: `28b5 2ffd 2000 1500 0000 00`
Hexdump after this PR: `28b5 2ffd 0000 1500 0000 00`
Yann Collet [Tue, 7 Feb 2023 08:35:51 +0000 (00:35 -0800)]
return error code when benchmark fails
such scenario can happen, for example,
when trying a decompression-only benchmark on invalid data.
Other possibilities include an allocation error in an intermediate step.
So far, the benchmark would return immediately, but still return 0.
On command line, this would be confusing, as the program appears successful (though it does not display any successful message).
Now it returns !0, which can be interpreted as an error by command line.
W. Felix Handte [Mon, 6 Feb 2023 16:05:47 +0000 (08:05 -0800)]
Use File Descriptor in Setting Stat on Output File
Note that the `fd` is only valid while the file is still open. So we need to
move the setting calls to before we close the file. However! We cannot do so
with the `utime()` call (even though `futimens()` exists) because the follow-
ing `close()` call to the `fd` will reset the atime of the file. So it seems
the `utime()` call has to happen after the file is closed.
W. Felix Handte [Fri, 3 Feb 2023 21:48:34 +0000 (13:48 -0800)]
Introduce Variants of Some Functions that Take Optional File Descriptors
Somewhat surprisingly, calling `fchmod()` is non-trivially faster than calling
`chmod()`, and so on.
This commit introduces alternate variants to some common file util functions
that take an optional fd. If present, they call the `f`-variant of the
underlying function. Otherwise, they fall back to the regular filename-taking
version of the function.
Nick Terrell [Thu, 2 Feb 2023 18:53:08 +0000 (10:53 -0800)]
Fix ZSTD_getOffsetInfo() when nbSeq == 0
In 32-bit mode, ZSTD_getOffsetInfo() can be called when nbSeq == 0, and
in this case the offset table is uninitialized. The function should just
return 0 for both values, because there are no sequences.
Nick Terrell [Thu, 2 Feb 2023 00:38:36 +0000 (16:38 -0800)]
Fix 32-bit decoding with large dictionary
The 32-bit decoder could corrupt the regenerated data by using regular
offset mode when there were actually long offsets. This is because we
were only considering the window size in the calculation, not the
dictionary size. So a large dictionary could allow longer offsets.
Fix this in two ways:
1. Instead of looking at the window size, look at the total referencable
bytes in the history buffer. Use this in the comparison instead of
the window size. Additionally, we were comparing against the wrong
value, it was too low. Fix that by computing exactly the maximum
offset for regular sequence decoding.
2. If it is possible that we have long offsets due to (1), then check
the offset code decoding table, and if the decoding table's maximum
number of additional bits is no more than STREAM_ACCUMULATOR_MIN,
then we can't have long offsets.
This gates us to be using the long offsets decoder only when we are very
likely to actually have long offsets.
Note that this bug only affects the decoding of the data, and the
original compressed data, if re-read with a patched decoder, will
correctly regenerate the orginal data. Except that the encoder also had
the same issue previously.