W. Felix Handte [Tue, 10 May 2022 21:29:39 +0000 (14:29 -0700)]
ZSTD_fast_noDict: Minimize Checks When Writing Hash Table for ip1
This commit avoids checking whether a hashtable write is safe in two of the
three match-found paths in `ZSTD_compressBlock_fast_noDict_generic`. This pro-
duces a ~0.5% speed-up in compression.
A comment in the code describes why we can skip this check in the other two
paths (the repcode check and the first match check in the unrolled loop).
A downside is that in the new position where we make this check, we have not
yet computed `mLength`. We therefore have to avoid writing *possibly* dangerous
positions, rather than the old check which only avoids writing *actually*
dangerous positions. This leads to a miniscule loss in ratio (remember that
this scenario can only been triggered in very negative levels or under incomp-
ressibility acceleration).
Eli Schwartz [Thu, 28 Apr 2022 22:22:55 +0000 (18:22 -0400)]
meson: for internal linkage, link to both libzstd and a static copy of it
Partial, Meson-only implementation of #2976 for non-MSVC builds.
Due to the prevalence of private symbol reuse, linking to a shared
library is simply utterly unreliable, but we still want to defer to the
shared library for installable applications. By linking to both, we can
share symbols where possible, and statically link where needed.
This means we no longer need to manually track every file that needs to
be extracted and reused.
The flip side is that MSVC completely does not support this, so for MSVC
builds we just link to a full static copy even where
-Ddefault_library=shared.
As a side benefit, by using library inclusion rather than including
extra explicit object files, the zstd program shrinks in size slightly
(~4kb).
Dirk Müller [Thu, 10 Mar 2022 00:34:12 +0000 (01:34 +0100)]
Split help in long and short version, cleanup formatting
Adopt the more standard Usage: formatting style
List short and long options alongside where available
Print lists as a table
Use command style description
Dirk Müller [Wed, 9 Mar 2022 23:25:05 +0000 (00:25 +0100)]
Handle newer less versions in zstdless testing
Newer less versions appear to have changed how stderr
and stdout are showing error messages. hardcode the
expected behavior to make the tests pass with any less version.
Also set locale to C so that the strings are matching.
Cyber Knight [Thu, 10 Mar 2022 07:32:13 +0000 (15:32 +0800)]
[contrib][linux] Make zstd_reset_cstream() functionally identical to ZSTD_resetCStream()
- As referenced by Nick Terrelln ~ the ZSTD maintainer in the linux kernel, making zstd_reset_cstream() functionally identical to ZSTD_resetCStream() would be the perfect way to fix the warning without touching any core functions or breaking other parts of the code.
Suggested-by: Nick Terrell <terrelln@fb.com> Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
Dirk Müller [Mon, 7 Feb 2022 19:39:15 +0000 (20:39 +0100)]
Keep original file if -c or --stdout is given
Set removeSrcFile back to false when -c or --stdout is used to improve
compatibility with gzip(1) behavior.
gzip(1) is removing the original file on compression unless --stdout or
/-c is used. zstd is defaulting to keep the file unless --rm is used or
when it is called via a gzip symlink, in which it is removing by
default. Specifying -c/--stdout turns this behavior off.
Cyber Knight [Tue, 8 Mar 2022 05:38:06 +0000 (13:38 +0800)]
[contrib][linux] Use ZSTD_CCtx_setPledgedSrcSize() instead of ZSTD_CCtx_reset()
- The previous patch throws the following warning:
../linux/lib/zstd/zstd_compress_module.c: In function ‘zstd_reset_cstream’:
../linux/lib/zstd/zstd_compress_module.c:136:34: error: enum conversion when passing argument 2 of ‘ZSTD_CCtx_reset’ is invalid in C++ [-Werror=c++-compat]
136 | return ZSTD_CCtx_reset(cstream, pledged_src_size);
| ^~~~~~~~~~~~~~~~
In file included from ../linux/include/linux/zstd.h:26,
from ../linux/lib/zstd/zstd_compress_module.c:15:
../linux/include/linux/zstd_lib.h:501:20: note: expected ‘ZSTD_ResetDirective’ {aka ‘enum <anonymous>’} but argument is of type ‘long long unsigned int’
501 | ZSTDLIB_API size_t ZSTD_CCtx_reset(ZSTD_CCtx* cctx, ZSTD_ResetDirective reset);
| ^~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
Since we have a choice to either use ZSTD_CCtx_reset or ZSTD_CCtx_setPledgedSrcSize instead of ZSTD_resetCStream, let's switch to ZSTD_CCtx_setPledgedSrcSize to not have any unnecessary warns alongside the kernel build and CI test build.
Cyber Knight [Mon, 7 Mar 2022 03:55:33 +0000 (11:55 +0800)]
[contrib][linux] Fix a warning in zstd_reset_cstream()
- This fixes the below warning:
../lib/zstd/zstd_compress_module.c: In function 'zstd_reset_cstream':
../lib/zstd/zstd_compress_module.c:136:9: warning: 'ZSTD_resetCStream' is deprecated [-Wdeprecated-declarations]
136 | return ZSTD_resetCStream(cstream, pledged_src_size);
| ^~~~~~
In file included from ../include/linux/zstd.h:26,
from ../lib/zstd/zstd_compress_module.c:15:
../include/linux/zstd_lib.h:2277:8: note: declared here
2277 | size_t ZSTD_resetCStream(ZSTD_CStream* zcs, unsigned long long pledgedSrcSize);
| ^~~~~~~~~~~~~~~~~
ZSTD_resetCstream is deprecated and zstd_CCtx_reset is suggested to use hence let's switch to it.
Ilya Tokar [Wed, 23 Feb 2022 22:59:56 +0000 (17:59 -0500)]
Use helper function for bit manipulations.
We already have BIT_getLowerBits, so use it. Benefits are 2fold:
1) Somewhat cleaner code
2) We are now using bzhi instructions, when available. Performance
delta is too small for microbenchmarks, but avoiding load still helps
larger applications, by reducing data cache pressure.
Nick Terrell [Mon, 7 Feb 2022 23:20:42 +0000 (15:20 -0800)]
[cli-tests] Fix zstd symlinks
The zstd symlinks, notably `zstdcat`, weren't working as expected
because only the `tests/cli-tests/bin/zstd` wrapper was symlinked. We
still invoked `zstd` with the name `zstd`. The fix is to create a
directory of zstd symlinks in `tests/cli-tests/bin/symlinks` for each
name that zstd recognizes. And when `tets/cli-tests/bin/zstd` is
invoked, it selects the correct symlink to call.
See the test `zstd-cli/zstdcat.sh` for an example of how it would work.
* playtests.sh: fix for a bug in macos' /bin/sh that persists temporary env vars when introduced before function calls
* cli-tests/run.py: Do not use existing ZSTD* envvars
Fix required decompression memory usage reported by -vv + --long
The use of --long alters the window size internally in the underlying
library (lib/compress/zstd_compress.c:ZSTD_getCParamsFromCCtxParams),
which changes the memory required for decompression. This means that the
reported requirement from the zstd binary when -vv is specified is
incorrect.
A full fix for this would be to add an API call to be able to retrieve
the required decompression memory from the library, but as a
lighterweight fix we can just take account of the fact we've enabled
long mode and update our verbose output appropriately.
Yann Collet [Mon, 31 Jan 2022 17:34:04 +0000 (09:34 -0800)]
fix 44239
credit to oss-fuzz
This issue could happen when using the new Sequence Compression API in Explicit Delimiter Mode
with a too small dstCapacity.
In which case, there was one place where the buffer size wasn't checked.