[zstd][dict] Ensure that dictionary training functions are fully reentrant
The two main functions used for dictionary training using the COVER
algorithm require initialization of a COVER_ctx_t where a call
to qsort() is performed.
The issue is that the standard C99 qsort() function doesn't offer
a way to pass an extra parameter for the comparison function callback
(e.g. a pointer to a context) and currently zstd relies on a *global*
static variable to hold a pointer to a context needed to perform
the sort operation.
If a zstd library user invokes either ZDICT_trainFromBuffer_cover or
ZDICT_optimizeTrainFromBuffer_cover from multiple threads, the
global context may be overwritten before/during the call/execution to qsort()
in the initialization of the COVER_ctx_t, thus yielding to crashes
and other bad things (Tm) as reported on issue #4045.
Enters qsort_r(): it was designed to address precisely this situation,
to quote from the documention [1]: "the comparison function does not need to
use global variables to pass through arbitrary arguments, and is therefore
reentrant and safe to use in threads."
It is available with small variations for multiple OSes (GNU, BSD[2],
Windows[3]), and the ISO C11 [4] standard features on annex B-21 qsort_s() as
part of the <stdlib.h>. Let's hope that compilers eventually catch up
with it.
For now, we have to handle the small variations in function parameters
for each platform.
The current fix solves the problem by allowing each executing thread
pass its own COVER_ctx_t instance to qsort_r(), removing the use of
a global pointer and allowing the code to be reentrant.
Unfortunately for *BSD, we cannot leverage qsort_r() given that its API
has changed on newer versions of FreeBSD (14.0) and the other BSD variants
(e.g. NetBSD, OpenBSD) don't implement it.
For such cases we provide a fallback that will work only requiring support
for compilers implementing support for C90.
Quentin Boswank [Wed, 5 Jun 2024 16:21:34 +0000 (18:21 +0200)]
Fix $filter and Msys/Cygwin
- switched the patter and input of $filter into the right places
- added pattern wildcard to MSYS_NT & CYGWIN_NT as they change with windows versions
- correctly identify MSYS2, even in an env like MINGW64
[fix] Add check on failed allocation in legacy/zstd_v06
As reported by Ben Hawkes in #4026, a failure to allocate a zstd context
would lead to a dereference of a NULL pointer due to a missing check
on the returned result of ZSTDv06_createDCtx().
This patch fix the issue by adding a check for valid returned pointer.
meson: don't add -pthread to static linking flags on Windows
Meson always returns -pthread in dependency('threads') on non-MSVC
compilers. On Windows we use Windows threading primitives, so we don't
need this. Avoid adding -pthread to libzstd's link flags, either as a
Meson subproject or via pkg-config Libs.private, so the application
doesn't inadvertently depend on winpthreads.
Add a Meson MinGW cross-compile CI test that checks for this. It turns
out that pzstd fails to build in that environment, so have the test
skip building contrib for now.
Provide variant pkg-config file for multi-threaded static lib
Multi-threaded static library require -pthread to correctly link and works.
The pkg-config we provide tho only works with dynamic multi-threaded library
and won't provide the correct libs and cflags values if lib-mt is used.
To handle this, introduce an env variable MT to permit advanced user to
install and generate a correct pkg-config file for lib-mt or detect if
lib-mt target is called.
With MT env set on calling make install-pc, libzstd.pc.in is a
pkg-config file for a multi-threaded static library.
On calling make lib-mt, a libzstd.pc is generated for a multi-threaded
static library as it's what asked by the user by forcing it.
libzstd.pc is changed to PHONY to force regeneration of it on calling
lib targets or install-pc to handle case where the same directory is
used for mixed compilation.
This was notice while migrating from meson to make build system where
meson generates a correct .pc file while make doesn't.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
~/zstd/zlibWrapper$ cc -c zstd_zlibwrapper.o gz*.c -lz -lzstd -DSTDC
gzwrite.c: In function ‘gz_write’:
gzwrite.c:226:43: error: ‘z_uInt’ undeclared (first use in this
function); did you mean ‘uInt’?
226 | state.state->strm.avail_in = (z_uInt)n;
| ^~~~~~
| uInt
gzwrite.c:226:43: note: each undeclared identifier is reported only
once for each function it appears in
gzwrite.c:226:50: error: expected ‘;’ before ‘n’
226 | state.state->strm.avail_in = (z_uInt)n;
| ^
| ;
z_uInt is never used directly, zconf.h redefines uInt to z_uInt under
the condition that Z_PREFIX is set. All examples use uInt, and the type
of avail_in is also uInt.
In this commit I modify the cast to refer to the same type as the type
of lvalue.
Arguably, the real fix here is to handle possible overflows, but that's
beyond the scope of this commit.
Nick Terrell [Tue, 26 Mar 2024 17:01:19 +0000 (10:01 -0700)]
[fuzz] Turn off -Werror by default
This was causing OSS-Fuzz errors, due to compiler differences.
* Fix the issue
* Also turn off -Werror so we don't fail fuzzer builds for warnings
* Turn on -Werror in our CI
Nick Terrell [Tue, 19 Mar 2024 19:37:55 +0000 (12:37 -0700)]
Fix & fuzz ZSTD_generateSequences
This function was seriously flawed:
* It didn't do output bounds checks
* It produced invalid sequences when an uncompressed or RLE block was emitted
* It produced invalid sequences when the block splitter was enabled
* It produced invalid sequences when ZSTD_c_targetCBlockSize was enabled
I've attempted to fix these issues, but this function is just a bad idea,
so I've marked it as deprecated and unsafe. We should replace it with
`ZSTD_extractSequences()` which operates on a compressed frame.
Yonatan Komornik [Mon, 18 Mar 2024 22:25:22 +0000 (15:25 -0700)]
Fail on errors when building fuzzers
Fails on errors when building fuzzers with `fuzz.py` (adds `Werror`).
Currently allows `declaration-after-statement`, `c++-compat` and
`deprecated` as they are abundant in code (some fixes to
`declaration-after-statement` are presented in this commit).
Yonatan Komornik [Mon, 18 Mar 2024 22:36:40 +0000 (15:36 -0700)]
Fix bugs in simple decompression fuzzer (#3978)
Fixes 2 issue in `simple_decompress.c`:
1. Wrong type used for storing the results of `ZSTD_findDecompressedSize` resulting in never matching to `ZSTD_CONTENTSIZE_ERROR` or `ZSTD_CONTENTSIZE_UNKNOWN`.
2. Experimental API is used (`ZSTD_findDecompressedSize`) without defining `ZSTD_STATIC_LINKING_ONLY`.
Nick Terrell [Thu, 14 Mar 2024 19:12:55 +0000 (12:12 -0700)]
[cmake] Fix up PR #3716
* Make a variable `PublicHeaders` for Zstd's public headers
* Add `PublicHeaders` to `Headers`, which was missing
* Only export `${LIBRARY_DIR}` publicly, not `common/`
* Switch the `target_include_directories()` to `INTERFACE` because zstd uses relative includes internally, so doesn't need any include directories to build
* Switch installation to use the `PublicHeaders` variable, and test that the right headers are installed