]> git.ipfire.org Git - thirdparty/zstd.git/log
thirdparty/zstd.git
2 years agobuild(cmake): improve pkg-config generation 3252/head
Andrea Pappacoda [Sun, 28 Aug 2022 11:01:20 +0000 (13:01 +0200)] 
build(cmake): improve pkg-config generation

With this patch the pkg-config generation when using the CMake build
system is improved in the following ways:

- Libs.private is now filled when needed
- The JoinPaths module is now used to join paths, leading to simpler
  code
- The .pc file is always generated, regardless of the platform, as it
  can also be consumed on Windows

Here's how the .pc file is affected by these changes, in comparison to
the one generated with the official Makefiles:

    $ diff -s lib/libzstd.pc build/cmake/build-old/lib/libzstd.pc
    15c15
    < Libs.private: -pthread
    ---
    > Libs.private:

    $ diff -s lib/libzstd.pc build/cmake/build-new/lib/libzstd.pc
    Files lib/libzstd.pc and build/cmake/build-new/lib/libzstd.pc are
    identical

2 years agoMerge pull request #3241 from wahern/wahern-combine-sh-faster
Yann Collet [Tue, 16 Aug 2022 23:09:43 +0000 (16:09 -0700)] 
Merge pull request #3241 from wahern/wahern-combine-sh-faster

restore combine.sh bash performance while still sticking to POSIX

2 years agoMerge pull request #3230 from grossws/fix3229-docs
Elliot Gorokhovsky [Tue, 16 Aug 2022 16:48:23 +0000 (12:48 -0400)] 
Merge pull request #3230 from grossws/fix3229-docs

Add description for ZSTD_decompressStream and ZSTD_initDStream

2 years agoDocument pass-through behavior (#3242)
Chris Burgess [Mon, 15 Aug 2022 17:29:54 +0000 (13:29 -0400)] 
Document pass-through behavior (#3242)

Adds documentation to help and man pages for legacy pass-through behavior
when force is set and destination is stdout. Documents --pass-through in
man pages

2 years agoescape glob pattern special characters in subject string before generating search... 3241/head
William Ahern [Thu, 11 Aug 2022 03:58:55 +0000 (20:58 -0700)] 
escape glob pattern special characters in subject string before generating search patterns in combine.sh list_has_item

2 years agorestore combine.sh bash performance while still sticking to POSIX
William Ahern [Wed, 10 Aug 2022 23:51:17 +0000 (16:51 -0700)] 
restore combine.sh bash performance while still sticking to POSIX

2 years agoMerge pull request #3235 from facebook/docTraining
Yann Collet [Mon, 8 Aug 2022 19:06:49 +0000 (12:06 -0700)] 
Merge pull request #3235 from facebook/docTraining

[easy] added a few documentation words about dictionary training

2 years agoAdd description for ZSTD_decompressStream and ZSTD_initDStream 3230/head
Konstantin Gribov [Mon, 1 Aug 2022 20:50:54 +0000 (23:50 +0300)] 
Add description for ZSTD_decompressStream and ZSTD_initDStream

With that these functions become visible in generated docs.

Fixes #3229

3 years agoadded a few documentation words about dictionary training 3235/head
Yann Collet [Fri, 5 Aug 2022 15:09:22 +0000 (17:09 +0200)] 
added a few documentation words about dictionary training

partially answering questions such as #3233
which looks for guidance within `exmaples/`.

3 years agoAdd explicit --pass-through flag and default to enabled for *cat (#3223)
Nick Terrell [Fri, 5 Aug 2022 00:15:59 +0000 (17:15 -0700)] 
Add explicit --pass-through flag and default to enabled for *cat (#3223)

Fixes #3211.

Adds the `--[no-]pass-through` flag which enables/disables pass-through mode.

* `zstdcat`, `zcat`, and `gzcat` default to `--pass-through`.
  Pass-through mode can be disabled by passing `--no-pass-through`.
* All other binaries default to not setting pass-through mode.
  However, we preserve the legacy behavior of enabling pass-through
  mode when writing to stdout with `-f` set, unless pass-through
  mode is explicitly disabled with `--no-pass-through`.

Adds a new test for this behavior that should codify the behavior we want.

3 years agofix issue #3144 (#3226)
zengyijing [Thu, 4 Aug 2022 20:51:14 +0000 (16:51 -0400)] 
fix issue #3144 (#3226)

* fix issue #3144

* add test case for verbose-wlog

Co-authored-by: zengyijing <yijingzeng@fb.com>
3 years agoMerge pull request #3232 from facebook/fileiotypes_nomemh
Yann Collet [Wed, 3 Aug 2022 20:57:16 +0000 (22:57 +0200)] 
Merge pull request #3232 from facebook/fileiotypes_nomemh

fileio_types.h : avoid dependency on mem.h

3 years agoMerge pull request #3231 from facebook/fileio_missingInclude
Yann Collet [Wed, 3 Aug 2022 20:48:05 +0000 (22:48 +0200)] 
Merge pull request #3231 from facebook/fileio_missingInclude

[easy] fixed missing include

3 years agofileio_types.h : avoid dependency on mem.h 3232/head
Yann Collet [Wed, 3 Aug 2022 19:39:35 +0000 (21:39 +0200)] 
fileio_types.h : avoid dependency on mem.h

fileio_types.h cannot be parsed by itself
because it relies on basic types defined in `lib/common/mem.h`.
As for #3231, it likely wasn't detected because `mem.h` was probably included before within target files.
But this is not proper.

A "easy" solution would be to add the missing include,
but each dependency should be considered "bad" by default,
and only allowed if it brings some tangible value.

In this case, since these types are only used to declare internal structure variables
which are effectively only flags,
I believe it's really not valuable to add a dependency on `mem.h` for this purpose
while the standard `int` type can do the same job.

I was expecting some compiler warnings following this change,
but it turns out we don't use `-Wconversion` by default on `zstd` source code,
so there is none.

Nevertheless, I enabled `-Wconversion` locally and proceeded to fix a few conversion warnings in the process.

Adding `-Wconversion` to the list of flags used for `zstd` is something I would be favorable over the long term,
but it cannot be done overnight,
because the nb of places where this warning is triggered is daunting.
Better progressively reduce the nb of triggered `-Wconversion` warnings before enabling this flag by default.

3 years agominor : fixed missing include 3231/head
Yann Collet [Wed, 3 Aug 2022 18:52:15 +0000 (20:52 +0200)] 
minor : fixed missing include

I presume it was not detected before
because "fileio.h" is probably always included after "util.h".

3 years agoFix off-by-one error in superblock mode (#3221)
Nick Terrell [Wed, 3 Aug 2022 18:28:39 +0000 (11:28 -0700)] 
Fix off-by-one error in superblock mode (#3221)

Fixes #3212.

Long literal and match lengths had an off-by-one error in ZSTD_getSequenceLength.
Fix the off-by-one error, and add a golden compression test that catches the bug.
Also run all the golden tests in the cli-tests framework.

3 years agoMerge pull request #3196 from mileshu/dev
Felix Handte [Tue, 2 Aug 2022 16:34:04 +0000 (12:34 -0400)] 
Merge pull request #3196 from mileshu/dev

[T124890272] Mark 2 Obsolete Functions(ZSTD_copy*Ctx) Deprecated in Zstd

3 years agoMerge branch 'dev' of https://github.com/mileshu/zstd into dev 3196/head
Miles Hu [Tue, 2 Aug 2022 05:52:47 +0000 (22:52 -0700)] 
Merge branch 'dev' of https://github.com/mileshu/zstd into dev

3 years ago[T124890272] Mark 2 Obsolete Functions(ZSTD_copy*Ctx) Deprecated in Zstd
Miles HU [Wed, 13 Jul 2022 18:00:05 +0000 (11:00 -0700)] 
[T124890272] Mark 2 Obsolete Functions(ZSTD_copy*Ctx) Deprecated in Zstd

The discussion for this task is here: facebook/zstd#3128.

This task can probably be scoped to the first part: marking these functions deprecated.
We'll later look at removal when we roll out v1.6.0.

3 years agoDeprecate ZSTD_getDecompressedSize() (#3225)
Nick Terrell [Mon, 1 Aug 2022 18:52:14 +0000 (11:52 -0700)] 
Deprecate ZSTD_getDecompressedSize() (#3225)

Fixes #3158.

Mark ZSTD_getDecompressedSize() as deprecated and replaced by ZSTD_getFrameContentSize().

3 years agoMerge pull request #3220 from embg/issue3200
Elliot Gorokhovsky [Mon, 1 Aug 2022 18:04:57 +0000 (14:04 -0400)] 
Merge pull request #3220 from embg/issue3200

Disallow empty string as argument for --output-dir-flat and --output-dir-mirror

3 years agoFix hash4Ptr for big endian (#3227)
Qiongsi Wu [Mon, 1 Aug 2022 17:41:24 +0000 (13:41 -0400)] 
Fix hash4Ptr for big endian (#3227)

3 years agostdin multiple file fixes (#3222)
Yonatan Komornik [Fri, 29 Jul 2022 23:13:07 +0000 (16:13 -0700)] 
stdin multiple file fixes (#3222)

* Fixes for https://github.com/facebook/zstd/issues/3206 - bugs when handling stdin as part of multiple files.

* new line at end of multiple-files.sh

3 years agoDisallow empty output directory 3220/head
Elliot Gorokhovsky [Fri, 29 Jul 2022 21:44:22 +0000 (14:44 -0700)] 
Disallow empty output directory

3 years agoAdd warning when multi-thread decompression is requested (#3208)
Tom Wang [Fri, 29 Jul 2022 19:51:58 +0000 (12:51 -0700)] 
Add warning when multi-thread decompression is requested (#3208)

When user pass in argument for both decompression and multi-thread, print a warning message
to indicate that multi-threaded decompression is not supported.

* Add warning when multi-thread decompression is requested
* add test case for multi-threaded decoding warning
   Expectation is for -d -T0 we will not throw any warning,
   and see warning for any other -d -T(>1) inputs

3 years agoFix small file passthrough (#3215)
Chris Burgess [Fri, 29 Jul 2022 19:22:46 +0000 (15:22 -0400)] 
Fix small file passthrough (#3215)

3 years agozlibWrapper: Update for zlib 1.2.12 (#3217)
orbea [Fri, 29 Jul 2022 19:22:10 +0000 (12:22 -0700)] 
zlibWrapper: Update for zlib 1.2.12 (#3217)

In zlib 1.2.12 the OF macro was changed to _Z_OF breaking any
project that used zlibWrapper. To fix this the OF has been
changed to _Z_OF everywhere and _Z_OF is defined as OF in the
case it is not yet defined for zlib 1.2.11 and older.

Fixes: https://github.com/facebook/zstd/issues/3216
3 years ago[AIX] Fix Compiler Flags and Bugs on AIX to Pass All Tests (#3219)
Qiongsi Wu [Fri, 29 Jul 2022 19:21:59 +0000 (15:21 -0400)] 
[AIX] Fix Compiler Flags and Bugs on AIX to Pass All Tests (#3219)

* Fixing compiler warnings

* Replace the old -s flag with the -Wl,-s flag

* Fixing compiler warnings

* Fixing the linker strip flag and tests/code not working as expected on AIX

3 years agoFix buffer underflow for null dir1
Elliot Gorokhovsky [Fri, 29 Jul 2022 18:10:47 +0000 (11:10 -0700)] 
Fix buffer underflow for null dir1

3 years agolib: add hint to generate more pipeline friendly code (#3138)
Jun He [Fri, 29 Jul 2022 17:28:04 +0000 (01:28 +0800)] 
lib: add hint to generate more pipeline friendly code (#3138)

With statistic data of test data files of silesia
the chance of position beyond highThreshold is very
low (~1.3%@L8 in most cases, all <2.5%), and is in
"lowprob area". Add the branch hint so compiler can
get better pipiline codegen.
With this change it is observed ~1% of mozilla and
xml, and slight (0.3%~0.8%) but consistent uplift on
other files on Arm N1.

Signed-off-by: Jun He <jun.he@arm.com>
Change-Id: Id9ba1d5c767e975290b5c1bf0ecce906544f4ade

3 years agodecomp: add prefetch for matched seq on aarch64 (#3164)
Jun He [Fri, 29 Jul 2022 17:27:20 +0000 (01:27 +0800)] 
decomp: add prefetch for matched seq on aarch64 (#3164)

match is used for following sequence copy. It is
only updated when extDict is needed, which is a
low probability case. So it can be prefetched to
reduce cache miss.
The benchmarks on various Arm platforms showed
uplift from 1% ~ 14% with gcc-11/clang-14.

Signed-off-by: Jun He <jun.he@arm.com>
Change-Id: If201af4799d2455d74c79f8387404439d7f684ae

3 years agoAdd transparency and optimize logo (#3218)
Mathew R Gordon [Fri, 29 Jul 2022 17:17:31 +0000 (11:17 -0600)] 
Add transparency and optimize logo (#3218)

Make the front page look better in dark GH themes

3 years agoMerge pull request #3197 from embg/docstring_clarify
Elliot Gorokhovsky [Tue, 26 Jul 2022 17:26:15 +0000 (13:26 -0400)] 
Merge pull request #3197 from embg/docstring_clarify

Clarify benchmark chunking docstring

3 years agoMerge pull request #3209 from zhuhan0/dev
Elliot Gorokhovsky [Tue, 26 Jul 2022 17:19:38 +0000 (13:19 -0400)] 
Merge pull request #3209 from zhuhan0/dev

[largeNbDicts] Second try at fixing decompression segfault to always create compressInstructions

3 years ago[largeNbDicts] Second try at fixing decompression segfault to always create compressI... 3209/head
Han Zhu [Wed, 20 Jul 2022 23:01:32 +0000 (16:01 -0700)] 
[largeNbDicts] Second try at fixing decompression segfault to always create compressInstructions

Summary:
Freeing an uninitialized pointer is undefined behavior. This caused a segfault
when compiling the benchmark with Clang -O3 and benching decompression.

V2: always create compressInstructions but check if cctxParams is NULL before
setting CCtx params to avoid segfault.

Test Plan:
make and run

3 years agoMerge pull request #3205 from zhuhan0/dev
Elliot Gorokhovsky [Wed, 20 Jul 2022 20:07:04 +0000 (16:07 -0400)] 
Merge pull request #3205 from zhuhan0/dev

[contrib][largeNbDicts] Fix decompression segfault; Add additional benchmark metrics

3 years ago[largeNbDicts] Add an option to print out median speed 3205/head
Han Zhu [Wed, 20 Jul 2022 18:14:51 +0000 (11:14 -0700)] 
[largeNbDicts] Add an option to print out median speed

Summary:
Added an option -p# where -p0 (default) sets the aggregation method to fastest
speed while -p1 sets the aggregation method to median. Also added a new column
in the csv file to report this option's value.

Test Plan:
``
$ ./largeNbDicts -1 --nbDicts=1 -D ~/benchmarks/html/html_8_16K.32K.dict
~/benchmarks/html/html_8_16K/*
loading 7450 files...
created src buffer of size 83.4 MB
split input into 7450 blocks
loading dictionary /home/zhuhan/benchmarks/html/html_8_16K.32K.dict
compressing at level 1 without dictionary : Ratio=3.03  (28827863 bytes)
compressed using a 32768 bytes dictionary : Ratio=4.28  (20410262 bytes)
generating 1 dictionaries, using 0.1 MB of memory
Compression Speed : 306.0 MB/s
Fastest Speed : 310.6 MB/s

$ ./largeNbDicts -1 --nbDicts=1 -p1 -D ~/benchmarks/html/html_8_16K.32K.dict
~/benchmarks/html/html_8_16K/*
loading 7450 files...
created src buffer of size 83.4 MB
split input into 7450 blocks
loading dictionary /home/zhuhan/benchmarks/html/html_8_16K.32K.dict
compressing at level 1 without dictionary : Ratio=3.03  (28827863 bytes)
compressed using a 32768 bytes dictionary : Ratio=4.28  (20410262 bytes)
generating 1 dictionaries, using 0.1 MB of memory
Compression Speed : 306.9 MB/s
Median Speed : 298.4 MB/s
```

3 years ago[largeNbDicts] Print more metrics into csv file
Han Zhu [Tue, 19 Jul 2022 23:50:28 +0000 (16:50 -0700)] 
[largeNbDicts] Print more metrics into csv file

Summary:
Add column headers and data for whether it's a compression or a decompression
run, compression level, nbDicts and dictAttachPref in additional to
compr/decompr speed.

Test Plan:
Example output:

```
./largeNbDicts
Compression/Decompression,Level,nbDicts,dictAttachPref,Speed
Compression,1,1,0,300.9
Compression,1,1,1,296.4
Compression,1,1,2,307.8
Compression,1,10,0,292.3
Compression,1,100,0,293.3
Compression,3,110,0,106.0
Decompression,-1,110,-1,155.6
Decompression,-1,110,-1,709.4
Decompression,-1,120,-1,709.1
Decompression,-1,120,-1,734.6
```

3 years ago[largeNbDicts] Fix decompression segfault in createCompressInstructions
Han Zhu [Tue, 19 Jul 2022 20:55:48 +0000 (13:55 -0700)] 
[largeNbDicts] Fix decompression segfault in createCompressInstructions

Benchmarking decompression results in a segfault in `createCompressInstructions`
because `cctxParams` is NULL. Skip running that function if we are not benching
compression.

3 years agoIntial commit to address 3090. Added support to decompress empty block. (#3118)
udayanbapat [Thu, 14 Jul 2022 18:54:34 +0000 (11:54 -0700)] 
Intial commit to address 3090. Added support to decompress empty block. (#3118)

* Intial commit to address 3090. Added support to decompress empty block

* Update zstd_decompress_block.c

Addressed review comments for the case of 'set_basic'

* Update lib/decompress/zstd_decompress_block.c

Co-authored-by: Nick Terrell <nickrterrell@gmail.com>
* Update lib/decompress/zstd_decompress_block.c

Co-authored-by: Nick Terrell <nickrterrell@gmail.com>
Co-authored-by: Nick Terrell <nickrterrell@gmail.com>
3 years agoClarify -B docstring 3197/head
Elliot Gorokhovsky [Wed, 13 Jul 2022 20:54:29 +0000 (16:54 -0400)] 
Clarify -B docstring

3 years ago[T124890272] Mark 2 Obsolete Functions(ZSTD_copy*Ctx) Deprecated in Zstd 3187/head
Miles HU [Wed, 13 Jul 2022 18:00:05 +0000 (11:00 -0700)] 
[T124890272] Mark 2 Obsolete Functions(ZSTD_copy*Ctx) Deprecated in Zstd

The discussion for this task is here: facebook/zstd#3128.

This task can probably be scoped to the first part: marking these functions deprecated.
We'll later look at removal when we roll out v1.6.0.

3 years agoRevert "T119975957"
Miles HU [Tue, 12 Jul 2022 18:17:25 +0000 (11:17 -0700)] 
Revert "T119975957"

This reverts commit 962746edffa5340315136af34ac3331eba82c3c8.

3 years agoT119975957
Miles HU [Fri, 8 Jul 2022 22:01:36 +0000 (15:01 -0700)] 
T119975957

Signed-off-by: Miles HU <yuanpu@fb.com>
3 years agoMerge pull request #3184 from htnhan/features/list_verbose_to_show_dictionary_id
Felix Handte [Fri, 8 Jul 2022 20:04:39 +0000 (16:04 -0400)] 
Merge pull request #3184 from htnhan/features/list_verbose_to_show_dictionary_id

zstd -lv <file> to show dictID

3 years agoDetect multiple dictIDs in one file 3184/head
htnhan [Fri, 8 Jul 2022 17:20:50 +0000 (12:20 -0500)] 
Detect multiple dictIDs in one file

3 years agozstd -lv <file> to show dictID
htnhan [Wed, 6 Jul 2022 02:28:33 +0000 (21:28 -0500)] 
zstd -lv <file> to show dictID

3 years agoMerge pull request #3180 from nocnokneo/MSVCBuildTests
Elliot Gorokhovsky [Tue, 5 Jul 2022 17:13:34 +0000 (13:13 -0400)] 
Merge pull request #3180 from nocnokneo/MSVCBuildTests

Fix ZSTD_BUILD_TESTS=ON with MSVC

3 years agoFix ZSTD_BUILD_TESTS=ON build with MSVC 3180/head
Taylor Braun-Jones [Thu, 30 Jun 2022 17:20:42 +0000 (13:20 -0400)] 
Fix ZSTD_BUILD_TESTS=ON build with MSVC

Fixes:

    Command line error D8021 : invalid numeric argument '/Wno-deprecated-declarations'

3 years agoMerge pull request #3179 from embg/1.5.3_bump
Elliot Gorokhovsky [Wed, 29 Jun 2022 20:03:52 +0000 (13:03 -0700)] 
Merge pull request #3179 from embg/1.5.3_bump

Prepare v1.5.3

3 years agomake -C programs zstd.1 3179/head
Elliot Gorokhovsky [Wed, 29 Jun 2022 18:55:14 +0000 (14:55 -0400)] 
make -C programs zstd.1

3 years ago1.5.3 version bump
Elliot Gorokhovsky [Wed, 29 Jun 2022 17:11:13 +0000 (13:11 -0400)] 
1.5.3 version bump

3 years agoMerge pull request #3177 from embg/dms_prefetch2
Elliot Gorokhovsky [Fri, 24 Jun 2022 15:24:43 +0000 (08:24 -0700)] 
Merge pull request #3177 from embg/dms_prefetch2

Add prefetchCDictTables CCtxParam (+10-20% cold dict compression speed)

3 years agoNits 3177/head
Elliot Gorokhovsky [Thu, 23 Jun 2022 20:58:03 +0000 (16:58 -0400)] 
Nits

3 years agoUpdate README.md for fuzzers (#3174)
Elliot Gorokhovsky [Thu, 23 Jun 2022 01:02:07 +0000 (18:02 -0700)] 
Update README.md for fuzzers (#3174)

* Update README.md for fuzzers

* Add ls corpora/*crash command

* nit

* Clarify wording and add Nick's command

* Minor clarification

3 years agoAdd tests
Elliot Gorokhovsky [Wed, 22 Jun 2022 21:05:23 +0000 (17:05 -0400)] 
Add tests

3 years agoadd prefetchCDictTables to largeNbDicts
Elliot Gorokhovsky [Wed, 22 Jun 2022 15:59:28 +0000 (08:59 -0700)] 
add prefetchCDictTables to largeNbDicts

3 years agoAdd docs
Elliot Gorokhovsky [Tue, 21 Jun 2022 22:06:48 +0000 (18:06 -0400)] 
Add docs

3 years agoAdd prefetchCDictTables CCtxParam
Elliot Gorokhovsky [Tue, 21 Jun 2022 15:59:27 +0000 (11:59 -0400)] 
Add prefetchCDictTables CCtxParam

3 years agoMerge pull request #3175 from facebook/fix3169
Yann Collet [Wed, 22 Jun 2022 18:21:09 +0000 (11:21 -0700)] 
Merge pull request #3175 from facebook/fix3169

Streaming decompression can detect incorrect header ID sooner

3 years agoStreaming decompression can detect incorrect header ID sooner 3175/head
Yann Collet [Wed, 22 Jun 2022 01:14:11 +0000 (18:14 -0700)] 
Streaming decompression can detect incorrect header ID sooner

Streaming decompression used to wait for a minimum of 5 bytes before attempting decoding.
This meant that, in the case that only a few bytes (<5) were provided,
and assuming these bytes are incorrect,
there would be no error reported.
The streaming API would simply request more data, waiting for at least 5 bytes.

This PR makes it possible to detect incorrect Frame IDs as soon as the first byte is provided.

Fix #3169

3 years ago"Short cache" optimization for level 1-4 DMS (+5-30% compression speed) (#3152)
Elliot Gorokhovsky [Tue, 21 Jun 2022 21:27:19 +0000 (14:27 -0700)] 
"Short cache" optimization for level 1-4 DMS (+5-30% compression speed) (#3152)

* first attempt at fast DMS short cache

* significant wins for some scenarios

* fix all clang regressions

* nits

* fix 1.5% gcc11 regression on hot 110Kdict scenario

* fix CI

* nit

* Add tags to doublefast hash table

* use tags in doublefast DMS

* Fix CI

* Clean up some hardcoded logic / constants

* Switch forCCtx to an enum

* nit

* add short cache to ip+1 long search

* Move tag size into hashLog

* Minor nits

* Truncate dictionaries greater than 16MB in short cache mode

* Helper function for tag comparison

* Cap short cache hashLog at 24 to prevent overflow

* size_t dictTagsMatch -> int dictTagsMatch

* nit

* Clean up and comment dictionary truncation

* Move ZSTD_tableFillPurpose_e next to ZSTD_dictTableLoadMethod_e

* Comment and expand helper functions

* Asserts and documentation

* nit

3 years agoMerge pull request #3170 from facebook/mesongnu99
Yann Collet [Tue, 21 Jun 2022 17:17:36 +0000 (10:17 -0700)] 
Merge pull request #3170 from facebook/mesongnu99

removed gnu99 statement from meson recipe

3 years agoremoved gnu99 statement from meson recipe 3170/head
Yann Collet [Mon, 20 Jun 2022 22:02:41 +0000 (15:02 -0700)] 
removed gnu99 statement from meson recipe

3 years agoMerge pull request #3167 from facebook/cmake_std
Yann Collet [Sun, 19 Jun 2022 23:49:21 +0000 (16:49 -0700)] 
Merge pull request #3167 from facebook/cmake_std

remove explicit standard setting from cmake script

3 years agoremoved explicit compilation standard from cmake script 3167/head
Yann Collet [Sun, 19 Jun 2022 21:52:32 +0000 (14:52 -0700)] 
removed explicit compilation standard from cmake script

it's not expected to be useful
and can actually lead to subtle side effects
such as #3163.

3 years agoMerge pull request #3166 from facebook/warning_clockt
Yann Collet [Sun, 19 Jun 2022 21:45:49 +0000 (14:45 -0700)] 
Merge pull request #3166 from facebook/warning_clockt

display a warning message when using C90 clock_t

3 years agodisplay a warning message when using C90 clock_t for MT speed measurements. 3166/head
Yann Collet [Sun, 19 Jun 2022 18:38:06 +0000 (11:38 -0700)] 
display a warning message when using C90 clock_t for MT speed measurements.

3 years agoupdated documentation regarding build systems
Yann Collet [Sun, 19 Jun 2022 18:12:16 +0000 (11:12 -0700)] 
updated documentation regarding build systems

3 years agoMerge pull request #3161 from embg/largeNbDictsImprovements
Elliot Gorokhovsky [Wed, 15 Jun 2022 14:39:50 +0000 (07:39 -0700)] 
Merge pull request #3161 from embg/largeNbDictsImprovements

[contrib] largeNbDicts bugfix + improvements

3 years agofix typo 3161/head
Elliot Gorokhovsky [Tue, 14 Jun 2022 23:18:49 +0000 (19:18 -0400)] 
fix typo

Co-authored-by: Nick Terrell <nickrterrell@gmail.com>
3 years agoFix FILE handle leak
Elliot Gorokhovsky [Tue, 14 Jun 2022 21:57:54 +0000 (14:57 -0700)] 
Fix FILE handle leak

3 years agoSupport advanced API so forceCopy/forceAttach works properly
Elliot Gorokhovsky [Tue, 14 Jun 2022 21:52:51 +0000 (14:52 -0700)] 
Support advanced API so forceCopy/forceAttach works properly

3 years agolargeNbDicts bugfix + improvements
Elliot Gorokhovsky [Tue, 14 Jun 2022 00:23:33 +0000 (17:23 -0700)] 
largeNbDicts bugfix + improvements

3 years agoMerge pull request #3160 from danlark1/patch-1
Elliot Gorokhovsky [Mon, 13 Jun 2022 18:01:43 +0000 (14:01 -0400)] 
Merge pull request #3160 from danlark1/patch-1

Fix big endian ARM NEON path

3 years agoFix big endian ARM NEON path 3160/head
Daniel Kutenin [Mon, 13 Jun 2022 08:16:24 +0000 (09:16 +0100)] 
Fix big endian ARM NEON path

It is not using the NEON acceleration but the bit grouping was applied

3 years agoMerge pull request #3141 from JunHe77/seqDec
Nick Terrell [Thu, 9 Jun 2022 20:40:51 +0000 (13:40 -0700)] 
Merge pull request #3141 from JunHe77/seqDec

dec: adjust seqSymbol load on aarch64

3 years agoMerge pull request #3145 from JunHe77/wildcopy
Nick Terrell [Thu, 9 Jun 2022 20:38:30 +0000 (13:38 -0700)] 
Merge pull request #3145 from JunHe77/wildcopy

common: apply two stage copy to aarch64

3 years agoMerge pull request #3157 from embg/huge_dict_bugfix
Elliot Gorokhovsky [Thu, 9 Jun 2022 19:35:29 +0000 (15:35 -0400)] 
Merge pull request #3157 from embg/huge_dict_bugfix

Bugfix for huge dictionaries

3 years agoBugfix for huge dictionaries 3157/head
Elliot Gorokhovsky [Thu, 9 Jun 2022 15:39:30 +0000 (11:39 -0400)] 
Bugfix for huge dictionaries

3 years agoupdated --single-thread man
Yann Collet [Wed, 8 Jun 2022 00:44:20 +0000 (17:44 -0700)] 
updated --single-thread man

3 years agoMerge pull request #3154 from terrelln/rsyncable-speed-fix
Nick Terrell [Mon, 6 Jun 2022 23:07:20 +0000 (16:07 -0700)] 
Merge pull request #3154 from terrelln/rsyncable-speed-fix

Remove expensive assert in --rsyncable hot loop

3 years agoRemove expensive assert in --rsyncable hot loop 3154/head
Nick Terrell [Mon, 6 Jun 2022 18:56:13 +0000 (11:56 -0700)] 
Remove expensive assert in --rsyncable hot loop

This assert slows the loop down by 10x. We can get similar
coverage by asserting at the beginning & end of the loop.

We need this fix because Debian compiles zstd with asserts
enabled. Separately, we should ask them why, and if they would
consider disabling asserts in their builds. Since we don't
optimize for assert enabled builds.

Fixes Issue #3150.

3 years agoMerge pull request #3147 from animalize/dev
Nick Terrell [Thu, 2 Jun 2022 17:04:55 +0000 (10:04 -0700)] 
Merge pull request #3147 from animalize/dev

fix leaking thread handles on Windows

3 years agoMerge pull request #3148 from ihsinme/patch-1
Yann Collet [Thu, 2 Jun 2022 16:58:45 +0000 (09:58 -0700)] 
Merge pull request #3148 from ihsinme/patch-1

simple fix

3 years agodec: adjust seqSymbol load on aarch64 3141/head
Jun He [Mon, 23 May 2022 06:25:10 +0000 (14:25 +0800)] 
dec: adjust seqSymbol load on aarch64

ZSTD_seqSymbol is a structure with total of 64 bits
wide. So it can be loaded in one operation and
extract its fields by simply shifting or extracting
on aarch64.
GCC doesn't recognize this and generates more
unnecessary ldr/ldrb/ldrh operations that cause
performance drop.
With this change it is observed 2~4% uplift of
silesia and 2.5~6% of cantrbry @L8 on Arm N1.

Signed-off-by: Jun He <jun.he@arm.com>
Change-Id: I7748909204cf78a17eb9d4f2333692d53239daa8

3 years agoUpdate zstd_compress.c 3148/head
ihsinme [Mon, 30 May 2022 11:08:19 +0000 (14:08 +0300)] 
Update zstd_compress.c

3 years agofix leaking thread handles on Windows 3147/head
Ma Lin [Mon, 30 May 2022 00:18:54 +0000 (08:18 +0800)] 
fix leaking thread handles on Windows

On Windows, thread handle should be closed explicitly.

Co-authored-by: luben karavelov <luben@users.noreply.github.com>
3 years agocommon: apply two stage copy to aarch64 3145/head
Jun He [Wed, 25 May 2022 14:26:41 +0000 (22:26 +0800)] 
common: apply two stage copy to aarch64

On aarch64 ZSTD_wildcopy uses a simple loop to do
16B based memory copy. There is existing optimized
two stage copy that can achieve better performance.
By applying this to aarch64 it is also observed ~1%
uplift in silesia corpus.

Signed-off-by: Jun He <jun.he@arm.com>
Change-Id: Ic1253308e7a8a7df2d08963ba544e086c81ce8be

3 years agoMerge pull request #3143 from facebook/fixdoc_3142
Yann Collet [Tue, 24 May 2022 17:19:14 +0000 (10:19 -0700)] 
Merge pull request #3143 from facebook/fixdoc_3142

fix small error in format documentation example

3 years agoMerge pull request #3139 from danlark1/dev
Nick Terrell [Tue, 24 May 2022 15:10:26 +0000 (11:10 -0400)] 
Merge pull request #3139 from danlark1/dev

[lazy] Optimize ZSTD_row_getMatchMask for levels 8-10 for ARM

3 years agofix small error in format documentation example 3143/head
Yann Collet [Tue, 24 May 2022 11:47:49 +0000 (04:47 -0700)] 
fix small error in format documentation example

reported by @dkcasset
fix #3142

3 years agoAgain unused error warning. Fixed 3139/head
Danila Kutenin [Mon, 23 May 2022 14:51:47 +0000 (14:51 +0000)] 
Again unused error warning. Fixed

3 years agoMove NEON version to a separate function and fix indentation
Danila Kutenin [Mon, 23 May 2022 14:49:35 +0000 (14:49 +0000)] 
Move NEON version to a separate function and fix indentation

3 years agoDisable unused variable warning
Danila Kutenin [Sun, 22 May 2022 10:50:33 +0000 (10:50 +0000)] 
Disable unused variable warning

3 years ago[lazy] Optimize ZSTD_row_getMatchMask for level 8-10
Danila Kutenin [Sun, 22 May 2022 10:34:33 +0000 (10:34 +0000)] 
[lazy] Optimize ZSTD_row_getMatchMask for level 8-10

We found that movemask is not used properly or consumes too much CPU.
This effort helps to optimize the movemask emulation on ARM.

For level 8-9 we saw 3-5% improvements. For level 10 we say 1.5%
improvement.

The key idea is not to use pure movemasks but to have groups of bits.
For rowEntries == 16, 32 we are going to have groups of size 4 and 2
respectively. It means that each bit will be duplicated within the group

Then we do AND to have only one bit set in the group so that iteration
with lowering bit `a &= (a - 1)` works as well.

Also, aarch64 does not have rotate instructions for 16 bit, only for 32
and 64, that's why we see more improvements for level 8-9.

vshrn_n_u16 instruction is used to achieve that: vshrn_n_u16 shifts by
4 every u16 and narrows to 8 lower bits. See the picture below. It's
also used in
[Folly](https://github.com/facebook/folly/blob/c5702590080aa5d0e8d666d91861d64634065132/folly/container/detail/F14Table.h#L446).
It also uses 2 cycles according to Neoverse-N{1,2} guidelines.

64 bit movemask is already well optimized. We have ongoing experiments
but were not able to validate other implementations work reliably faster.

3 years agoMerge pull request #3135 from averred/dev
Yann Collet [Fri, 20 May 2022 17:05:16 +0000 (10:05 -0700)] 
Merge pull request #3135 from averred/dev

Typo in man

3 years agoTypo in man 3135/head
Talha Khan [Fri, 20 May 2022 08:53:48 +0000 (16:53 +0800)] 
Typo in man

3 years agoMerge pull request #3127 from embg/repcode_history
Elliot Gorokhovsky [Thu, 12 May 2022 17:50:15 +0000 (13:50 -0400)] 
Merge pull request #3127 from embg/repcode_history

Correct and clarify repcode offset history logic

3 years agoNits 3127/head
Elliot Gorokhovsky [Thu, 12 May 2022 16:53:15 +0000 (12:53 -0400)] 
Nits