Fangrui Song [Thu, 22 Sep 2022 19:30:21 +0000 (12:30 -0700)]
Move ZSTD_DEPRECATED before ZSTDLIB_API/ZSTDLIB_STATIC_API
Clang doesn't allow [[deprecated(...)]] attribute after __attribute__.
Move [[deprecated(...)]] before __attribute__ to fix C++14/C++17 uses
with Clang.
Andrea Pappacoda [Tue, 20 Sep 2022 20:13:23 +0000 (22:13 +0200)]
ci: test pkg-config file
As mentioned in
https://github.com/facebook/zstd/pull/3252#issuecomment-1251733791 ,
this patch adds a CI job that builds and installs libzstd on the job
runner, and then compiles a sample binary linking against the installed
library; the needed build flags are passed by invoking pkg-config.
Jun He [Fri, 24 Jun 2022 09:07:06 +0000 (17:07 +0800)]
compress:check more bytes to reduce ZSTD_count call
Comparing 4B instead of comparing 1B in ZSTD_noDict
mode, thus it can avoid cases like match in match[ml]
but mismatch in match[ml-3]..match[ml-1]. So the call
count of ZSTD_count can be reduced.
Signed-off-by: Jun He <jun.he@arm.com>
Change-Id: I3449ea423d5c8e8344f75341f19a2d1643c703f6
When creating a new `Makefile` target to build,
it's also necessary to update the `clean` target,
which purpose is to remove built targets when they are present.
This process is simple, but it's also easy to forget :
since there is a large distance between the position in the `Makefile` where the new built target is added,
and the place where the list of files to `clean` is defined.
Moreover, the list of files becomes pretty long over time,
hence it's difficult to visually ensure that all built targets are present there,
or that no old target (no longer produced) is no longer in the list
This PR tries to improve this process by adding a CLEAN variable.
Now, when a new built target is added to the `Makefile`,
it should preceded by :
```
CLEAN += newTarget
newTarget:
<TAB> ...recipe...
```
This new requirement is somewhat similar to `.PHONY: newTarget` for non-built targets.
This new method offers the advantage of locality :
there is no separate place in the file to maintain a list of files to clean.
This makes maintenance of `make clean` easier.
Andrea Pappacoda [Sun, 28 Aug 2022 11:01:20 +0000 (13:01 +0200)]
build(cmake): improve pkg-config generation
With this patch the pkg-config generation when using the CMake build
system is improved in the following ways:
- Libs.private is now filled when needed
- The JoinPaths module is now used to join paths, leading to simpler
code
- The .pc file is always generated, regardless of the platform, as it
can also be consumed on Windows
Here's how the .pc file is affected by these changes, in comparison to
the one generated with the official Makefiles:
Chris Burgess [Mon, 15 Aug 2022 17:29:54 +0000 (13:29 -0400)]
Document pass-through behavior (#3242)
Adds documentation to help and man pages for legacy pass-through behavior
when force is set and destination is stdout. Documents --pass-through in
man pages
Nick Terrell [Fri, 5 Aug 2022 00:15:59 +0000 (17:15 -0700)]
Add explicit --pass-through flag and default to enabled for *cat (#3223)
Fixes #3211.
Adds the `--[no-]pass-through` flag which enables/disables pass-through mode.
* `zstdcat`, `zcat`, and `gzcat` default to `--pass-through`.
Pass-through mode can be disabled by passing `--no-pass-through`.
* All other binaries default to not setting pass-through mode.
However, we preserve the legacy behavior of enabling pass-through
mode when writing to stdout with `-f` set, unless pass-through
mode is explicitly disabled with `--no-pass-through`.
Adds a new test for this behavior that should codify the behavior we want.
Yann Collet [Wed, 3 Aug 2022 19:39:35 +0000 (21:39 +0200)]
fileio_types.h : avoid dependency on mem.h
fileio_types.h cannot be parsed by itself
because it relies on basic types defined in `lib/common/mem.h`.
As for #3231, it likely wasn't detected because `mem.h` was probably included before within target files.
But this is not proper.
A "easy" solution would be to add the missing include,
but each dependency should be considered "bad" by default,
and only allowed if it brings some tangible value.
In this case, since these types are only used to declare internal structure variables
which are effectively only flags,
I believe it's really not valuable to add a dependency on `mem.h` for this purpose
while the standard `int` type can do the same job.
I was expecting some compiler warnings following this change,
but it turns out we don't use `-Wconversion` by default on `zstd` source code,
so there is none.
Nevertheless, I enabled `-Wconversion` locally and proceeded to fix a few conversion warnings in the process.
Adding `-Wconversion` to the list of flags used for `zstd` is something I would be favorable over the long term,
but it cannot be done overnight,
because the nb of places where this warning is triggered is daunting.
Better progressively reduce the nb of triggered `-Wconversion` warnings before enabling this flag by default.
Nick Terrell [Wed, 3 Aug 2022 18:28:39 +0000 (11:28 -0700)]
Fix off-by-one error in superblock mode (#3221)
Fixes #3212.
Long literal and match lengths had an off-by-one error in ZSTD_getSequenceLength.
Fix the off-by-one error, and add a golden compression test that catches the bug.
Also run all the golden tests in the cli-tests framework.
Tom Wang [Fri, 29 Jul 2022 19:51:58 +0000 (12:51 -0700)]
Add warning when multi-thread decompression is requested (#3208)
When user pass in argument for both decompression and multi-thread, print a warning message
to indicate that multi-threaded decompression is not supported.
* Add warning when multi-thread decompression is requested
* add test case for multi-threaded decoding warning
Expectation is for -d -T0 we will not throw any warning,
and see warning for any other -d -T(>1) inputs
In zlib 1.2.12 the OF macro was changed to _Z_OF breaking any
project that used zlibWrapper. To fix this the OF has been
changed to _Z_OF everywhere and _Z_OF is defined as OF in the
case it is not yet defined for zlib 1.2.11 and older.
Jun He [Fri, 29 Jul 2022 17:28:04 +0000 (01:28 +0800)]
lib: add hint to generate more pipeline friendly code (#3138)
With statistic data of test data files of silesia
the chance of position beyond highThreshold is very
low (~1.3%@L8 in most cases, all <2.5%), and is in
"lowprob area". Add the branch hint so compiler can
get better pipiline codegen.
With this change it is observed ~1% of mozilla and
xml, and slight (0.3%~0.8%) but consistent uplift on
other files on Arm N1.
Signed-off-by: Jun He <jun.he@arm.com>
Change-Id: Id9ba1d5c767e975290b5c1bf0ecce906544f4ade
Jun He [Fri, 29 Jul 2022 17:27:20 +0000 (01:27 +0800)]
decomp: add prefetch for matched seq on aarch64 (#3164)
match is used for following sequence copy. It is
only updated when extDict is needed, which is a
low probability case. So it can be prefetched to
reduce cache miss.
The benchmarks on various Arm platforms showed
uplift from 1% ~ 14% with gcc-11/clang-14.
Signed-off-by: Jun He <jun.he@arm.com>
Change-Id: If201af4799d2455d74c79f8387404439d7f684ae
Han Zhu [Wed, 20 Jul 2022 23:01:32 +0000 (16:01 -0700)]
[largeNbDicts] Second try at fixing decompression segfault to always create compressInstructions
Summary:
Freeing an uninitialized pointer is undefined behavior. This caused a segfault
when compiling the benchmark with Clang -O3 and benching decompression.
V2: always create compressInstructions but check if cctxParams is NULL before
setting CCtx params to avoid segfault.
Han Zhu [Wed, 20 Jul 2022 18:14:51 +0000 (11:14 -0700)]
[largeNbDicts] Add an option to print out median speed
Summary:
Added an option -p# where -p0 (default) sets the aggregation method to fastest
speed while -p1 sets the aggregation method to median. Also added a new column
in the csv file to report this option's value.
Test Plan:
``
$ ./largeNbDicts -1 --nbDicts=1 -D ~/benchmarks/html/html_8_16K.32K.dict
~/benchmarks/html/html_8_16K/*
loading 7450 files...
created src buffer of size 83.4 MB
split input into 7450 blocks
loading dictionary /home/zhuhan/benchmarks/html/html_8_16K.32K.dict
compressing at level 1 without dictionary : Ratio=3.03 (28827863 bytes)
compressed using a 32768 bytes dictionary : Ratio=4.28 (20410262 bytes)
generating 1 dictionaries, using 0.1 MB of memory
Compression Speed : 306.0 MB/s
Fastest Speed : 310.6 MB/s
$ ./largeNbDicts -1 --nbDicts=1 -p1 -D ~/benchmarks/html/html_8_16K.32K.dict
~/benchmarks/html/html_8_16K/*
loading 7450 files...
created src buffer of size 83.4 MB
split input into 7450 blocks
loading dictionary /home/zhuhan/benchmarks/html/html_8_16K.32K.dict
compressing at level 1 without dictionary : Ratio=3.03 (28827863 bytes)
compressed using a 32768 bytes dictionary : Ratio=4.28 (20410262 bytes)
generating 1 dictionaries, using 0.1 MB of memory
Compression Speed : 306.9 MB/s
Median Speed : 298.4 MB/s
```
Han Zhu [Tue, 19 Jul 2022 23:50:28 +0000 (16:50 -0700)]
[largeNbDicts] Print more metrics into csv file
Summary:
Add column headers and data for whether it's a compression or a decompression
run, compression level, nbDicts and dictAttachPref in additional to
compr/decompr speed.
Han Zhu [Tue, 19 Jul 2022 20:55:48 +0000 (13:55 -0700)]
[largeNbDicts] Fix decompression segfault in createCompressInstructions
Benchmarking decompression results in a segfault in `createCompressInstructions`
because `cctxParams` is NULL. Skip running that function if we are not benching
compression.