]> git.ipfire.org Git - thirdparty/zlib-ng.git/log
thirdparty/zlib-ng.git
2 years agoUse static inline in x86_features.c
Cameron Cawley [Sun, 25 Sep 2022 20:42:36 +0000 (21:42 +0100)] 
Use static inline in x86_features.c

2 years agoCheck that the OS supports saving the YMM registers before enabling AVX2
Cameron Cawley [Sun, 25 Sep 2022 19:11:14 +0000 (20:11 +0100)] 
Check that the OS supports saving the YMM registers before enabling AVX2

2 years agoUse short decompress option name for gzip compatibility. #1347
Nathan Moinvaziri [Wed, 28 Sep 2022 16:38:06 +0000 (09:38 -0700)] 
Use short decompress option name for gzip compatibility. #1347

Long option names in BusyBoxy is an optional feature, so use short option
names by default.

2 years agoRemove unused tryboth() function
Hans Kristian Rosbach [Sun, 25 Sep 2022 15:59:44 +0000 (17:59 +0200)] 
Remove unused tryboth() function

2 years agovpclmulqdq compilation fails without avx512f also enabled
Hans Kristian Rosbach [Wed, 14 Sep 2022 19:30:43 +0000 (21:30 +0200)] 
vpclmulqdq compilation fails without avx512f also enabled

2 years agoMake visibility tests run the same way as the other tests.
Hans Kristian Rosbach [Wed, 14 Sep 2022 19:29:41 +0000 (21:29 +0200)] 
Make visibility tests run the same way as the other tests.
Fix indentation.

2 years agoDon't try to link gz* objects twice.
Hans Kristian Rosbach [Wed, 14 Sep 2022 19:27:48 +0000 (21:27 +0200)] 
Don't try to link gz* objects twice.

2 years agoRemove errant space in cmake posix specifier
Hans Kristian Rosbach [Wed, 14 Sep 2022 19:26:32 +0000 (21:26 +0200)] 
Remove errant space in cmake posix specifier

2 years agoIBM Z DFLTCC: Fix updating strm.adler with inflate()
Ilya Leoshkevich [Sat, 17 Sep 2022 13:32:29 +0000 (15:32 +0200)] 
IBM Z DFLTCC: Fix updating strm.adler with inflate()

inflate() does not update strm.adler with DFLTCC.
Add a missing assignment to dfltcc_inflate() to fix this.
Note that deflate() is not affected.
Also add a test to prevent regressions.

2 years ago[Compat] Don't use uint32_t for z_crc_t
Mika Lindqvist [Sun, 11 Sep 2022 13:15:10 +0000 (16:15 +0300)] 
[Compat] Don't use uint32_t for z_crc_t
* We don't include stdint.h as it must be included before stdarg.h and other headers might include stdarg.h before us

See #1342

2 years agomsvc/armv7: disable crc32_acle
Shawn Hoffman [Tue, 6 Sep 2022 20:11:16 +0000 (13:11 -0700)] 
msvc/armv7: disable crc32_acle
msvc compiler targeting 32bit arm supports
only armv7 and lacks these intrinsics

2 years agoFixed casting warning from uint64_t to size_t in adler32_copy benchmarks
Nathan Moinvaziri [Sun, 11 Sep 2022 19:36:59 +0000 (15:36 -0400)] 
Fixed casting warning from uint64_t to size_t in adler32_copy benchmarks

2 years agoFixed undefined variable random_ints in adler32_copy benchmarks.
Nathan Moinvaziri [Sun, 11 Sep 2022 19:36:33 +0000 (15:36 -0400)] 
Fixed undefined variable random_ints in adler32_copy benchmarks.

2 years agoPin Google Benchmark to v1.7.0 since master tag has been removed.
Nathan Moinvaziri [Sun, 11 Sep 2022 19:35:24 +0000 (15:35 -0400)] 
Pin Google Benchmark to v1.7.0 since master tag has been removed.

2 years agoInflate: Increase max root table sizes to 10 and 9
Dougall Johnson [Mon, 22 Aug 2022 00:57:39 +0000 (10:57 +1000)] 
Inflate: Increase max root table sizes to 10 and 9

This increases the size of the `codes` array by 1920 bytes (33%), but
improves performance a little. Root table size is still limited by the
maximum code length in use, so tiny files typically see no change to
table-building time, as they don't use longer codes.

2 years agoFix build failure introduced by 8df665005952cdbe7dc995d409ffe4f861e7a83e
Mika Lindqvist [Sun, 11 Sep 2022 15:57:02 +0000 (18:57 +0300)] 
Fix build failure introduced by 8df665005952cdbe7dc995d409ffe4f861e7a83e
* We need absolute path for libz.a

2 years agoFix wine32 dependencies for MinGW i686.
Mika Lindqvist [Mon, 29 Aug 2022 07:50:05 +0000 (10:50 +0300)] 
Fix wine32 dependencies for MinGW i686.
* Downgrade broken libgd3

2 years agoAdd CVE-2022-37434 test.
Vladislav Shchapov [Fri, 19 Aug 2022 11:33:59 +0000 (16:33 +0500)] 
Add CVE-2022-37434 test.

Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>
2 years agoIf the extra field was larger than the space the user provided with
Mika Lindqvist [Fri, 19 Aug 2022 12:00:21 +0000 (15:00 +0300)] 
If the extra field was larger than the space the user provided with
inflateGetHeader(), and if multiple calls of inflate() delivered
the extra header data, then there could be a buffer overflow of the
provided space. This commit assures that provided space is not
exceeded.

See #1323.

2 years agoformat Vcpkg
FrankXie [Fri, 26 Aug 2022 02:19:03 +0000 (19:19 -0700)] 
format Vcpkg

2 years agoformat and add vcpkg headings.
FrankXie [Fri, 26 Aug 2022 02:16:00 +0000 (19:16 -0700)] 
format and add vcpkg headings.

2 years agoAdd vcpkg installation instructions
FrankXie [Thu, 25 Aug 2022 03:02:12 +0000 (20:02 -0700)] 
Add vcpkg installation instructions

2 years agoAdd add_subdirectory test.
Vladislav Shchapov [Wed, 17 Aug 2022 16:06:37 +0000 (21:06 +0500)] 
Add add_subdirectory test.

Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>
2 years agoFix build as subproject.
Vladislav Shchapov [Wed, 17 Aug 2022 13:13:39 +0000 (18:13 +0500)] 
Fix build as subproject.

Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>
2 years agofix ACLE detection on msvc/arm64
Shawn Hoffman [Sun, 24 Jul 2022 04:01:04 +0000 (21:01 -0700)] 
fix ACLE detection on msvc/arm64

2 years agomsvc/arm64: fix narrowing/signed conversion warning in NEON_accum32
Shawn Hoffman [Sun, 24 Jul 2022 04:01:47 +0000 (21:01 -0700)] 
msvc/arm64: fix narrowing/signed conversion warning in NEON_accum32

2 years agoRemove extra compiler names
Long Nguyen [Sat, 13 Aug 2022 03:38:02 +0000 (10:38 +0700)] 
Remove extra compiler names

2 years agoSet `CMAKE_*_COMPILER_TARGET` to full triplet for mingw toolchains
Long Nguyen [Tue, 12 Jul 2022 09:27:52 +0000 (16:27 +0700)] 
Set `CMAKE_*_COMPILER_TARGET` to full triplet for mingw toolchains

2 years agoRemove ZLIB_DUAL_LINK option to simplify dual link tests.
Nathan Moinvaziri [Sat, 20 Aug 2022 16:55:49 +0000 (09:55 -0700)] 
Remove ZLIB_DUAL_LINK option to simplify dual link tests.

2 years agoActually run `configure` CI on macOS with GCC
Mosè Giordano [Wed, 24 Aug 2022 19:39:03 +0000 (20:39 +0100)] 
Actually run `configure` CI on macOS with GCC

`gcc` is an alias for Apple Clang on macOS.  See for example https://github.com/zlib-ng/zlib-ng/runs/7963904948?check_suite_focus=true
```
gcc -O2  -std=c11 -Wall -fPIC -DNDEBUG -DHAVE_POSIX_MEMALIGN -DWITH_GZFILEOP -fno-semantic-interposition -DHAVE_VISIBILITY_HIDDEN -DHAVE_VISIBILITY_INTERNAL -DHAVE_BUILTIN_CTZ -DHAVE_BUILTIN_CTZLL -DX86_FEATURES -DX86_AVX2 -DX86_AVX2_ADLER32 -DX86_AVX_CHUNKSET -DX86_AVX512 -DX86_AVX512_ADLER32 -DX86_MASK_INTRIN -DX86_AVX512VNNI -DX86_AVX512VNNI_ADLER32 -DX86_SSE41 -DX86_SSE42_CRC_HASH -DX86_SSE42_ADLER32 -DX86_SSE42_CRC_INTRIN -DX86_SSE2 -DX86_SSE2_CHUNKSET -DX86_SSSE3 -DX86_SSSE3_ADLER32 -DX86_PCLMULQDQ_CRC -DPIC -I/Users/runner/work/zlib-ng/zlib-ng -c -o adler32.lo /Users/runner/work/zlib-ng/zlib-ng/adler32.c
clang: warning: argument unused during compilation: '-fno-semantic-interposition' [-Wunused-command-line-argument]
```

In order to use the real GCC you have to call `gcc-9`, `gcc-10`, or `gcc-11`: https://github.com/actions/runner-images/blob/06dd4c14e4aa8c14febdd8d6cf123b8d770b4e4a/images/macos/macos-11-Readme.md#language-and-runtime.

2 years agoFixed content already populated error in CMake scripts. #1327
Nathan Moinvaziri [Sun, 21 Aug 2022 17:03:32 +0000 (10:03 -0700)] 
Fixed content already populated error in CMake scripts. #1327

Should only need to use either FetchContent_MakeAvailable or
FetchContent_GetProperties and FetchContent_Populate but not both methods. We
use the later for CMake compatibility with lower versions.

3 years agocmake: respect custom `RC` flags and delete `GCC_WINDRES`
Viktor Szakats [Sun, 17 Jul 2022 19:33:01 +0000 (19:33 +0000)] 
cmake: respect custom `RC` flags and delete `GCC_WINDRES`

Before this patch, `zlib.rc` was compiled using a manual command [1] when
using the MinGW (and MSYS/Cygwin) toolchains. This method ignores
`CMAKE_RC_FLAGS` and offers no other way to pass a custom flag, breaking
the build in cases where a custom `windres` option is required. E.g.
`--target=` or `-I` on some platforms and configuration, in particular
with `llvm-windres`.

This patch deletes the special case for these toolchains and lets CMake
compile the `.rc` file the default way used for all Windows targets.

I'm not entirely sure why this special case was added back in 2011. The
need to pass `-DGCC_WINDRES` is my suspect. We can resolve this much
simpler by adding this line for the targets that require it:
   set(CMAKE_RC_FLAGS "${CMAKE_RC_FLAGS} -DGCC_WINDRES")

But, the `.rc` line protected by `GCC_WINDRES`, these days work just fine
with `windres`. Moreover, that protected line are oboslete flags from the
16-bit era, which for a long time have no effect, as documented here:
<https://docs.microsoft.com/windows/win32/menurc/common-resource-attributes>

So, this patch deletes `GCC_WINDRES` from the project entirely.

[1] dc5a43e

3 years agoReorganize cmake scripts for tests.
Nathan Moinvaziri [Mon, 4 Jul 2022 18:53:46 +0000 (11:53 -0700)] 
Reorganize cmake scripts for tests.

* Moves cmake scripts for testing into test/cmake.
* Separates out related add_tests into separate cmake scripts.
* Moves building test binaries into CMakeLists.txt in test directory.

3 years agoFix inflateBack to detect invalid input with distances too far.
Mark Adler [Thu, 30 Jun 2022 19:04:27 +0000 (12:04 -0700)] 
Fix inflateBack to detect invalid input with distances too far.

3 years agoDon't use unaligned access for memcpy instructions due to GCC 11 assuming it is align...
Nathan Moinvaziri [Wed, 29 Jun 2022 15:57:11 +0000 (08:57 -0700)] 
Don't use unaligned access for memcpy instructions due to GCC 11 assuming it is aligned in certain instances.

3 years agoTreat arm64 as aarch64 for Apple M1.
Mika Lindqvist [Sat, 9 Jul 2022 09:33:02 +0000 (12:33 +0300)] 
Treat arm64 as aarch64 for Apple M1.

3 years agoFixed functions declared without a prototype warning in tools.
Nathan Moinvaziri [Sat, 2 Jul 2022 20:59:19 +0000 (13:59 -0700)] 
Fixed functions declared without a prototype warning in tools.

  tools/maketrees.c:101:29: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
static void gen_trees_header()

  tools/makecrct.c:65:27: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
static void make_crc_table()

3 years agoDon't use zlib fork identifier in copyright statement.
Nathan Moinvaziri [Thu, 30 Jun 2022 18:23:23 +0000 (11:23 -0700)] 
Don't use zlib fork identifier in copyright statement.

3 years agoUse 15 (0xf) for ZLIB_VER_SUBREVISION to indicate zlib fork.
Nathan Moinvaziri [Thu, 30 Jun 2022 18:23:37 +0000 (11:23 -0700)] 
Use 15 (0xf) for ZLIB_VER_SUBREVISION to indicate zlib fork.

3 years agoSet max time for job to complete to 60 minutes.
Nathan Moinvaziri [Sun, 19 Jun 2022 21:42:57 +0000 (14:42 -0700)] 
Set max time for job to complete to 60 minutes.

3 years agoUse msvc-dev-cmd to set the development environment.
Nathan Moinvaziri [Sun, 19 Jun 2022 20:33:56 +0000 (13:33 -0700)] 
Use msvc-dev-cmd to set the development environment.

3 years agoSplit long workflow commands into separate lines for easier readability.
Nathan Moinvaziri [Fri, 24 Jun 2022 18:22:41 +0000 (11:22 -0700)] 
Split long workflow commands into separate lines for easier readability.

3 years agoAdded whitespace separation between steps in fuzzer workflow.
Nathan Moinvaziri [Sun, 19 Jun 2022 20:19:29 +0000 (13:19 -0700)] 
Added whitespace separation between steps in fuzzer workflow.

3 years agoRemove unused sanitizer options in configure workflow.
Nathan Moinvaziri [Sun, 19 Jun 2022 20:19:11 +0000 (13:19 -0700)] 
Remove unused sanitizer options in configure workflow.

3 years agoUse working-directory property for run actions instead of changing directory.
Nathan Moinvaziri [Sun, 19 Jun 2022 20:19:53 +0000 (13:19 -0700)] 
Use working-directory property for run actions instead of changing directory.

3 years agoCollapse workflow multiline run actions into single line.
Nathan Moinvaziri [Sun, 19 Jun 2022 20:17:02 +0000 (13:17 -0700)] 
Collapse workflow multiline run actions into single line.

3 years agoClean up workflow and job names. Remove CI prefix because it is redundant.
Nathan Moinvaziri [Fri, 24 Jun 2022 14:38:40 +0000 (07:38 -0700)] 
Clean up workflow and job names. Remove CI prefix because it is redundant.

3 years agoUpgrade to actions/checkout@v3.
Nathan Moinvaziri [Sun, 19 Jun 2022 19:44:47 +0000 (12:44 -0700)] 
Upgrade to actions/checkout@v3.

3 years agoFixed conversion warning with level in compress_bound unit tests.
Nathan Moinvaziri [Wed, 29 Jun 2022 16:00:53 +0000 (09:00 -0700)] 
Fixed conversion warning with level in compress_bound unit tests.

  test_compress_bound.cc(43,1): warning C4267: 'argument': conversion from 'size_t' to 'int32_t', possible loss of data

3 years agoUninstall strawberryperl
Ilya Leoshkevich [Mon, 27 Jun 2022 18:16:54 +0000 (20:16 +0200)] 
Uninstall strawberryperl

strawberryperl installs /c/Strawberry/c/bin/libstdc++-6.dll, which is
incompatible with the mingw64 one. zlib-ng does not need perl, so
simply remove it.

3 years agoearly return as requested
Lucy Phipps [Wed, 8 Jun 2022 17:48:19 +0000 (18:48 +0100)] 
early return as requested

3 years agoremove UNROLL_MORE as suggested
Lucinda May Phipps [Tue, 7 Jun 2022 13:59:39 +0000 (14:59 +0100)] 
remove UNROLL_MORE as suggested

3 years agocrc32_acle.c: make logic more consistent
Lucinda May Phipps [Fri, 13 May 2022 07:48:17 +0000 (08:48 +0100)] 
crc32_acle.c: make logic more consistent

3 years agoBump _POSIX_C_SOURCE to 200809 for strdup()
Ilya Leoshkevich [Tue, 28 Jun 2022 08:51:39 +0000 (10:51 +0200)] 
Bump _POSIX_C_SOURCE to 200809 for strdup()

Google Test uses strdup(), which makes building tests fail on a fresh
MSYS2 setup:

In file included from zlib-ng/_deps/googletest-src/googletest/include/gtest/internal/gtest-internal.h:40,
                 from zlib-ng/_deps/googletest-src/googletest/include/gtest/gtest.h:62,
                 from zlib-ng/test/test_compress.cc:17:
zlib-ng/_deps/googletest-src/googletest/include/gtest/internal/gtest-port.h: In function â€˜char* testing::internal::posix::StrDup(const char*)’:
zlib-ng/_deps/googletest-src/googletest/include/gtest/internal/gtest-port.h:2046:47: error: â€˜strdup’ was not declared in this scope; did you mean â€˜StrDup’?
 2046 | inline char* StrDup(const char* src) { return strdup(src); }
      |                                               ^~~~~~
      |                                               StrDup

Bump _POSIX_C_SOURCE to enable this function. An alternative solution
would be to define _POSIX_C_SOURCE in test/CMakeLists.txt, but having a
bigger value for zlib-ng itself should not hurt.

Include zbuild.h earlier in minideflate.c in order to make the new
setting take effect for this file.

3 years agoImprove the swizzle of the memory magazine fed in a chunk copy for neon
Adam Stylinski [Thu, 2 Jun 2022 22:46:56 +0000 (18:46 -0400)] 
Improve the swizzle of the memory magazine fed in a chunk copy for neon

Like the x86 variant, we can leverage the same tables to load a vector
register worth of values. This shows a vast improvement in places where
very large run length encodes can be found in the lz runs.

3 years agoAdded workflow to test linking zlib and zlib-ng compat against native zlib-ng.
Nathan Moinvaziri [Sun, 19 Jun 2022 15:52:38 +0000 (08:52 -0700)] 
Added workflow to test linking zlib and zlib-ng compat against native zlib-ng.

3 years agoImprove dual link test to compile against zlib. Previously we were only linking again...
Nathan Moinvaziri [Sun, 19 Jun 2022 16:02:27 +0000 (09:02 -0700)] 
Improve dual link test to compile against zlib. Previously we were only linking against it.

3 years agoIBM Z DFLTCC: Simplify includes in dfltcc_detail.h
Ilya Leoshkevich [Wed, 15 Jun 2022 17:10:43 +0000 (19:10 +0200)] 
IBM Z DFLTCC: Simplify includes in dfltcc_detail.h

Include zbuild.h instead of the standard headers. Keep stdio.h, since
it's provided only conditionally.

Suggested-by: Nathan Moinvaziri <nathan@nathanm.com>
3 years agoIBM Z DFLTCC: Test with MSan
Ilya Leoshkevich [Tue, 14 Jun 2022 09:19:29 +0000 (11:19 +0200)] 
IBM Z DFLTCC: Test with MSan

* Add a CI job.
* Do not collect coverage: LLVM's gcov support (part of compiler-rt)
  cannot be built with the MSan instrumentation, which means that
  whenever it's called (in particular, in order to write the results to
  a file at the end), there is a risk of false positives.
* Add __msan_unpoison() calls to DFLTCC inline assembly.
* Make parameter block sizes symbolic constants.
* Move dfltcc() definition after struct dfltcc_param_v0 definition.

3 years agoAdd a test for concurrently modifying deflate() input
Ilya Leoshkevich [Fri, 3 Jun 2022 13:38:19 +0000 (15:38 +0200)] 
Add a test for concurrently modifying deflate() input

The test simulates what one of the QEMU live migration tests is doing:
increments each buffer byte by 1 while deflate()ing it.

The test tries to produce a race condition and therefore is
probabilistic. The longer it runs, the better are the chances to catch
an issue. The scenario in question is known to be broken on IBM Z
with DFLTCC, and there it is caught in 100ms most of the time. The
run time is therefore set to 1 second in order to balance usability and
reliability.

3 years agoUsed fixed width uint8_t for crc32 and adler32 function declarations.
Nathan Moinvaziri [Sun, 5 Jun 2022 20:59:44 +0000 (13:59 -0700)] 
Used fixed width uint8_t for crc32 and adler32 function declarations.

3 years agoUse uint64_t instead of size_t for len in adler32 to be consistent with crc32.
Nathan Moinvaziri [Mon, 6 Jun 2022 04:28:49 +0000 (21:28 -0700)] 
Use uint64_t instead of size_t for len in adler32 to be consistent with crc32.

3 years agoFix MSVC possible loss of data warning in crc32_pclmulqdq by converting len types...
Nathan Moinvaziri [Sun, 5 Jun 2022 23:54:32 +0000 (16:54 -0700)] 
Fix MSVC possible loss of data warning in crc32_pclmulqdq by converting len types to use uint64_t.

arch\x86\crc32_fold_pclmulqdq.c(604,43): warning C4244: 'function':
  conversion from 'uint64_t' to 'size_t', possible loss of data

3 years agoPrint gtest_zlib test results using color.
Nathan Moinvaziri [Tue, 21 Jun 2022 03:27:00 +0000 (20:27 -0700)] 
Print gtest_zlib test results using color.

3 years agoIn compatibility mode, always define z_crc_t as uint32_t for backwards compatibility.
Mika Lindqvist [Fri, 17 Jun 2022 10:06:56 +0000 (13:06 +0300)] 
In compatibility mode, always define z_crc_t as uint32_t for backwards compatibility.

3 years agoFix typo
Tobias Stoeckmann [Sat, 18 Jun 2022 19:00:10 +0000 (21:00 +0200)] 
Fix typo

Typo found with codespell.

3 years agoAdd public compile definition for zlib-ng API so that other projects that use CMake...
Nathan Moinvaziri [Sun, 12 Jun 2022 16:01:15 +0000 (09:01 -0700)] 
Add public compile definition for zlib-ng API so that other projects that use CMake and link against the zlib project can easily determine whether or not to include "zlib-ng.h" or "zlib.h".

3 years agoHandle invalid windowBits in init functions
Tobias Stoeckmann [Mon, 13 Jun 2022 16:43:16 +0000 (18:43 +0200)] 
Handle invalid windowBits in init functions

Negative windowBits arguments are eventually turned positive in
deflateInit2_ and inflateInit2_ (more precisely in inflateReset2).
Such values are used to indicate that raw deflate/inflate should
be performed.

If a user supplies INT32_MIN for windowBits, the code will perform
-INT32_MIN which does not fit into int32_t. In fact, this is
undefined behavior in C and should be avoided.

Clearly this is a user error, but given the careful validation of
input arguments a few lines later in deflateInit2_ I think this
might be of interest.

Proof of Concept:

- Compile zlib-ng with gcc -ftrapv or -fsanitize=undefined
- Compile and run this program:

```
 #include <limits.h>
 #include <stdio.h>
 #include <zlib-ng.h>

 int main(void) {
  zng_stream de_stream = { 0 }, in_stream = { 0 };
  int result;

  result = zng_deflateInit2(&de_stream, 0, Z_DEFLATED, INT32_MIN,
      MAX_MEM_LEVEL, Z_DEFAULT_STRATEGY);
  printf("zng_deflateInit2: %d\n", result);

  result = zng_inflateInit2(&in_stream, INT32_MIN);
  printf("zng_inflateInit2: %d\n", result);

  return 0;
 }
```

3 years agoExtend GZIP conditional
Tobias Stoeckmann [Mon, 13 Jun 2022 16:46:00 +0000 (18:46 +0200)] 
Extend GZIP conditional

If gzip support has been disabled during compilation then also
consider gzip relevant states as invalid in deflateStateCheck.

Also the gzip state definitions can be removed.

This change leads to failure in test/example, and I am not sure
what the GZIP conditional is trying to achieve. All gzip related
functions are still defined in zlib.h

Alternative approach is to remove the GZIP define.

3 years agoFixed conversion warnings for wsize in slide_hash_c.
Nathan Moinvaziri [Tue, 7 Jun 2022 15:54:26 +0000 (08:54 -0700)] 
Fixed conversion warnings for wsize in slide_hash_c.

  slide_hash.c(50,44): warning C4244: 'function': conversion from 'unsigned int' to 'uint16_t', possible loss of data
  slide_hash.c(51,40): warning C4244: 'function': conversion from 'unsigned int' to 'uint16_t', possible loss of data

3 years agoFixed inflate size conversion warning in chunkcopy_safe.
Nathan Moinvaziri [Mon, 6 Jun 2022 23:22:26 +0000 (16:22 -0700)] 
Fixed inflate size conversion warning in chunkcopy_safe.

  inflate_p.h(142,27): warning C4244: 'function': conversion from 'uint64_t' to 'size_t', possible loss of data

3 years agozlib 1.2.12
Nathan Moinvaziri [Tue, 7 Jun 2022 21:21:48 +0000 (14:21 -0700)] 
zlib 1.2.12

3 years agoRemove unused chunkcopy_safe function prototypes.
Nathan Moinvaziri [Mon, 6 Jun 2022 23:20:17 +0000 (16:20 -0700)] 
Remove unused chunkcopy_safe function prototypes.

3 years agoFixed signed warnings in GZ unit tests.
Nathan Moinvaziri [Mon, 6 Jun 2022 23:18:19 +0000 (16:18 -0700)] 
Fixed signed warnings in GZ unit tests.

  test_gzio.cc(44): warning C4389: '!=': signed/unsigned mismatch
  test_gzio.cc(72): warning C4389: '!=': signed/unsigned mismatch

3 years agoAdded Intel's Fast CRC Computation for Generic Polynomials Using PCLMULQDQ Instructio...
Nathan Moinvaziri [Sun, 5 Jun 2022 20:38:51 +0000 (13:38 -0700)] 
Added Intel's Fast CRC Computation for Generic Polynomials Using PCLMULQDQ Instruction paper to docs folder.

3 years agoUpload abi files when pkgcheck failure occurs.
Nathan Moinvaziri [Sat, 4 Jun 2022 16:20:46 +0000 (09:20 -0700)] 
Upload abi files when pkgcheck failure occurs.

3 years agoabicheck.sh: update reference versions.
Dan Kegel [Sat, 4 Jun 2022 16:44:10 +0000 (09:44 -0700)] 
abicheck.sh: update reference versions.

Co-authored-by: Nathan Moinvaziri <nathan@nathanm.com>
3 years agoMove crc32 fold functions into templates. Don't store xmm_crc_part between runs becau...
Nathan Moinvaziri [Fri, 15 Apr 2022 02:49:32 +0000 (19:49 -0700)] 
Move crc32 fold functions into templates. Don't store xmm_crc_part between runs because it is automatically folded into the checksum in partial_fold.

Co-authored-by: Adam Stylinski <kungfujesus06@gmail.com>
3 years agoRemove zng_gzgetc_ function from zlib-ng native API.
Hans Kristian Rosbach [Fri, 11 Feb 2022 12:49:28 +0000 (13:49 +0100)] 
Remove zng_gzgetc_ function from zlib-ng native API.
It exists in zlib for backwards compatibility, but has never been
documented/advertised for use in zlib-ngs native API.

3 years agoSimplify version and struct size checking, and ensure we do it the same way everywhere.
Hans Kristian Rosbach [Fri, 11 Feb 2022 11:08:43 +0000 (12:08 +0100)] 
Simplify version and struct size checking, and ensure we do it the same way everywhere.

3 years agoSimplify zlib-ng native API by removing version and struct size checks.
Hans Kristian Rosbach [Fri, 11 Feb 2022 10:11:04 +0000 (11:11 +0100)] 
Simplify zlib-ng native API by removing version and struct size checks.
This should be backwards compatible with applications compiled for 2.0.x.

3 years ago[ARM] We need to include NEON headers when testing for -mfpu=neon.
Mika Lindqvist [Tue, 31 May 2022 12:58:17 +0000 (15:58 +0300)] 
[ARM] We need to include NEON headers when testing for -mfpu=neon.
* If -mfpu is already specified in C_FLAGS, it can disable NEON support.

3 years agoCMakeLists.txt: fix version in zlib.pc when building statically
Fabrice Fontaine [Fri, 27 May 2022 21:25:21 +0000 (23:25 +0200)] 
CMakeLists.txt: fix version in zlib.pc when building statically

When building statically (i.e. with BUILD_SHARED_LIBS=OFF),
ZLIB_FULL_VERSION is not set resulting in an empty version in zlib.pc
and the following build failure with transmission:

checking for ZLIB... configure: error: Package requirements (zlib >= 1.2.3) were not met:

Package dependency requirement 'zlib >= 1.2.3' could not be satisfied.
Package 'zlib' has version '', required version is '>= 1.2.3'

Fixes:
 - http://autobuild.buildroot.org/results/b3b882482f517726e5c780ba4c37818bd379df82

Signed-off-by: Fabrice Fontaine <fontaine.fabrice@gmail.com>
3 years agoAllow external gtest
Vladislav Shchapov [Thu, 26 May 2022 18:39:04 +0000 (23:39 +0500)] 
Allow external gtest

Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>
3 years agoCorrect incorrect inputs provided to the CRC functions.
Mark Adler [Tue, 10 May 2022 15:11:32 +0000 (08:11 -0700)] 
Correct incorrect inputs provided to the CRC functions.
The previous releases of zlib were not sensitive to incorrect CRC
inputs with bits set above the low 32. This commit restores that
behavior, so that applications with such bugs will continue to
operate as before.

Co-authored-by: Nathan Moinvaziri <nathan@nathanm.com>
3 years agoCorrect comment for x2nmodp.
Mark Adler [Tue, 10 May 2022 15:04:47 +0000 (08:04 -0700)] 
Correct comment for x2nmodp.

3 years agoDefine W = 8 on powerpc64 for braided crc32 generation.
Nathan Moinvaziri [Tue, 10 May 2022 06:42:00 +0000 (23:42 -0700)] 
Define W = 8 on powerpc64 for braided crc32 generation.

3 years agoSpeed up software CRC-32 computation by a factor of 1.5 to 3.
Nathan Moinvaziri [Tue, 24 May 2022 18:44:20 +0000 (11:44 -0700)] 
Speed up software CRC-32 computation by a factor of 1.5 to 3.
Use the interleaved method of Kadatch and Jenkins in order to make
use of pipelined instructions through multiple ALUs in a single
core. This also speeds up and simplifies the combination of CRCs,
and updates the functions to pre-calculate and use an operator for
CRC combination.

Co-authored-by: Nathan Moinvaziri <nathan@nathanm.com>
3 years agoAdding avx512_vnni inline + copy elision
Adam Stylinski [Fri, 8 Apr 2022 17:24:21 +0000 (13:24 -0400)] 
Adding avx512_vnni inline + copy elision

Interesting revelation while benchmarking all of this is that our
chunkmemset_avx seems to be slower in a lot of use cases than
chunkmemset_sse.  That will be an interesting function to attempt to
optimize.

Right now though, we're basically beating google for all PNG decode and
encode benchmarks.  There are some variations of flags that can
basically have us trading blows, but we're about as much as 14% faster
than chromium's zlib patches.

While we're here, add a more direct benchmark of the folded copy method
versus the explicit copy + checksum.

3 years agoAdded inlined AVX512 adler checksum + copy
Adam Stylinski [Fri, 8 Apr 2022 02:57:09 +0000 (22:57 -0400)] 
Added inlined AVX512 adler checksum + copy

While we're here, also simplfy the "fold" signature, as reducing the
number of rebases and horizontal sums did not prove to be meaningfully
faster (slower in many circumstances).

3 years agoAdd AVX2 inline copy + adler implementation
Adam Stylinski [Wed, 6 Apr 2022 19:38:20 +0000 (15:38 -0400)] 
Add AVX2 inline copy + adler implementation

This was pretty much across the board wins for performance, but the wins
are very data dependent and it sort of depends on what copy runs look
like.  On our less than realistic data in benchmark_zlib_apps, the
decode test saw some of the bigger gains, ranging anywhere from 6 to 11%
when compiled with AVX2 on a Cascade Lake CPU (and with only AVX2
enabled).  The decode on realistic imagery enjoyed smaller gains,
somewhere between 2 and 4%.

Interestingly, there was one outlier on encode, at level 5.  The best
theory for this is that the copy runs for that particular compression
level were such that glibc's ERMS aware memmove implementation managed
to marginally outpace the copy during the checksum with the move rep str
sequence thanks to clever microcoding on Intel's part. It's hard to say
for sure but the most standout difference between the two perf profiles
was more time spent in memmove (which is expected, as it's calling
memcpy instead of copying the bytes during the checksum).

There's the distinct possibility that the AVX2 checksums could be
marginally improved by one level of unrolling (like what's done in the
SSE3 implementation).  The AVX512 implementations are certainly getting
gains from this but it's not appropriate to append this optimization in
this series of commits.

3 years agoAdding an SSE42 optimized copy + adler checksum implementation
Adam Stylinski [Sun, 3 Apr 2022 16:18:12 +0000 (12:18 -0400)] 
Adding an SSE42 optimized copy + adler checksum implementation

We are protecting its usage around a lot of preprocessor macros as the
other methods are not yet implemented and calling this version bypasses
the faster adler implementations implicitly.

When more versions are written for faster vectorizations, the functable
entries will be populated and preprocessor macros removed. This round,
the copy + checksum is not employing as many tricks as one would hope
with a "folded" checksum routine.  The reason for this is the
particularly tricky case of dealing with unaligned buffers.  The
implementations which don't have CPUs in the mix that have a huge
penalty for unaligned loads will have a much faster implementation.

Fancier methods that minimized rebasing, while having the potential to
be faster, ended up being slower because the compiler structured the
code in a way that ended up either spilling to the stack or trampolining
out of a loop and back in it instead of just jumping over the first load
and store.

Revisiting this for AVX512, where more registers are abundant and more
advanced loads exist, may be prudent.

3 years agoCreate adler32_fold_c* functions
Adam Stylinski [Fri, 1 Apr 2022 23:02:05 +0000 (19:02 -0400)] 
Create adler32_fold_c* functions

These are very simple wrappers that do nothing clever but serve as a
shim interface for implementing versions which do cleverly track the
number of scalar sums performed so that we can minimize rebasing and
also have an efficient copy elision.

This serves as the baseline as each vectorization gets its own commit.
That way the PR will be bisectable.

3 years agoImproved chunkset substantially where it's heavily used
Adam Stylinski [Sun, 10 Apr 2022 17:01:22 +0000 (13:01 -0400)] 
Improved chunkset substantially where it's heavily used

For most realistic use cases, this doesn't make a ton of difference.
However, for things which are highly compressible and enjoy very large
run length encodes in the window, this is a huge win.

We leverage a permutation table to swizzle the contents of the memory
chunk into a vector register and then splat that over memory with a fast
copy loop.

In essence, where this helps, it helps a lot.  Where it doesn't, it does
no measurable damage to the runtime.

This commit also simplifies a chunkcopy_safe call for determining a
distance.  Using labs is enough to give the same behavior as before,
with the added benefit that no predication is required _and_, most
importantly, static analysis by GCC's string fortification can't throw a
fit because it conveys better to the compiler that the input into
builtin_memcpy will always be in range.

3 years agoMake directory for output files.
Vladislav Shchapov [Thu, 12 May 2022 13:36:42 +0000 (18:36 +0500)] 
Make directory for output files.

Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>
3 years agoFixed incorrect case for __clang__ preprocessor macro in zbuild.h.
Nathan Moinvaziri [Tue, 10 May 2022 06:32:32 +0000 (23:32 -0700)] 
Fixed incorrect case for __clang__ preprocessor macro in zbuild.h.

3 years agoFixed issue #1264: Use fallback for _mm256_zextsi128_si256 on Xcode < 9.3
Sean McBride [Mon, 9 May 2022 20:02:41 +0000 (16:02 -0400)] 
Fixed issue #1264: Use fallback for _mm256_zextsi128_si256 on Xcode < 9.3

3 years agoRemove extra CMake messages from ARM toolchains.
Nathan Moinvaziri [Mon, 9 May 2022 16:53:43 +0000 (09:53 -0700)] 
Remove extra CMake messages from ARM toolchains.

3 years agoImplement power9 version of compare256.
Matheus Castanho [Sun, 17 Apr 2022 00:12:53 +0000 (17:12 -0700)] 
Implement power9 version of compare256.

Co-authored-by: Nathan Moinvaziri <nathan@nathanm.com>
3 years agoThe names CMAKE_INTERPROCEDURAL_OPTIMIZATION_* must be uppercase.
Vladislav Shchapov [Thu, 5 May 2022 11:51:17 +0000 (16:51 +0500)] 
The names CMAKE_INTERPROCEDURAL_OPTIMIZATION_* must be uppercase.

Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>