]> git.ipfire.org Git - thirdparty/zlib-ng.git/log
thirdparty/zlib-ng.git
3 years agoAdd nonnull attributes to trees.c functions. hinting
Hans Kristian Rosbach [Sat, 7 May 2022 17:10:29 +0000 (19:10 +0200)] 
Add nonnull attributes to trees.c functions.

3 years agoImplement power9 version of compare256.
Matheus Castanho [Sun, 17 Apr 2022 00:12:53 +0000 (17:12 -0700)] 
Implement power9 version of compare256.

Co-authored-by: Nathan Moinvaziri <nathan@nathanm.com>
3 years agoThe names CMAKE_INTERPROCEDURAL_OPTIMIZATION_* must be uppercase.
Vladislav Shchapov [Thu, 5 May 2022 11:51:17 +0000 (16:51 +0500)] 
The names CMAKE_INTERPROCEDURAL_OPTIMIZATION_* must be uppercase.

Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>
3 years agoImplement neon version of compare256.
Nathan Moinvaziri [Mon, 18 Apr 2022 01:47:07 +0000 (18:47 -0700)] 
Implement neon version of compare256.

Co-authored-by: Adam Stylinski <kungfujesus06@gmail.com>
3 years agoFixed warning about strict prototypes for cpu_check_features.
Nathan Moinvaziri [Tue, 3 May 2022 18:12:46 +0000 (11:12 -0700)] 
Fixed warning about strict prototypes for cpu_check_features.

3 years agoRemove unused chunkmemset_1 code.
Nathan Moinvaziri [Tue, 3 May 2022 22:40:44 +0000 (15:40 -0700)] 
Remove unused chunkmemset_1 code.

3 years agoDisable redirection to 64-bit function variants when Z_SOLO is defined
Mika Lindqvist [Tue, 3 May 2022 00:47:52 +0000 (03:47 +0300)] 
Disable redirection to 64-bit function variants when Z_SOLO is defined

See #1262.

3 years agoImprove deflateBound unit test to test a range of small buffer lengths with various...
Nathan Moinvaziri [Sun, 10 Apr 2022 05:20:39 +0000 (22:20 -0700)] 
Improve deflateBound unit test to test a range of small buffer lengths with various deflateBound initialization values.

Co-authored-by: Mika T. Lindqvist <postmaster@raasu.org>
3 years agoAdded compressBound tests for small buffers.
Nathan Moinvaziri [Sun, 10 Apr 2022 05:19:56 +0000 (22:19 -0700)] 
Added compressBound tests for small buffers.

Co-authored-by: Mika T. Lindqvist <postmaster@raasu.org>
3 years agoFixed MSVC warning about unknown option for AVX512 flag.
Nathan Moinvaziri [Wed, 27 Apr 2022 16:26:31 +0000 (09:26 -0700)] 
Fixed MSVC warning about unknown option for AVX512 flag.

cl : command line warning D9002: ignoring unknown option '/ARCH:AVX512'

3 years agoAdd test for stream adler after inflateSetDictionary
Josh Triplett [Wed, 27 Apr 2022 00:21:13 +0000 (17:21 -0700)] 
Add test for stream adler after inflateSetDictionary

This test passes prior to commit
8550a90de4dcb8589a7d48fe308c4c45bba5a466 ("Leverage inline CRC + copy")
and fails after that commit.

3 years agoFixed regression introduced by inlining CRC + copy
Adam Stylinski [Wed, 27 Apr 2022 01:53:30 +0000 (21:53 -0400)] 
Fixed regression introduced by inlining CRC + copy

Pretty much every time updatewindow has been called, implicitly a
checksum was performed unless on s/390 or state->wrap & 4 == 0. The
inflateSetDictionary function instead separately calls this checksum
before invoking update window and checks the checksum to see if it
matches the initial checksum (a property that happens from parsing the
DICTID section of the headers).

Instead, we can make updatewindow have a "copy" parameter, which is the
state->wrap value that is being checked anyway.  We instead move the 3rd
bit check to be checked by the caller rather than the callee.

3 years agoIBM Z DFLTCC: Split deflate and inflate states
Ilya Leoshkevich [Wed, 13 Apr 2022 11:46:24 +0000 (13:46 +0200)] 
IBM Z DFLTCC: Split deflate and inflate states

Currently deflate and inflate both use a common state struct. There are
several variables in this struct that we don't need for inflate, and
more may be coming in the future. Therefore split them in two separate
structs. This in turn requires splitting ZALLOC_STATE and ZCOPY_STATE
macros.

3 years agomacOs M1 build fix on arm cpu checks.
David CARLIER [Fri, 18 Mar 2022 21:01:00 +0000 (21:01 +0000)] 
macOs M1 build fix on arm cpu checks.

3 years agoRemove trailing whitespace in several source code files
Dženan Zukić [Mon, 25 Apr 2022 18:42:11 +0000 (14:42 -0400)] 
Remove trailing whitespace in several source code files

3 years agoFixed forcing CMAKE_BUILD_TYPE to Release on multi config generators such as Xcode...
Nathan Moinvaziri [Fri, 22 Apr 2022 21:58:24 +0000 (14:58 -0700)] 
Fixed forcing CMAKE_BUILD_TYPE to Release on multi config generators such as Xcode or MSVC.

Co-authored-by: Sergey Markelov <sergio-nsk@users.noreply.github.com>
3 years agoRemove CVE testing from configure script.
Nathan Moinvaziri [Sat, 9 Apr 2022 20:43:53 +0000 (13:43 -0700)] 
Remove CVE testing from configure script.

3 years agoUse PREFIX() for some of the Z_INTERNAL symbols
Ilya Leoshkevich [Tue, 12 Apr 2022 13:16:20 +0000 (15:16 +0200)] 
Use PREFIX() for some of the Z_INTERNAL symbols

https://github.com/powturbo/TurboBench links zlib and zlib-ng into the
same binary, causing non-static symbol conflicts. Fix by using PREFIX()
for flush_pending(), bi_reverse(), inflate_ensure_window() and all of
the IBM Z symbols.

Note: do not use an explicit zng_, since one of the long-term goals is
to be able to link two versions of zlib-ng into the same binary for
benchmarking [1].

[1] https://github.com/zlib-ng/zlib-ng/pull/1248#issuecomment-1096648932

3 years agoDon't try to build tests or benchmarks if cmake is too old for them.
Mika Lindqvist [Wed, 13 Apr 2022 01:24:12 +0000 (04:24 +0300)] 
Don't try to build tests or benchmarks if cmake is too old for them.

3 years agoCheck that sys/auxv.h exists at configure time and add preprocessor define for it.
Mika Lindqvist [Tue, 12 Apr 2022 22:22:29 +0000 (01:22 +0300)] 
Check that sys/auxv.h exists at configure time and add preprocessor define for it.
* Protect including sys/auxv.h in all relevant files with the new preprocessor define
* Test for both existence of both sys/auxv.h and getauxval() with both cmake and configure

3 years agoFixed failed tools tests when source directory is read-only.
Nathan Moinvaziri [Sat, 9 Apr 2022 04:43:54 +0000 (21:43 -0700)] 
Fixed failed tools tests when source directory is read-only.

3 years agoAdd test for issue #1235.
Mika Lindqvist [Wed, 6 Apr 2022 20:49:24 +0000 (23:49 +0300)] 
Add test for issue #1235.
* Test both compressBound() and deflateBound() as those share same code fragment.

3 years agoAdd one extra byte to return value of compressBound and deflateBound for small length...
Mika Lindqvist [Tue, 5 Apr 2022 21:04:45 +0000 (00:04 +0300)] 
Add one extra byte to return value of compressBound and deflateBound for small lengths due to shift returning 0.
* Treat 0 byte input as 1 byte input when calculating compressBound and deflateBound

3 years agoDisable LTO in CMake
Vladislav Shchapov [Mon, 4 Apr 2022 08:37:12 +0000 (13:37 +0500)] 
Disable LTO in CMake

3 years agoUse _msan_unposion to unposion end of window for when it needs to read the past ...
Nathan Moinvaziri [Mon, 11 Apr 2022 02:35:12 +0000 (19:35 -0700)] 
Use _msan_unposion to unposion end of window for when it needs to read the past < chunksize bytes in the window. See #1245.

Co-authored-by: Adam Stylinski <kungfujesus06@gmail.com>
3 years agoRename memory alignment functions because they handle custom allocator which is the...
Nathan Moinvaziri [Mon, 11 Apr 2022 02:20:23 +0000 (19:20 -0700)] 
Rename memory alignment functions because they handle custom allocator which is the first parameter so having calloc and cfree (c = custom) is confusing in the name.

3 years agoFixed off-by-one error when benchmarking compare256 resulting in heap-buffer-overflow.
Nathan Moinvaziri [Sat, 9 Apr 2022 14:22:01 +0000 (07:22 -0700)] 
Fixed off-by-one error when benchmarking compare256 resulting in heap-buffer-overflow.

3 years agoMimic minigzip behavior and only unlink files if not using -c copy out argument.
Nathan Moinvaziri [Fri, 8 Apr 2022 03:39:21 +0000 (20:39 -0700)] 
Mimic minigzip behavior and only unlink files if not using -c copy out argument.

3 years agoFix the custom PNG image based benchmark
Adam Stylinski [Wed, 6 Apr 2022 22:15:57 +0000 (18:15 -0400)] 
Fix the custom PNG image based benchmark

The height parameter was using a fixed macro, written at a time when the
test imagery was fully synthetic.  Because of this, images smaller than
than our in-memory generated imagery will artificially throw a CRC
error.

3 years agoRemove sanitizer support from configure since it is better supported in cmake. Anybod...
Nathan Moinvaziri [Sat, 2 Apr 2022 21:49:28 +0000 (14:49 -0700)] 
Remove sanitizer support from configure since it is better supported in cmake. Anybody who still needs it can use cmake or manually set CFLAGS and LDFLAGS.

3 years agoabicheck.sh: don't export CHOST if it wasn't already exported.
Dan Kegel [Sun, 3 Apr 2022 18:27:39 +0000 (11:27 -0700)] 
abicheck.sh: don't export CHOST if it wasn't already exported.

This fixes https://github.com/zlib-ng/zlib-ng/issues/1219,
a regression when running abicheck.sh with default compiler.

3 years agoFixed missing crc32_combine exports for zlib 1.2.12.
Nathan Moinvaziri [Sun, 3 Apr 2022 00:31:45 +0000 (17:31 -0700)] 
Fixed missing crc32_combine exports for zlib 1.2.12.

3 years agoabicheck.sh: zlib-ng is a bash script, not a sh script, don't hardcode shell when...
Dan Kegel [Fri, 1 Apr 2022 19:42:28 +0000 (19:42 +0000)] 
abicheck.sh: zlib-ng is a bash script, not a sh script, don't hardcode shell when running configure

3 years agoFixed building tests when -DWITH_GZFILEOP=OFF
Nathan Moinvaziri [Sat, 2 Apr 2022 22:41:21 +0000 (15:41 -0700)] 
Fixed building tests when -DWITH_GZFILEOP=OFF

3 years agoMove fuzzer cmake into fuzz directory.
Nathan Moinvaziri [Sat, 2 Apr 2022 22:31:22 +0000 (15:31 -0700)] 
Move fuzzer cmake into fuzz directory.

3 years agoUse latest stable version of google test instead of unstable main branch.
Nathan Moinvaziri [Sat, 2 Apr 2022 22:13:16 +0000 (15:13 -0700)] 
Use latest stable version of google test instead of unstable main branch.

3 years agoabicheck.sh: implement --refresh-if option as documented.
Dan Kegel [Fri, 1 Apr 2022 19:38:36 +0000 (19:38 +0000)] 
abicheck.sh: implement --refresh-if option as documented.

Also change exit status to nonzero if there is no abifile and --refresh or --refresh-if were not specified.

3 years agoGrammar fixes
Shlomi Fish [Sat, 2 Apr 2022 10:16:47 +0000 (13:16 +0300)] 
Grammar fixes

3 years agoRemove support for building fuzzers from configure.
Nathan Moinvaziri [Thu, 31 Mar 2022 16:50:52 +0000 (09:50 -0700)] 
Remove support for building fuzzers from configure.

3 years agoTest CVE-2018-25032 against the default level and levels 1 and 2.
Nathan Moinvaziri [Thu, 31 Mar 2022 17:04:49 +0000 (10:04 -0700)] 
Test CVE-2018-25032 against the default level and levels 1 and 2.

3 years agoAdded unit test against CVE-2018-25032 with default strategy.
Nathan Moinvaziri [Mon, 28 Mar 2022 14:53:55 +0000 (07:53 -0700)] 
Added unit test against CVE-2018-25032 with default strategy.

Co-authored-by: Eric Biggers <ebiggers@kernel.org>
3 years agoAdded unit test against CVE-2018-25032.
Nathan Moinvaziri [Sun, 27 Mar 2022 00:49:49 +0000 (17:49 -0700)] 
Added unit test against CVE-2018-25032.
Sample input from https://www.openwall.com/lists/oss-security/2022/03/26/1.

Co-authored-by: Tavis Ormandy <taviso@users.noreply.github.com>
3 years agoAdded missing -F argument for Z_FIXED strategy in minideflate.
Nathan Moinvaziri [Sun, 27 Mar 2022 00:26:16 +0000 (17:26 -0700)] 
Added missing -F argument for Z_FIXED strategy in minideflate.

3 years agoUse size_t types for len arithmetic, matching signature
Adam Stylinski [Sun, 27 Mar 2022 23:20:08 +0000 (19:20 -0400)] 
Use size_t types for len arithmetic, matching signature

This suppresses a warning and keeps everything safely the same type.
While it's unlikely that the input for any of this will exceed the size
of an unsigned 32 bit integer, this approach is cleaner than casting and
should not result in a performance degradation.

3 years agoUse standalone fuzzing runner only when fuzzing engine is not found.
Nathan Moinvaziri [Mon, 28 Mar 2022 23:52:02 +0000 (16:52 -0700)] 
Use standalone fuzzing runner only when fuzzing engine is not found.

3 years agoAllow SSE2 and AVX2 functions with -DWITH_UNALIGNED=OFF. Even though they use unalign...
Nathan Moinvaziri [Sun, 27 Mar 2022 20:18:03 +0000 (13:18 -0700)] 
Allow SSE2 and AVX2 functions with -DWITH_UNALIGNED=OFF. Even though they use unaligned loads, they don't result in undefined behavior.

3 years agoLeverage inline CRC + copy
Adam Stylinski [Sat, 12 Mar 2022 21:09:02 +0000 (16:09 -0500)] 
Leverage inline CRC + copy

This brings back a bit of the performance that may have been sacrificed
by reverting the reorganized inflate window. Doing a copy at the same
time as a CRC is basically free.

3 years agoFixed clang signed/unsigned warning in chunkcopy_safe.
Nathan Moinvaziri [Sun, 27 Mar 2022 20:44:58 +0000 (13:44 -0700)] 
Fixed clang signed/unsigned warning in chunkcopy_safe.

inflate_p.h:159:18: warning: comparison of integers of different signs: 'int32_t' (aka 'int') and 'size_t' (aka 'unsigned long') [-Wsign-compare]
        tocopy = MIN(non_olap_size, len);
                 ^   ~~~~~~~~~~~~~  ~~~
zbuild.h:74:24: note: expanded from macro 'MIN'
#define MIN(a, b) ((a) > (b) ? (b) : (a))
                    ~  ^  ~

3 years agoUse specific gcovr version 5.0 due to parser errors with 5.1.
Nathan Moinvaziri [Sun, 27 Mar 2022 16:14:45 +0000 (09:14 -0700)] 
Use specific gcovr version 5.0 due to parser errors with 5.1.
https://github.com/gcovr/gcovr/issues/583

3 years agoRemove unistd.h include from gzguts.h which is already included from zconf.h via...
Nathan Moinvaziri [Sat, 26 Mar 2022 17:37:29 +0000 (10:37 -0700)] 
Remove unistd.h include from gzguts.h which is already included from zconf.h via zlib.h.

3 years agoUse HAVE instead of HAS for variable name for consistency.
Nathan Moinvaziri [Sat, 26 Mar 2022 15:47:01 +0000 (08:47 -0700)] 
Use HAVE instead of HAS for variable name for consistency.

3 years agoRemove detect_leaks=0 from non-ASAN cmake jobs.
Nathan Moinvaziri [Sat, 26 Mar 2022 15:05:19 +0000 (08:05 -0700)] 
Remove detect_leaks=0 from non-ASAN cmake jobs.

3 years agoFixed error with compare256_unaligned_avx2 undefined if unaligned access is disabled.
Nathan Moinvaziri [Thu, 24 Mar 2022 20:01:21 +0000 (13:01 -0700)] 
Fixed error with compare256_unaligned_avx2 undefined if unaligned access is disabled.

3 years agoFixed signed comparison warning in zng_calloc_aligned.
Nathan Moinvaziri [Sun, 20 Mar 2022 02:41:24 +0000 (19:41 -0700)] 
Fixed signed comparison warning in zng_calloc_aligned.

zutil.c: In function ‘zng_calloc_aligned’:
zutil.c:133:20: warning: comparison of integer expressions of different signedness: ‘int32_t’ {aka ‘int’} and ‘long unsigned int’ [-Wsign-compare]

3 years agoFixed unused opaque variable in aligned alloc test.
Nathan Moinvaziri [Sun, 20 Mar 2022 02:39:06 +0000 (19:39 -0700)] 
Fixed unused opaque variable in aligned alloc test.

test_aligned_alloc.cc: In function ‘void* zng_calloc_unaligned(void*, unsigned int, unsigned int)’:
test_aligned_alloc.cc:14:34: warning: unused parameter ‘opaque’ [-Wunused-parameter]
test_aligned_alloc.cc: In function ‘void zng_cfree_unaligned(void*, void*)’:
test_aligned_alloc.cc:28:32: warning: unused parameter ‘opaque’ [-Wunused-parameter]

3 years agoFixed operator precedence warnings in slide_hash_sse2.
Nathan Moinvaziri [Wed, 16 Mar 2022 21:21:33 +0000 (14:21 -0700)] 
Fixed operator precedence warnings in slide_hash_sse2.

slide_hash_sse2.c(58,5): warning C4554: '&': check operator precedence for possible error; use parentheses to clarify precedence
slide_hash_sse2.c(59,5): warning C4554: '&': check operator precedence for possible error; use parentheses to clarify precedence

3 years agoFixed signed/unsigned warning in chunkmemset.
Nathan Moinvaziri [Wed, 16 Mar 2022 21:20:23 +0000 (14:20 -0700)] 
Fixed signed/unsigned warning in chunkmemset.

chunkset_tpl.h(107,24): warning C4018: '>': signed/unsigned mismatch

3 years agoFixed MSVC warnings in chunkcopy_safe.
Nathan Moinvaziri [Wed, 16 Mar 2022 21:10:14 +0000 (14:10 -0700)] 
Fixed MSVC warnings in chunkcopy_safe.

inflate_p.h(244,18): warning C4018: '>': signed/unsigned mismatch
inflate_p.h(234,38): warning C4244: 'initializing': conversion from '__int64' to 'int', possible loss of data
inffast.c
inflate_p.h(244,18): warning C4018: '>': signed/unsigned mismatch
inflate_p.h(234,38): warning C4244: 'initializing': conversion from '__int64' to 'int', possible loss of data
inflate.c
inflate_p.h(244,18): warning C4018: '>': signed/unsigned mismatch
inflate_p.h(234,38): warning C4244: 'initializing': conversion from '__int64' to 'int', possible loss of data

3 years agoCorrect typo in functable
Rich Ercolani [Fri, 25 Mar 2022 14:06:39 +0000 (10:06 -0400)] 
Correct typo in functable

Now, I could be wrong about this being an error, but I don't see any discussion suggesting this was intended, so...

3 years agoAdd unit tests for compare256 variants.
Nathan Moinvaziri [Fri, 18 Mar 2022 19:10:27 +0000 (12:10 -0700)] 
Add unit tests for compare256 variants.

3 years agoFixed a warning about a comparison of an unsigned with a signed type
Adam Stylinski [Tue, 22 Mar 2022 23:39:41 +0000 (19:39 -0400)] 
Fixed a warning about a comparison of an unsigned with a signed type

3 years agoFix an issue with the ubsan for overflow
Adam Stylinski [Fri, 18 Mar 2022 23:18:10 +0000 (19:18 -0400)] 
Fix an issue with the ubsan for overflow

While this didn't _actually_ cause any issues for us, technically the
_mm512_reduce_add_epi32() intrinsics returns a signed integer and it
does the very last summation in scalar GPRs as signed integers. While
the ALU still did the math properly (the negative representation is the
same addition in hardware, just interpreted differently), the sanitizer
caught window of inputs here definitely outside the range of a signed
integer for this immediate operation.

The solution, as silly as it may seem, would be to implement our own 32
bit horizontal sum function that does all of the work in vector
registers. This allows us to implicitly keep things in vector register
domain and convert at the very end after we've summed the summation.

The compiler's sanitizer doesn't know the wiser and the solution still
results in being correct.

3 years agoUpdate language around ABI compatibility with zlib. #1081
Nathan Moinvaziri [Mon, 21 Mar 2022 18:58:07 +0000 (11:58 -0700)] 
Update language around ABI compatibility with zlib. #1081

3 years agoRename adler32_sse41 to adler32_ssse3
Adam Stylinski [Sun, 20 Mar 2022 15:44:32 +0000 (11:44 -0400)] 
Rename adler32_sse41 to adler32_ssse3

As it turns out, the sum of absolute differences instruction _did_ exist
in SSSE3 all along. SSE41 introduced a stranger, less commonly used
variation of the sum of absolute difference instruction.  Knowing this,
the old SSSE3 method can be axed entirely and the SSE41 method can now
be used on CPUs only having SSSE3.

Removing this extra functable entry shrinks the code and allows for a
simpler planned refactor later for the adler checksum and copy elision.

3 years agoFixed missing checks around compare256 and longest_match definitions.
Nathan Moinvaziri [Sat, 19 Mar 2022 22:53:30 +0000 (15:53 -0700)] 
Fixed missing checks around compare256 and longest_match definitions.

3 years agoUse zmemcmp_2 in 16-bit unaligned compare256 variant.
Nathan Moinvaziri [Sat, 19 Mar 2022 22:34:09 +0000 (15:34 -0700)] 
Use zmemcmp_2 in 16-bit unaligned compare256 variant.

3 years agoRevert "Reorganize inflate window layout"
Nathan Moinvaziri [Mon, 7 Mar 2022 03:20:29 +0000 (19:20 -0800)] 
Revert "Reorganize inflate window layout"

This reverts commit dc3b60841dbfa9cf37be3efb4568f055b4e15580.

3 years agoRevert "Add back original version of inflate_fast for use with inflateBack."
Nathan Moinvaziri [Mon, 7 Mar 2022 03:16:49 +0000 (19:16 -0800)] 
Revert "Add back original version of inflate_fast for use with inflateBack."

This reverts commit 2d2dde43b11c40cb58a339ff4a8425bca0091c31.

3 years agoRevert "DFLTCC update for window optimization from Jim & Nathan"
Nathan Moinvaziri [Mon, 7 Mar 2022 03:16:37 +0000 (19:16 -0800)] 
Revert "DFLTCC update for window optimization from Jim & Nathan"

This reverts commit b4ca25afabba7b4bf74d36e26728006d28df891d.

3 years agoAdded common sanitizer flags for getting optimal stack traces.
Nathan Moinvaziri [Fri, 18 Mar 2022 23:05:18 +0000 (16:05 -0700)] 
Added common sanitizer flags for getting optimal stack traces.

3 years agoUse halt_on_error in sanitizer options.
Nathan Moinvaziri [Fri, 18 Mar 2022 18:05:58 +0000 (11:05 -0700)] 
Use halt_on_error in sanitizer options.
https://lists.llvm.org/pipermail/cfe-dev/2015-October/045710.html

3 years agoMake symbolic prefix instance names consistent in NMake GHA workflow.
Nathan Moinvaziri [Thu, 17 Mar 2022 21:39:12 +0000 (14:39 -0700)] 
Make symbolic prefix instance names consistent in NMake GHA workflow.

3 years agoFixed misspelling in NO_UNALIGNED flag.
Nathan Moinvaziri [Thu, 17 Mar 2022 20:01:38 +0000 (13:01 -0700)] 
Fixed misspelling in NO_UNALIGNED flag.

3 years agoWrong variable used when detecting unaligned support for sanitize
Nathan Moinvaziri [Thu, 17 Mar 2022 19:57:41 +0000 (12:57 -0700)] 
Wrong variable used when detecting unaligned support for sanitize

3 years agoAdded sanitizer tests in configure GitHub Actions workflow.
Nathan Moinvaziri [Thu, 17 Mar 2022 19:22:22 +0000 (12:22 -0700)] 
Added sanitizer tests in configure GitHub Actions workflow.
Added missing OPTIONS environment variable for UBSAN in CMake GitHub workflow.

3 years agoUse zutil.h which already includes zlib headers.
Nathan Moinvaziri [Fri, 18 Mar 2022 00:16:11 +0000 (17:16 -0700)] 
Use zutil.h which already includes zlib headers.

3 years agoRemove unused zutil header.
Nathan Moinvaziri [Wed, 16 Mar 2022 21:38:37 +0000 (14:38 -0700)] 
Remove unused zutil header.

3 years agoFix a latent issue with chunkmemset
Adam Stylinski [Fri, 18 Mar 2022 00:22:56 +0000 (20:22 -0400)] 
Fix a latent issue with chunkmemset

It would seem that on some platforms, namely those which are
!UNALIGNED64_OK, there was a likelihood of chunkmemset_safe_c copying all
the bytes before passing control flow to chunkcopy, a function which is
explicitly unsafe to be called with a zero length copy.

This fixes that bug for those platforms.

3 years agoFix UBSAN's cry afoul
Adam Stylinski [Thu, 17 Mar 2022 02:52:44 +0000 (22:52 -0400)] 
Fix UBSAN's cry afoul

Technically, we weren't actually doing this the way C wants us to,
legally.  The zmemcpy's turn into NOPs for pretty much all > 0
optimization levels and this gets us defined behavior with the
sanitizer, putting the optimized load by arbitrary alignment into the
compiler's hands instead of ours.

3 years agoAdded check for UNALIGNED64_OK when defining zmemcpy_8 and zmemcmp_8.
Nathan Moinvaziri [Tue, 15 Mar 2022 23:54:44 +0000 (16:54 -0700)] 
Added check for UNALIGNED64_OK when defining zmemcpy_8 and zmemcmp_8.

3 years agoAdded 32-bit GCC build to CMake GitHub Actions.
Nathan Moinvaziri [Tue, 15 Mar 2022 17:24:47 +0000 (10:24 -0700)] 
Added 32-bit GCC build to CMake GitHub Actions.

3 years agoMake unaligned access being disabled configurable via build scripts.
Nathan Moinvaziri [Tue, 15 Mar 2022 01:30:51 +0000 (18:30 -0700)] 
Make unaligned access being disabled configurable via build scripts.

3 years agoMove UNALIGNED_OK detection to compile time instead of configure time.
Nathan Moinvaziri [Tue, 18 Jan 2022 02:47:23 +0000 (18:47 -0800)] 
Move UNALIGNED_OK detection to compile time instead of configure time.

3 years agoExplicitly install dependencies for wine32.
Mika T. Lindqvist [Tue, 15 Mar 2022 16:33:17 +0000 (18:33 +0200)] 
Explicitly install dependencies for wine32.
* Allow downgrading packages to resolve conflicts

3 years agoDon't use -mtune with ClangCl.
Mika Lindqvist [Tue, 15 Mar 2022 15:19:58 +0000 (17:19 +0200)] 
Don't use -mtune with ClangCl.

3 years ago[README] Add missing FORCE_SSE2 for CMake.
Mika Lindqvist [Mon, 14 Mar 2022 18:02:01 +0000 (20:02 +0200)] 
[README] Add missing FORCE_SSE2 for CMake.

3 years agoAllow bypassing runtime feature check of TZCNT instructions.
Mika Lindqvist [Sun, 13 Mar 2022 15:12:42 +0000 (17:12 +0200)] 
Allow bypassing runtime feature check of TZCNT instructions.
* This avoids conditional branch when it's known at build time that TZCNT instructions are always supported

3 years agoThrow an error when input is raw deflate stream but window_bits is not supplied.
Nathan Moinvaziri [Fri, 11 Mar 2022 23:45:06 +0000 (15:45 -0800)] 
Throw an error when input is raw deflate stream but window_bits is not supplied.

3 years agoPrint help when no arguments supplied to minideflate.
Nathan Moinvaziri [Fri, 11 Mar 2022 23:42:14 +0000 (15:42 -0800)] 
Print help when no arguments supplied to minideflate.

3 years agoAppend extension to output file path based on window_bits when compressing and remove...
Nathan Moinvaziri [Fri, 11 Mar 2022 23:31:57 +0000 (15:31 -0800)] 
Append extension to output file path based on window_bits when compressing and remove extension from output file path when decompressing.

3 years agoAdded support for -k keep argument to minideflate. By default minideflate will now...
Nathan Moinvaziri [Fri, 11 Mar 2022 22:53:47 +0000 (14:53 -0800)] 
Added support for -k keep argument to minideflate. By default minideflate will now delete the input file.

3 years agoUse large default buffer size for minideflate to match minigzip use of GZBUFSIZE.
Nathan Moinvaziri [Thu, 10 Mar 2022 16:57:13 +0000 (08:57 -0800)] 
Use large default buffer size for minideflate to match minigzip use of GZBUFSIZE.

3 years agoInclude zutil.h for definition of DEF_MEM_LEVEL.
Nathan Moinvaziri [Thu, 10 Mar 2022 16:53:44 +0000 (08:53 -0800)] 
Include zutil.h for definition of DEF_MEM_LEVEL.

3 years agoAuto-detect wrapper when inflating and no window_bits specified.
Nathan Moinvaziri [Mon, 28 Feb 2022 17:00:26 +0000 (09:00 -0800)] 
Auto-detect wrapper when inflating and no window_bits specified.

3 years agoUpdated help usage with correct values for window_bits.
Nathan Moinvaziri [Sun, 27 Feb 2022 18:06:13 +0000 (10:06 -0800)] 
Updated help usage with correct values for window_bits.

3 years agoFixed wrong error name when calling inflate in minideflate.
Nathan Moinvaziri [Wed, 23 Feb 2022 20:07:01 +0000 (12:07 -0800)] 
Fixed wrong error name when calling inflate in minideflate.

3 years agoSpeed up chunkcopy and memset
Adam Stylinski [Mon, 21 Feb 2022 21:52:17 +0000 (16:52 -0500)] 
Speed up chunkcopy and memset

This was found to have a significant impact on a highly compressible PNG
for both the encode and decode.  Some deltas show performance improving
as much as 60%+.

For the scenarios where the "dist" is not an even modulus of our chunk
size, we simply repeat the bytes as many times as possible into our
vector registers.  We then copy the entire vector and then advance the
quotient of our chunksize divided by our dist value.

If dist happens to be 1, there's no reason to not just call memset from
libc (this is likely to be just as fast if not faster).

3 years agoImprove SSE2 slide hash performance
Adam Stylinski [Mon, 24 Jan 2022 04:32:46 +0000 (23:32 -0500)] 
Improve SSE2 slide hash performance

At least on pre-nehalem CPUs, we get a > 50% improvement. This is
mostly due to the fact that we're opportunistically doing aligned loads
instead of unaligned loads.  This is something that is very likely to be
possible, given that the deflate stream initialization uses the zalloc
function, which most libraries don't override.  Our allocator aligns to
64 byte boundaries, meaning we can do aligned loads on even AVX512 for
the zstream->prev and zstream->head pointers. However, only pre-nehalem
CPUs _actually_ benefit from explicitly aligned load instructions.

The other thing being done here is we're unrolling the loop by a factor
of 2 so that we can get a tiny bit more ILP.  This improved performance
by another 5%-7% gain.

3 years agoAdded unit test for zng_calloc_aligned to ensure that it always returns 64-byte align...
Nathan Moinvaziri [Mon, 14 Mar 2022 18:46:02 +0000 (11:46 -0700)] 
Added unit test for zng_calloc_aligned to ensure that it always returns 64-byte aligned memory allocation when requested.

3 years agoBypass memory alignment compensation if not using custom allocator.
Nathan Moinvaziri [Tue, 25 Jan 2022 04:58:54 +0000 (20:58 -0800)] 
Bypass memory alignment compensation if not using custom allocator.