git.ipfire.org Git - thirdparty/zlib-ng.git/log

remove 16-byte alignment from deflate_state::crc0

We noticed recently on the Skia tree that if we build Chromium's zlib
with GCC, -O3, -m32, and -msse2, deflateInit2_() crashes.  Might also
need -fPIC... not sure.

I tracked this down to a `movaps` (16-byte aligned store) to an address
that was only 8-byte aligned.  This address was somewhere in the middle
of the deflate_state struct that deflateInit2_()'s job is to initialize.

That deflate_state struct `s` is allocated using ZALLOC, which calls any
user supplied zalloc if set, or the default if not.  Neither one of
these has any special alignment contract, so generally they'll tend to
be 2*sizeof(void*) aligned.  On 32-bit builds, that's 8-byte aligned.

But because we've annotated crc0 as zalign(16), the natural alignment of
the whole struct is 16-byte, and a compiler like GCC can feel free to
use 16-byte aligned stores to parts of the struct that are 16-byte
aligned, like the beginning, crc0, or any other part before or after
crc0 that happens to fall on a 16-byte boundary.  With -O3 and -msse2,
GCC does exactly that, writing a few of the fields with one 16-byte
store.

The fix is simply to remove zalign(16).  All the code that manipulates
this field was actually already using unaligned loads and stores.  You
can see it all right at the top of crc_folding.c, CRC_LOAD and CRC_SAVE.

This bug comes from the Intel performance patches we landed a few years
ago, and isn't present in upstream zlib, Android's zlib, or Google's
internal zlib.

It doesn't seem to be tickled by Clang, and won't happen on 64-bit GCC
builds: zalloc is likely 16-byte aligned there.  I _think_ it's possible
for it to trigger on non-x86 32-bit builds with GCC, but haven't tested
that.  I also have not tested MSVC.

Reviewed-on: https://chromium-review.googlesource.com/1236613

Fix a bug that can crash deflate on some input when using Z_FIXED.

This bug was reported by Danilo Ramos of Eideticom, Inc. It has
lain in wait 13 years before being found! The bug was introduced
in zlib 1.2.2.2, with the addition of the Z_FIXED option. That
option forces the use of fixed Huffman codes. For rare inputs with
a large number of distant matches, the pending buffer into which
the compressed data is written can overwrite the distance symbol
table which it overlays. That results in corrupted output due to
invalid distances, and can result in out-of-bound accesses,
crashing the application.

The fix here combines the distance buffer and literal/length
buffers into a single symbol buffer. Now three bytes of pending
buffer space are opened up for each literal or length/distance
pair consumed, instead of the previous two bytes. This assures
that the pending buffer cannot overwrite the symbol table, since
the maximum fixed code compressed length/distance is 31 bits, and
since there are four bytes of pending space for every three bytes
of symbol space.

fix oss-fuzz/11323: clear out s->prev buffer

zlib-ng compiled with MSAN used to fail with:

SUMMARY: MemorySanitizer: use-of-uninitialized-value /src/zlib-ng/match.c:473:60 in longest_match
Exiting

  Uninitialized value was stored to memory at
    #0 0x7fcaced77645 in fill_window_sse /src/zlib-ng/arch/x86/fill_window_sse.c:84:17
    #1 0x7fcaced7d3d4 in deflate_quick /src/zlib-ng/arch/x86/deflate_quick.c:230:13
    #2 0x7fcaced2f54b in zng_deflate /src/zlib-ng/deflate.c:951:18
    #3 0x4a04e9 in test_large_deflate /src/zlib-ng/test/example.c:266:11
    #4 0x4a38d2 in main /src/zlib-ng/test/example.c:539:5
    #5 0x7fcace96a82f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)

  Uninitialized value was created by a heap allocation
    #0 0x45bf70 in malloc /src/llvm/projects/compiler-rt/lib/msan/msan_interceptors.cc:910
    #1 0x7fcaced26cd9 in zng_deflateInit2_ /src/zlib-ng/deflate.c:315:26
    #2 0x7fcaced2605a in zng_deflateInit_ /src/zlib-ng/deflate.c:224:12
    #3 0x4a03c5 in test_large_deflate /src/zlib-ng/test/example.c:255:11
    #4 0x4a38d2 in main /src/zlib-ng/test/example.c:539:5
    #5 0x7fcace96a82f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)

fix bug #215: use the proper intrinsic __crc32w for words (#217)

integration of oss-fuzz in make test #204 (#206)

The requirements for an ideal integration of a project in oss-fuzz are:
https://github.com/google/oss-fuzz/blob/master/docs/ideal_integration.md
- Is maintained by code owners in their RCS (Git, SVN, etc).
- Is built with the rest of the tests - no bit rot!
- Has a seed corpus with good code coverage.
- Is continuously tested on the seed corpus with ASan/UBSan/MSan
- Is fast and has no OOMs
- Has a fuzzing dictionary, if applicable

Fix test/example.c when compiled with ASAN

Before this patch

cmake -DWITH_SANITIZERS=1
make
make test

used to fail with:

Running tests...
Test project /home/hansr/github/zlib/zlib-ng
    Start 1: example
1/2 Test #1: example ..........................***Failed    0.14 sec
    Start 2: example64
2/2 Test #2: example64 ........................***Failed    0.13 sec

==11605==ERROR: AddressSanitizer: memcpy-param-overlap: memory ranges [0x62e000000595,0x62e0000053b5) and [0x62e000000400, 0x62e000005220) overlap
    #0 0x7fab3bcc9662 in __asan_memcpy (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x8c662)
    #1 0x40f936 in memcpy /usr/include/x86_64-linux-gnu/bits/string3.h:53
    #2 0x40f936 in read_buf /home/spop/s/zlib-ng/deflate.c:1122
    #3 0x410458 in deflate_stored /home/spop/s/zlib-ng/deflate.c:1394
    #4 0x4133d7 in zng_deflate /home/spop/s/zlib-ng/deflate.c:945
    #5 0x402253 in test_large_deflate /home/spop/s/zlib-ng/test/example.c:275
    #6 0x4014e8 in main /home/spop/s/zlib-ng/test/example.c:536
    #7 0x7fab3b89382f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
    #8 0x4018e8 in _start (/work/spop/zlib-ng/example+0x4018e8)

0x62e000000595 is located 405 bytes inside of 40000-byte region [0x62e000000400,0x62e00000a040)
allocated by thread T0 here:
    #0 0x7fab3bcd579a in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x9879a)
    #1 0x40147a in main /home/spop/s/zlib-ng/test/example.c:516

0x62e000000400 is located 0 bytes inside of 40000-byte region [0x62e000000400,0x62e00000a040)
allocated by thread T0 here:
    #0 0x7fab3bcd579a in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x9879a)
    #1 0x40147a in main /home/spop/s/zlib-ng/test/example.c:516

SUMMARY: AddressSanitizer: memcpy-param-overlap ??:0 __asan_memcpy
==11605==ABORTING

fix bug #183 following recommendations of Mika Lindqvist

> the problem is in line c_stream.avail_in = (unsigned int)comprLen/2;
> which feeds it too much data ... it should cap it to
> c_stream.next_out - compr instead.

fix ASAN crash on test/minigzip

Before this patch, when configuring with address sanitizer:

./configure --with-sanitizers
make
make test

used to fail with the following error:

$ echo hello world | ./minigzip
ASAN:SIGSEGV
=================================================================
==17466==ERROR: AddressSanitizer: SEGV on unknown address 0x00000000fc80 (pc 0x7fcacddd46f8 bp 0x7ffd01ceb310 sp 0x7ffd01ceb290 T0)
    #0 0x7fcacddd46f7 in _IO_fwrite (/lib/x86_64-linux-gnu/libc.so.6+0x6e6f7)
    #1 0x402602 in zng_gzwrite /home/spop/s/zlib-ng/test/minigzip.c:180
    #2 0x403445 in gz_compress /home/spop/s/zlib-ng/test/minigzip.c:305
    #3 0x404724 in main /home/spop/s/zlib-ng/test/minigzip.c:509
    #4 0x7fcacdd8682f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
    #5 0x4018d8 in _start (/work/spop/zlib-ng/minigzip+0x4018d8)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV ??:0 _IO_fwrite
==17466==ABORTING

During compilation the following warnings point to a missing definition:

/home/spop/s/zlib-ng/test/minigzip.c:154:31: warning: implicit declaration of function 'fdopen' is invalid in C99 [-Wimplicit-function-declaration]
    gz->file = path == NULL ? fdopen(fd, gz->write ? "wb" : "rb") :
                              ^
/home/spop/s/zlib-ng/test/minigzip.c:154:29: warning: pointer/integer type mismatch in conditional expression ('int' and 'FILE *' (aka 'struct _IO_FILE *')) [-Wconditional-type-mismatch]
    gz->file = path == NULL ? fdopen(fd, gz->write ? "wb" : "rb") :
                            ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/spop/s/zlib-ng/test/minigzip.c:504:36: warning: implicit declaration of function 'fileno' is invalid in C99 [-Wimplicit-function-declaration]
            file = PREFIX(gzdopen)(fileno(stdin), "rb");
                                   ^
/home/spop/s/zlib-ng/test/minigzip.c:508:36: warning: implicit declaration of function 'fileno' is invalid in C99 [-Wimplicit-function-declaration]
            file = PREFIX(gzdopen)(fileno(stdout), outmode);
                                   ^
/home/spop/s/zlib-ng/test/minigzip.c:534:48: warning: implicit declaration of function 'fileno' is invalid in C99 [-Wimplicit-function-declaration]
                        file = PREFIX(gzdopen)(fileno(stdout), outmode);
                                               ^
5 warnings generated.

and looking at stdio.h that defines fdopen we see that it is only defined under
__USE_POSIX:

#ifdef __USE_POSIX
/* Create a new stream that refers to an existing system file descriptor.  */
extern FILE *fdopen (int __fd, const char *__modes) __THROW __wur;
#endif

This patch fixes the compiler warnings and the runtime ASAN error.

Revert "[ARM/AArch64] Add run-time detection of ACLE and NEON instructions under Linux. * Use getauxval() to check support for ACLE CRC32 instructions * Allow disabling CRC32 instruction check"

This reverts commit e7e80f4cd22346a2ea3cadad57ed574078aa5576.

[ARM/AArch64] Add run-time detection of ACLE and NEON instructions under Linux.
* Use getauxval() to check support for ACLE CRC32 instructions
* Allow disabling CRC32 instruction check

Update zlib.h

Tell compiler to adhere to C99 standards.
Exception being newer cmake versions that will decay to gnu99 in
certain situations. This decay currently hides a warning in minigzip,
but using C99 with C_STANDARD_REQUIRED on could potentially introduce
unknown problems on other platforms, so for now we will allow this decay.

fix bug #207: avoid undefined integer overflow

zlib-ng used to fail when compiled with UBSan with this error:
deflate_slow.c:112:21: runtime error: unsigned integer overflow: 45871 - 45872 cannot be represented in type 'unsigned int'

The bug occurs in code added to zlib-ng under `#ifndef NOT_TWEAK_COMPILER`.
The original code of zlib contains a loop with two induction variables:

  s->prev_length -= 2;
  do {
      if (++s->strstart <= max_insert) {
          functable.insert_string(s, s->strstart, 1);
      }
  } while (--s->prev_length != 0);

The function insert_string is not executed when
  !(++s->strstart <= max_insert)
i.e., when
  !(s->strstart + 1 <= max_insert)
  !(s->strstart < max_insert)
  max_insert <= s->strstart

The function insert_string is executed when
  ++s->strstart <= max_insert
i.e., when
  s->strstart + 1 <= max_insert
  s->strstart < max_insert

The function is executed at most `max_insert - s->strstart` times, following the
exit condition of the do-while `(--s->prev_length != 0)`.  If the loop exits
after evaluating the exit condition once, the function is executed once
independently of `max_insert - s->strstart`.  The number of times the function
executes is the minimum between the number of iterations in the do-while loop
and `max_insert - s->strstart`.

The number of iterations of the loop is `mov_fwd = s->prev_length - 2`, and we
know that this is at least one as otherwise `--s->prev_length` would overflow.

The number of times the function insert_string is called is
  `min(mov_fwd, max_insert - s->strstart)`

Fix clang scan-build "zlib-ng/memcopy.h:298:5: warning: Value stored to 'from' is never read"

fix #187: remove errors exposed by undefined behavior sanitizer

Move decrement in loop to avoid the following errors:
adler32.c:91:19: runtime error: unsigned integer overflow: 0 - 1 cannot be represented in type 'size_t' (aka 'unsigned long')
adler32.c:136:19: runtime error: unsigned integer overflow: 0 - 1 cannot be represented in type 'size_t' (aka 'unsigned long')
inflate.c:972:32: runtime error: unsigned integer overflow: 0 - 1 cannot be represented in type 'unsigned int'

Fix the following bugs as recommended by Mika Lindqvist:
arch/x86/deflate_quick.c:233:22: runtime error: unsigned integer overflow: 0 - 1 cannot be represented in type 'unsigned int'
arch/x86/fill_window_sse.c:52:28: runtime error: unsigned integer overflow: 1 - 8192 cannot be represented in type 'unsigned int'

Allow compiling with gzfileops from configure too

Add more --help output to configure

Fix ZLIB_COMPAT=OFF and WITH_GZFILEOP=ON compilation failure.
Also add this combination to travis testing.

Remove --native testing from travis, since they somehow make this fail very often,
probably due to caching or running the executables on a different platform than
the compiler thinks it is running on.

Make functable thread-local.

fix bug #184: clear out buf to avoid msan use-of-uninitialized-value

Do not use bzero as suggested by Mika Lindqvist:
> You shouldn't use bzero() in new code as some compilers, like Visual C++,
> don't have it... New code should just use memset().

fix bug #192, oss-fuzz/9827 : MemorySanitizer:DEADLYSIGNAL

==4908==ERROR: MemorySanitizer: SEGV on unknown address 0x730fffffffff (pc 0x0000004b1b97 bp 0x7ffd4bf59a00 sp 0x7ffd4bf598a0 T4908)
==4908==The signal is caused by a READ memory access.
  #0 0x5a0599 in fizzle_matches zlib-ng/deflate_medium.c:168:12
  #1 0x59ea27 in deflate_medium zlib-ng/deflate_medium.c:296:21
  #2 0x5901c5 in zng_deflate zlib-ng/deflate.c:951:18
  #3 0x586955 in zng_compress2 zlib-ng/compress.c:59:15
  #4 0x5861eb in LLVMFuzzerTestOneInput zlib-ng/test/fuzz/compress_fuzzer.c:18:3
  #5 0x4e9b48 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /src/libfuzzer/FuzzerLoop.cpp:575:15
  #6 0x4a2f66 in fuzzer::RunOneTest(fuzzer::Fuzzer*, char const*, unsigned long) /src/libfuzzer/FuzzerDriver.cpp:280:6
  #7 0x4b3adb in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /src/libfuzzer/FuzzerDriver.cpp:715:9
  #8 0x4a2091 in main /src/libfuzzer/FuzzerMain.cpp:20:10
  #9 0x7fa3d7ff582f in __libc_start_main /build/glibc-Cl5G7W/glibc-2.23/csu/libc-start.c:291
  #10 0x41ec68 in _start

fix bugs #186 and #191, oss-fuzz/9831: use-of-uninitialized-value

==1==WARNING: MemorySanitizer: use-of-uninitialized-value
  #0 0x59fa93 in deflate_medium zlib-ng/deflate_medium.c:259:21
  #1 0x590905 in zng_deflate zlib-ng/deflate.c:951:18
  #2 0x587095 in zng_compress2 zlib-ng/compress.c:59:15
  #3 0x5866e3 in check_compress_level zlib-ng/test/fuzz/compress_fuzzer.c:18:3
  #4 0x5862fd in LLVMFuzzerTestOneInput zlib-ng/test/fuzz/compress_fuzzer.c:38:3
  #5 0x4e9b48 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /src/libfuzzer/FuzzerLoop.cpp:575:15
  #6 0x4a2f66 in fuzzer::RunOneTest(fuzzer::Fuzzer*, char const*, unsigned long) /src/libfuzzer/FuzzerDriver.cpp:280:6
  #7 0x4b3adb in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /src/libfuzzer/FuzzerDriver.cpp:715:9
  #8 0x4a2091 in main /src/libfuzzer/FuzzerMain.cpp:20:10
  #9 0x7fea2fea482f in __libc_start_main /build/glibc-Cl5G7W/glibc-2.23/csu/libc-start.c:291
  #10 0x41ec68 in _start
Uninitialized value was created by a heap allocation
  #0 0x45f2a0 in malloc /src/llvm/projects/compiler-rt/lib/msan/msan_interceptors.cc:910
  #1 0x587d42 in zng_deflateInit2_ zlib-ng/deflate.c:284:27
  #2 0x5874fa in zng_deflateInit_ zlib-ng/deflate.c:224:12
  #3 0x586c95 in zng_compress2 zlib-ng/compress.c:41:11
  #4 0x5866e3 in check_compress_level zlib-ng/test/fuzz/compress_fuzzer.c:18:3
  #5 0x5862fd in LLVMFuzzerTestOneInput zlib-ng/test/fuzz/compress_fuzzer.c:38:3
  #6 0x4e9b48 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /src/libfuzzer/FuzzerLoop.cpp:575:15
  #7 0x4a2f66 in fuzzer::RunOneTest(fuzzer::Fuzzer*, char const*, unsigned long) /src/libfuzzer/FuzzerDriver.cpp:280:6
  #8 0x4b3adb in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /src/libfuzzer/FuzzerDriver.cpp:715:9
  #9 0x4a2091 in main /src/libfuzzer/FuzzerMain.cpp:20:10
  #10 0x7fea2fea482f in __libc_start_main /build/glibc-Cl5G7W/glibc-2.23/csu/libc-start.c:291

fix #197, oss-fuzz/10036: only write 4 bytes per iteration in deflate_quick

by aggregating the two consecutive values to be written by static_emit_ptr to
s->pending_buf and writing the two values at once in a 4 byte store, we avoid
running out of the allocated buffer. We used to call quick_send_bits twice and
bumped the counter s->pending in the first call, which made the second call
write to memory beyond the safe 4 bytes that were guaranteed by the following
condition in the enclosing loop in deflate_quick:

  if (s->pending + 4 >= s->pending_buf_size) {
    flush_pending(s->strm);

The bug was exposed by the memory sanitizer like so:

MemorySanitizer:DEADLYSIGNAL
--
  | ==1==ERROR: MemorySanitizer: SEGV on unknown address 0x730000020000 (pc 0x0000005b6ce4 bp 0x7fff59adb5e0 sp 0x7fff59adb570 T1)
  | ==1==The signal is caused by a WRITE memory access.
  | #0 0x5b6ce3 in quick_send_bits zlib-ng/arch/x86/deflate_quick.c:134:48
  | #1 0x5b5752 in deflate_quick zlib-ng/arch/x86/deflate_quick.c:243:21
  | #2 0x590a15 in zng_deflate zlib-ng/deflate.c:952:18
  | #3 0x587165 in zng_compress2 zlib-ng/compress.c:59:15
  | #4 0x5866d3 in check_compress_level zlib-ng/test/fuzz/compress_fuzzer.c:22:3
  | #5 0x5862d8 in LLVMFuzzerTestOneInput zlib-ng/test/fuzz/compress_fuzzer.c:74:3
  | #6 0x4e9b48 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /src/libfuzzer/FuzzerLoop.cpp:575:15
  | #7 0x4a2f66 in fuzzer::RunOneTest(fuzzer::Fuzzer*, char const*, unsigned long) /src/libfuzzer/FuzzerDriver.cpp:280:6
  | #8 0x4b3adb in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /src/libfuzzer/FuzzerDriver.cpp:715:9
  | #9 0x4a2091 in main /src/libfuzzer/FuzzerMain.cpp:20:10
  | #10 0x7fb8919b082f in __libc_start_main /build/glibc-Cl5G7W/glibc-2.23/csu/libc-start.c:291
  | #11 0x41ec68 in _start
  | MemorySanitizer can not provide additional info.
  | SUMMARY: MemorySanitizer: SEGV (/mnt/scratch0/clusterfuzz/slave-bot/builds/clusterfuzz-builds_zlib-ng_7ead0a3e4980f024583384fd355b6e3ddd4b2ca2/revisions/compress_fuzzer+0x5b6ce3)

replaced include_directories() with target_include_directories()

using target_include_directories() with the zlib libraries prevents people from having to manually include those directories when linking to those libraries

Reset CMAKE_REQUIRED_FLAGS

Reset CMAKE_REQUIRED_FLAGS after each check to avoid the following
checks being influenced by the previous results.

Change-Id: I2e34f6127ef1c617f4eea363a2cb80bc49b3bcab
Signed-off-by: Richael Zhuang <richael.zhuang@arm.com>

Add check for -mfpu=neon (#171)

For 64bit armv8-a there's no need to use "-mfpu=neon" to enable NEON.
But for 32bit system "-mfpu=neon" is required.

This patch adds the detection for -mfpu=neon flag.

Signed-off-by: Richael Zhuang richael.zhuang@arm.com

travis: add linux-ppc64le

Add the support for some missing cross-compile tool chains in zlib-ng

Change-Id: I7b5c9acd0b3e43079e59c3da9eac161475408f83
Signed-off-by: Richael Zhuang <richael.zhuang@arm.com>

Separate feature checks for x86 and x86_64
* Don't check for SSE2 on anything else than i685
* Don't check for PCLMULQDQ on anything else than i686 or x86_64
* Check for SSE4.2 CRC intrinsics

CMake: don't assume i[3456]86 if others don't match

Match these architectures explicitly and fall back to generic.

[ARM] Disable ACLE support if uname returns "eabi".
* Warn if current processor doesn't support ACLE or NEON.

Add ARM cross build and validation

Update configure and test scripts to cross compile
and validate arm build-outs on x86 by using qemu

Change-Id: I183d003ebafcf686de26fd1705704ded4b344580
Signed-off-by: Jun He <jun.he@arm.com>

Move private defines from zconf.h and zconf-ng.h to zbuild.h
* move definition of z_size_t to zbuild.h

[MSVC] Fix size_t/ssize_t when using ZLIB_COMPAT. (#161)

* zconf.h.in wasn't including Windows.h, that is correct header to include definitions from BaseTsd.h, and such missing required type definition for ssize_t when compiling using MS Visual C++
* Various places need implicit casting to z_size_t to get around compatibility issues, this will truncate the result when ZLIB_COMPAT is defined, calling code should check for truncation.
* Add ZLIB_COMPAT flag to nmake Makefile and use it to determine correct
filenames instead of WITH_GZFILEOP

[configure] Fix creating import libraries and coverity build.

Fix build on ARM and gcc 4.x.

Fix the problem about rule to make target "zconf.h" on Arm platforms

If building zlib-ng with --acle option on Arm platforms, the building
process will stop in the meantime with the message "No rule to make
target zconf.h needed by crc32_acle.o".

This patch fixes the problem by including zconf.h or zconf-ng.h
according to the fact that whether ZLIB_COMPAT is defined or not in
crc32_acle.c.

Change-Id: Ib050c5b0e65d86210c8babdff5dbe670729fc63a

Signed-off-by: Richael Zhuang <richael.zhuang@arm.com>

Align in 16-byte boundary when UNALIGNED_OK is undefined.

[Issue #140] Use "/usr/bin/env bash" instead of /bin/sh.

[compat] Don't check for ZLIB_COMPAT
* ZLIB_COMPAT is always implied if using zlib.h
* Revert z_stream->adler to "unsigned long" to enforce correct alignment
of struct members

[compat] Use unsigned long for size parameters of compress/compressBound/uncompress

Assume WITH_GZFILEOP is defined when compatibility mode is enabled.

Fix build problems about NEON (#149)

* Fix build problems about NEON on AArch64

NEON is enabled by default on armv8-a platforms, and so NEON related
objects should be included when the platform is armv8-a. Errors about
adler32_neon will occur when you run ./configure on armv8-a platforms
without --neon option, because zlib-ng uses --neon option to include
NEON related objects regardless of Arm architecture.
You will have similar issue when you build the project with cmake.

This patch fixes the problem by including NEON related objects when
the platform is armv8-a(including aarch64).

Use adler32_neon only when zlib-ng is configured with --neon (or
-DWITH_NEON=ON if using cmake), or else use the default adler32
no matter what Arm architecture is.

Signed-off-by: Richael Zhuang <richael.zhuang@arm.com>

Prefer memcpy and memcmp over direct memory read/comparisons. (#135)

* Prefer memcpy and memcmp over direct memory read/comparisons.

* Some platforms have alignment requirements and unaligned direct memory
read/comparisons may result in undefined behaviour.
* Prefer memcpy and memcmp which are lowered to efficient assembly where
possible.

Add option to disable test binaries ZLIB_ENABLE_TESTS

wrap crc32 in functable (#145)

* wrap crc32 in functable
* change internal crc32 api to use uint64_t rather than size_t for length

Use CMake to generate cmakein file (#146)

* Use CMake to generate cmakein file

Fix the bug in crc32_acle

On armv8-a platforms if --acle is enabled, zlib-ng will use crc32_acle
instead of the default crc32. However, in crc32_acle the __crc32b() is
used to calculate the crc result of two variables with types uint32_t
and uint64_t, which gives an error result.The correct function used
should be __crc32d().

Signed-off-by: Richael Zhuang <richael.zhuang@arm.com>

Fix dependency problem about cmake options

According to the content of CMakeLists.txt, if building with "-DZLIB_COMPAT=ON",
the value of WITH_GZFILEOP should be ON too. However, WITH_GZFILEOP is OFF
actually when you run "cmake .. -DZIB_COMPAT=ON", which will cause errors if you
use gzfile related functions.

This patch fixes the problem by adjusting the position of WITH_GZFILEOP
option.

Signed-off-by: Richael Zhuang <richael.zhuang@arm.com>

[ARM/AArch64] Allow disabling NEON support in adler32_stub.

Merge pull request #148 from Dead2/renamelib2

Rename library when compiled without --zlib-compat

Fix dynamic versioning of library

Adapt code to support PREFIX macros and update build scripts

Copy zconf.h.in to zconf-ng.h.in and add relevant processing in
configure and CMakeLists.txt

Add function prefix (zng_) to all exported functions to allow zlib-ng
to co-exist in an application that has been linked to something that
depends on stock zlib. Previously, that would cause random problems
since there is no way to guarantee what zlib version is being used
for each dynamically linked function.

Add the corresponding zlib-ng.h.

Tests, example and minigzip will not compile before they have been
adapted to use the correct functions as well.
Either duplicate them, so we have minigzip-ng.c for example, or add
compile-time detection in the source code.

Rename library based on zlib-compat setting
If zlib-compat is enabled, keep libz name.
If zlib-compat is disabled, use libz-ng name.

Add new .map file for -ng mode, with prefixed function names.

This commit only containes preparatory changes to
the central parts of the build system.

Fix .so library permissions on install, needs to be executable for packaging
systems such as RPM to pick them up as provided libraries.

Fix make distclean with non-standard make (#134)

Like mingw32-make

Merge pull request #131 from mtl1979/patch-1

Fix compiler warning and spelling mistake in zlib.h

Update zlib.h

Fix compiler warning and spelling mistake

Merge pull request #130 from mtl1979/sync

Sync with zlib development branch

Fix deflateEnd() to not report an error at start of raw deflate.

Avoid an undefined behavior of memcpy() in _tr_stored_block().

Allegedly the behavior of memcpy() is undefined if the source
pointer is NULL, even if the number of bytes to copy is zero.

Merge pull request #129 from mtl1979/patch-1

[arm] Fix insert_string_acle.c

Merge pull request #128 from jserv/develop

aarch64: Build fix

Merge pull request #127 from mtl1979/typefix

[Issue #126] Fix implicit cast from unsigned char to signed int.

Update insert_string_acle.c

Build fix.

aarch64: Build fix

[Issue #126] Fix implicit cast from unsigned char to int.

Merge pull request #115 from Dead2/hacknslash7

Merge Hacknslash7

Fix that s->prev is not used uninitialized in insert_string_*

Revert "x86: use TZCNT (#113)"

Reverted after objections to its inclusion.

This reverts commit a7271104bf9a2d82dc6a69090c12442eacd2fd71.

Make code an int in compress_block()

send_code() expects int instead of unsigned. Other procedures do pass
int.

configure: For Windows builds, add the CROSS_PREFIX to $RC and $STRIP.

zlib's original win32/Makefile.gcc did the same, but this was removed in
7d17132436431d5f62cf5089623073d72d07deb0. It is kind of essential for
cross-compiling a Win32 build on Linux, since `windres` most certainly
doesn't exist, and the regular `strip` may not be able to handle DLLs.

It should probably actually be something like

RC="${RC-${CROSS_PREFIX}windres}"

and

STRIP="${STRIP-${CROSS_PREFIX}strip}"

to be consistent with the assignments of $AR, $RANLIB and $NM, but this
didn't work for some reason.

ZLIB_COMPAT: add an extra 32 bits of padding in z_stream

zlib "stock" uses an "uLong" for zstream::adler, meaning 4 bytes in 64
bit bits. The padding makes zlib-ng a drop-in replacement for libz; without,
the deflateInit2_() function returns a version error when called from
dependents that were built against "stock" zlib.

Committed from host : Portia.local

various  CMake fixes:

- on Mac, builds can target 1 or more architectures that are not the host
  architecture. Pick the first from the list and ignore the others.
  A more complete implementation would warn if i386 and x86_64 builds are
  mixed via the compiler options.
- use CMake's compiler IDs to detect GCC and Clang (should be applied to
  icc too but I can't test)
- disable PCLMUL optimisation in 32bit Mac builds. It crashes and provides
  very little gain (to builds that are probably increasingly rare)

Committed from host : Portia.local

Apply trivial CMake fixes based on feedback from RJVB in issue #110

Make sure we don't export internal functions

x86: use TZCNT (#113)

x86: use TZCNT instruction
On processors that do not support TZCNT, the instruction byte encoding is executed as BSF.
TZCNT is faster on AMD than BSF.

Fix: wrong register for BMI1 bit (#112)

The BMI1 bit is in the ebx register and not in ecx.
See reference: https://software.intel.com/sites/default/files/article/405250/how-to-detect-new-instruction-support-in-the-4th-generation-intel-core-processor-family.pdf

Lazily initialize functable members. (#108)

- Split functableInit() function as separate functions for each functable member, so we don't need to initialize full functable in multiple places in the zlib-ng code, or to check for NULL on every invocation.
- Optimized function for each functable member is detected on first invocation and the functable item is updated for subsequent invocations.
- Remove NULL check in adler32() and adler32_z() as it is no longer needed.

- Add adler32 to functable
- Add missing call to functableinit from inflateinit
- Fix external direct calls to adler32 functions without calling functableinit

Style cleanup

Add missing functable.h dependencies in arch makefiles

Merge branch 'hacknslash7' of github.com:Dead2/zlib-ng into hacknslash7

ARM optimizations part 2 (#107)

*  add adler32_neon to main dependency checking and ARM/Windows Makefile
*  split non-optimized adler32 to adler32_c so we can test/compare both without recompiling.
*  add detection of default floating point ABI in gcc
    NOTE: This should avoid build error when gcc supports both ABIs but header for just one ABI is installed.

Make -warn use the pedantic parameter that makes warnings, not errors.

Style cleanup

Add a struct func_table and function functableInit.
The struct contains pointers to select functions to be used by the
rest of zlib, and the init function selects what functions will be
used depending on what optimizations has been compiled in and what
instruction-sets are available at runtime.

Tests done on a haswell cpu running minigzip -6 compression of a
40M file shows a 2.5% decrease in branches, and a 25-30% reduction
in iTLB-loads. The reduction i iTLB-loads is likely mostly due to
the inability to inline functions. This also causes a slight
performance regression of around 1%, this might still be worth it
to make it much easier to implement new optimized functions for
various architectures and instruction sets.

The performance penalty will get smaller for functions that get more
alternative implementations to choose from, since there is no need
to add more branches to every call of the function.
Today insert_string has 1 branch to choose insert_string_sse
or insert_string_c, but if we also add for example insert_string_sse4
then that would have needed another branch, and it would probably
at some point hinder effective inlining too.

Implementing NEON-ized Adler32 checksum (#102)

The checksum is calculated in the uncompressed PNG data and can be
made much faster by using SIMD. Tests in ARMv8 yielded an improvement
of about 3x (e.g. walltime was 350ms x 125ms for a 4096x4096 bytes
executed 30 times).

This yields an improvement in image decoding in Chromium around 18%
(see https://bugs.chromium.org/p/chromium/issues/detail?id=688601).

Neon-Optimized hash chain rebase. (#106)

* Neon-Optimized hash chain rebase.

Signed-off-by: Jun He <jun.he@arm.com>

CMakeLists.txt: Fix cross-compiling. (#104)

Merge pull request #93 from sebpop/develop

inflate: improve performance of memory copy operations

Merge pull request #88 from mtl1979/arm

Implement ACLE/NEON optimizations for ARM/AARCH64

CMakeLists.txt: We can't use check_c_source_runs() when cross-compiling.

Add initial support for ARM NEON vector instructions.

Optimize fill_window_c.

inflate: improve performance of memory copy operations

When memory copy operations happen byte by byte, the processors are unable to
fuse the loads and stores together because of aliasing issues.  This patch
clusters some of the memory copy operations in chunks of 16 and 8 bytes.

For byte memset, the compiler knows how to prepare the chunk to be stored.
When the memset pattern is larger than a byte, this patch builds the pattern for
chunk memset using the same technique as in Simon Hosie's patch
https://codereview.chromium.org/2722063002

This patch improves by 50% the performance of zlib decompression of a 50K PNG on
aarch64-linux and x86_64-linux when compiled with gcc-7 or llvm-5.

The number of executed instructions reported by valgrind --tool=cachegrind
on the decompression of a 50K PNG file on aarch64-linux:
- before the patch:
I   refs:      3,783,757,451
D   refs:      1,574,572,882  (869,116,630 rd   + 705,456,252 wr)

- with the patch:
I   refs:      2,391,899,214
D   refs:        899,359,836  (516,666,051 rd   + 382,693,785 wr)

The compression of a 260MB directory containing the code of llvm into a tar.gz
of 35MB and decompressing that with minigzip -d
on i7-4790K x86_64-linux, it takes 0.533s before the patch and 0.493s with the patch,
on Juno-r0 aarch64-linux A57, it takes 2.796s before the patch and 2.467s with the patch,
on Juno-r0 aarch64-linux A53, it takes 4.055s before the patch and 3.604s with the patch.

Add initial support for AARCH64.

Add support for ARM ACLE instructions.

Add ARM implementation of CTZL for Visual C++.