lawadr [Thu, 30 Mar 2023 19:37:14 +0000 (20:37 +0100)]
Check for attribute aligned compiler support
Check for compiler support in CMake and the configure script. This
allows ALIGNED_ to be defined for more compilers so that more than
just Clang, GCC and MSVC can build the project.
The header locations are OS specific and not architecture specific. The
previous behaviour was to always include machine/endian.h for ARM and
AArch64 architectures on non-Windows and non-Linux OSs, causing build
failures if the OS uses other locations defined further down the
conditional block.
Georgiy Manuilov [Sun, 12 Mar 2023 13:45:53 +0000 (14:45 +0100)]
Enable using AVX512 intrinsics with GCC <9
Replace missing '_mm512_set_epi8' with
'_mm512_set_epi32' in test code for configuring;
Add fallback for '-mtune=cascadelake' flag used
when AVX512 is enabled.
Georgiy Manuilov [Sun, 12 Mar 2023 13:45:05 +0000 (14:45 +0100)]
Add fallback function for '_mm512_set_epi8' intrinsic
'_mm512_set_epi8' intrinsic is missing in GCC <9.
However, its usage can be easily eliminated in
favor of '_mm512_set_epi32' with no loss in
performance enabling older GCC to benefit from
AVX512-optimized codepaths.
lawadr [Mon, 20 Mar 2023 17:46:35 +0000 (17:46 +0000)]
Add member to cpu_features struct if empty
When WITH_OPTIM is off, the cpu_features struct is empty. This is not
allowed in standard C and causes a build failure with various compilers,
including MSVC.
This adds a dummy char member to the struct if it would otherwise be
empty.
lawadr [Fri, 17 Mar 2023 16:35:13 +0000 (16:35 +0000)]
Fix regex for visibility attribute tests
The previous regex of `not supported` was very specific to a particular
compiler (Clang 3.4+). As Clang isn't the only compiler that throws a
warning (but otherwise succeeds) when a visibility isn't supported, make
the regex more generic to hit all such cases.
Testing on Compiler Explorer shows that looking for the string
`visibility` has a better hit rate. `attribute` is perhaps more
dangerously generic, and `hidden`/`internal` doesn't always show up in
warning messages when the visibility attribute itself isn't available.
Reduce the amount of different defines required for arch-specific optimizations.
Also removed a reference to a nonexistant adler32_sse41 in test/test_adler32.cc.
Combine some of the checks that were not identical.
Made longest_match and compare256 use the X86_NOCHECK_SSE2 override,
thus now those are also automatically enabled on x86_64.
Ilya Leoshkevich [Fri, 10 Feb 2023 12:41:07 +0000 (13:41 +0100)]
Fix warnings in benchmarks
1. Initialize len in benchmark_compare256.cc.
In function ‘typename std::enable_if<(std::is_trivially_copyable<_Tp>::value && (sizeof (Tp) <= sizeof (Tp*)))>::type benchmark::DoNotOptimize(Tp&) [with Tp = unsigned int]’,
inlined from ‘void compare256::Bench(benchmark::State&, compare256_func)’ at /zlib-ng/test/benchmarks/benchmark_compare256.cc:44:33,
inlined from ‘virtual void compare256_c_Benchmark::BenchmarkCase(benchmark::State&)’ at /zlib-ng/test/benchmarks/benchmark_compare256.cc:62:1:
/zlib-ng/_deps/benchmark-src/include/benchmark/benchmark.h:480:3: warning: ‘len’ may be used uninitialized [-Wmaybe-uninitialized]
480 | asm volatile("" : "+m,r"(value) : : "memory");
| ^~~
/zlib-ng/test/benchmarks/benchmark_compare256.cc: In member function ‘virtual void compare256_c_Benchmark::BenchmarkCase(benchmark::State&)’:
/zlib-ng/test/benchmarks/benchmark_compare256.cc:36:18: note: ‘len’ was declared here
36 | uint32_t len;
| ^~~
2. Make the loop counter unsigned in benchmark_slidehash.cc.
/zlib-ng/test/benchmarks/benchmark_slidehash.cc: In member function ‘virtual void slide_hash::SetUp(const benchmark::State&)’:
/zlib-ng/test/benchmarks/benchmark_slidehash.cc:29:31: warning: comparison of integer expressions of different signedness: ‘int32_t’ {aka ‘int’} and ‘unsigned int’ [-Wsign-compare]
29 | for (int32_t i = 0; i < HASH_SIZE; i++) {
Adjust thread counts for compiles and tests to avoid under-utilization and congestion.
The free Github Actions VMs have 2 cores, the dedicated s390x VM has 4 cores.
Disable zlib-ng internal tests when BUILD_SHARED_LIBS=ON.
When BUILD_SHARED_LIBS=ON some zlib-ng internal functions are not exported,
which are used by gtest_zlib and benchmark_zlib. Therefore, we must disable
those tests/projects.
Replace __builtin_ctz[ll] fallback functions with branchless implementations.
Added debug assert check for value = 0.
Added more details to the comment to avoid future confusion.
Added fallback logic for older MSVC versions, just in case.
This should reduce the cost of indirection that occurs when calling functable
chunk copying functions inside inflate_fast. It should also allow the compiler
to optimize the inflate fast path for the specific architecture.
Mark Adler [Thu, 15 Dec 2022 17:07:13 +0000 (09:07 -0800)]
Fix bug in deflateBound() for level 0 and memLevel 9.
memLevel 9 would cause deflateBound() to assume the use of fixed
blocks, even if the compression level was 0, which forces stored
blocks. That could result in a bound less than the size of the
compressed data. Now level 0 always uses the stored blocks bound.
Mika Lindqvist [Sat, 21 Jan 2023 23:16:11 +0000 (01:16 +0200)]
Allow disabling visibility attribute with configure
* Disable visibility check for Cygwin, MinGW and MSYS as the compiler will only issue warning instead of error for unsupported attributes.
Fix ABI checking...
* Ubuntu 22.04 use different format for ABI files so old ones need to be removed
* Use more recent zlib-ng commit to avoid issues with internal adler32 and crc32 functions
Pavel P [Fri, 13 Jan 2023 18:31:48 +0000 (21:31 +0300)]
Fix compilation error where `crc32_fold` type matches field name in struct functable_s
If functable.h is included by a c++ compiler, compiler issues the following error (VS 2022):
```
zlib-ng/functable.h(20,49): error C2327: 'functable_s::crc32_fold': is not a type name, static, or enumerator
```
The error happens on line 20 because on previous line crc32_fold is declared as a struct member. Using `struct crc32_fold_s` instead of `crc32_fold` fixes the error.