git.ipfire.org Git - thirdparty/zlib-ng.git/log

]> git.ipfire.org Git - thirdparty/zlib-ng.git/log

projects / thirdparty / zlib-ng.git / log

commit | commitdiff | tree

Nathan Moinvaziri [Tue, 18 Jan 2022 02:47:23 +0000 (18:47 -0800)]

Move UNALIGNED_OK detection to compile time instead of configure time.

commit | commitdiff | tree

Mika T. Lindqvist [Tue, 15 Mar 2022 16:33:17 +0000 (18:33 +0200)]

Explicitly install dependencies for wine32.
* Allow downgrading packages to resolve conflicts

commit | commitdiff | tree

Mika Lindqvist [Tue, 15 Mar 2022 15:19:58 +0000 (17:19 +0200)]

Don't use -mtune with ClangCl.

commit | commitdiff | tree

Mika Lindqvist [Mon, 14 Mar 2022 18:02:01 +0000 (20:02 +0200)]

[README] Add missing FORCE_SSE2 for CMake.

commit | commitdiff | tree

Mika Lindqvist [Sun, 13 Mar 2022 15:12:42 +0000 (17:12 +0200)]

Allow bypassing runtime feature check of TZCNT instructions.
* This avoids conditional branch when it's known at build time that TZCNT instructions are always supported

commit | commitdiff | tree

Nathan Moinvaziri [Fri, 11 Mar 2022 23:45:06 +0000 (15:45 -0800)]

Throw an error when input is raw deflate stream but window_bits is not supplied.

commit | commitdiff | tree

Nathan Moinvaziri [Fri, 11 Mar 2022 23:42:14 +0000 (15:42 -0800)]

Print help when no arguments supplied to minideflate.

commit | commitdiff | tree

Nathan Moinvaziri [Fri, 11 Mar 2022 23:31:57 +0000 (15:31 -0800)]

Append extension to output file path based on window_bits when compressing and remove extension from output file path when decompressing.

commit | commitdiff | tree

Nathan Moinvaziri [Fri, 11 Mar 2022 22:53:47 +0000 (14:53 -0800)]

Added support for -k keep argument to minideflate. By default minideflate will now delete the input file.

commit | commitdiff | tree

Nathan Moinvaziri [Thu, 10 Mar 2022 16:57:13 +0000 (08:57 -0800)]

Use large default buffer size for minideflate to match minigzip use of GZBUFSIZE.

commit | commitdiff | tree

Nathan Moinvaziri [Thu, 10 Mar 2022 16:53:44 +0000 (08:53 -0800)]

Include zutil.h for definition of DEF_MEM_LEVEL.

commit | commitdiff | tree

Nathan Moinvaziri [Mon, 28 Feb 2022 17:00:26 +0000 (09:00 -0800)]

Auto-detect wrapper when inflating and no window_bits specified.

commit | commitdiff | tree

Nathan Moinvaziri [Sun, 27 Feb 2022 18:06:13 +0000 (10:06 -0800)]

Updated help usage with correct values for window_bits.

commit | commitdiff | tree

Nathan Moinvaziri [Wed, 23 Feb 2022 20:07:01 +0000 (12:07 -0800)]

Fixed wrong error name when calling inflate in minideflate.

commit | commitdiff | tree

Adam Stylinski [Mon, 21 Feb 2022 21:52:17 +0000 (16:52 -0500)]

Speed up chunkcopy and memset

This was found to have a significant impact on a highly compressible PNG
for both the encode and decode. Some deltas show performance improving
as much as 60%+.

For the scenarios where the "dist" is not an even modulus of our chunk
size, we simply repeat the bytes as many times as possible into our
vector registers. We then copy the entire vector and then advance the
quotient of our chunksize divided by our dist value.

If dist happens to be 1, there's no reason to not just call memset from
libc (this is likely to be just as fast if not faster).

commit | commitdiff | tree

Adam Stylinski [Mon, 24 Jan 2022 04:32:46 +0000 (23:32 -0500)]

Improve SSE2 slide hash performance

At least on pre-nehalem CPUs, we get a > 50% improvement. This is
mostly due to the fact that we're opportunistically doing aligned loads
instead of unaligned loads.  This is something that is very likely to be
possible, given that the deflate stream initialization uses the zalloc
function, which most libraries don't override.  Our allocator aligns to
64 byte boundaries, meaning we can do aligned loads on even AVX512 for
the zstream->prev and zstream->head pointers. However, only pre-nehalem
CPUs _actually_ benefit from explicitly aligned load instructions.

The other thing being done here is we're unrolling the loop by a factor
of 2 so that we can get a tiny bit more ILP.  This improved performance
by another 5%-7% gain.

commit | commitdiff | tree

Nathan Moinvaziri [Mon, 14 Mar 2022 18:46:02 +0000 (11:46 -0700)]

Added unit test for zng_calloc_aligned to ensure that it always returns 64-byte aligned memory allocation when requested.

commit | commitdiff | tree

Nathan Moinvaziri [Tue, 25 Jan 2022 04:58:54 +0000 (20:58 -0800)]

Bypass memory alignment compensation if not using custom allocator.

commit | commitdiff | tree

Nathan Moinvaziri [Tue, 4 Jan 2022 18:30:54 +0000 (10:30 -0800)]

Added memory alignment compensation functions for users who may be using custom allocators that don't align on the same boundary zlib-ng expects.

commit | commitdiff | tree

Ilya Leoshkevich [Tue, 15 Mar 2022 12:09:04 +0000 (08:09 -0400)]

IBM Z: Delete stale self-hosted builder containers

Due to things like power outage ExecStop may not run, resulting in a
stale actions-runner container. This would prevent ExecStart from
succeeding, so try deleting such stale containers in ExecStartPre.

commit | commitdiff | tree

Nathan Moinvaziri [Mon, 7 Mar 2022 16:25:42 +0000 (08:25 -0800)]

Use older version of Google test to support older versions of GCC.
Allow specifying alternative Google test repository and tag.

commit | commitdiff | tree

Nathan Moinvaziri [Fri, 25 Feb 2022 01:56:09 +0000 (17:56 -0800)]

Ignore code coverage for files in _dep directory.

commit | commitdiff | tree

Nathan Moinvaziri [Thu, 3 Feb 2022 22:37:12 +0000 (14:37 -0800)]

Compile MSAN instrumented C++ libraries for MSAN build with googletest.

commit | commitdiff | tree

Nathan Moinvaziri [Mon, 31 Jan 2022 23:59:42 +0000 (15:59 -0800)]

Prefer posix versions of MinGW for compiling against googletest.

commit | commitdiff | tree

Nathan Moinvaziri [Mon, 31 Jan 2022 23:50:27 +0000 (15:50 -0800)]

Added static versions of c++ libraries on S390X, MinGW, and ppc.

commit | commitdiff | tree

Nathan Moinvaziri [Thu, 27 Jan 2022 02:27:20 +0000 (18:27 -0800)]

Specify c++ compiler, packages, and flags for Google Test in cmake workflow.

commit | commitdiff | tree

Nathan Moinvaziri [Sun, 6 Feb 2022 17:51:06 +0000 (09:51 -0800)]

Move CVE-2003-0107 test to Google Tests.

commit | commitdiff | tree

Nathan Moinvaziri [Sun, 6 Feb 2022 17:52:27 +0000 (09:52 -0800)]

Implement unit testing using Google Test framework.

commit | commitdiff | tree

Nathan Moinvaziri [Wed, 9 Mar 2022 22:57:22 +0000 (14:57 -0800)]

Added ClangCl instances to GitHub Actions workflow.

commit | commitdiff | tree

Nathan Moinvaziri [Wed, 9 Mar 2022 22:50:41 +0000 (14:50 -0800)]

Fixed inftrees.c should be compiled with infcover when zlib is a shared library.

commit | commitdiff | tree

Adam Stylinski [Mon, 21 Feb 2022 05:17:07 +0000 (00:17 -0500)]

Adding some application-specific benchmarks

So far there's only added png encode and decode with predictably
compressible bytes. This gives us a rough idea of more holistic
impacts of performance improvements (and regressions).

An interesting thing found with this, when compared with stock zlib,
we're slower for png decoding at levels 8 & 9. When we are slower, we
are spending a fair amount of time in the chunk copy function. This
probably merits a closer look.

This code creates optionally an alternative benchmark binary that links
with an alternative static zlib implementation. This can be used to
quickly compare between different forks.

commit | commitdiff | tree

Adam Stylinski [Tue, 8 Feb 2022 22:09:30 +0000 (17:09 -0500)]

Use pclmulqdq accelerated CRC for exported function

We were already using this internally for our CRC calculations, however
the exported function to CRC checksum any arbitrary stream of bytes was
still using a generic C based version that leveraged tables. This
function is now called when len is at least 64 bytes.

commit | commitdiff | tree

Adam Stylinski [Sat, 12 Feb 2022 15:26:50 +0000 (10:26 -0500)]

Improved adler32 NEON performance by 30-47%

We unlocked some ILP by allowing for independent sums in the loop and
reducing these sums outside of the loop. Additionally, the multiplication
by 32 (now 64) is moved outside of this loop. Similar to the chromium
implementation, this code does straight 8 bit -> 16 bit additions and defers
the fused multiply accumulate outside of the loop. However, by unrolling by
another factor of 2, the code is measurably faster. The code does fused multiply
accmulates back to as many scratch registers we have room for in order to maximize
ILP for the 16 integer FMAs that need to occur. The compiler seems to order them
such that the destination register is the same register as the previous instruction,
so perhaps it's not actually able to overlap or maybe the -A73's pipeline is reordering
these instructions, anyway.

On the Odroid-N2, the Cortex-A73 cores are ~30-44% faster on the adler32 benchmark,
and the Cortex-A53 cores are anywhere from 34-47% faster.

commit | commitdiff | tree

Adam Stylinski [Wed, 16 Feb 2022 14:42:40 +0000 (09:42 -0500)]

Unlocked more ILP in SSE variant of adler checksum

This helps uarchs such as sandybridge more than Yorkfield, but there
were some measurable gains on a Core 2 Quad Q9650 as well. We can sum
to two separate vs2 variables and add them back together at the end,
allowing for some overlapping multiply-adds. This was only about a 9-12%
gain on the Q9650 but it nearly doubled performance on cascade lake and
is likely to have appreciable gains on everything in between those two.

commit | commitdiff | tree

Adam Stylinski [Sat, 5 Feb 2022 21:15:46 +0000 (16:15 -0500)]

Improve sse41 adler32 performance

Rather than doing opportunistic aligned loads, we can do scalar
unaligned loads into our two halves of the checksum until we hit
alignment. Then, we can subtract from the max number of sums for the
first run through the loop.

This allows us to force aligned loads for unaligned buffers (likely a
common case for arbitrary runs of memory). This is not meaningful after
Nehalem but pre-Nehalem architectures it makes a substantial difference
to performance and is more foolproof than hoping for an aligned buffer.

Improvement is around 44-50% for unaligned worst case scenarios.

commit | commitdiff | tree

Hans Kristian Rosbach [Wed, 23 Feb 2022 20:31:11 +0000 (21:31 +0100)]

Run libpng tests on push in addition to pull-requests.
Also run oss-fuzz on push to certain branches.

commit | commitdiff | tree

Hans Kristian Rosbach [Fri, 11 Feb 2022 14:53:00 +0000 (15:53 +0100)]

Fix compilation of benchmark when compiler supports, but does not default to enable C++11 or higher.

commit | commitdiff | tree

Nathan Moinvaziri [Wed, 23 Feb 2022 17:44:00 +0000 (09:44 -0800)]

Use multiple threads when running gcovr.

commit | commitdiff | tree

Nathan Moinvaziri [Wed, 23 Feb 2022 15:16:19 +0000 (07:16 -0800)]

Exclude unreachable branches from code coverage report.

commit | commitdiff | tree

Nathan Moinvaziri [Sat, 12 Feb 2022 16:10:17 +0000 (08:10 -0800)]

Added codecov yaml configuration to repository.

commit | commitdiff | tree

Nathan Moinvaziri [Sat, 12 Feb 2022 01:23:31 +0000 (17:23 -0800)]

Switch to using Codecov GitHub Action.

commit | commitdiff | tree

Ilya Leoshkevich [Sun, 6 Feb 2022 20:00:49 +0000 (21:00 +0100)]

IBM Z: Install Codecov dependencies on the self-hosted builder

commit | commitdiff | tree

Adam Stylinski [Mon, 21 Feb 2022 21:46:18 +0000 (16:46 -0500)]

Prevent stale stub functions from being called in deflate_slow

Just in case this is the very first call to longest match, we should
instead assign the function pointer instead of the function itself. This
way, by the time it leaves the stub, the function pointer gets
reassigned. This was found incidentally while debugging something else.

commit | commitdiff | tree

Mika Lindqvist [Fri, 18 Feb 2022 06:27:18 +0000 (08:27 +0200)]

Don't use -march=native when doing LD4 test for ARM/AArch64.

commit | commitdiff | tree

Mika Lindqvist [Thu, 17 Feb 2022 23:42:59 +0000 (01:42 +0200)]

[AArch64] Add missing LD4 test for configure.

commit | commitdiff | tree

Adam Stylinski [Sun, 23 Jan 2022 05:18:17 +0000 (00:18 -0500)]

Axe the SSE4 compare256 functions

commit | commitdiff | tree

Adam Stylinski [Sun, 23 Jan 2022 03:49:04 +0000 (22:49 -0500)]

Write an SSE2 optimized compare256

The SSE4 variant uses the unfortunate string comparison instructions from
SSE4.2 which not only don't work on as many CPUs but, are often slower
than the SSE2 counterparts except in very specific circumstances.

This version should be ~2x faster than unaligned_64 for larger strings
and about half the performance of AVX2 comparisons on identical
hardware.

This version is meant to supplement pre AVX hardware. Because of this,
we're performing 1 extra load + compare at the beginning. In the event
that we're doing a full 256 byte comparison (completely equal strings),
this will result in 2 extra SIMD comparisons if the inputs are unaligned.
Given that the loads will be absorbed by L1, this isn't super likely to
be a giant penalty but for something like a core-i first or second gen,
where unaligned loads aren't nearly as expensive, this going to be
_marginally_ slower in the worst case. This allows us to have half the
loads be aligned, so that the compiler can elide the load and compare by
using a register relative pcmpeqb.

commit | commitdiff | tree

Nathan Moinvaziri [Wed, 26 Jan 2022 18:51:23 +0000 (10:51 -0800)]

Introduce zmemcmp to use unaligned access for architectures we know support unaligned access, otherwise use memcmp.

commit | commitdiff | tree

Nathan Moinvaziri [Sun, 9 Jan 2022 23:01:23 +0000 (15:01 -0800)]

Introduce zmemcpy to use unaligned access for architectures we know support unaligned access, otherwise use memcpy.

commit | commitdiff | tree

Nathan Moinvaziri [Sun, 9 Jan 2022 22:58:53 +0000 (14:58 -0800)]

Simplify chunk_t type to uint64_t with memcpy calls.

commit | commitdiff | tree

Deniz Bahadir [Thu, 23 Dec 2021 00:13:03 +0000 (01:13 +0100)]

Fix compilation with clang-cl on windows

Do not include (system) headers when processing these headers with the
resource compiler, because it might trip over the headers coming from
LLVM.

commit | commitdiff | tree

Mika Lindqvist [Mon, 7 Feb 2022 15:54:41 +0000 (17:54 +0200)]

Update NMake GitHub Actions to use Visual Studio 2022 Enterprise.

commit | commitdiff | tree

Nathan Moinvaziri [Sun, 6 Feb 2022 18:06:02 +0000 (10:06 -0800)]

Remove unnecessary zutil.h includes.

commit | commitdiff | tree

Nathan Moinvaziri [Fri, 4 Feb 2022 03:13:59 +0000 (19:13 -0800)]

Fixed short name for CPU features header guard.

commit | commitdiff | tree

Nathan Moinvaziri [Fri, 4 Feb 2022 01:57:31 +0000 (17:57 -0800)]

Remove duplicate header includes.

commit | commitdiff | tree

Nathan Moinvaziri [Thu, 27 Jan 2022 01:58:52 +0000 (17:58 -0800)]

Only define CPU variants that require deflate_state when deflate.h has previously been included. This allows us to include cpu_features.h without including zlib.h or name mangling.

commit | commitdiff | tree

Nathan Moinvaziri [Sat, 5 Feb 2022 19:47:03 +0000 (11:47 -0800)]

Rename CPU feature header and source files for consistency.

commit | commitdiff | tree

Nathan Moinvaziri [Sun, 30 Jan 2022 17:34:02 +0000 (09:34 -0800)]

Added power8 cpu flag to powerpc64 CMake toolchain files.

commit | commitdiff | tree

Adam Stylinski [Thu, 3 Feb 2022 23:23:45 +0000 (18:23 -0500)]

Obtained more ILP with VMX by breaking a data dependency

By unrolling and finding the equivalent recurrence relation here, we can
do more independent sums, maximizing ILP. For when the data size fits
into cache, we get a sizable return. For when we don't, it's minor but
still measurable.

Testing was done on a quad Powermac G5 at 2.5Ghz.

commit | commitdiff | tree

Nathan Moinvaziri [Fri, 4 Feb 2022 16:47:54 +0000 (08:47 -0800)]

Install codecov tools only after successful build and test.

commit | commitdiff | tree

Nathan Moinvaziri [Thu, 3 Feb 2022 23:14:56 +0000 (15:14 -0800)]

Add wine repositories only when needed.

commit | commitdiff | tree

Nathan Moinvaziri [Thu, 3 Feb 2022 23:02:14 +0000 (15:02 -0800)]

Remove code coverage libraries which introduce memory leak on exit.

commit | commitdiff | tree

Nathan Moinvaziri [Thu, 3 Feb 2022 22:31:10 +0000 (14:31 -0800)]

Enable code coverage build only when code coverage report name specified.

commit | commitdiff | tree

Mika T. Lindqvist [Tue, 1 Feb 2022 14:20:38 +0000 (16:20 +0200)]

[Benchmarks] Fix adler32/vmx benchmark not found under 32-bit PowerPC

commit | commitdiff | tree

Adam Stylinski [Sun, 23 Jan 2022 16:59:57 +0000 (11:59 -0500)]

Marginal improvement by pipelining loads on NEON

The ld1{4 reg} variant saves us instructions
and only adds 3 cycles of latency to load 3
more neon/asimd registers worth of data.

commit | commitdiff | tree

Adam Stylinski [Fri, 28 Jan 2022 15:00:07 +0000 (10:00 -0500)]

More than double adler32 performance with altivec

Bits of low hanging and high hanging fruit in this round of
optimization.  Altivec has a sum characters into 4 lanes of integers
instructions (intrinsic vec_sum4s) that seems basically made for this
algorithm.  Additionally, there's a similar multiply-accumulate routine
that takes two character vectors for input and outputs a vector of 4
ints for their respective adjacent sums.  This alone was a good amount
of the performance gains.

Additionally, the shifting by 4 was still done in the loop when it was
easy to roll outside of the loop and do only once.  This removed some
latency for a dependent operand to be ready.  We also unrolled the loop
with independent sums, though, this only seems to help for much larger
input sizes.

Additionally, we reduced feeding the two 16 bit halves of the sum simply
by packing them into an aligned allocation in the stack next to each
other.  Then, when loaded, we permute and shift the values to two
separate vector registers from the same input registers.  The separation
of these scalars probably could have been done in vector registers
through some tricks but we need them in scalar GPRs anyhow every time
they leave the loop so it was naturally better to keep those separate
before hitting the vectorized code.

For the horizontal addition, the code was modified to use a sequence of
shifts and adds to produce a vector sum in the first lane.  Then, the
much cheaper vec_ste was used to store the value into a general purpose
register rather than vec_extract.

Lastly, instead of doing the relatively expensive modulus in GPRs after
we perform the scalar operations to align all of the loads in the loop,
we can instead reduce "n" here for the first round to be n minus the
alignment offset.

commit | commitdiff | tree

Nathan Moinvaziri [Mon, 31 Jan 2022 15:54:52 +0000 (07:54 -0800)]

Added sanitizer identification to CMake CI instance names.

commit | commitdiff | tree

Nathan Moinvaziri [Mon, 31 Jan 2022 15:51:01 +0000 (07:51 -0800)]

Fixed S390X CI instance code coverage report names not unique.

commit | commitdiff | tree

Nathan Moinvaziri [Sun, 30 Jan 2022 16:30:34 +0000 (08:30 -0800)]

Remove qemu-run variable which is already defined in the toolchain files. Only configure script would need this variable defined.

commit | commitdiff | tree

Nathan Moinvaziri [Fri, 21 Jan 2022 20:15:36 +0000 (12:15 -0800)]

Fixed typo when undefined behavior sanitizer is not supported.

commit | commitdiff | tree

Nathan Moinvaziri [Thu, 27 Jan 2022 00:07:31 +0000 (16:07 -0800)]

Remove unnecessary compiler specification from mingw configs.

commit | commitdiff | tree

Nathan Moinvaziri [Wed, 26 Jan 2022 22:36:26 +0000 (14:36 -0800)]

Removed unused CMake includes in benchmark cmake.

commit | commitdiff | tree

Nathan Moinvaziri [Wed, 26 Jan 2022 22:36:07 +0000 (14:36 -0800)]

Fixed wrong cpu check variable used for ARM in adler benchmarks.

commit | commitdiff | tree

Ilya Leoshkevich [Mon, 31 Jan 2022 12:19:18 +0000 (13:19 +0100)]

IBM Z: Upgrade self-hosted builder to v2.287.1

commit | commitdiff | tree

Mika T. Lindqvist [Tue, 25 Jan 2022 14:11:39 +0000 (16:11 +0200)]

[ARM] rename cmake/configure macros check_{acle,neon}_intrinsics to check_{acle,neon}_compiler_flag
* Currently these macros only check that the compiler flag(s) are supported, not that the compiler supports the actual intrinsics

commit | commitdiff | tree

Mika T. Lindqvist [Mon, 24 Jan 2022 06:21:10 +0000 (08:21 +0200)]

[ARM] Use armv8-crc+simd when compiling ACLE code on toolchains that don't enable FPU by default.

commit | commitdiff | tree

Nathan Moinvaziri [Wed, 26 Jan 2022 19:30:34 +0000 (11:30 -0800)]

Move _POSIX_C_SOURCE define before first stdlib.h include in zbuild.h for posix_memalign.

commit | commitdiff | tree

Nathan Moinvaziri [Sat, 22 Jan 2022 18:52:12 +0000 (10:52 -0800)]

Remove zutil.h includes from many files to prevent zlib.h being included.

commit | commitdiff | tree

Nathan Moinvaziri [Sat, 22 Jan 2022 17:57:24 +0000 (09:57 -0800)]

Move build basics to zbuild.h.

commit | commitdiff | tree

Nathan Moinvaziri [Tue, 25 Jan 2022 23:28:56 +0000 (15:28 -0800)]

Added CMake status message when code coverage is enabled.

commit | commitdiff | tree

Nathan Moinvaziri [Tue, 25 Jan 2022 05:28:13 +0000 (21:28 -0800)]

Use NATIVEFLAG in intrinsic checks that is added whenever it is enabled.

commit | commitdiff | tree

Nathan Moinvaziri [Mon, 24 Jan 2022 03:53:22 +0000 (19:53 -0800)]

Added fallback macros for add_compile_options and add_link_options.

commit | commitdiff | tree

Nathan Moinvaziri [Wed, 26 Jan 2022 01:56:12 +0000 (17:56 -0800)]

Use add_compile_options in cmake which sets both C and CXX flags.

It is necessary to break apart strings that contain multiple flags so they are processed by add_compile_options properly.

commit | commitdiff | tree

Nathan Moinvaziri [Wed, 26 Jan 2022 01:49:45 +0000 (17:49 -0800)]

Use add_link_options in cmake sanitizer detection.

commit | commitdiff | tree

Nathan Moinvaziri [Sun, 23 Jan 2022 21:26:36 +0000 (13:26 -0800)]

Use add_link_options in cmake code coverage detection.

commit | commitdiff | tree

Nathan Moinvaziri [Sun, 23 Jan 2022 21:48:45 +0000 (13:48 -0800)]

Update CMake toolchains for alternative gcc/gcc++ names.
Remove compiler attribute in GHA when using CMake toolchain files.

commit | commitdiff | tree

Mika T. Lindqvist [Fri, 28 Jan 2022 22:49:14 +0000 (00:49 +0200)]

[PowerPC] Default CPU to 7400 to enable VMX support in qemu-ppc

commit | commitdiff | tree

Nathan Moinvaziri [Sun, 23 Jan 2022 20:59:01 +0000 (12:59 -0800)]

Use static keyword for vec_sumsu to prevent undefined reference error when g++ linking.

commit | commitdiff | tree

Michael Hirsch [Tue, 25 Jan 2022 00:22:01 +0000 (19:22 -0500)]

Intel compilers: update deprecated -wn to -Wall style

This removes warnings on every single target like:
icx: command line warning #10430: Unsupported command line options encountered
These options as listed are not supported.
For more information, use '-qnextgen-diag'.
option list:
-w3

Signed-off-by: Michael Hirsch <michael@scivision.dev>

commit | commitdiff | tree

Nathan Moinvaziri [Mon, 24 Jan 2022 04:15:40 +0000 (20:15 -0800)]

Fixed GCC warning about unused variable in longest_match.

match_tpl.h:47:13: warning: unused variable ‘scan_start’ [-Wunused-variable]
47 | uint8_t scan_start[8], scan_end[8];

commit | commitdiff | tree

Adam Stylinski [Tue, 25 Jan 2022 05:16:37 +0000 (00:16 -0500)]

Make cmake and configure release flags consistent

CMake sufficiently appends -DNDEBUG to the preprocessor macros when not
compiling with debug symbols. This turns off debug level assertions and
has some other side effects. As such, we should equally append this
define to the configure scripts' CFLAGS.

commit | commitdiff | tree

Nathan Moinvaziri [Mon, 24 Jan 2022 00:18:05 +0000 (16:18 -0800)]

Remove unused fdopen define for MSVC.

commit | commitdiff | tree

Nathan Moinvaziri [Sat, 22 Jan 2022 21:06:14 +0000 (13:06 -0800)]

Use pigz version 2.6 due to bug in NOTHREADS support.
https://github.com/madler/pigz/issues/97

commit | commitdiff | tree

Nathan Moinvaziri [Sat, 22 Jan 2022 21:05:44 +0000 (13:05 -0800)]

Allow setting of version when building with pigz.

commit | commitdiff | tree

Nathan Moinvaziri [Sat, 22 Jan 2022 17:49:13 +0000 (09:49 -0800)]

Move cpu feature variant callback typedefs to cpu_features header.