Simplify zng_calloc and zng_cfree.
Make new static functions zng_alloc and zng_free available to other parts of the code.
Always request aligned allocations, even if UNALIGNED_OK is set.
Dan Kegel [Thu, 16 Jul 2020 16:59:52 +0000 (09:59 -0700)]
test/abicheck.sh: new script to verify abi compatibility with older versions
Verifies that zlib-ng's ABI has not changed since the reference commit
(indicated by variables ABI_URL and ABI_COMMIT in the script).
If --zlib-compat is given, the reference commit is zlib's 1.2.11;
otherwise, it's zlib-ng's 1d2504ddc489 (for now).
Ignores new symbols entirely, as they probably don't break backwards compatibility.
Ignores warnings listed in test/abi/ignore, currently, just those related to internal_state or z_stream*.
If --refresh is given, actually checks out the reference commit,
builds it, and stores its ABI description into an .abi file;
otherwise just uses the .abi file saved in git for that CHOST.
(--refresh_if is similar, but only does the above if the file
is not present.)
Known issues:
- pkgcheck.yml skips -m32 abicheck failures loudly until #705 is fixed
- although abicheck.sh supports -m32 (if in both CFLAGS and LDFLAGS), it doesn't yet support -32 in CONFIGURE_ARGS.
- only includes a few abi definitions; the rest can be added in a followon commit (see --refresh).
Changes to deflate's internal_state struct members:
- Change window_size from unsigned long to unsigned int
- Change block_start from long to int
- Change high_water from unsigned long to unsigned int
- Reorder to promote cache locality in hot code and decrease holes.
On x86_64 this means the struct goes from:
/* size: 6008, cachelines: 94, members: 57 */
/* sum members: 5984, holes: 6, sum holes: 24 */
/* last cacheline: 56 bytes */
To:
/* size: 5984, cachelines: 94, members: 57 */
/* sum members: 5972, holes: 3, sum holes: 8 */
/* padding: 4 */
/* last cacheline: 32 bytes */
value, which can be uint64_t, is printed using %llx, which, strictly
speaking, is not correct, and triggers -Wformat.
Since we don't really know what type value can have (send_bits_trace
is a macro), don't use <inttypes.h>, but rather cast it to long long.
Also cast length to int in order to prevent similar issues in the
future.
Some gcc versions complain that parameter c is always less than
MAX_MATCH-MIN_MATCH, and therefore the assertion that checks for this
is useless, but in reality some day MIN_MATCH and MAX_MATCH can change.
Increase hash table size from 15 to 16 bits.
This gives a good performance increase, and usually also improves compression.
Make separate define HASH_SLIDE for fallback version of UPDATE_HASH.
Check for match length exceeding lookahead each time a new best match is found. This reduces some code complexity in GOTO_NEXT_CHAIN by removing the need for RETURN_BEST_LEN.
Use unaligned 32-bit and 64-bit compare based on best match length when searching for matches.
Move TRIGGER_LEVEL to match_tpl.h since it is only used in longest match.
Use early return inside match loops instead of cont variable.
Added back two variable check for platforms that don't supported unaligned access.
Fix switching compression levels on older SystemZ machines
When switching to a compression level that is in general supported by
the hardware accelerator, the code doesn't check whether acceleration is
available or enabled.
Ilya Leoshkevich [Fri, 14 Aug 2020 11:50:39 +0000 (13:50 +0200)]
Fix DFLTCC detection
On some Z machines ARCH is determined to be s390, not s390x, which
prevents DFLTCC support from being built. In general, it is safe to
build zlib-ng with DFLTCC support on any SystemZ machine, because its
usage is guarded by STFLE.
Josh Triplett [Sun, 16 Aug 2020 23:12:13 +0000 (16:12 -0700)]
Fix testsuite warnings on Windows, using PRIu64
zlib-ng already counts on inttypes.h and stdint.h, so use those to avoid
printf-related warnings by casting integer fields whose size may vary to
uint64_t and printing them that way.
Josh Triplett [Fri, 14 Aug 2020 03:59:17 +0000 (20:59 -0700)]
Optimize inflate_fast for a 0.8% speedup
When inflate_fast checks for extra length bits, it first checks if the
number of extra length bits (in op) is non-zero. However, if the number
of extra length bits is 0, the `bits < op` check will be false, BITS(op)
will be 0, and DROPBITS(op) will do nothing. So, drop the conditional,
for a speedup of about 0.8%.
This makes the handling of extra length bits match the handling of extra
dist bits, which already lacks the extra conditional.
Fixed conversion from unsigned int to short warning in slide_hash_sse2 and slide_hash_avx2.
slide_sse.c(20,51): warning C4244: 'function': conversion from 'unsigned int' to 'short', possible loss of data
slide_avx.c(21,54): warning C4244: 'function': conversion from 'unsigned int' to 'short', possible loss of data
Fixed warning about indexing literal string in GCC macOS
infcover.c:378:56: warning: adding 'int' to a string does not append to the string [-Wstring-plus-int]
infcover.c:378:56: note: use array indexing to silence this warning
Remove ARM crc instruction set from NEONFLAG. The crc instruction set is not used for neon source files. It may have been necessary back when the NEONFLAG was applied to all source files in CMake. Configure script does not apply crc instruction set when setting neon flags.