]> git.ipfire.org Git - thirdparty/zlib-ng.git/log
thirdparty/zlib-ng.git
4 hours agoFix CI configure workflow failure artifact upload coverage-enablement 2096/head
Hans Kristian Rosbach [Tue, 13 Jan 2026 21:01:44 +0000 (22:01 +0100)] 
Fix CI configure workflow failure artifact upload

4 hours agoExpand configure script testing.
Hans Kristian Rosbach [Sun, 11 Jan 2026 21:50:21 +0000 (22:50 +0100)] 
Expand configure script testing.
- Remove 2 redundant jobs.
- Add 1 new jobs.
- Test multiple non-conflicting options in more of the jobs.

4 hours agoEnable codecov for more CI jobs.
Hans Kristian Rosbach [Sun, 11 Jan 2026 14:14:11 +0000 (15:14 +0100)] 
Enable codecov for more CI jobs.
Disable codecov where -O1 or higher is requested, since codecov sets -O0.
Disable codecov where tests are not run.
Add comments for jobs where codecov is not enabled.

4 hours agoUse default clang version for most builds.
Hans Kristian Rosbach [Sun, 11 Jan 2026 17:17:26 +0000 (18:17 +0100)] 
Use default clang version for most builds.
Let one job use clang-15, and a few clang-20

4 hours agoCombine ARM CI jobs testing non-NEON with non-ARMv8, as these have no common optimized
Hans Kristian Rosbach [Sun, 11 Jan 2026 22:05:51 +0000 (23:05 +0100)] 
Combine ARM CI jobs testing non-NEON with non-ARMv8, as these have no common optimized
functions. For Aarch64, use no-opt config for testing bothwithout Neon/Armv8.
This reduced cmake and configure jobs by 3 each.
Also reorder and rename a few other jobs to try to use a common style.

4 hours agoRemove separate MMAP CI job by folding into another.
Hans Kristian Rosbach [Sun, 11 Jan 2026 18:03:27 +0000 (19:03 +0100)] 
Remove separate MMAP CI job by folding into another.
Remove separate REDUCED_MEM CI job by folding into another.
Make sure both are present for both GCC and Clang.
Add ZLIB_COMPAT to clang debug job.

4 hours agoFix codecov parameter placement warnings.
Hans Kristian Rosbach [Sun, 11 Jan 2026 18:16:28 +0000 (19:16 +0100)] 
Fix codecov parameter placement warnings.

4 hours agoAdd configured compiler defines to informational output,
Hans Kristian Rosbach [Sun, 11 Jan 2026 13:51:31 +0000 (14:51 +0100)] 
Add configured compiler defines to informational output,
this eases debugging, especially in CI where further inspection is hard.

4 hours agoImprove detection of compiler code coverage support.
Hans Kristian Rosbach [Sun, 11 Jan 2026 13:49:06 +0000 (14:49 +0100)] 
Improve detection of compiler code coverage support.

4 days agoAdd fallback for __has_builtin to prevent unmatched parenthesis warning
Nathan Moinvaziri [Sun, 11 Jan 2026 21:11:08 +0000 (13:11 -0800)] 
Add fallback for __has_builtin to prevent unmatched parenthesis warning

Occurs on MSVC.

4 days agoAdd ARM __builtin_bitreverse16 fallback implementation for GCC.
Nathan Moinvaziri [Tue, 13 Jan 2026 17:01:11 +0000 (09:01 -0800)] 
Add ARM __builtin_bitreverse16 fallback implementation for GCC.

4 days agoRemove compiler check for builtin_bitreverse16 since we check in code
Nathan Moinvaziri [Sun, 11 Jan 2026 00:22:43 +0000 (16:22 -0800)] 
Remove compiler check for builtin_bitreverse16 since we check in code

And we have a generic fallback

4 days ago__builtin_bitreverse16 CMake compiler check fails for GCC 13
Nathan Moinvaziri [Sat, 10 Jan 2026 01:38:55 +0000 (17:38 -0800)] 
__builtin_bitreverse16 CMake compiler check fails for GCC 13

Provide a final check for __builtin_bitreverse16 in code.

5 days agoUse GCC's may_alias attribute for access to buffers in crc32_chorba
Vladislav Shchapov [Wed, 7 Jan 2026 19:30:19 +0000 (00:30 +0500)] 
Use GCC's may_alias attribute for access to buffers in crc32_chorba

Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>
6 days agoAdd Z_UNREACHABLE compiler hint
Hans Kristian Rosbach [Tue, 13 Jan 2026 14:32:53 +0000 (15:32 +0100)] 
Add Z_UNREACHABLE compiler hint

6 days agoFix (impossible) infinite loop in gz_fetch() detected by GCC-14 static analyzer.
Hans Kristian Rosbach [Tue, 13 Jan 2026 13:55:37 +0000 (14:55 +0100)] 
Fix (impossible) infinite loop in gz_fetch() detected by GCC-14 static analyzer.
According to the comment, gz_fetch() also assumes that state->x.have == 0, so
lets add an Assert to that effect.

6 days agoUpdate static analyzer from targeting GCC v10 to v14
Hans Kristian Rosbach [Mon, 12 Jan 2026 19:52:47 +0000 (20:52 +0100)] 
Update static analyzer from targeting GCC v10 to v14

6 days agoFix symbol mangling so symbols in shared library are exported correctly
Mika T. Lindqvist [Mon, 12 Jan 2026 01:42:13 +0000 (03:42 +0200)] 
Fix symbol mangling so symbols in shared library are exported correctly
* We need to mangle symbols in the map file, otherwise none of the symbols are exported
* Fix gz_error name conflict with zlib-ng API

6 days agoRemove extra indirection calling into crc32_z functions.
Nathan Moinvaziri [Mon, 12 Jan 2026 22:57:50 +0000 (14:57 -0800)] 
Remove extra indirection calling into crc32_z functions.

This also prevents the double-checking of buf == NULL.

6 days agoClean up buf == NULL handling on adler32 functions and test strings.
Nathan Moinvaziri [Mon, 12 Jan 2026 19:18:56 +0000 (11:18 -0800)] 
Clean up buf == NULL handling on adler32 functions and test strings.

6 days agoFixed UB in adler32_avx512_copy storemask when len is 0.
Nathan Moinvaziri [Sun, 11 Jan 2026 00:13:54 +0000 (16:13 -0800)] 
Fixed UB in adler32_avx512_copy storemask when len is 0.

6 days agoRename and reorder properties in hash_test.
Nathan Moinvaziri [Thu, 8 Jan 2026 19:02:08 +0000 (11:02 -0800)] 
Rename and reorder properties in hash_test.

6 days agoMerge adler32 and crc32 hash test strings.
Nathan Moinvaziri [Sat, 10 Jan 2026 18:29:05 +0000 (10:29 -0800)] 
Merge adler32 and crc32 hash test strings.

6 days agoAdd adler32_copy unit test
Nathan Moinvaziri [Wed, 7 Jan 2026 08:39:25 +0000 (00:39 -0800)] 
Add adler32_copy unit test

6 days agoSeparate adler32 test strings into their own source header
Nathan Moinvaziri [Wed, 7 Jan 2026 08:34:02 +0000 (00:34 -0800)] 
Separate adler32 test strings into their own source header

8 days agoSimplify the gzread.c name mangling workaround by splitting out just
Hans Kristian Rosbach [Sat, 10 Jan 2026 21:08:13 +0000 (22:08 +0100)] 
Simplify the gzread.c name mangling workaround by splitting out just
the workaround into a separate file. This allows us to browse gzread.c
with code highlighting and it allows codecov to record coverage data.

8 days agoDon't count tests/tools towards overall project coverage.
Hans Kristian Rosbach [Sat, 10 Jan 2026 22:32:49 +0000 (23:32 +0100)] 
Don't count tests/tools towards overall project coverage.
Set project coverage target to 80%.
Loosen project coverage reduction threshold to 10% to avoid failing coverage
tests when CI happens to run on hosts that do not support AVX-512.
Set component coverage reduction thresholds low, except for common and
arch_x86 that need higher limits due to the AVX-512 CI hosts.

9 days agoUpdate to GoogleTest 1.16.0.
Vladislav Shchapov [Fri, 9 Jan 2026 20:02:11 +0000 (01:02 +0500)] 
Update to GoogleTest 1.16.0.
This requires minimum CMake 3.13 and C++14, this matches nicely with zlib-ng 2.3.x requirements.

Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>
9 days agoReplace deprecated FetchContent_Populate with FetchContent_MakeAvailable
Vladislav Shchapov [Fri, 9 Jan 2026 19:47:20 +0000 (00:47 +0500)] 
Replace deprecated FetchContent_Populate with FetchContent_MakeAvailable

Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>
9 days agoRemove always TRUE or FALSE CMake version checks
Vladislav Shchapov [Fri, 9 Jan 2026 19:01:03 +0000 (00:01 +0500)] 
Remove always TRUE or FALSE CMake version checks

Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>
9 days agoSet minimum and upper compatible CMake version
Vladislav Shchapov [Fri, 9 Jan 2026 18:55:13 +0000 (23:55 +0500)] 
Set minimum and upper compatible CMake version

Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>
9 days agodeflateinit was still checking for failed secondary allocations, this is
Hans Kristian Rosbach [Sat, 10 Jan 2026 20:31:06 +0000 (21:31 +0100)] 
deflateinit was still checking for failed secondary allocations, this is
no longer necessary as we only allocate a single buffer and has already
been checked for failure before this.

9 days agoExplicitly define the __SSE__ and __SSE2__ macros, since starting with MSVS 2012...
Vladislav Shchapov [Thu, 8 Jan 2026 19:27:55 +0000 (00:27 +0500)] 
Explicitly define the __SSE__ and __SSE2__ macros, since starting with MSVS 2012 the default instruction set is SSE2

Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>
10 days agoCleanup preprocessor indents in fallback_builtins.
Nathan Moinvaziri [Thu, 8 Jan 2026 19:13:00 +0000 (11:13 -0800)] 
Cleanup preprocessor indents in fallback_builtins.

10 days agoAdd missing compiler preprocessor defines for 32-bit architectures
Nathan Moinvaziri [Thu, 8 Jan 2026 00:47:06 +0000 (16:47 -0800)] 
Add missing compiler preprocessor defines for 32-bit architectures

10 days agoAdd ARCH defines to code to make the ifdef logic easier
Nathan Moinvaziri [Thu, 8 Jan 2026 00:50:11 +0000 (16:50 -0800)] 
Add ARCH defines to code to make the ifdef logic easier

10 days agoAdd ARCH_32BIT and ARCH_64BIT defines for better code clarity
Nathan Moinvaziri [Thu, 8 Jan 2026 00:47:54 +0000 (16:47 -0800)] 
Add ARCH_32BIT and ARCH_64BIT defines for better code clarity

10 days agoIgnore benchmarks in codecov coverage reports.
Hans Kristian Rosbach [Sat, 10 Jan 2026 12:54:23 +0000 (13:54 +0100)] 
Ignore benchmarks in codecov coverage reports.
We already avoid collecting coverage when running benchmarks because the
benchmarks do not perform most error checking, thus even though they might
code increase coverage, they won't detect most bugs unless it actually
crashes the whole benchmark.

10 days agoAdd missing resets of compiler flags after completing each test,
Hans Kristian Rosbach [Fri, 9 Jan 2026 14:17:58 +0000 (15:17 +0100)] 
Add missing resets of compiler flags after completing each test,
avoids the next test inheriting the previous flags.

10 days agoAdded separate components.
Hans Kristian Rosbach [Fri, 9 Jan 2026 19:58:17 +0000 (20:58 +0100)] 
Added separate components.
Wait for CI completion before posting status report, avoids emailing an inital report with very low coverage based on pigz tests only.
Make report informational, low coverage will not be a CI failure.
Disable Github Annotations, these are deprecated due to API limits.

10 days agoDisable downloading extra test corpora for WITH_SANITIZER builds,
Hans Kristian Rosbach [Fri, 9 Jan 2026 20:45:05 +0000 (21:45 +0100)] 
Disable downloading extra test corpora for WITH_SANITIZER builds,
those tests are much too slow, upwards of 1 hour or more.

10 days agoResolve merge conflicts in coverage data, instead of aborting.
Hans Kristian Rosbach [Fri, 9 Jan 2026 15:08:38 +0000 (16:08 +0100)] 
Resolve merge conflicts in coverage data, instead of aborting.

11 days agoFix possible loss of data warning in benchmark_inflate on MSVC 2026
Nathan Moinvaziri [Thu, 8 Jan 2026 18:31:35 +0000 (10:31 -0800)] 
Fix possible loss of data warning in benchmark_inflate on MSVC 2026

benchmark_inflate.cc(131,51): warning C4267: '=': conversion from 'size_
t' to 'uint32_t', possible loss of dat

13 days agoFix warning: 'sprintf' is deprecated
Vladislav Shchapov [Wed, 31 Dec 2025 10:57:08 +0000 (15:57 +0500)] 
Fix warning: 'sprintf' is deprecated

Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>
13 days agoRebalance benchmark_compress size ranges
Hans Kristian Rosbach [Sun, 28 Dec 2025 18:05:31 +0000 (19:05 +0100)] 
Rebalance benchmark_compress size ranges

13 days agoImprove benchmark_compress and benchmark_uncompress.
Hans Kristian Rosbach [Sat, 27 Dec 2025 21:53:46 +0000 (22:53 +0100)] 
Improve benchmark_compress and benchmark_uncompress.
- These now use the same generated data as benchmark_inflate.
- benchmark_uncompress now also uses level 9 for compression, so that
  we also get 3-byte matches to uncompress.
- Improve error checking
- Unify code with benchmark_inflate

13 days agoAdd new benchmark inflate_nocrc. This lets us benchmark just the
Hans Kristian Rosbach [Sat, 27 Dec 2025 21:51:22 +0000 (22:51 +0100)] 
Add new benchmark inflate_nocrc. This lets us benchmark just the
inflate process more accurately. Also adds a new shared function for
generating highly compressible data that avoids very long matches.

13 days agoUse Z_FORCEINLINE for all adler32 or crc32 implementation functions
Nathan Moinvaziri [Thu, 1 Jan 2026 03:50:10 +0000 (19:50 -0800)] 
Use Z_FORCEINLINE for all adler32 or crc32 implementation functions

13 days agoSimplify crc32 pre/post conditioning for consistency
Nathan Moinvaziri [Sun, 4 Jan 2026 07:54:18 +0000 (23:54 -0800)] 
Simplify crc32 pre/post conditioning for consistency

13 days agoSimplify alignment checks in crc32_loongarch64
Nathan Moinvaziri [Sun, 4 Jan 2026 07:22:39 +0000 (23:22 -0800)] 
Simplify alignment checks in crc32_loongarch64

13 days agoSimplify alignment checks in crc32_armv8_pmull_eor3
Nathan Moinvaziri [Sun, 4 Jan 2026 07:09:13 +0000 (23:09 -0800)] 
Simplify alignment checks in crc32_armv8_pmull_eor3

13 days agoSimplify alignment checks in crc32_armv8
Nathan Moinvaziri [Sun, 4 Jan 2026 07:09:25 +0000 (23:09 -0800)] 
Simplify alignment checks in crc32_armv8

13 days agoRemove unnecessary buf variables in crc32_armv8.
Nathan Moinvaziri [Sun, 4 Jan 2026 04:46:57 +0000 (20:46 -0800)] 
Remove unnecessary buf variables in crc32_armv8.

13 days agoRemove unnecessary buf variables in crc32_loongarch64.
Nathan Moinvaziri [Sun, 4 Jan 2026 04:46:57 +0000 (20:46 -0800)] 
Remove unnecessary buf variables in crc32_loongarch64.

13 days agoAdd ALIGN_DIFF to perform alignment needed to next boundary
Nathan Moinvaziri [Sun, 4 Jan 2026 07:52:27 +0000 (23:52 -0800)] 
Add ALIGN_DIFF to perform alignment needed to next boundary

13 days agoConsume bits before branches in inflate_fast.
Dougall Johnson [Sun, 28 Dec 2025 23:41:02 +0000 (15:41 -0800)] 
Consume bits before branches in inflate_fast.

13 days agoUnroll some of the adler checksum for LASX
Vladislav Shchapov [Sat, 27 Dec 2025 19:58:55 +0000 (00:58 +0500)] 
Unroll some of the adler checksum for LASX

Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>
2 weeks ago[CI] Add workflow with no AVX512VNNI
Mika Lindqvist [Mon, 5 Jan 2026 00:08:42 +0000 (02:08 +0200)] 
[CI] Add workflow with no AVX512VNNI
* This adds coverage with optimizations that have versions for both AVX512 and AVX512VNNI

2 weeks agoUse bitrev instruction on LoongArch.
Vladislav Shchapov [Sat, 20 Dec 2025 14:06:59 +0000 (19:06 +0500)] 
Use bitrev instruction on LoongArch.

Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>
2 weeks agoBump actions/upload-artifact from 5 to 6
dependabot[bot] [Thu, 1 Jan 2026 07:04:31 +0000 (07:04 +0000)] 
Bump actions/upload-artifact from 5 to 6

Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 5 to 6.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](https://github.com/actions/upload-artifact/compare/v5...v6)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2 weeks agoBump actions/download-artifact from 6 to 7
dependabot[bot] [Thu, 1 Jan 2026 07:04:21 +0000 (07:04 +0000)] 
Bump actions/download-artifact from 6 to 7

Bumps [actions/download-artifact](https://github.com/actions/download-artifact) from 6 to 7.
- [Release notes](https://github.com/actions/download-artifact/releases)
- [Commits](https://github.com/actions/download-artifact/compare/v6...v7)

---
updated-dependencies:
- dependency-name: actions/download-artifact
  dependency-version: '7'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
3 weeks agoCheck CPU info for fast PMULL support.
Nathan Moinvaziri [Mon, 8 Dec 2025 02:44:30 +0000 (18:44 -0800)] 
Check CPU info for fast PMULL support.

armv8_pmull_eor3 is beneficial only if the CPU has multiple PMULL
execution units.

Co-authored-by: Adam Stylinski <kungfujesus06@gmail.com>
3 weeks agoIntegrate ARMv8 PMULL+EOR3 crc32 algorithm from Peter Cawley
Nathan Moinvaziri [Sun, 28 Dec 2025 22:47:44 +0000 (14:47 -0800)] 
Integrate ARMv8 PMULL+EOR3 crc32 algorithm from Peter Cawley

https://github.com/corsix/fast-crc32
https://github.com/zlib-ng/zlib-ng/pull/2023#discussion_r2573303259

Co-authored-by: Peter Cawley <corsix@corsix.org>
3 weeks agoLoongArch64 and e2k has 8-byte general-purpose registers.
Vladislav Shchapov [Thu, 25 Dec 2025 09:40:17 +0000 (14:40 +0500)] 
LoongArch64 and e2k has 8-byte general-purpose registers.

Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>
3 weeks agoSimplify LoongArch64 assembler. GCC 16, LLVM 22 have LASX and LSX conversion intrinsics.
Vladislav Shchapov [Sat, 20 Dec 2025 22:38:50 +0000 (03:38 +0500)] 
Simplify LoongArch64 assembler. GCC 16, LLVM 22 have LASX and LSX conversion intrinsics.

Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>
3 weeks agoImprove LoongArch64 toolchain file.
Vladislav Shchapov [Sat, 20 Dec 2025 20:30:38 +0000 (01:30 +0500)] 
Improve LoongArch64 toolchain file.

Use COMPILER_SUFFIX variable to set gcc name suffix.

Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>
3 weeks agoForce purely aligned loads in inflate_table code length counting
Adam Stylinski [Fri, 12 Dec 2025 21:23:27 +0000 (16:23 -0500)] 
Force purely aligned loads in inflate_table code length counting

At the expense of some extra stack space and eating about 4 more cache
lines, let's make these loads purely aligned. On potato CPUs such as the
Core 2, unaligned loads in a loop are not ideal. Additionally some SBC
based ARM chips (usually the little in big.little variants) suffer a
penalty for unaligned loads. This also paves the way for a trivial
altivec implementation, for which unaligned loads don't exist and need
to be synthesized with permutation vectors.

3 weeks agoOptimize code length counting in inflate_table using intrinsics.
Dougall Johnson [Wed, 10 Dec 2025 03:06:06 +0000 (19:06 -0800)] 
Optimize code length counting in inflate_table using intrinsics.

https://github.com/dougallj/zlib-dougallj/commit/f23fa25aa168ef782bab5e7cd6f9df50d7bb5eb2
https://godbolt.org/z/fojxrEo4T

Co-authored-by: Nathan Moinvaziri <nathan@nathanm.com>
3 weeks agoAdd missing adler32_copy_power8 implementation
Nathan Moinvaziri [Fri, 26 Dec 2025 16:50:44 +0000 (08:50 -0800)] 
Add missing adler32_copy_power8 implementation

3 weeks agoAdd missing adler32_copy_ssse3 implementation
Nathan Moinvaziri [Thu, 18 Dec 2025 00:35:18 +0000 (16:35 -0800)] 
Add missing adler32_copy_ssse3 implementation

3 weeks agoAdd missing adler32_copy_vmx implementation
Nathan Moinvaziri [Fri, 26 Dec 2025 16:56:41 +0000 (08:56 -0800)] 
Add missing adler32_copy_vmx implementation

3 weeks agoAdd comment to adler32_copy_avx512_vnni about lower vector width usage
Nathan Moinvaziri [Thu, 18 Dec 2025 00:12:30 +0000 (16:12 -0800)] 
Add comment to adler32_copy_avx512_vnni about lower vector width usage

3 weeks agoAdd static inline/Z_FORCEINLINE to crc32_(v)pclmulqdq functions.
Nathan Moinvaziri [Fri, 26 Dec 2025 16:39:04 +0000 (08:39 -0800)] 
Add static inline/Z_FORCEINLINE to crc32_(v)pclmulqdq functions.

3 weeks agoUse tail optimization in final barrett reduction
Nathan Moinvaziri [Fri, 26 Dec 2025 08:30:58 +0000 (00:30 -0800)] 
Use tail optimization in final barrett reduction

Fold 4x128-bit into a single 128-bit value using k1/k2 constants, then reduce
128-bits to 32-bits.

https://www.corsix.org/content/alternative-exposition-crc32_4k_pclmulqdq

3 weeks agoMove COPY out of fold_16 inline with other fold_# functions.
Nathan Moinvaziri [Fri, 26 Dec 2025 08:15:20 +0000 (00:15 -0800)] 
Move COPY out of fold_16 inline with other fold_# functions.

3 weeks agoMove fold calls closer to last change in xmm_crc# variables.
Nathan Moinvaziri [Fri, 26 Dec 2025 07:47:14 +0000 (23:47 -0800)] 
Move fold calls closer to last change in xmm_crc# variables.

3 weeks agoHandle initial crc only at the beginning of crc32_(v)pclmulqdq
Nathan Moinvaziri [Fri, 26 Dec 2025 07:14:21 +0000 (23:14 -0800)] 
Handle initial crc only at the beginning of crc32_(v)pclmulqdq

3 weeks agoFix initial crc value loading in crc32_(v)pclmulqdq
Nathan Moinvaziri [Sun, 14 Dec 2025 08:57:37 +0000 (00:57 -0800)] 
Fix initial crc value loading in crc32_(v)pclmulqdq

In main function, alignment diff processing was getting in the way of XORing
the initial CRC, because it does not guarantee at least 16 bytes have been
loaded.

In fold_16, src data modified by initial crc XORing before being stored to dst.

3 weeks agoRename crc32_fold_pclmulqdq_tpl.h to crc32_pclmulqdq_tpl.h
Nathan Moinvaziri [Thu, 11 Dec 2025 07:21:47 +0000 (23:21 -0800)] 
Rename crc32_fold_pclmulqdq_tpl.h to crc32_pclmulqdq_tpl.h

3 weeks agoMerged crc32_fold functions save, load, reset
Nathan Moinvaziri [Thu, 11 Dec 2025 06:59:50 +0000 (22:59 -0800)] 
Merged crc32_fold functions save, load, reset

3 weeks agoMove crc32_fold_s struct into x86 implementation.
Nathan Moinvaziri [Sun, 14 Dec 2025 18:32:02 +0000 (10:32 -0800)] 
Move crc32_fold_s struct into x86 implementation.

3 weeks agoUpdate crc32_fold test and benchmarks for crc32_copy
Nathan Moinvaziri [Fri, 19 Dec 2025 00:37:34 +0000 (16:37 -0800)] 
Update crc32_fold test and benchmarks for crc32_copy

3 weeks agoRefactor crc32_fold functions into single crc32_copy
Nathan Moinvaziri [Fri, 19 Dec 2025 00:17:18 +0000 (16:17 -0800)] 
Refactor crc32_fold functions into single crc32_copy

3 weeks agoRemove redundant instructions in 256 bit wide chunkset on LoongArch64
Vladislav Shchapov [Sat, 27 Dec 2025 10:58:03 +0000 (15:58 +0500)] 
Remove redundant instructions in 256 bit wide chunkset on LoongArch64

Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>
3 weeks agoSmall optimization in 256 bit wide chunkset
Adam Stylinski [Tue, 23 Dec 2025 23:58:10 +0000 (18:58 -0500)] 
Small optimization in 256 bit wide chunkset

It turns out Intel only parses the bottom 4 bits of the shuffle vector.
This makes it already a sufficient permutation vector and saves us a
small bit of latency.

3 weeks agoUse different bit accumulator type for x86 compiler optimization
Nathan Moinvaziri [Sat, 13 Dec 2025 01:50:15 +0000 (17:50 -0800)] 
Use different bit accumulator type for x86 compiler optimization

3 weeks agoFix bits var warning conversion from unsigned int to uint8_t in MSVC
Nathan Moinvaziri [Wed, 10 Dec 2025 21:34:31 +0000 (13:34 -0800)] 
Fix bits var warning conversion from unsigned int to uint8_t in MSVC

3 weeks agoChange code table access from pointer to value in inflate_fast.
Dougall Johnson [Wed, 3 Dec 2025 07:44:56 +0000 (23:44 -0800)] 
Change code table access from pointer to value in inflate_fast.

+r doesn't appear to work on MIPS or RISC-V architectures

Co-authored by: Nathan Moinvaziri <nathan@nathanm.com>

3 weeks agoApply consistent use of UNLIKLEY across adler32 variants
Nathan Moinvaziri [Thu, 18 Dec 2025 00:05:55 +0000 (16:05 -0800)] 
Apply consistent use of UNLIKLEY across adler32 variants

3 weeks agoClean up adler32 short length functions
Nathan Moinvaziri [Wed, 17 Dec 2025 02:00:11 +0000 (18:00 -0800)] 
Clean up adler32 short length functions

4 weeks agoImprove cmake/detect-arch.cmake to also provide bitness.
Hans Kristian Rosbach [Fri, 5 Dec 2025 19:04:14 +0000 (20:04 +0100)] 
Improve cmake/detect-arch.cmake to also provide bitness.
Rewrite checks in CMakelists.txt and cmake/detect-intrinsics.cmake
to utilize the new variables.

4 weeks agoReorder deflate.h variables to improve cache locality
Hans Kristian Rosbach [Wed, 10 Dec 2025 19:27:46 +0000 (20:27 +0100)] 
Reorder deflate.h variables to improve cache locality

4 weeks agoUse uint32_t for hash_head in update_hash/insert_string
Hans Kristian Rosbach [Thu, 11 Dec 2025 19:34:05 +0000 (20:34 +0100)] 
Use uint32_t for hash_head in update_hash/insert_string

4 weeks agoUse uin32_t for Pos in match_tpl.h
Hans Kristian Rosbach [Thu, 11 Dec 2025 16:24:59 +0000 (17:24 +0100)] 
Use uin32_t for Pos in match_tpl.h

4 weeks ago- Reorder variables in longest_match, reducing gaps.
Hans Kristian Rosbach [Mon, 8 Dec 2025 13:30:05 +0000 (14:30 +0100)] 
- Reorder variables in longest_match, reducing gaps.
- Make window-based pointers in match_tpl.h const, only the
  pointers move, never the data.

4 weeks agoUse pointer arithmetic to access window in deflate_quick/deflate_fast
Hans Kristian Rosbach [Mon, 8 Dec 2025 13:30:05 +0000 (14:30 +0100)] 
Use pointer arithmetic to access window in deflate_quick/deflate_fast

4 weeks ago- Add local window pointer to:
Hans Kristian Rosbach [Mon, 8 Dec 2025 12:18:24 +0000 (13:18 +0100)] 
- Add local window pointer to:
  deflate_quick, deflate_fast, deflate_medium and fill_window.
- Add local strm pointer in fill_window.
- Fix missed change to use local lookahead variable in match_tpl

4 weeks agoDeflate_state changes:
Hans Kristian Rosbach [Mon, 8 Dec 2025 12:09:42 +0000 (13:09 +0100)] 
Deflate_state changes:
- Reduce opt_len/static_len sizes.
- Move matches/insert closer to their related varibles.
  These now fill a 8-byte hole in the struct on 64-bit platforms.
- Exclude compressed_len and bits_sent if ZLIB_DEBUG is
  not enabled. Also move them to the end.
- Remove x86 MSVC-specific padding

4 weeks ago- Minor inlining changes in trees_emit.h:
Hans Kristian Rosbach [Mon, 8 Dec 2025 12:03:33 +0000 (13:03 +0100)] 
- Minor inlining changes in trees_emit.h:
  - Inline the small bi_windup function
  - Don't attempt inlining for the big zng_emit_dist
- Don't check for too long match in deflate_quick, it cannot happen.
- Move GOTO_NEXT_CHAIN macro outside of LONGEST_MATCH function to
  improve readability.

4 weeks agoFix warnings: unused parameter state, comparison of integer expressions of different...
Vladislav Shchapov [Sat, 20 Dec 2025 14:31:01 +0000 (19:31 +0500)] 
Fix warnings: unused parameter state, comparison of integer expressions of different signedness: size_t and int64_t.

Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>