git.ipfire.org Git - thirdparty/zlib-ng.git/log

]> git.ipfire.org Git - thirdparty/zlib-ng.git/log

projects / thirdparty / zlib-ng.git / log

commit | commitdiff | tree

Hans Kristian Rosbach [Sat, 10 Jan 2026 14:46:10 +0000 (15:46 +0100)]

Add -O0 and -DUSE_MMAP to tests.
Add a GCC REDUCED_MEM test as well.
(USE_MMAP is only used by minigzip)

commit | commitdiff | tree

Hans Kristian Rosbach [Sat, 10 Jan 2026 12:54:23 +0000 (13:54 +0100)]

Ignore benchmarks in codecov coverage reports.
We already avoid collecting coverage when running benchmarks because the
benchmarks do not perform most error checking, thus even though they might
code increase coverage, they won't detect most bugs unless it actually
crashes the whole benchmark.

commit | commitdiff | tree

Hans Kristian Rosbach [Fri, 9 Jan 2026 14:17:58 +0000 (15:17 +0100)]

Add missing resets of compiler flags after completing each test,
avoids the next test inheriting the previous flags.

commit | commitdiff | tree

Hans Kristian Rosbach [Fri, 9 Jan 2026 19:58:17 +0000 (20:58 +0100)]

Added separate components.
Wait for CI completion before posting status report, avoids emailing an inital report with very low coverage based on pigz tests only.
Make report informational, low coverage will not be a CI failure.
Disable Github Annotations, these are deprecated due to API limits.

commit | commitdiff | tree

Hans Kristian Rosbach [Fri, 9 Jan 2026 20:45:05 +0000 (21:45 +0100)]

Disable downloading extra test corpora for WITH_SANITIZER builds,
those tests are much too slow, upwards of 1 hour or more.

commit | commitdiff | tree

Hans Kristian Rosbach [Fri, 9 Jan 2026 15:08:38 +0000 (16:08 +0100)]

Resolve merge conflicts in coverage data, instead of aborting.

commit | commitdiff | tree

Nathan Moinvaziri [Thu, 8 Jan 2026 18:31:35 +0000 (10:31 -0800)]

Fix possible loss of data warning in benchmark_inflate on MSVC 2026

benchmark_inflate.cc(131,51): warning C4267: '=': conversion from 'size_
t' to 'uint32_t', possible loss of dat

commit | commitdiff | tree

Vladislav Shchapov [Wed, 31 Dec 2025 10:57:08 +0000 (15:57 +0500)]

Fix warning: 'sprintf' is deprecated

Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>

commit | commitdiff | tree

Hans Kristian Rosbach [Sun, 28 Dec 2025 18:05:31 +0000 (19:05 +0100)]

Rebalance benchmark_compress size ranges

commit | commitdiff | tree

Hans Kristian Rosbach [Sat, 27 Dec 2025 21:53:46 +0000 (22:53 +0100)]

Improve benchmark_compress and benchmark_uncompress.
- These now use the same generated data as benchmark_inflate.
- benchmark_uncompress now also uses level 9 for compression, so that
we also get 3-byte matches to uncompress.
- Improve error checking
- Unify code with benchmark_inflate

commit | commitdiff | tree

Hans Kristian Rosbach [Sat, 27 Dec 2025 21:51:22 +0000 (22:51 +0100)]

Add new benchmark inflate_nocrc. This lets us benchmark just the
inflate process more accurately. Also adds a new shared function for
generating highly compressible data that avoids very long matches.

commit | commitdiff | tree

Nathan Moinvaziri [Thu, 1 Jan 2026 03:50:10 +0000 (19:50 -0800)]

Use Z_FORCEINLINE for all adler32 or crc32 implementation functions

commit | commitdiff | tree

Nathan Moinvaziri [Sun, 4 Jan 2026 07:54:18 +0000 (23:54 -0800)]

Simplify crc32 pre/post conditioning for consistency

commit | commitdiff | tree

Nathan Moinvaziri [Sun, 4 Jan 2026 07:22:39 +0000 (23:22 -0800)]

Simplify alignment checks in crc32_loongarch64

commit | commitdiff | tree

Nathan Moinvaziri [Sun, 4 Jan 2026 07:09:13 +0000 (23:09 -0800)]

Simplify alignment checks in crc32_armv8_pmull_eor3

commit | commitdiff | tree

Nathan Moinvaziri [Sun, 4 Jan 2026 07:09:25 +0000 (23:09 -0800)]

Simplify alignment checks in crc32_armv8

commit | commitdiff | tree

Nathan Moinvaziri [Sun, 4 Jan 2026 04:46:57 +0000 (20:46 -0800)]

Remove unnecessary buf variables in crc32_armv8.

commit | commitdiff | tree

Nathan Moinvaziri [Sun, 4 Jan 2026 04:46:57 +0000 (20:46 -0800)]

Remove unnecessary buf variables in crc32_loongarch64.

commit | commitdiff | tree

Nathan Moinvaziri [Sun, 4 Jan 2026 07:52:27 +0000 (23:52 -0800)]

Add ALIGN_DIFF to perform alignment needed to next boundary

commit | commitdiff | tree

Dougall Johnson [Sun, 28 Dec 2025 23:41:02 +0000 (15:41 -0800)]

Consume bits before branches in inflate_fast.

commit | commitdiff | tree

Vladislav Shchapov [Sat, 27 Dec 2025 19:58:55 +0000 (00:58 +0500)]

Unroll some of the adler checksum for LASX

Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>

commit | commitdiff | tree

Mika Lindqvist [Mon, 5 Jan 2026 00:08:42 +0000 (02:08 +0200)]

[CI] Add workflow with no AVX512VNNI
* This adds coverage with optimizations that have versions for both AVX512 and AVX512VNNI

commit | commitdiff | tree

Vladislav Shchapov [Sat, 20 Dec 2025 14:06:59 +0000 (19:06 +0500)]

Use bitrev instruction on LoongArch.

Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>

commit | commitdiff | tree

dependabot[bot] [Thu, 1 Jan 2026 07:04:31 +0000 (07:04 +0000)]

Bump actions/upload-artifact from 5 to 6

Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 5 to 6.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](https://github.com/actions/upload-artifact/compare/v5...v6)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

commit | commitdiff | tree

dependabot[bot] [Thu, 1 Jan 2026 07:04:21 +0000 (07:04 +0000)]

Bump actions/download-artifact from 6 to 7

Bumps [actions/download-artifact](https://github.com/actions/download-artifact) from 6 to 7.
- [Release notes](https://github.com/actions/download-artifact/releases)
- [Commits](https://github.com/actions/download-artifact/compare/v6...v7)

---
updated-dependencies:
- dependency-name: actions/download-artifact
  dependency-version: '7'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

commit | commitdiff | tree

Nathan Moinvaziri [Mon, 8 Dec 2025 02:44:30 +0000 (18:44 -0800)]

Check CPU info for fast PMULL support.

armv8_pmull_eor3 is beneficial only if the CPU has multiple PMULL
execution units.

Co-authored-by: Adam Stylinski <kungfujesus06@gmail.com>

commit | commitdiff | tree

Nathan Moinvaziri [Sun, 28 Dec 2025 22:47:44 +0000 (14:47 -0800)]

Integrate ARMv8 PMULL+EOR3 crc32 algorithm from Peter Cawley

https://github.com/corsix/fast-crc32
https://github.com/zlib-ng/zlib-ng/pull/2023#discussion_r2573303259

Co-authored-by: Peter Cawley <corsix@corsix.org>

commit | commitdiff | tree

Vladislav Shchapov [Thu, 25 Dec 2025 09:40:17 +0000 (14:40 +0500)]

LoongArch64 and e2k has 8-byte general-purpose registers.

Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>

commit | commitdiff | tree

Vladislav Shchapov [Sat, 20 Dec 2025 22:38:50 +0000 (03:38 +0500)]

Simplify LoongArch64 assembler. GCC 16, LLVM 22 have LASX and LSX conversion intrinsics.

Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>

commit | commitdiff | tree

Vladislav Shchapov [Sat, 20 Dec 2025 20:30:38 +0000 (01:30 +0500)]

Improve LoongArch64 toolchain file.

Use COMPILER_SUFFIX variable to set gcc name suffix.

Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>

commit | commitdiff | tree

Adam Stylinski [Fri, 12 Dec 2025 21:23:27 +0000 (16:23 -0500)]

Force purely aligned loads in inflate_table code length counting

At the expense of some extra stack space and eating about 4 more cache
lines, let's make these loads purely aligned. On potato CPUs such as the
Core 2, unaligned loads in a loop are not ideal. Additionally some SBC
based ARM chips (usually the little in big.little variants) suffer a
penalty for unaligned loads. This also paves the way for a trivial
altivec implementation, for which unaligned loads don't exist and need
to be synthesized with permutation vectors.

commit | commitdiff | tree

Dougall Johnson [Wed, 10 Dec 2025 03:06:06 +0000 (19:06 -0800)]

Optimize code length counting in inflate_table using intrinsics.

https://github.com/dougallj/zlib-dougallj/commit/f23fa25aa168ef782bab5e7cd6f9df50d7bb5eb2
https://godbolt.org/z/fojxrEo4T

Co-authored-by: Nathan Moinvaziri <nathan@nathanm.com>

commit | commitdiff | tree

Nathan Moinvaziri [Fri, 26 Dec 2025 16:50:44 +0000 (08:50 -0800)]

Add missing adler32_copy_power8 implementation

commit | commitdiff | tree

Nathan Moinvaziri [Thu, 18 Dec 2025 00:35:18 +0000 (16:35 -0800)]

Add missing adler32_copy_ssse3 implementation

commit | commitdiff | tree

Nathan Moinvaziri [Fri, 26 Dec 2025 16:56:41 +0000 (08:56 -0800)]

Add missing adler32_copy_vmx implementation

commit | commitdiff | tree

Nathan Moinvaziri [Thu, 18 Dec 2025 00:12:30 +0000 (16:12 -0800)]

Add comment to adler32_copy_avx512_vnni about lower vector width usage

commit | commitdiff | tree

Nathan Moinvaziri [Fri, 26 Dec 2025 16:39:04 +0000 (08:39 -0800)]

Add static inline/Z_FORCEINLINE to crc32_(v)pclmulqdq functions.

commit | commitdiff | tree

Nathan Moinvaziri [Fri, 26 Dec 2025 08:30:58 +0000 (00:30 -0800)]

Use tail optimization in final barrett reduction

Fold 4x128-bit into a single 128-bit value using k1/k2 constants, then reduce
128-bits to 32-bits.

https://www.corsix.org/content/alternative-exposition-crc32_4k_pclmulqdq

commit | commitdiff | tree

Nathan Moinvaziri [Fri, 26 Dec 2025 08:15:20 +0000 (00:15 -0800)]

Move COPY out of fold_16 inline with other fold_# functions.

commit | commitdiff | tree

Nathan Moinvaziri [Fri, 26 Dec 2025 07:47:14 +0000 (23:47 -0800)]

Move fold calls closer to last change in xmm_crc# variables.

commit | commitdiff | tree

Nathan Moinvaziri [Fri, 26 Dec 2025 07:14:21 +0000 (23:14 -0800)]

Handle initial crc only at the beginning of crc32_(v)pclmulqdq

commit | commitdiff | tree

Nathan Moinvaziri [Sun, 14 Dec 2025 08:57:37 +0000 (00:57 -0800)]

Fix initial crc value loading in crc32_(v)pclmulqdq

In main function, alignment diff processing was getting in the way of XORing
the initial CRC, because it does not guarantee at least 16 bytes have been
loaded.

In fold_16, src data modified by initial crc XORing before being stored to dst.

commit | commitdiff | tree

Nathan Moinvaziri [Thu, 11 Dec 2025 07:21:47 +0000 (23:21 -0800)]

Rename crc32_fold_pclmulqdq_tpl.h to crc32_pclmulqdq_tpl.h

commit | commitdiff | tree

Nathan Moinvaziri [Thu, 11 Dec 2025 06:59:50 +0000 (22:59 -0800)]

Merged crc32_fold functions save, load, reset

commit | commitdiff | tree

Nathan Moinvaziri [Sun, 14 Dec 2025 18:32:02 +0000 (10:32 -0800)]

Move crc32_fold_s struct into x86 implementation.

commit | commitdiff | tree

Nathan Moinvaziri [Fri, 19 Dec 2025 00:37:34 +0000 (16:37 -0800)]

Update crc32_fold test and benchmarks for crc32_copy

commit | commitdiff | tree

Nathan Moinvaziri [Fri, 19 Dec 2025 00:17:18 +0000 (16:17 -0800)]

Refactor crc32_fold functions into single crc32_copy

commit | commitdiff | tree

Vladislav Shchapov [Sat, 27 Dec 2025 10:58:03 +0000 (15:58 +0500)]

Remove redundant instructions in 256 bit wide chunkset on LoongArch64

Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>

commit | commitdiff | tree

Adam Stylinski [Tue, 23 Dec 2025 23:58:10 +0000 (18:58 -0500)]

Small optimization in 256 bit wide chunkset

It turns out Intel only parses the bottom 4 bits of the shuffle vector.
This makes it already a sufficient permutation vector and saves us a
small bit of latency.

commit | commitdiff | tree

Nathan Moinvaziri [Sat, 13 Dec 2025 01:50:15 +0000 (17:50 -0800)]

Use different bit accumulator type for x86 compiler optimization

commit | commitdiff | tree

Nathan Moinvaziri [Wed, 10 Dec 2025 21:34:31 +0000 (13:34 -0800)]

Fix bits var warning conversion from unsigned int to uint8_t in MSVC

commit | commitdiff | tree

Dougall Johnson [Wed, 3 Dec 2025 07:44:56 +0000 (23:44 -0800)]

Change code table access from pointer to value in inflate_fast.

+r doesn't appear to work on MIPS or RISC-V architectures

Co-authored by: Nathan Moinvaziri <nathan@nathanm.com>

commit | commitdiff | tree

Nathan Moinvaziri [Thu, 18 Dec 2025 00:05:55 +0000 (16:05 -0800)]

Apply consistent use of UNLIKLEY across adler32 variants

commit | commitdiff | tree

Nathan Moinvaziri [Wed, 17 Dec 2025 02:00:11 +0000 (18:00 -0800)]

Clean up adler32 short length functions

commit | commitdiff | tree

Hans Kristian Rosbach [Fri, 5 Dec 2025 19:04:14 +0000 (20:04 +0100)]

Improve cmake/detect-arch.cmake to also provide bitness.
Rewrite checks in CMakelists.txt and cmake/detect-intrinsics.cmake
to utilize the new variables.

commit | commitdiff | tree

Hans Kristian Rosbach [Wed, 10 Dec 2025 19:27:46 +0000 (20:27 +0100)]

Reorder deflate.h variables to improve cache locality

commit | commitdiff | tree

Hans Kristian Rosbach [Thu, 11 Dec 2025 19:34:05 +0000 (20:34 +0100)]

Use uint32_t for hash_head in update_hash/insert_string

commit | commitdiff | tree

Hans Kristian Rosbach [Thu, 11 Dec 2025 16:24:59 +0000 (17:24 +0100)]

Use uin32_t for Pos in match_tpl.h

commit | commitdiff | tree

Hans Kristian Rosbach [Mon, 8 Dec 2025 13:30:05 +0000 (14:30 +0100)]

- Reorder variables in longest_match, reducing gaps.
- Make window-based pointers in match_tpl.h const, only the
pointers move, never the data.

commit | commitdiff | tree

Hans Kristian Rosbach [Mon, 8 Dec 2025 13:30:05 +0000 (14:30 +0100)]

Use pointer arithmetic to access window in deflate_quick/deflate_fast

commit | commitdiff | tree

Hans Kristian Rosbach [Mon, 8 Dec 2025 12:18:24 +0000 (13:18 +0100)]

- Add local window pointer to:
deflate_quick, deflate_fast, deflate_medium and fill_window.
- Add local strm pointer in fill_window.
- Fix missed change to use local lookahead variable in match_tpl

commit | commitdiff | tree

Hans Kristian Rosbach [Mon, 8 Dec 2025 12:09:42 +0000 (13:09 +0100)]

Deflate_state changes:
- Reduce opt_len/static_len sizes.
- Move matches/insert closer to their related varibles.
These now fill a 8-byte hole in the struct on 64-bit platforms.
- Exclude compressed_len and bits_sent if ZLIB_DEBUG is
not enabled. Also move them to the end.
- Remove x86 MSVC-specific padding

commit | commitdiff | tree

Hans Kristian Rosbach [Mon, 8 Dec 2025 12:03:33 +0000 (13:03 +0100)]

- Minor inlining changes in trees_emit.h:
  - Inline the small bi_windup function
  - Don't attempt inlining for the big zng_emit_dist
- Don't check for too long match in deflate_quick, it cannot happen.
- Move GOTO_NEXT_CHAIN macro outside of LONGEST_MATCH function to
  improve readability.

commit | commitdiff | tree

Vladislav Shchapov [Sat, 20 Dec 2025 14:31:01 +0000 (19:31 +0500)]

Fix warnings: unused parameter state, comparison of integer expressions of different signedness: size_t and int64_t.

Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>

commit | commitdiff | tree

Mathias Heyer [Thu, 18 Dec 2025 01:14:41 +0000 (17:14 -0800)]

slide_hash_sse2 and slide_hash_avx2 are not dependent on HAVE_BUILTIN_CTZ

This patch matches x86_functions.h with behavior found in functable.c

It fixes builds where HAVE_BUILTIN_CTZ remained undefined.

commit | commitdiff | tree

Nathan Moinvaziri [Fri, 12 Dec 2025 01:28:12 +0000 (17:28 -0800)]

Change bi_reverse to use uint16_t code arg.

commit | commitdiff | tree

Nathan Moinvaziri [Sun, 7 Dec 2025 07:56:21 +0000 (23:56 -0800)]

Use __builtin_bitreverse16 in inflate_table

https://github.com/dougallj/zlib-dougallj/commit/f23fa25aa168ef782bab5e7cd6f9df50d7bb5eb2

commit | commitdiff | tree

Nathan Moinvaziri [Sat, 6 Dec 2025 15:55:07 +0000 (07:55 -0800)]

Use __builtin_bitreverse16 in bi_reverse if available.

commit | commitdiff | tree

Dougall Johnson [Mon, 8 Dec 2025 04:11:52 +0000 (20:11 -0800)]

Reorder code struct fields for better access patterns

Place bits field before op field in code struct to optimize memory
access. The bits field is accessed first in the hot path, so placing
it at offset 0 may improve code generation on some architectures.

commit | commitdiff | tree

Nathan Moinvaziri [Wed, 3 Dec 2025 03:36:54 +0000 (19:36 -0800)]

Remove COPY ifdef from crc32 (v)pclmulqdq.

commit | commitdiff | tree

Nathan Moinvaziri [Mon, 8 Dec 2025 03:54:41 +0000 (19:54 -0800)]

Add padding to deflate_struct until can be cleaned up along cachelines

commit | commitdiff | tree

Nathan Moinvaziri [Mon, 8 Dec 2025 03:59:36 +0000 (19:59 -0800)]

Compute w_bits rather than storing it in the deflate_state structure

Co-authored-by: Brian Pane <brianp@brianp.net>

commit | commitdiff | tree

Nathan Moinvaziri [Sat, 6 Dec 2025 01:52:47 +0000 (17:52 -0800)]

Compute w_mask rather than storing it in the deflate_state structure

Co-authored-by: Brian Pane <brianp@brianp.net>

commit | commitdiff | tree

Mika T. Lindqvist [Sat, 6 Dec 2025 21:52:57 +0000 (23:52 +0200)]

[configure] Fix detecting -fno-lto support
* Previously -fno-lto support was assumed to be supported on non-gcc compatible or unsupported compilers.
Support for it was never tested on those cases. Set the default to not supported.

commit | commitdiff | tree

Nathan Moinvaziri [Sat, 6 Dec 2025 03:59:06 +0000 (19:59 -0800)]

Micro-optimization for in pointer calculation for inflate_fast REFILL

trifectatechfoundation/zlib-rs#320

Co-authored-by: Brian Pane <brianp@brianp.net>

commit | commitdiff | tree

Vladislav Shchapov [Sat, 6 Dec 2025 15:17:40 +0000 (20:17 +0500)]

Fix for potentially uninitialized local variable ft used.

Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>

commit | commitdiff | tree

Hans Kristian Rosbach [Wed, 3 Dec 2025 12:10:06 +0000 (13:10 +0100)]

Use local copies of s->level and s->window in deflate_slow

commit | commitdiff | tree

Hans Kristian Rosbach [Sun, 30 Nov 2025 21:31:49 +0000 (22:31 +0100)]

Inline all uses of quick_insert_string*/quick_insert_value*.
Inline all uses of update_hash*.
Inline insert_string into deflate_quick, deflate_fast and deflate_medium.
Remove insert_string from deflate_state
Use local function pointer for insert_string.
Fix level check to actually check level and not `s->max_chain_length <= 1024`.

commit | commitdiff | tree

Nathan Moinvaziri [Wed, 3 Dec 2025 06:51:05 +0000 (22:51 -0800)]

Wrap _cond in Assert macro in case complex statement used.

commit | commitdiff | tree

Nathan Moinvaziri [Wed, 3 Dec 2025 05:23:20 +0000 (21:23 -0800)]

Wrap support_flag for cpu features in benchmark and test macros.

commit | commitdiff | tree

Nathan Moinvaziri [Wed, 3 Dec 2025 05:19:45 +0000 (21:19 -0800)]

Fixed casting warning in benchmark_uncompress on MSVC

benchmark_uncompress.cc(55,93): warning C4244: 'argument': conversion from 'int64_t' to 'size_t', possible loss of data

commit | commitdiff | tree

Nathan Moinvaziri [Tue, 2 Dec 2025 23:25:56 +0000 (15:25 -0800)]

Rename adler32_fold_copy to adler32_copy (#2026)

There are no folding techniques in adler32 implementations. It is simply hashing while copying.
- Rename adler32_fold_copy to adler32_copy.
- Remove unnecessary adler32_fold.c file.
- Reorder adler32_copy functions last in source file for consistency.
- Rename adler32_rvv_impl to adler32_copy_impl for consistency.
- Replace dst != NULL with 1 in adler32_copy_neon to remove branching.

commit | commitdiff | tree

Brad Smith [Fri, 14 Nov 2025 11:45:41 +0000 (06:45 -0500)]

Use elf_aux_info() on FreeBSD and OpenBSD ARM / AArch64

Use elf_aux_info() as the prefered API for modern FreeBSD and OpenBSD
ARM and AArch64. This adds 32-bit ARM support.

commit | commitdiff | tree

Sam Russell [Tue, 2 Dec 2025 19:12:17 +0000 (20:12 +0100)]

Chorba: Add test cases for #2029

Add test case from @KungFuJesus and a few others in similar data lengths

commit | commitdiff | tree

Sam Russell [Tue, 2 Dec 2025 13:46:33 +0000 (14:46 +0100)]

Chorba: Fix edge case bug for >256KB input

commit | commitdiff | tree

dependabot[bot] [Mon, 1 Dec 2025 07:20:48 +0000 (07:20 +0000)]

Bump actions/checkout from 5 to 6

Bumps [actions/checkout](https://github.com/actions/checkout) from 5 to 6.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/v5...v6)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

commit | commitdiff | tree

Nathan Moinvaziri [Sat, 29 Nov 2025 03:31:10 +0000 (19:31 -0800)]

Add quick_insert_value for optimized hash insertion

Reduces the number of reads by two

Co-authored-by: Brian Pane <brianp@brianp.net>
trifectatechfoundation/zlib-rs#374
trifectatechfoundation/zlib-rs#375

commit | commitdiff | tree

Hans Kristian Rosbach [Fri, 28 Nov 2025 23:50:40 +0000 (18:50 -0500)]

deflate_stored: use local copy of s->w_size

commit | commitdiff | tree

Hans Kristian Rosbach [Fri, 28 Nov 2025 23:49:37 +0000 (18:49 -0500)]

Minor cleanups of some variables in deflate functions

commit | commitdiff | tree

Hans Kristian Rosbach [Tue, 25 Nov 2025 11:18:52 +0000 (12:18 +0100)]

2.3.1 Release

commit | commitdiff | tree

Adam Stylinski [Fri, 21 Nov 2025 15:02:14 +0000 (10:02 -0500)]

Conditionally shortcut via the chorba polynomial based on compile flags

As it turns out, the copying CRC32 variant _is_ slower when compiled
with generic flags. The reason for this is mainly extra stack spills and
the lack of operations we can overlap with the moves. However, when
compiling for an architecture with more registers, such as avx512, we no
longer have to eat all these costly stack spills and we can overlap with
a 3 operand XOR. Conditionally guarding this means that if a Linux
distribution wants to compile with -march=x86_64-v4 they get all the
upsides to this.

This code notably is not actually used if you happen to have something
that support 512 bit wide clmul, so this does help a somewhat narrow
range of targets (most of the earlier avx512 implementations pre ice
lake).

We also must guard with AVX512VL, as just specifying AVX512F makes GCC
generate vpternlogic instructions of 512 bit widths only, so a bunch of
packing and unpacking of 512 bit to 256 bit registers and vice versa has
to occur, absolutely killing runtime. It's only AVX512VL where there's a
128 bit wide vpternlogic.

commit | commitdiff | tree

Adam Stylinski [Fri, 21 Nov 2025 14:45:48 +0000 (09:45 -0500)]

Use aligned loads in the chorba portions of the clmul crc routines

We go through the trouble to do aligned loads, we may as well let the
compiler know this is certain in doing so. We can't guarantee an aligned
store but at least with an aligned load the compiler can elide a load
with a subsequent xor multiplication when not copying.

commit | commitdiff | tree

Mika Lindqvist [Mon, 17 Nov 2025 17:15:03 +0000 (19:15 +0200)]

Fix build using configure
* "\i" is not valid escape code in BSD sed
* Some x86 shared sources were missing -fPIC due to using wrong variable in build rule

Fixes #2015.

commit | commitdiff | tree

Mika Lindqvist [Mon, 17 Nov 2025 08:21:36 +0000 (10:21 +0200)]

Update Google Benchmark to v1.9.4
* Require CMake 3.13

commit | commitdiff | tree

Brad Smith [Mon, 17 Nov 2025 05:50:47 +0000 (00:50 -0500)]

configure: Determine system architecture properly on *BSD systems

uname -m on a BSD system will provide the architecture port .e.g.
arm64, macppc, octeon instead of the machine architecture .e.g.
aarch64, powerpc, mips64. uname -p will provide the machine
architecture. NetBSD uses x86_64, OpenBSD uses amd64, FreeBSD
is a mix between uname -p and the compiler output.

commit | commitdiff | tree

Mika Lindqvist [Mon, 17 Nov 2025 10:28:21 +0000 (12:28 +0200)]

[CI] Downgrade "Windows GCC Native Instructions (AVX)" workflow
* Windows Server 2025 runner has broken GCC, so use Windows Server 2022 runner instead until fix is propagated to all runners

commit | commitdiff | tree

Hans Kristian Rosbach [Sun, 16 Nov 2025 18:41:18 +0000 (19:41 +0100)]

2.3.0 RC2

commit | commitdiff | tree

Hans Kristian Rosbach [Fri, 14 Nov 2025 14:33:32 +0000 (15:33 +0100)]

Add benchmark for crc32 fold copy implementations
Uses local functions for benchmarking some of the run-time selected variants.

commit | commitdiff | tree

Mika Lindqvist [Sun, 16 Nov 2025 12:49:44 +0000 (14:49 +0200)]

Disable benchmark for slide_hash_c with Visual C++ too.

commit | commitdiff | tree

Hans Kristian Rosbach [Thu, 13 Nov 2025 21:54:25 +0000 (22:54 +0100)]

Add tests for crc32_fold_copy functions

Mirror of https://github.com/zlib-ng/zlib-ng.git

RSS Atom