]> git.ipfire.org Git - thirdparty/zlib-ng.git/log
thirdparty/zlib-ng.git
21 hours agoChange bi_reverse to use uint16_t code arg. develop
Nathan Moinvaziri [Fri, 12 Dec 2025 01:28:12 +0000 (17:28 -0800)] 
Change bi_reverse to use uint16_t code arg.

21 hours agoUse __builtin_bitreverse16 in inflate_table
Nathan Moinvaziri [Sun, 7 Dec 2025 07:56:21 +0000 (23:56 -0800)] 
Use __builtin_bitreverse16 in inflate_table

https://github.com/dougallj/zlib-dougallj/commit/f23fa25aa168ef782bab5e7cd6f9df50d7bb5eb2

21 hours agoUse __builtin_bitreverse16 in bi_reverse if available.
Nathan Moinvaziri [Sat, 6 Dec 2025 15:55:07 +0000 (07:55 -0800)] 
Use __builtin_bitreverse16 in bi_reverse if available.

5 days agoReorder code struct fields for better access patterns
Dougall Johnson [Mon, 8 Dec 2025 04:11:52 +0000 (20:11 -0800)] 
Reorder code struct fields for better access patterns

Place bits field before op field in code struct to optimize memory
access. The bits field is accessed first in the hot path, so placing
it at offset 0 may improve code generation on some architectures.

5 days agoRemove COPY ifdef from crc32 (v)pclmulqdq.
Nathan Moinvaziri [Wed, 3 Dec 2025 03:36:54 +0000 (19:36 -0800)] 
Remove COPY ifdef from crc32 (v)pclmulqdq.

5 days agoAdd padding to deflate_struct until can be cleaned up along cachelines
Nathan Moinvaziri [Mon, 8 Dec 2025 03:54:41 +0000 (19:54 -0800)] 
Add padding to deflate_struct until can be cleaned up along cachelines

5 days agoCompute w_bits rather than storing it in the deflate_state structure
Nathan Moinvaziri [Mon, 8 Dec 2025 03:59:36 +0000 (19:59 -0800)] 
Compute w_bits rather than storing it in the deflate_state structure

Co-authored-by: Brian Pane <brianp@brianp.net>
5 days agoCompute w_mask rather than storing it in the deflate_state structure
Nathan Moinvaziri [Sat, 6 Dec 2025 01:52:47 +0000 (17:52 -0800)] 
Compute w_mask rather than storing it in the deflate_state structure

Co-authored-by: Brian Pane <brianp@brianp.net>
7 days ago[configure] Fix detecting -fno-lto support
Mika T. Lindqvist [Sat, 6 Dec 2025 21:52:57 +0000 (23:52 +0200)] 
[configure] Fix detecting -fno-lto support
* Previously -fno-lto support was assumed to be supported on non-gcc compatible or unsupported compilers.
  Support for it was never tested on those cases. Set the default to not supported.

7 days agoMicro-optimization for in pointer calculation for inflate_fast REFILL
Nathan Moinvaziri [Sat, 6 Dec 2025 03:59:06 +0000 (19:59 -0800)] 
Micro-optimization for in pointer calculation for inflate_fast REFILL

trifectatechfoundation/zlib-rs#320

Co-authored-by: Brian Pane <brianp@brianp.net>
7 days agoFix for potentially uninitialized local variable ft used.
Vladislav Shchapov [Sat, 6 Dec 2025 15:17:40 +0000 (20:17 +0500)] 
Fix for potentially uninitialized local variable ft used.

Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>
8 days agoUse local copies of s->level and s->window in deflate_slow
Hans Kristian Rosbach [Wed, 3 Dec 2025 12:10:06 +0000 (13:10 +0100)] 
Use local copies of s->level and s->window in deflate_slow

8 days agoInline all uses of quick_insert_string*/quick_insert_value*.
Hans Kristian Rosbach [Sun, 30 Nov 2025 21:31:49 +0000 (22:31 +0100)] 
Inline all uses of quick_insert_string*/quick_insert_value*.
Inline all uses of update_hash*.
Inline insert_string into deflate_quick, deflate_fast and deflate_medium.
Remove insert_string from deflate_state
Use local function pointer for insert_string.
Fix level check to actually check level and not `s->max_chain_length <= 1024`.

12 days agoWrap _cond in Assert macro in case complex statement used.
Nathan Moinvaziri [Wed, 3 Dec 2025 06:51:05 +0000 (22:51 -0800)] 
Wrap _cond in Assert macro in case complex statement used.

12 days agoWrap support_flag for cpu features in benchmark and test macros.
Nathan Moinvaziri [Wed, 3 Dec 2025 05:23:20 +0000 (21:23 -0800)] 
Wrap support_flag for cpu features in benchmark and test macros.

12 days agoFixed casting warning in benchmark_uncompress on MSVC
Nathan Moinvaziri [Wed, 3 Dec 2025 05:19:45 +0000 (21:19 -0800)] 
Fixed casting warning in benchmark_uncompress on MSVC

benchmark_uncompress.cc(55,93): warning C4244: 'argument': conversion from 'int64_t' to 'size_t', possible loss of data

12 days agoRename adler32_fold_copy to adler32_copy (#2026)
Nathan Moinvaziri [Tue, 2 Dec 2025 23:25:56 +0000 (15:25 -0800)] 
Rename adler32_fold_copy to adler32_copy (#2026)

There are no folding techniques in adler32 implementations. It is simply hashing while copying.
- Rename adler32_fold_copy to adler32_copy.
- Remove unnecessary adler32_fold.c file.
- Reorder adler32_copy functions last in source file for consistency.
- Rename adler32_rvv_impl to adler32_copy_impl for consistency.
- Replace dst != NULL with 1 in adler32_copy_neon to remove branching.

12 days agoUse elf_aux_info() on FreeBSD and OpenBSD ARM / AArch64
Brad Smith [Fri, 14 Nov 2025 11:45:41 +0000 (06:45 -0500)] 
Use elf_aux_info() on FreeBSD and OpenBSD ARM / AArch64

Use elf_aux_info() as the prefered API for modern FreeBSD and OpenBSD
ARM and AArch64. This adds 32-bit ARM support.

12 days agoChorba: Add test cases for #2029
Sam Russell [Tue, 2 Dec 2025 19:12:17 +0000 (20:12 +0100)] 
Chorba: Add test cases for #2029

Add test case from @KungFuJesus and a few others in similar data lengths

12 days agoChorba: Fix edge case bug for >256KB input
Sam Russell [Tue, 2 Dec 2025 13:46:33 +0000 (14:46 +0100)] 
Chorba: Fix edge case bug for >256KB input

2 weeks agoBump actions/checkout from 5 to 6
dependabot[bot] [Mon, 1 Dec 2025 07:20:48 +0000 (07:20 +0000)] 
Bump actions/checkout from 5 to 6

Bumps [actions/checkout](https://github.com/actions/checkout) from 5 to 6.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/v5...v6)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2 weeks agoAdd quick_insert_value for optimized hash insertion
Nathan Moinvaziri [Sat, 29 Nov 2025 03:31:10 +0000 (19:31 -0800)] 
Add quick_insert_value for optimized hash insertion

Reduces the number of reads by two

Co-authored-by: Brian Pane <brianp@brianp.net>
trifectatechfoundation/zlib-rs#374
trifectatechfoundation/zlib-rs#375

2 weeks agodeflate_stored: use local copy of s->w_size
Hans Kristian Rosbach [Fri, 28 Nov 2025 23:50:40 +0000 (18:50 -0500)] 
deflate_stored: use local copy of s->w_size

2 weeks agoMinor cleanups of some variables in deflate functions
Hans Kristian Rosbach [Fri, 28 Nov 2025 23:49:37 +0000 (18:49 -0500)] 
Minor cleanups of some variables in deflate functions

2 weeks ago2.3.1 Release 2.3.1
Hans Kristian Rosbach [Tue, 25 Nov 2025 11:18:52 +0000 (12:18 +0100)] 
2.3.1 Release

3 weeks agoConditionally shortcut via the chorba polynomial based on compile flags
Adam Stylinski [Fri, 21 Nov 2025 15:02:14 +0000 (10:02 -0500)] 
Conditionally shortcut via the chorba polynomial based on compile flags

As it turns out, the copying CRC32 variant _is_ slower when compiled
with generic flags. The reason for this is mainly extra stack spills and
the lack of operations we can overlap with the moves. However, when
compiling for an architecture with more registers, such as avx512, we no
longer have to eat all these costly stack spills and we can overlap with
a 3 operand XOR. Conditionally guarding this means that if a Linux
distribution wants to compile with -march=x86_64-v4 they get all the
upsides to this.

This code notably is not actually used if you happen to have something
that support 512 bit wide clmul, so this does help a somewhat narrow
range of targets (most of the earlier avx512 implementations pre ice
lake).

We also must guard with AVX512VL, as just specifying AVX512F makes GCC
generate vpternlogic instructions of 512 bit widths only, so a bunch of
packing and unpacking of 512 bit to 256 bit registers and vice versa has
to occur, absolutely killing runtime. It's only AVX512VL where there's a
128 bit wide vpternlogic.

3 weeks agoUse aligned loads in the chorba portions of the clmul crc routines
Adam Stylinski [Fri, 21 Nov 2025 14:45:48 +0000 (09:45 -0500)] 
Use aligned loads in the chorba portions of the clmul crc routines

We go through the trouble to do aligned loads, we may as well let the
compiler know this is certain in doing so. We can't guarantee an aligned
store but at least with an aligned load the compiler can elide a load
with a subsequent xor multiplication when not copying.

3 weeks agoFix build using configure
Mika Lindqvist [Mon, 17 Nov 2025 17:15:03 +0000 (19:15 +0200)] 
Fix build using configure
* "\i" is not valid escape code in BSD sed
* Some x86 shared sources were missing -fPIC due to using wrong variable in build rule

Fixes #2015.

3 weeks agoUpdate Google Benchmark to v1.9.4
Mika Lindqvist [Mon, 17 Nov 2025 08:21:36 +0000 (10:21 +0200)] 
Update Google Benchmark to v1.9.4
* Require CMake 3.13

3 weeks agoconfigure: Determine system architecture properly on *BSD systems
Brad Smith [Mon, 17 Nov 2025 05:50:47 +0000 (00:50 -0500)] 
configure: Determine system architecture properly on *BSD systems

uname -m on a BSD system will provide the architecture port .e.g.
arm64, macppc, octeon instead of the machine architecture .e.g.
aarch64, powerpc, mips64. uname -p will provide the machine
architecture. NetBSD uses x86_64, OpenBSD uses amd64, FreeBSD
is a mix between uname -p and the compiler output.

3 weeks ago[CI] Downgrade "Windows GCC Native Instructions (AVX)" workflow
Mika Lindqvist [Mon, 17 Nov 2025 10:28:21 +0000 (12:28 +0200)] 
[CI] Downgrade "Windows GCC Native Instructions (AVX)" workflow
* Windows Server 2025 runner has broken GCC, so use Windows Server 2022 runner instead until fix is propagated to all runners

4 weeks ago2.3.0 RC2 2.3.0-rc2
Hans Kristian Rosbach [Sun, 16 Nov 2025 18:41:18 +0000 (19:41 +0100)] 
2.3.0 RC2

4 weeks agoAdd benchmark for crc32 fold copy implementations
Hans Kristian Rosbach [Fri, 14 Nov 2025 14:33:32 +0000 (15:33 +0100)] 
Add benchmark for crc32 fold copy implementations
Uses local functions for benchmarking some of the run-time selected variants.

4 weeks agoDisable benchmark for slide_hash_c with Visual C++ too.
Mika Lindqvist [Sun, 16 Nov 2025 12:49:44 +0000 (14:49 +0200)] 
Disable benchmark for slide_hash_c with Visual C++ too.

4 weeks agoAdd tests for crc32_fold_copy functions
Hans Kristian Rosbach [Thu, 13 Nov 2025 21:54:25 +0000 (22:54 +0100)] 
Add tests for crc32_fold_copy functions

4 weeks agoUse CTest to simplify testing options
Hans Kristian Rosbach [Tue, 11 Nov 2025 16:24:26 +0000 (17:24 +0100)] 
Use CTest to simplify testing options
Add CMake variable TEST_STOCK_ZLIB to disable some tests if attempting
to run our testsuite on stock zlib.
PR depends on CMP0077, introduced by CMake 3.13.
Upped minimum compatible CMake version to 3.13, same as we have
actually been telling people was the minumum for years on the wiki.
Upped upper compatible CMake version to 3.31, my current version.

4 weeks agoUse elf_aux_info() on OpenBSD PowerPC
Brad Smith [Fri, 14 Nov 2025 01:02:25 +0000 (20:02 -0500)] 
Use elf_aux_info() on OpenBSD PowerPC

4 weeks ago- Unify crc32_chorba, chorba_sse2 and chorba_sse41 dispatch functions.
Hans Kristian Rosbach [Tue, 11 Nov 2025 21:47:52 +0000 (22:47 +0100)] 
- Unify crc32_chorba, chorba_sse2 and chorba_sse41 dispatch functions.
- Fixed alignment diff calculation in crc32_chorba.
- Fixed length check to happen early, avoiding extra branches for too short lengths,
this also allows removing one function call to crc32_braid_internal to handle those.
Gbench shows ~0.15-0.25ns saved per call for lengths shorter than CHORBA_SMALL_THRESHOLD.
- Avoid calculating aligned len if buffer is already aligned

4 weeks agoReorganize Chorba activation.
Hans Kristian Rosbach [Tue, 11 Nov 2025 19:23:24 +0000 (20:23 +0100)] 
Reorganize Chorba activation.
Now WITHOUT_CHORBA will only disable the crc32_chorba C fallback.

SSE2, SSE41 and pclmul variants will still be able to use their Chorba-algorithm based code,
but their fallback to the generic crc32_chorba C code in SSE2 and SSE41 will be disabled,
reducing their performance on really big input buffers (not used during deflate/inflate,
only when calling crc32 directly).

Remove the crc32_c function (and its file crc32_c.c), instead use the normal functable
routing to select between crc32_braid and crc32_chorba.

Disable sse2 and sse4.1 variants of Chorba-crc32 on MSVC older than 2022 due to code
generation bug in 2019 causing segfaults.

Compile either crc32_chorba_small_nondestructive or crc32_chorba_small_nondestructive_32bit,
not both. Don't compile crc32_chorba_32768_nondestructive on 32bit arch.

4 weeks agoriscv: features: test HWCAP regardless of kernel versions
Icenowy Zheng [Tue, 11 Nov 2025 14:47:55 +0000 (22:47 +0800)] 
riscv: features: test HWCAP regardless of kernel versions

The HWCAP facility comes at day 1 of Linux RISC-V support (date back to
4.15), only the V bit definition is added in 6.5 (because proper vector
support is added in that version too).

There should be no need to test kernel version number before accessing
hwcap, only the V bit will never be present on kernel older than 6.5
(except dirty patched downstream ones).

For Xtheadvector systems that bogusly announce V bit in HWCAP, the
assembly code should be able to factor them out. This is tested on
a Sophgo SG2042 machine with 6.1 kernel.

Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
4 weeks agoUpdate README.md, add a lot of missing info, and reorder some of it.
Hans Kristian Rosbach [Tue, 11 Nov 2025 16:17:35 +0000 (17:17 +0100)] 
Update README.md, add a lot of missing info, and reorder some of it.
Add missing parameter to configure help text.
Update descriptions and reorganize some options in CMake

5 weeks ago2.3.0 RC1 2.3.0-rc1
Hans Kristian Rosbach [Fri, 31 Oct 2025 22:38:52 +0000 (23:38 +0100)] 
2.3.0 RC1

5 weeks agoInitial support for nVidia toolchain
Mika Lindqvist [Sun, 2 Nov 2025 16:57:16 +0000 (18:57 +0200)] 
Initial support for nVidia toolchain
* Supports native and non-native builds for x86_64 using CMake

6 weeks agoBump github/codeql-action from 3 to 4
dependabot[bot] [Sat, 1 Nov 2025 07:04:15 +0000 (07:04 +0000)] 
Bump github/codeql-action from 3 to 4

Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3 to 4.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](https://github.com/github/codeql-action/compare/v3...v4)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: '4'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
6 weeks agoBump actions/upload-artifact from 4 to 5
dependabot[bot] [Sat, 1 Nov 2025 07:04:10 +0000 (07:04 +0000)] 
Bump actions/upload-artifact from 4 to 5

Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4 to 5.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](https://github.com/actions/upload-artifact/compare/v4...v5)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-version: '5'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
6 weeks agoBump actions/download-artifact from 5 to 6
dependabot[bot] [Sat, 1 Nov 2025 07:04:03 +0000 (07:04 +0000)] 
Bump actions/download-artifact from 5 to 6

Bumps [actions/download-artifact](https://github.com/actions/download-artifact) from 5 to 6.
- [Release notes](https://github.com/actions/download-artifact/releases)
- [Commits](https://github.com/actions/download-artifact/compare/v5...v6)

---
updated-dependencies:
- dependency-name: actions/download-artifact
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
6 weeks agorename cmake config target files to avoid illegal overwrite of PACKAGE_VERSION
Benjamin Buch [Thu, 23 Oct 2025 17:17:29 +0000 (19:17 +0200)] 
rename cmake config target files to avoid illegal overwrite of PACKAGE_VERSION

6 weeks agoRename CMake targets to avoid clashes when used as a subproject (#1970)
Cameron Cawley [Tue, 28 Oct 2025 22:34:56 +0000 (22:34 +0000)] 
Rename CMake targets to avoid clashes when used as a subproject (#1970)

7 weeks agoFix type mismatch on platforms where int32_t and uint32_t use long instead of int
Mika Lindqvist [Thu, 9 Oct 2025 08:40:16 +0000 (11:40 +0300)] 
Fix type mismatch on platforms where int32_t and uint32_t use long instead of int
* Based on PR #1934

7 weeks agoImprove resilience of the functable initialization; during functable init,
Hans Kristian Rosbach [Fri, 10 Oct 2025 11:33:53 +0000 (13:33 +0200)] 
Improve resilience of the functable initialization; during functable init,
make sure none of the function pointers are nullpointers.

Up until now, zlib-ng and the application would have segfaulted either at the start
of processing, or at some point later depending on when a nullpointer call would happen
in the processing. In any case most likely after accepting data from the application.

Now, the deflateinit/inflateinit functions will error with Z_VERSION_ERROR, and
gzopen will return Z_STREAM_ERROR before actually processing any data.

Direct calls to functions like adler32 or crc32 will however print an error message
and call abort(), as these functions have no actual way of reporting errors.

Note: This should never happen with default builds of zlib-ng, only if it is run on
a cpu that is missing both the matching optimized and the generic fallback functions.
This can currently only happen if zlib-ng is compiled using custom cflags or by
editing the code.

2 months agoDon't build C-fallback functions that never get used on x86_64
Hans Kristian Rosbach [Fri, 10 Oct 2025 12:52:21 +0000 (14:52 +0200)] 
Don't build C-fallback functions that never get used on x86_64

2 months agoRemove force-sse2 config option from x86 builds.
Hans Kristian Rosbach [Fri, 10 Oct 2025 11:26:12 +0000 (13:26 +0200)] 
Remove force-sse2 config option from x86 builds.
Due to major refactoring done long ago, this option no longer avoids a branch
in a hot path, it currently only removes a single if check during init.

2 months agoUpdate s390x actions runner.
Hans Kristian Rosbach [Fri, 10 Oct 2025 11:15:38 +0000 (13:15 +0200)] 
Update s390x actions runner.
- Update to EL10
- Update URL to s390x runner patch

2 months ago📝 Add docstrings to `cleanup3`
coderabbitai[bot] [Mon, 6 Oct 2025 18:36:46 +0000 (18:36 +0000)] 
📝 Add docstrings to `cleanup3`

Docstrings generation was requested by @mtl1979.

* https://github.com/zlib-ng/zlib-ng/pull/1978#issuecomment-3373304629

The following files were modified:

* `test/benchmarks/benchmark_slidehash.cc`

2 months agoBump actions/checkout from 4 to 5
dependabot[bot] [Wed, 8 Oct 2025 14:06:54 +0000 (14:06 +0000)] 
Bump actions/checkout from 4 to 5

Bumps [actions/checkout](https://github.com/actions/checkout) from 4 to 5.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/v4...v5)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: '5'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2 months agoFix cast and truncation warnings.
Mika Lindqvist [Mon, 6 Oct 2025 18:22:40 +0000 (21:22 +0300)] 
Fix cast and truncation warnings.

2 months agoUpdate terms in txtvsbin.txt
Jeff Handley [Fri, 3 Oct 2025 16:36:43 +0000 (12:36 -0400)] 
Update terms in txtvsbin.txt

2 months agoUse 'block-list' and 'allow-list' terms
Jeff Handley [Thu, 2 Oct 2025 23:22:46 +0000 (16:22 -0700)] 
Use 'block-list' and 'allow-list' terms

2 months agoIncrease minimum supported CMake version from 3.5.1 to 3.12
Hans Kristian Rosbach [Thu, 2 Oct 2025 12:16:53 +0000 (14:16 +0200)] 
Increase minimum supported CMake version from 3.5.1 to 3.12

2 months agoInline the CHUNKSIZE function
Cameron Cawley [Thu, 2 Oct 2025 16:14:09 +0000 (17:14 +0100)] 
Inline the CHUNKSIZE function

2 months agoUpdate macOS CI images
Cameron Cawley [Sat, 27 Sep 2025 12:26:12 +0000 (13:26 +0100)] 
Update macOS CI images

2 months agoSynchronise ARMv8 and Loongarch CRC32 implementations
Cameron Cawley [Thu, 25 Sep 2025 15:30:53 +0000 (16:30 +0100)] 
Synchronise ARMv8 and Loongarch CRC32 implementations

2 months agoFix -Wstrict-prototypes warnings
Cameron Cawley [Thu, 25 Sep 2025 14:11:14 +0000 (15:11 +0100)] 
Fix -Wstrict-prototypes warnings

2 months agoFix -Wunused-command-line-argument warnings on Mac OS X
Cameron Cawley [Thu, 25 Sep 2025 13:59:10 +0000 (14:59 +0100)] 
Fix -Wunused-command-line-argument warnings on Mac OS X

3 months agoAllow C17 for newer CMake versions (#1958)
Alexander Vieth [Thu, 11 Sep 2025 10:27:49 +0000 (12:27 +0200)] 
Allow C17 for newer CMake versions (#1958)

3 months ago[CI] Use MacOS 14 for GCC UBSAN.
Mika Lindqvist [Sat, 6 Sep 2025 15:32:52 +0000 (18:32 +0300)] 
[CI] Use MacOS 14 for GCC UBSAN.

3 months ago[CI] Install Windows 11 SDK 10.0.22621 for 32-bit ARM.
Mika Lindqvist [Sat, 6 Sep 2025 16:59:46 +0000 (19:59 +0300)] 
[CI] Install Windows 11 SDK 10.0.22621 for 32-bit ARM.

3 months agoFix type mismatch with Windows GCC.
Mika Lindqvist [Sat, 6 Sep 2025 17:25:54 +0000 (20:25 +0300)] 
Fix type mismatch with Windows GCC.

3 months ago[CI] Update MacOS toolchain.
Mika Lindqvist [Sat, 6 Sep 2025 12:32:48 +0000 (15:32 +0300)] 
[CI] Update MacOS toolchain.
* Use Xcode 16.4 as Xcode 15.2 is no longer supported

3 months agoBump actions/download-artifact from 4 to 5
dependabot[bot] [Mon, 1 Sep 2025 12:50:15 +0000 (12:50 +0000)] 
Bump actions/download-artifact from 4 to 5

Bumps [actions/download-artifact](https://github.com/actions/download-artifact) from 4 to 5.
- [Release notes](https://github.com/actions/download-artifact/releases)
- [Commits](https://github.com/actions/download-artifact/compare/v4...v5)

---
updated-dependencies:
- dependency-name: actions/download-artifact
  dependency-version: '5'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
3 months agoPartially inline flush_pending
Hans Kristian Rosbach [Mon, 18 Aug 2025 13:20:05 +0000 (15:20 +0200)] 
Partially inline flush_pending

3 months agoOptimize read_buf by removing indirection penalty
Hans Kristian Rosbach [Thu, 21 Aug 2025 18:36:53 +0000 (20:36 +0200)] 
Optimize read_buf by removing indirection penalty

3 months agoInline read_buf
Hans Kristian Rosbach [Mon, 18 Aug 2025 13:19:50 +0000 (15:19 +0200)] 
Inline read_buf

3 months agoMove "Architecture-specific hooks" from deflate.c to the more appropriate deflate.h
Hans Kristian Rosbach [Sat, 16 Aug 2025 21:09:59 +0000 (23:09 +0200)] 
Move "Architecture-specific hooks" from deflate.c to the more appropriate deflate.h

3 months agoUse aligned load/store in AVX2 Slide Hash.
Hans Kristian Rosbach [Sun, 3 Aug 2025 21:04:20 +0000 (23:04 +0200)] 
Use aligned load/store in AVX2 Slide Hash.
Also test slidehash from 512 bytes, the minimum window size we use.

3 months agoAdd benchmark for insert_string.
Nathan Moinvaziri [Fri, 22 Aug 2025 17:34:15 +0000 (10:34 -0700)] 
Add benchmark for insert_string.

3 months agoInline bi_reverse
Hans Kristian Rosbach [Fri, 22 Aug 2025 07:39:25 +0000 (09:39 +0200)] 
Inline bi_reverse

3 months agoAdd error propagation to gzread/gzwrite
Hans Kristian Rosbach [Wed, 20 Aug 2025 14:24:16 +0000 (16:24 +0200)] 
Add error propagation to gzread/gzwrite

3 months agoSplit out gz_read_init() from gzlook(), and rename gz_init() to gz_write_init().
Hans Kristian Rosbach [Fri, 4 Jul 2025 19:15:35 +0000 (21:15 +0200)] 
Split out gz_read_init() from gzlook(), and rename gz_init() to gz_write_init().
This makes gzread.c more like gzwrite.c, and fits in with the new code in gzlib.c.

3 months agoReorganize initialization and use a single malloc call for both
Hans Kristian Rosbach [Fri, 4 Jul 2025 15:53:09 +0000 (17:53 +0200)] 
Reorganize initialization and use a single malloc call for both
in and outbuffers in gzopen/gzread/gzwrite.
Also start aligning the allocation to 64 bytes (on a cacheline border).

3 months agoInline pqdownheap
Hans Kristian Rosbach [Mon, 18 Aug 2025 20:05:20 +0000 (22:05 +0200)] 
Inline pqdownheap

3 months agoReorder functions related to build_tree to more closely match order of
Hans Kristian Rosbach [Mon, 18 Aug 2025 19:25:29 +0000 (21:25 +0200)] 
Reorder functions related to build_tree to more closely match order of
actual usage in the code. Could also help with cache locality.

3 months agoUse local pointers to avoid indirection penalties in pqdownheap and build_tree
Hans Kristian Rosbach [Mon, 18 Aug 2025 18:54:08 +0000 (20:54 +0200)] 
Use local pointers to avoid indirection penalties in pqdownheap and build_tree

3 months agoUse local pointers to avoid indirection penalty in compress_block
Hans Kristian Rosbach [Mon, 18 Aug 2025 18:53:23 +0000 (20:53 +0200)] 
Use local pointers to avoid indirection penalty in compress_block

3 months agoMinor optimization of insert_string
Hans Kristian Rosbach [Fri, 15 Aug 2025 13:59:46 +0000 (15:59 +0200)] 
Minor optimization of insert_string

3 months agoUnroll some of the adler checksum for avx2
Adam Stylinski [Sat, 16 Aug 2025 20:04:30 +0000 (16:04 -0400)] 
Unroll some of the adler checksum for avx2

Similar to what's done for vmx, avx512, and sse4, let's unroll some
of this checksum since it's a commutative checksum. We take advantage
of ILP and do more intermediate sums before rolling them back together
for the finalization of the checksum.

3 months agoCheck the proper bit for BMI2
Adam Stylinski [Sat, 16 Aug 2025 15:35:33 +0000 (11:35 -0400)] 
Check the proper bit for BMI2

We were actually checking for BMI1 support here. This is unlikely to have
caused any issues because to date there have not been any x86 CPUs with
AVX2 support but no BMI2 support.

4 months agoDisable NEON workaround on Clang 20 and above, and enable it for non-mobile platforms...
Un1q32 [Sun, 3 Aug 2025 18:46:52 +0000 (14:46 -0400)] 
Disable NEON workaround on Clang 20 and above, and enable it for non-mobile platforms (#1942)

4 months agostyle
Harmen Stoppels [Wed, 30 Jul 2025 09:47:41 +0000 (11:47 +0200)] 
style

4 months agoX86_AVX512VNNI: test _mm256_dpbusd_epi32 too
Harmen Stoppels [Tue, 29 Jul 2025 12:25:16 +0000 (14:25 +0200)] 
X86_AVX512VNNI: test _mm256_dpbusd_epi32 too

On RHEL9 the GCC is new enough to support AVX512-VNNI, but its assembler
(binutils) is not and errors with

```
Error: unsupported instruction vpdpbusd
```

This was already addressed earlier in
https://github.com/zlib-ng/zlib-ng/pull/1562 to some extent, except that
a check for `_mm256_dpbusd_epi32` was not added, which is what the
assembler errors over.

4 months agoMake test options dependent on ZLIB_ENABLE_TESTS
Cameron Cawley [Fri, 11 Jul 2025 12:19:02 +0000 (13:19 +0100)] 
Make test options dependent on ZLIB_ENABLE_TESTS

4 months agoUpdate incorrect comment
Hans Kristian Rosbach [Sat, 12 Jul 2025 11:08:05 +0000 (13:08 +0200)] 
Update incorrect comment

4 months agoRevert "Inserting strings is not slow any longer, remove bypass in deflate_medium()."
Hans Kristian Rosbach [Sat, 12 Jul 2025 10:48:10 +0000 (12:48 +0200)] 
Revert "Inserting strings is not slow any longer, remove bypass in deflate_medium()."

This reverts commit 322753f36e833343ae030e499564691da15eef32.

4 months agoRevert "Clean up insert_match() in deflate_medium"
Hans Kristian Rosbach [Sat, 12 Jul 2025 10:48:03 +0000 (12:48 +0200)] 
Revert "Clean up insert_match() in deflate_medium"

This reverts commit 56d3d9851a824aeb921c9853042776866aa195a3.

4 months agoPrep for 2.3.0 beta
Hans Kristian Rosbach [Tue, 15 Jul 2025 20:50:46 +0000 (22:50 +0200)] 
Prep for 2.3.0 beta

5 months agoRemove usage of aligned alloc implementations and instead use malloc
Hans Kristian Rosbach [Fri, 4 Jul 2025 10:40:33 +0000 (12:40 +0200)] 
Remove usage of aligned alloc implementations and instead use malloc
and handle alignment internally. We already always have to do those
checks because we have to support external alloc implementations.

5 months agoRewrite LoongArch64 CRC32 implementation based on ARMv8 with manual alignment
Vladislav Shchapov [Thu, 3 Jul 2025 15:58:16 +0000 (20:58 +0500)] 
Rewrite LoongArch64 CRC32 implementation based on ARMv8 with manual alignment

Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>
5 months agoLoongArch64 micro-optimizations
Vladislav Shchapov [Fri, 20 Jun 2025 16:56:47 +0000 (21:56 +0500)] 
LoongArch64 micro-optimizations

Co-authored-by: junchao-zhao <zhaojunchao@loongson.cn>
Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>
5 months agoUpdate README.md
Vladislav Shchapov [Wed, 18 Jun 2025 20:22:09 +0000 (01:22 +0500)] 
Update README.md

Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>
5 months agoAdd LoongArch64 (LASX) adler32, adler32_fold_copy implementation
Vladislav Shchapov [Sat, 14 Jun 2025 20:44:38 +0000 (01:44 +0500)] 
Add LoongArch64 (LASX) adler32, adler32_fold_copy implementation

Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>