]> git.ipfire.org Git - thirdparty/zlib-ng.git/log
thirdparty/zlib-ng.git
7 weeks agoTake account of use-case where there is an empty git tree object when reading the...
Paul Marquess [Sun, 15 Feb 2026 12:11:33 +0000 (12:11 +0000)] 
Take account of use-case where there is an empty git tree object when reading the BASE_SHA

7 weeks agoadd workflow_dispatch to most of the workflow files
Paul Marquess [Sat, 14 Feb 2026 16:29:40 +0000 (16:29 +0000)] 
add workflow_dispatch to most of the workflow files

7 weeks agoAdd .vscode to .gitignore
Nathan Moinvaziri [Tue, 17 Feb 2026 01:50:11 +0000 (17:50 -0800)] 
Add .vscode to .gitignore

7 weeks agoFixed unused function warning for arm_has_cpuid
Nathan Moinvaziri [Sat, 14 Feb 2026 22:24:33 +0000 (14:24 -0800)] 
Fixed unused function warning for arm_has_cpuid

7 weeks agoRemove unnecessary ARCH_ARM in arm_features.c
Nathan Moinvaziri [Sat, 14 Feb 2026 22:24:11 +0000 (14:24 -0800)] 
Remove unnecessary ARCH_ARM in arm_features.c

7 weeks agoFix building on RISC-V without RVV.
Mika T. Lindqvist [Fri, 13 Feb 2026 01:10:55 +0000 (03:10 +0200)] 
Fix building on RISC-V without RVV.

7 weeks agodeflate_medium: more readability initialize structs match
Herman Semenoff [Mon, 9 Feb 2026 14:19:55 +0000 (17:19 +0300)] 
deflate_medium: more readability initialize structs match

8 weeks ago[configure] Add initial support for NVHPC toolchain.
Mika T. Lindqvist [Sat, 6 Dec 2025 21:52:57 +0000 (23:52 +0200)] 
[configure] Add initial support for NVHPC toolchain.
* Improve detecting default compiler

8 weeks ago[zconf] Fix LFS support on Windows
Mika Lindqvist [Mon, 9 Feb 2026 10:48:21 +0000 (12:48 +0200)] 
[zconf] Fix LFS support on Windows
* Windows doesn't have unistd.h, so z_off_t declaration only can depend on value of Z_HAVE_UNISTD_H.

2 months agoCleanup formatting for crc32_chorba files
Nathan Moinvaziri [Fri, 6 Feb 2026 06:25:07 +0000 (22:25 -0800)] 
Cleanup formatting for crc32_chorba files

2 months agoAdd e2k codecov build
Vladislav Shchapov [Sat, 31 Jan 2026 17:08:29 +0000 (22:08 +0500)] 
Add e2k codecov build

Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>
2 months agoAdd e2k CI
Vladislav Shchapov [Tue, 27 Jan 2026 15:13:57 +0000 (20:13 +0500)] 
Add e2k CI

Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>
2 months agoAdd e2k support
Vladislav Shchapov [Sun, 25 Jan 2026 17:52:15 +0000 (22:52 +0500)] 
Add e2k support

Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>
2 months agoRemove build script compiler checks for ctz builtins
Nathan Moinvaziri [Mon, 26 Jan 2026 08:45:07 +0000 (00:45 -0800)] 
Remove build script compiler checks for ctz builtins

2 months agoCleanup and rename bitreverse16 after ctz builtin refactoring
Nathan Moinvaziri [Mon, 26 Jan 2026 08:35:21 +0000 (00:35 -0800)] 
Cleanup and rename bitreverse16 after ctz builtin refactoring

2 months agoCleanup compare256 and compare256_rle implementations.
Nathan Moinvaziri [Mon, 2 Feb 2026 21:25:08 +0000 (13:25 -0800)] 
Cleanup compare256 and compare256_rle implementations.

We no longer need to check for HAVE_BUILTIN_CTZ or HAVE_BUILTIN_CTZLL,
since that logic is now handled in zng_ctz32/zng_ctz64.

2 months agoAlways use zng_ctz32 for W_BITS calculation
Nathan Moinvaziri [Mon, 26 Jan 2026 08:43:10 +0000 (00:43 -0800)] 
Always use zng_ctz32 for W_BITS calculation

2 months agoRefactor ctz builtins while always providing fallback.
Nathan Moinvaziri [Mon, 2 Feb 2026 21:43:09 +0000 (13:43 -0800)] 
Refactor ctz builtins while always providing fallback.

Centralize count trailing zeros logic in fallback_builtins.h with
zng_ctz32/zng_ctz64 that use hardware intrinsics when available and
De Bruijn multiplication as portable fallback.

2 months agoChange "None" to "OFF" for sanitizer option (#2141)
Pavel P [Fri, 6 Feb 2026 00:17:14 +0000 (02:17 +0200)] 
Change "None" to "OFF" for sanitizer option (#2141)

* Change "None" to "OFF" for sanitizer option

Co-authored-by: Benoit Pierre <benoit.pierre@gmail.com>
2 months agoRename chorba SSE files to crc32_chorba for consistency
Nathan Moinvaziri [Sun, 1 Feb 2026 20:11:27 +0000 (12:11 -0800)] 
Rename chorba SSE files to crc32_chorba for consistency

2 months agoSimplify logic in INFLATE_ALLOW_INVALID_DISTANCE_TOOFAR_ARRR
Nathan Moinvaziri [Tue, 20 Jan 2026 00:45:53 +0000 (16:45 -0800)] 
Simplify logic in INFLATE_ALLOW_INVALID_DISTANCE_TOOFAR_ARRR

2 months agoSlide 32 hash entries per loop iteration when using AVX2.
Mika T. Lindqvist [Sat, 31 Jan 2026 19:44:33 +0000 (21:44 +0200)] 
Slide 32 hash entries per loop iteration when using AVX2.

2 months agoClean up crc32_braid/chorba calls.
Nathan Moinvaziri [Thu, 15 Jan 2026 07:46:28 +0000 (23:46 -0800)] 
Clean up crc32_braid/chorba calls.

2 months agoriscv: features: add support for detecting V/Zbc via hwprobe
Icenowy Zheng [Wed, 28 Jan 2026 08:13:56 +0000 (16:13 +0800)] 
riscv: features: add support for detecting V/Zbc via hwprobe

Adding support for riscv_hwprobe and detecting V/Zbc via it.

The needed macros should be in Linux 6.12 UAPI headers, which are
shipped by Debian Trixie.

Tested via qemu-user that the Zbc codepath is examined by adding some
code there.

Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
2 months agoriscv: features: add check for asm/hwprobe.h in build systems
Icenowy Zheng [Wed, 28 Jan 2026 15:34:45 +0000 (23:34 +0800)] 
riscv: features: add check for asm/hwprobe.h in build systems

Currently the check follows the practice of arm_acle.h. It's checked in
the configure script only when riscv_features is built, but always
checked for CMake.

Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
2 months agoriscv: features: prepare for more runtime detection facilities
Icenowy Zheng [Tue, 27 Jan 2026 15:12:15 +0000 (23:12 +0800)] 
riscv: features: prepare for more runtime detection facilities

As hwprobe support is going to be added, do some preparation for it.

Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
2 months agoFix building on FreeBSD/OpenBSD
Brad Smith [Wed, 28 Jan 2026 11:16:00 +0000 (06:16 -0500)] 
Fix building on FreeBSD/OpenBSD

Put the checks in the right order. Newer before older.

2 months agoRemove unnecessary string.h include in x86_features
Nathan Moinvaziri [Thu, 29 Jan 2026 03:48:43 +0000 (19:48 -0800)] 
Remove unnecessary string.h include in x86_features

2 months agoUse index-based CRC macros and inline memcpy.
Nathan Moinvaziri [Mon, 26 Jan 2026 22:53:43 +0000 (14:53 -0800)] 
Use index-based CRC macros and inline memcpy.

2 months agoLoop unroll for len >= 8 in crc32_copy_small.
Nathan Moinvaziri [Thu, 15 Jan 2026 04:17:53 +0000 (20:17 -0800)] 
Loop unroll for len >= 8 in crc32_copy_small.

2 months agoMove crc32_copy_small to shared private header.
Nathan Moinvaziri [Thu, 15 Jan 2026 04:15:05 +0000 (20:15 -0800)] 
Move crc32_copy_small to shared private header.

2 months agoCombine Huffman code and extra bits into single shift operation
Dougall Johnson [Sun, 25 Jan 2026 18:34:14 +0000 (10:34 -0800)] 
Combine Huffman code and extra bits into single shift operation

This changes the "code" structure so that "bits" contains the total
number of bits, and "op & 15" contains the non-extra bit count.

Based on https://github.com/dougallj/zlib-dougallj/commit/34b9fc457b5247d7d2d732e6f28c9a80ff16abd7

Co-authored-by: Nathan Moinvaziri <nathan@nathanm.com>
2 months agoCheck cpuid availability on FreeBSD/OpenBSD when detecting fast/pmull
Nathan Moinvaziri [Sun, 25 Jan 2026 02:00:50 +0000 (18:00 -0800)] 
Check cpuid availability on FreeBSD/OpenBSD when detecting fast/pmull

2 months agoMove cleanup to inffast_tpl.h, clean up INFLATE_FAST
Pavel P [Wed, 21 Jan 2026 12:30:11 +0000 (14:30 +0200)] 
Move cleanup to inffast_tpl.h, clean up INFLATE_FAST

2 months agoAdd cleanup for defines that might be set before inclusion of chunkset_tpl.h
Pavel P [Wed, 21 Jan 2026 11:10:03 +0000 (13:10 +0200)] 
Add cleanup for defines that might be set before inclusion of chunkset_tpl.h

 + remove unused `HAVE_CHUNKUNROLL`
 + type `utilisation` => `utilization`

2 months agoRead architecture name from binary in detect-arch.
Nathan Moinvaziri [Sat, 17 Jan 2026 18:10:49 +0000 (10:10 -0800)] 
Read architecture name from binary in detect-arch.

2 months agoRemove architecture-specific compile definitions from build system
Nathan Moinvaziri [Sat, 17 Jan 2026 16:43:26 +0000 (08:43 -0800)] 
Remove architecture-specific compile definitions from build system

2 months agoConvert arch detection from preprocessor errors to runtime with header
Nathan Moinvaziri [Sun, 18 Jan 2026 03:12:44 +0000 (19:12 -0800)] 
Convert arch detection from preprocessor errors to runtime with header

2 months agoMove DoNotOptimize in benchmark loop to prevent it being optimized away
Nathan Moinvaziri [Thu, 22 Jan 2026 01:10:14 +0000 (17:10 -0800)] 
Move DoNotOptimize in benchmark loop to prevent it being optimized away

In some cases I've noticed incorrect 0 benchmark results from compiler
optimizing away len during hash benchmarks.

2 months agoUse MIN macro in a few more instances throughout the code
Nathan Moinvaziri [Tue, 20 Jan 2026 01:06:02 +0000 (17:06 -0800)] 
Use MIN macro in a few more instances throughout the code

2 months agoAdd ALIGN_UP and ALIGN_DOWN macros for readability
Nathan Moinvaziri [Thu, 8 Jan 2026 02:23:23 +0000 (18:23 -0800)] 
Add ALIGN_UP and ALIGN_DOWN macros for readability

2 months agoRemove redundant include directives
Robert Kausch [Fri, 23 Jan 2026 13:16:46 +0000 (14:16 +0100)] 
Remove redundant include directives

2 months agoMake feature macro tests more consistent
Robert Kausch [Wed, 21 Jan 2026 23:15:08 +0000 (00:15 +0100)] 
Make feature macro tests more consistent

- Test feature macros before any includes
- Use #ifdef over #if defined() for single macro tests
- Always include zbuild.h first

2 months agoTest feature macros in all implementation files
Robert Kausch [Mon, 19 Jan 2026 15:41:14 +0000 (16:41 +0100)] 
Test feature macros in all implementation files

2 months agoReduce code size and directly call in adler32_copy
Nathan Moinvaziri [Thu, 15 Jan 2026 07:53:18 +0000 (23:53 -0800)] 
Reduce code size and directly call in adler32_copy

There is no need for _impl when there is no const int COPY being used

2 months agoFix oversized pair allocation in adler32_vmx
Nathan Moinvaziri [Wed, 14 Jan 2026 23:39:34 +0000 (15:39 -0800)] 
Fix oversized pair allocation in adler32_vmx

2 months agoSkip redundant literal checks in inflate_fast
Nathan Moinvaziri [Fri, 16 Jan 2026 18:36:57 +0000 (10:36 -0800)] 
Skip redundant literal checks in inflate_fast

When we know a code is not a literal we can skip the redundant check for
op == 0.

2 months agoShared macros for inflate decoding trace statements
Nathan Moinvaziri [Fri, 16 Jan 2026 18:50:48 +0000 (10:50 -0800)] 
Shared macros for inflate decoding trace statements

Previously, trace statements for some literal decodes were not being
reported in inflate_fast.

2 months agoReplace conditional byte swapping with portable host/LE conversion
Nathan Moinvaziri [Fri, 16 Jan 2026 19:49:10 +0000 (11:49 -0800)] 
Replace conditional byte swapping with portable host/LE conversion

2 months agoRemove unnecessary CHUNK_SIZE define - rm last ref
Pavel P [Thu, 22 Jan 2026 16:43:43 +0000 (18:43 +0200)] 
Remove unnecessary CHUNK_SIZE define - rm last ref

2 months agoRemove unnecessary CHUNK_SIZE define
Pavel P [Thu, 22 Jan 2026 09:04:33 +0000 (11:04 +0200)] 
Remove unnecessary CHUNK_SIZE define

2 months agoDon't run testing with the extra corpora for RISC-V builds,
Hans Kristian Rosbach [Thu, 22 Jan 2026 15:32:46 +0000 (16:32 +0100)] 
Don't run testing with the extra corpora for RISC-V builds,
they take ~15min to finish a single job and they easily bog down
the whole CI queue when multiple pushes/PRs are queued.

2 months agoAlso upload coverage reports to coveralls.io
Hans Kristian Rosbach [Wed, 21 Jan 2026 17:39:28 +0000 (18:39 +0100)] 
Also upload coverage reports to coveralls.io

2 months ago- Merge -O3 test into prefix test.
Hans Kristian Rosbach [Thu, 22 Jan 2026 12:51:44 +0000 (13:51 +0100)] 
- Merge -O3 test into prefix test.
- Enable benchmarks in both prefix tests, to make sure they work with prefixes.

2 months agoSplit OSB builds out into separate workflow.
Hans Kristian Rosbach [Thu, 22 Jan 2026 10:39:49 +0000 (11:39 +0100)] 
Split OSB builds out into separate workflow.
Enable verbose for cmake build stage.

2 months agoAdd Windows ARM support for EOR3 feature detection and MSVC intrinsics
Mika Lindqvist [Tue, 13 Jan 2026 21:07:56 +0000 (13:07 -0800)] 
Add Windows ARM support for EOR3 feature detection and MSVC intrinsics

2 months agoCheck for null return value from getauxval(AT_PLATFORM)
Nathan Moinvaziri [Tue, 13 Jan 2026 03:11:00 +0000 (19:11 -0800)] 
Check for null return value from getauxval(AT_PLATFORM)

2 months agoGuard FreeBSD/OpenBSD auxv calls with HAVE_SYS_AUXV_H check
Brad Smith [Tue, 13 Jan 2026 02:38:35 +0000 (18:38 -0800)] 
Guard FreeBSD/OpenBSD auxv calls with HAVE_SYS_AUXV_H check

2 months agoAdd elf_aux_info() support on FreeBSD/OpenBSD for PMULL and EOR3
Brad Smith [Tue, 13 Jan 2026 02:38:35 +0000 (18:38 -0800)] 
Add elf_aux_info() support on FreeBSD/OpenBSD for PMULL and EOR3

Use elf_aux_info() to detect PMULL and EOR3 on FreeBSD and OpenBSD
aarch64.

2 months agoRemove unnecessary ARM_AUXV_HAS_NEON preprocessor check
Nathan Moinvaziri [Tue, 13 Jan 2026 03:25:57 +0000 (19:25 -0800)] 
Remove unnecessary ARM_AUXV_HAS_NEON preprocessor check

2 months agoRemove unnecessary ARM_AUXV_HAS_CRC32 preprocessor check
Nathan Moinvaziri [Tue, 13 Jan 2026 02:39:13 +0000 (18:39 -0800)] 
Remove unnecessary ARM_AUXV_HAS_CRC32 preprocessor check

2 months agoUse ARCH_64BIT preprocessor define in arm feature checks.
Nathan Moinvaziri [Tue, 13 Jan 2026 03:13:22 +0000 (19:13 -0800)] 
Use ARCH_64BIT preprocessor define in arm feature checks.

2 months agoClean up arm feature check return values.
Nathan Moinvaziri [Tue, 13 Jan 2026 03:35:34 +0000 (19:35 -0800)] 
Clean up arm feature check return values.

2 months agoPre-calculate last vector check ptr in compare256 for sse2 and lsx
Nathan Moinvaziri [Thu, 8 Jan 2026 07:51:08 +0000 (23:51 -0800)] 
Pre-calculate last vector check ptr in compare256 for sse2 and lsx

2 months agoAdd "None" for sanitizer option
Pavel P [Tue, 20 Jan 2026 19:52:13 +0000 (21:52 +0200)] 
Add "None" for sanitizer option

2 months agoMove cleanup undefs to insert_string_tpl.h
Pavel P [Tue, 20 Jan 2026 19:50:58 +0000 (21:50 +0200)] 
Move cleanup undefs to insert_string_tpl.h

2 months agoFix integer overflow in gz_compress_mmap
Vladislav Shchapov [Sat, 17 Jan 2026 13:46:50 +0000 (18:46 +0500)] 
Fix integer overflow in gz_compress_mmap

Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>
2 months agoFix CI configure workflow failure artifact upload
Hans Kristian Rosbach [Tue, 13 Jan 2026 21:01:44 +0000 (22:01 +0100)] 
Fix CI configure workflow failure artifact upload

2 months agoExpand configure script testing.
Hans Kristian Rosbach [Sun, 11 Jan 2026 21:50:21 +0000 (22:50 +0100)] 
Expand configure script testing.
- Remove 2 redundant jobs.
- Add 1 new jobs.
- Test multiple non-conflicting options in more of the jobs.

2 months agoEnable codecov for more CI jobs.
Hans Kristian Rosbach [Sun, 11 Jan 2026 14:14:11 +0000 (15:14 +0100)] 
Enable codecov for more CI jobs.
Disable codecov where -O1 or higher is requested, since codecov sets -O0.
Disable codecov where tests are not run.
Add comments for jobs where codecov is not enabled.

2 months agoUse default clang version for most builds.
Hans Kristian Rosbach [Sun, 11 Jan 2026 17:17:26 +0000 (18:17 +0100)] 
Use default clang version for most builds.
Let one job use clang-15, and a few clang-20

2 months agoCombine ARM CI jobs testing non-NEON with non-ARMv8, as these have no common optimized
Hans Kristian Rosbach [Sun, 11 Jan 2026 22:05:51 +0000 (23:05 +0100)] 
Combine ARM CI jobs testing non-NEON with non-ARMv8, as these have no common optimized
functions. For Aarch64, use no-opt config for testing bothwithout Neon/Armv8.
This reduced cmake and configure jobs by 3 each.
Also reorder and rename a few other jobs to try to use a common style.

2 months agoRemove separate MMAP CI job by folding into another.
Hans Kristian Rosbach [Sun, 11 Jan 2026 18:03:27 +0000 (19:03 +0100)] 
Remove separate MMAP CI job by folding into another.
Remove separate REDUCED_MEM CI job by folding into another.
Make sure both are present for both GCC and Clang.
Add ZLIB_COMPAT to clang debug job.

2 months agoFix codecov parameter placement warnings.
Hans Kristian Rosbach [Sun, 11 Jan 2026 18:16:28 +0000 (19:16 +0100)] 
Fix codecov parameter placement warnings.

2 months agoAdd configured compiler defines to informational output,
Hans Kristian Rosbach [Sun, 11 Jan 2026 13:51:31 +0000 (14:51 +0100)] 
Add configured compiler defines to informational output,
this eases debugging, especially in CI where further inspection is hard.

2 months agoImprove detection of compiler code coverage support.
Hans Kristian Rosbach [Sun, 11 Jan 2026 13:49:06 +0000 (14:49 +0100)] 
Improve detection of compiler code coverage support.

2 months agoUnify baseline benchmarking for both adler32 and crc32.
Hans Kristian Rosbach [Thu, 15 Jan 2026 22:42:19 +0000 (23:42 +0100)] 
Unify baseline benchmarking for both adler32 and crc32.
Fix missing benchmarks of _copy functions for some platforms.

2 months agoUnify compare256/compare256_rle benchmarks and add rolling misalignment
Hans Kristian Rosbach [Wed, 14 Jan 2026 20:18:56 +0000 (21:18 +0100)] 
Unify compare256/compare256_rle benchmarks and add rolling misalignment

2 months agoUnify adler32/crc32 benchmarks and add rotating misalignment
Hans Kristian Rosbach [Wed, 14 Jan 2026 16:18:02 +0000 (17:18 +0100)] 
Unify adler32/crc32 benchmarks and add rotating misalignment
Add aligned benchmarks for adler32/crc32

2 months agoUse aligned alloc for insert_string benchmark
Hans Kristian Rosbach [Wed, 14 Jan 2026 20:20:47 +0000 (21:20 +0100)] 
Use aligned alloc for insert_string benchmark

2 months agoFix name collision in inflate benchmark
Hans Kristian Rosbach [Wed, 14 Jan 2026 22:37:34 +0000 (23:37 +0100)] 
Fix name collision in inflate benchmark

2 months agoMake deflate output deterministic if PREFIX3(stream) is reused after deflateReset
Vladislav Shchapov [Tue, 13 Jan 2026 20:02:47 +0000 (01:02 +0500)] 
Make deflate output deterministic if PREFIX3(stream) is reused after deflateReset

Co-authored-by: Marcin Kowalczyk <QrczakMK@gmail.com>
Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>
2 months agoPrefix macros with z in crc32_vpclmulqdq for clarity
Nathan Moinvaziri [Tue, 13 Jan 2026 18:04:55 +0000 (10:04 -0800)] 
Prefix macros with z in crc32_vpclmulqdq for clarity

2 months agoUse epi32 variants for older MSVC (v141/v140) to avoid cast warnings
Nathan Moinvaziri [Tue, 13 Jan 2026 16:43:26 +0000 (08:43 -0800)] 
Use epi32 variants for older MSVC (v141/v140) to avoid cast warnings

2 months agoFix cast truncates constant value warnings with ternarylogic on Win v141
Nathan Moinvaziri [Mon, 12 Jan 2026 01:16:36 +0000 (17:16 -0800)] 
Fix cast truncates constant value warnings with ternarylogic on Win v141

2 months agoUse epi64 intrinsics for VPCLMULQDQ operations
Nathan Moinvaziri [Sun, 11 Jan 2026 22:53:45 +0000 (14:53 -0800)] 
Use epi64 intrinsics for VPCLMULQDQ operations

PCLMULQDQ operates on 64-bit polynomial elements, so use epi64 intrinsics
throughout to provide accurate type information to the compiler.

2 months agoUse masked load/store in partial folding in crc32_vpclmulqdq.
Nathan Moinvaziri [Sun, 11 Jan 2026 20:05:46 +0000 (12:05 -0800)] 
Use masked load/store in partial folding in crc32_vpclmulqdq.

2 months agoCombine final_fold function to remove extra len branch
Nathan Moinvaziri [Sun, 11 Jan 2026 20:03:26 +0000 (12:03 -0800)] 
Combine final_fold function to remove extra len branch

2 months agoEliminate extra vmovdqu instruction folding xmm into zmm.
Nathan Moinvaziri [Sun, 11 Jan 2026 19:32:44 +0000 (11:32 -0800)] 
Eliminate extra vmovdqu instruction folding xmm into zmm.

Fixed by using _mm512_castsi128_si512() and removing redundant insert.

2 months agoClean up variable names for readability in zmm path.
Nathan Moinvaziri [Sat, 3 Jan 2026 07:47:34 +0000 (23:47 -0800)] 
Clean up variable names for readability in zmm path.

2 months agoDon't compile in Chorba for vpclmulqdq because it is never used
Nathan Moinvaziri [Sat, 3 Jan 2026 02:14:24 +0000 (18:14 -0800)] 
Don't compile in Chorba for vpclmulqdq because it is never used

By the time Chorba if statement is hit, len is already reduced to < 256.

2 months agoCombine partial and final fold and reduce the number of operations
Nathan Moinvaziri [Sun, 11 Jan 2026 19:34:45 +0000 (11:34 -0800)] 
Combine partial and final fold and reduce the number of operations

We do the partial fold after we have folded the crc32 state into a single
128-bit value.

2 months agoGenerate shuffle masks in registers for partial_fold.
Nathan Moinvaziri [Fri, 2 Jan 2026 23:03:33 +0000 (15:03 -0800)] 
Generate shuffle masks in registers for partial_fold.

Faster than loading table into memory.

2 months agoUse mm_blend_epi16 in crc32_(v)pclmulqdq final reduction
Nathan Moinvaziri [Fri, 2 Jan 2026 22:47:53 +0000 (14:47 -0800)] 
Use mm_blend_epi16 in crc32_(v)pclmulqdq final reduction

This is the preferred operation mentioned in
https://www.corsix.org/content/alternative-exposition-crc32_4k_pclmulqdq

2 months agoUse ternarylogic when available in crc32_vpclmulqdq.
Nathan Moinvaziri [Sun, 11 Jan 2026 20:17:34 +0000 (12:17 -0800)] 
Use ternarylogic when available in crc32_vpclmulqdq.

2 months agoHoist folding constants to function scope to avoid repeated loads
Nathan Moinvaziri [Sat, 3 Jan 2026 02:26:16 +0000 (18:26 -0800)] 
Hoist folding constants to function scope to avoid repeated loads

2 months agoBatch PCLMULQDQ operations to reduce latency
Nathan Moinvaziri [Sun, 11 Jan 2026 21:28:20 +0000 (13:28 -0800)] 
Batch PCLMULQDQ operations to reduce latency

2 months agoMove remaining fold calls before load to hide latency
Nathan Moinvaziri [Fri, 2 Jan 2026 08:46:36 +0000 (00:46 -0800)] 
Move remaining fold calls before load to hide latency

All fold calls are now consistent in this respect.

2 months agoRevert "Move fold calls closer to last change in xmm_crc# variables."
Nathan Moinvaziri [Fri, 2 Jan 2026 03:26:14 +0000 (19:26 -0800)] 
Revert "Move fold calls closer to last change in xmm_crc# variables."

The fold calls were in a better spot before begin located after loads to
reduce latency.

This reverts commit cda0827b6d522acdb2656114e2c4b7b18b6c1c20.

2 months agoRemove old comments about crc32 folding from crc32 benchmark.
Nathan Moinvaziri [Wed, 31 Dec 2025 23:18:38 +0000 (15:18 -0800)] 
Remove old comments about crc32 folding from crc32 benchmark.