]> git.ipfire.org Git - thirdparty/zlib-ng.git/commit
Add AVX512 version of compare256 compare256-avx512 1901/head
authorHans Kristian Rosbach <hk-git@circlestorm.org>
Thu, 10 Apr 2025 22:46:06 +0000 (00:46 +0200)
committerHans Kristian Rosbach <hk-git@circlestorm.org>
Mon, 14 Apr 2025 10:32:16 +0000 (12:32 +0200)
commite098ae80d4923e11f720bddd735cd0f3d6d9807b
treec72d9d62aa7023f4ff12142daf631375baa5ae81
parenteb76eca8c70f1132735d05766681f7593fb91321
Add AVX512 version of compare256

Improve the speed of sub-16 byte matches by first using a
128-bit intrinsic, after that use only 512-bit intrinsics.
This requires us to overlap on the last run, but this is cheaper than
processing the tail using a 256-bit and then a 128-bit run.

Change benchmark steps to avoid it hitting chunk boundaries
of one or the other function as much, this gives more fair benchmarks.
CMakeLists.txt
arch/x86/Makefile.in
arch/x86/compare256_avx512.c [new file with mode: 0644]
arch/x86/x86_functions.h
configure
functable.c
test/benchmarks/benchmark_compare256.cc
test/test_compare256.cc