]> git.ipfire.org Git - thirdparty/zlib-ng.git/commit
Add AVX512 version of compare256
authorHans Kristian Rosbach <hk-git@circlestorm.org>
Thu, 10 Apr 2025 22:46:06 +0000 (00:46 +0200)
committerHans Kristian Rosbach <hk-github@circlestorm.org>
Mon, 14 Apr 2025 21:28:38 +0000 (23:28 +0200)
commit00a3168d5dd2e93ae65f83b91228f00f0bf507be
tree91756787eaf42d219dc504255188bb0e0a341563
parentcfd90c7e1ace237b271ad000826051a4571af170
Add AVX512 version of compare256

Improve the speed of sub-16 byte matches by first using a
128-bit intrinsic, after that use only 512-bit intrinsics.
This requires us to overlap on the last run, but this is cheaper than
processing the tail using a 256-bit and then a 128-bit run.

Change benchmark steps to avoid it hitting chunk boundaries
of one or the other function as much, this gives more fair benchmarks.
CMakeLists.txt
arch/x86/Makefile.in
arch/x86/compare256_avx512.c [new file with mode: 0644]
arch/x86/x86_functions.h
configure
functable.c
test/benchmarks/benchmark_compare256.cc
test/test_compare256.cc