git.ipfire.org Git - thirdparty/zlib-ng.git/commit

]> git.ipfire.org Git - thirdparty/zlib-ng.git/commit

projects / thirdparty / zlib-ng.git / commit

author	Adam Stylinski <kungfujesus06@gmail.com>
	Sun, 16 Feb 2025 17:13:00 +0000 (12:13 -0500)
committer	Hans Kristian Rosbach <hk-github@circlestorm.org>
	Fri, 28 Mar 2025 19:43:59 +0000 (20:43 +0100)
commit	724dc0cfb4805dfd57983080ec4d2b3c53262e87
tree	b9bd4347f3059cb5976ec37c9ad25535bd56b9e8	tree \| snapshot
parent	2bba7e8468e808b7a7d5c1045d339eb5ffd12591	commit \| diff

Explicit SSE2 vectorization of Chorba CRC method

The version that's currently in the generic implementation for 32768
byte buffers leverages the stack. It manages to autovectorize but
unfortunately the trips to the stack hurt its performance for CPUs which
need this the most. This version is explicitly SIMD vectorized and
doesn't use trips to the stack. In my testing it's ~10% faster than the
"small" variant, and about 42% faster than the "32768" variant.

CMakeLists.txt		diff \| blob \| blame \| history
arch/x86/Makefile.in		diff \| blob \| blame \| history
arch/x86/chorba_sse2.c	[new file with mode: 0644]	blob
arch/x86/x86_functions.h		diff \| blob \| blame \| history
arch/x86/x86_intrins.h		diff \| blob \| blame \| history
configure		diff \| blob \| blame \| history
crc32.h		diff \| blob \| blame \| history
functable.c		diff \| blob \| blame \| history
test/benchmarks/benchmark_crc32.cc		diff \| blob \| blame \| history
test/test_crc32.cc		diff \| blob \| blame \| history
win32/Makefile.msc		diff \| blob \| blame \| history

Mirror of https://github.com/zlib-ng/zlib-ng.git

RSS Atom