git.ipfire.org Git - thirdparty/zlib-ng.git/commit

author	shuxinyang <syang@shuxinyangs-mbp.gateway.2wire.net>
	Mon, 10 Mar 2014 00:20:02 +0000 (17:20 -0700)
committer	hansr <hansr@hk.drivdigital.no>
	Tue, 7 Oct 2014 12:45:00 +0000 (14:45 +0200)
commit	5545321e6144c36842e4bdb717000c0098f5698c
tree	48509f489d68adb71424c9fb506f1a8b0a4d2016	tree \| snapshot
parent	e176b3c23ace88d5ded5b8f8371bbab6d7b02ba8	commit \| diff

  Rewrite the loops such that gcc can vectorize them using saturated-sub
on x86-64 architecture. Speedup the performance by some 7% on my linux box
with corei7 archiecture.

  The original loop is legal to be vectorized; gcc 4.7.* and 4.8.*
somehow fail to catch this case. There are still have room to squeeze
from the vectorized code. However, since these loops now account for about
1.5% of execution time, it is not worthwhile to sequeeze the performance
via hand-writing assembly.

  The original loops are guarded with "#ifdef NOT_TWEAK_COMPILER". By
default, the modified version is picked up unless the code is compiled
explictly with -DNOT_TWEAK_COMPILER.