From: Sebastian Pop Date: Mon, 27 Feb 2017 17:21:59 +0000 (-0600) Subject: call memset for read after write dependences at distance 1 X-Git-Tag: 1.9.9-b1~667 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=fe8b3cf0a69f2eb4d6933fec9741457bd3394bf8;p=thirdparty%2Fzlib-ng.git call memset for read after write dependences at distance 1 On a benchmark using zlib to decompress a PNG image this change shows a 20% speedup. It makes sense to special case distance = 1 of read after write dependences because it is possible to replace the loop kernel with a memset which is usually implemented in assembly in the libc, and because of the frequency at which distance = 1 appears during the PNG decompression: Distance Frequency 1 1009001 6 64500 9 29000 3 25500 144 14500 12 10000 15 3500 7 2000 24 1000 21 1000 18 1000 87 500 22 500 192 500 --- diff --git a/inffast.c b/inffast.c index 2e5108010..1d4048def 100644 --- a/inffast.c +++ b/inffast.c @@ -244,16 +244,21 @@ void ZLIB_INTERNAL inflate_fast(z_stream *strm, unsigned long start) { } } else { from = out - dist; /* copy direct from output */ - do { /* minimum length is three */ - *out++ = *from++; - *out++ = *from++; - *out++ = *from++; - len -= 3; - } while (len > 2); - if (len) { - *out++ = *from++; - if (len > 1) + if (dist == 1) { + memset (out, *from, len); + out += len; + } else { + do { /* minimum length is three */ + *out++ = *from++; + *out++ = *from++; *out++ = *from++; + len -= 3; + } while (len > 2); + if (len) { + *out++ = *from++; + if (len > 1) + *out++ = *from++; + } } } } else if ((op & 64) == 0) { /* 2nd level distance code */