]> git.ipfire.org Git - thirdparty/zstd.git/commitdiff
Fix ZSTD_execSequence() performance regression 431/head
authorNick Terrell <terrelln@fb.com>
Thu, 27 Oct 2016 23:19:54 +0000 (16:19 -0700)
committerNick Terrell <terrelln@fb.com>
Thu, 27 Oct 2016 23:19:57 +0000 (16:19 -0700)
Commit ae1cb3b3d07024618269b89e3421d828adfd34d9 caused the regression.
It is an instruction alignment issue, because if it is `U64 i` instead
of `U32 i`, the regression returns.  This patch fixes the regression
in gcc, but only gets some of the clang performance back.

Benchmarks:
Run on `silesia.tar`.  I only show levels 1-5 because the performance
regression was uniform across all levels.  I did one run on levels
1-19 and it looked good.

| Build | Level | Before | While | After |
|-------|-------|-------:|------:|------:|
| gcc   |     1 |  931.4 | 904.4 | 932.8 |
| gcc   |     2 |  849.1 | 822.6 | 851.2 |
| gcc   |     3 |  815.6 | 790.6 | 818.9 |
| gcc   |     4 |  794.1 | 770.7 | 798.0 |
| gcc   |     5 |  785.7 | 760.7 | 788.8 |
| clang |     1 |  705.5 | 683.2 | 693.8 |
| clang |     2 |  670.0 | 649.2 | 660.7 |
| clang |     3 |  659.6 | 639.8 | 651.4 |
| clang |     4 |  652.5 | 634.7 | 645.9 |
| clang |     5 |  646.9 | 625.5 | 637.7 |

lib/decompress/zstd_decompress.c

index f3ff4ebff413a61cb861380802bc184d37579bad..ed68d888bfe94aa3e6657c0ce1bc1b375e83c022 100644 (file)
@@ -887,7 +887,8 @@ size_t ZSTD_execSequence(BYTE* op,
             sequence.matchLength -= length1;
             match = base;
             if (op > oend_w) {
-              while (op < oMatchEnd) *op++ = *match++;
+              U32 i;
+              for (i = 0; i < sequence.matchLength; ++i) op[i] = match[i];
               return sequenceLength;
             }
     }   }