]> git.ipfire.org Git - thirdparty/zstd.git/commit
[LDM] Speed optimization on repetitive data 2602/head
authorNick Terrell <terrelln@fb.com>
Mon, 3 May 2021 21:32:15 +0000 (14:32 -0700)
committerNick Terrell <terrelln@fb.com>
Tue, 4 May 2021 17:57:42 +0000 (10:57 -0700)
commit32823bc150fe27b89d5c08e8b9483e98f2c4cb59
treee53f951c9d4766e97b3b50f9cb484c1f8d12906a
parent0e2345b8594feccb9e63b8e400a94ef5bacb5a72
[LDM] Speed optimization on repetitive data

LDM does especially poorly on repetitive data when that data's hash happens
to have `(hash & stopMask) == 0`. Either because the `stopMask == 0` or
random chance. Optimize this case by skipping over repetitive patterns.
The detection is very simplistic, but should catch most of the offending
cases.

```
head -c 1G /dev/zero | perf stat -- ./zstd -1 -o /dev/null -v --zstd=ldmHashRateLog=1 --long
      21.187881087 seconds time elapsed

head -c 1G /dev/zero | perf stat -- ./zstd -1 -o /dev/null -v --zstd=ldmHashRateLog=1 --long
       1.149707921 seconds time elapsed

```
lib/compress/zstd_ldm.c