git.ipfire.org Git - thirdparty/zstd.git/commit

author	Nick Terrell <terrelln@fb.com>
	Mon, 3 May 2021 21:32:15 +0000 (14:32 -0700)
committer	Nick Terrell <terrelln@fb.com>
	Tue, 4 May 2021 17:57:42 +0000 (10:57 -0700)
commit	32823bc150fe27b89d5c08e8b9483e98f2c4cb59
tree	e53f951c9d4766e97b3b50f9cb484c1f8d12906a	tree
parent	0e2345b8594feccb9e63b8e400a94ef5bacb5a72	commit \| diff

[LDM] Speed optimization on repetitive data

LDM does especially poorly on repetitive data when that data's hash happens
to have `(hash & stopMask) == 0`. Either because the `stopMask == 0` or
random chance. Optimize this case by skipping over repetitive patterns.
The detection is very simplistic, but should catch most of the offending
cases.

```
head -c 1G /dev/zero | perf stat -- ./zstd -1 -o /dev/null -v --zstd=ldmHashRateLog=1 --long
21.187881087 seconds time elapsed

head -c 1G /dev/zero | perf stat -- ./zstd -1 -o /dev/null -v --zstd=ldmHashRateLog=1 --long
1.149707921 seconds time elapsed

```