From: Yann Collet Date: Wed, 16 Oct 2024 19:13:57 +0000 (-0700) Subject: slightly improved compression ratio at levels 3 & 4 X-Git-Tag: v1.5.7^2~73^2~4 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=632677516616434c312d2fff2d84dcf0c9b78012;p=thirdparty%2Fzstd.git slightly improved compression ratio at levels 3 & 4 The compression ratio benefits are small but consistent, i.e. always positive. On `silesia.tar` corpus, this modification saves ~75 KB at level 3. The measured speed cost is negligible, i.e. below noise level, between 0 and -1%. --- diff --git a/lib/compress/zstd_double_fast.c b/lib/compress/zstd_double_fast.c index e2b3b4a14..72b541ea6 100644 --- a/lib/compress/zstd_double_fast.c +++ b/lib/compress/zstd_double_fast.c @@ -252,19 +252,23 @@ _cleanup: _search_next_long: - /* check prefix long +1 match */ + /* short match found: let's check for a longer one */ + mLength = ZSTD_count(ip+4, matchs0+4, iend) + 4; + + /* check long match at +1 position */ if (idxl1 > prefixLowestIndex) { if (MEM_read64(matchl1) == MEM_read64(ip1)) { - ip = ip1; - mLength = ZSTD_count(ip+8, matchl1+8, iend) + 8; - offset = (U32)(ip-matchl1); - while (((ip>anchor) & (matchl1>prefixLowest)) && (ip[-1] == matchl1[-1])) { ip--; matchl1--; mLength++; } /* catch up */ - goto _match_found; - } + size_t const llen = ZSTD_count(ip1+8, matchl1+8, iend) + 8; + if (llen > mLength) { + ip = ip1; + mLength = llen; + offset = (U32)(ip-matchl1); + while (((ip>anchor) & (matchl1>prefixLowest)) && (ip[-1] == matchl1[-1])) { ip--; matchl1--; mLength++; } /* catch up */ + goto _match_found; + } } } - /* if no long +1 match, explore the short match we found */ - mLength = ZSTD_count(ip+4, matchs0+4, iend) + 4; + /* validate short match previously found */ offset = (U32)(ip - matchs0); while (((ip>anchor) & (matchs0>prefixLowest)) && (ip[-1] == matchs0[-1])) { ip--; matchs0--; mLength++; } /* catch up */