]> git.ipfire.org Git - thirdparty/zstd.git/commit
improve compression ratio of small alphabets
authorYann Collet <cyan@fb.com>
Wed, 21 Dec 2022 22:58:53 +0000 (14:58 -0800)
committerYann Collet <cyan@fb.com>
Tue, 3 Jan 2023 20:22:37 +0000 (12:22 -0800)
commit5434de01e21672cdd3ac111a99a969d8c5079297
treed7f1f7aaee3383e486cb20165f136920061ef479
parent1c818e3a0a5d492178c26b8964111b5d8d34ed6d
improve compression ratio of small alphabets

fix #3328

In situations where the alphabet size is very small,
the evaluation of literal costs from the Optimal Parser is initially incorrect.
It takes some time to converge, during which compression is less efficient.
This is especially important for small files,
because there will not be enough data to converge,
so most of the parsing is selected based on incorrect metrics.

After this patch, the scenario ##3328 gets fixed,
delivering the expected 29 bytes compressed size (smallest known compressed size).
lib/compress/huf_compress.c
lib/compress/zstd_compress.c
lib/compress/zstd_compress_literals.c
lib/compress/zstd_compress_literals.h
lib/compress/zstd_opt.c