From: W. Felix Handte Date: Wed, 17 Jul 2019 21:30:09 +0000 (-0400) Subject: [doc] Remove Limitation that Compressed Block is Smaller than Uncompressed Content X-Git-Tag: v1.4.1^2~2 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=c05b270edc293cfd337176f9efdf9d16904229b8;p=thirdparty%2Fzstd.git [doc] Remove Limitation that Compressed Block is Smaller than Uncompressed Content This changes the size limit on compressed blocks to match those of the other block types: they may not be larger than the `Block_Maximum_Decompressed_Size`, which is the smaller of the `Window_Size` and 128 KB, removing the additional restriction that had been placed on `Compressed_Block`s, that they be smaller than the decompressed content they represent. Several things motivate removing this restriction. On the one hand, this restriction is not useful for decoders: the decoder must nonetheless be prepared to accept compressed blocks that are the full `Block_Maximum_Decompressed_Size`. And on the other, this bound is actually artificially limiting. If block representations were entirely independent, a compressed representation of a block that is larger than the contents of the block would be ipso facto useless, and it would be strictly better to send it as an `Raw_Block`. However, blocks are not entirely independent, and it can make sense to pay the cost of encoding custom entropy tables in a block, even if that pushes that block size over the size of the data it represents, because those tables can be re-used by subsequent blocks. Finally, as far as I can tell, this restriction in the spec is not currently enforced in any Zstandard implementation, nor has it ever been. This change should therefore be safe to make. --- diff --git a/doc/zstd_compression_format.md b/doc/zstd_compression_format.md index ed758cf59..ad5a61ec7 100644 --- a/doc/zstd_compression_format.md +++ b/doc/zstd_compression_format.md @@ -390,9 +390,7 @@ A block can contain any number of bytes (even zero), up to - Window_Size - 128 KB -A `Compressed_Block` has the extra restriction that `Block_Size` is always -strictly less than the decompressed size. -If this condition cannot be respected, +If this condition cannot be respected when generating a `Compressed_Block`, the block must be sent uncompressed instead (`Raw_Block`).