### Version
-0.3.7 (2020-12-09)
+0.3.8 (2023-02-18)
Introduction
repeated `Regenerated_Size` times.
- `Compressed_Literals_Block` - This is a standard Huffman-compressed block,
starting with a Huffman tree description.
+ In this mode, there are at least 2 different literals represented in the Huffman tree description.
See details below.
- `Treeless_Literals_Block` - This is a Huffman-compressed block,
using Huffman tree _from previous Huffman-compressed literals block_.
### `Huffman_Tree_Description`
This section is only present when `Literals_Block_Type` type is `Compressed_Literals_Block` (`2`).
+The tree describes the weights of all literals symbols that can be present in the literals block, at least 2 and up to 256.
The format of the Huffman tree description can be found at [Huffman Tree description](#huffman-tree-description).
The size of `Huffman_Tree_Description` is determined during decoding process,
it must be used to determine where streams begin.
--------------
Zstandard Huffman-coded streams are read backwards,
similar to the FSE bitstreams.
-Therefore, to find the start of the bitstream, it is therefore to
+Therefore, to find the start of the bitstream, it is required to
know the offset of the last byte of the Huffman-coded stream.
After writing the last bit containing information, the compressor
```
Number_of_Bits = Weight ? (Max_Number_of_Bits + 1 - Weight) : 0
```
-The last symbol's `Weight` is deduced from previously decoded ones,
-by completing to the nearest power of 2.
-This power of 2 gives `Max_Number_of_Bits`, the depth of the current tree.
+When a literal value is not present, it receives a `Weight` of 0.
+The least frequent symbol receives a `Weight` of 1.
+Consequently, the `Weight` 1 is necessarily present.
+The most frequent symbol receives a `Weight` anywhere between 1 and 11 (max).
+The last symbol's `Weight` is deduced from previously retrieved Weights,
+by completing to the nearest power of 2. It's necessarily non 0.
+If it's not possible to reach a clean power of 2 with a single `Weight` value,
+the Huffman Tree Description is considered invalid.
+This final power of 2 gives `Max_Number_of_Bits`, the depth of the current tree.
`Max_Number_of_Bits` must be <= 11,
otherwise the representation is considered corrupted.
The tree depth is 4, since its longest elements uses 4 bits
(longest elements are the one with smallest frequency).
-Value `5` will not be listed, as it can be determined from values for 0-4,
+Literal value `5` will not be listed, as it can be determined from previous values 0-4,
nor will values above `5` as they are all 0.
Values from `0` to `4` will be listed using `Weight` instead of `Number_of_Bits`.
Weight formula is :
The sum of `2^(Weight-1)` (excluding 0's) is :
`8 + 4 + 2 + 0 + 1 = 15`.
Nearest larger power of 2 value is 16.
-Therefore, `Max_Number_of_Bits = 4` and `Weight[5] = 16-15 = 1`.
+Therefore, `Max_Number_of_Bits = 4` and `Weight[5] = log_2(16 - 15) + 1 = 1`.
#### Huffman Tree header
Version changes
---------------
+- 0.3.8 : clarifications for Huffman Blocks and Huffman Tree descriptions.
- 0.3.7 : clarifications for Repeat_Offsets, matching RFC8878
- 0.3.6 : clarifications for Dictionary_ID
- 0.3.5 : clarifications for Block_Maximum_Size