updated doc

author Yann Collet <yann.collet.73@gmail.com>

Fri, 8 Jul 2016 08:42:59 +0000 (10:42 +0200)

committer Yann Collet <yann.collet.73@gmail.com>

Fri, 8 Jul 2016 09:45:08 +0000 (11:45 +0200)
author Yann Collet <yann.collet.73@gmail.com>
Fri, 8 Jul 2016 08:42:59 +0000 (10:42 +0200)
committer Yann Collet <yann.collet.73@gmail.com>
Fri, 8 Jul 2016 09:45:08 +0000 (11:45 +0200)
diff --git a/lib/README.md b/lib/README.md

index 45e8e6fdcf707a5ada4d3f9efaf3702f607eb857..9357065061dc64612aa4e68d1e61e812bd9f38c3 100644 (file)
--- a/lib/README.md
+++ b/lib/README.md
@@ -45,14 +45,19 @@ It is used by `zstd` command line utility, and [7zip plugin](http://mcmilk.de/pr
  - compress/zbuff_compress.c
  - decompress/zbuff_decompress.c
  
+
  #### Dictionary builder
  
-To create dictionaries from training sets :
+In order to create dictionaries from some training sets,
+it's needed to include all files from [dictBuilder directory](dictBuilder/)
+
+
+#### Legacy support
+
+Zstandard can decode previous formats, starting from v0.1.
+Support for these format is provided in [folder legacy](legacy/).
+It's also required to compile the library with `ZSTD_LEGACY_SUPPORT = 1`.
  
-- dictBuilder/divsufsort.c
-- dictBuilder/divsufsort.h
-- dictBuilder/zdict.c
-- dictBuilder/zdict.h
  
  #### Miscellaneous
  
diff --git a/zstd_compression_format.md b/zstd_compression_format.md

index 2fbe3fa4c018312c4b83f207a66a73439208349b..75cf4a83bc0cc846ecd188aebcc49ff8a28bba47 100644 (file)
--- a/zstd_compression_format.md
+++ b/zstd_compression_format.md
@@ -565,37 +565,46 @@ which tells how to decode the list of weights.
  | Nb of 1s | 1 | 2 | 3 | 4 | 7 | 8 | 15| 16| 31| 32| 63| 64|127|128|
  |Complement| 1 | 2 | 1 | 4 | 1 | 8 | 1 | 16| 1 | 32| 1 | 64| 1 |128|
  
-_Note_ : complement is by using the "join to nearest power of 2" rule.
+_Note_ : complement is found by using "join to nearest power of 2" rule.
  
  - if headerByte >= 128 : this is a direct representation,
    where each weight is written directly as a 4 bits field (0-15).
    The full representation occupies `((nbSymbols+1)/2)` bytes,
    meaning it uses a last full byte even if nbSymbols is odd.
-  `nbSymbols = headerByte - 127;`
+  `nbSymbols = headerByte - 127;`.
+  Note that maximum nbSymbols is 241-127 = 114.
+  A larger serie must necessarily use FSE compression.
  
  - if headerByte < 128 :
    the serie of weights is compressed by FSE.
-  The length of the compressed serie is `headerByte` (0-127).
+  The length of the FSE-compressed serie is `headerByte` (0-127).
  
  ##### FSE (Finite State Entropy) compression of huffman weights
  
-The serie of weights is compressed using standard FSE compression.
+The serie of weights is compressed using FSE compression.
  It's a single bitstream with 2 interleaved states,
-using a single distribution table.
+sharing a single distribution table.
  
  To decode an FSE bitstream, it is necessary to know its compressed size.
  Compressed size is provided by `headerByte`.
-It's also necessary to know its maximum decompressed size.
-In this case, it's `255`, since literal values range from `0` to `255`,
+It's also necessary to know its maximum decompressed size,
+which is `255`, since literal values span from `0` to `255`,
  and last symbol value is not represented.
  
  An FSE bitstream starts by a header, describing probabilities distribution.
  It will create a Decoding Table.
-It is necessary to know the maximum accuracy of distribution
-to properly allocate space for the Table.
-For a list of huffman weights, this maximum is 7 bits.
+Table must be pre-allocated, which requires to support a maximum accuracy.
+For a list of huffman weights, recommended maximum is 7 bits.
+
+FSE header is [described in relevant chapter](#fse-distribution-table--condensed-format),
+and so is [FSE bitstream](#bitstream).
+The main difference is that Huffman header compression uses 2 states,
+which share the same FSE distribution table.
+Bitstream contains only FSE symbols, there are no interleaved "raw bitfields".
+The number of symbols to decode is discovered
+by tracking bitStream overflow condition.
+When both states have overflowed the bitstream, end is reached.
  
-FSE header and bitstreams are described in a separated chapter.
  
  ##### Conversion from weights to huffman prefix codes
author	Yann Collet <yann.collet.73@gmail.com>
	Fri, 8 Jul 2016 08:42:59 +0000 (10:42 +0200)
committer	Yann Collet <yann.collet.73@gmail.com>
	Fri, 8 Jul 2016 09:45:08 +0000 (11:45 +0200)
lib/README.md		patch \| blob \| blame \| history
zstd_compression_format.md		patch \| blob \| blame \| history