From: Yann Collet Date: Mon, 28 Oct 2019 18:15:41 +0000 (-0700) Subject: updated API inline doc and manual X-Git-Tag: v1.4.4~1^2~9 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=74065da4c5ea14a9755982e5ca0a72bf2d4f10d5;p=thirdparty%2Fzstd.git updated API inline doc and manual regarding ZSTD_CDict created without a dictBuffer. --- diff --git a/doc/zstd_manual.html b/doc/zstd_manual.html index 0021eec28..43c5555b8 100644 --- a/doc/zstd_manual.html +++ b/doc/zstd_manual.html @@ -692,12 +692,17 @@ size_t ZSTD_freeDStream(ZSTD_DStream* zds);
ZSTD_CDict* ZSTD_createCDict(const void* dictBuffer, size_t dictSize,
                              int compressionLevel);
-

When compressing multiple messages / blocks using the same dictionary, it's recommended to load it only once. - ZSTD_createCDict() will create a digested dictionary, ready to start future compression operations without startup cost. +

When compressing multiple messages or blocks using the same dictionary, + it's recommended to digest the dictionary only once, since it's a costly operation. + ZSTD_createCDict() will create a state from digesting a dictionary. + The resulting state can be used for future compression operations with very limited startup cost. ZSTD_CDict can be created once and shared by multiple threads concurrently, since its usage is read-only. - `dictBuffer` can be released after ZSTD_CDict creation, because its content is copied within CDict. - Consider experimental function `ZSTD_createCDict_byReference()` if you prefer to not duplicate `dictBuffer` content. - Note : A ZSTD_CDict can be created from an empty dictBuffer, but it is inefficient when used to compress small data. + @dictBuffer can be released after ZSTD_CDict creation, because its content is copied within CDict. + Note 1 : Consider experimental function `ZSTD_createCDict_byReference()` if you prefer to not duplicate @dictBuffer content. + Note 2 : A ZSTD_CDict can be created from an empty @dictBuffer, + in which case the only thing that it transports is the @compressionLevel. + This can be useful in a pipeline featuring ZSTD_compress_usingCDict() exclusively, + expecting a ZSTD_CDict parameter with any data, including those without a known dictionary.


size_t      ZSTD_freeCDict(ZSTD_CDict* CDict);
@@ -947,7 +952,7 @@ size_t ZSTD_sizeof_DDict(const ZSTD_DDict* ddict);
      * to evolve and should be considered only in the context of extremely
      * advanced performance tuning.
      *
-     * Zstd currently supports the use of a CDict in two ways:
+     * Zstd currently supports the use of a CDict in three ways:
      *
      * - The contents of the CDict can be copied into the working context. This
      *   means that the compression can search both the dictionary and input
@@ -963,6 +968,12 @@ size_t ZSTD_sizeof_DDict(const ZSTD_DDict* ddict);
      *   working context's tables can be reused). For small inputs, this can be
      *   faster than copying the CDict's tables.
      *
+     * - The CDict's tables are not used at all, and instead we use the working
+     *   context alone to reload the dictionary and use params based on the source
+     *   size. See ZSTD_compress_insertDictionary() and ZSTD_compress_usingDict().
+     *   This method is effective when the dictionary sizes are very small relative
+     *   to the input size, and the input size is fairly large to begin with.
+     *
      * Zstd has a simple internal heuristic that selects which strategy to use
      * at the beginning of a compression. However, if experimentation shows that
      * Zstd is making poor choices, it is possible to override that choice with
@@ -970,7 +981,8 @@ size_t ZSTD_sizeof_DDict(const ZSTD_DDict* ddict);
      */
     ZSTD_dictDefaultAttach = 0, /* Use the default heuristic. */
     ZSTD_dictForceAttach   = 1, /* Never copy the dictionary. */
-    ZSTD_dictForceCopy     = 2  /* Always copy the dictionary. */
+    ZSTD_dictForceCopy     = 2, /* Always copy the dictionary. */
+    ZSTD_dictForceLoad     = 3  /* Always reload the dictionary */
 } ZSTD_dictAttachPref_e;
 

typedef enum {
diff --git a/lib/zstd.h b/lib/zstd.h
index 24d23ff81..72080ea87 100644
--- a/lib/zstd.h
+++ b/lib/zstd.h
@@ -808,12 +808,17 @@ ZSTDLIB_API size_t ZSTD_decompress_usingDict(ZSTD_DCtx* dctx,
 typedef struct ZSTD_CDict_s ZSTD_CDict;
 
 /*! ZSTD_createCDict() :
- *  When compressing multiple messages / blocks using the same dictionary, it's recommended to load it only once.
- *  ZSTD_createCDict() will create a digested dictionary, ready to start future compression operations without startup cost.
+ *  When compressing multiple messages or blocks using the same dictionary,
+ *  it's recommended to digest the dictionary only once, since it's a costly operation.
+ *  ZSTD_createCDict() will create a state from digesting a dictionary.
+ *  The resulting state can be used for future compression operations with very limited startup cost.
  *  ZSTD_CDict can be created once and shared by multiple threads concurrently, since its usage is read-only.
- * `dictBuffer` can be released after ZSTD_CDict creation, because its content is copied within CDict.
- *  Consider experimental function `ZSTD_createCDict_byReference()` if you prefer to not duplicate `dictBuffer` content.
- *  Note : A ZSTD_CDict can be created from an empty dictBuffer, but it is inefficient when used to compress small data. */
+ * @dictBuffer can be released after ZSTD_CDict creation, because its content is copied within CDict.
+ *  Note 1 : Consider experimental function `ZSTD_createCDict_byReference()` if you prefer to not duplicate @dictBuffer content.
+ *  Note 2 : A ZSTD_CDict can be created from an empty @dictBuffer,
+ *      in which case the only thing that it transports is the @compressionLevel.
+ *      This can be useful in a pipeline featuring ZSTD_compress_usingCDict() exclusively,
+ *      expecting a ZSTD_CDict parameter with any data, including those without a known dictionary. */
 ZSTDLIB_API ZSTD_CDict* ZSTD_createCDict(const void* dictBuffer, size_t dictSize,
                                          int compressionLevel);
 
@@ -1167,7 +1172,7 @@ typedef enum {
      *   tables. However, this model incurs no start-up cost (as long as the
      *   working context's tables can be reused). For small inputs, this can be
      *   faster than copying the CDict's tables.
-     * 
+     *
      * - The CDict's tables are not used at all, and instead we use the working
      *   context alone to reload the dictionary and use params based on the source
      *   size. See ZSTD_compress_insertDictionary() and ZSTD_compress_usingDict().