Update documentation

author Jennifer Liu <jenniferliu620@fb.com>

Mon, 2 Jul 2018 02:59:37 +0000 (19:59 -0700)

committer Jennifer Liu <jenniferliu620@fb.com>

Mon, 2 Jul 2018 02:59:37 +0000 (19:59 -0700)
author Jennifer Liu <jenniferliu620@fb.com>
Mon, 2 Jul 2018 02:59:37 +0000 (19:59 -0700)
committer Jennifer Liu <jenniferliu620@fb.com>
Mon, 2 Jul 2018 02:59:37 +0000 (19:59 -0700)
diff --git a/programs/README.md b/programs/README.md

index a308fccf9ea3b8232becbc25958cb30cc3a22969..2833875e5dce19c70333ade3fceb56d2dd285d66 100644 (file)
--- a/programs/README.md
+++ b/programs/README.md
@@ -150,7 +150,7 @@ Advanced arguments :
  
  Dictionary builder :
  --train ## : create a dictionary from a training set of files
---train-cover[=k=#,d=#,steps=#] : use the cover algorithm with optional args
+--train-cover[=k=#,d=#,steps=#,split=#] : use the cover algorithm with optional args
  --train-legacy[=s=#] : use the legacy algorithm with selectivity (default: 9)
   -o file : `file` is dictionary name (default: dictionary)
  --maxdict=# : limit dictionary to specified size (default: 112640)
diff --git a/programs/dibio.c b/programs/dibio.c

index 112259ddcd054364df7172e7413808d9974c8979..5d1f6d6c46162188d4fd2fd529271d7b2d6f1213 100644 (file)
--- a/programs/dibio.c
+++ b/programs/dibio.c
@@ -323,7 +323,8 @@ int DiB_trainFromFiles(const char* dictFileName, unsigned maxDictSize,
                                                             srcBuffer, sampleSizes, fs.nbSamples,
                                                             coverParams);
              if (!ZDICT_isError(dictSize)) {
-                DISPLAYLEVEL(2, "k=%u\nd=%u\nsteps=%u\n", coverParams->k, coverParams->d, coverParams->steps);
+                unsigned splitPercentage = (unsigned)(coverParams->splitPoint * 100);
+                DISPLAYLEVEL(2, "k=%u\nd=%u\nsteps=%u\nsplit=%u\n", coverParams->k, coverParams->d, coverParams->steps, splitPercentage);
              }
          } else {
              dictSize = ZDICT_trainFromBuffer_cover(dictBuffer, maxDictSize, srcBuffer,
diff --git a/programs/zstd.1 b/programs/zstd.1

index 507933c97a994f8a15e6f2cdafbad09991b3f8e7..e1ebd297e7c698d152d6175f2480f64f7c2b4527 100644 (file)
--- a/programs/zstd.1
+++ b/programs/zstd.1
@@ -217,8 +217,8 @@ Split input files in blocks of size # (default: no split)
  A dictionary ID is a locally unique ID that a decoder can use to verify it is using the right dictionary\. By default, zstd will create a 4\-bytes random number ID\. It\'s possible to give a precise number instead\. Short numbers have an advantage : an ID < 256 will only need 1 byte in the compressed frame header, and an ID < 65536 will only need 2 bytes\. This compares favorably to 4 bytes default\. However, it\'s up to the dictionary manager to not assign twice the same ID to 2 different dictionaries\.
  .
  .TP
-\fB\-\-train\-cover[=k#,d=#,steps=#]\fR
-Select parameters for the default dictionary builder algorithm named cover\. If \fId\fR is not specified, then it tries \fId\fR = 6 and \fId\fR = 8\. If \fIk\fR is not specified, then it tries \fIsteps\fR values in the range [50, 2000]\. If \fIsteps\fR is not specified, then the default value of 40 is used\. Requires that \fId\fR <= \fIk\fR\.
+\fB\-\-train\-cover[=k#,d=#,steps=#,split=#]\fR
+Select parameters for the default dictionary builder algorithm named cover\. If \fId\fR is not specified, then it tries \fId\fR = 6 and \fId\fR = 8\. If \fIk\fR is not specified, then it tries \fIsteps\fR values in the range [50, 2000]\. If \fIsteps\fR is not specified, then the default value of 40 is used\. If \fIsplit\fR is not specified, then the default value of 80 is used\. Requires that \fId\fR <= \fIk\fR\.
  .
  .IP
  Selects segments of size \fIk\fR with highest score to put in the dictionary\. The score of a segment is computed by the sum of the frequencies of all the subsegments of size \fId\fR\. Generally \fId\fR should be in the range [6, 8], occasionally up to 16, but the algorithm will run faster with d <= \fI8\fR\. Good values for \fIk\fR vary widely based on the input data, but a safe range is [2 * \fId\fR, 2000]\. Supports multithreading if \fBzstd\fR is compiled with threading support\.
diff --git a/programs/zstdcli.c b/programs/zstdcli.c

index 74dc607a331aabfdbc1885aab20f70b6c97cd20f..28bed2309e2d674bf5e1d772e2ce3a4c4cb40b3f 100644 (file)
--- a/programs/zstdcli.c
+++ b/programs/zstdcli.c
@@ -170,7 +170,7 @@ static int usage_advanced(const char* programName)
      DISPLAY( "\n");
      DISPLAY( "Dictionary builder : \n");
      DISPLAY( "--train ## : create a dictionary from a training set of files \n");
-    DISPLAY( "--train-cover[=k=#,d=#,steps=#] : use the cover algorithm with optional args\n");
+    DISPLAY( "--train-cover[=k=#,d=#,steps=#,split=#] : use the cover algorithm with optional args\n");
      DISPLAY( "--train-legacy[=s=#] : use the legacy algorithm with selectivity (default: %u)\n", g_defaultSelectivityLevel);
      DISPLAY( " -o file : `file` is dictionary name (default: %s) \n", g_defaultDictName);
      DISPLAY( "--maxdict=# : limit dictionary to specified size (default: %u) \n", g_defaultMaxDictSize);
author	Jennifer Liu <jenniferliu620@fb.com>
	Mon, 2 Jul 2018 02:59:37 +0000 (19:59 -0700)
committer	Jennifer Liu <jenniferliu620@fb.com>
	Mon, 2 Jul 2018 02:59:37 +0000 (19:59 -0700)
programs/README.md		patch \| blob \| blame \| history
programs/dibio.c		patch \| blob \| blame \| history
programs/zstd.1		patch \| blob \| blame \| history
programs/zstdcli.c		patch \| blob \| blame \| history