Yann Collet [Mon, 30 Jan 2017 21:35:45 +0000 (13:35 -0800)]
zstdmt : section size is set to be a minimum of overlapSize
the minimum size condition size is applied transparently (no warning, no error)
like previous minimum section size condition (1 KB) which still applies.
Yann Collet [Fri, 27 Jan 2017 23:55:30 +0000 (15:55 -0800)]
reduced zstdmt latency when using small custom section sizes with high compression levels
Previous version was requiring a fairly large initial amount of input data
before starting to create compression jobs.
This new version starts the process much sooner.
Nick Terrell [Fri, 27 Jan 2017 23:42:36 +0000 (15:42 -0800)]
Fix segfault in zstreamtest MT
It was reading beyond the end of the input buffer because no errors were
detected. Once that was fixed, it wasn't making forward progress because
no errors were detected and it was waiting for input.
Yann Collet [Fri, 27 Jan 2017 21:30:18 +0000 (13:30 -0800)]
Fixed status display for zstdmt
There is a large buffering effect when using zstdmt in MT mode.
Consequently, data is read first, pushed to workers,
and only later will the compressed result come out.
That means there is no longer immediate correlation
between amount of data read, and amount of data written.
This patch disables the displaying of % compression
when multi-threading is enabled.
It adds the displaying of total size when it can be determined
(it usually can be determined for files, but not for stdin)
so the user has a sense of "how far from the end" the compression compressed is.
There is no modification to decompression side,
since decompression is only single-threaded for now.
Nick Terrell [Fri, 27 Jan 2017 20:05:48 +0000 (12:05 -0800)]
Merge branch 'dev' into buck
* dev:
updated NEWS
fixed MSAN warnings in legacy decoders
Fix cmake build
updated NEWS
Edits as per comments, and change wildcard 'X' to '?'
Fix Visual Studios project
Fix pool.c threading.h import
Fix zstdmt_compress.h include
Fixed commented issues
Updated format specification to be easier to understand
improved #232 fix
Fixed https://github.com/facebook/zstd/issues/232
.travis.yml: different tests for "master" branch
.travis.yml: optimized order of short tests
.travis.yml: test jobs 12-15
JOB_NUMBER -eq 9
improved ZSTD_compressBlock_opt_extDict_generic
Yann Collet [Fri, 27 Jan 2017 18:44:03 +0000 (10:44 -0800)]
fixed MSAN warnings in legacy decoders
In some extraordinary circumstances,
*Length field can be generated from reading a partially uninitialized memory segment.
Data is correctly identified as corrupted later on,
but the read taints some later pointer arithmetic operation.
Yann Collet [Wed, 25 Jan 2017 20:31:07 +0000 (12:31 -0800)]
fixed zstdmt cli freeze issue with large nb of threads
fileio.c was continually pushing more content without giving a chance to flush compressed one.
It would block the job queue when input data was accumulated too fast (requiring to define many threads).
Fixed : fileio flushes whatever it can after each input attempt.
Yann Collet [Wed, 25 Jan 2017 01:02:26 +0000 (17:02 -0800)]
zstdmt cli and API allow selection of section sizes
By default, section sizes are 4x window size.
This new setting allow manual selection of section sizes.
The larger they are, the (slightly) better the compression ratio,
but also the higher the memory allocation cost,
and eventually the lesser the nb of possible threads,
since each section is compressed by a single thread.
It also introduces a prototype to set generic parameters,
ZSTDMT_setMTCtxParameter()
The idea is that it's possible to add enums
to extend the list of parameters that can be set this way.
This is more long-term oriented than a fixed-size struct.
Consider it as a test.
Yann Collet [Mon, 23 Jan 2017 07:49:52 +0000 (23:49 -0800)]
ZSTDMT_compressStream() becomes blocking when required to ensure forward progresses
In some (rare) cases, job list could be blocked by a first job still being processed,
while all following ones are completed, waiting to be flushed.
In such case, the current job-table implementation is unable to accept new job.
As a consequence, a call to ZSTDMT_compressStream() can be useless (nothing read, nothing flushed),
with the risk to trigger a busy-wait on the caller side
(needlessly loop over ZSTDMT_compressStream() ).
In such a case, ZSTDMT_compressStream() will block until the first job is completed and ready to flush.
It ensures some forward progress by guaranteeing it will flush at least a part of the completed job.
Energy-wasting busy-wait is avoided.
Yann Collet [Mon, 23 Jan 2017 00:40:06 +0000 (16:40 -0800)]
ZSTDMT_initCStream_usingDict() can outlive dict
Like ZSTD_initCStream_usingDict(),
ZSTDMT_initCStream_usingDict() now keep a copy of dict internally.
This way, dict can be released :
it does not longer have to outlive all future compression sessions.
Yann Collet [Thu, 19 Jan 2017 23:32:07 +0000 (15:32 -0800)]
Added ZSTDMT_initCStream_advanced() variant
Correctly compress with custom params and dictionary
Added relevant fuzzer test in zstreamtest
Also :
new macro ZSTDMT_SECTION_LOGSIZE_MIN, which sets a minimum size for a full job
(note : a flush() command can still generate a partial job anytime)