Sean Purcell [Mon, 30 Jan 2017 22:42:21 +0000 (14:42 -0800)]
Added ZSTD_get_decompressed_size
Since this implementation handles multiple concatenated frames,
to determine decompressed size we must traverse the entire input,
checking each frame's frame_content_size field
Nick Terrell [Fri, 27 Jan 2017 23:42:36 +0000 (15:42 -0800)]
Fix segfault in zstreamtest MT
It was reading beyond the end of the input buffer because no errors were
detected. Once that was fixed, it wasn't making forward progress because
no errors were detected and it was waiting for input.
Nick Terrell [Fri, 27 Jan 2017 20:05:48 +0000 (12:05 -0800)]
Merge branch 'dev' into buck
* dev:
updated NEWS
fixed MSAN warnings in legacy decoders
Fix cmake build
updated NEWS
Edits as per comments, and change wildcard 'X' to '?'
Fix Visual Studios project
Fix pool.c threading.h import
Fix zstdmt_compress.h include
Fixed commented issues
Updated format specification to be easier to understand
improved #232 fix
Fixed https://github.com/facebook/zstd/issues/232
.travis.yml: different tests for "master" branch
.travis.yml: optimized order of short tests
.travis.yml: test jobs 12-15
JOB_NUMBER -eq 9
improved ZSTD_compressBlock_opt_extDict_generic
Yann Collet [Fri, 27 Jan 2017 18:44:03 +0000 (10:44 -0800)]
fixed MSAN warnings in legacy decoders
In some extraordinary circumstances,
*Length field can be generated from reading a partially uninitialized memory segment.
Data is correctly identified as corrupted later on,
but the read taints some later pointer arithmetic operation.
Yann Collet [Wed, 25 Jan 2017 20:31:07 +0000 (12:31 -0800)]
fixed zstdmt cli freeze issue with large nb of threads
fileio.c was continually pushing more content without giving a chance to flush compressed one.
It would block the job queue when input data was accumulated too fast (requiring to define many threads).
Fixed : fileio flushes whatever it can after each input attempt.
Yann Collet [Wed, 25 Jan 2017 01:02:26 +0000 (17:02 -0800)]
zstdmt cli and API allow selection of section sizes
By default, section sizes are 4x window size.
This new setting allow manual selection of section sizes.
The larger they are, the (slightly) better the compression ratio,
but also the higher the memory allocation cost,
and eventually the lesser the nb of possible threads,
since each section is compressed by a single thread.
It also introduces a prototype to set generic parameters,
ZSTDMT_setMTCtxParameter()
The idea is that it's possible to add enums
to extend the list of parameters that can be set this way.
This is more long-term oriented than a fixed-size struct.
Consider it as a test.
Yann Collet [Mon, 23 Jan 2017 07:49:52 +0000 (23:49 -0800)]
ZSTDMT_compressStream() becomes blocking when required to ensure forward progresses
In some (rare) cases, job list could be blocked by a first job still being processed,
while all following ones are completed, waiting to be flushed.
In such case, the current job-table implementation is unable to accept new job.
As a consequence, a call to ZSTDMT_compressStream() can be useless (nothing read, nothing flushed),
with the risk to trigger a busy-wait on the caller side
(needlessly loop over ZSTDMT_compressStream() ).
In such a case, ZSTDMT_compressStream() will block until the first job is completed and ready to flush.
It ensures some forward progress by guaranteeing it will flush at least a part of the completed job.
Energy-wasting busy-wait is avoided.
Yann Collet [Mon, 23 Jan 2017 00:40:06 +0000 (16:40 -0800)]
ZSTDMT_initCStream_usingDict() can outlive dict
Like ZSTD_initCStream_usingDict(),
ZSTDMT_initCStream_usingDict() now keep a copy of dict internally.
This way, dict can be released :
it does not longer have to outlive all future compression sessions.
Yann Collet [Thu, 19 Jan 2017 23:32:07 +0000 (15:32 -0800)]
Added ZSTDMT_initCStream_advanced() variant
Correctly compress with custom params and dictionary
Added relevant fuzzer test in zstreamtest
Also :
new macro ZSTDMT_SECTION_LOGSIZE_MIN, which sets a minimum size for a full job
(note : a flush() command can still generate a partial job anytime)
Yann Collet [Thu, 19 Jan 2017 20:12:50 +0000 (12:12 -0800)]
added streaming fuzzer tests for MT API
Also : fixed corner case, where nb of jobs completed becomes > jobQueueSize
which is possible when many flushes are issued
while there is not enough dst buffer to flush completed ones.
Yann Collet [Thu, 19 Jan 2017 18:32:55 +0000 (10:32 -0800)]
fixed Multi-threaded compression
MT compression generates a single frame.
Multi-threading operates by breaking the frames into independent sections.
But from a decoder perspective, there is no difference :
it's just a suite of blocks.
Problem is, decoder preserves repCodes from previous block to start decoding next block.
This is also valid between sections, since they are no different than changing block.
Previous version would incorrectly initialize repcodes to their default value at the beginning of each section.
When using them, there was a mismatch between encoder (default values) and decoder (values from previous block).
This change ensures that repcodes won't be used at the beginning of a new section.
It works by setting them to 0.
This only works with regular (single segment) variants : extDict variants will fail !
Fortunately, sections beyond the 1st one belong to this category.
To be checked : btopt strategy.
This change was only validated from fast to btlazy2 strategies.