Yann Collet [Fri, 26 Jan 2018 20:15:43 +0000 (12:15 -0800)]
zstdmt : flush() only lock to read shared job members
Other job members are accessed directly.
This avoids a full job copy, which would access everything,
including a few members that are supposed to be used by worker only,
uselessly requiring additional locks to avoid race conditions.
Yann Collet [Fri, 26 Jan 2018 01:35:49 +0000 (17:35 -0800)]
zstdmt: removed job->jobCompleted
replaced by equivalent signal job->consumer == job->srcSize.
created additional functions
ZSTD_writeLastEmptyBlock()
and
ZSTDMT_writeLastEmptyBlock()
required when it's necessary to finish a frame with a last empty job, to create an "end of frame" marker.
Yann Collet [Tue, 23 Jan 2018 23:19:11 +0000 (15:19 -0800)]
zstdmt : fix end condition (ZSTD_e_end)
When ZSTD_e_end directive is provided,
the question is not only "are internal buffers completely flushed",
it is also "is current frame completed".
In some rare cases,
it was possible for internal buffers to be completely flushed,
triggering a @return == 0,
but frame was not completed as it needed a last null-size block to mark the end,
resulting in an unfinished frame.
Yann Collet [Tue, 23 Jan 2018 22:03:07 +0000 (14:03 -0800)]
zstdmt: fixed minor race condition
no real consequence, but pollute tsan tests :
job->dstBuff is being modified inside worker,
while main thread might read it accidentally
because it copies whole job.
But since it doesn't used dstBuff, there is no real consequence.
Other potential solution : only copy useful data, instead of whole job
Nick Terrell [Thu, 18 Jan 2018 21:28:30 +0000 (13:28 -0800)]
Set repcodes for empty ZSTD_CDict
When the dictionary is <= 8 bytes, no data is loaded from the dictionary.
In this case the repcodes weren't set, because they were inserted after the
size check. Fix this problem in general by first setting the cdict state to
a clean state of an empty dictionary, then filling the state from there.
Yann Collet [Wed, 17 Jan 2018 20:39:58 +0000 (12:39 -0800)]
zstdmt : fixed very large window sizes
would create too large buffers,
since default job size == window size * 4.
This would crash on 32-bit systems.
Also : jobSize being a 32-bit unsigned, it cannot be >= 4 GB,
so the formula was failing for large window sizes >= 1 GB.
Fixed now : max job Size is 2 GB, whatever the window size.
Yann Collet [Wed, 17 Jan 2018 01:35:00 +0000 (17:35 -0800)]
fix fileio progression status update
The compression % is no longer correct,
since it's no longer possible to make direct correlation
between nb bytes read and nb bytes written
due to large internal buffer inside CCtx
(exacerbated with --long).
The current "fix" is to no longer display the %.
A more complex solution will have to count exactly how much data has been consumed and compressed internally, within CCtx buffers.
Yann Collet [Wed, 17 Jan 2018 00:15:47 +0000 (16:15 -0800)]
introduced parameter ZSTD_p_nonBlockingMode
This new parameter makes it possible to call
streaming ZSTDMT with a single thread set
which is non blocking.
It makes it possible for the main thread to do other tasks in parallel
while the worker thread does compression.
Typically, for zstd cli, it means it can do I/O stuff.
Applied within fileio.c, this patch provides non-negligible gains during compression.
Tested on my laptop, with enwik9 (1000000000 bytes) : time zstd -f enwik9
With traditional single-thread blocking mode :
real 0m9.557s
user 0m8.861s
sys 0m0.538s
With new single-worker non blocking mode :
real 0m7.938s
user 0m8.049s
sys 0m0.514s
Yann Collet [Thu, 11 Jan 2018 19:16:32 +0000 (11:16 -0800)]
fixed #304
Pathological samples may result in literal section being incompressible.
This case is now detected,
and literal distribution is replaced by one that can be written into the dictionary.
Yann Collet [Thu, 11 Jan 2018 04:33:45 +0000 (20:33 -0800)]
fixed bug #976, reported by @indygreg
constants in zstd.h should not depend on MIN() macro which existence is not guaranteed.
Added a test to check the specific constants.
The test is a bit too specific.
But I have found no way to control a more generic "are all macro already defined" condition,
especially as this is a valid construction (the missing macro might be defined later, intentionnally).
Yann Collet [Sat, 30 Dec 2017 14:12:59 +0000 (15:12 +0100)]
improved btlazy2 : list of unsorted candidates can reach extDict
It used to stop on reaching extDict, for simplification.
As a consequence, there was a small loss of performance each time the round buffer would restart from beginning.
It's not a large difference though, just several hundreds of bytes on silesia.
This patch fixes it.
Yann Collet [Sat, 30 Dec 2017 10:37:18 +0000 (11:37 +0100)]
updated compression level table for btlazy2
now selected for levels 13, 14 and 15.
Also : dropped the requirement for monotonic memory budget increase of compression levels,,
which was required for ZSTD_estimateCCtxSize()
in order to ensure that a memory budget for level L is large enough for any level <= L.
This condition is now ensured at run time inside ZSTD_estimateCCtxSize().
Yann Collet [Sat, 30 Dec 2017 10:37:18 +0000 (11:37 +0100)]
updated compression level table for btlazy2
now selected for levels 13, 14 and 15.
Also : dropped the requirement for monotonic memory budget increase of compression levels,,
which was required for ZSTD_estimateCCtxSize()
in order to ensure that a memory budget for level L is large enough for any level <= L.
This condition is now ensured at run time inside ZSTD_estimateCCtxSize().