]>
git.ipfire.org Git - thirdparty/zstd.git/log
Dmitriy Titarenko [Sun, 22 Nov 2020 18:45:18 +0000 (23:45 +0500)]
Pass dictBufferCapacity to COVER_selectDict()
closes #2371
sen [Fri, 20 Nov 2020 21:54:20 +0000 (16:54 -0500)]
Merge pull request #2387 from senhuang42/compress_sequence_API
[RFC] New sequence compression API
senhuang42 [Fri, 20 Nov 2020 16:23:22 +0000 (11:23 -0500)]
Add experimental param for sequence validation
senhuang42 [Fri, 20 Nov 2020 15:07:55 +0000 (10:07 -0500)]
Remove unnecessary repcode backup, apply style choices, use function pointer
sen [Thu, 19 Nov 2020 23:26:42 +0000 (18:26 -0500)]
Merge pull request #2395 from senhuang42/is_rle_speedup
10x speedup for ZSTD_isRLE()
sen [Thu, 19 Nov 2020 22:32:40 +0000 (17:32 -0500)]
Explicit cast for visual warnings
Github has automatic commits now! Cool
Co-authored-by: Nick Terrell <nickrterrell@gmail.com>
senhuang42 [Thu, 19 Nov 2020 16:56:16 +0000 (11:56 -0500)]
Unroll isRLE loop
senhuang42 [Wed, 18 Nov 2020 15:01:30 +0000 (10:01 -0500)]
Clean up visual conversion warnings
senhuang42 [Wed, 18 Nov 2020 14:52:24 +0000 (09:52 -0500)]
Improve documentation on ZSTD_compressSequences()
senhuang42 [Tue, 17 Nov 2020 15:13:22 +0000 (10:13 -0500)]
Modification to offset validation to include entire sequence
senhuang42 [Mon, 16 Nov 2020 23:05:35 +0000 (18:05 -0500)]
Fix assert edge case, improve documentation in zstd.h
senhuang42 [Mon, 16 Nov 2020 16:47:27 +0000 (11:47 -0500)]
Fix warnings and make validation enabled by default
senhuang42 [Mon, 16 Nov 2020 15:47:26 +0000 (10:47 -0500)]
Fix unit tests to agree with new changes
senhuang42 [Mon, 16 Nov 2020 15:44:57 +0000 (10:44 -0500)]
Add new sequence format as an experimental CCtx param
senhuang42 [Mon, 16 Nov 2020 15:36:06 +0000 (10:36 -0500)]
Overhaul logic to simplify, add in proper validations, fix match splitting
senhuang42 [Wed, 11 Nov 2020 21:28:17 +0000 (16:28 -0500)]
Add new sequence compress api params to cctx
senhuang42 [Tue, 10 Nov 2020 20:33:33 +0000 (15:33 -0500)]
Fix srcSize=0 edge case
senhuang42 [Tue, 10 Nov 2020 18:48:02 +0000 (13:48 -0500)]
Fix literals length calculation
senhuang42 [Fri, 6 Nov 2020 18:24:43 +0000 (13:24 -0500)]
Adjust unit tests to agree with new sequence generation API
senhuang42 [Fri, 6 Nov 2020 18:09:15 +0000 (13:09 -0500)]
Remove dstCapacity error check
senhuang42 [Fri, 6 Nov 2020 16:35:10 +0000 (11:35 -0500)]
Remove extraneous function in this API
senhuang42 [Thu, 5 Nov 2020 17:36:08 +0000 (12:36 -0500)]
Add check comparing offset to window size
senhuang42 [Wed, 4 Nov 2020 17:42:19 +0000 (12:42 -0500)]
Fix MSAN errors
senhuang42 [Wed, 4 Nov 2020 16:05:41 +0000 (11:05 -0500)]
Address edge case with endPosInSequence
senhuang42 [Wed, 4 Nov 2020 15:43:35 +0000 (10:43 -0500)]
Change debug levels to appropriate ones
senhuang42 [Wed, 4 Nov 2020 14:36:36 +0000 (09:36 -0500)]
Add RLE support
senhuang42 [Wed, 4 Nov 2020 00:00:02 +0000 (19:00 -0500)]
Fix various build warnings
senhuang42 [Tue, 3 Nov 2020 23:53:44 +0000 (18:53 -0500)]
Add test case to roundtrip using ZSTD_getSequences() and ZSTD_compressSequences()
senhuang42 [Tue, 3 Nov 2020 18:52:21 +0000 (13:52 -0500)]
Add documentation for new api functions
senhuang42 [Tue, 3 Nov 2020 18:41:20 +0000 (13:41 -0500)]
Refactor for enhanced code clarity
senhuang42 [Tue, 3 Nov 2020 18:37:50 +0000 (13:37 -0500)]
Rename internal function compressSequences(), and promote new *_ext() functions to their actual name
senhuang42 [Tue, 3 Nov 2020 18:31:07 +0000 (13:31 -0500)]
Add another API function to compress with existing CCTX
senhuang42 [Tue, 3 Nov 2020 18:05:57 +0000 (13:05 -0500)]
More adjustments to improve code clarity
senhuang42 [Tue, 3 Nov 2020 16:59:07 +0000 (11:59 -0500)]
Pull compressStream2() transparent initialization into its own function
senhuang42 [Mon, 2 Nov 2020 18:29:56 +0000 (13:29 -0500)]
Add initial support for new ZSTD_Sequence mode
senhuang42 [Mon, 2 Nov 2020 15:01:18 +0000 (10:01 -0500)]
Add sequence compression format param
senhuang42 [Mon, 2 Nov 2020 14:52:52 +0000 (09:52 -0500)]
Always ensure sequenceRange updates properly, add more error forwarding
senhuang42 [Mon, 2 Nov 2020 14:40:26 +0000 (09:40 -0500)]
Various minor logical refactors to improve clarity
senhuang42 [Fri, 30 Oct 2020 15:55:20 +0000 (11:55 -0400)]
Fix cSize calculation for noCompressBlocks
senhuang42 [Thu, 29 Oct 2020 18:58:06 +0000 (14:58 -0400)]
Rebased, roundtrips silesia.tar
senhuang42 [Thu, 29 Oct 2020 18:47:36 +0000 (14:47 -0400)]
Refactor for better debugging info
senhuang42 [Thu, 29 Oct 2020 16:52:58 +0000 (12:52 -0400)]
Corrections and edge-case fixes to be able to roundtrip dickens
senhuang42 [Thu, 29 Oct 2020 15:01:04 +0000 (11:01 -0400)]
Multi-block compression scaffolding - works on single-block files
senhuang42 [Thu, 29 Oct 2020 14:24:45 +0000 (10:24 -0400)]
Add support for uncompressible blocks
senhuang42 [Wed, 28 Oct 2020 17:40:37 +0000 (13:40 -0400)]
Enable usage of ZSTD_sequenceRange for single-block compression
senhuang42 [Wed, 28 Oct 2020 17:28:27 +0000 (13:28 -0400)]
Add logic to handle ZSTD_sequenceRange
senhuang42 [Wed, 28 Oct 2020 15:57:21 +0000 (11:57 -0400)]
Add last literals handling like getSequences()
senhuang42 [Wed, 28 Oct 2020 15:50:38 +0000 (11:50 -0400)]
Pull block compression out of main compressSequences() function
senhuang42 [Wed, 28 Oct 2020 15:28:12 +0000 (11:28 -0400)]
Implement ZSTD_updateSequenceRange
senhuang42 [Wed, 28 Oct 2020 15:07:36 +0000 (11:07 -0400)]
Modify SequenceRange to have posInSequence
senhuang42 [Wed, 28 Oct 2020 15:04:44 +0000 (11:04 -0400)]
Add function definition for sequenceRange updater
senhuang42 [Wed, 28 Oct 2020 15:04:18 +0000 (11:04 -0400)]
Add ZSTD_SequenceRange to count ranges in array of ZSTD_Sequence
senhuang42 [Tue, 27 Oct 2020 15:02:58 +0000 (11:02 -0400)]
Add support for repcodes
senhuang42 [Mon, 26 Oct 2020 16:33:58 +0000 (12:33 -0400)]
Code cleanup, add debuglog statments
senhuang42 [Wed, 30 Sep 2020 21:18:20 +0000 (17:18 -0400)]
Implement first pass at compressSequences()
senhuang42 [Fri, 13 Nov 2020 14:55:05 +0000 (09:55 -0500)]
Add initial function prototype for ZSTD_compressSequences_ext (to be renamed later)
sen [Sun, 15 Nov 2020 23:29:52 +0000 (18:29 -0500)]
Merge pull request #2393 from senhuang42/fix_sequence_extractions_api
Improve repcode handling in sequence extraction API
senhuang42 [Fri, 13 Nov 2020 14:41:44 +0000 (09:41 -0500)]
Reduce number of memcpy() calls
senhuang42 [Thu, 12 Nov 2020 21:38:23 +0000 (16:38 -0500)]
Use existing repcode update function to implement updates
senhuang42 [Thu, 12 Nov 2020 19:37:47 +0000 (14:37 -0500)]
Add in proper block repcode histories
senhuang42 [Thu, 12 Nov 2020 17:22:58 +0000 (12:22 -0500)]
Let block reps persist
senhuang42 [Thu, 12 Nov 2020 16:57:01 +0000 (11:57 -0500)]
Fix incorrect repcode setting
senhuang42 [Thu, 12 Nov 2020 16:09:01 +0000 (11:09 -0500)]
Improve unit test
senhuang42 [Thu, 12 Nov 2020 15:59:35 +0000 (10:59 -0500)]
Overhaul repcode handling logic
Yann Collet [Fri, 6 Nov 2020 19:38:08 +0000 (11:38 -0800)]
Merge pull request #2388 from facebook/fix2386
fix incorrect assert
sen [Fri, 6 Nov 2020 18:00:31 +0000 (13:00 -0500)]
Merge pull request #2381 from senhuang42/expand_sequence_extraction_api
Add enum to define ZSTD_Sequence type and update sequence extraction API
Yann Collet [Fri, 6 Nov 2020 17:57:05 +0000 (09:57 -0800)]
fix multiple minor conversion warnings
unrelated to #2386, just cleaning up while I'm updating this file ...
Yann Collet [Fri, 6 Nov 2020 17:44:04 +0000 (09:44 -0800)]
fix incorrect assert
fix #2386, reported by @Neumann-A
senhuang42 [Fri, 6 Nov 2020 15:56:56 +0000 (10:56 -0500)]
Update unit tests
senhuang42 [Fri, 6 Nov 2020 15:55:46 +0000 (10:55 -0500)]
Implement mergeGeneratedSequences()
senhuang42 [Fri, 6 Nov 2020 15:53:22 +0000 (10:53 -0500)]
Rename getSequences() to generateSequences()
senhuang42 [Fri, 6 Nov 2020 15:52:34 +0000 (10:52 -0500)]
Add new mergeGeneratedSequences() function
Nick Terrell [Thu, 5 Nov 2020 18:36:13 +0000 (10:36 -0800)]
Merge pull request #2385 from LuAPi/add-ZSTD_getDictID_fromCDict-single-commit
Add ZSTD_getDictID_fromCDict function to experimental section
Luke Pitt [Wed, 4 Nov 2020 11:37:37 +0000 (11:37 +0000)]
Add ZSTD_getDictID_fromCDict function to experimental section
senhuang42 [Mon, 2 Nov 2020 21:59:16 +0000 (16:59 -0500)]
Change block delimiter removing to linear time approach
senhuang42 [Mon, 2 Nov 2020 16:53:04 +0000 (11:53 -0500)]
Remove trailing comma
senhuang42 [Mon, 2 Nov 2020 16:43:19 +0000 (11:43 -0500)]
Use ZSTD_memmove() instead of memmove()
senhuang42 [Mon, 2 Nov 2020 16:35:27 +0000 (11:35 -0500)]
Revert compressibility change
senhuang42 [Mon, 2 Nov 2020 16:32:56 +0000 (11:32 -0500)]
Update name of enum, clarify documentation
senhuang42 [Mon, 2 Nov 2020 16:30:31 +0000 (11:30 -0500)]
Update unit test
senhuang42 [Mon, 2 Nov 2020 15:59:06 +0000 (10:59 -0500)]
Revert unnecessary seqCollector adjustment
senhuang42 [Mon, 2 Nov 2020 15:58:18 +0000 (10:58 -0500)]
Fix incorrect index increment in merge algorithm
senhuang42 [Mon, 2 Nov 2020 15:46:52 +0000 (10:46 -0500)]
Add algorithm to remove all delimiters
senhuang42 [Mon, 2 Nov 2020 15:19:26 +0000 (10:19 -0500)]
Update seqCollector definition
senhuang42 [Mon, 2 Nov 2020 15:17:59 +0000 (10:17 -0500)]
Update ZSTD_getSequences function signature
senhuang42 [Mon, 2 Nov 2020 15:15:53 +0000 (10:15 -0500)]
Add new enum for different sequence formats for ingestion/extraction
sen [Mon, 2 Nov 2020 01:33:25 +0000 (20:33 -0500)]
Merge pull request #2378 from senhuang42/free_cress_ptr
[minor] Pass cRess_t by const ptr instead of by value
Nick Terrell [Fri, 30 Oct 2020 22:09:38 +0000 (15:09 -0700)]
Merge pull request #2379 from terrelln/regression-test
[regression] Updates results.csv & add README
Nick Terrell [Fri, 30 Oct 2020 22:06:56 +0000 (15:06 -0700)]
Merge pull request #2354 from terrelln/stable-buffer
Add ZSTD_c_stable{In,Out}Buffer and optimize when set
Nick Terrell [Fri, 30 Oct 2020 20:55:52 +0000 (13:55 -0700)]
[regression] Add README explaining the test
Nick Terrell [Fri, 30 Oct 2020 20:54:30 +0000 (13:54 -0700)]
[regression] Update results.csv
https://github.com/facebook/zstd/pull/2339 removes the single-pass zstdmt API.
This changes the compressed size, because we no longer take the # of threads into
account when deciding the job size.
sen [Fri, 30 Oct 2020 19:47:25 +0000 (15:47 -0400)]
Merge pull request #2376 from senhuang42/clarify_sequence_extraction_api
Refine external ZSTD_Sequence API
Nick Terrell [Tue, 13 Oct 2020 01:40:14 +0000 (18:40 -0700)]
[test] Add unit tests for ZSTD_c_stable{In,Out}Buffer
Nick Terrell [Mon, 12 Oct 2020 21:47:55 +0000 (14:47 -0700)]
[lib] Avoid allocating the input buffer when ZSTD_c_stableInBuffer is set
We don't use it when we have a stable input buffer, so don't allocate
it. I had to slightly modify `ZSTD_copyCCtx()` by storing the
`ZSTD_buffered_policy_e` in the `ZSTD_CCtx`, since `inBuffSize > 0` is
no longer the correct signal for the buffered mode.
Nick Terrell [Mon, 12 Oct 2020 21:36:30 +0000 (14:36 -0700)]
[lib] Skip the input window buffer when ZSTD_c_stableInBuffer is set
Compress directly from the `ZSTD_inBuffer`. We still allocate the input
buffer. A following commit will remove that allocation.
Nick Terrell [Mon, 12 Oct 2020 21:19:04 +0000 (14:19 -0700)]
[cwksp] Return NULL when 0 bytes are requested
This ensures that the buffer is never used.
Nick Terrell [Mon, 12 Oct 2020 21:17:22 +0000 (14:17 -0700)]
[lib] Avoid allocating output buffer when ZSTD_c_stableOutBuffer is set
We compress directly to the `ZSTD_outBuffer` so we don't need to
allocate it.
Nick Terrell [Mon, 12 Oct 2020 21:12:23 +0000 (14:12 -0700)]
[lib] Compress directly into output when ZSTD_c_stableOutBuffer is set
When we have a stable output buffer always compress directly into the
`ZSTD_outBuffer`. We are allowed to return `dstSizeTooSmall`.
Nick Terrell [Mon, 12 Oct 2020 21:09:12 +0000 (14:09 -0700)]
[lib] Take the shortcut when ZSTD_c_stableOutBuffer is set
When we have a stable output buffer take the single-pass shortcut.
It is okay to return `dstSizeTooSmall` if the output buffer isn't
big enough, because we know it will never grow.
Nick Terrell [Mon, 12 Oct 2020 20:51:35 +0000 (13:51 -0700)]
[lib] Set ZSTD_c_stable{In,Out}Buffer in ZSTD_compress2()
Sets these parameters in ZSTD_compress2() then resets them to their
orignal values after the compression call.
An alternative design could be to add a flush mode `ZSTD_e_singlePass`
which implies `ZSTD_c_stable{In,Out}Buffer` but only for a single
compression call, by directly setting the applied parameters. I've opted
for the smaller change, but this is open for discussion.