]> git.ipfire.org Git - thirdparty/zstd.git/log
thirdparty/zstd.git
7 years agochanged a few variable names 1162/head
Yann Collet [Tue, 5 Jun 2018 00:10:50 +0000 (17:10 -0700)] 
changed a few variable names

to unify naming convention

7 years agoFixed a nasty corruption bug
Yann Collet [Fri, 1 Jun 2018 22:18:32 +0000 (15:18 -0700)] 
Fixed a nasty corruption bug

recently introduce into the new dictionary mode.
The bug could be reproduced with this command :
./zstreamtest -v --opaqueapi --no-big-tests -s4092 -t639

error was in function ZSTD_count_2segments() :
the beginning of the 2nd segment corresponds to prefixStart
and not the beginning of the current block (istart == src).
This would result in comparing the wrong byte.

7 years agoMerge pull request #1157 from facebook/decompressedSize
Yann Collet [Fri, 1 Jun 2018 17:28:17 +0000 (10:28 -0700)] 
Merge pull request #1157 from facebook/decompressedSize

minor : improved zstd.h API code comment

7 years agoMerge pull request #1151 from felixhandte/zstd-dfast-in-place-dict-goto
Yann Collet [Fri, 1 Jun 2018 00:37:09 +0000 (17:37 -0700)] 
Merge pull request #1151 from felixhandte/zstd-dfast-in-place-dict-goto

ZSTD_dfast: Support Searching the Dictionary Context In-Place (Alternate `goto` Implementation)

7 years agoAllow Different Dict Attachment Cut-Offs for Different Strategies 1151/head
W. Felix Handte [Fri, 25 May 2018 19:19:37 +0000 (15:19 -0400)] 
Allow Different Dict Attachment Cut-Offs for Different Strategies

7 years agoRemove Incorrect and Extraneous Repcode Bounds Check
W. Felix Handte [Thu, 31 May 2018 20:55:50 +0000 (16:55 -0400)] 
Remove Incorrect and Extraneous Repcode Bounds Check

7 years agominor : improved API code comment 1157/head
Yann Collet [Thu, 31 May 2018 18:12:18 +0000 (11:12 -0700)] 
minor : improved API code comment

Extend guarantee that ZSTD_getFrameContentSize() will delivering the decompressed size
to any single-pass compression function.

Answer #1156

7 years agoMerge pull request #1154 from facebook/altOpt
Yann Collet [Wed, 30 May 2018 20:05:51 +0000 (13:05 -0700)] 
Merge pull request #1154 from facebook/altOpt

minor optimal parser optimization

7 years agominor update of literal cost function 1154/head
Yann Collet [Tue, 29 May 2018 22:29:55 +0000 (15:29 -0700)] 
minor update of literal cost function

just assert() there is no negative cost evaluation for literals

7 years agosimplified optimal parser
Yann Collet [Tue, 29 May 2018 21:07:25 +0000 (14:07 -0700)] 
simplified optimal parser

removed "cached" structure.
prices are now saved in the optimal table.

Primarily done for simplification.
Might improve speed by a little.
But actually, and surprisingly, also improves ratio in some circumstances.

7 years agoMerge pull request #1153 from facebook/dynThreshold
Yann Collet [Sat, 26 May 2018 15:43:45 +0000 (08:43 -0700)] 
Merge pull request #1153 from facebook/dynThreshold

changed dynamic fse threshold for offset

7 years agofixed minor visual warning 1153/head
Yann Collet [Sat, 26 May 2018 03:43:09 +0000 (20:43 -0700)] 
fixed minor visual warning

7 years agoMerge pull request #1152 from facebook/lowCompression
Yann Collet [Sat, 26 May 2018 00:46:11 +0000 (17:46 -0700)] 
Merge pull request #1152 from facebook/lowCompression

btultra accepts blocks with poorer compression ratio

7 years agochanged dynamic fse threshold for offset
Yann Collet [Sat, 26 May 2018 00:41:16 +0000 (17:41 -0700)] 
changed dynamic fse threshold for offset

recent experienced showed that
default distribution table for offset
can get it wrong pretty quickly with the nb of symbols,
while it remains a reasonable choice much longer for lengths symbols.

Changed the formula,
so that dynamic threshold is now 32 symbols for offsets.
It remains at 64 symbols for lengths.

Detection based on defaultNormLog

7 years agoMerge branch 'dev' into lowCompression 1152/head
Yann Collet [Fri, 25 May 2018 22:45:03 +0000 (15:45 -0700)] 
Merge branch 'dev' into lowCompression

7 years agoMerge pull request #1141 from facebook/staticDictCost
Yann Collet [Fri, 25 May 2018 22:43:32 +0000 (15:43 -0700)] 
Merge pull request #1141 from facebook/staticDictCost

Random stuff on High Compression mode

7 years agobtultra accepts blocks with poorer compression ratio
Yann Collet [Fri, 25 May 2018 22:19:52 +0000 (15:19 -0700)] 
btultra accepts blocks with poorer compression ratio

zstd rejects blocks which do not compress by at least a certain amount.
In which case, such block is simply emitted uncompressed (even if a little bit of compression could be achieved).
This is better for decompression speed, hence for energy.

The logic is controlled by ZSTD_minGain().
The rule is applied uniformly, at all compression levels.

This change makes btultra accepts blocks with poor compression ratios.
We presume that users of btultra mode prefers compression ratio over some decompress speed gains.

The threshold for minimum gain is lowered for btultra
from s>>6 (~1.5% minimum gain)
to s>>7 (~0.8% minimum gain).

This is a prudent change.
Not sure if it's large enough.

7 years agoslightly nudge choices towards less sequences 1141/head
Yann Collet [Fri, 25 May 2018 21:52:21 +0000 (14:52 -0700)] 
slightly nudge choices towards less sequences

also slightly improve some strange detrimental corner cases.

7 years agoCheck Long + 1 Matches in Both Prefix and Dict in Bothe Short Match Paths
W. Felix Handte [Wed, 23 May 2018 20:43:33 +0000 (16:43 -0400)] 
Check Long + 1 Matches in Both Prefix and Dict in Bothe Short Match Paths

7 years agoInterleave Prefix and Dict Searches
W. Felix Handte [Wed, 16 May 2018 19:20:16 +0000 (15:20 -0400)] 
Interleave Prefix and Dict Searches

7 years agoRefactor ZSTD_dfast to Use `goto`s
W. Felix Handte [Wed, 16 May 2018 18:55:20 +0000 (14:55 -0400)] 
Refactor ZSTD_dfast to Use `goto`s

7 years ago... When I Said "HashTable", I Meant "ChainTable" 1138/head
W. Felix Handte [Wed, 23 May 2018 21:37:47 +0000 (17:37 -0400)] 
... When I Said "HashTable", I Meant "ChainTable"

7 years agoFix Off-By-One Error
W. Felix Handte [Wed, 23 May 2018 20:45:58 +0000 (16:45 -0400)] 
Fix Off-By-One Error

7 years agoDisallow Too-Long Repcodes When Using an Attached Dict
W. Felix Handte [Wed, 23 May 2018 16:51:09 +0000 (12:51 -0400)] 
Disallow Too-Long Repcodes When Using an Attached Dict

7 years agoPort Changes Made to ZSTD_fast to ZSTD_dfast
W. Felix Handte [Tue, 15 May 2018 22:40:07 +0000 (18:40 -0400)] 
Port Changes Made to ZSTD_fast to ZSTD_dfast

7 years agoImplement Second Repcode Check
W. Felix Handte [Wed, 9 May 2018 22:46:33 +0000 (18:46 -0400)] 
Implement Second Repcode Check

7 years agoImplement First Repcode Check
W. Felix Handte [Wed, 9 May 2018 22:37:24 +0000 (18:37 -0400)] 
Implement First Repcode Check

7 years agoFind Dict Hash Table Matches
W. Felix Handte [Wed, 9 May 2018 22:31:51 +0000 (18:31 -0400)] 
Find Dict Hash Table Matches

7 years agoExisting Repcode Check Only Applies to noDict Case
W. Felix Handte [Wed, 9 May 2018 20:46:57 +0000 (16:46 -0400)] 
Existing Repcode Check Only Applies to noDict Case

7 years agoProperly Initialize Repcode Values
W. Felix Handte [Wed, 9 May 2018 19:23:22 +0000 (15:23 -0400)] 
Properly Initialize Repcode Values

7 years agoAdd Necessary Dict Variables
W. Felix Handte [Wed, 9 May 2018 19:28:47 +0000 (15:28 -0400)] 
Add Necessary Dict Variables

7 years agoRename 'lowest' to 'localLowest' to Prepare to Introduce Dict Indices
W. Felix Handte [Wed, 9 May 2018 19:22:36 +0000 (15:22 -0400)] 
Rename 'lowest' to 'localLowest' to Prepare to Introduce Dict Indices

7 years agoSkeleton for In-Place Impl for ZSTD_dfast
W. Felix Handte [Fri, 4 May 2018 19:59:27 +0000 (15:59 -0400)] 
Skeleton for In-Place Impl for ZSTD_dfast

7 years agoMerge branch 'dev' into staticDictCost
Yann Collet [Thu, 24 May 2018 23:21:02 +0000 (16:21 -0700)] 
Merge branch 'dev' into staticDictCost

7 years agoMerge pull request #1149 from terrelln/fuzz-py
Yann Collet [Thu, 24 May 2018 22:56:09 +0000 (15:56 -0700)] 
Merge pull request #1149 from terrelln/fuzz-py

Small fixes to fuzz.py

7 years agoImprove compiler detection to work on Mac 1149/head
Nick Terrell [Thu, 24 May 2018 01:46:38 +0000 (18:46 -0700)] 
Improve compiler detection to work on Mac

7 years agoDefine BIT_DEBUG for --debug
Nick Terrell [Thu, 24 May 2018 01:25:26 +0000 (18:25 -0700)] 
Define BIT_DEBUG for --debug

7 years agoIncrease the maximum file size
Nick Terrell [Thu, 24 May 2018 01:22:32 +0000 (18:22 -0700)] 
Increase the maximum file size

7 years agoSmall fixes to fuzz.py
Nick Terrell [Thu, 24 May 2018 01:04:52 +0000 (18:04 -0700)] 
Small fixes to fuzz.py

7 years agoMerge pull request #1150 from facebook/fracFse
Yann Collet [Thu, 24 May 2018 21:19:30 +0000 (14:19 -0700)] 
Merge pull request #1150 from facebook/fracFse

fix corner case when requiring cost of an FSE symbol

7 years agoMerge branch 'dev' into fracFse 1150/head
Yann Collet [Thu, 24 May 2018 21:09:49 +0000 (14:09 -0700)] 
Merge branch 'dev' into fracFse

7 years agofix corner case when requiring cost of an FSE symbol
Yann Collet [Thu, 24 May 2018 20:59:11 +0000 (13:59 -0700)] 
fix corner case when requiring cost of an FSE symbol

ensure that, when frequency[symbol]==0,
result is (tableLog + 1) bits
with both upper-bit and fractional-bit estimates.

Also : enable BIT_DEBUG in /tests

7 years agoMerge pull request #1117 from felixhandte/zstd-fast-in-place-dict
Yann Collet [Thu, 24 May 2018 02:32:25 +0000 (19:32 -0700)] 
Merge pull request #1117 from felixhandte/zstd-fast-in-place-dict

ZSTD_fast: Support Searching the Dictionary Context In-Place

7 years agoWork around bug in zstd decoder (#1147)
Nick Terrell [Thu, 24 May 2018 01:02:30 +0000 (18:02 -0700)] 
Work around bug in zstd decoder (#1147)

Work around bug in zstd decoder

Pull request #1144 exercised a new path in the zstd decoder that proved to
be buggy. Avoid the extremely rare bug by emitting an uncompressed block.

7 years agoMerge pull request #1146 from terrelln/fse-fix
Yann Collet [Wed, 23 May 2018 23:41:42 +0000 (16:41 -0700)] 
Merge pull request #1146 from terrelln/fse-fix

[zstd] Fix decompression edge case

7 years agoVariable declarations 1146/head
Nick Terrell [Wed, 23 May 2018 21:58:58 +0000 (14:58 -0700)] 
Variable declarations

7 years agoAssert that Dict and Current Window are Adjacent in Index Space 1117/head
W. Felix Handte [Wed, 23 May 2018 20:00:17 +0000 (16:00 -0400)] 
Assert that Dict and Current Window are Adjacent in Index Space

7 years agoMake loadedDictEnd an Index, not the Dict Len
W. Felix Handte [Tue, 22 May 2018 00:12:11 +0000 (20:12 -0400)] 
Make loadedDictEnd an Index, not the Dict Len

7 years agoFixes in re Comments
W. Felix Handte [Mon, 21 May 2018 22:27:08 +0000 (18:27 -0400)] 
Fixes in re Comments

7 years agoDon't Attach Empty Dict Contents
W. Felix Handte [Tue, 15 May 2018 21:23:16 +0000 (17:23 -0400)] 
Don't Attach Empty Dict Contents

In weird corner cases, they produce unexpected results...

7 years agoAvoid Undefined Behavior in Match Ptr Calculation
W. Felix Handte [Tue, 15 May 2018 19:45:37 +0000 (15:45 -0400)] 
Avoid Undefined Behavior in Match Ptr Calculation

7 years agoRemove Out-of-Date Comment
W. Felix Handte [Tue, 15 May 2018 19:41:37 +0000 (15:41 -0400)] 
Remove Out-of-Date Comment

7 years agoMoar Renames
W. Felix Handte [Tue, 15 May 2018 17:16:50 +0000 (13:16 -0400)] 
Moar Renames

7 years agoAlso Attach Dict When Source Size is Unknown
W. Felix Handte [Tue, 15 May 2018 17:13:19 +0000 (13:13 -0400)] 
Also Attach Dict When Source Size is Unknown

7 years agoClear the Dictionary When Sliding the Window
W. Felix Handte [Tue, 15 May 2018 17:08:03 +0000 (13:08 -0400)] 
Clear the Dictionary When Sliding the Window

7 years agoRefine ip Initialization to Avoid ARM Weirdness
W. Felix Handte [Tue, 15 May 2018 05:15:33 +0000 (01:15 -0400)] 
Refine ip Initialization to Avoid ARM Weirdness

7 years agoUse New Index Invariant to Simplify Conditionals
W. Felix Handte [Thu, 10 May 2018 21:18:08 +0000 (17:18 -0400)] 
Use New Index Invariant to Simplify Conditionals

7 years agoForce Working Context Indices Greater than Dict Indices
W. Felix Handte [Thu, 10 May 2018 21:17:10 +0000 (17:17 -0400)] 
Force Working Context Indices Greater than Dict Indices

7 years agoWhitespace Fix
W. Felix Handte [Thu, 10 May 2018 17:46:19 +0000 (13:46 -0400)] 
Whitespace Fix

7 years agoSwitch to Original Match Calc for noDict Repcode Check
W. Felix Handte [Wed, 9 May 2018 22:40:23 +0000 (18:40 -0400)] 
Switch to Original Match Calc for noDict Repcode Check

7 years agoRename 'hasDict' to 'dictMode'
W. Felix Handte [Wed, 9 May 2018 19:14:12 +0000 (15:14 -0400)] 
Rename 'hasDict' to 'dictMode'

7 years agoRespond to PR Comments; Formatting/Style/Lint Fixes
W. Felix Handte [Wed, 9 May 2018 17:14:20 +0000 (13:14 -0400)] 
Respond to PR Comments; Formatting/Style/Lint Fixes

7 years agoRename and Reformat
W. Felix Handte [Fri, 4 May 2018 17:08:07 +0000 (13:08 -0400)] 
Rename and Reformat

7 years agoChange Cut-Off to 8 KB
W. Felix Handte [Thu, 3 May 2018 02:28:29 +0000 (22:28 -0400)] 
Change Cut-Off to 8 KB

7 years agoFix Rep Code Initialization
W. Felix Handte [Thu, 3 May 2018 00:30:03 +0000 (20:30 -0400)] 
Fix Rep Code Initialization

7 years agoCoalesce hasDictMatchState and extDict Checks into One Enum and Rename Stuff
W. Felix Handte [Wed, 2 May 2018 21:34:34 +0000 (17:34 -0400)] 
Coalesce hasDictMatchState and extDict Checks into One Enum and Rename Stuff

7 years agoSplit Wrapper Functions to Cause Inlining
W. Felix Handte [Wed, 2 May 2018 21:10:51 +0000 (17:10 -0400)] 
Split Wrapper Functions to Cause Inlining

7 years agoAdd bounds check in repcode tests
W. Felix Handte [Wed, 2 May 2018 19:12:18 +0000 (15:12 -0400)] 
Add bounds check in repcode tests

7 years agoInitial Repcode Check Support for Ext Dict Ctx
W. Felix Handte [Tue, 1 May 2018 20:21:18 +0000 (16:21 -0400)] 
Initial Repcode Check Support for Ext Dict Ctx

7 years agoPreliminary Support in ZSTD_compressBlock_fast_generic() for Ext Dict Ctx
W. Felix Handte [Sat, 28 Apr 2018 04:42:37 +0000 (00:42 -0400)] 
Preliminary Support in ZSTD_compressBlock_fast_generic() for Ext Dict Ctx

7 years agoRefer to the Dictionary Match State In-Place (Sometimes)
W. Felix Handte [Fri, 27 Apr 2018 22:46:59 +0000 (18:46 -0400)] 
Refer to the Dictionary Match State In-Place (Sometimes)

7 years agoError if reported size is too large in edge case
Nick Terrell [Wed, 23 May 2018 21:47:20 +0000 (14:47 -0700)] 
Error if reported size is too large in edge case

7 years ago[zstd] Fix decompression edge case
Nick Terrell [Wed, 23 May 2018 19:16:00 +0000 (12:16 -0700)] 
[zstd] Fix decompression edge case

This edge case is only possible with the new optimal encoding selector,
since before zstd would always choose `set_basic` for small numbers of
sequences.

Fix `FSE_readNCount()` to support buffers < 4 bytes.

Credit to OSS-Fuzz

7 years agoMerge pull request #1144 from terrelln/fse-entropy
Yann Collet [Wed, 23 May 2018 02:25:37 +0000 (19:25 -0700)] 
Merge pull request #1144 from terrelln/fse-entropy

Approximate FSE encoding costs for selection

7 years agoMerge pull request #1145 from terrelln/spec
Yann Collet [Tue, 22 May 2018 23:21:40 +0000 (16:21 -0700)] 
Merge pull request #1145 from terrelln/spec

Clarify what happens when Number_of_Sequences == 0

7 years agoClarify what happens when Number_of_Sequences == 0 1145/head
Nick Terrell [Tue, 22 May 2018 23:12:33 +0000 (16:12 -0700)] 
Clarify what happens when Number_of_Sequences == 0

7 years agoFixes 1144/head
Nick Terrell [Tue, 22 May 2018 23:06:33 +0000 (16:06 -0700)] 
Fixes

7 years agoMerge branch 'dev' into staticDictCost
Yann Collet [Tue, 22 May 2018 22:10:05 +0000 (15:10 -0700)] 
Merge branch 'dev' into staticDictCost

7 years agodisable 2-passes strategy
Yann Collet [Tue, 22 May 2018 22:06:36 +0000 (15:06 -0700)] 
disable 2-passes strategy

7 years agoApproximate FSE encoding costs for selection
Nick Terrell [Mon, 16 Apr 2018 22:37:27 +0000 (15:37 -0700)] 
Approximate FSE encoding costs for selection

Estimate the cost for using FSE modes `set_basic`, `set_compressed`, and
`set_repeat`, and select the one with the lowest cost.

* The cost of `set_basic` is computed using the cross-entropy cost
  function `ZSTD_crossEntropyCost()`, using the normalized default count
  and the count.
* The cost of `set_repeat` is computed using `FSE_bitCost()`. We check the
  previous table to see if it is able to represent the distribution.
* The cost of `set_compressed` is computed with the entropy cost function
  `ZSTD_entropyCost()`, together with the cost of writing the normalized
  count `ZSTD_NCountCost()`.

7 years agoMerge pull request #1143 from facebook/tableLevels
Yann Collet [Sat, 19 May 2018 21:40:37 +0000 (14:40 -0700)] 
Merge pull request #1143 from facebook/tableLevels

Update table of compression levels

7 years agoMerge branch 'tableLevels' of github.com:facebook/zstd into tableLevels 1143/head
Yann Collet [Sat, 19 May 2018 01:23:40 +0000 (18:23 -0700)] 
Merge branch 'tableLevels' of github.com:facebook/zstd into tableLevels

7 years agoMerge branch 'dev' into tableLevels
Yann Collet [Sat, 19 May 2018 00:17:45 +0000 (17:17 -0700)] 
Merge branch 'dev' into tableLevels

7 years agoMerge pull request #1142 from terrelln/better-dict
Yann Collet [Sat, 19 May 2018 00:19:13 +0000 (17:19 -0700)] 
Merge pull request #1142 from terrelln/better-dict

[cover] Small compression ratio improvement

7 years agoMerge branch 'dev' into tableLevels
Yann Collet [Sat, 19 May 2018 00:17:45 +0000 (17:17 -0700)] 
Merge branch 'dev' into tableLevels

7 years agoupdated compression levels for blocks of 256KB
Yann Collet [Sat, 19 May 2018 00:17:12 +0000 (17:17 -0700)] 
updated compression levels for blocks of 256KB

7 years ago[cover] Small compression ratio improvement 1142/head
Nick Terrell [Fri, 18 May 2018 22:25:10 +0000 (15:25 -0700)] 
[cover] Small compression ratio improvement

The cover algorithm selects one segment per epoch, and it selects the epoch
size such that `epochs * segmentSize ~= dictSize`. Selecting less epochs
gives the algorithm more candidates to choose from for each segment it
selects, and then it will loop back to the first epoch when it hits the
last one.

The trade off is that now it takes longer to select each segment, since it
has to look at more data before making a choice.

I benchmarked on the following data sets using this command:

```sh
$ZSTD -T0 -3 --train-cover=d=8,steps=256 $DIR -r -o dict && $ZSTD -3 -D dict -rc $DIR | wc -c
```

| Data set     | k (approx) |  Before  |  After   | % difference |
|--------------|------------|----------|----------|--------------|
| GitHub       | ~1000      |   738138 |   746610 |       +1.14% |
| hg-changelog | ~90        |  4295156 |  4285336 |       -0.23% |
| hg-commands  | ~500       |  1095580 |  1079814 |       -1.44% |
| hg-manifest  | ~400       | 16559892 | 16504346 |       -0.34% |

There is some noise in the measurements, since small changes to `k` can
have large differences, which is why I'm using `steps=256`, to try to
minimize the noise. However, the GitHub data set still has some noise.

If I run the GitHub data set on my Mac, which presumably lists directory
entries in a different order, so the dictionary builder sees the files in
a different order, or I use `steps=1024` I see these results.

| Run        | Before | After  | % difference |
|------------|--------|--------|--------------|
| steps=1024 | 738138 | 734470 |       -0.50% |
| MacBook    | 738451 | 737132 |       -0.18% |

Question: Should we expose this as a parameter? I don't think it is
necessary. Someone might want to turn it up to exchange a much longer
dictionary building time in exchange for a slightly better dictionary.
I tested `2`, `4`, and `16`, and `4` got most of the benefit of `16`
with a faster running time.

7 years agoMerge branch 'dev' into staticDictCost
Yann Collet [Fri, 18 May 2018 23:03:06 +0000 (16:03 -0700)] 
Merge branch 'dev' into staticDictCost

7 years agoadding some debug functions to observe statistics
Yann Collet [Fri, 18 May 2018 21:09:42 +0000 (14:09 -0700)] 
adding some debug functions to observe statistics

7 years agoMerge pull request #1139 from fbrosson/prefetch
Yann Collet [Fri, 18 May 2018 20:23:35 +0000 (13:23 -0700)] 
Merge pull request #1139 from fbrosson/prefetch

__builtin_prefetch did probably not exist before gcc 3.1.

7 years ago__builtin_prefetch did probably not exist before gcc 3.1. 1139/head
fbrosson [Fri, 18 May 2018 18:40:11 +0000 (18:40 +0000)] 
__builtin_prefetch did probably not exist before gcc 3.1.

7 years agoMerge pull request #1140 from fbrosson/cpu-asm
Yann Collet [Fri, 18 May 2018 17:32:16 +0000 (10:32 -0700)] 
Merge pull request #1140 from fbrosson/cpu-asm

Drop colon in asm snippet to make old versions of gcc happy.

7 years agoDrop colon in asm snippet to make old versions of gcc happy. 1140/head
fbrosson [Fri, 18 May 2018 17:05:36 +0000 (17:05 +0000)] 
Drop colon in asm snippet to make old versions of gcc happy.

7 years agofixed minor conversion warning
Yann Collet [Fri, 18 May 2018 00:27:27 +0000 (17:27 -0700)] 
fixed minor conversion warning

7 years agofixed a pretty complex bug when combining ldm + btultra
Yann Collet [Thu, 17 May 2018 23:13:53 +0000 (16:13 -0700)] 
fixed a pretty complex bug when combining ldm + btultra

7 years agocollect statistics for first block in ultra mode
Yann Collet [Thu, 17 May 2018 19:19:37 +0000 (12:19 -0700)] 
collect statistics for first block in ultra mode

this patch makes btultra do 2 passes on the first block,
the first one being dedicated to collecting statistics
so that the 2nd pass is more accurate.

It translates into a very small compression ratio gain :

enwik7, level 20:
blocks  4K : 2.142 -> 2.153
blocks 16K : 2.447 -> 2.457
blocks 64K : 2.716 -> 2.726

On the other hand, the cpu cost is doubled.

The trade off looks bad.
Though, that's ultimately a price to pay to reach better compression ratio.
So it's only enabled when setting btultra.

7 years agoslightly improved weight calculation
Yann Collet [Thu, 17 May 2018 18:19:05 +0000 (11:19 -0700)] 
slightly improved weight calculation

translating into a tiny compression ratio improvement

7 years agoupdate table levels for blocks <= 16K
Yann Collet [Wed, 16 May 2018 23:13:37 +0000 (16:13 -0700)] 
update table levels for blocks <= 16K

also : allow hlog to be slighly larger than windowlog,
as it's apparently good for both speed and compression ratio.

7 years agointroduced bit-fractional cost evaluation
Yann Collet [Wed, 16 May 2018 21:53:35 +0000 (14:53 -0700)] 
introduced bit-fractional cost evaluation

this improves compression ratio by a *tiny* amount.
It also reduces speed by a small amount.

Consequently, bit-fractional evaluation is only turned on for btultra.

7 years agoMerge pull request #1135 from facebook/frameCSize
Yann Collet [Tue, 15 May 2018 18:02:53 +0000 (11:02 -0700)] 
Merge pull request #1135 from facebook/frameCSize

decompress: changed error code when input is too large