Yann Collet [Wed, 26 Dec 2018 23:51:34 +0000 (15:51 -0800)]
fixed detection of input==output on Visual
due to bad support of inode identifiers.
On Visual, option is limited to same file name,
which is imperfect, but way better than disabling the feature entirely.
Yann Collet [Wed, 26 Dec 2018 23:19:09 +0000 (15:19 -0800)]
fixed playTests.sh for minGW
On Windows, the equivalent of `/dev/null` is `NUL`.
When tests are run under msys2/minGW,
the environment identifies itself as Windows,
hence the script uses `NUL` instead of `/dev/null`
but the environment will consider `NUL` to be a regular file name.
Consequently, `NUL` will be overwritten during tests,
triggering an error.
This patch uses flag `-f` to force such overwrite
passing the test.
Yann Collet [Mon, 24 Dec 2018 10:56:21 +0000 (02:56 -0800)]
fixed tests for minimal decoder builds
It's incorrect to mix targets `all` and `check` with directive -j.
They will be build in parallel
and resulting artifacts will fight each other
with different sets of build options (such as DEBUGLEVEL).
Ancient versions of CMake required else(), endif(), and similar block
termination commands to have arguments matching the command starting the block.
This is no longer the preferred style.
Yann Collet [Sat, 22 Dec 2018 00:19:44 +0000 (16:19 -0800)]
fix confusion between unsigned <-> U32
as suggested in #1441.
generally U32 and unsigned are the same thing,
except when they are not ...
case : 32-bit compilation for MIPS (uint32_t == unsigned long)
A vast majority of transformation consists in transforming U32 into unsigned.
In rare cases, it's the other way around (typically for internal code, such as seeds).
Among a few issues this patches solves :
- some parameters were declared with type `unsigned` in *.h,
but with type `U32` in their implementation *.c .
- some parameters have type unsigned*,
but the caller user a pointer to U32 instead.
These fixes are useful.
However, the bulk of changes is about %u formating,
which requires unsigned type,
but generally receives U32 values instead,
often just for brevity (U32 is shorter than unsigned).
These changes are generally minor, or even annoying.
As a consequence, the amount of code changed is larger than I would expect for such a patch.
Testing is also a pain :
it requires manually modifying `mem.h`,
in order to lie about `U32`
and force it to be an `unsigned long` typically.
On a 64-bit system, this will break the equivalence unsigned == U32.
Unfortunately, it will also break a few static_assert(), controlling structure sizes.
So it also requires modifying `debug.h` to make `static_assert()` a noop.
And then reverting these changes.
So it's inconvenient, and as a consequence,
this property is currently not checked during CI tests.
Therefore, these problems can emerge again in the future.
I wonder if it is worth ensuring proper distinction of U32 != unsigned in CI tests.
It's another restriction for coding, adding more frustration during merge tests,
since most platforms don't need this distinction (hence contributor will not see it),
and while this can matter in theory, the number of platforms impacted seems minimal.
Yann Collet [Thu, 20 Dec 2018 00:54:15 +0000 (16:54 -0800)]
fixed OSSfuzz 11849
The problem was already masked,
due to no longer accepting tiny blocks for statistics.
But in case it could still happen with not-so-tiny blocks,
there is a stricter control which ensures that
nothing was already loaded prior to statistics collection.
Nick Terrell [Thu, 20 Dec 2018 00:24:59 +0000 (16:24 -0800)]
[regression] Add more configs
* Add configs that test multithreading, LDM, and setting explicit
parameters.
* Update the `compress cctx` method to accept `ZSTD_parameters`.
* Compile against the multithreaded `libzstd.a`.
* Update `results.csv` for the new configs.
Unless you think there are more configs/methods I should test, I think
we have a fairly wide set of configs/methods, so I'll pause adding
more for now.
Yann Collet [Wed, 19 Dec 2018 18:11:06 +0000 (10:11 -0800)]
fixed: compression ratio discrepancy
depending on initialization,
the first byte of a new frame was invalidated or not.
As a consequence, one match opportunity was available or not,
resulting in slightly different compressed sizes
(on average, 1 or 2 bytes once every 20 frames).
It impacted ratio comparison between one-shot and streaming modes.
This fix makes the first byte of a new frame always a valid match.
Now compressed size is always the same.
It also improves compressed size by a negligible amount.
Nick Terrell [Tue, 18 Dec 2018 22:24:49 +0000 (14:24 -0800)]
[libzstd] Fix estimate with negative levels
* Fix `ZSTD_estimateCCtxSize()` with negative levels.
* Fix `ZSTD_estimateCStreamSize()` with negative levels.
* Add a unit test to test for this error.
Yann Collet [Tue, 18 Dec 2018 20:32:58 +0000 (12:32 -0800)]
btultra2 and very small srcSize
When srcSize is small,
the nb of symbols produced is likely too small to warrant dedicated probability tables.
In which case, predefined distribution tables will be used instead.
There is a cheap algorithm in btultra initialization :
it presumes default distribution will be used if srcSize <= 1024.
btultra2 now uses the same threshold to shut down probability estimation,
since measured frequencies won't be used at entropy stage,
and therefore relying on them to determine sequence cost is misleading,
resulting in worse compression ratios.
This fixes btultra2 performance issue on very small input.
Note that, a proper way should be
to determine which symbol is going to use predefined probaility
and which symbol is going to use dynamic ones.
But the current algorithm is unable to make a "per-symbol" decision.
So this will require significant modifications.
Yi Jin [Tue, 18 Dec 2018 00:54:55 +0000 (16:54 -0800)]
break loadFile_orDie() into 2: loadFile_orDie() loads file into a pre-allocated memory buffer, mallocAndLoadFile_orDie() allocates memory first, then calls loadFile_orDie()