From: Yann Collet Date: Mon, 5 Jun 2023 16:51:52 +0000 (-0700) Subject: fix a minor inefficiency in compress_superblock X-Git-Tag: v1.5.6^2~157^2 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=1f83b7cfc459c2dbef00dc6276f790370e17aef6;p=thirdparty%2Fzstd.git fix a minor inefficiency in compress_superblock and in `decodecorpus`: the specific case `nbSeq=127` can be represented using the 1-byte format. Note that both the 1-byte and the 2-bytes formats are valid to represent this case, so there was no "error", produced data remains valid, it's just that the 1-byte format is more efficient. fix #3667 Credit to @ip7z for finding this issue. --- diff --git a/doc/zstd_compression_format.md b/doc/zstd_compression_format.md index 3843bf390..2a69c4c30 100644 --- a/doc/zstd_compression_format.md +++ b/doc/zstd_compression_format.md @@ -655,8 +655,9 @@ Let's call its first byte `byte0`. Decompressed content is defined entirely as Literals Section content. The FSE tables used in `Repeat_Mode` aren't updated. - `if (byte0 < 128)` : `Number_of_Sequences = byte0` . Uses 1 byte. -- `if (byte0 < 255)` : `Number_of_Sequences = ((byte0-128) << 8) + byte1` . Uses 2 bytes. -- `if (byte0 == 255)`: `Number_of_Sequences = byte1 + (byte2<<8) + 0x7F00` . Uses 3 bytes. +- `if (byte0 < 255)` : `Number_of_Sequences = ((byte0 - 0x80) << 8) + byte1`. Uses 2 bytes. + Note that the 2 bytes format fully overlaps the 1 byte format. +- `if (byte0 == 255)`: `Number_of_Sequences = byte1 + (byte2<<8) + 0x7F00`. Uses 3 bytes. __Symbol compression modes__ diff --git a/lib/compress/zstd_compress_superblock.c b/lib/compress/zstd_compress_superblock.c index 638c4acbe..dacaf85db 100644 --- a/lib/compress/zstd_compress_superblock.c +++ b/lib/compress/zstd_compress_superblock.c @@ -180,7 +180,7 @@ ZSTD_compressSubBlock_sequences(const ZSTD_fseCTables_t* fseTables, /* Sequences Header */ RETURN_ERROR_IF((oend-op) < 3 /*max nbSeq Size*/ + 1 /*seqHead*/, dstSize_tooSmall, ""); - if (nbSeq < 0x7F) + if (nbSeq < 128) *op++ = (BYTE)nbSeq; else if (nbSeq < LONGNBSEQ) op[0] = (BYTE)((nbSeq>>8) + 0x80), op[1] = (BYTE)nbSeq, op+=2; diff --git a/tests/decodecorpus.c b/tests/decodecorpus.c index e48eccd6d..a440ae38a 100644 --- a/tests/decodecorpus.c +++ b/tests/decodecorpus.c @@ -825,7 +825,7 @@ static size_t writeSequences(U32* seed, frame_t* frame, seqStore_t* seqStorePtr, /* Sequences Header */ if ((oend-op) < 3 /*max nbSeq Size*/ + 1 /*seqHead */) return ERROR(dstSize_tooSmall); - if (nbSeq < 0x7F) *op++ = (BYTE)nbSeq; + if (nbSeq < 128) *op++ = (BYTE)nbSeq; else if (nbSeq < LONGNBSEQ) op[0] = (BYTE)((nbSeq>>8) + 0x80), op[1] = (BYTE)nbSeq, op+=2; else op[0]=0xFF, MEM_writeLE16(op+1, (U16)(nbSeq - LONGNBSEQ)), op+=3;