From: elasota <1137273+elasota@users.noreply.github.com> Date: Sun, 19 Nov 2023 20:33:37 +0000 (-0500) Subject: Specify offset 0 as invalid X-Git-Tag: v1.5.6^2~50^2 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=f06b18b3ff009ef7dc90294fca674658ddf139bf;p=thirdparty%2Fzstd.git Specify offset 0 as invalid --- diff --git a/doc/decompressor_accepted_invalid_data.md b/doc/decompressor_accepted_invalid_data.md new file mode 100644 index 000000000..f08f963d9 --- /dev/null +++ b/doc/decompressor_accepted_invalid_data.md @@ -0,0 +1,14 @@ +Decompressor Accepted Invalid Data +================================== + +This document describes the behavior of the reference decompressor in cases +where it accepts an invalid frame instead of reporting an error. + +Zero offsets converted to 1 +--------------------------- +If a sequence is decoded with `literals_length = 0` and `offset_value = 3` +while `Repeated_Offset_1 = 1`, the computed offset will be `0`, which is +invalid. + +The reference decompressor will process this case as if the computed +offset was `1`, including inserting `1` into the repeated offset list. \ No newline at end of file diff --git a/doc/zstd_compression_format.md b/doc/zstd_compression_format.md index 4a8d338b7..a8d4a0f35 100644 --- a/doc/zstd_compression_format.md +++ b/doc/zstd_compression_format.md @@ -929,7 +929,10 @@ There is an exception though, when current sequence's `literals_length = 0`. In this case, repeated offsets are shifted by one, so an `offset_value` of 1 means `Repeated_Offset2`, an `offset_value` of 2 means `Repeated_Offset3`, -and an `offset_value` of 3 means `Repeated_Offset1 - 1_byte`. +and an `offset_value` of 3 means `Repeated_Offset1 - 1`. + +In the final case, if `Repeated_Offset1 - 1` evaluates to 0, then the +data is considered corrupted. For the first block, the starting offset history is populated with following values : `Repeated_Offset1`=1, `Repeated_Offset2`=4, `Repeated_Offset3`=8,