lz4 has to decompress a whole "sequence" at a time. When the compressed
data is composed of a repeating pattern, the whole set of repeats has
do be docompressed, and the output buffer has to be big enough.
This is unfortunate, because potentially the slowdown is very big. We
are only interested in the field name, but we might have to decompress
the whole thing. But the full cost will be borne out only when the
full entry is a repeating pattern. In practice this shouldn't happen
(apart from tests and the like). Hopefully lz4 will be fixed to avoid
this problem, or it will grow a new function which we can use [1], so
this fix should be remporary.
[1] https://groups.google.com/d/msg/lz4c/_3kkz5N6n00/oTahzqErCgAJ
* prefix */
int r;
+ size_t size;
assert(src);
assert(src_size > 0);
r = LZ4_decompress_safe_partial(src + 8, *buffer, src_size - 8,
prefix_len + 1, *buffer_size);
+ if (r >= 0)
+ size = (unsigned) r;
+ else {
+ /* lz4 always tries to decode full "sequence", so in
+ * pathological cases might need to decompress the
+ * full field. */
+ r = decompress_blob_lz4(src, src_size, buffer, buffer_size, &size, 0);
+ if (r < 0)
+ return r;
+ }
- if (r < 0)
- return -EBADMSG;
- if ((unsigned) r >= prefix_len + 1)
+ if (size >= prefix_len + 1)
return memcmp(*buffer, prefix, prefix_len) == 0 &&
((const uint8_t*) *buffer)[prefix_len] == extra;
else