perf: faster huffman decoder, using x64 assembly, by @terrelln
perf: slightly faster high speed modes (strategies fast & dfast), by @felixhandte
perf: improved binary size and faster compilation times, by @terrelln
-perf: new row64 mode, used notably in levels 11 & 12, by @senhuang42
+perf: new row64 mode, used notably in level 12, by @senhuang42
+perf: faster mid-level compression speed in presence of highly repetitive patterns, by @senhuang42
perf: minor compression ratio improvements for small data at high levels, by @cyan4973
perf: reduced stack usage (mostly useful for Linux Kernel), by @terrelln
perf: faster compression speed on incompressible data, by @bindhvo