Previously it was enabled only on x86-64 and ARM64 when also support
for unaligned access was detected or manually enabled at built time.
In the default build configuration, the 8-byte method is now enabled
also on 64-bit RISC-V and 64-bit PowerPC (both endiannesses). It was
reported that on big endian POWER9, encoding time may reduce 12-13 %.
This change only affects builds with GCC and Clang because the code
uses __builtin_ctzll or __builtin_clzll.
Thanks to Marcus Comstedt for testing on POWER9.
#if defined(TUKLIB_FAST_UNALIGNED_ACCESS) \
&& (((TUKLIB_GNUC_REQ(3, 4) || defined(__clang__)) \
- && (defined(__x86_64__) \
- || defined(__aarch64__))) \
+ && SIZE_MAX == UINT64_MAX) \
|| (defined(__INTEL_COMPILER) && defined(__x86_64__)) \
|| (defined(__INTEL_COMPILER) && defined(_M_X64)) \
|| (defined(_MSC_VER) && (defined(_M_X64) \