From 796ad10efb749d264fe262b624842a921f05e45c Mon Sep 17 00:00:00 2001 From: Mike Klein Date: Thu, 20 Sep 2018 20:34:42 +0000 Subject: [PATCH] remove 16-byte alignment from deflate_state::crc0 We noticed recently on the Skia tree that if we build Chromium's zlib with GCC, -O3, -m32, and -msse2, deflateInit2_() crashes. Might also need -fPIC... not sure. I tracked this down to a `movaps` (16-byte aligned store) to an address that was only 8-byte aligned. This address was somewhere in the middle of the deflate_state struct that deflateInit2_()'s job is to initialize. That deflate_state struct `s` is allocated using ZALLOC, which calls any user supplied zalloc if set, or the default if not. Neither one of these has any special alignment contract, so generally they'll tend to be 2*sizeof(void*) aligned. On 32-bit builds, that's 8-byte aligned. But because we've annotated crc0 as zalign(16), the natural alignment of the whole struct is 16-byte, and a compiler like GCC can feel free to use 16-byte aligned stores to parts of the struct that are 16-byte aligned, like the beginning, crc0, or any other part before or after crc0 that happens to fall on a 16-byte boundary. With -O3 and -msse2, GCC does exactly that, writing a few of the fields with one 16-byte store. The fix is simply to remove zalign(16). All the code that manipulates this field was actually already using unaligned loads and stores. You can see it all right at the top of crc_folding.c, CRC_LOAD and CRC_SAVE. This bug comes from the Intel performance patches we landed a few years ago, and isn't present in upstream zlib, Android's zlib, or Google's internal zlib. It doesn't seem to be tickled by Clang, and won't happen on 64-bit GCC builds: zalloc is likely 16-byte aligned there. I _think_ it's possible for it to trigger on non-x86 32-bit builds with GCC, but haven't tested that. I also have not tested MSVC. Reviewed-on: https://chromium-review.googlesource.com/1236613 --- deflate.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/deflate.h b/deflate.h index 81b97008..02d8f5a6 100644 --- a/deflate.h +++ b/deflate.h @@ -116,7 +116,7 @@ typedef struct internal_state { int last_flush; /* value of flush param for previous deflate call */ #ifdef X86_PCLMULQDQ_CRC - unsigned ALIGNED_(16) crc0[4 * 5]; + unsigned crc0[4 * 5]; #endif /* used by deflate.c: */ -- 2.47.2