This commit reduces the overhead of each heap allocation done by
Valgrind's allocator, by overlapping the redzones (used when blocks
are in-use) with the prev/next ptrs (used when they are free).
This reduces the overhead for a heap block allocated by the core from
32B to 16B on 32 bit machines, and from 48B to 32B on 64 bit machines.
The only conceivable downside of this is that on 64 bit machines, if
the client frees a block and then writes past the start/end of it,
it will corrupt the metadata after only 8 bytes of overwriting, rather than
16 bytes. Memcheck will have squealed to kingdom come by this time anyway.
(This won't happen on 32 bit machines because the overhead hasn't changed
for client blocks as allocated by Memcheck on 32 bit machines.)