All ABIs, except alpha, powerpc, and x86_64, define it to
atomic_full_barrier/__sync_synchronize, which can be mapped to
__atomic_thread_fence (__ATOMIC_SEQ_CST) in most cases, with the
exception of aarch64 (where the acquire fence is generated as
'dmb ishld' instead of 'dmb ish').
For s390x, it defaults to a memory barrier where __sync_synchronize
emits a 'bcr 15,0' (which the manual describes as pipeline
synchronization).
For PowerPC, it allows the use of lwsync for additional chips
(since _ARCH_PWR4 does not cover all chips that support it).
Tested on aarch64-linux-gnu, where the acquire produces a different
instruction that the current code.