Remove obsolete catomic_* locking primitives which don't map
to standard compiler builtins.
There are still a couple of places in the tree that uses them
(malloc/arena.c and malloc/malloc.c).
x86 didn't define __arch_c_compare_and_exchange_bool_* primitives
so fallback code used __arch_c_compare_and_exchange_val_* primitives
instead. This resulted in unoptimal code for
catomic_compare_and_exchange_bool_acq where superfluous
CMP was emitted after CMPXCHG, e.g. in arena_get2:
OTOH, catomic_decrement does not fallback to atomic_fetch_add (, -1)
builtin but to the cmpxchg loop, so the generated code in arena_get2
regresses a bit, from using LOCK DECQ insn:
Depending on the target processor, the compiler may emit either
'LOCK ADD/SUB $1, m' or 'INC/DEC $1, m' instruction, due to partial
flag register stall issue.