aarch64: Reimplement vqmovun_high* intrinsics using builtins
Another transition from inline asm to builtin.
Only 3 intrinsics converted this time but they use the "+w" constraint in their inline asm
so are more likely to generate redundant moves so benefit more from reimplementation.