Like other loads in AArch64, the LDARB,LDARH,LDAR instructions clear out the top part of their
destination register and we can thus avoid having to explicitly zero-extend it.
We were missing a combine pattern that this patch adds.
For one of the examples in the testcase we generated:
load_uint8_t_ext_uint16_t:
adrp x0, .LANCHOR0
add x0, x0, :lo12:.LANCHOR0
ldarb w0, [x0]
and w0, w0, 255
ret
but now generate:
load_uint8_t_ext_uint16_t:
adrp x0, .LANCHOR0
add x0, x0, :lo12:.LANCHOR0
ldarb w0, [x0]
ret
Bootstrapped and tested on aarch64-none-linux-gnu.
gcc/ChangeLog:
* config/aarch64/atomics.md (*atomic_load<ALLX:mode>_zext<SD_HSDI:mode>):
New pattern.