target/arm: Correct LDRD atomicity and fault behaviour
Our LDRD implementation is wrong in two respects:
* if the address is 4-aligned and the load crosses a page boundary
and the second load faults and the first load was to the
base register (as in cases like "ldrd r2, r3, [r2]", then we
must not update the base register before taking the fault
* if the address is 8-aligned the access must be a 64-bit
single-copy atomic access, not two 32-bit accesses
Rewrite the handling of the loads in LDRD to use a single
tcg_gen_qemu_ld_i64() and split the result into the destination
registers. This allows us to get the atomicity requirements
right, and also implicitly means that we won't update the
base register too early for the page-crossing case.
Note that because we no longer increment 'addr' by 4 in the course of
performing the LDRD we must change the adjustment value we pass to
op_addr_ri_post() and op_addr_rr_post(): it no longer needs to
subtract 4 to get the correct value to use if doing base register
writeback.
STRD has the same problem with not getting the atomicity right;
we will deal with that in the following commit.
Cc: qemu-stable@nongnu.org
Reported-by: Stu Grossman <stu.grossman@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id:
20250227142746.
1698904-2-peter.maydell@linaro.org