]> git.ipfire.org Git - thirdparty/gcc.git/commit
arm: Stop vadcq, vsbcq intrinsics from overwriting the FPSCR NZ flags
authorStam Markianos-Wright <stam.markianos-wright@arm.com>
Thu, 27 Apr 2023 14:51:14 +0000 (15:51 +0100)
committerStam Markianos-Wright <stam.markianos-wright@arm.com>
Thu, 18 May 2023 10:12:16 +0000 (11:12 +0100)
commit8eedd1e1d6ab14ab2e394a692cd0b6edb5262dd1
tree58fee8ed1552e2c5510e5958913a33d131350242
parentf2dd012ae6cd1f488103e6c17b46fef64d1b96fd
arm: Stop vadcq, vsbcq intrinsics from overwriting the FPSCR NZ flags

Hi all,

We noticed that calls to the vadcq and vsbcq intrinsics, both of
which use __builtin_arm_set_fpscr_nzcvqc to set the Carry flag in
the FPSCR, would produce the following code:

```
< r2 is the *carry input >
vmrs r3, FPSCR_nzcvqc
bic r3, r3, #536870912
orr r3, r3, r2, lsl #29
vmsr FPSCR_nzcvqc, r3
```

when the MVE ACLE instead gives a different instruction sequence of:
```
< Rt is the *carry input >
VMRS Rs,FPSCR_nzcvqc
BFI Rs,Rt,#29,#1
VMSR FPSCR_nzcvqc,Rs
```

the bic + orr pair is slower and it's also wrong, because, if the
*carry input is greater than 1, then we risk overwriting the top two
bits of the FPSCR register (the N and Z flags).

This turned out to be a problem in the header file and the solution was
to simply add a `& 1x0u` to the `*carry` input: then the compiler knows
that we only care about the lowest bit and can optimise to a BFI.

Ok for trunk?

Thanks,
Stam Markianos-Wright

gcc/ChangeLog:

* config/arm/arm_mve.h (__arm_vadcq_s32): Fix arithmetic.
(__arm_vadcq_u32): Likewise.
(__arm_vadcq_m_s32): Likewise.
(__arm_vadcq_m_u32): Likewise.
(__arm_vsbcq_s32): Likewise.
(__arm_vsbcq_u32): Likewise.
(__arm_vsbcq_m_s32): Likewise.
(__arm_vsbcq_m_u32): Likewise.
* config/arm/mve.md (get_fpscr_nzcvqc): Make unspec_volatile.

gcc/testsuite/ChangeLog:
* gcc.target/arm/mve/mve_vadcq_vsbcq_fpscr_overwrite.c: New.
gcc/config/arm/arm_mve.h
gcc/config/arm/mve.md
gcc/testsuite/gcc.target/arm/mve/mve_vadcq_vsbcq_fpscr_overwrite.c [new file with mode: 0644]