]> git.ipfire.org Git - thirdparty/gcc.git/commit
aarch64: add zeroing forms for predicated SVE top FP conversions
authorArtemiy Volkov <artemiy.volkov@arm.com>
Sat, 10 Jan 2026 07:40:59 +0000 (07:40 +0000)
committerArtemiy Volkov <artemiy.volkov@arm.com>
Fri, 29 May 2026 11:33:18 +0000 (11:33 +0000)
commit2b371286fd58c8b59b9de2f062ae51e29a2d4e8d
treea8013f763a27744b0974c9818e1a46bbe3a175d5
parent7c7ace08e4aab91ca0ce4bbd318af181da7c460f
aarch64: add zeroing forms for predicated SVE top FP conversions

SVE2.2 (or in streaming mode, SME2.2) adds support for zeroing
predication for the following SVE FP conversion instructions:

SVE1:
- BFCVTNT (Single-precision convert to BFloat16 (top, predicated))

SVE2:
- FCVTLT (Floating-point widening convert (top, predicated))
- FCVTNT (Floating-point narrowing convert (top, predicated))
- FCVTXNT (Double-precision convert to single-precision, rounding
  to odd (top, predicated))

Additionally, this patch implements corresponding intrinsics documented in
the ACLE manual [0] with the following signatures:

svfloat{32,64}_t svcvtlt_{f32[_f16],_f64[_f32]}_z
  (svbool_t pg, svfloat{16,32}_t op);

sv{bfloat16,float16,float32}_t svcvtnt_{f16[_f32],_f32[_f64],_bf16[_f32]}_z
  (sv{bfloat16,float16,float32}_t even, svbool_t pg, svfloat{32,64}_t op);

svfloat32_t svcvtxnt_f32[_f64]_z
  (svfloat32_t even, svbool_t pg, svfloat64_t op);

This patch adds an alternative that emits a single zeroing-predication
form of the instructions mentioned above (as long as the sve2p2_or_sme2p2
condition holds) to corresponding RTL patterns.  For narrowing conversions
([B]FCVTNT and FCVTXNT), since an additional merge operand controlling the
values of inactive lanes is required, the intrinsics have been changed to
use the new top_narrowing_convert SVE function base class; this new class
injects a const_vector selector operand at expand time.  Depending on the
value of this operand, either the destination vector or a constant zero
vector is used to supply values for inactive lanes.

The new tests all have "_z" in their names since they only cover the
zeroing-predication versions of their respective intrinsics.

[0] https://github.com/ARM-software/acle

gcc/ChangeLog:

* config/aarch64/aarch64-sve-builtins-base.cc (class svcvtnt_impl):
Remove.
(svcvtnt): Redefine using narrowing_top_convert.
* config/aarch64/aarch64-sve-builtins-functions.h
(class narrowing_top_convert): New SVE function base class.
(NARROWING_TOP_CONVERT0): New function-like macro for specializing
narrowing_top_convert.
(NARROWING_TOP_CONVERT1): Likewise.
* config/aarch64/aarch64-sve-builtins-sve2.cc (class svcvtxnt_impl):
Remove.
(svcvtxnt): Redefine using narrowing_top_convert.
* config/aarch64/aarch64-sve-builtins-sve2.def (svcvtlt): Allow
zeroing predication.
(svcvtnt): Likewise.
(svcvtxnt): Likewise.
* config/aarch64/aarch64-sve.md (@aarch64_sve_cvtnt<mode>):
Convert to compact syntax. Add operand 4 for values of
inactive lanes.  New alternative for zeroing predication.
* config/aarch64/aarch64-sve2.md
(*cond_<sve_fp_op><mode>_relaxed): Convert to compact syntax.
New alternative for zeroing predication.
(*cond_<sve_fp_op><mode>_strict): Likewise.
(@aarch64_sve_cvtnt<mode>): Convert to compact syntax. Add
operand 4 for values of inactive lanes.  New alternative for
zeroing predication.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve2/acle/asm/cvtlt_f32_z.c: New test.
* gcc.target/aarch64/sve2/acle/asm/cvtlt_f64_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/cvtnt_bf16_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/cvtnt_f16_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/cvtnt_f32_z.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/cvtxnt_f32_z.c: Likewise.
12 files changed:
gcc/config/aarch64/aarch64-sve-builtins-base.cc
gcc/config/aarch64/aarch64-sve-builtins-functions.h
gcc/config/aarch64/aarch64-sve-builtins-sve2.cc
gcc/config/aarch64/aarch64-sve-builtins-sve2.def
gcc/config/aarch64/aarch64-sve.md
gcc/config/aarch64/aarch64-sve2.md
gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/cvtlt_f32_z.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/cvtlt_f64_z.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/cvtnt_bf16_z.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/cvtnt_f16_z.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/cvtnt_f32_z.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/cvtxnt_f32_z.c [new file with mode: 0644]