This patch would like to allow the VLS mode autovec for the
floating-point binary operation MAX/MIN.
Given below code example:
void test(float * restrict out, float * restrict in1, float * restrict in2)
{
for (int i = 0; i < 128; i++)
out[i] = __builtin_copysignf (in1[i], in2[i]);
}
Before this patch:
test:
csrr a4,vlenb
slli a4,a4,1
li a5,128
bleu a5,a4,.L2
mv a5,a4
.L2:
vsetvli zero,a5,e32,m8,ta,ma
vle32.v v8,0(a1)
vle32.v v16,0(a2)
vsetvli a4,zero,e32,m8,ta,ma
vfsgnj.vv v8,v8,v16
vsetvli zero,a5,e32,m8,ta,ma
vse32.v v8,0(a0)
ret
After this patch:
test:
li a5,128
vsetvli zero,a5,e32,m1,ta,ma
vle32.v v1,0(a1)
vle32.v v2,0(a2)
vfsgnj.vv v1,v1,v2
vse32.v v1,0(a0)
ret
Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:
* config/riscv/autovec-vls.md (copysign<mode>3): New pattern.
* config/riscv/vector.md: Extend iterator for VLS.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vls/def.h: New macro.
* gcc.target/riscv/rvv/autovec/vls/floating-point-sgnj-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/floating-point-sgnj-2.c: New test.