]> git.ipfire.org Git - thirdparty/gcc.git/commit
RISC-V: Support RVV VLA SLP auto-vectorization
authorJuzhe-Zhong <juzhe.zhong@rivai.ai>
Wed, 7 Jun 2023 03:19:15 +0000 (11:19 +0800)
committerPan Li <pan2.li@intel.com>
Wed, 7 Jun 2023 06:02:56 +0000 (14:02 +0800)
commit631e86b7adb55fb5ce418ce4cb5a59a1a3a6faa7
treec0cc992280066a5cca9b8a5483ffe746f2eae2b8
parent3f085e45755643f13d4fa45a12a6ade45be98f95
RISC-V: Support RVV VLA SLP auto-vectorization

This patch enables basic VLA SLP auto-vectorization.
Consider this following case:
void
f (uint8_t *restrict a, uint8_t *restrict b)
{
  for (int i = 0; i < 100; ++i)
    {
      a[i * 8 + 0] = b[i * 8 + 7] + 1;
      a[i * 8 + 1] = b[i * 8 + 7] + 2;
      a[i * 8 + 2] = b[i * 8 + 7] + 8;
      a[i * 8 + 3] = b[i * 8 + 7] + 4;
      a[i * 8 + 4] = b[i * 8 + 7] + 5;
      a[i * 8 + 5] = b[i * 8 + 7] + 6;
      a[i * 8 + 6] = b[i * 8 + 7] + 7;
      a[i * 8 + 7] = b[i * 8 + 7] + 3;
    }
}

To enable VLA SLP auto-vectorization, we should be able to handle this following const vector:

1. NPATTERNS = 8, NELTS_PER_PATTERN = 3.
{ 0, 0, 0, 0, 0, 0, 0, 0, 8, 8, 8, 8, 8, 8, 8, 8, 16, 16, 16, 16, 16, 16, 16, 16, ... }

2. NPATTERNS = 8, NELTS_PER_PATTERN = 1.
{ 1, 2, 8, 4, 5, 6, 7, 3, ... }

And these vector can be generated at prologue.

After this patch, we end up with this following codegen:

Prologue:
...
        vsetvli a7,zero,e16,m2,ta,ma
        vid.v   v4
        vsrl.vi v4,v4,3
        li      a3,8
        vmul.vx v4,v4,a3  ===> v4 = { 0, 0, 0, 0, 0, 0, 0, 0, 8, 8, 8, 8, 8, 8, 8, 8, 16, 16, 16, 16, 16, 16, 16, 16, ... }
...
        li      t1,67633152
        addi    t1,t1,513
        li      a3,50790400
        addi    a3,a3,1541
        slli    a3,a3,32
        add     a3,a3,t1
        vsetvli t1,zero,e64,m1,ta,ma
        vmv.v.x v3,a3   ===> v3 = { 1, 2, 8, 4, 5, 6, 7, 3, ... }
...
LoopBody:
...
        min     a3,...
        vsetvli zero,a3,e8,m1,ta,ma
        vle8.v  v2,0(a6)
        vsetvli a7,zero,e8,m1,ta,ma
        vrgatherei16.vv v1,v2,v4
        vadd.vv v1,v1,v3
        vsetvli zero,a3,e8,m1,ta,ma
        vse8.v  v1,0(a2)
        add     a6,a6,a4
        add     a2,a2,a4
        mv      a3,a5
        add     a5,a5,t1
        bgtu    a3,a4,.L3
...

Note: we need to use "vrgatherei16.vv" instead of "vrgather.vv" for SEW = 8
since "vrgatherei16.vv" can cover larger range than "vrgather.vv" (which
only can maximum element index = 255).

Epilogue:
        lbu     a5,799(a1)
        addiw   a4,a5,1
        sb      a4,792(a0)
        addiw   a4,a5,2
        sb      a4,793(a0)
        addiw   a4,a5,8
        sb      a4,794(a0)
        addiw   a4,a5,4
        sb      a4,795(a0)
        addiw   a4,a5,5
        sb      a4,796(a0)
        addiw   a4,a5,6
        sb      a4,797(a0)
        addiw   a4,a5,7
        sb      a4,798(a0)
        addiw   a5,a5,3
        sb      a5,799(a0)
        ret

There is one more last thing we need to do is the "Epilogue auto-vectorization"
which needs VLS modes support. I will support VLS modes for
"Epilogue auto-vectorization" in the future.

gcc/ChangeLog:

* config/riscv/riscv-protos.h (expand_vec_perm_const): New function.
* config/riscv/riscv-v.cc
(rvv_builder::can_duplicate_repeating_sequence_p): Support POLY
handling.
(rvv_builder::single_step_npatterns_p): New function.
(rvv_builder::npatterns_all_equal_p): Ditto.
(const_vec_all_in_range_p): Support POLY handling.
(gen_const_vector_dup): Ditto.
(emit_vlmax_gather_insn): Add vrgatherei16.
(emit_vlmax_masked_gather_mu_insn): Ditto.
(expand_const_vector): Add VLA SLP const vector support.
(expand_vec_perm): Support POLY.
(struct expand_vec_perm_d): New struct.
(shuffle_generic_patterns): New function.
(expand_vec_perm_const_1): Ditto.
(expand_vec_perm_const): Ditto.
* config/riscv/riscv.cc (riscv_vectorize_vec_perm_const): Ditto.
(TARGET_VECTORIZE_VEC_PERM_CONST): New targethook.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/scalable-1.c: Adapt testcase for VLA
vectorizer.
* gcc.target/riscv/rvv/autovec/v-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/zve32f_zvl128b-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/zve32x_zvl128b-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/zve64d-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/zve64d_zvl128b-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/zve64f-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/zve64f_zvl128b-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/zve64x_zvl128b-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/partial/slp-1.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp-2.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp-3.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp-4.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp-5.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp-6.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp-7.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp_run-1.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp_run-2.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp_run-3.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp_run-4.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp_run-5.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp_run-6.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp_run-7.c: New test.
26 files changed:
gcc/config/riscv/riscv-protos.h
gcc/config/riscv/riscv-v.cc
gcc/config/riscv/riscv.cc
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp-1.c [new file with mode: 0644]
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp-2.c [new file with mode: 0644]
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp-3.c [new file with mode: 0644]
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp-4.c [new file with mode: 0644]
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp-5.c [new file with mode: 0644]
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp-6.c [new file with mode: 0644]
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp-7.c [new file with mode: 0644]
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp_run-1.c [new file with mode: 0644]
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp_run-2.c [new file with mode: 0644]
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp_run-3.c [new file with mode: 0644]
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp_run-4.c [new file with mode: 0644]
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp_run-5.c [new file with mode: 0644]
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp_run-6.c [new file with mode: 0644]
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp_run-7.c [new file with mode: 0644]
gcc/testsuite/gcc.target/riscv/rvv/autovec/scalable-1.c
gcc/testsuite/gcc.target/riscv/rvv/autovec/v-1.c
gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f_zvl128b-1.c
gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32x_zvl128b-1.c
gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d-1.c
gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d_zvl128b-1.c
gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f-1.c
gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f_zvl128b-1.c
gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64x_zvl128b-1.c