From: Robin Dapp Date: Wed, 17 May 2023 12:38:18 +0000 (+0200) Subject: RISC-V: Add autovec sign/zero extension and truncation. X-Git-Tag: basepoints/gcc-15~8837 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=25907509787e3ef68cd8054460893cd316a8186a;p=thirdparty%2Fgcc.git RISC-V: Add autovec sign/zero extension and truncation. This patch implements the autovec expanders for sign and zero extension patterns as well as the accompanying truncations. In order to use them additional mode_attr iterators as well as vectorizer hooks are required. Using these hooks we can e.g. vectorize with VNx4QImode as base mode and extend VNx4SI to VNx4DI. They are still going to be expanded in the future. vf4 and vf8 truncations are emulated by truncating two and three times respectively. The patch also adds tests and changes some expectations for already existing ones. Combine does not yet handle binary operations of two widened operands as we are missing the necessary split/rewrite patterns. These will be added at a later time. Co-authored-by: Juzhe Zhong gcc/ChangeLog: * config/riscv/autovec.md (2): New expander. (2): Dito. (2): Dito. (trunc2): Dito. (trunc2): Dito. (trunc2): Dito. * config/riscv/riscv-protos.h (vectorize_related_mode): Define. (autovectorize_vector_modes): Define. * config/riscv/riscv-v.cc (vectorize_related_mode): Implement hook. (autovectorize_vector_modes): Implement hook. * config/riscv/riscv.cc (riscv_autovectorize_vector_modes): Implement target hook. (riscv_vectorize_related_mode): Implement target hook. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): Define. (TARGET_VECTORIZE_RELATED_MODE): Define. * config/riscv/vector-iterators.md: Add lowercase versions of mode_attr iterators. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/shift-rv32gcv.c: Adjust expectation. * gcc.target/riscv/rvv/autovec/binop/shift-rv64gcv.c: Dito. * gcc.target/riscv/rvv/autovec/binop/vdiv-run.c: Dito. * gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv.c: Dito. * gcc.target/riscv/rvv/autovec/binop/vdiv-rv64gcv.c: Dito. * gcc.target/riscv/rvv/autovec/binop/vdiv-template.h: Dito. * gcc.target/riscv/rvv/autovec/binop/vrem-rv32gcv.c: Dito. * gcc.target/riscv/rvv/autovec/binop/vrem-rv64gcv.c: Dito. * gcc.target/riscv/rvv/autovec/zve32f_zvl128b-2.c: Dito. * gcc.target/riscv/rvv/autovec/zve32x_zvl128b-2.c: Dito. * gcc.target/riscv/rvv/autovec/zve64d-2.c: Dito. * gcc.target/riscv/rvv/autovec/zve64f-2.c: Dito. * gcc.target/riscv/rvv/autovec/zve64x-2.c: Dito. * gcc.target/riscv/rvv/rvv.exp: Add new conversion tests. * gcc.target/riscv/rvv/vsetvl/avl_single-38.c: Do not vectorize. * gcc.target/riscv/rvv/vsetvl/avl_single-47.c: Dito. * gcc.target/riscv/rvv/vsetvl/avl_single-48.c: Dito. * gcc.target/riscv/rvv/vsetvl/avl_single-49.c: Dito. * gcc.target/riscv/rvv/vsetvl/imm_switch-8.c: Dito. * gcc.target/riscv/rvv/autovec/conversions/vncvt-run.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vncvt-rv32gcv.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vncvt-rv64gcv.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vncvt-template.h: New test. * gcc.target/riscv/rvv/autovec/conversions/vsext-run.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vsext-rv32gcv.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vsext-rv64gcv.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vsext-template.h: New test. * gcc.target/riscv/rvv/autovec/conversions/vzext-run.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vzext-rv32gcv.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vzext-rv64gcv.c: New test. * gcc.target/riscv/rvv/autovec/conversions/vzext-template.h: New test. --- diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md index 7fe4d94de39a..64433c72ea87 100644 --- a/gcc/config/riscv/autovec.md +++ b/gcc/config/riscv/autovec.md @@ -373,3 +373,107 @@ DONE; } ) + +;; ------------------------------------------------------------------------- +;; ---- [INT] Sign and zero extension +;; ------------------------------------------------------------------------- +;; Includes: +;; - vzext.vf[2|4|8] +;; - vsext.vf[2|4|8] +;; ------------------------------------------------------------------------- + +(define_expand "2" + [(set (match_operand:VWEXTI 0 "register_operand") + (any_extend:VWEXTI + (match_operand: 1 "register_operand")))] + "TARGET_VECTOR" +{ + insn_code icode = code_for_pred_vf2 (, mode); + riscv_vector::emit_vlmax_insn (icode, riscv_vector::RVV_UNOP, operands); + DONE; +}) + +(define_expand "2" + [(set (match_operand:VQEXTI 0 "register_operand") + (any_extend:VQEXTI + (match_operand: 1 "register_operand")))] + "TARGET_VECTOR" +{ + insn_code icode = code_for_pred_vf4 (, mode); + riscv_vector::emit_vlmax_insn (icode, riscv_vector::RVV_UNOP, operands); + DONE; +}) + +(define_expand "2" + [(set (match_operand:VOEXTI 0 "register_operand") + (any_extend:VOEXTI + (match_operand: 1 "register_operand")))] + "TARGET_VECTOR" +{ + insn_code icode = code_for_pred_vf8 (, mode); + riscv_vector::emit_vlmax_insn (icode, riscv_vector::RVV_UNOP, operands); + DONE; +}) + +;; ------------------------------------------------------------------------- +;; ---- [INT] Truncation +;; ------------------------------------------------------------------------- +;; - vncvt.x.x.w +;; ------------------------------------------------------------------------- +(define_expand "trunc2" + [(set (match_operand: 0 "register_operand") + (truncate: + (match_operand:VWEXTI 1 "register_operand")))] + "TARGET_VECTOR" +{ + insn_code icode = code_for_pred_trunc (mode); + riscv_vector::emit_vlmax_insn (icode, riscv_vector::RVV_UNOP, operands); + DONE; +}) + +;; ------------------------------------------------------------------------- +;; Truncation to a mode whose inner mode size is a quarter of mode's. +;; We emulate this with two consecutive vncvts. +;; ------------------------------------------------------------------------- +(define_expand "trunc2" + [(set (match_operand: 0 "register_operand") + (truncate: + (match_operand:VQEXTI 1 "register_operand")))] + "TARGET_VECTOR" +{ + rtx half = gen_reg_rtx (mode); + rtx opshalf[] = {half, operands[1]}; + insn_code icode = code_for_pred_trunc (mode); + riscv_vector::emit_vlmax_insn (icode, riscv_vector::RVV_UNOP, opshalf); + + rtx ops[] = {operands[0], half}; + icode = code_for_pred_trunc (mode); + riscv_vector::emit_vlmax_insn (icode, riscv_vector::RVV_UNOP, ops); + DONE; +}) + +;; ------------------------------------------------------------------------- +;; Truncation to a mode whose inner mode size is an eigth of mode's. +;; We emulate this with three consecutive vncvts. +;; ------------------------------------------------------------------------- +(define_expand "trunc2" + [(set (match_operand: 0 "register_operand") + (truncate: + (match_operand:VOEXTI 1 "register_operand")))] + "TARGET_VECTOR" +{ + rtx half = gen_reg_rtx (mode); + rtx opshalf[] = {half, operands[1]}; + insn_code icode = code_for_pred_trunc (mode); + riscv_vector::emit_vlmax_insn (icode, riscv_vector::RVV_UNOP, opshalf); + + rtx quarter = gen_reg_rtx (mode); + rtx opsquarter[] = {quarter, half}; + icode = code_for_pred_trunc (mode); + riscv_vector::emit_vlmax_insn (icode, riscv_vector::RVV_UNOP, opsquarter); + + rtx ops[] = {operands[0], quarter}; + icode = code_for_pred_trunc (mode); + riscv_vector::emit_vlmax_insn (icode, riscv_vector::RVV_UNOP, ops); + DONE; +}) diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 36419c95bbd8..07689977f6db 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -253,6 +253,10 @@ enum frm_field_enum FRM_RMM = 0b100, FRM_DYN = 0b111 }; + +opt_machine_mode vectorize_related_mode (machine_mode, scalar_mode, + poly_uint64); +unsigned int autovectorize_vector_modes (vec *, bool); } /* We classify builtin types into two classes: diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index f71ad9e46a1b..3dd7ee84484d 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -1399,6 +1399,88 @@ get_cmp_insn_code (rtx_code code, machine_mode mode) return icode; } +/* This hook gives the vectorizer more vector mode options. We want it to not + only try modes with the maximum number of units a full vector can hold but + for example also half the number of units for a smaller elements size. + Such vectors can be promoted to a full vector of widened elements + (still with the same number of elements, essentially vectorizing at a + fixed number of units rather than a fixed number of bytes). */ +unsigned int +autovectorize_vector_modes (vector_modes *modes, bool) +{ + if (autovec_use_vlmax_p ()) + { + /* TODO: We will support RVV VLS auto-vectorization mode in the future. */ + poly_uint64 full_size + = BYTES_PER_RISCV_VECTOR * ((int) riscv_autovec_lmul); + + /* Start with a VNxYYQImode where YY is the number of units that + fit a whole vector. + Then try YY = nunits / 2, nunits / 4 and nunits / 8 which + is guided by the extensions we have available (vf2, vf4 and vf8). + + - full_size: Try using full vectors for all element types. + - full_size / 2: + Try using 16-bit containers for 8-bit elements and full vectors + for wider elements. + - full_size / 4: + Try using 32-bit containers for 8-bit and 16-bit elements and + full vectors for wider elements. + - full_size / 8: + Try using 64-bit containers for all element types. */ + static const int rvv_factors[] = {1, 2, 4, 8}; + for (unsigned int i = 0; i < sizeof (rvv_factors) / sizeof (int); i++) + { + poly_uint64 units; + machine_mode mode; + if (can_div_trunc_p (full_size, rvv_factors[i], &units) + && get_vector_mode (QImode, units).exists (&mode)) + modes->safe_push (mode); + } + } + return 0; +} + +/* If the given VECTOR_MODE is an RVV mode, first get the largest number + of units that fit into a full vector at the given ELEMENT_MODE. + We will have the vectorizer call us with a successively decreasing + number of units (as specified in autovectorize_vector_modes). + The starting mode is always the one specified by preferred_simd_mode. */ +opt_machine_mode +vectorize_related_mode (machine_mode vector_mode, scalar_mode element_mode, + poly_uint64 nunits) +{ + /* TODO: We will support RVV VLS auto-vectorization mode in the future. */ + poly_uint64 min_units; + if (autovec_use_vlmax_p () && riscv_v_ext_vector_mode_p (vector_mode) + && multiple_p (BYTES_PER_RISCV_VECTOR * ((int) riscv_autovec_lmul), + GET_MODE_SIZE (element_mode), &min_units)) + { + machine_mode rvv_mode; + if (maybe_ne (nunits, 0U)) + { + /* If we were given a number of units NUNITS, try to find an + RVV vector mode of inner mode ELEMENT_MODE with the same + number of units. */ + if (multiple_p (min_units, nunits) + && get_vector_mode (element_mode, nunits).exists (&rvv_mode)) + return rvv_mode; + } + else + { + /* Look for a vector mode with the same number of units as the + VECTOR_MODE we were given. We keep track of the minimum + number of units so far which determines the smallest necessary + but largest possible, suitable mode for vectorization. */ + min_units = ordered_min (min_units, GET_MODE_SIZE (vector_mode)); + if (get_vector_mode (element_mode, min_units).exists (&rvv_mode)) + return rvv_mode; + } + } + + return default_vectorize_related_mode (vector_mode, element_mode, nunits); +} + /* Expand an RVV comparison. */ void diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index b16c60df6a75..92aaa9e93911 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -7605,6 +7605,28 @@ riscv_mode_priority (int, int n) return n; } +/* Implement TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES. */ +unsigned int +riscv_autovectorize_vector_modes (vector_modes *modes, bool all) +{ + if (TARGET_VECTOR) + return riscv_vector::autovectorize_vector_modes (modes, all); + + return default_autovectorize_vector_modes (modes, all); +} + +/* Implement TARGET_VECTORIZE_RELATED_MODE. */ +opt_machine_mode +riscv_vectorize_related_mode (machine_mode vector_mode, scalar_mode element_mode, + poly_uint64 nunits) +{ + if (TARGET_VECTOR) + return riscv_vector::vectorize_related_mode (vector_mode, element_mode, + nunits); + return default_vectorize_related_mode (vector_mode, element_mode, nunits); +} + + /* Initialize the GCC target structure. */ #undef TARGET_ASM_ALIGNED_HI_OP #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t" @@ -7896,6 +7918,13 @@ riscv_mode_priority (int, int n) #undef TARGET_MODE_PRIORITY #define TARGET_MODE_PRIORITY riscv_mode_priority +#undef TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES +#define TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES \ + riscv_autovectorize_vector_modes + +#undef TARGET_VECTORIZE_RELATED_MODE +#define TARGET_VECTORIZE_RELATED_MODE riscv_vectorize_related_mode + struct gcc_target targetm = TARGET_INITIALIZER; #include "gt-riscv.h" diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md index 3096ac5be3cf..70fb5b80b1b9 100644 --- a/gcc/config/riscv/vector-iterators.md +++ b/gcc/config/riscv/vector-iterators.md @@ -1118,8 +1118,10 @@ (VNx16HI "VNx16QI") (VNx32HI "VNx32QI") (VNx64HI "VNx64QI") (VNx1SI "VNx1HI") (VNx2SI "VNx2HI") (VNx4SI "VNx4HI") (VNx8SI "VNx8HI") (VNx16SI "VNx16HI") (VNx32SI "VNx32HI") - (VNx1DI "VNx1SI") (VNx2DI "VNx2SI") (VNx4DI "VNx4SI") (VNx8DI "VNx8SI") (VNx16DI "VNx16SI") - (VNx1DF "VNx1SF") (VNx2DF "VNx2SF") (VNx4DF "VNx4SF") (VNx8DF "VNx8SF") (VNx16DF "VNx16SF") + (VNx1DI "VNx1SI") (VNx2DI "VNx2SI") (VNx4DI "VNx4SI") (VNx8DI "VNx8SI") + (VNx16DI "VNx16SI") + (VNx1DF "VNx1SF") (VNx2DF "VNx2SF") (VNx4DF "VNx4SF") (VNx8DF "VNx8SF") + (VNx16DF "VNx16SF") ]) (define_mode_attr V_QUAD_TRUNC [ @@ -1130,7 +1132,32 @@ ]) (define_mode_attr V_OCT_TRUNC [ - (VNx1DI "VNx1QI") (VNx2DI "VNx2QI") (VNx4DI "VNx4QI") (VNx8DI "VNx8QI") (VNx16DI "VNx16QI") + (VNx1DI "VNx1QI") (VNx2DI "VNx2QI") (VNx4DI "VNx4QI") (VNx8DI "VNx8QI") + (VNx16DI "VNx16QI") +]) + +; Again in lower case. +(define_mode_attr v_double_trunc [ + (VNx1HI "vnx1qi") (VNx2HI "vnx2qi") (VNx4HI "vnx4qi") (VNx8HI "vnx8qi") + (VNx16HI "vnx16qi") (VNx32HI "vnx32qi") (VNx64HI "vnx64qi") + (VNx1SI "vnx1hi") (VNx2SI "vnx2hi") (VNx4SI "vnx4hi") (VNx8SI "vnx8hi") + (VNx16SI "vnx16hi") (VNx32SI "vnx32hi") + (VNx1DI "vnx1si") (VNx2DI "vnx2si") (VNx4DI "vnx4si") (VNx8DI "vnx8si") + (VNx16DI "vnx16si") + (VNx1DF "vnx1sf") (VNx2DF "vnx2sf") (VNx4DF "vnx4sf") (VNx8DF "vnx8sf") + (VNx16DF "vnx16sf") +]) + +(define_mode_attr v_quad_trunc [ + (VNx1SI "vnx1qi") (VNx2SI "vnx2qi") (VNx4SI "vnx4qi") (VNx8SI "vnx8qi") + (VNx16SI "vnx16qi") (VNx32SI "vnx32qi") + (VNx1DI "vnx1hi") (VNx2DI "vnx2hi") (VNx4DI "vnx4hi") (VNx8DI "vnx8hi") + (VNx16DI "vnx16hi") +]) + +(define_mode_attr v_oct_trunc [ + (VNx1DI "vnx1qi") (VNx2DI "vnx2qi") (VNx4DI "vnx4qi") (VNx8DI "vnx8qi") + (VNx16DI "vnx16qi") ]) (define_mode_attr VINDEX_DOUBLE_TRUNC [ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/shift-rv32gcv.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/shift-rv32gcv.c index d98100b32762..557a7c82531a 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/shift-rv32gcv.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/shift-rv32gcv.c @@ -10,4 +10,3 @@ /* { dg-final { scan-assembler {\tvsll\.vv} } } */ /* { dg-final { scan-assembler {\tvsrl\.vv} } } */ /* { dg-final { scan-assembler {\tvsra\.vv} } } */ - diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/shift-rv64gcv.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/shift-rv64gcv.c index d9109fd8774f..01a9cb21efc0 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/shift-rv64gcv.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/shift-rv64gcv.c @@ -3,9 +3,6 @@ #include "shift-template.h" -/* TODO: For int16_t and uint16_t we need widening/promotion patterns. - Therefore, expect only 4 vsll.vv instead of 6 for now. */ - -/* { dg-final { scan-assembler-times {\tvsll\.vv} 4 } } */ +/* { dg-final { scan-assembler-times {\tvsll\.vv} 6 } } */ /* { dg-final { scan-assembler-times {\tvsrl\.vv} 3 } } */ /* { dg-final { scan-assembler-times {\tvsra\.vv} 3 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vdiv-run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vdiv-run.c index aa9a3c55abe6..5de339172fc2 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vdiv-run.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vdiv-run.c @@ -15,7 +15,7 @@ a##TYPE[i] = VAL * 3; \ b##TYPE[i] = VAL; \ } \ - vadd_##TYPE (a##TYPE, a##TYPE, b##TYPE, SZ); \ + vdiv_##TYPE (a##TYPE, a##TYPE, b##TYPE, SZ); \ for (int i = 0; i < SZ; i++) \ assert (a##TYPE[i] == 3); @@ -23,7 +23,7 @@ TYPE as##TYPE[SZ]; \ for (int i = 0; i < SZ; i++) \ as##TYPE[i] = VAL * 5; \ - vadds_##TYPE (as##TYPE, as##TYPE, VAL, SZ); \ + vdivs_##TYPE (as##TYPE, as##TYPE, VAL, SZ); \ for (int i = 0; i < SZ; i++) \ assert (as##TYPE[i] == 5); diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv.c index 9759401b9efd..1dce9dd562ef 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv.c @@ -3,7 +3,8 @@ #include "vdiv-template.h" -/* TODO: Implement vector type promotion. We should have 6 vdiv.vv here. */ +/* Currently we use an epilogue loop which also contains vdivs. Therefore we + expect 10 vdiv[u]s instead of 6. */ -/* { dg-final { scan-assembler-times {\tvdiv\.vv} 4 } } */ -/* { dg-final { scan-assembler-times {\tvdivu\.vv} 6 } } */ +/* { dg-final { scan-assembler-times {\tvdiv\.vv} 10 } } */ +/* { dg-final { scan-assembler-times {\tvdivu\.vv} 10 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vdiv-rv64gcv.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vdiv-rv64gcv.c index 7d9b75ae0b19..16a18c466e05 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vdiv-rv64gcv.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vdiv-rv64gcv.c @@ -3,7 +3,8 @@ #include "vdiv-template.h" -/* TODO: Implement vector type promotion. We should have 6 vdiv.vv here. */ +/* Currently we use an epilogue loop which also contains vdivs. Therefore we + expect 10 vdiv[u]s instead of 6. */ -/* { dg-final { scan-assembler-times {\tvdiv\.vv} 4 } } */ -/* { dg-final { scan-assembler-times {\tvdivu\.vv} 6 } } */ +/* { dg-final { scan-assembler-times {\tvdiv\.vv} 10 } } */ +/* { dg-final { scan-assembler-times {\tvdivu\.vv} 10 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vdiv-template.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vdiv-template.h index 12a1de328744..f8d3bfde4ed6 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vdiv-template.h +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vdiv-template.h @@ -2,7 +2,7 @@ #define TEST_TYPE(TYPE) \ __attribute__((noipa)) \ - void vadd_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n) \ + void vdiv_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n) \ { \ for (int i = 0; i < n; i++) \ dst[i] = a[i] / b[i]; \ @@ -10,13 +10,12 @@ #define TEST2_TYPE(TYPE) \ __attribute__((noipa)) \ - void vadds_##TYPE (TYPE *dst, TYPE *a, TYPE b, int n) \ + void vdivs_##TYPE (TYPE *dst, TYPE *a, TYPE b, int n) \ { \ for (int i = 0; i < n; i++) \ dst[i] = a[i] / b; \ } -/* *int8_t not autovec currently. */ #define TEST_ALL() \ TEST_TYPE(int16_t) \ TEST_TYPE(uint16_t) \ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vrem-rv32gcv.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vrem-rv32gcv.c index 28cba510a930..df99f5019fb7 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vrem-rv32gcv.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vrem-rv32gcv.c @@ -3,7 +3,8 @@ #include "vrem-template.h" -/* TODO: Implement vector type promotion. We should have 6 vrem.vv here. */ +/* Currently we use an epilogue loop which also contains vrems. Therefore we + expect 10 vrem[u]s instead of 6. */ -/* { dg-final { scan-assembler-times {\tvrem\.vv} 5 } } */ -/* { dg-final { scan-assembler-times {\tvremu\.vv} 6 } } */ +/* { dg-final { scan-assembler-times {\tvrem\.vv} 10 } } */ +/* { dg-final { scan-assembler-times {\tvremu\.vv} 10 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vrem-rv64gcv.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vrem-rv64gcv.c index 5b6961d1f63b..3cff13a47e4a 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vrem-rv64gcv.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vrem-rv64gcv.c @@ -3,7 +3,8 @@ #include "vrem-template.h" -/* TODO: Implement vector type promotion. We should have 6 vrem.vv here. */ +/* Currently we use an epilogue loop which also contains vrems. Therefore we + expect 10 vrem[u]s instead of 6. */ -/* { dg-final { scan-assembler-times {\tvrem\.vv} 5 } } */ -/* { dg-final { scan-assembler-times {\tvremu\.vv} 6 } } */ +/* { dg-final { scan-assembler-times {\tvrem\.vv} 10 } } */ +/* { dg-final { scan-assembler-times {\tvremu\.vv} 10 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vncvt-run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vncvt-run.c new file mode 100644 index 000000000000..f55d2dfce7f1 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vncvt-run.c @@ -0,0 +1,35 @@ +/* { dg-do run { target { riscv_vector } } } */ +/* { dg-additional-options "-std=c99 -fno-vect-cost-model --param=riscv-autovec-preference=fixed-vlmax" } */ + +#include "vncvt-template.h" + +#include + +#define SZ 256 + +#define RUN(TYPE1,TYPE2) \ + TYPE1 src##TYPE1##TYPE2[SZ]; \ + TYPE2 dst##TYPE1##TYPE2[SZ]; \ + for (int i = 0; i < SZ; i++) \ + { \ + src##TYPE1##TYPE2[i] = i; \ + dst##TYPE1##TYPE2[i] = -1; \ + } \ + vncvt_##TYPE1##TYPE2 (dst##TYPE1##TYPE2, \ + src##TYPE1##TYPE2, SZ); \ + for (int i = 0; i < SZ; i++) \ + assert (dst##TYPE1##TYPE2[i] == i); + + +#define RUN_ALL() \ + RUN(uint16_t, uint8_t) \ + RUN(uint32_t, uint8_t) \ + RUN(uint64_t, uint8_t) \ + RUN(uint32_t, uint16_t) \ + RUN(uint64_t, uint16_t) \ + RUN(uint64_t, uint32_t) \ + +int main () +{ + RUN_ALL() +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vncvt-rv32gcv.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vncvt-rv32gcv.c new file mode 100644 index 000000000000..2b5aa0051cf4 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vncvt-rv32gcv.c @@ -0,0 +1,6 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv32gcv -mabi=ilp32d --param=riscv-autovec-preference=fixed-vlmax" } */ + +#include "vncvt-template.h" + +/* { dg-final { scan-assembler-times {\tvncvt.x.x.w} 10 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vncvt-rv64gcv.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vncvt-rv64gcv.c new file mode 100644 index 000000000000..29349b33da6c --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vncvt-rv64gcv.c @@ -0,0 +1,6 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv64gcv -mabi=lp64d --param=riscv-autovec-preference=fixed-vlmax" } */ + +#include "vncvt-template.h" + +/* { dg-final { scan-assembler-times {\tvncvt.x.x.w} 10 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vncvt-template.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vncvt-template.h new file mode 100644 index 000000000000..6b19ff12abe8 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vncvt-template.h @@ -0,0 +1,19 @@ +#include + +#define TEST(TYPE1, TYPE2) \ + __attribute__((noipa)) \ + void vncvt_##TYPE1##TYPE2 (TYPE2 *dst, TYPE1 *a, int n) \ + { \ + for (int i = 0; i < n; i++) \ + dst[i] = (TYPE1)a[i]; \ + } + +#define TEST_ALL() \ + TEST(uint16_t, uint8_t) \ + TEST(uint32_t, uint8_t) \ + TEST(uint32_t, uint16_t) \ + TEST(uint64_t, uint8_t) \ + TEST(uint64_t, uint16_t) \ + TEST(uint64_t, uint32_t) \ + +TEST_ALL() diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vsext-run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vsext-run.c new file mode 100644 index 000000000000..d5f0190957a2 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vsext-run.c @@ -0,0 +1,35 @@ +/* { dg-do run { target { riscv_vector } } } */ +/* { dg-additional-options "-std=c99 -fno-vect-cost-model --param=riscv-autovec-preference=fixed-vlmax" } */ + +#include "vsext-template.h" + +#include + +#define SZ 256 + +#define RUN(TYPE1,TYPE2) \ + TYPE1 src##TYPE1##TYPE2[SZ]; \ + TYPE2 dst##TYPE1##TYPE2[SZ]; \ + for (int i = 0; i < SZ; i++) \ + { \ + src##TYPE1##TYPE2[i] = i - 128; \ + dst##TYPE1##TYPE2[i] = 0; \ + } \ + vsext_##TYPE1##TYPE2 (dst##TYPE1##TYPE2, \ + src##TYPE1##TYPE2, SZ); \ + for (int i = 0; i < SZ; i++) \ + assert (dst##TYPE1##TYPE2[i] == i - 128); + + +#define RUN_ALL() \ + RUN(int8_t, int16_t) \ + RUN(int8_t, int32_t) \ + RUN(int8_t, int64_t) \ + RUN(int16_t, int32_t) \ + RUN(int16_t, int64_t) \ + RUN(int32_t, int64_t) \ + +int main () +{ + RUN_ALL() +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vsext-rv32gcv.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vsext-rv32gcv.c new file mode 100644 index 000000000000..538216ab9c34 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vsext-rv32gcv.c @@ -0,0 +1,8 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv32gcv -mabi=ilp32d --param=riscv-autovec-preference=fixed-vlmax" } */ + +#include "vsext-template.h" + +/* { dg-final { scan-assembler-times {\tvsext\.vf2} 3 } } */ +/* { dg-final { scan-assembler-times {\tvsext\.vf4} 2 } } */ +/* { dg-final { scan-assembler-times {\tvsext\.vf8} 1 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vsext-rv64gcv.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vsext-rv64gcv.c new file mode 100644 index 000000000000..29348cc67e54 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vsext-rv64gcv.c @@ -0,0 +1,8 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv64gcv -mabi=lp64d --param=riscv-autovec-preference=fixed-vlmax" } */ + +#include "vsext-template.h" + +/* { dg-final { scan-assembler-times {\tvsext\.vf2} 3 } } */ +/* { dg-final { scan-assembler-times {\tvsext\.vf4} 2 } } */ +/* { dg-final { scan-assembler-times {\tvsext\.vf8} 1 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vsext-template.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vsext-template.h new file mode 100644 index 000000000000..c2f5fc92c99f --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vsext-template.h @@ -0,0 +1,19 @@ +#include + +#define TEST(TYPE1, TYPE2) \ + __attribute__((noipa)) \ + void vsext_##TYPE1##TYPE2 (TYPE2 *dst, TYPE1 *a, int n) \ + { \ + for (int i = 0; i < n; i++) \ + dst[i] = (TYPE1)a[i]; \ + } + +#define TEST_ALL() \ + TEST(int8_t, int16_t) \ + TEST(int8_t, int32_t) \ + TEST(int8_t, int64_t) \ + TEST(int16_t, int32_t) \ + TEST(int16_t, int64_t) \ + TEST(int32_t, int64_t) \ + +TEST_ALL() diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vzext-run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vzext-run.c new file mode 100644 index 000000000000..9d1c259f5921 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vzext-run.c @@ -0,0 +1,35 @@ +/* { dg-do run { target { riscv_vector } } } */ +/* { dg-additional-options "-std=c99 -fno-vect-cost-model --param=riscv-autovec-preference=fixed-vlmax" } */ + +#include "vzext-template.h" + +#include + +#define SZ 256 + +#define RUN(TYPE1,TYPE2) \ + TYPE1 src##TYPE1##TYPE2[SZ]; \ + TYPE2 dst##TYPE1##TYPE2[SZ]; \ + for (int i = 0; i < SZ; i++) \ + { \ + src##TYPE1##TYPE2[i] = i; \ + dst##TYPE1##TYPE2[i] = -1; \ + } \ + vzext_##TYPE1##TYPE2 (dst##TYPE1##TYPE2, \ + src##TYPE1##TYPE2, SZ); \ + for (int i = 0; i < SZ; i++) \ + assert (dst##TYPE1##TYPE2[i] == i); + + +#define RUN_ALL() \ + RUN(uint8_t, uint16_t) \ + RUN(uint8_t, uint32_t) \ + RUN(uint8_t, uint64_t) \ + RUN(uint16_t, uint32_t) \ + RUN(uint16_t, uint64_t) \ + RUN(uint32_t, uint64_t) \ + +int main () +{ + RUN_ALL() +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vzext-rv32gcv.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vzext-rv32gcv.c new file mode 100644 index 000000000000..3e92843a5c2d --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vzext-rv32gcv.c @@ -0,0 +1,8 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv32gcv -mabi=ilp32d --param=riscv-autovec-preference=fixed-vlmax" } */ + +#include "vzext-template.h" + +/* { dg-final { scan-assembler-times {\tvzext\.vf2} 3 } } */ +/* { dg-final { scan-assembler-times {\tvzext\.vf4} 2 } } */ +/* { dg-final { scan-assembler-times {\tvzext\.vf8} 1 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vzext-rv64gcv.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vzext-rv64gcv.c new file mode 100644 index 000000000000..cee0012d58cb --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vzext-rv64gcv.c @@ -0,0 +1,8 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv64gcv -mabi=lp64d --param=riscv-autovec-preference=fixed-vlmax" } */ + +#include "vzext-template.h" + +/* { dg-final { scan-assembler-times {\tvzext\.vf2} 3 } } */ +/* { dg-final { scan-assembler-times {\tvzext\.vf4} 2 } } */ +/* { dg-final { scan-assembler-times {\tvzext\.vf8} 1 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vzext-template.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vzext-template.h new file mode 100644 index 000000000000..847905b690c1 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vzext-template.h @@ -0,0 +1,19 @@ +#include + +#define TEST(TYPE1, TYPE2) \ + __attribute__((noipa)) \ + void vzext_##TYPE1##TYPE2 (TYPE2 *dst, TYPE1 *a, int n) \ + { \ + for (int i = 0; i < n; i++) \ + dst[i] = (TYPE1)a[i]; \ + } + +#define TEST_ALL() \ + TEST(uint8_t, uint16_t) \ + TEST(uint8_t, uint32_t) \ + TEST(uint8_t, uint64_t) \ + TEST(uint16_t, uint32_t) \ + TEST(uint16_t, uint64_t) \ + TEST(uint32_t, uint64_t) \ + +TEST_ALL() diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f_zvl128b-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f_zvl128b-2.c index c53975e9b012..7f499befa82f 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f_zvl128b-2.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f_zvl128b-2.c @@ -3,4 +3,4 @@ #include "template-1.h" -/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 4 "vect" } } */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 5 "vect" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32x_zvl128b-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32x_zvl128b-2.c index 80a4796fd38b..2de09a29f02a 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32x_zvl128b-2.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32x_zvl128b-2.c @@ -3,4 +3,4 @@ #include "template-1.h" -/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 2 "vect" } } */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 3 "vect" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d-2.c index 5e38b41a5c3d..95d54d7b281c 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d-2.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d-2.c @@ -3,4 +3,4 @@ #include "template-1.h" -/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 3 "vect" } } */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 6 "vect" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f-2.c index ee37282f1f89..f9f44a949027 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f-2.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f-2.c @@ -3,4 +3,4 @@ #include "template-1.h" -/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 3 "vect" } } */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 5 "vect" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64x-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64x-2.c index 6a64a1a1fdfd..12703a7e0368 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64x-2.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64x-2.c @@ -3,4 +3,4 @@ #include "template-1.h" -/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 2 "vect" } } */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 3 "vect" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp b/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp index 9809a421fc80..9466c2032e38 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp +++ b/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp @@ -65,6 +65,8 @@ foreach op $AUTOVEC_TEST_OPTS { "" "$op" dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/autovec/cmp/*.\[cS\]]] \ "" "$op" + dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/autovec/conversions/*.\[cS\]]] \ + "" "$op" } # VLS-VLMAX tests diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-38.c b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-38.c index 34f1cd43a214..8606b10268d3 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-38.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-38.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-march=rv32gcv -mabi=ilp32 -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32 -fno-schedule-insns -fno-schedule-insns2 -fno-tree-vectorize" } */ #include "riscv_vector.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-47.c b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-47.c index 935e1b10630d..11403203aa9d 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-47.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-47.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-march=rv32gcv -mabi=ilp32 -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32 -fno-schedule-insns -fno-schedule-insns2 -fno-tree-vectorize" } */ #include "riscv_vector.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-48.c b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-48.c index a1f4b70d4b47..79af2ef450a9 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-48.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-48.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-march=rv32gcv -mabi=ilp32 -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32 -fno-schedule-insns -fno-schedule-insns2 -fno-tree-vectorize" } */ #include "riscv_vector.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-49.c b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-49.c index 4fa68627cc0d..77fe05b68a72 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-49.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-49.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-march=rv32gcv -mabi=ilp32 -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32 -fno-schedule-insns -fno-schedule-insns2 -fno-tree-vectorize" } */ #include "riscv_vector.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/imm_switch-8.c b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/imm_switch-8.c index 4a1ef3ce5e98..f8568a8c8981 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/imm_switch-8.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/imm_switch-8.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-march=rv32gcv -mabi=ilp32 -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32 -fno-schedule-insns -fno-schedule-insns2 -fno-tree-vectorize" } */ #include "riscv_vector.h"