git.ipfire.org Git - thirdparty/gcc.git/commit

author	Ju-Zhe Zhong <juzhe.zhong@rivai.ai>
	Fri, 14 Jul 2023 23:45:00 +0000 (07:45 +0800)
committer	Pan Li <pan2.li@intel.com>
	Wed, 19 Jul 2023 13:36:56 +0000 (21:36 +0800)
commit	ba49332baba622cb9af8e34629636f2586664c7e
tree	55b32252c1641d98571fb7b2374c9d3e1781d4a3	tree
parent	e029635cb72e6db72f1826b6b43fa4b299b2145f	commit \| diff

VECT: Add mask_len_fold_left_plus for in-order floating-point reduction

Hi, Richard and Richi.

This patch adds mask_len_fold_left_plus pattern to support in-order floating-point
reduction for target support len loop control.

Consider this following case:
double
foo2 (double *__restrict a,
     double init,
     int *__restrict cond,
     int n)
{
    for (int i = 0; i < n; i++)
      if (cond[i])
        init += a[i];
    return init;
}

ARM SVE:

...
vec_mask_and_60 = loop_mask_54 & mask__23.33_57;
vect__ifc__35.37_64 = .VCOND_MASK (vec_mask_and_60, vect__8.36_61, { 0.0, ... });
_36 = .MASK_FOLD_LEFT_PLUS (init_20, vect__ifc__35.37_64, loop_mask_54);
...

For RVV, we want to see:
...
_36 = .MASK_LEN_FOLD_LEFT_PLUS (init_20, vect__ifc__35.37_64, control_mask, loop_len, bias);
...

gcc/ChangeLog:

* doc/md.texi: Add mask_len_fold_left_plus.
* internal-fn.cc (mask_len_fold_left_direct): Ditto.
(expand_mask_len_fold_left_optab_fn): Ditto.
(direct_mask_len_fold_left_optab_supported_p): Ditto.
* internal-fn.def (MASK_LEN_FOLD_LEFT_PLUS): Ditto.
* optabs.def (OPTAB_D): Ditto.

gcc/doc/md.texi		diff \| blob \| blame \| history
gcc/internal-fn.cc		diff \| blob \| blame \| history
gcc/internal-fn.def		diff \| blob \| blame \| history
gcc/optabs.def		diff \| blob \| blame \| history