PPC64: Add libmvec SIMD single-precision power function [BZ #24210]
Based off the ./sysdeps/ieee754/flt-32/powf.c implementation,
and thus provides identical results.
Unlike other libmvec functions, this sets the underflow and overflow bits.
The caller can check these flags, and possibly re-run the calculations with
scalar powf to figure out what is causing the overflow or underflow.
I may have not normalized the data for benchmarking this properly,
but operating only on floats between 0.5 and 1 I get the following:
Running 20 times over 32MiB
vector: mean 307.659767 (sd 0.203217)
scalar: mean 221.837088 (sd 0.032256)
And with random data there is a decrease in performance:
vector: mean 265.366371 (sd 0.000626)
scalar: mean 279.598078 (sd 0.025592)
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>