AArch32: Add support for sign differing dot-product usdot for NEON.
This adds optabs implementing usdot_prod.
The following testcase:
#define N 480
#define SIGNEDNESS_1 unsigned
#define SIGNEDNESS_2 signed
#define SIGNEDNESS_3 signed
#define SIGNEDNESS_4 unsigned
SIGNEDNESS_1 int __attribute__ ((noipa))
f (SIGNEDNESS_1 int res, SIGNEDNESS_3 char *restrict a,
SIGNEDNESS_4 char *restrict b)
{
for (__INTPTR_TYPE__ i = 0; i < N; ++i)
{
int av = a[i];
int bv = b[i];
SIGNEDNESS_2 short mult = av * bv;
res += mult;
}
return res;
}
For NEON. I couldn't figure out if the MVE instruction vmlaldav.s16 could be
used to emulate this. Because it would require additional widening to work I
left MVE out of this patch set but perhaps someone should take a look.