The compiler can match mpyd.eq r0,r1,r0 as a predicated instruction,
which is incorrect. The mpyd(u) instruction takes as input two 32-bit
registers, returning into a double 64-bit even-odd register pair. For
the predicated case, the ARC instruction decoder expects the
destination register to be the same as the first input register. In
the big-endian case the result is swaped in the destination register
pair, however, the instruction encoding remains the same. Refurbish
the mpyd(u) patterns to take into account the above observation.
* config/arc/arc.md (mpyd<su_optab>_arcv2hs): New template
pattern.
(*pmpyd<su_optab>_arcv2hs): Likewise.
(*pmpyd<su_optab>_imm_arcv2hs): Likewise.
(mpyd_arcv2hs): Moved into above template.
(mpyd_imm_arcv2hs): Moved into above template.
(mpydu_arcv2hs): Likewise.
(mpydu_imm_arcv2hs): Likewise.
(su_optab): New optab prefix for sign/zero-extending operations.