There are two main benefits:
1. Reducing the number of instructions.
- lu32i.d $r12,0
- bstrpick.d $r4,$r4,31,0
- mul.d $r4,$r4,$r12
- srli.d $r4,$r4,33
---
+ mulh.wu $r4,$r4,$r12
+ bstrpick.d $r4,$r4,31,1
2. Help with the replacement of the high-latency div.w.
- addi.w $r12,$r0,3
- div.w $r4,$r4,$r12
---
+ lu12i.w $r13,349525
+ ori $r13,$r13,1366
+ mulw.d.w $r12,$r4,$r13
+ srai.w $r4,$r4,31
+ srli.d $r12,$r12,32
+ sub.w $r4,$r12,$r4
gcc/ChangeLog:
* config/loongarch/loongarch.cc (loongarch_rtx_costs):
Ignore the cost impact of SIGN_EXTEND/ZERO_EXTEND.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/widen-mul-rtx-cost-signed.c: New test.
* gcc.target/loongarch/widen-mul-rtx-cost-unsigned.c: New test.