Following the instruction cost fix, we are generating
alsl.w $a0, $a0, $a0, 4
instead of
li.w $t0, 17
mul.w $a0, $t0
for "x * 4", because alsl.w is 4 times faster than mul.w. But we didn't
have a sign-extending pattern for alsl.w, causing an extra slli.w
instruction generated to sign-extend $a0. Add the pattern to remove the
redundant extension.
gcc/ChangeLog:
* config/loongarch/loongarch.md (alslsi3_extend): New
define_insn.
[(set_attr "type" "arith")
(set_attr "mode" "<MODE>")])
+(define_insn "alslsi3_extend"
+ [(set (match_operand:DI 0 "register_operand" "=r")
+ (sign_extend:DI
+ (plus:SI
+ (ashift:SI (match_operand:SI 1 "register_operand" "r")
+ (match_operand 2 "const_immalsl_operand" ""))
+ (match_operand:SI 3 "register_operand" "r"))))]
+ ""
+ "alsl.w\t%0,%1,%3,%2"
+ [(set_attr "type" "arith")
+ (set_attr "mode" "SI")])
+
\f
;; Reverse the order of bytes of operand 1 and store the result in operand 0.