The ctz.w instruction writes to the entire $rd. So due to the range of
the CTZ result ([0, 32]), $rd is already both sign-extended and
zero-extended from its lower half. But in pr90838.c we can see two
redundant sign-extension (slli.w ...,0). Now get rid of them.
The "andi" instructions in pr90393.c are really needed, because the
source code logic should produce 0 for zero input, but the ctz.[dw]
instructions actually produce 64/32.
gcc/
* config/loongarch/loongarch.md (*ctzsi2_extend): New
define_insn.
gcc/testsuite/
* gcc.dg/pr90838.c: Adjust expectation for LoongArch.
[(set_attr "type" "clz")
(set_attr "mode" "<MODE>")])
+(define_insn "*ctzsi2_extend"
+ [(set (match_operand:DI 0 "register_operand" "=r")
+ (any_extend:DI
+ (ctz:SI (match_operand:SI 1 "register_operand" "r"))))]
+ "TARGET_64BIT"
+ "ctz.w\t%0,%1"
+ [(set_attr "type" "clz")
+ (set_attr "mode" "SI")])
+
;;
;; ....................
;;
/* { dg-final { scan-tree-dump-times {= \.CTZ} 4 "forwprop2" { target { loongarch64*-*-* } } } } */
/* { dg-final { scan-assembler-times "ctz.d\t" 1 { target { loongarch64*-*-* } } } } */
/* { dg-final { scan-assembler-times "ctz.w\t" 3 { target { loongarch64*-*-* } } } } */
-/* { dg-final { scan-assembler-times "\(andi|slli.w\)\t" 4 { target { loongarch64*-*-* } } } } */
+/* { dg-final { scan-assembler-times "andi\t" 2 { target { loongarch64*-*-* } } } } */