LoongArch: Use bstrins for "value & (-1u << const)"
A move/bstrins pair is as fast as a (addi.w|lu12i.w|lu32i.d|lu52i.d)/and
pair, and twice fast as a srli/slli pair. When the src reg and the dst
reg happens to be the same, the move instruction can be optimized away.
gcc/ChangeLog:
* config/loongarch/predicates.md (high_bitmask_operand): New
predicate.
* config/loongarch/constraints.md (Yy): New constriant.
* config/loongarch/loongarch.md (and<mode>3_align): New
define_insn_and_split.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/bstrins-1.c: New test.
* gcc.target/loongarch/bstrins-2.c: New test.