The following testcase ICEs, because the splitters into vptest
create an invalid instruction. The operands of all the UNSPEC_PTEST
using instructions use register_operand and vector_operand predicate,
these splitters use vector_operand predicate but create vptest
instruction which has the same argument twice, so one of them needs
to be in a register.
The following patch keeps vector_operand predicate on the splitters
but uses force_reg to force it into a REG if it was a MEM, that results
in better code generation e.g. on the included testcase, as combine
can match those even with MEM.
The difference on the testcase is
- vpxor %xmm0, %xmm0, %xmm0
- vpcmpeqb (%rdi), %xmm0, %xmm0
- vpmovmskb %xmm0, %eax
- cmpl $65535, %eax
+ vmovdqa (%rdi), %xmm0
+ vptest %xmm0, %xmm0
(- for patch which changes the splitters to
s/vector_operand/register_operand/ and + for this patch).
2025-03-19 Jakub Jelinek <jakub@redhat.com>
PR target/119357
* config/i386/sse.md (pmovmskb 0xffff to ptest splitter,
*pmovsk_ptest_<mode>_avx512): Force operands[0] into a REG.
* gcc.target/i386/avx512vlbw-pr119357.c: New test.
[(set (reg:CCZ FLAGS_REG)
(unspec:CCZ [(match_dup 0)
(match_dup 0)]
- UNSPEC_PTEST))])
+ UNSPEC_PTEST))]
+ "operands[0] = force_reg (<MODE>mode, operands[0]);")
(define_insn_and_split "*pmovsk_mask_cmp_<mode>_avx512"
[(set (reg:CCZ FLAGS_REG)
[(set (reg:CCZ FLAGS_REG)
(unspec:CCZ [(match_dup 0)
(match_dup 0)]
- UNSPEC_PTEST))])
+ UNSPEC_PTEST))]
+ "operands[0] = force_reg (<MODE>mode, operands[0]);")
(define_expand "sse2_maskmovdqu"
[(set (match_operand:V16QI 0 "memory_operand")
--- /dev/null
+/* PR target/119357 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx512bw -mavx512vl" } */
+
+#include <x86intrin.h>
+
+typedef char V __attribute__((vector_size (16)));
+
+void
+foo (V *p)
+{
+ if (_mm_movemask_epi8 ((__m128i) (*p == 0)) != 65535)
+ __builtin_abort ();
+}