This patch, inspired by PR target/90483 and libstdc++/118416, implements
some RTL expansion-time simplifications of ptest. A common idiom for
testing a vector against zero is to use ptestz(mask,-1). Alas the code
generated for this is suboptimal, requiring materialization of an all_ones
vector. Given that ptestz(x,y) is defined as (x & y) == 0, an equivalent
form is ptestz(mask,mask), saving an instruction (if ~0 isn't available).
Consider the function:
typedef long long v2di __attribute__ ((__vector_size__ (16)));
int foo (v2di x)
{
return __builtin_ia32_ptestz128(x,~(v2di){0,0});
}
foo: xorl %eax, %eax
vptest %xmm0, %xmm0
sete %al
ret
2026-05-19 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR target/90483
PR libstdc++/118416
* config/i386/i386-expand.cc (ix86_expand_sse_ptest): Refactor
with optimizations for PTESTZ*, PTESTC* and PTESTNZC*, including
transforming ptestz(x,-1) into ptestz(x,x).
gcc/testsuite/ChangeLog
PR target/90483
PR libstdc++/118416
* gcc.target/i386/sse4_1-ptest-8.c: New test case.
* gcc.target/i386/sse4_1-ptest-9.c: Likewise.