commit
eb4031cb20aa710834be891f8638e04dbba81edc
Author: Jan Beulich <jbeulich@suse.com>
Date: Tue Jul 4 17:07:26 2023 +0200
x86: optimize 128-bit VPBROADCASTQ to VPUNPCKLQDQ
was supposed to optimize
vpbroadcastq %xmmN, %xmmM -> vpunpcklqdq %xmmN, %xmmN, %xmmM (N < 8)
But it didn't check if the destination operand is XMM. As the result, it
turned:
vpbroadcastq %xmmN, %ymmM
into
vpunpcklqdq %xmmN, %xmmN, %xmmM
Fixing it by checking XMM destination.
PR gas/34171
* config/tc-i386.c (optimize_encoding): Check XMM destination
when optimizing 128-bit VPBROADCASTQ.
* testsuite/gas/i386/optimize-2.d: Updated.
* testsuite/gas/i386/optimize-2.s: Add 256-bit vpbroadcastq.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
&& i.tm.opcode_modifier.vex
&& !(i.op[0].regs->reg_flags & RegRex)
&& i.op[0].regs->reg_type.bitfield.xmmword
+ && i.op[1].regs->reg_type.bitfield.xmmword
&& pp.encoding != encoding_vex3)
{
/* Optimize: -Os:
+[a-f0-9]+: c5 .* vpaddq %xmm2,%xmm2,%xmm3
+[a-f0-9]+: 62 .* vpaddq %zmm2,%zmm2,%zmm3
+[a-f0-9]+: c5 .* vpunpcklqdq %xmm2,%xmm2,%xmm0
+ +[a-f0-9]+: c4 .* vpbroadcastq %xmm2,%ymm0
#pass
vpsllq $1, %zmm2, %zmm3
vpbroadcastq %xmm2, %xmm0
+ vpbroadcastq %xmm2, %ymm0