Like is already the case for the AVX/AVX2 form, VMOVDDUP - acting on
double precision floating values - is more appropriate to use here, and
it can also result in shorter insn encodings when source is memory or
%xmm0...%xmm7, and no masking is applied (in allowing a 2-byte VEX
prefix then instead of a 3-byte one).
gcc/
* config/i386/sse.md (<avx512>_vec_dup<mode><mask_name>): Use
vmovddup.
"TARGET_AVX512F"
{
/* There is no DF broadcast (in AVX-512*) to 128b register.
- Mimic it with integer variant. */
+ Mimic it with vmovddup, just like vec_dupv2df<mask_name> does. */
if (<MODE>mode == V2DFmode)
- return "vpbroadcastq\t{%1, %0<mask_operand2>|%0<mask_operand2>, %q1}";
+ return "vmovddup\t{%1, %0<mask_operand2>|%0<mask_operand2>, %q1}";
return "v<sseintprefix>broadcast<bcstscalarsuff>\t{%1, %0<mask_operand2>|%0<mask_operand2>, %<iptr>1}";
}