]> git.ipfire.org Git - thirdparty/gcc.git/commit
AVX512FP16: Fix wrong code for _mm_mask_f[c]madd.*sch [PR 104978]
authorHongyu Wang <hongyu.wang@intel.com>
Fri, 18 Mar 2022 17:16:29 +0000 (01:16 +0800)
committerHongyu Wang <hongyu.wang@intel.com>
Tue, 22 Mar 2022 03:48:38 +0000 (11:48 +0800)
commit7bce0be03b857eefe5990c3ef0af06ea8f8ae04e
tree5d2ba232d0294f28781f9b0690d9d7cf72f33127
parentd156bb870225f442b32983983f94e731397fdb6e
AVX512FP16: Fix wrong code for _mm_mask_f[c]madd.*sch [PR 104978]

For complex scalar intrinsic like _mm_mask_fcmadd_sch, the
mask should be and by 1 to ensure the mask is bind to lowest byte.
Use masked vmovss to perform same operation which omits higher bits
of mask.

gcc/ChangeLog:

PR target/104978
* config/i386/sse.md
(avx512fp16_fmaddcsh_v8hf_mask1<round_expand_name):
Use avx512f_movsf_mask instead of vmovaps or vblend, and
force_reg before lowpart_subreg.
(avx512fp16_fcmaddcsh_v8hf_mask1<round_expand_name): Likewise.

gcc/testsuite/ChangeLog:

PR target/104978
* gcc.target/i386/avx512fp16-vfcmaddcsh-1a.c: Adjust asm scan.
* gcc.target/i386/avx512fp16-vfmaddcsh-1a.c: Ditto.
* gcc.target/i386/avx512fp16-vfcmaddcsh-1c.c: Removed.
* gcc.target/i386/avx512fp16-vfmaddcsh-1c.c: Ditto.
* gcc.target/i386/pr104978.c: New test.
gcc/config/i386/sse.md
gcc/testsuite/gcc.target/i386/avx512fp16-vfcmaddcsh-1a.c
gcc/testsuite/gcc.target/i386/avx512fp16-vfcmaddcsh-1c.c [deleted file]
gcc/testsuite/gcc.target/i386/avx512fp16-vfmaddcsh-1a.c
gcc/testsuite/gcc.target/i386/avx512fp16-vfmaddcsh-1c.c [deleted file]
gcc/testsuite/gcc.target/i386/pr104978.c [new file with mode: 0644]