git.ipfire.org Git - thirdparty/gcc.git/commit

author	Pranav Gorantla <Pranav.Gorantla@amd.com>
	Thu, 29 May 2025 13:02:24 +0000 (15:02 +0200)
committer	Jan Hubicka <hubicka@ucw.cz>
	Thu, 29 May 2025 13:03:23 +0000 (15:03 +0200)
commit	5080d98a383de244a7b78ae50456fd41881268c2
tree	8e5f1088673eb346ac6643dbd4f27d162427e25f	tree
parent	6df697847773d21ad8276de38131413aa5c5e3b0	commit \| diff

i386: Use Shuffles instead of shifts for Reduction in AMD znver4/5

In AMD znver4, znver5 targets vpshufd, vpsrldq have latencies 1,2 and
throughput 4 (2 for znver4),2 respectively. It is better to generate
shuffles instead of shifts wherever possible. In this patch we try to
generate appropriate shuffle instruction to copy higher half to lower
half instead of a simple right shift during horizontal vector reduction.

gcc/ChangeLog:

* config/i386/i386-expand.cc (emit_reduc_half): Use shuffles to
generate reduc half for V4SI, similar modes.
* config/i386/i386.h (TARGET_SSE_REDUCTION_PREFER_PSHUF): New Macro.
* config/i386/x86-tune.def (X86_TUNE_SSE_REDUCTION_PREFER_PSHUF):
New tuning.

gcc/testsuite/ChangeLog:

* gcc.target/i386/reduc-pshuf.c: New test.

gcc/config/i386/i386-expand.cc		diff \| blob \| blame \| history
gcc/config/i386/i386.h		diff \| blob \| blame \| history
gcc/config/i386/x86-tune.def		diff \| blob \| blame \| history
gcc/testsuite/gcc.target/i386/reduc-pshuf.c	[new file with mode: 0644]	blob