git.ipfire.org Git - thirdparty/gcc.git/commit

author	Remi Machet <rmachet@nvidia.com>
	Tue, 1 Jul 2025 12:45:04 +0000 (13:45 +0100)
committer	Richard Sandiford <richard.sandiford@arm.com>
	Tue, 1 Jul 2025 12:45:04 +0000 (13:45 +0100)
commit	1cbb3122cb2779198b0dcfb8afc28df711e64138
tree	3fdd849d3faf3998af9a51716fa3ce15270d301a	tree \| snapshot
parent	ee31ab9b1950b7f47f030bda231ace34d187ae26	commit \| diff

AArch64 SIMD: convert mvn+shrn into mvni+subhn

Add an optimization to aarch64 SIMD converting mvn+shrn into mvni+subhn when
possible, which allows for better optimization when the code is inside a loop
by using a constant.

The conversion is based on the fact that for an unsigned integer:
  -x = ~x + 1 => ~x = -1 - x
thus '(u8)(~x >> imm)' is equivalent to '(u8)(((u16)-1 - x) >> imm)'.

For the following function:
uint8x8_t neg_narrow_v8hi(uint16x8_t a) {
  uint16x8_t b = vmvnq_u16(a);
  return vshrn_n_u16(b, 8);
}

Without this patch the assembly look like:
not v0.16b, v0.16b
shrn v0.8b, v0.8h, 8

After the patch it becomes:
mvni v31.4s, 0
subhn v0.8b, v31.8h, v0.8h

Bootstrapped and regtested on aarch64-linux-gnu.

Signed-off-by: Remi Machet <rmachet@nvidia.com>
gcc/ChangeLog:

* config/aarch64/aarch64-simd.md (*shrn_to_subhn_<mode>): Add pattern
converting mvn+shrn into mvni+subhn.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/simd/shrn2subhn.c: New test.

gcc/config/aarch64/aarch64-simd.md		diff \| blob \| blame \| history
gcc/testsuite/gcc.target/aarch64/simd/shrn2subhn.c	[new file with mode: 0644]	blob