]> git.ipfire.org Git - thirdparty/gcc.git/commit - gcc/expmed.h
middle-end: Use subregs to expand COMPLEX_EXPR to set the lowpart.
authorTamar Christina <tamar.christina@arm.com>
Fri, 8 Jul 2022 06:37:20 +0000 (07:37 +0100)
committerTamar Christina <tamar.christina@arm.com>
Fri, 8 Jul 2022 06:39:33 +0000 (07:39 +0100)
commit13f44099bcc64ddb50a6dbd462bf79b258dfd02c
tree7ea840cf4915eabc7761e0c1bc3989a9c3859bba
parentbf3695691f4fc964a3b1c8274a6949d844e3edff
middle-end: Use subregs to expand COMPLEX_EXPR to set the lowpart.

When lowering COMPLEX_EXPR we currently emit two VEC_EXTRACTs.  One for the
lowpart and one for the highpart.

The problem with this is that in RTL the lvalue of the RTX is the only thing
tying the two instructions together.

This means that e.g. combine is unable to try to combine the two instructions
for setting the lowpart and highpart.

For ISAs that have bit extract instructions we can eliminate one of the extracts
if, and only if we're setting the entire complex number.

This change changes the expand code when we're setting the entire complex number
to generate a subreg for the lowpart instead of a vec_extract.

This allows us to optimize sequences such as:

_Complex int f(int a, int b) {
    _Complex int t = a + b * 1i;
    return t;
}

from:

f:
bfi     x2, x0, 0, 32
bfi     x2, x1, 32, 32
mov     x0, x2
ret

into:

f:
bfi x0, x1, 32, 32
ret

I have also confirmed the codegen for x86_64 did not change.

gcc/ChangeLog:

* expmed.cc (store_bit_field_1): Add parameter that indicates if value is
still undefined and if so emit a subreg move instead.
(store_integral_bit_field): Likewise.
(store_bit_field): Likewise.
* expr.h (write_complex_part): Likewise.
* expmed.h (store_bit_field): Add new parameter.
* builtins.cc (expand_ifn_atomic_compare_exchange_into_call): Use new
parameter.
(expand_ifn_atomic_compare_exchange): Likewise.
* calls.cc (store_unaligned_arguments_into_pseudos): Likewise.
* emit-rtl.cc (validate_subreg): Likewise.
* expr.cc (emit_group_store): Likewise.
(copy_blkmode_from_reg): Likewise.
(copy_blkmode_to_reg): Likewise.
(clear_storage_hints): Likewise.
(write_complex_part):  Likewise.
(emit_move_complex_parts): Likewise.
(expand_assignment): Likewise.
(store_expr): Likewise.
(store_field): Likewise.
(expand_expr_real_2): Likewise.
* ifcvt.cc (noce_emit_move_insn): Likewise.
* internal-fn.cc (expand_arith_set_overflow): Likewise.
(expand_arith_overflow_result_store): Likewise.
(expand_addsub_overflow): Likewise.
(expand_neg_overflow): Likewise.
(expand_mul_overflow): Likewise.
(expand_arith_overflow): Likewise.

gcc/testsuite/ChangeLog:

* g++.target/aarch64/complex-init.C: New test.
gcc/builtins.cc
gcc/calls.cc
gcc/emit-rtl.cc
gcc/expmed.cc
gcc/expmed.h
gcc/expr.cc
gcc/expr.h
gcc/ifcvt.cc
gcc/internal-fn.cc
gcc/testsuite/g++.target/aarch64/complex-init.C [new file with mode: 0644]