]> git.ipfire.org Git - thirdparty/gcc.git/commit
AArch64: take gather/scatter decode overhead into account
authorTamar Christina <tamar.christina@arm.com>
Tue, 6 Aug 2024 21:41:10 +0000 (22:41 +0100)
committerTamar Christina <tamar.christina@arm.com>
Tue, 6 Aug 2024 21:41:10 +0000 (22:41 +0100)
commita50916a6c0a6c73c1537d033509d4f7034341f75
tree49b0ea6cd0545f510b63ac075bf40b0e081a7adc
parent77d232522d3eb7a6541fc91c3092c115cc535275
AArch64: take gather/scatter decode overhead into account

Gather and scatters are not usually beneficial when the loop count is small.
This is because there's not only a cost to their execution within the loop but
there is also some cost to enter loops with them.

As such this patch models this overhead.  For generic tuning we however still
prefer gathers/scatters when the loop costs work out.

gcc/ChangeLog:

* config/aarch64/aarch64-protos.h (struct sve_vec_cost): Add
gather_load_x32_init_cost and gather_load_x64_init_cost.
* config/aarch64/aarch64.cc (aarch64_vector_costs): Add
m_sve_gather_scatter_init_cost.
(aarch64_vector_costs::add_stmt_cost): Use them.
(aarch64_vector_costs::finish_cost): Likewise.
* config/aarch64/tuning_models/a64fx.h: Update.
* config/aarch64/tuning_models/cortexx925.h: Update.
* config/aarch64/tuning_models/generic.h: Update.
* config/aarch64/tuning_models/generic_armv8_a.h: Update.
* config/aarch64/tuning_models/generic_armv9_a.h: Update.
* config/aarch64/tuning_models/neoverse512tvb.h: Update.
* config/aarch64/tuning_models/neoversen2.h: Update.
* config/aarch64/tuning_models/neoversen3.h: Update.
* config/aarch64/tuning_models/neoversev1.h: Update.
* config/aarch64/tuning_models/neoversev2.h: Update.
* config/aarch64/tuning_models/neoversev3.h: Update.
* config/aarch64/tuning_models/neoversev3ae.h: Update.
14 files changed:
gcc/config/aarch64/aarch64-protos.h
gcc/config/aarch64/aarch64.cc
gcc/config/aarch64/tuning_models/a64fx.h
gcc/config/aarch64/tuning_models/cortexx925.h
gcc/config/aarch64/tuning_models/generic.h
gcc/config/aarch64/tuning_models/generic_armv8_a.h
gcc/config/aarch64/tuning_models/generic_armv9_a.h
gcc/config/aarch64/tuning_models/neoverse512tvb.h
gcc/config/aarch64/tuning_models/neoversen2.h
gcc/config/aarch64/tuning_models/neoversen3.h
gcc/config/aarch64/tuning_models/neoversev1.h
gcc/config/aarch64/tuning_models/neoversev2.h
gcc/config/aarch64/tuning_models/neoversev3.h
gcc/config/aarch64/tuning_models/neoversev3ae.h