git.ipfire.org Git - thirdparty/gcc.git/commit

AArch64: Implement target hooks for dispatch scheduling.

This patch adds dispatch scheduling for AArch64 by implementing the two target
hooks TARGET_SCHED_DISPATCH and TARGET_SCHED_DISPATCH_DO.

The motivation for this is that cores with out-of-order processing do
most of the reordering to avoid pipeline hazards on the hardware side
using large reorder buffers. For such cores, rather than scheduling
around instruction latencies and throughputs, the compiler should aim to
maximize the utilized dispatch bandwidth by inserting a certain
instruction mix into the frontend dispatch window.

In the following, we will describe the overall implementation:
Recall that the Haifa scheduler makes the following 6 types of queries to
a dispatch scheduling model:
1) targetm.sched.dispatch (NULL, IS_DISPATCH_ON)
2) targetm.sched.dispatch_do (NULL, DISPATCH_INIT)
3) targetm.sched.dispatch (insn, FITS_DISPATCH_WINDOW)
4) targetm.sched.dispatch_do (insn, ADD_TO_DISPATCH_WINDOW)
5) targetm.sched.dispatch (NULL, DISPATCH_VIOLATION)
6) targetm.sched.dispatch (insn, IS_CMP)

For 1), we created the new tune flag AARCH64_EXTRA_TUNE_DISPATCH_SCHED.

For 2-5), we modeled dispatch scheduling using the class dispatch_window.
A dispatch_window object represents the window of operations that is dispatched
per cycle. It contains the two arrays max_slots and free_slots (the length
of the arrays is the number of dispatch constraints specified for a core)
to keep track of the available slots.
The dispatch_window class exposes functions to ask whether a given
instruction would fit into the dispatch_window or to add an instruction to
the window.
The model operates using only one dispatch_window object that is constructed
when 2) is called. Upon construction, it copies the number of available slots
given in the tuning model (more details on the changes to tune_params below).
During scheduling, instructions are added according to the dispatch
constraints. For that, the dispatch_window queries the tuning model using a
callback function that takes an insn as input and returns a vector of
pairs (a, b), where a is the index of the constraint and b is the number of
slots occupied.
The dispatch_window checks if the instruction fits into the current
window. If not, i.e. the current window is full, the free_slots array is
reset to max_slots. Then the dispatch_window deducts b slots from
free_slots[a] for each pair (a, b) in the vector returned by the callback.
A dispatch violation occurs when the number of free slots becomes negative
for any dispatch_constraint.

For 6), return false (see comment in aarch64-sched-dispatch.cc).

Dispatch information for a core can be added in its tuning model. We added
the new field *dispatch_constraint to the struct tune_params that holds a
pointer to a struct dispatch_constraints_info.
All current tuning models were initialized with nullptr.
(In the next patch, dispatch information will be added for Neoverse V2.)

The patch was bootstrapped and tested on aarch64-linux-gnu, no regression.

Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>
gcc/ChangeLog:

* config.gcc: Add aarch64-sched-dispatch.o to extra_objs.
* config/aarch64/aarch64-protos.h (struct tune_params): New
field for dispatch scheduling.
(struct dispatch_constraint_info): New struct for dispatch scheduling.
* config/aarch64/aarch64-tuning-flags.def
(AARCH64_EXTRA_TUNING_OPTION): New flag to enable dispatch scheduling.
* config/aarch64/aarch64.cc (TARGET_SCHED_DISPATCH): Implement
target hook.
(TARGET_SCHED_DISPATCH_DO): Likewise.
(aarch64_override_options_internal): Add check for definition of
dispatch constraints if dispatch-scheduling tune flag is set.
* config/aarch64/t-aarch64: Add aarch64-sched-dispatch.o.
* config/aarch64/tuning_models/a64fx.h: Initialize fields for
dispatch scheduling in tune_params.
* config/aarch64/tuning_models/ampere1.h: Likewise.
* config/aarch64/tuning_models/ampere1a.h: Likewise.
* config/aarch64/tuning_models/ampere1b.h: Likewise.
* config/aarch64/tuning_models/cortexa35.h: Likewise.
* config/aarch64/tuning_models/cortexa53.h: Likewise.
* config/aarch64/tuning_models/cortexa57.h: Likewise.
* config/aarch64/tuning_models/cortexa72.h: Likewise.
* config/aarch64/tuning_models/cortexa73.h: Likewise.
* config/aarch64/tuning_models/cortexx925.h: Likewise.
* config/aarch64/tuning_models/emag.h: Likewise.
* config/aarch64/tuning_models/exynosm1.h: Likewise.
* config/aarch64/tuning_models/fujitsu_monaka.h: Likewise.
* config/aarch64/tuning_models/generic.h: Likewise.
* config/aarch64/tuning_models/generic_armv8_a.h: Likewise.
* config/aarch64/tuning_models/generic_armv9_a.h: Likewise.
* config/aarch64/tuning_models/neoverse512tvb.h: Likewise.
* config/aarch64/tuning_models/neoversen1.h: Likewise.
* config/aarch64/tuning_models/neoversen2.h: Likewise.
* config/aarch64/tuning_models/neoversen3.h: Likewise.
* config/aarch64/tuning_models/neoversev1.h: Likewise.
* config/aarch64/tuning_models/neoversev2.h: Likewise.
* config/aarch64/tuning_models/neoversev3.h: Likewise.
* config/aarch64/tuning_models/neoversev3ae.h: Likewise.
* config/aarch64/tuning_models/olympus.h: Likewise.
* config/aarch64/tuning_models/qdf24xx.h: Likewise.
* config/aarch64/tuning_models/saphira.h: Likewise.
* config/aarch64/tuning_models/thunderx.h: Likewise.
* config/aarch64/tuning_models/thunderx2t99.h: Likewise.
* config/aarch64/tuning_models/thunderx3t110.h: Likewise.
* config/aarch64/tuning_models/thunderxt88.h: Likewise.
* config/aarch64/tuning_models/tsv110.h: Likewise.
* config/aarch64/tuning_models/xgene1.h: Likewise.
* config/aarch64/aarch64-sched-dispatch.cc: New file for
dispatch scheduling for aarch64.
* config/aarch64/aarch64-sched-dispatch.h: New header file.

author	Jennifer Schmitz <jschmitz@nvidia.com>
	Wed, 17 Sep 2025 10:22:12 +0000 (03:22 -0700)
committer	Jennifer Schmitz <jschmitz@nvidia.com>
	Wed, 24 Sep 2025 14:18:28 +0000 (16:18 +0200)
commit	c8bd7b2d55039381b91146fb8ce40673dd035311
tree	6523080964fa14a530cfa2a2ba49850aee373d67	tree
parent	cb80cdbef47fa06538c763315e563216ace3673d	commit \| diff

gcc/config.gcc		diff \| blob \| blame \| history
gcc/config/aarch64/aarch64-protos.h		diff \| blob \| blame \| history
gcc/config/aarch64/aarch64-sched-dispatch.cc	[new file with mode: 0644]	blob
gcc/config/aarch64/aarch64-sched-dispatch.h	[new file with mode: 0644]	blob
gcc/config/aarch64/aarch64-tuning-flags.def		diff \| blob \| blame \| history
gcc/config/aarch64/aarch64.cc		diff \| blob \| blame \| history
gcc/config/aarch64/t-aarch64		diff \| blob \| blame \| history
gcc/config/aarch64/tuning_models/a64fx.h		diff \| blob \| blame \| history
gcc/config/aarch64/tuning_models/ampere1.h		diff \| blob \| blame \| history
gcc/config/aarch64/tuning_models/ampere1a.h		diff \| blob \| blame \| history
gcc/config/aarch64/tuning_models/ampere1b.h		diff \| blob \| blame \| history
gcc/config/aarch64/tuning_models/cortexa35.h		diff \| blob \| blame \| history
gcc/config/aarch64/tuning_models/cortexa53.h		diff \| blob \| blame \| history
gcc/config/aarch64/tuning_models/cortexa57.h		diff \| blob \| blame \| history
gcc/config/aarch64/tuning_models/cortexa72.h		diff \| blob \| blame \| history
gcc/config/aarch64/tuning_models/cortexa73.h		diff \| blob \| blame \| history
gcc/config/aarch64/tuning_models/cortexx925.h		diff \| blob \| blame \| history
gcc/config/aarch64/tuning_models/emag.h		diff \| blob \| blame \| history
gcc/config/aarch64/tuning_models/exynosm1.h		diff \| blob \| blame \| history
gcc/config/aarch64/tuning_models/fujitsu_monaka.h		diff \| blob \| blame \| history
gcc/config/aarch64/tuning_models/generic.h		diff \| blob \| blame \| history
gcc/config/aarch64/tuning_models/generic_armv8_a.h		diff \| blob \| blame \| history
gcc/config/aarch64/tuning_models/generic_armv9_a.h		diff \| blob \| blame \| history
gcc/config/aarch64/tuning_models/neoverse512tvb.h		diff \| blob \| blame \| history
gcc/config/aarch64/tuning_models/neoversen1.h		diff \| blob \| blame \| history
gcc/config/aarch64/tuning_models/neoversen2.h		diff \| blob \| blame \| history
gcc/config/aarch64/tuning_models/neoversen3.h		diff \| blob \| blame \| history
gcc/config/aarch64/tuning_models/neoversev1.h		diff \| blob \| blame \| history
gcc/config/aarch64/tuning_models/neoversev2.h		diff \| blob \| blame \| history
gcc/config/aarch64/tuning_models/neoversev3.h		diff \| blob \| blame \| history
gcc/config/aarch64/tuning_models/neoversev3ae.h		diff \| blob \| blame \| history
gcc/config/aarch64/tuning_models/olympus.h		diff \| blob \| blame \| history
gcc/config/aarch64/tuning_models/qdf24xx.h		diff \| blob \| blame \| history
gcc/config/aarch64/tuning_models/saphira.h		diff \| blob \| blame \| history
gcc/config/aarch64/tuning_models/thunderx.h		diff \| blob \| blame \| history
gcc/config/aarch64/tuning_models/thunderx2t99.h		diff \| blob \| blame \| history
gcc/config/aarch64/tuning_models/thunderx3t110.h		diff \| blob \| blame \| history
gcc/config/aarch64/tuning_models/thunderxt88.h		diff \| blob \| blame \| history
gcc/config/aarch64/tuning_models/tsv110.h		diff \| blob \| blame \| history
gcc/config/aarch64/tuning_models/xgene1.h		diff \| blob \| blame \| history