]> git.ipfire.org Git - thirdparty/gcc.git/blame - gcc/doc/gccint/link-time-optimization/using-summary-information-in-ipa-passes.rst
sphinx: add missing trailing newline
[thirdparty/gcc.git] / gcc / doc / gccint / link-time-optimization / using-summary-information-in-ipa-passes.rst
CommitLineData
c63539ff
ML
1..
2 Copyright 1988-2022 Free Software Foundation, Inc.
3 This is part of the GCC manual.
4 For copying conditions, see the copyright.rst file.
5
6.. _ipa:
7
8Using summary information in IPA passes
9***************************************
10
11Programs are represented internally as a *callgraph* (a
12multi-graph where nodes are functions and edges are call sites)
13and a *varpool* (a list of static and external variables in
14the program).
15
16The inter-procedural optimization is organized as a sequence of
17individual passes, which operate on the callgraph and the
18varpool. To make the implementation of WHOPR possible, every
19inter-procedural optimization pass is split into several stages
20that are executed at different times during WHOPR compilation:
21
22* LGEN time
23
24 * *Generate summary* (``generate_summary`` in
25 ``struct ipa_opt_pass_d``). This stage analyzes every function
26 body and variable initializer is examined and stores relevant
27 information into a pass-specific data structure.
28
29 * *Write summary* (``write_summary`` in
30 ``struct ipa_opt_pass_d``). This stage writes all the
31 pass-specific information generated by ``generate_summary``.
32 Summaries go into their own ``LTO_section_*`` sections that
33 have to be declared in :samp:`lto-streamer.h`: ``enum
34 lto_section_type``. A new section is created by calling
35 ``create_output_block`` and data can be written using the
36 ``lto_output_*`` routines.
37
38* WPA time
39
40 * *Read summary* (``read_summary`` in
41 ``struct ipa_opt_pass_d``). This stage reads all the
42 pass-specific information in exactly the same order that it was
43 written by ``write_summary``.
44
45 * *Execute* (``execute`` in ``struct
46 opt_pass``). This performs inter-procedural propagation. This
47 must be done without actual access to the individual function
48 bodies or variable initializers. Typically, this results in a
49 transitive closure operation over the summary information of all
50 the nodes in the callgraph.
51
52 * *Write optimization summary*
53 (``write_optimization_summary`` in ``struct
54 ipa_opt_pass_d``). This writes the result of the inter-procedural
55 propagation into the object file. This can use the same data
56 structures and helper routines used in ``write_summary``.
57
58* LTRANS time
59
60 * *Read optimization summary*
61 (``read_optimization_summary`` in ``struct
62 ipa_opt_pass_d``). The counterpart to
63 ``write_optimization_summary``. This reads the interprocedural
64 optimization decisions in exactly the same format emitted by
65 ``write_optimization_summary``.
66
67 * *Transform* (``function_transform`` and
68 ``variable_transform`` in ``struct ipa_opt_pass_d``).
69 The actual function bodies and variable initializers are updated
70 based on the information passed down from the *Execute* stage.
71
72The implementation of the inter-procedural passes are shared
73between LTO, WHOPR and classic non-LTO compilation.
74
75* During the traditional file-by-file mode every pass executes its
76 own *Generate summary*, *Execute*, and *Transform*
77 stages within the single execution context of the compiler.
78
79* In LTO compilation mode, every pass uses *Generate
80 summary* and *Write summary* stages at compilation time,
81 while the *Read summary*, *Execute*, and
82 *Transform* stages are executed at link time.
83
84* In WHOPR mode all stages are used.
85
86To simplify development, the GCC pass manager differentiates
87between normal inter-procedural passes (see :ref:`regular-ipa-passes`),
88small inter-procedural passes (see :ref:`small-ipa-passes`)
89and late inter-procedural passes (see :ref:`late-ipa-passes`).
90A small or late IPA pass (``SIMPLE_IPA_PASS``) does
91everything at once and thus cannot be executed during WPA in
92WHOPR mode. It defines only the *Execute* stage and during
93this stage it accesses and modifies the function bodies. Such
94passes are useful for optimization at LGEN or LTRANS time and are
95used, for example, to implement early optimization before writing
96object files. The simple inter-procedural passes can also be used
97for easier prototyping and development of a new inter-procedural
98pass.
99
100Virtual clones
101^^^^^^^^^^^^^^
102
103One of the main challenges of introducing the WHOPR compilation
104mode was addressing the interactions between optimization passes.
105In LTO compilation mode, the passes are executed in a sequence,
106each of which consists of analysis (or *Generate summary*),
107propagation (or *Execute*) and *Transform* stages.
108Once the work of one pass is finished, the next pass sees the
109updated program representation and can execute. This makes the
110individual passes dependent on each other.
111
112In WHOPR mode all passes first execute their *Generate
113summary* stage. Then summary writing marks the end of the LGEN
114stage. At WPA time,
115the summaries are read back into memory and all passes run the
116*Execute* stage. Optimization summaries are streamed and
117sent to LTRANS, where all the passes execute the *Transform*
118stage.
119
120Most optimization passes split naturally into analysis,
121propagation and transformation stages. But some do not. The
122main problem arises when one pass performs changes and the
123following pass gets confused by seeing different callgraphs
124between the *Transform* stage and the *Generate summary*
125or *Execute* stage. This means that the passes are required
126to communicate their decisions with each other.
127
128To facilitate this communication, the GCC callgraph
129infrastructure implements *virtual clones*, a method of
130representing the changes performed by the optimization passes in
131the callgraph without needing to update function bodies.
132
133A *virtual clone* in the callgraph is a function that has no
134associated body, just a description of how to create its body based
135on a different function (which itself may be a virtual clone).
136
137The description of function modifications includes adjustments to
138the function's signature (which allows, for example, removing or
139adding function arguments), substitutions to perform on the
140function body, and, for inlined functions, a pointer to the
141function that it will be inlined into.
142
143It is also possible to redirect any edge of the callgraph from a
144function to its virtual clone. This implies updating of the call
145site to adjust for the new function signature.
146
147Most of the transformations performed by inter-procedural
148optimizations can be represented via virtual clones. For
149instance, a constant propagation pass can produce a virtual clone
150of the function which replaces one of its arguments by a
151constant. The inliner can represent its decisions by producing a
152clone of a function whose body will be later integrated into
153a given function.
154
155Using *virtual clones*, the program can be easily updated
156during the *Execute* stage, solving most of pass interactions
157problems that would otherwise occur during *Transform*.
158
159Virtual clones are later materialized in the LTRANS stage and
160turned into real functions. Passes executed after the virtual
161clone were introduced also perform their *Transform* stage
162on new functions, so for a pass there is no significant
163difference between operating on a real function or a virtual
164clone introduced before its *Execute* stage.
165
166Optimization passes then work on virtual clones introduced before
167their *Execute* stage as if they were real functions. The
168only difference is that clones are not visible during the
169*Generate Summary* stage.
170
171To keep function summaries updated, the callgraph interface
172allows an optimizer to register a callback that is called every
173time a new clone is introduced as well as when the actual
174function or variable is generated or when a function or variable
175is removed. These hooks are registered in the *Generate
176summary* stage and allow the pass to keep its information intact
177until the *Execute* stage. The same hooks can also be
178registered during the *Execute* stage to keep the
179optimization summaries updated for the *Transform* stage.
180
181IPA references
182^^^^^^^^^^^^^^
183
184GCC represents IPA references in the callgraph. For a function
185or variable ``A``, the *IPA reference* is a list of all
186locations where the address of ``A`` is taken and, when
187``A`` is a variable, a list of all direct stores and reads
188to/from ``A``. References represent an oriented multi-graph on
189the union of nodes of the callgraph and the varpool. See
190:samp:`ipa-reference.cc`: ``ipa_reference_write_optimization_summary``
191and
192:samp:`ipa-reference.cc`: ``ipa_reference_read_optimization_summary``
193for details.
194
195Jump functions
196^^^^^^^^^^^^^^
197
198Suppose that an optimization pass sees a function ``A`` and it
199knows the values of (some of) its arguments. The *jump
200function* describes the value of a parameter of a given function
201call in function ``A`` based on this knowledge.
202
203Jump functions are used by several optimizations, such as the
204inter-procedural constant propagation pass and the
205devirtualization pass. The inliner also uses jump functions to
3ed1b4ce 206perform inlining of callbacks.