]> git.ipfire.org Git - thirdparty/gcc.git/blob - gcc/doc/gccint/link-time-optimization/using-summary-information-in-ipa-passes.rst
sphinx: add missing trailing newline
[thirdparty/gcc.git] / gcc / doc / gccint / link-time-optimization / using-summary-information-in-ipa-passes.rst
1 ..
2 Copyright 1988-2022 Free Software Foundation, Inc.
3 This is part of the GCC manual.
4 For copying conditions, see the copyright.rst file.
5
6 .. _ipa:
7
8 Using summary information in IPA passes
9 ***************************************
10
11 Programs are represented internally as a *callgraph* (a
12 multi-graph where nodes are functions and edges are call sites)
13 and a *varpool* (a list of static and external variables in
14 the program).
15
16 The inter-procedural optimization is organized as a sequence of
17 individual passes, which operate on the callgraph and the
18 varpool. To make the implementation of WHOPR possible, every
19 inter-procedural optimization pass is split into several stages
20 that are executed at different times during WHOPR compilation:
21
22 * LGEN time
23
24 * *Generate summary* (``generate_summary`` in
25 ``struct ipa_opt_pass_d``). This stage analyzes every function
26 body and variable initializer is examined and stores relevant
27 information into a pass-specific data structure.
28
29 * *Write summary* (``write_summary`` in
30 ``struct ipa_opt_pass_d``). This stage writes all the
31 pass-specific information generated by ``generate_summary``.
32 Summaries go into their own ``LTO_section_*`` sections that
33 have to be declared in :samp:`lto-streamer.h`: ``enum
34 lto_section_type``. A new section is created by calling
35 ``create_output_block`` and data can be written using the
36 ``lto_output_*`` routines.
37
38 * WPA time
39
40 * *Read summary* (``read_summary`` in
41 ``struct ipa_opt_pass_d``). This stage reads all the
42 pass-specific information in exactly the same order that it was
43 written by ``write_summary``.
44
45 * *Execute* (``execute`` in ``struct
46 opt_pass``). This performs inter-procedural propagation. This
47 must be done without actual access to the individual function
48 bodies or variable initializers. Typically, this results in a
49 transitive closure operation over the summary information of all
50 the nodes in the callgraph.
51
52 * *Write optimization summary*
53 (``write_optimization_summary`` in ``struct
54 ipa_opt_pass_d``). This writes the result of the inter-procedural
55 propagation into the object file. This can use the same data
56 structures and helper routines used in ``write_summary``.
57
58 * LTRANS time
59
60 * *Read optimization summary*
61 (``read_optimization_summary`` in ``struct
62 ipa_opt_pass_d``). The counterpart to
63 ``write_optimization_summary``. This reads the interprocedural
64 optimization decisions in exactly the same format emitted by
65 ``write_optimization_summary``.
66
67 * *Transform* (``function_transform`` and
68 ``variable_transform`` in ``struct ipa_opt_pass_d``).
69 The actual function bodies and variable initializers are updated
70 based on the information passed down from the *Execute* stage.
71
72 The implementation of the inter-procedural passes are shared
73 between LTO, WHOPR and classic non-LTO compilation.
74
75 * During the traditional file-by-file mode every pass executes its
76 own *Generate summary*, *Execute*, and *Transform*
77 stages within the single execution context of the compiler.
78
79 * In LTO compilation mode, every pass uses *Generate
80 summary* and *Write summary* stages at compilation time,
81 while the *Read summary*, *Execute*, and
82 *Transform* stages are executed at link time.
83
84 * In WHOPR mode all stages are used.
85
86 To simplify development, the GCC pass manager differentiates
87 between normal inter-procedural passes (see :ref:`regular-ipa-passes`),
88 small inter-procedural passes (see :ref:`small-ipa-passes`)
89 and late inter-procedural passes (see :ref:`late-ipa-passes`).
90 A small or late IPA pass (``SIMPLE_IPA_PASS``) does
91 everything at once and thus cannot be executed during WPA in
92 WHOPR mode. It defines only the *Execute* stage and during
93 this stage it accesses and modifies the function bodies. Such
94 passes are useful for optimization at LGEN or LTRANS time and are
95 used, for example, to implement early optimization before writing
96 object files. The simple inter-procedural passes can also be used
97 for easier prototyping and development of a new inter-procedural
98 pass.
99
100 Virtual clones
101 ^^^^^^^^^^^^^^
102
103 One of the main challenges of introducing the WHOPR compilation
104 mode was addressing the interactions between optimization passes.
105 In LTO compilation mode, the passes are executed in a sequence,
106 each of which consists of analysis (or *Generate summary*),
107 propagation (or *Execute*) and *Transform* stages.
108 Once the work of one pass is finished, the next pass sees the
109 updated program representation and can execute. This makes the
110 individual passes dependent on each other.
111
112 In WHOPR mode all passes first execute their *Generate
113 summary* stage. Then summary writing marks the end of the LGEN
114 stage. At WPA time,
115 the summaries are read back into memory and all passes run the
116 *Execute* stage. Optimization summaries are streamed and
117 sent to LTRANS, where all the passes execute the *Transform*
118 stage.
119
120 Most optimization passes split naturally into analysis,
121 propagation and transformation stages. But some do not. The
122 main problem arises when one pass performs changes and the
123 following pass gets confused by seeing different callgraphs
124 between the *Transform* stage and the *Generate summary*
125 or *Execute* stage. This means that the passes are required
126 to communicate their decisions with each other.
127
128 To facilitate this communication, the GCC callgraph
129 infrastructure implements *virtual clones*, a method of
130 representing the changes performed by the optimization passes in
131 the callgraph without needing to update function bodies.
132
133 A *virtual clone* in the callgraph is a function that has no
134 associated body, just a description of how to create its body based
135 on a different function (which itself may be a virtual clone).
136
137 The description of function modifications includes adjustments to
138 the function's signature (which allows, for example, removing or
139 adding function arguments), substitutions to perform on the
140 function body, and, for inlined functions, a pointer to the
141 function that it will be inlined into.
142
143 It is also possible to redirect any edge of the callgraph from a
144 function to its virtual clone. This implies updating of the call
145 site to adjust for the new function signature.
146
147 Most of the transformations performed by inter-procedural
148 optimizations can be represented via virtual clones. For
149 instance, a constant propagation pass can produce a virtual clone
150 of the function which replaces one of its arguments by a
151 constant. The inliner can represent its decisions by producing a
152 clone of a function whose body will be later integrated into
153 a given function.
154
155 Using *virtual clones*, the program can be easily updated
156 during the *Execute* stage, solving most of pass interactions
157 problems that would otherwise occur during *Transform*.
158
159 Virtual clones are later materialized in the LTRANS stage and
160 turned into real functions. Passes executed after the virtual
161 clone were introduced also perform their *Transform* stage
162 on new functions, so for a pass there is no significant
163 difference between operating on a real function or a virtual
164 clone introduced before its *Execute* stage.
165
166 Optimization passes then work on virtual clones introduced before
167 their *Execute* stage as if they were real functions. The
168 only difference is that clones are not visible during the
169 *Generate Summary* stage.
170
171 To keep function summaries updated, the callgraph interface
172 allows an optimizer to register a callback that is called every
173 time a new clone is introduced as well as when the actual
174 function or variable is generated or when a function or variable
175 is removed. These hooks are registered in the *Generate
176 summary* stage and allow the pass to keep its information intact
177 until the *Execute* stage. The same hooks can also be
178 registered during the *Execute* stage to keep the
179 optimization summaries updated for the *Transform* stage.
180
181 IPA references
182 ^^^^^^^^^^^^^^
183
184 GCC represents IPA references in the callgraph. For a function
185 or variable ``A``, the *IPA reference* is a list of all
186 locations where the address of ``A`` is taken and, when
187 ``A`` is a variable, a list of all direct stores and reads
188 to/from ``A``. References represent an oriented multi-graph on
189 the union of nodes of the callgraph and the varpool. See
190 :samp:`ipa-reference.cc`: ``ipa_reference_write_optimization_summary``
191 and
192 :samp:`ipa-reference.cc`: ``ipa_reference_read_optimization_summary``
193 for details.
194
195 Jump functions
196 ^^^^^^^^^^^^^^
197
198 Suppose that an optimization pass sees a function ``A`` and it
199 knows the values of (some of) its arguments. The *jump
200 function* describes the value of a parameter of a given function
201 call in function ``A`` based on this knowledge.
202
203 Jump functions are used by several optimizations, such as the
204 inter-procedural constant propagation pass and the
205 devirtualization pass. The inliner also uses jump functions to
206 perform inlining of callbacks.