]>
Commit | Line | Data |
---|---|---|
c63539ff ML |
1 | .. |
2 | Copyright 1988-2022 Free Software Foundation, Inc. | |
3 | This is part of the GCC manual. | |
4 | For copying conditions, see the copyright.rst file. | |
5 | ||
6 | .. _ipa: | |
7 | ||
8 | Using summary information in IPA passes | |
9 | *************************************** | |
10 | ||
11 | Programs are represented internally as a *callgraph* (a | |
12 | multi-graph where nodes are functions and edges are call sites) | |
13 | and a *varpool* (a list of static and external variables in | |
14 | the program). | |
15 | ||
16 | The inter-procedural optimization is organized as a sequence of | |
17 | individual passes, which operate on the callgraph and the | |
18 | varpool. To make the implementation of WHOPR possible, every | |
19 | inter-procedural optimization pass is split into several stages | |
20 | that are executed at different times during WHOPR compilation: | |
21 | ||
22 | * LGEN time | |
23 | ||
24 | * *Generate summary* (``generate_summary`` in | |
25 | ``struct ipa_opt_pass_d``). This stage analyzes every function | |
26 | body and variable initializer is examined and stores relevant | |
27 | information into a pass-specific data structure. | |
28 | ||
29 | * *Write summary* (``write_summary`` in | |
30 | ``struct ipa_opt_pass_d``). This stage writes all the | |
31 | pass-specific information generated by ``generate_summary``. | |
32 | Summaries go into their own ``LTO_section_*`` sections that | |
33 | have to be declared in :samp:`lto-streamer.h`: ``enum | |
34 | lto_section_type``. A new section is created by calling | |
35 | ``create_output_block`` and data can be written using the | |
36 | ``lto_output_*`` routines. | |
37 | ||
38 | * WPA time | |
39 | ||
40 | * *Read summary* (``read_summary`` in | |
41 | ``struct ipa_opt_pass_d``). This stage reads all the | |
42 | pass-specific information in exactly the same order that it was | |
43 | written by ``write_summary``. | |
44 | ||
45 | * *Execute* (``execute`` in ``struct | |
46 | opt_pass``). This performs inter-procedural propagation. This | |
47 | must be done without actual access to the individual function | |
48 | bodies or variable initializers. Typically, this results in a | |
49 | transitive closure operation over the summary information of all | |
50 | the nodes in the callgraph. | |
51 | ||
52 | * *Write optimization summary* | |
53 | (``write_optimization_summary`` in ``struct | |
54 | ipa_opt_pass_d``). This writes the result of the inter-procedural | |
55 | propagation into the object file. This can use the same data | |
56 | structures and helper routines used in ``write_summary``. | |
57 | ||
58 | * LTRANS time | |
59 | ||
60 | * *Read optimization summary* | |
61 | (``read_optimization_summary`` in ``struct | |
62 | ipa_opt_pass_d``). The counterpart to | |
63 | ``write_optimization_summary``. This reads the interprocedural | |
64 | optimization decisions in exactly the same format emitted by | |
65 | ``write_optimization_summary``. | |
66 | ||
67 | * *Transform* (``function_transform`` and | |
68 | ``variable_transform`` in ``struct ipa_opt_pass_d``). | |
69 | The actual function bodies and variable initializers are updated | |
70 | based on the information passed down from the *Execute* stage. | |
71 | ||
72 | The implementation of the inter-procedural passes are shared | |
73 | between LTO, WHOPR and classic non-LTO compilation. | |
74 | ||
75 | * During the traditional file-by-file mode every pass executes its | |
76 | own *Generate summary*, *Execute*, and *Transform* | |
77 | stages within the single execution context of the compiler. | |
78 | ||
79 | * In LTO compilation mode, every pass uses *Generate | |
80 | summary* and *Write summary* stages at compilation time, | |
81 | while the *Read summary*, *Execute*, and | |
82 | *Transform* stages are executed at link time. | |
83 | ||
84 | * In WHOPR mode all stages are used. | |
85 | ||
86 | To simplify development, the GCC pass manager differentiates | |
87 | between normal inter-procedural passes (see :ref:`regular-ipa-passes`), | |
88 | small inter-procedural passes (see :ref:`small-ipa-passes`) | |
89 | and late inter-procedural passes (see :ref:`late-ipa-passes`). | |
90 | A small or late IPA pass (``SIMPLE_IPA_PASS``) does | |
91 | everything at once and thus cannot be executed during WPA in | |
92 | WHOPR mode. It defines only the *Execute* stage and during | |
93 | this stage it accesses and modifies the function bodies. Such | |
94 | passes are useful for optimization at LGEN or LTRANS time and are | |
95 | used, for example, to implement early optimization before writing | |
96 | object files. The simple inter-procedural passes can also be used | |
97 | for easier prototyping and development of a new inter-procedural | |
98 | pass. | |
99 | ||
100 | Virtual clones | |
101 | ^^^^^^^^^^^^^^ | |
102 | ||
103 | One of the main challenges of introducing the WHOPR compilation | |
104 | mode was addressing the interactions between optimization passes. | |
105 | In LTO compilation mode, the passes are executed in a sequence, | |
106 | each of which consists of analysis (or *Generate summary*), | |
107 | propagation (or *Execute*) and *Transform* stages. | |
108 | Once the work of one pass is finished, the next pass sees the | |
109 | updated program representation and can execute. This makes the | |
110 | individual passes dependent on each other. | |
111 | ||
112 | In WHOPR mode all passes first execute their *Generate | |
113 | summary* stage. Then summary writing marks the end of the LGEN | |
114 | stage. At WPA time, | |
115 | the summaries are read back into memory and all passes run the | |
116 | *Execute* stage. Optimization summaries are streamed and | |
117 | sent to LTRANS, where all the passes execute the *Transform* | |
118 | stage. | |
119 | ||
120 | Most optimization passes split naturally into analysis, | |
121 | propagation and transformation stages. But some do not. The | |
122 | main problem arises when one pass performs changes and the | |
123 | following pass gets confused by seeing different callgraphs | |
124 | between the *Transform* stage and the *Generate summary* | |
125 | or *Execute* stage. This means that the passes are required | |
126 | to communicate their decisions with each other. | |
127 | ||
128 | To facilitate this communication, the GCC callgraph | |
129 | infrastructure implements *virtual clones*, a method of | |
130 | representing the changes performed by the optimization passes in | |
131 | the callgraph without needing to update function bodies. | |
132 | ||
133 | A *virtual clone* in the callgraph is a function that has no | |
134 | associated body, just a description of how to create its body based | |
135 | on a different function (which itself may be a virtual clone). | |
136 | ||
137 | The description of function modifications includes adjustments to | |
138 | the function's signature (which allows, for example, removing or | |
139 | adding function arguments), substitutions to perform on the | |
140 | function body, and, for inlined functions, a pointer to the | |
141 | function that it will be inlined into. | |
142 | ||
143 | It is also possible to redirect any edge of the callgraph from a | |
144 | function to its virtual clone. This implies updating of the call | |
145 | site to adjust for the new function signature. | |
146 | ||
147 | Most of the transformations performed by inter-procedural | |
148 | optimizations can be represented via virtual clones. For | |
149 | instance, a constant propagation pass can produce a virtual clone | |
150 | of the function which replaces one of its arguments by a | |
151 | constant. The inliner can represent its decisions by producing a | |
152 | clone of a function whose body will be later integrated into | |
153 | a given function. | |
154 | ||
155 | Using *virtual clones*, the program can be easily updated | |
156 | during the *Execute* stage, solving most of pass interactions | |
157 | problems that would otherwise occur during *Transform*. | |
158 | ||
159 | Virtual clones are later materialized in the LTRANS stage and | |
160 | turned into real functions. Passes executed after the virtual | |
161 | clone were introduced also perform their *Transform* stage | |
162 | on new functions, so for a pass there is no significant | |
163 | difference between operating on a real function or a virtual | |
164 | clone introduced before its *Execute* stage. | |
165 | ||
166 | Optimization passes then work on virtual clones introduced before | |
167 | their *Execute* stage as if they were real functions. The | |
168 | only difference is that clones are not visible during the | |
169 | *Generate Summary* stage. | |
170 | ||
171 | To keep function summaries updated, the callgraph interface | |
172 | allows an optimizer to register a callback that is called every | |
173 | time a new clone is introduced as well as when the actual | |
174 | function or variable is generated or when a function or variable | |
175 | is removed. These hooks are registered in the *Generate | |
176 | summary* stage and allow the pass to keep its information intact | |
177 | until the *Execute* stage. The same hooks can also be | |
178 | registered during the *Execute* stage to keep the | |
179 | optimization summaries updated for the *Transform* stage. | |
180 | ||
181 | IPA references | |
182 | ^^^^^^^^^^^^^^ | |
183 | ||
184 | GCC represents IPA references in the callgraph. For a function | |
185 | or variable ``A``, the *IPA reference* is a list of all | |
186 | locations where the address of ``A`` is taken and, when | |
187 | ``A`` is a variable, a list of all direct stores and reads | |
188 | to/from ``A``. References represent an oriented multi-graph on | |
189 | the union of nodes of the callgraph and the varpool. See | |
190 | :samp:`ipa-reference.cc`: ``ipa_reference_write_optimization_summary`` | |
191 | and | |
192 | :samp:`ipa-reference.cc`: ``ipa_reference_read_optimization_summary`` | |
193 | for details. | |
194 | ||
195 | Jump functions | |
196 | ^^^^^^^^^^^^^^ | |
197 | ||
198 | Suppose that an optimization pass sees a function ``A`` and it | |
199 | knows the values of (some of) its arguments. The *jump | |
200 | function* describes the value of a parameter of a given function | |
201 | call in function ``A`` based on this knowledge. | |
202 | ||
203 | Jump functions are used by several optimizations, such as the | |
204 | inter-procedural constant propagation pass and the | |
205 | devirtualization pass. The inliner also uses jump functions to | |
3ed1b4ce | 206 | perform inlining of callbacks. |