]>
Commit | Line | Data |
---|---|---|
7adcbafe | 1 | @c Copyright (C) 2004-2022 Free Software Foundation, Inc. |
6de9cd9a DN |
2 | @c This is part of the GCC manual. |
3 | @c For copying conditions, see the file gcc.texi. | |
4 | ||
5 | @c --------------------------------------------------------------------- | |
6 | @c Tree SSA | |
7 | @c --------------------------------------------------------------------- | |
8 | ||
9 | @node Tree SSA | |
e6c99067 | 10 | @chapter Analysis and Optimization of GIMPLE tuples |
6de9cd9a DN |
11 | @cindex Tree SSA |
12 | @cindex Optimization infrastructure for GIMPLE | |
13 | ||
14 | GCC uses three main intermediate languages to represent the program | |
15 | during compilation: GENERIC, GIMPLE and RTL@. GENERIC is a | |
16 | language-independent representation generated by each front end. It | |
17 | is used to serve as an interface between the parser and optimizer. | |
18 | GENERIC is a common representation that is able to represent programs | |
19 | written in all the languages supported by GCC@. | |
20 | ||
21 | GIMPLE and RTL are used to optimize the program. GIMPLE is used for | |
22 | target and language independent optimizations (e.g., inlining, | |
23 | constant propagation, tail call elimination, redundancy elimination, | |
24 | etc). Much like GENERIC, GIMPLE is a language independent, tree based | |
8a36672b | 25 | representation. However, it differs from GENERIC in that the GIMPLE |
6de9cd9a DN |
26 | grammar is more restrictive: expressions contain no more than 3 |
27 | operands (except function calls), it has no control flow structures | |
df18c24a | 28 | and expressions with side effects are only allowed on the right hand |
6de9cd9a DN |
29 | side of assignments. See the chapter describing GENERIC and GIMPLE |
30 | for more details. | |
31 | ||
32 | This chapter describes the data structures and functions used in the | |
33 | GIMPLE optimizers (also known as ``tree optimizers'' or ``middle | |
34 | end''). In particular, it focuses on all the macros, data structures, | |
35 | functions and programming constructs needed to implement optimization | |
36 | passes for GIMPLE@. | |
37 | ||
38 | @menu | |
e6c99067 | 39 | * Annotations:: Attributes for variables. |
02a9370c | 40 | * SSA Operands:: SSA names referenced by GIMPLE statements. |
6ccde948 RW |
41 | * SSA:: Static Single Assignment representation. |
42 | * Alias analysis:: Representing aliased loads and stores. | |
4d7a65ea | 43 | * Memory model:: Memory model used by the middle-end. |
6de9cd9a DN |
44 | @end menu |
45 | ||
6de9cd9a DN |
46 | @node Annotations |
47 | @section Annotations | |
48 | @cindex annotations | |
49 | ||
e6c99067 DN |
50 | The optimizers need to associate attributes with variables during the |
51 | optimization process. For instance, we need to know whether a | |
52 | variable has aliases. All these attributes are stored in data | |
53 | structures called annotations which are then linked to the field | |
54 | @code{ann} in @code{struct tree_common}. | |
6de9cd9a | 55 | |
6de9cd9a | 56 | |
e6c99067 DN |
57 | @node SSA Operands |
58 | @section SSA Operands | |
6de9cd9a DN |
59 | @cindex operands |
60 | @cindex virtual operands | |
61 | @cindex real operands | |
43ae1e1c | 62 | @findex update_stmt |
6de9cd9a DN |
63 | |
64 | Almost every GIMPLE statement will contain a reference to a variable | |
65 | or memory location. Since statements come in different shapes and | |
66 | sizes, their operands are going to be located at various spots inside | |
67 | the statement's tree. To facilitate access to the statement's | |
f47c96aa AM |
68 | operands, they are organized into lists associated inside each |
69 | statement's annotation. Each element in an operand list is a pointer | |
6de9cd9a DN |
70 | to a @code{VAR_DECL}, @code{PARM_DECL} or @code{SSA_NAME} tree node. |
71 | This provides a very convenient way of examining and replacing | |
f0eb93a8 | 72 | operands. |
6de9cd9a DN |
73 | |
74 | Data flow analysis and optimization is done on all tree nodes | |
75 | representing variables. Any node for which @code{SSA_VAR_P} returns | |
76 | nonzero is considered when scanning statement operands. However, not | |
77 | all @code{SSA_VAR_P} variables are processed in the same way. For the | |
78 | purposes of optimization, we need to distinguish between references to | |
79 | local scalar variables and references to globals, statics, structures, | |
80 | arrays, aliased variables, etc. The reason is simple, the compiler | |
81 | can gather complete data flow information for a local scalar. On the | |
82 | other hand, a global variable may be modified by a function call, it | |
83 | may not be possible to keep track of all the elements of an array or | |
84 | the fields of a structure, etc. | |
85 | ||
86 | The operand scanner gathers two kinds of operands: @dfn{real} and | |
87 | @dfn{virtual}. An operand for which @code{is_gimple_reg} returns true | |
88 | is considered real, otherwise it is a virtual operand. We also | |
89 | distinguish between uses and definitions. An operand is used if its | |
90 | value is loaded by the statement (e.g., the operand at the RHS of an | |
91 | assignment). If the statement assigns a new value to the operand, the | |
92 | operand is considered a definition (e.g., the operand at the LHS of | |
93 | an assignment). | |
94 | ||
95 | Virtual and real operands also have very different data flow | |
96 | properties. Real operands are unambiguous references to the | |
97 | full object that they represent. For instance, given | |
98 | ||
99 | @smallexample | |
100 | @{ | |
101 | int a, b; | |
102 | a = b | |
103 | @} | |
104 | @end smallexample | |
105 | ||
106 | Since @code{a} and @code{b} are non-aliased locals, the statement | |
107 | @code{a = b} will have one real definition and one real use because | |
f133d4a2 EF |
108 | variable @code{a} is completely modified with the contents of |
109 | variable @code{b}. Real definition are also known as @dfn{killing | |
110 | definitions}. Similarly, the use of @code{b} reads all its bits. | |
6de9cd9a | 111 | |
a32b97a2 | 112 | In contrast, virtual operands are used with variables that can have |
8a36672b JM |
113 | a partial or ambiguous reference. This includes structures, arrays, |
114 | globals, and aliased variables. In these cases, we have two types of | |
115 | definitions. For globals, structures, and arrays, we can determine from | |
f0eb93a8 | 116 | a statement whether a variable of these types has a killing definition. |
a32b97a2 | 117 | If the variable does, then the statement is marked as having a |
8a36672b | 118 | @dfn{must definition} of that variable. However, if a statement is only |
431ae0bf | 119 | defining a part of the variable (i.e.@: a field in a structure), or if we |
a32b97a2 | 120 | know that a statement might define the variable but we cannot say for sure, |
f0eb93a8 | 121 | then we mark that statement as having a @dfn{may definition}. For |
a32b97a2 | 122 | instance, given |
6de9cd9a DN |
123 | |
124 | @smallexample | |
125 | @{ | |
126 | int a, b, *p; | |
127 | ||
923158be | 128 | if (@dots{}) |
6de9cd9a DN |
129 | p = &a; |
130 | else | |
131 | p = &b; | |
132 | *p = 5; | |
133 | return *p; | |
134 | @} | |
135 | @end smallexample | |
136 | ||
137 | The assignment @code{*p = 5} may be a definition of @code{a} or | |
138 | @code{b}. If we cannot determine statically where @code{p} is | |
139 | pointing to at the time of the store operation, we create virtual | |
140 | definitions to mark that statement as a potential definition site for | |
141 | @code{a} and @code{b}. Memory loads are similarly marked with virtual | |
142 | use operands. Virtual operands are shown in tree dumps right before | |
143 | the statement that contains them. To request a tree dump with virtual | |
144 | operands, use the @option{-vops} option to @option{-fdump-tree}: | |
145 | ||
146 | @smallexample | |
147 | @{ | |
148 | int a, b, *p; | |
149 | ||
923158be | 150 | if (@dots{}) |
6de9cd9a DN |
151 | p = &a; |
152 | else | |
153 | p = &b; | |
38635499 DN |
154 | # a = VDEF <a> |
155 | # b = VDEF <b> | |
6de9cd9a DN |
156 | *p = 5; |
157 | ||
158 | # VUSE <a> | |
159 | # VUSE <b> | |
160 | return *p; | |
161 | @} | |
162 | @end smallexample | |
163 | ||
38635499 | 164 | Notice that @code{VDEF} operands have two copies of the referenced |
6de9cd9a DN |
165 | variable. This indicates that this is not a killing definition of |
166 | that variable. In this case we refer to it as a @dfn{may definition} | |
167 | or @dfn{aliased store}. The presence of the second copy of the | |
38635499 | 168 | variable in the @code{VDEF} operand will become important when the |
6de9cd9a DN |
169 | function is converted into SSA form. This will be used to link all |
170 | the non-killing definitions to prevent optimizations from making | |
171 | incorrect assumptions about them. | |
172 | ||
1e6a5d3c KH |
173 | Operands are updated as soon as the statement is finished via a call |
174 | to @code{update_stmt}. If statement elements are changed via | |
175 | @code{SET_USE} or @code{SET_DEF}, then no further action is required | |
1eaf20ec | 176 | (i.e., those macros take care of updating the statement). If changes |
1e6a5d3c | 177 | are made by manipulating the statement's tree directly, then a call |
c3689350 | 178 | must be made to @code{update_stmt} when complete. Calling one of the |
1e6a5d3c KH |
179 | @code{bsi_insert} routines or @code{bsi_replace} performs an implicit |
180 | call to @code{update_stmt}. | |
6de9cd9a | 181 | |
f47c96aa | 182 | @subsection Operand Iterators And Access Routines |
ff2ce160 | 183 | @cindex Operand Iterators |
f47c96aa AM |
184 | @cindex Operand Access Routines |
185 | ||
186 | Operands are collected by @file{tree-ssa-operands.c}. They are stored | |
187 | inside each statement's annotation and can be accessed through either the | |
188 | operand iterators or an access routine. | |
189 | ||
190 | The following access routines are available for examining operands: | |
191 | ||
192 | @enumerate | |
ff2ce160 MS |
193 | @item @code{SINGLE_SSA_@{USE,DEF,TREE@}_OPERAND}: These accessors will return |
194 | NULL unless there is exactly one operand matching the specified flags. If | |
195 | there is exactly one operand, the operand is returned as either a @code{tree}, | |
f47c96aa AM |
196 | @code{def_operand_p}, or @code{use_operand_p}. |
197 | ||
198 | @smallexample | |
199 | tree t = SINGLE_SSA_TREE_OPERAND (stmt, flags); | |
200 | use_operand_p u = SINGLE_SSA_USE_OPERAND (stmt, SSA_ALL_VIRTUAL_USES); | |
201 | def_operand_p d = SINGLE_SSA_DEF_OPERAND (stmt, SSA_OP_ALL_DEFS); | |
202 | @end smallexample | |
203 | ||
ff2ce160 | 204 | @item @code{ZERO_SSA_OPERANDS}: This macro returns true if there are no |
f47c96aa AM |
205 | operands matching the specified flags. |
206 | ||
207 | @smallexample | |
208 | if (ZERO_SSA_OPERANDS (stmt, SSA_OP_ALL_VIRTUALS)) | |
209 | return; | |
210 | @end smallexample | |
aca2bd7c | 211 | |
ff2ce160 MS |
212 | @item @code{NUM_SSA_OPERANDS}: This macro Returns the number of operands |
213 | matching 'flags'. This actually executes a loop to perform the count, so | |
f47c96aa AM |
214 | only use this if it is really needed. |
215 | ||
216 | @smallexample | |
217 | int count = NUM_SSA_OPERANDS (stmt, flags) | |
218 | @end smallexample | |
219 | @end enumerate | |
220 | ||
221 | ||
222 | If you wish to iterate over some or all operands, use the | |
223 | @code{FOR_EACH_SSA_@{USE,DEF,TREE@}_OPERAND} iterator. For example, to print | |
224 | all the operands for a statement: | |
aca2bd7c AM |
225 | |
226 | @smallexample | |
227 | void | |
228 | print_ops (tree stmt) | |
229 | @{ | |
230 | ssa_op_iter; | |
231 | tree var; | |
232 | ||
aca2bd7c | 233 | FOR_EACH_SSA_TREE_OPERAND (var, stmt, iter, SSA_OP_ALL_OPERANDS) |
f47c96aa | 234 | print_generic_expr (stderr, var, TDF_SLIM); |
aca2bd7c AM |
235 | @} |
236 | @end smallexample | |
237 | ||
238 | ||
f47c96aa AM |
239 | How to choose the appropriate iterator: |
240 | ||
aca2bd7c AM |
241 | @enumerate |
242 | @item Determine whether you are need to see the operand pointers, or just the | |
6ccde948 | 243 | trees, and choose the appropriate macro: |
aca2bd7c AM |
244 | |
245 | @smallexample | |
f0eb93a8 JM |
246 | Need Macro: |
247 | ---- ------- | |
248 | use_operand_p FOR_EACH_SSA_USE_OPERAND | |
249 | def_operand_p FOR_EACH_SSA_DEF_OPERAND | |
250 | tree FOR_EACH_SSA_TREE_OPERAND | |
aca2bd7c AM |
251 | @end smallexample |
252 | ||
253 | @item You need to declare a variable of the type you are interested | |
6ccde948 RW |
254 | in, and an ssa_op_iter structure which serves as the loop controlling |
255 | variable. | |
aca2bd7c AM |
256 | |
257 | @item Determine which operands you wish to use, and specify the flags of | |
6ccde948 RW |
258 | those you are interested in. They are documented in |
259 | @file{tree-ssa-operands.h}: | |
aca2bd7c AM |
260 | |
261 | @smallexample | |
12bcfaa1 JM |
262 | #define SSA_OP_USE 0x01 /* @r{Real USE operands.} */ |
263 | #define SSA_OP_DEF 0x02 /* @r{Real DEF operands.} */ | |
264 | #define SSA_OP_VUSE 0x04 /* @r{VUSE operands.} */ | |
50e6a148 | 265 | #define SSA_OP_VDEF 0x08 /* @r{VDEF operands.} */ |
12bcfaa1 JM |
266 | |
267 | /* @r{These are commonly grouped operand flags.} */ | |
50e6a148 AH |
268 | #define SSA_OP_VIRTUAL_USES (SSA_OP_VUSE) |
269 | #define SSA_OP_VIRTUAL_DEFS (SSA_OP_VDEF) | |
270 | #define SSA_OP_ALL_VIRTUALS (SSA_OP_VIRTUAL_USES | SSA_OP_VIRTUAL_DEFS) | |
271 | #define SSA_OP_ALL_USES (SSA_OP_VIRTUAL_USES | SSA_OP_USE) | |
272 | #define SSA_OP_ALL_DEFS (SSA_OP_VIRTUAL_DEFS | SSA_OP_DEF) | |
273 | #define SSA_OP_ALL_OPERANDS (SSA_OP_ALL_USES | SSA_OP_ALL_DEFS) | |
aca2bd7c AM |
274 | @end smallexample |
275 | @end enumerate | |
276 | ||
277 | So if you want to look at the use pointers for all the @code{USE} and | |
f0eb93a8 | 278 | @code{VUSE} operands, you would do something like: |
aca2bd7c AM |
279 | |
280 | @smallexample | |
f0eb93a8 JM |
281 | use_operand_p use_p; |
282 | ssa_op_iter iter; | |
aca2bd7c AM |
283 | |
284 | FOR_EACH_SSA_USE_OPERAND (use_p, stmt, iter, (SSA_OP_USE | SSA_OP_VUSE)) | |
f0eb93a8 JM |
285 | @{ |
286 | process_use_ptr (use_p); | |
287 | @} | |
aca2bd7c AM |
288 | @end smallexample |
289 | ||
f47c96aa | 290 | The @code{TREE} macro is basically the same as the @code{USE} and |
aca2bd7c | 291 | @code{DEF} macros, only with the use or def dereferenced via |
8a36672b | 292 | @code{USE_FROM_PTR (use_p)} and @code{DEF_FROM_PTR (def_p)}. Since we |
f0eb93a8 | 293 | aren't using operand pointers, use and defs flags can be mixed. |
aca2bd7c AM |
294 | |
295 | @smallexample | |
296 | tree var; | |
297 | ssa_op_iter iter; | |
298 | ||
e9dc9c4e | 299 | FOR_EACH_SSA_TREE_OPERAND (var, stmt, iter, SSA_OP_VUSE) |
aca2bd7c AM |
300 | @{ |
301 | print_generic_expr (stderr, var, TDF_SLIM); | |
302 | @} | |
303 | @end smallexample | |
304 | ||
38635499 | 305 | @code{VDEF}s are broken into two flags, one for the |
e9dc9c4e | 306 | @code{DEF} portion (@code{SSA_OP_VDEF}) and one for the USE portion |
50e6a148 | 307 | (@code{SSA_OP_VUSE}). |
aca2bd7c | 308 | |
50e6a148 AH |
309 | There are many examples in the code, in addition to the documentation |
310 | in @file{tree-ssa-operands.h} and @file{ssa-iterators.h}. | |
aca2bd7c | 311 | |
f47c96aa AM |
312 | There are also a couple of variants on the stmt iterators regarding PHI |
313 | nodes. | |
314 | ||
ff2ce160 MS |
315 | @code{FOR_EACH_PHI_ARG} Works exactly like |
316 | @code{FOR_EACH_SSA_USE_OPERAND}, except it works over @code{PHI} arguments | |
f47c96aa AM |
317 | instead of statement operands. |
318 | ||
319 | @smallexample | |
320 | /* Look at every virtual PHI use. */ | |
321 | FOR_EACH_PHI_ARG (use_p, phi_stmt, iter, SSA_OP_VIRTUAL_USES) | |
322 | @{ | |
323 | my_code; | |
324 | @} | |
325 | ||
326 | /* Look at every real PHI use. */ | |
327 | FOR_EACH_PHI_ARG (use_p, phi_stmt, iter, SSA_OP_USES) | |
328 | my_code; | |
329 | ||
d1facce0 | 330 | /* Look at every PHI use. */ |
f47c96aa AM |
331 | FOR_EACH_PHI_ARG (use_p, phi_stmt, iter, SSA_OP_ALL_USES) |
332 | my_code; | |
333 | @end smallexample | |
334 | ||
ff2ce160 | 335 | @code{FOR_EACH_PHI_OR_STMT_@{USE,DEF@}} works exactly like |
f47c96aa AM |
336 | @code{FOR_EACH_SSA_@{USE,DEF@}_OPERAND}, except it will function on |
337 | either a statement or a @code{PHI} node. These should be used when it is | |
ff2ce160 | 338 | appropriate but they are not quite as efficient as the individual |
f47c96aa AM |
339 | @code{FOR_EACH_PHI} and @code{FOR_EACH_SSA} routines. |
340 | ||
341 | @smallexample | |
342 | FOR_EACH_PHI_OR_STMT_USE (use_operand_p, stmt, iter, flags) | |
343 | @{ | |
344 | my_code; | |
345 | @} | |
346 | ||
347 | FOR_EACH_PHI_OR_STMT_DEF (def_operand_p, phi, iter, flags) | |
348 | @{ | |
349 | my_code; | |
350 | @} | |
351 | @end smallexample | |
6de9cd9a | 352 | |
43ae1e1c AM |
353 | @subsection Immediate Uses |
354 | @cindex Immediate Uses | |
355 | ||
ff2ce160 | 356 | Immediate use information is now always available. Using the immediate use |
43ae1e1c | 357 | iterators, you may examine every use of any @code{SSA_NAME}. For instance, |
6c00f606 AM |
358 | to change each use of @code{ssa_var} to @code{ssa_var2} and call fold_stmt on |
359 | each stmt after that is done: | |
43ae1e1c AM |
360 | |
361 | @smallexample | |
c3689350 AM |
362 | use_operand_p imm_use_p; |
363 | imm_use_iterator iterator; | |
6c00f606 | 364 | tree ssa_var, stmt; |
c3689350 | 365 | |
6c00f606 AM |
366 | |
367 | FOR_EACH_IMM_USE_STMT (stmt, iterator, ssa_var) | |
368 | @{ | |
369 | FOR_EACH_IMM_USE_ON_STMT (imm_use_p, iterator) | |
370 | SET_USE (imm_use_p, ssa_var_2); | |
371 | fold_stmt (stmt); | |
372 | @} | |
43ae1e1c AM |
373 | @end smallexample |
374 | ||
1eaf20ec RW |
375 | There are 2 iterators which can be used. @code{FOR_EACH_IMM_USE_FAST} is |
376 | used when the immediate uses are not changed, i.e., you are looking at the | |
ff2ce160 | 377 | uses, but not setting them. |
43ae1e1c | 378 | |
ff2ce160 MS |
379 | If they do get changed, then care must be taken that things are not changed |
380 | under the iterators, so use the @code{FOR_EACH_IMM_USE_STMT} and | |
381 | @code{FOR_EACH_IMM_USE_ON_STMT} iterators. They attempt to preserve the | |
382 | sanity of the use list by moving all the uses for a statement into | |
383 | a controlled position, and then iterating over those uses. Then the | |
6c00f606 | 384 | optimization can manipulate the stmt when all the uses have been |
ff2ce160 MS |
385 | processed. This is a little slower than the FAST version since it adds a |
386 | placeholder element and must sort through the list a bit for each statement. | |
387 | This placeholder element must be also be removed if the loop is | |
640296c3 AO |
388 | terminated early; a destructor takes care of that when leaving the |
389 | @code{FOR_EACH_IMM_USE_STMT} scope. | |
43ae1e1c AM |
390 | |
391 | There are checks in @code{verify_ssa} which verify that the immediate use list | |
ff2ce160 MS |
392 | is up to date, as well as checking that an optimization didn't break from the |
393 | loop without using this macro. It is safe to simply 'break'; from a | |
43ae1e1c AM |
394 | @code{FOR_EACH_IMM_USE_FAST} traverse. |
395 | ||
396 | Some useful functions and macros: | |
397 | @enumerate | |
398 | @item @code{has_zero_uses (ssa_var)} : Returns true if there are no uses of | |
399 | @code{ssa_var}. | |
ff2ce160 | 400 | @item @code{has_single_use (ssa_var)} : Returns true if there is only a |
43ae1e1c AM |
401 | single use of @code{ssa_var}. |
402 | @item @code{single_imm_use (ssa_var, use_operand_p *ptr, tree *stmt)} : | |
403 | Returns true if there is only a single use of @code{ssa_var}, and also returns | |
d1facce0 | 404 | the use pointer and statement it occurs in, in the second and third parameters. |
43ae1e1c | 405 | @item @code{num_imm_uses (ssa_var)} : Returns the number of immediate uses of |
c3689350 | 406 | @code{ssa_var}. It is better not to use this if possible since it simply |
43ae1e1c AM |
407 | utilizes a loop to count the uses. |
408 | @item @code{PHI_ARG_INDEX_FROM_USE (use_p)} : Given a use within a @code{PHI} | |
409 | node, return the index number for the use. An assert is triggered if the use | |
410 | isn't located in a @code{PHI} node. | |
c3689350 | 411 | @item @code{USE_STMT (use_p)} : Return the statement a use occurs in. |
43ae1e1c AM |
412 | @end enumerate |
413 | ||
414 | Note that uses are not put into an immediate use list until their statement is | |
ff2ce160 | 415 | actually inserted into the instruction stream via a @code{bsi_*} routine. |
43ae1e1c | 416 | |
ff2ce160 MS |
417 | It is also still possible to utilize lazy updating of statements, but this |
418 | should be used only when absolutely required. Both alias analysis and the | |
419 | dominator optimizations currently do this. | |
43ae1e1c | 420 | |
ff2ce160 | 421 | When lazy updating is being used, the immediate use information is out of date |
c3689350 | 422 | and cannot be used reliably. Lazy updating is achieved by simply marking |
98c39652 | 423 | statements modified via calls to @code{gimple_set_modified} instead of |
ff2ce160 MS |
424 | @code{update_stmt}. When lazy updating is no longer required, all the |
425 | modified statements must have @code{update_stmt} called in order to bring them | |
426 | up to date. This must be done before the optimization is finished, or | |
c3689350 | 427 | @code{verify_ssa} will trigger an abort. |
43ae1e1c AM |
428 | |
429 | This is done with a simple loop over the instruction stream: | |
430 | @smallexample | |
431 | block_stmt_iterator bsi; | |
432 | basic_block bb; | |
433 | FOR_EACH_BB (bb) | |
434 | @{ | |
435 | for (bsi = bsi_start (bb); !bsi_end_p (bsi); bsi_next (&bsi)) | |
436 | update_stmt_if_modified (bsi_stmt (bsi)); | |
437 | @} | |
438 | @end smallexample | |
439 | ||
6de9cd9a DN |
440 | @node SSA |
441 | @section Static Single Assignment | |
442 | @cindex SSA | |
443 | @cindex static single assignment | |
444 | ||
445 | Most of the tree optimizers rely on the data flow information provided | |
446 | by the Static Single Assignment (SSA) form. We implement the SSA form | |
447 | as described in @cite{R. Cytron, J. Ferrante, B. Rosen, M. Wegman, and | |
8a36672b JM |
448 | K. Zadeck. Efficiently Computing Static Single Assignment Form and the |
449 | Control Dependence Graph. ACM Transactions on Programming Languages | |
6de9cd9a DN |
450 | and Systems, 13(4):451-490, October 1991}. |
451 | ||
452 | The SSA form is based on the premise that program variables are | |
453 | assigned in exactly one location in the program. Multiple assignments | |
454 | to the same variable create new versions of that variable. Naturally, | |
455 | actual programs are seldom in SSA form initially because variables | |
456 | tend to be assigned multiple times. The compiler modifies the program | |
457 | representation so that every time a variable is assigned in the code, | |
458 | a new version of the variable is created. Different versions of the | |
459 | same variable are distinguished by subscripting the variable name with | |
460 | its version number. Variables used in the right-hand side of | |
461 | expressions are renamed so that their version number matches that of | |
462 | the most recent assignment. | |
463 | ||
464 | We represent variable versions using @code{SSA_NAME} nodes. The | |
465 | renaming process in @file{tree-ssa.c} wraps every real and | |
466 | virtual operand with an @code{SSA_NAME} node which contains | |
467 | the version number and the statement that created the | |
468 | @code{SSA_NAME}. Only definitions and virtual definitions may | |
469 | create new @code{SSA_NAME} nodes. | |
470 | ||
b814cc0a BF |
471 | @cindex PHI nodes |
472 | Sometimes, flow of control makes it impossible to determine the | |
6de9cd9a DN |
473 | most recent version of a variable. In these cases, the compiler |
474 | inserts an artificial definition for that variable called | |
475 | @dfn{PHI function} or @dfn{PHI node}. This new definition merges | |
476 | all the incoming versions of the variable to create a new name | |
477 | for it. For instance, | |
478 | ||
479 | @smallexample | |
923158be | 480 | if (@dots{}) |
6de9cd9a | 481 | a_1 = 5; |
923158be | 482 | else if (@dots{}) |
6de9cd9a DN |
483 | a_2 = 2; |
484 | else | |
485 | a_3 = 13; | |
486 | ||
487 | # a_4 = PHI <a_1, a_2, a_3> | |
488 | return a_4; | |
489 | @end smallexample | |
490 | ||
491 | Since it is not possible to determine which of the three branches | |
492 | will be taken at runtime, we don't know which of @code{a_1}, | |
493 | @code{a_2} or @code{a_3} to use at the return statement. So, the | |
494 | SSA renamer creates a new version @code{a_4} which is assigned | |
495 | the result of ``merging'' @code{a_1}, @code{a_2} and @code{a_3}. | |
496 | Hence, PHI nodes mean ``one of these operands. I don't know | |
497 | which''. | |
498 | ||
c960732f | 499 | The following functions can be used to examine PHI nodes |
6de9cd9a | 500 | |
c960732f | 501 | @defun gimple_phi_result (@var{phi}) |
6de9cd9a | 502 | Returns the @code{SSA_NAME} created by PHI node @var{phi} (i.e., |
8a36672b | 503 | @var{phi}'s LHS)@. |
c960732f | 504 | @end defun |
6de9cd9a | 505 | |
c960732f | 506 | @defun gimple_phi_num_args (@var{phi}) |
6de9cd9a DN |
507 | Returns the number of arguments in @var{phi}. This number is exactly |
508 | the number of incoming edges to the basic block holding @var{phi}@. | |
c960732f | 509 | @end defun |
6de9cd9a | 510 | |
c960732f ML |
511 | @defun gimple_phi_arg (@var{phi}, @var{i}) |
512 | Returns @var{i}th argument of @var{phi}@. | |
513 | @end defun | |
6de9cd9a | 514 | |
c960732f | 515 | @defun gimple_phi_arg_edge (@var{phi}, @var{i}) |
6de9cd9a | 516 | Returns the incoming edge for the @var{i}th argument of @var{phi}. |
c960732f | 517 | @end defun |
6de9cd9a | 518 | |
c960732f | 519 | @defun gimple_phi_arg_def (@var{phi}, @var{i}) |
6de9cd9a | 520 | Returns the @code{SSA_NAME} for the @var{i}th argument of @var{phi}. |
c960732f | 521 | @end defun |
6de9cd9a DN |
522 | |
523 | ||
524 | @subsection Preserving the SSA form | |
0bca51f0 | 525 | @findex update_ssa |
6de9cd9a DN |
526 | @cindex preserving SSA form |
527 | Some optimization passes make changes to the function that | |
528 | invalidate the SSA property. This can happen when a pass has | |
0bca51f0 DN |
529 | added new symbols or changed the program so that variables that |
530 | were previously aliased aren't anymore. Whenever something like this | |
ff2ce160 | 531 | happens, the affected symbols must be renamed into SSA form again. |
0bca51f0 DN |
532 | Transformations that emit new code or replicate existing statements |
533 | will also need to update the SSA form@. | |
534 | ||
535 | Since GCC implements two different SSA forms for register and virtual | |
536 | variables, keeping the SSA form up to date depends on whether you are | |
537 | updating register or virtual names. In both cases, the general idea | |
538 | behind incremental SSA updates is similar: when new SSA names are | |
539 | created, they typically are meant to replace other existing names in | |
540 | the program@. | |
541 | ||
542 | For instance, given the following code: | |
543 | ||
544 | @smallexample | |
6ccde948 RW |
545 | 1 L0: |
546 | 2 x_1 = PHI (0, x_5) | |
547 | 3 if (x_1 < 10) | |
548 | 4 if (x_1 > 7) | |
549 | 5 y_2 = 0 | |
550 | 6 else | |
551 | 7 y_3 = x_1 + x_7 | |
552 | 8 endif | |
553 | 9 x_5 = x_1 + 1 | |
0bca51f0 | 554 | 10 goto L0; |
6ccde948 | 555 | 11 endif |
0bca51f0 DN |
556 | @end smallexample |
557 | ||
558 | Suppose that we insert new names @code{x_10} and @code{x_11} (lines | |
559 | @code{4} and @code{8})@. | |
560 | ||
561 | @smallexample | |
6ccde948 RW |
562 | 1 L0: |
563 | 2 x_1 = PHI (0, x_5) | |
564 | 3 if (x_1 < 10) | |
565 | 4 x_10 = @dots{} | |
566 | 5 if (x_1 > 7) | |
567 | 6 y_2 = 0 | |
568 | 7 else | |
569 | 8 x_11 = @dots{} | |
570 | 9 y_3 = x_1 + x_7 | |
571 | 10 endif | |
572 | 11 x_5 = x_1 + 1 | |
573 | 12 goto L0; | |
574 | 13 endif | |
0bca51f0 DN |
575 | @end smallexample |
576 | ||
577 | We want to replace all the uses of @code{x_1} with the new definitions | |
578 | of @code{x_10} and @code{x_11}. Note that the only uses that should | |
579 | be replaced are those at lines @code{5}, @code{9} and @code{11}. | |
580 | Also, the use of @code{x_7} at line @code{9} should @emph{not} be | |
581 | replaced (this is why we cannot just mark symbol @code{x} for | |
582 | renaming)@. | |
583 | ||
584 | Additionally, we may need to insert a PHI node at line @code{11} | |
585 | because that is a merge point for @code{x_10} and @code{x_11}. So the | |
586 | use of @code{x_1} at line @code{11} will be replaced with the new PHI | |
587 | node. The insertion of PHI nodes is optional. They are not strictly | |
588 | necessary to preserve the SSA form, and depending on what the caller | |
589 | inserted, they may not even be useful for the optimizers@. | |
590 | ||
591 | Updating the SSA form is a two step process. First, the pass has to | |
592 | identify which names need to be updated and/or which symbols need to | |
593 | be renamed into SSA form for the first time. When new names are | |
594 | introduced to replace existing names in the program, the mapping | |
595 | between the old and the new names are registered by calling | |
596 | @code{register_new_name_mapping} (note that if your pass creates new | |
597 | code by duplicating basic blocks, the call to @code{tree_duplicate_bb} | |
1491b564 | 598 | will set up the necessary mappings automatically). |
0bca51f0 DN |
599 | |
600 | After the replacement mappings have been registered and new symbols | |
601 | marked for renaming, a call to @code{update_ssa} makes the registered | |
602 | changes. This can be done with an explicit call or by creating | |
603 | @code{TODO} flags in the @code{tree_opt_pass} structure for your pass. | |
0fa2e4df | 604 | There are several @code{TODO} flags that control the behavior of |
0bca51f0 DN |
605 | @code{update_ssa}: |
606 | ||
607 | @itemize @bullet | |
608 | @item @code{TODO_update_ssa}. Update the SSA form inserting PHI nodes | |
6ccde948 RW |
609 | for newly exposed symbols and virtual names marked for updating. |
610 | When updating real names, only insert PHI nodes for a real name | |
611 | @code{O_j} in blocks reached by all the new and old definitions for | |
612 | @code{O_j}. If the iterated dominance frontier for @code{O_j} | |
613 | is not pruned, we may end up inserting PHI nodes in blocks that | |
614 | have one or more edges with no incoming definition for | |
615 | @code{O_j}. This would lead to uninitialized warnings for | |
616 | @code{O_j}'s symbol@. | |
0bca51f0 DN |
617 | |
618 | @item @code{TODO_update_ssa_no_phi}. Update the SSA form without | |
6ccde948 RW |
619 | inserting any new PHI nodes at all. This is used by passes that |
620 | have either inserted all the PHI nodes themselves or passes that | |
621 | need only to patch use-def and def-def chains for virtuals | |
622 | (e.g., DCE)@. | |
0bca51f0 DN |
623 | |
624 | ||
625 | @item @code{TODO_update_ssa_full_phi}. Insert PHI nodes everywhere | |
6ccde948 RW |
626 | they are needed. No pruning of the IDF is done. This is used |
627 | by passes that need the PHI nodes for @code{O_j} even if it | |
628 | means that some arguments will come from the default definition | |
629 | of @code{O_j}'s symbol (e.g., @code{pass_linear_transform})@. | |
0bca51f0 | 630 | |
6ccde948 RW |
631 | WARNING: If you need to use this flag, chances are that your |
632 | pass may be doing something wrong. Inserting PHI nodes for an | |
633 | old name where not all edges carry a new replacement may lead to | |
634 | silent codegen errors or spurious uninitialized warnings@. | |
0bca51f0 DN |
635 | |
636 | @item @code{TODO_update_ssa_only_virtuals}. Passes that update the | |
6ccde948 RW |
637 | SSA form on their own may want to delegate the updating of |
638 | virtual names to the generic updater. Since FUD chains are | |
639 | easier to maintain, this simplifies the work they need to do. | |
640 | NOTE: If this flag is used, any OLD->NEW mappings for real names | |
641 | are explicitly destroyed and only the symbols marked for | |
642 | renaming are processed@. | |
0bca51f0 DN |
643 | @end itemize |
644 | ||
6de9cd9a DN |
645 | @subsection Examining @code{SSA_NAME} nodes |
646 | @cindex examining SSA_NAMEs | |
647 | ||
648 | The following macros can be used to examine @code{SSA_NAME} nodes | |
649 | ||
650 | @defmac SSA_NAME_DEF_STMT (@var{var}) | |
651 | Returns the statement @var{s} that creates the @code{SSA_NAME} | |
652 | @var{var}. If @var{s} is an empty statement (i.e., @code{IS_EMPTY_STMT | |
653 | (@var{s})} returns @code{true}), it means that the first reference to | |
654 | this variable is a USE or a VUSE@. | |
655 | @end defmac | |
656 | ||
657 | @defmac SSA_NAME_VERSION (@var{var}) | |
658 | Returns the version number of the @code{SSA_NAME} object @var{var}. | |
659 | @end defmac | |
660 | ||
661 | ||
6de9cd9a DN |
662 | @subsection Walking the dominator tree |
663 | ||
664 | @deftypefn {Tree SSA function} void walk_dominator_tree (@var{walk_data}, @var{bb}) | |
665 | ||
666 | This function walks the dominator tree for the current CFG calling a | |
667 | set of callback functions defined in @var{struct dom_walk_data} in | |
668 | @file{domwalk.h}. The call back functions you need to define give you | |
669 | hooks to execute custom code at various points during traversal: | |
670 | ||
671 | @enumerate | |
672 | @item Once to initialize any local data needed while processing | |
6ccde948 RW |
673 | @var{bb} and its children. This local data is pushed into an |
674 | internal stack which is automatically pushed and popped as the | |
675 | walker traverses the dominator tree. | |
6de9cd9a DN |
676 | |
677 | @item Once before traversing all the statements in the @var{bb}. | |
678 | ||
679 | @item Once for every statement inside @var{bb}. | |
680 | ||
681 | @item Once after traversing all the statements and before recursing | |
6ccde948 | 682 | into @var{bb}'s dominator children. |
6de9cd9a DN |
683 | |
684 | @item It then recurses into all the dominator children of @var{bb}. | |
685 | ||
686 | @item After recursing into all the dominator children of @var{bb} it | |
6ccde948 RW |
687 | can, optionally, traverse every statement in @var{bb} again |
688 | (i.e., repeating steps 2 and 3). | |
6de9cd9a DN |
689 | |
690 | @item Once after walking the statements in @var{bb} and @var{bb}'s | |
6ccde948 RW |
691 | dominator children. At this stage, the block local data stack |
692 | is popped. | |
6de9cd9a DN |
693 | @end enumerate |
694 | @end deftypefn | |
695 | ||
696 | @node Alias analysis | |
697 | @section Alias analysis | |
698 | @cindex alias | |
699 | @cindex flow-sensitive alias analysis | |
700 | @cindex flow-insensitive alias analysis | |
701 | ||
5006671f RG |
702 | Alias analysis in GIMPLE SSA form consists of two pieces. First |
703 | the virtual SSA web ties conflicting memory accesses and provides | |
704 | a SSA use-def chain and SSA immediate-use chains for walking | |
705 | possibly dependent memory accesses. Second an alias-oracle can | |
706 | be queried to disambiguate explicit and implicit memory references. | |
6de9cd9a DN |
707 | |
708 | @enumerate | |
5006671f | 709 | @item Memory SSA form. |
c75ab022 | 710 | |
5006671f RG |
711 | All statements that may use memory have exactly one accompanied use of |
712 | a virtual SSA name that represents the state of memory at the | |
713 | given point in the IL. | |
714 | ||
715 | All statements that may define memory have exactly one accompanied | |
716 | definition of a virtual SSA name using the previous state of memory | |
717 | and defining the new state of memory after the given point in the IL. | |
c75ab022 DB |
718 | |
719 | @smallexample | |
5006671f RG |
720 | int i; |
721 | int foo (void) | |
c75ab022 | 722 | @{ |
5006671f RG |
723 | # .MEM_3 = VDEF <.MEM_2(D)> |
724 | i = 1; | |
725 | # VUSE <.MEM_3> | |
726 | return i; | |
c75ab022 DB |
727 | @} |
728 | @end smallexample | |
729 | ||
5006671f RG |
730 | The virtual SSA names in this case are @code{.MEM_2(D)} and |
731 | @code{.MEM_3}. The store to the global variable @code{i} | |
732 | defines @code{.MEM_3} invalidating @code{.MEM_2(D)}. The | |
733 | load from @code{i} uses that new state @code{.MEM_3}. | |
734 | ||
735 | The virtual SSA web serves as constraints to SSA optimizers | |
736 | preventing illegitimate code-motion and optimization. It | |
737 | also provides a way to walk related memory statements. | |
c75ab022 | 738 | |
6ccde948 | 739 | @item Points-to and escape analysis. |
6de9cd9a | 740 | |
5006671f RG |
741 | Points-to analysis builds a set of constraints from the GIMPLE |
742 | SSA IL representing all pointer operations and facts we do | |
743 | or do not know about pointers. Solving this set of constraints | |
744 | yields a conservatively correct solution for each pointer | |
745 | variable in the program (though we are only interested in | |
746 | SSA name pointers) as to what it may possibly point to. | |
747 | ||
748 | This points-to solution for a given SSA name pointer is stored | |
749 | in the @code{pt_solution} sub-structure of the | |
750 | @code{SSA_NAME_PTR_INFO} record. The following accessor | |
751 | functions are available: | |
6de9cd9a | 752 | |
6ccde948 | 753 | @itemize @bullet |
5006671f RG |
754 | @item @code{pt_solution_includes} |
755 | @item @code{pt_solutions_intersect} | |
6ccde948 | 756 | @end itemize |
6de9cd9a | 757 | |
5006671f RG |
758 | Points-to analysis also computes the solution for two special |
759 | set of pointers, @code{ESCAPED} and @code{CALLUSED}. Those | |
760 | represent all memory that has escaped the scope of analysis | |
761 | or that is used by pure or nested const calls. | |
6de9cd9a | 762 | |
5006671f | 763 | @item Type-based alias analysis |
6de9cd9a | 764 | |
5006671f RG |
765 | Type-based alias analysis is frontend dependent though generic |
766 | support is provided by the middle-end in @code{alias.c}. TBAA | |
767 | code is used by both tree optimizers and RTL optimizers. | |
d2927bd5 JC |
768 | |
769 | Every language that wishes to perform language-specific alias analysis | |
770 | should define a function that computes, given a @code{tree} | |
771 | node, an alias set for the node. Nodes in different alias sets are not | |
772 | allowed to alias. For an example, see the C front-end function | |
773 | @code{c_get_alias_set}. | |
6de9cd9a | 774 | |
5006671f | 775 | @item Tree alias-oracle |
6de9cd9a | 776 | |
5006671f RG |
777 | The tree alias-oracle provides means to disambiguate two memory |
778 | references and memory references against statements. The following | |
779 | queries are available: | |
6de9cd9a | 780 | |
5006671f RG |
781 | @itemize @bullet |
782 | @item @code{refs_may_alias_p} | |
783 | @item @code{ref_maybe_used_by_stmt_p} | |
784 | @item @code{stmt_may_clobber_ref_p} | |
785 | @end itemize | |
6de9cd9a | 786 | |
5006671f RG |
787 | In addition to those two kind of statement walkers are available |
788 | walking statements related to a reference ref. | |
789 | @code{walk_non_aliased_vuses} walks over dominating memory defining | |
790 | statements and calls back if the statement does not clobber ref | |
791 | providing the non-aliased VUSE. The walk stops at | |
792 | the first clobbering statement or if asked to. | |
793 | @code{walk_aliased_vdefs} walks over dominating memory defining | |
794 | statements and calls back on each statement clobbering ref | |
795 | providing its aliasing VDEF. The walk stops if asked to. | |
6de9cd9a | 796 | |
6de9cd9a | 797 | @end enumerate |
5006671f | 798 | |
4d7a65ea RG |
799 | |
800 | @node Memory model | |
801 | @section Memory model | |
802 | @cindex memory model | |
803 | ||
804 | The memory model used by the middle-end models that of the C/C++ | |
805 | languages. The middle-end has the notion of an effective type | |
806 | of a memory region which is used for type-based alias analysis. | |
807 | ||
808 | The following is a refinement of ISO C99 6.5/6, clarifying the block copy case | |
809 | to follow common sense and extending the concept of a dynamic effective | |
810 | type to objects with a declared type as required for C++. | |
811 | ||
812 | @smallexample | |
813 | The effective type of an object for an access to its stored value is | |
814 | the declared type of the object or the effective type determined by | |
815 | a previous store to it. If a value is stored into an object through | |
816 | an lvalue having a type that is not a character type, then the | |
817 | type of the lvalue becomes the effective type of the object for that | |
818 | access and for subsequent accesses that do not modify the stored value. | |
819 | If a value is copied into an object using @code{memcpy} or @code{memmove}, | |
820 | or is copied as an array of character type, then the effective type | |
821 | of the modified object for that access and for subsequent accesses that | |
822 | do not modify the value is undetermined. For all other accesses to an | |
823 | object, the effective type of the object is simply the type of the | |
824 | lvalue used for the access. | |
825 | @end smallexample | |
826 |